from:"Robert Stanford"

[ceph-users] unsubscribe

2019-07-12 Thread Robert Stanford

unsubscribe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Adding and removing monitors with Mimic's new centralized configuration

2019-06-17 Thread Robert Stanford

 Is it possible to add and remove monitors in Mimic, using the new
centralized configuration method?

 Regards
  R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Mimic multisite and latency

2018-12-05 Thread Robert Stanford

 I have Mimic Ceph clusters that are hundreds of miles apart. I want to use
them in a multisite configuration. Will the latency between them cause any
problems?

 Regards
R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RGW performance with lots of objects

2018-11-27 Thread Robert Stanford

In the old days when I first installed Ceph with RGW the performance would
be very slow after storing 500+ million objects in my buckets. With
Luminous and index sharding is this still a problem or is this an old
problem that has been solved?

Regards
R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Verifying the location of the wal

2018-10-28 Thread Robert Stanford

 Mehmet: it doesn't look like wal is mentioned in the osd metadata.  I see
bluefs slow, bluestore bdev, and bluefs db mentioned only.

On Sun, Oct 28, 2018 at 1:48 PM  wrote:

> IIRC there is a Command like
>
> Ceph osd Metadata
>
> Where you should be able to find Information like this
>
> Hab
> - Mehmet
>
> Am 21. Oktober 2018 19:39:58 MESZ schrieb Robert Stanford <
> rstanford8...@gmail.com>:
>>
>>
>>  I did exactly this when creating my osds, and found that my total
>> utilization is about the same as the sum of the utilization of the pools,
>> plus (wal size * number osds).  So it looks like my wals are actually
>> sharing OSDs.  But I'd like to be 100% sure... so I am seeking a way to
>> find out
>>
>> On Sun, Oct 21, 2018 at 11:13 AM Serkan Çoban 
>> wrote:
>>
>>> wal and db device will be same if you use just db path during osd
>>> creation. i do not know how to verify this with ceph commands.
>>> On Sun, Oct 21, 2018 at 4:17 PM Robert Stanford 
>>> wrote:
>>> >
>>> >
>>> >  Thanks Serkan.  I am using --path instead of --dev (dev won't work
>>> because I'm using VGs/LVs).  The output shows block and block.db, but
>>> nothing about wal.db.  How can I learn where my wal lives?
>>> >
>>> >
>>> >
>>> >
>>> > On Sun, Oct 21, 2018 at 12:43 AM Serkan Çoban 
>>> wrote:
>>> >>
>>> >> ceph-bluestore-tool can show you the disk labels.
>>> >> ceph-bluestore-tool show-label --dev /dev/sda1
>>> >> On Sun, Oct 21, 2018 at 1:29 AM Robert Stanford <
>>> rstanford8...@gmail.com> wrote:
>>> >> >
>>> >> >
>>> >> >  An email from this list stated that the wal would be created in
>>> the same place as the db, if the db were specified when running ceph-volume
>>> lvm create, and the db were specified on that command line.  I followed
>>> those instructions and like the other person writing to this list today, I
>>> was surprised to find that my cluster usage was higher than the total of
>>> pools (higher by an amount the same as all my wal sizes on each node
>>> combined).  This leads me to think my wal actually is on the data disk and
>>> not the ssd I specified the db should go to.
>>> >> >
>>> >> >  How can I verify which disk the wal is on, from the command line?
>>> I've searched the net and not come up with anything.
>>> >> >
>>> >> >  Thanks and regards
>>> >> >  R
>>> >> >
>>> >> > ___
>>> >> > ceph-users mailing list
>>> >> > ceph-users@lists.ceph.com
>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RGW stale buckets

2018-10-22 Thread Robert Stanford

 Someone deleted our rgw data pool to clean up.  They recreated it
afterward.  This is fine in one respect, we don't need the data.  But
listing with radosgw-admin still shows all the buckets.  How can we clean
things up and get rgw to understand what actually exists, and what doesn't?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Drive for Wal and Db

2018-10-22 Thread Robert Stanford

 That's very helpful, thanks.  In your first case above your
bluefs_db_partition_path and bluestore_bdev_partition path are the same.
Though I have a different data and db drive, mine are different.  Might
this explain something?  My root concern is that there is more utilization
on the cluster than what's in the pools, the excess equal to about wal size
* number of osds...

On Mon, Oct 22, 2018 at 3:35 PM David Turner  wrote:

> My DB doesn't have a specific partition anywhere, but there's still a
> symlink for it to the data partition.  On my home cluster with all DB, WAL,
> and Data on the same disk without any partitions specified there is a block
> symlink but no block.wal symlink.
>
> For the cluster with a specific WAL partition, but no DB partition, my OSD
> paths looks like [1] this.  For my cluster with everything on the same
> disk, my OSD paths look like [2] this.  Unless you have a specific path for
> "bluefs_wal_partition_path" then it's going to find itself on the same
> partition as the db.
>
> [1] $ ceph osd metadata 5 | grep path
> "bluefs_db_partition_path": "/dev/dm-29",
> "bluefs_wal_partition_path": "/dev/dm-41",
> "bluestore_bdev_partition_path": "/dev/dm-29",
>
> [2] $ ceph osd metadata 5 | grep path
> "bluefs_db_partition_path": "/dev/dm-5",
> "bluestore_bdev_partition_path": "/dev/dm-5",
>
> On Mon, Oct 22, 2018 at 4:21 PM Robert Stanford 
> wrote:
>
>>
>>  Let me add, I have no block.wal file (which the docs suggest should be
>> there).
>> http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/
>>
>> On Mon, Oct 22, 2018 at 3:13 PM Robert Stanford 
>> wrote:
>>
>>>
>>>  We're out of sync, I think.  You have your DB on your data disk so your
>>> block.db symlink points to that disk, right?  There is however no wal
>>> symlink?  So how would you verify your WAL actually lived on your NVMe?
>>>
>>> On Mon, Oct 22, 2018 at 3:07 PM David Turner 
>>> wrote:
>>>
>>>> And by the data disk I mean that I didn't specify a location for the DB
>>>> partition.
>>>>
>>>> On Mon, Oct 22, 2018 at 4:06 PM David Turner 
>>>> wrote:
>>>>
>>>>> Track down where it says they point to?  Does it match what you
>>>>> expect?  It does for me.  I have my DB on my data disk and my WAL on a
>>>>> separate NVMe.
>>>>>
>>>>> On Mon, Oct 22, 2018 at 3:21 PM Robert Stanford <
>>>>> rstanford8...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>  David - is it ensured that wal and db both live where the symlink
>>>>>> block.db points?  I assumed that was a symlink for the db, but 
>>>>>> necessarily
>>>>>> for the wal, because it can live in a place different than the db.
>>>>>>
>>>>>> On Mon, Oct 22, 2018 at 2:18 PM David Turner 
>>>>>> wrote:
>>>>>>
>>>>>>> You can always just go to /var/lib/ceph/osd/ceph-{osd-num}/ and look
>>>>>>> at where the symlinks for block and block.wal point to.
>>>>>>>
>>>>>>> On Mon, Oct 22, 2018 at 12:29 PM Robert Stanford <
>>>>>>> rstanford8...@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>  That's what they say, however I did exactly this and my cluster
>>>>>>>> utilization is higher than the total pool utilization by about the 
>>>>>>>> number
>>>>>>>> of OSDs * wal size.  I want to verify that the wal is on the SSDs too 
>>>>>>>> but
>>>>>>>> I've asked here and no one seems to know a way to verify this.  Do you?
>>>>>>>>
>>>>>>>>  Thank you, R
>>>>>>>>
>>>>>>>> On Mon, Oct 22, 2018 at 5:22 AM Maged Mokhtar 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you specify a db on ssd and data on hdd and not explicitly
>>>>>>>>> specify a
>>>>>>>>> device for wal, wal will be placed on same ssd partition with db.
>>>>>>>>> Placing only wal on ssd or creating separate devices for wal and
>>>>>>>>> db are
>>>>>>>&

Re: [ceph-users] Drive for Wal and Db

2018-10-22 Thread Robert Stanford

 Let me add, I have no block.wal file (which the docs suggest should be
there).
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/

On Mon, Oct 22, 2018 at 3:13 PM Robert Stanford 
wrote:

>
>  We're out of sync, I think.  You have your DB on your data disk so your
> block.db symlink points to that disk, right?  There is however no wal
> symlink?  So how would you verify your WAL actually lived on your NVMe?
>
> On Mon, Oct 22, 2018 at 3:07 PM David Turner 
> wrote:
>
>> And by the data disk I mean that I didn't specify a location for the DB
>> partition.
>>
>> On Mon, Oct 22, 2018 at 4:06 PM David Turner 
>> wrote:
>>
>>> Track down where it says they point to?  Does it match what you expect?
>>> It does for me.  I have my DB on my data disk and my WAL on a separate NVMe.
>>>
>>> On Mon, Oct 22, 2018 at 3:21 PM Robert Stanford 
>>> wrote:
>>>
>>>>
>>>>  David - is it ensured that wal and db both live where the symlink
>>>> block.db points?  I assumed that was a symlink for the db, but necessarily
>>>> for the wal, because it can live in a place different than the db.
>>>>
>>>> On Mon, Oct 22, 2018 at 2:18 PM David Turner 
>>>> wrote:
>>>>
>>>>> You can always just go to /var/lib/ceph/osd/ceph-{osd-num}/ and look
>>>>> at where the symlinks for block and block.wal point to.
>>>>>
>>>>> On Mon, Oct 22, 2018 at 12:29 PM Robert Stanford <
>>>>> rstanford8...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>  That's what they say, however I did exactly this and my cluster
>>>>>> utilization is higher than the total pool utilization by about the number
>>>>>> of OSDs * wal size.  I want to verify that the wal is on the SSDs too but
>>>>>> I've asked here and no one seems to know a way to verify this.  Do you?
>>>>>>
>>>>>>  Thank you, R
>>>>>>
>>>>>> On Mon, Oct 22, 2018 at 5:22 AM Maged Mokhtar 
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> If you specify a db on ssd and data on hdd and not explicitly
>>>>>>> specify a
>>>>>>> device for wal, wal will be placed on same ssd partition with db.
>>>>>>> Placing only wal on ssd or creating separate devices for wal and db
>>>>>>> are
>>>>>>> less common setups.
>>>>>>>
>>>>>>> /Maged
>>>>>>>
>>>>>>> On 22/10/18 09:03, Fyodor Ustinov wrote:
>>>>>>> > Hi!
>>>>>>> >
>>>>>>> > For sharing SSD between WAL and DB what should be placed on SSD?
>>>>>>> WAL or DB?
>>>>>>> >
>>>>>>> > - Original Message -
>>>>>>> > From: "Maged Mokhtar" 
>>>>>>> > To: "ceph-users" 
>>>>>>> > Sent: Saturday, 20 October, 2018 20:05:44
>>>>>>> > Subject: Re: [ceph-users] Drive for Wal and Db
>>>>>>> >
>>>>>>> > On 20/10/18 18:57, Robert Stanford wrote:
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > Our OSDs are BlueStore and are on regular hard drives. Each OSD
>>>>>>> has a partition on an SSD for its DB. Wal is on the regular hard drives.
>>>>>>> Should I move the wal to share the SSD with the DB?
>>>>>>> >
>>>>>>> > Regards
>>>>>>> > R
>>>>>>> >
>>>>>>> >
>>>>>>> > ___
>>>>>>> > ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
>>>>>>> ceph-users@lists.ceph.com ] [
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
>>>>>>> >
>>>>>>> > you should put wal on the faster device, wal and db could share
>>>>>>> the same ssd partition,
>>>>>>> >
>>>>>>> > Maged
>>>>>>> >
>>>>>>> > ___
>>>>>>> > ceph-users mailing list
>>>>>>> > ceph-users@lists.ceph.com
>>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>> > ___
>>>>>>> > ceph-users mailing list
>>>>>>> > ceph-users@lists.ceph.com
>>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>
>>>>>>> ___
>>>>>>> ceph-users mailing list
>>>>>>> ceph-users@lists.ceph.com
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>
>>>>>> ___
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Drive for Wal and Db

2018-10-22 Thread Robert Stanford

 We're out of sync, I think.  You have your DB on your data disk so your
block.db symlink points to that disk, right?  There is however no wal
symlink?  So how would you verify your WAL actually lived on your NVMe?

On Mon, Oct 22, 2018 at 3:07 PM David Turner  wrote:

> And by the data disk I mean that I didn't specify a location for the DB
> partition.
>
> On Mon, Oct 22, 2018 at 4:06 PM David Turner 
> wrote:
>
>> Track down where it says they point to?  Does it match what you expect?
>> It does for me.  I have my DB on my data disk and my WAL on a separate NVMe.
>>
>> On Mon, Oct 22, 2018 at 3:21 PM Robert Stanford 
>> wrote:
>>
>>>
>>>  David - is it ensured that wal and db both live where the symlink
>>> block.db points?  I assumed that was a symlink for the db, but necessarily
>>> for the wal, because it can live in a place different than the db.
>>>
>>> On Mon, Oct 22, 2018 at 2:18 PM David Turner 
>>> wrote:
>>>
>>>> You can always just go to /var/lib/ceph/osd/ceph-{osd-num}/ and look at
>>>> where the symlinks for block and block.wal point to.
>>>>
>>>> On Mon, Oct 22, 2018 at 12:29 PM Robert Stanford <
>>>> rstanford8...@gmail.com> wrote:
>>>>
>>>>>
>>>>>  That's what they say, however I did exactly this and my cluster
>>>>> utilization is higher than the total pool utilization by about the number
>>>>> of OSDs * wal size.  I want to verify that the wal is on the SSDs too but
>>>>> I've asked here and no one seems to know a way to verify this.  Do you?
>>>>>
>>>>>  Thank you, R
>>>>>
>>>>> On Mon, Oct 22, 2018 at 5:22 AM Maged Mokhtar 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> If you specify a db on ssd and data on hdd and not explicitly specify
>>>>>> a
>>>>>> device for wal, wal will be placed on same ssd partition with db.
>>>>>> Placing only wal on ssd or creating separate devices for wal and db
>>>>>> are
>>>>>> less common setups.
>>>>>>
>>>>>> /Maged
>>>>>>
>>>>>> On 22/10/18 09:03, Fyodor Ustinov wrote:
>>>>>> > Hi!
>>>>>> >
>>>>>> > For sharing SSD between WAL and DB what should be placed on SSD?
>>>>>> WAL or DB?
>>>>>> >
>>>>>> > - Original Message -
>>>>>> > From: "Maged Mokhtar" 
>>>>>> > To: "ceph-users" 
>>>>>> > Sent: Saturday, 20 October, 2018 20:05:44
>>>>>> > Subject: Re: [ceph-users] Drive for Wal and Db
>>>>>> >
>>>>>> > On 20/10/18 18:57, Robert Stanford wrote:
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Our OSDs are BlueStore and are on regular hard drives. Each OSD has
>>>>>> a partition on an SSD for its DB. Wal is on the regular hard drives. 
>>>>>> Should
>>>>>> I move the wal to share the SSD with the DB?
>>>>>> >
>>>>>> > Regards
>>>>>> > R
>>>>>> >
>>>>>> >
>>>>>> > ___
>>>>>> > ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
>>>>>> ceph-users@lists.ceph.com ] [
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
>>>>>> >
>>>>>> > you should put wal on the faster device, wal and db could share the
>>>>>> same ssd partition,
>>>>>> >
>>>>>> > Maged
>>>>>> >
>>>>>> > ___
>>>>>> > ceph-users mailing list
>>>>>> > ceph-users@lists.ceph.com
>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>> > ___
>>>>>> > ceph-users mailing list
>>>>>> > ceph-users@lists.ceph.com
>>>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>> ___
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>> ___
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Drive for Wal and Db

2018-10-22 Thread Robert Stanford

 David - is it ensured that wal and db both live where the symlink block.db
points?  I assumed that was a symlink for the db, but necessarily for the
wal, because it can live in a place different than the db.

On Mon, Oct 22, 2018 at 2:18 PM David Turner  wrote:

> You can always just go to /var/lib/ceph/osd/ceph-{osd-num}/ and look at
> where the symlinks for block and block.wal point to.
>
> On Mon, Oct 22, 2018 at 12:29 PM Robert Stanford 
> wrote:
>
>>
>>  That's what they say, however I did exactly this and my cluster
>> utilization is higher than the total pool utilization by about the number
>> of OSDs * wal size.  I want to verify that the wal is on the SSDs too but
>> I've asked here and no one seems to know a way to verify this.  Do you?
>>
>>  Thank you, R
>>
>> On Mon, Oct 22, 2018 at 5:22 AM Maged Mokhtar 
>> wrote:
>>
>>>
>>> If you specify a db on ssd and data on hdd and not explicitly specify a
>>> device for wal, wal will be placed on same ssd partition with db.
>>> Placing only wal on ssd or creating separate devices for wal and db are
>>> less common setups.
>>>
>>> /Maged
>>>
>>> On 22/10/18 09:03, Fyodor Ustinov wrote:
>>> > Hi!
>>> >
>>> > For sharing SSD between WAL and DB what should be placed on SSD? WAL
>>> or DB?
>>> >
>>> > - Original Message -
>>> > From: "Maged Mokhtar" 
>>> > To: "ceph-users" 
>>> > Sent: Saturday, 20 October, 2018 20:05:44
>>> > Subject: Re: [ceph-users] Drive for Wal and Db
>>> >
>>> > On 20/10/18 18:57, Robert Stanford wrote:
>>> >
>>> >
>>> >
>>> >
>>> > Our OSDs are BlueStore and are on regular hard drives. Each OSD has a
>>> partition on an SSD for its DB. Wal is on the regular hard drives. Should I
>>> move the wal to share the SSD with the DB?
>>> >
>>> > Regards
>>> > R
>>> >
>>> >
>>> > ___
>>> > ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
>>> ceph-users@lists.ceph.com ] [
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
>>> >
>>> > you should put wal on the faster device, wal and db could share the
>>> same ssd partition,
>>> >
>>> > Maged
>>> >
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Drive for Wal and Db

2018-10-22 Thread Robert Stanford

 That's what they say, however I did exactly this and my cluster
utilization is higher than the total pool utilization by about the number
of OSDs * wal size.  I want to verify that the wal is on the SSDs too but
I've asked here and no one seems to know a way to verify this.  Do you?

 Thank you, R

On Mon, Oct 22, 2018 at 5:22 AM Maged Mokhtar  wrote:

>
> If you specify a db on ssd and data on hdd and not explicitly specify a
> device for wal, wal will be placed on same ssd partition with db.
> Placing only wal on ssd or creating separate devices for wal and db are
> less common setups.
>
> /Maged
>
> On 22/10/18 09:03, Fyodor Ustinov wrote:
> > Hi!
> >
> > For sharing SSD between WAL and DB what should be placed on SSD? WAL or
> DB?
> >
> > - Original Message -
> > From: "Maged Mokhtar" 
> > To: "ceph-users" 
> > Sent: Saturday, 20 October, 2018 20:05:44
> > Subject: Re: [ceph-users] Drive for Wal and Db
> >
> > On 20/10/18 18:57, Robert Stanford wrote:
> >
> >
> >
> >
> > Our OSDs are BlueStore and are on regular hard drives. Each OSD has a
> partition on an SSD for its DB. Wal is on the regular hard drives. Should I
> move the wal to share the SSD with the DB?
> >
> > Regards
> > R
> >
> >
> > ___
> > ceph-users mailing list [ mailto:ceph-users@lists.ceph.com |
> ceph-users@lists.ceph.com ] [
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
> >
> > you should put wal on the faster device, wal and db could share the same
> ssd partition,
> >
> > Maged
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Verifying the location of the wal

2018-10-21 Thread Robert Stanford

 I did exactly this when creating my osds, and found that my total
utilization is about the same as the sum of the utilization of the pools,
plus (wal size * number osds).  So it looks like my wals are actually
sharing OSDs.  But I'd like to be 100% sure... so I am seeking a way to
find out

On Sun, Oct 21, 2018 at 11:13 AM Serkan Çoban  wrote:

> wal and db device will be same if you use just db path during osd
> creation. i do not know how to verify this with ceph commands.
> On Sun, Oct 21, 2018 at 4:17 PM Robert Stanford 
> wrote:
> >
> >
> >  Thanks Serkan.  I am using --path instead of --dev (dev won't work
> because I'm using VGs/LVs).  The output shows block and block.db, but
> nothing about wal.db.  How can I learn where my wal lives?
> >
> >
> >
> >
> > On Sun, Oct 21, 2018 at 12:43 AM Serkan Çoban 
> wrote:
> >>
> >> ceph-bluestore-tool can show you the disk labels.
> >> ceph-bluestore-tool show-label --dev /dev/sda1
> >> On Sun, Oct 21, 2018 at 1:29 AM Robert Stanford <
> rstanford8...@gmail.com> wrote:
> >> >
> >> >
> >> >  An email from this list stated that the wal would be created in the
> same place as the db, if the db were specified when running ceph-volume lvm
> create, and the db were specified on that command line.  I followed those
> instructions and like the other person writing to this list today, I was
> surprised to find that my cluster usage was higher than the total of pools
> (higher by an amount the same as all my wal sizes on each node combined).
> This leads me to think my wal actually is on the data disk and not the ssd
> I specified the db should go to.
> >> >
> >> >  How can I verify which disk the wal is on, from the command line?
> I've searched the net and not come up with anything.
> >> >
> >> >  Thanks and regards
> >> >  R
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Verifying the location of the wal

2018-10-21 Thread Robert Stanford

 Thanks Serkan.  I am using --path instead of --dev (dev won't work because
I'm using VGs/LVs).  The output shows block and block.db, but nothing about
wal.db.  How can I learn where my wal lives?




On Sun, Oct 21, 2018 at 12:43 AM Serkan Çoban  wrote:

> ceph-bluestore-tool can show you the disk labels.
> ceph-bluestore-tool show-label --dev /dev/sda1
> On Sun, Oct 21, 2018 at 1:29 AM Robert Stanford 
> wrote:
> >
> >
> >  An email from this list stated that the wal would be created in the
> same place as the db, if the db were specified when running ceph-volume lvm
> create, and the db were specified on that command line.  I followed those
> instructions and like the other person writing to this list today, I was
> surprised to find that my cluster usage was higher than the total of pools
> (higher by an amount the same as all my wal sizes on each node combined).
> This leads me to think my wal actually is on the data disk and not the ssd
> I specified the db should go to.
> >
> >  How can I verify which disk the wal is on, from the command line?  I've
> searched the net and not come up with anything.
> >
> >  Thanks and regards
> >  R
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Verifying the location of the wal

2018-10-20 Thread Robert Stanford

 An email from this list stated that the wal would be created in the same
place as the db, if the db were specified when running ceph-volume lvm
create, and the db were specified on that command line.  I followed those
instructions and like the other person writing to this list today, I was
surprised to find that my cluster usage was higher than the total of pools
(higher by an amount the same as all my wal sizes on each node combined).
This leads me to think my wal actually is on the data disk and not the ssd
I specified the db should go to.

 How can I verify which disk the wal is on, from the command line?  I've
searched the net and not come up with anything.

 Thanks and regards
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Drive for Wal and Db

2018-10-20 Thread Robert Stanford

 Our OSDs are BlueStore and are on regular hard drives.  Each OSD has a
partition on an SSD for its DB.  Wal is on the regular hard drives.  Should
I move the wal to share the SSD with the DB?

 Regards
R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Favorite SSD

2018-09-17 Thread Robert Stanford

 Awhile back the favorite SSD for Ceph was the Samsung SM863a.  Are there
any larger SSDs that are known to work well with Ceph?  I'd like around 1TB
if possible.  Is there any better alternative to the SM863a?

 Regards
   R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Luminous RGW errors at start

2018-09-04 Thread Robert Stanford

 This was the issue (could not create the pool, because it would have
exceeded the new (luminous) limitation on pgs /osd.

On Tue, Sep 4, 2018 at 10:35 AM David Turner  wrote:

> I was confused what could be causing this until Janne's email.  I think
> they're correct that the cluster is preventing pool creation due to too
> many PGs per OSD.  Double check how many PGs you have in each pool and what
> your defaults are for that.
>
> On Mon, Sep 3, 2018 at 7:19 AM Janne Johansson 
> wrote:
>
>> Did you change the default pg_num or pgp_num so the pools that did show
>> up made it go past the mon_max_pg_per_osd ?
>>
>>
>> Den fre 31 aug. 2018 kl 17:20 skrev Robert Stanford <
>> rstanford8...@gmail.com>:
>>
>>>
>>>  I installed a new Luminous cluster.  Everything is fine so far.  Then I
>>> tried to start RGW and got this error:
>>>
>>> 2018-08-31 15:15:41.998048 7fc350271e80  0 rgw_init_ioctx ERROR:
>>> librados::Rados::pool_create returned (34) Numerical result out of range
>>> (this can be due to a pool or placement group misconfiguration, e.g. pg_num
>>> < pgp_num or mon_max_pg_per_osd exceeded)
>>> 2018-08-31 15:15:42.005732 7fc350271e80 -1 Couldn't init storage
>>> provider (RADOS)
>>>
>>>  I notice that the only pools that exist are the data and index RGW
>>> pools (no user or log pools like on Jewel).  What is causing this?
>>>
>>>  Thank you
>>>  R
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>> --
>> May the most significant bit of your life be positive.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Luminous RGW errors at start

2018-08-31 Thread Robert Stanford

 I installed a new Luminous cluster.  Everything is fine so far.  Then I
tried to start RGW and got this error:

2018-08-31 15:15:41.998048 7fc350271e80  0 rgw_init_ioctx ERROR:
librados::Rados::pool_create returned (34) Numerical result out of range
(this can be due to a pool or placement group misconfiguration, e.g. pg_num
< pgp_num or mon_max_pg_per_osd exceeded)
2018-08-31 15:15:42.005732 7fc350271e80 -1 Couldn't init storage provider
(RADOS)

 I notice that the only pools that exist are the data and index RGW pools
(no user or log pools like on Jewel).  What is causing this?

 Thank you
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Error EINVAL: (22) Invalid argument While using ceph osd safe-to-destroy

2018-08-26 Thread Robert Stanford

 I am following the procedure here:
http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/

 When I get to the part to run "ceph osd safe-to-destroy $ID" in a while
loop, I get a EINVAL error.  I get this error when I run "ceph osd
safe-to-destroy 0" on the command line by itself, too.  (Extra note, the
while loop in the instructions look like it's bad.  I had to change it to
make it work in bash.)

 I know my ID is correct because I was able to use it in the previous step
(ceph osd out $ID).  I also substituted $ID for the number on the command
line and got the same error.  Why isn't this working?

Error: Error EINVAL: (22) Invalid argument While using ceph osd
safe-to-destroy

 Thank you
R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RGW pools don't show up in luminous

2018-08-24 Thread Robert Stanford

 Casey - this was exactly it.  My ceph-mgr had issues.  I didn't know this
was necessary for ceph df to work.  Thank you

R

On Fri, Aug 24, 2018 at 8:56 AM Casey Bodley  wrote:

>
>
> On 08/23/2018 01:22 PM, Robert Stanford wrote:
> >
> >  I installed a new Ceph cluster with Luminous, after a long time
> > working with Jewel.  I created my RGW pools the same as always (pool
> > create default.rgw.buckets.data etc.), but they don't show up in ceph
> > df with Luminous.  Has the command changed?
> >
> >  Thanks
> >  R
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> Hi Robert,
>
> Do you have a ceph-mgr running? I believe the accounting for 'ceph df'
> is performed by ceph-mgr in Luminous and beyond, rather than ceph-mon.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Dashboard can't activate in Luminous?

2018-08-23 Thread Robert Stanford

 I just installed a new luminous cluster.  When I run this command:
ceph mgr module enable dashboard

I get this response:
all mgr daemons do not support module 'dashboard'

All daemons are Luminous (I confirmed this by runing ceph version).
Why would this error appear?

 Thank you
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RGW pools don't show up in luminous

2018-08-23 Thread Robert Stanford

 I installed a new Ceph cluster with Luminous, after a long time working
with Jewel.  I created my RGW pools the same as always (pool create
default.rgw.buckets.data etc.), but they don't show up in ceph df with
Luminous.  Has the command changed?

 Thanks
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore options in ceph.conf not being used

2018-08-22 Thread Robert Stanford

 David - thanks again.  Your input the last week has been invaluable.

On Wed, Aug 22, 2018 at 2:41 PM David Turner  wrote:

> Yes, whatever you set your DB LV to at the time you create the Bluestore
> OSD, it will use all of that space for the db/wal.  If you increase the
> size after the initial creation, the space will not be used for the the
> DB.  You cannot resize it.
>
> On Wed, Aug 22, 2018 at 3:39 PM Robert Stanford 
> wrote:
>
>>
>>  In my case I am using the same values for lvcreate and in the ceph.conf
>> (bluestore* settings).  Since my lvs are the size I want the db to be, and
>> since I'm told that the wal will live in the same place automatically, it
>> sounds like setting my lv to be xGB ensures Ceph will use all of this for
>> db/wal automatically?
>>
>>  Thanks
>>  R
>>
>> On Wed, Aug 22, 2018 at 2:09 PM Alfredo Deza  wrote:
>>
>>> On Wed, Aug 22, 2018 at 2:48 PM, David Turner 
>>> wrote:
>>> > The config settings for DB and WAL size don't do anything.  For journal
>>> > sizes they would be used for creating your journal partition with
>>> ceph-disk,
>>> > but ceph-volume does not use them for creating bluestore OSDs.  You
>>> need to
>>> > create the partitions for the DB and WAL yourself and supply those
>>> > partitions to the ceph-volume command.  I have heard that they're
>>> working on
>>> > this for future releases, but currently those settings don't do
>>> anything.
>>>
>>> This is accurate, ceph-volume as of the latest release doesn't do any
>>> with them because it doesn't create these for the user.
>>>
>>> We are getting close on getting that functionality rolled out, but not
>>> ready unless you are using master (please don't use master :))
>>>
>>>
>>> >
>>> > On Wed, Aug 22, 2018 at 1:34 PM Robert Stanford <
>>> rstanford8...@gmail.com>
>>> > wrote:
>>> >>
>>> >>
>>> >>  I have created new OSDs for Ceph Luminous.  In my Ceph.conf I have
>>> >> specified that the db size be 10GB, and the wal size be 1GB.  However
>>> when I
>>> >> type ceph daemon osd.0 perf dump I get: bluestore_allocated": 5963776
>>> >>
>>> >>  I think this means that the bluestore db is using the default, and
>>> not
>>> >> the value of bluestore block db size in the ceph.conf.  Why is this?
>>> >>
>>> >>  Thanks
>>> >>  R
>>> >>
>>> >> ___
>>> >> ceph-users mailing list
>>> >> ceph-users@lists.ceph.com
>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>> >
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore options in ceph.conf not being used

2018-08-22 Thread Robert Stanford

 In my case I am using the same values for lvcreate and in the ceph.conf
(bluestore* settings).  Since my lvs are the size I want the db to be, and
since I'm told that the wal will live in the same place automatically, it
sounds like setting my lv to be xGB ensures Ceph will use all of this for
db/wal automatically?

 Thanks
 R

On Wed, Aug 22, 2018 at 2:09 PM Alfredo Deza  wrote:

> On Wed, Aug 22, 2018 at 2:48 PM, David Turner 
> wrote:
> > The config settings for DB and WAL size don't do anything.  For journal
> > sizes they would be used for creating your journal partition with
> ceph-disk,
> > but ceph-volume does not use them for creating bluestore OSDs.  You need
> to
> > create the partitions for the DB and WAL yourself and supply those
> > partitions to the ceph-volume command.  I have heard that they're
> working on
> > this for future releases, but currently those settings don't do anything.
>
> This is accurate, ceph-volume as of the latest release doesn't do any
> with them because it doesn't create these for the user.
>
> We are getting close on getting that functionality rolled out, but not
> ready unless you are using master (please don't use master :))
>
>
> >
> > On Wed, Aug 22, 2018 at 1:34 PM Robert Stanford  >
> > wrote:
> >>
> >>
> >>  I have created new OSDs for Ceph Luminous.  In my Ceph.conf I have
> >> specified that the db size be 10GB, and the wal size be 1GB.  However
> when I
> >> type ceph daemon osd.0 perf dump I get: bluestore_allocated": 5963776
> >>
> >>  I think this means that the bluestore db is using the default, and not
> >> the value of bluestore block db size in the ceph.conf.  Why is this?
> >>
> >>  Thanks
> >>  R
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] BlueStore options in ceph.conf not being used

2018-08-22 Thread Robert Stanford

 I have created new OSDs for Ceph Luminous.  In my Ceph.conf I have
specified that the db size be 10GB, and the wal size be 1GB.  However when
I type ceph daemon osd.0 perf dump I get: bluestore_allocated": 5963776

 I think this means that the bluestore db is using the default, and not the
value of bluestore block db size in the ceph.conf.  Why is this?

 Thanks
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Robert Stanford

 This is helpful, thanks.  Since the example is only for block.db, does
that imply that the wal should (can efficiently) live on the same disk as
data?

 R

On Fri, Aug 17, 2018 at 10:50 AM Alfredo Deza  wrote:

> On Fri, Aug 17, 2018 at 11:47 AM, Robert Stanford
>  wrote:
> >
> >  What's more, I was planning on using this single journal device (SSD)
> for 4
> > OSDs.  With filestore I simply told each OSD to use this drive, sdb, on
> the
> > command line, and it would create a new partition on that drive every
> time I
> > created an OSD.  I thought it would be the same for BlueStore.  So that
> begs
> > the question, how does one set up an SSD to hold journals for multiple
> OSDs,
> > both db and wal?  Searching has yielded nothing.
>
> We are working on expanding the tooling to this for you, but until
> then, it is up to the user to create the LVs manually.
>
> This section might help out a bit on what you would need (for block.db):
>
>
> http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#block-and-block-db
> >
> >  R
> >
> >
> > On Fri, Aug 17, 2018 at 9:48 AM David Turner 
> wrote:
> >>
> >> > ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc
> --block.db
> >> > /dev/sdb --block.wal /dev/sdb
> >>
> >> That command can't work... You're telling it to use the entire /dev/sdb
> >> device for the db and then again to do it for the wal, but you can only
> use
> >> the entire device once.  There are 2 things wrong with that.  First, if
> >> you're putting db and wal on the same device you do not need to specify
> the
> >> wal.  Second if you are actually intending to use a partition on
> /dev/sdb
> >> instead of the entire block device for this single OSD, then you need to
> >> manually create a partition for it and supply that partition to the
> >> --block.db command.
> >>
> >> Likely the command you want will end up being this after you create a
> >> partition on the SSD for the db/wal.
> >> `ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc
> --block.db
> >> /dev/sdb1`
> >>
> >> On Fri, Aug 17, 2018 at 10:24 AM Robert Stanford <
> rstanford8...@gmail.com>
> >> wrote:
> >>>
> >>>
> >>>  I was using the ceph-volume create command, which I understand
> combines
> >>> the prepare and activate functions.
> >>>
> >>> ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc
> --block.db
> >>> /dev/sdb --block.wal /dev/sdb
> >>>
> >>>  That is the command context I've found on the web.  Is it wrong?
> >>>
> >>>  Thanks
> >>> R
> >>>
> >>> On Fri, Aug 17, 2018 at 5:55 AM Alfredo Deza  wrote:
> >>>>
> >>>> On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
> >>>>  wrote:
> >>>> >
> >>>> >  I am following the steps to my filestore journal with a bluestore
> >>>> > journal
> >>>> >
> >>>> > (
> http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).
> It
> >>>> > is broken at ceph-volume lvm create.  Here is my error:
> >>>> >
> >>>> > --> Zapping successful for: /dev/sdc
> >>>> > Preparing sdc
> >>>> > Running command: /bin/ceph-authtool --gen-print-key
> >>>> > Running command: /bin/ceph --cluster ceph --name
> client.bootstrap-osd
> >>>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> >>>> > Running command: /bin/ceph --cluster ceph --name
> client.bootstrap-osd
> >>>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> >>>> > ff523216-350d-4ca0-9022-0c17662c2c3b 10
> >>>> > Running command: vgcreate --force --yes
> >>>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
> >>>> >  stdout: Physical volume "/dev/sdc" successfully created.
> >>>> >  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
> >>>> > successfully created
> >>>> > Running command: lvcreate --yes -l 100%FREE -n
> >>>> > osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
> >>>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
> >>>> >  stdout: Logical volume
> >>>> > "osd-block-ff523216-350d-4ca0-9022-

Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Robert Stanford

 What's more, I was planning on using this single journal device (SSD) for
4 OSDs.  With filestore I simply told each OSD to use this drive, sdb, on
the command line, and it would create a new partition on that drive every
time I created an OSD.  I thought it would be the same for BlueStore.  So
that begs the question, how does one set up an SSD to hold journals for
multiple OSDs, both db and wal?  Searching has yielded nothing.

 R


On Fri, Aug 17, 2018 at 9:48 AM David Turner  wrote:

> > ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc
> --block.db /dev/sdb --block.wal /dev/sdb
>
> That command can't work... You're telling it to use the entire /dev/sdb
> device for the db and then again to do it for the wal, but you can only use
> the entire device once.  There are 2 things wrong with that.  First, if
> you're putting db and wal on the same device you do not need to specify the
> wal.  Second if you are actually intending to use a partition on /dev/sdb
> instead of the entire block device for this single OSD, then you need to
> manually create a partition for it and supply that partition to the
> --block.db command.
>
> Likely the command you want will end up being this after you create a
> partition on the SSD for the db/wal.
> `ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
> /dev/sdb1`
>
> On Fri, Aug 17, 2018 at 10:24 AM Robert Stanford 
> wrote:
>
>>
>>  I was using the ceph-volume create command, which I understand combines
>> the prepare and activate functions.
>>
>> ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
>> /dev/sdb --block.wal /dev/sdb
>>
>>  That is the command context I've found on the web.  Is it wrong?
>>
>>  Thanks
>> R
>>
>> On Fri, Aug 17, 2018 at 5:55 AM Alfredo Deza  wrote:
>>
>>> On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
>>>  wrote:
>>> >
>>> >  I am following the steps to my filestore journal with a bluestore
>>> journal
>>> > (http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).
>>> It
>>> > is broken at ceph-volume lvm create.  Here is my error:
>>> >
>>> > --> Zapping successful for: /dev/sdc
>>> > Preparing sdc
>>> > Running command: /bin/ceph-authtool --gen-print-key
>>> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
>>> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
>>> > ff523216-350d-4ca0-9022-0c17662c2c3b 10
>>> > Running command: vgcreate --force --yes
>>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
>>> >  stdout: Physical volume "/dev/sdc" successfully created.
>>> >  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
>>> > successfully created
>>> > Running command: lvcreate --yes -l 100%FREE -n
>>> > osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
>>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
>>> >  stdout: Logical volume
>>> "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
>>> > created.
>>> > --> blkid could not detect a PARTUUID for device: sdb
>>> > --> Was unable to complete a new OSD, will rollback changes
>>> > --> OSD will be destroyed, keeping the ID because it was provided with
>>> > --osd-id
>>> > Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
>>> >  stderr: destroyed osd.10
>>> > -->  RuntimeError: unable to use device
>>> >
>>> >  Note that SDB is the SSD journal.  It has been zapped prior.
>>>
>>> I can't see what the actual command you used is, but I am guessing you
>>> did something like:
>>>
>>> ceph-volume lvm prepare --filestore --data /dev/sdb --journal /dev/sdb
>>>
>>> Which is not possible. There are a few ways you can do this (see:
>>> http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#filestore )
>>>
>>> With a raw device and a pre-created partition (must have a PARTUUID):
>>>
>>> ceph-volume lvm prepare --data /dev/sdb --journal /dev/sdc1
>>>
>>> With LVs:
>>>
>>> ceph-volume lvm prepare --data vg/my-data --journal vg/my-journal
>>>
>>> With an LV for data and a partition:
>>>
>>> ceph-volume lvm prepare --data vg/my-data --journal /dev/sdc1
>>>
>>> >
>>> >  What is going wrong, and how can I fix it?
>>> >
>>> >  Thank you
>>> >  R
>>> >
>>> >
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Robert Stanford

 I was using the ceph-volume create command, which I understand combines
the prepare and activate functions.

ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
/dev/sdb --block.wal /dev/sdb

 That is the command context I've found on the web.  Is it wrong?

 Thanks
R

On Fri, Aug 17, 2018 at 5:55 AM Alfredo Deza  wrote:

> On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
>  wrote:
> >
> >  I am following the steps to my filestore journal with a bluestore
> journal
> > (http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).
> It
> > is broken at ceph-volume lvm create.  Here is my error:
> >
> > --> Zapping successful for: /dev/sdc
> > Preparing sdc
> > Running command: /bin/ceph-authtool --gen-print-key
> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> > ff523216-350d-4ca0-9022-0c17662c2c3b 10
> > Running command: vgcreate --force --yes
> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
> >  stdout: Physical volume "/dev/sdc" successfully created.
> >  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
> > successfully created
> > Running command: lvcreate --yes -l 100%FREE -n
> > osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
> >  stdout: Logical volume "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
> > created.
> > --> blkid could not detect a PARTUUID for device: sdb
> > --> Was unable to complete a new OSD, will rollback changes
> > --> OSD will be destroyed, keeping the ID because it was provided with
> > --osd-id
> > Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
> >  stderr: destroyed osd.10
> > -->  RuntimeError: unable to use device
> >
> >  Note that SDB is the SSD journal.  It has been zapped prior.
>
> I can't see what the actual command you used is, but I am guessing you
> did something like:
>
> ceph-volume lvm prepare --filestore --data /dev/sdb --journal /dev/sdb
>
> Which is not possible. There are a few ways you can do this (see:
> http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#filestore )
>
> With a raw device and a pre-created partition (must have a PARTUUID):
>
> ceph-volume lvm prepare --data /dev/sdb --journal /dev/sdc1
>
> With LVs:
>
> ceph-volume lvm prepare --data vg/my-data --journal vg/my-journal
>
> With an LV for data and a partition:
>
> ceph-volume lvm prepare --data vg/my-data --journal /dev/sdc1
>
> >
> >  What is going wrong, and how can I fix it?
> >
> >  Thank you
> >  R
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] BlueStore upgrade steps broken

2018-08-16 Thread Robert Stanford

 I am following the steps to my filestore journal with a bluestore journal (
http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).  It
is broken at ceph-volume lvm create.  Here is my error:

--> Zapping successful for: /dev/sdc
Preparing sdc
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
ff523216-350d-4ca0-9022-0c17662c2c3b 10
Running command: vgcreate --force --yes
ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
 stdout: Physical volume "/dev/sdc" successfully created.
 stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
successfully created
Running command: lvcreate --yes -l 100%FREE -n
osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
 stdout: Logical volume "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
created.
--> blkid could not detect a PARTUUID for device: sdb
--> Was unable to complete a new OSD, will rollback changes
--> OSD will be destroyed, keeping the ID because it was provided with
--osd-id
Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
 stderr: destroyed osd.10
-->  RuntimeError: unable to use device

 Note that SDB is the SSD journal.  It has been zapped prior.

 What is going wrong, and how can I fix it?

 Thank you
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Scope of ceph.conf rgw values

2018-08-16 Thread Robert Stanford

 I am turning off resharding for Luminous with rgw dynamic resharding =
false on the rgw server.  When I show the configuration on that server
(with ceph daemon), I see that it is false, like I expect.  When I show the
configuration on the monitor servers, that setting shows up as "true".  Do
I need to include this line (to disable resharding) on each host, or only
the rgw servers, then?

 Thanks
  R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore wal vs. db size

2018-08-15 Thread Robert Stanford

 The workload is relatively high read/write of objects through radosgw.
Gbps+ in both directions.  The OSDs are spinning disks, the journals (up
until now filestore) are on SSDs.  Four OSDs / journal disk.

On Wed, Aug 15, 2018 at 10:58 AM, Wido den Hollander  wrote:

>
>
> On 08/15/2018 05:57 PM, Robert Stanford wrote:
> >
> >  Thank you Wido.  I don't want to make any assumptions so let me verify,
> > that's 10GB of DB per 1TB storage on that OSD alone, right?  So if I
> > have 4 OSDs sharing the same SSD journal, each 1TB, there are 4 10 GB DB
> > partitions for each?
> >
>
> Yes, that is correct.
>
> Each OSD needs 10GB/1TB of storage of DB. So size your SSD according to
> your storage needs.
>
> However, it depends on the workload if you need to offload WAL+DB to a
> SSD. What is the workload?
>
> Wido
>
> > On Wed, Aug 15, 2018 at 1:59 AM, Wido den Hollander  > <mailto:w...@42on.com>> wrote:
> >
> >
> >
> > On 08/15/2018 04:17 AM, Robert Stanford wrote:
> > > I am keeping the wal and db for a ceph cluster on an SSD.  I am
> using
> > > the masif_bluestore_block_db_size / masif_bluestore_block_wal_size
> > > parameters in ceph.conf to specify how big they should be.  Should
> these
> > > values be the same, or should one be much larger than the other?
> > >
> >
> > This has been answered multiple times on this mailinglist in the last
> > months, a bit of searching would have helped.
> >
> > Nevertheless, 1GB for the WAL is sufficient and then allocate about
> 10GB
> > of DB per TB of storage. That should be enough in most use cases.
> >
> > Now, if you can spare more DB space, do so!
> >
> > Wido
> >
> > >  R
> > >
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > >
> >
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore wal vs. db size

2018-08-15 Thread Robert Stanford

 Thank you Wido.  I don't want to make any assumptions so let me verify,
that's 10GB of DB per 1TB storage on that OSD alone, right?  So if I have 4
OSDs sharing the same SSD journal, each 1TB, there are 4 10 GB DB
partitions for each?

On Wed, Aug 15, 2018 at 1:59 AM, Wido den Hollander  wrote:

>
>
> On 08/15/2018 04:17 AM, Robert Stanford wrote:
> > I am keeping the wal and db for a ceph cluster on an SSD.  I am using
> > the masif_bluestore_block_db_size / masif_bluestore_block_wal_size
> > parameters in ceph.conf to specify how big they should be.  Should these
> > values be the same, or should one be much larger than the other?
> >
>
> This has been answered multiple times on this mailinglist in the last
> months, a bit of searching would have helped.
>
> Nevertheless, 1GB for the WAL is sufficient and then allocate about 10GB
> of DB per TB of storage. That should be enough in most use cases.
>
> Now, if you can spare more DB space, do so!
>
> Wido
>
> >  R
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] BlueStore wal vs. db size

2018-08-14 Thread Robert Stanford

I am keeping the wal and db for a ceph cluster on an SSD.  I am using the
masif_bluestore_block_db_size / masif_bluestore_block_wal_size parameters
in ceph.conf to specify how big they should be.  Should these values be the
same, or should one be much larger than the other?

 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Luminous upgrade instructions include bad commands

2018-08-10 Thread Robert Stanford

[root@monitor07]# ceph version
ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous
(stable)
[root@monitor07]# ceph mon feature ls
no valid command found; 10 closest matches:
mon compact
mon scrub
mon metadata {}
mon sync force {--yes-i-really-mean-it} {--i-know-what-i-am-doing}
mon dump {}
mon stat
mon getmap {}
mon add  
mon debug unset_feature persistent|optional 
{--yes-i-really-mean-it}
mon rm 
Error EINVAL: invalid command

[root@monitor07]# ceph versions
[same error as above]

 Both of these commands are referenced in the upgrade (from Jewel / Kraken)
instructions here: https://ceph.com/releases/v12-2-0-luminous-released/

 What gives?
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-deploy] Cluster Name

2018-08-10 Thread Robert Stanford

 Just FYI.  I asked about cluster names a month or two back and was told
that support for them is being phased out.  I've had all sorts of problems
using clusters with cluster names, and stopped using it myself.

On Fri, Aug 10, 2018 at 2:06 AM, Glen Baars 
wrote:

> I have now gotten this working. Thanks everyone for the help. The
> RBD-Mirror service is co-located on a MON server.
>
> Key points are:
>
> Start the services on the boxes with the following syntax ( depending on
> your config file names )
>
> On primary
> systemctl start ceph-rbd-mirror@primary
>
> On secondary
> systemctl start ceph-rbd-mirror@secondary
>
> Ensure this works on both boxes
> ceph --cluster secondary -n client.secondary -s
> ceph --cluster primary -n client.primary -s
>
> check the log files under - /var/log/ceph/ceph-client.primary.log and
> /var/log/ceph/ceph-client.secondary.log
>
> My primary server had these files in it.
>
> ceph.client.admin.keyring
> ceph.client.primary.keyring
> ceph.conf
> primary.client.primary.keyring
> primary.conf
> secondary.client.secondary.keyring
> secondary.conf
>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Thode Jocelyn 
> Sent: Thursday, 9 August 2018 1:41 PM
> To: Erik McCormick 
> Cc: Glen Baars ; Vasu Kulkarni <
> vakul...@redhat.com>; ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] [Ceph-deploy] Cluster Name
>
> Hi Erik,
>
> The thing is that the rbd-mirror service uses the /etc/sysconfig/ceph file
> to determine which configuration file to use (from CLUSTER_NAME). So you
> need to set this to the name you chose for rbd-mirror to work. However
> setting this CLUSTER_NAME variable in /etc/sysconfig/ceph makes it so that
> the mon, osd etc services will also use this variable. Because of this they
> cannot start anymore as all their path are set with "ceph" as cluster name.
>
> However there might be something that I missed which would make this point
> moot
>
> Best Regards
> Jocelyn Thode
>
> -Original Message-
> From: Erik McCormick [mailto:emccorm...@cirrusseven.com]
> Sent: mercredi, 8 août 2018 16:39
> To: Thode Jocelyn 
> Cc: Glen Baars ; Vasu Kulkarni <
> vakul...@redhat.com>; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] [Ceph-deploy] Cluster Name
>
> I'm not using this feature, so maybe I'm missing something, but from the
> way I understand cluster naming to work...
>
> I still don't understand why this is blocking for you. Unless you are
> attempting to mirror between two clusters running on the same hosts (why
> would you do this?) then systemd doesn't come into play. The --cluster flag
> on the rbd command will simply set the name of a configuration file with
> the FSID and settings of the appropriate cluster. Cluster name is just a
> way of telling ceph commands and systemd units where to find the configs.
>
> So, what you end up with is something like:
>
> /etc/ceph/ceph.conf (your local cluster configuration) on both clusters
> /etc/ceph/local.conf (config of the source cluster. Just a copy of
> ceph.conf of the source clsuter) /etc/ceph/remote.conf (config of
> destination peer cluster. Just a copy of ceph.conf of the remote cluster).
>
> Run all your rbd mirror commands against local and remote names.
> However when starting things like mons, osds, mds, etc. you need no
> cluster name as it can use ceph.conf (cluster name of ceph).
>
> Am I making sense, or have I completely missed something?
>
> -Erik
>
> On Wed, Aug 8, 2018 at 8:34 AM, Thode Jocelyn 
> wrote:
> > Hi,
> >
> >
> >
> > We are still blocked by this problem on our end. Glen did you  or
> > someone else figure out something for this ?
> >
> >
> >
> > Regards
> >
> > Jocelyn Thode
> >
> >
> >
> > From: Glen Baars [mailto:g...@onsitecomputers.com.au]
> > Sent: jeudi, 2 août 2018 05:43
> > To: Erik McCormick 
> > Cc: Thode Jocelyn ; Vasu Kulkarni
> > ; ceph-users@lists.ceph.com
> > Subject: RE: [ceph-users] [Ceph-deploy] Cluster Name
> >
> >
> >
> > Hello Erik,
> >
> >
> >
> > We are going to use RBD-mirror to replicate the clusters. This seems
> > to need separate cluster names.
> >
> > Kind regards,
> >
> > Glen Baars
> >
> >
> >
> > From: Erik McCormick 
> > Sent: Thursday, 2 August 2018 9:39 AM
> > To: Glen Baars 
> > Cc: Thode Jocelyn ; Vasu Kulkarni
> > ; ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] [Ceph-deploy] Cluster Name
> >
> >
> >
> > Don't set a cluster name. It's no longer supported. It really only
> > matters if you're running two or more independent clusters on the same
> > boxes. That's generally inadvisable anyway.
> >
> >
> >
> > Cheers,
> >
> > Erik
> >
> >
> >
> > On Wed, Aug 1, 2018, 9:17 PM Glen Baars 
> wrote:
> >
> > Hello Ceph Users,
> >
> > Does anyone know how to set the Cluster Name when deploying with
> > Ceph-deploy? I have 3 clusters to configure and need to correctly set
> > the name.
> >
> > Kind regards,
> > Glen Baars
> >
> > -Original Message-
> > From: ceph-users  On Behalf Of Glen
> > Baars
> > Sent:

[ceph-users] BlueStore performance: SSD vs on the same spinning disk

2018-08-07 Thread Robert Stanford

 I was surprised to see an email on this list a couple of days ago, which
said that write performance would actually fall with BlueStore.  I thought
the reason BlueStore existed was to increase performance.  Nevertheless, it
seems like filestore is going away and everyone should upgrade.

 My question is: I have SSDs for filestore journals, for spinning OSDs.
When upgrading to BlueStore, am I better of using the SSDs for wal/db, or
am I better of keeping everything (dat, wall, db) on the spinning disks
(from a performance perspective)?

 Thanks
  R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Upgrading journals to BlueStore: a conundrum

2018-08-06 Thread Robert Stanford

 Eugen: I've tried similar approaches in the past and it seems like it
won't work like that.  I have to zap the entire journal disk.  Also I plan
to use the configuration tunable for making the bluestore partition (wal,
db) larger than the default

On Mon, Aug 6, 2018 at 2:30 PM, Eugen Block  wrote:

> Hi,
>
>  How then can one upgrade journals to BlueStore when there is more than one
>> journal on the same disk?
>>
>
> if you're using one SSD for multiple OSDs the disk probably has several
> partitions. So you could just zap one partition at a time and replace the
> OSD. Or am I misunderstanding the question?
>
> Regards,
> Eugen
>
>
> Zitat von Bastiaan Visser :
>
>
> As long as your fault domain is host (or even rack) you're good, just take
>> out the entire host and recreate all osd's on it.
>>
>>
>> - Original Message -
>> From: "Robert Stanford" 
>> To: "ceph-users" 
>> Sent: Monday, August 6, 2018 8:39:07 PM
>> Subject: [ceph-users] Upgrading journals to BlueStore: a conundrum
>>
>> According to the instructions to upgrade a journal to BlueStore (
>> http://docs.ceph.com/docs/master/rados/operations/bluestore-migration/),
>> the OSD that uses the journal is destroyed and recreated.
>>
>>  I am using SSD journals, and want to use them with BlueStore.  Reusing
>> the
>> SSD requires zapping the disk (ceph-disk zap).  But this would take down
>> all OSDs that use this journal, not just the one-at-a-time that I destroy
>> and recreate when following the upgrade instructions.
>>
>>  How then can one upgrade journals to BlueStore when there is more than
>> one
>> journal on the same disk?
>>
>>  R
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Upgrading journals to BlueStore: a conundrum

2018-08-06 Thread Robert Stanford

 According to the instructions to upgrade a journal to BlueStore (
http://docs.ceph.com/docs/master/rados/operations/bluestore-migration/),
the OSD that uses the journal is destroyed and recreated.

 I am using SSD journals, and want to use them with BlueStore.  Reusing the
SSD requires zapping the disk (ceph-disk zap).  But this would take down
all OSDs that use this journal, not just the one-at-a-time that I destroy
and recreate when following the upgrade instructions.

 How then can one upgrade journals to BlueStore when there is more than one
journal on the same disk?

 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Converting to dynamic bucket resharding in Luminous

2018-07-27 Thread Robert Stanford

 I have a Jewel Ceph cluster with RGW index sharding enabled.  I've
configured the index to have 128 shards.  I am upgrading to Luminous.  What
will happen if I enable dynamic bucket index resharding in ceph.conf?  Will
it maintain my 128 shards (the buckets are currently empty), and will it
split them (to 256, and beyond) when they get full enough?

 - R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Implementing multi-site on an existing cluster

2018-07-24 Thread Robert Stanford

 I have a Luminous Ceph cluster that uses just rgw.  We want to turn it
into a mult-site installation.  Are there instructions online for this?
I've been unable to find them.

 -R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Converting to multisite

2018-07-23 Thread Robert Stanford

 I already have a set of default.rgw.* pools.  They are in use.  I want to
convert to multisite.  The tutorials show to create new pools
(zone.rgw.*).  Do I have to destroy my old pools and lose all data, in
order to convert to multisite?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Converting to BlueStore, and external journal devices

2018-07-19 Thread Robert Stanford

 Thank you.  Sounds like the typical configuration is just RocksDB on the
SSD, and both data and WAL on the OSD disk?

On Thu, Jul 19, 2018 at 9:00 AM, Eugen Block  wrote:

> Hi,
>
> if you have SSDs for RocksDB, you should provide that in the command
> (--block.db $DEV), otherwise Ceph will use the one provided disk for all
> data and RocksDB/WAL.
> Before you create that OSD you probably should check out the help page for
> that command, maybe there are more options you should be aware of, e.g. a
> separate WAL on NVMe.
>
> Regards,
> Eugen
>
>
> Zitat von Robert Stanford :
>
>
> I am following the steps here:
>> http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/
>>
>>  The final step is:
>>
>> ceph-volume lvm create --bluestore --data $DEVICE --osd-id $ID
>>
>>
>>  I notice this command doesn't specify a device to use as the journal.  Is
>> it implied that BlueStore will use the same (OSD) device for the function?
>> I don't think that's what I want (I have spinning disks for data, and SSDs
>> for journals).  Is there any reason *not* to specify which device to use
>> for a journal, when creating an OSD with BlueStore capability?
>>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Converting to BlueStore, and external journal devices

2018-07-19 Thread Robert Stanford

 I am following the steps here:
http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/

 The final step is:

ceph-volume lvm create --bluestore --data $DEVICE --osd-id $ID


 I notice this command doesn't specify a device to use as the journal.  Is
it implied that BlueStore will use the same (OSD) device for the function?
I don't think that's what I want (I have spinning disks for data, and SSDs
for journals).  Is there any reason *not* to specify which device to use
for a journal, when creating an OSD with BlueStore capability?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Increase tcmalloc thread cache bytes - still recommended?

2018-07-19 Thread Robert Stanford

 It seems that the Ceph community no longer recommends changing to
jemalloc.  However this also recommends to do what's in this email's
subject:
https://ceph.com/geen-categorie/the-ceph-and-tcmalloc-performance-story/

 Is it still recommended to increase the tcmalloc thread cache bytes, or is
that recommendation old and no longer applicable?

 Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] multisite and link speed

2018-07-17 Thread Robert Stanford

 I have ceph clusters in a zone configured as active/passive, or
primary/backup.  If the network link between the two clusters is slower
than the speed of data coming in to the active cluster, what will
eventually happen?  Will data pool on the active cluster until memory runs
out?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] tcmalloc performance still relevant?

2018-07-17 Thread Robert Stanford

Looking here:
https://ceph.com/geen-categorie/the-ceph-and-tcmalloc-performance-story/

 I see that it was a good idea to change to JEMalloc.  Is this still the
case, with up to date Linux and current Ceph?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD tuning no longer required?

2018-07-16 Thread Robert Stanford

 Golden advice.  Thank you Greg

On Mon, Jul 16, 2018 at 1:45 PM, Gregory Farnum  wrote:

> On Fri, Jul 13, 2018 at 2:50 AM Robert Stanford 
> wrote:
>
>>
>>  This is what leads me to believe it's other settings being referred to
>> as well:
>> https://ceph.com/community/new-luminous-rados-improvements/
>>
>> *"There are dozens of documents floating around with long lists of Ceph
>> configurables that have been tuned for optimal performance on specific
>> hardware or for specific workloads.  In most cases these ceph.conf
>> fragments tend to induce funny looks on developers’ faces because the
>> settings being adjusted seem counter-intuitive, unrelated to the
>> performance of the system, and/or outright dangerous.  Our goal is to make
>> Ceph work as well as we can out of the box without requiring any tuning at
>> all, so we are always striving to choose sane defaults.  And generally, we
>> discourage tuning by users. "*
>>
>> To me it's not just bluestore settings / sdd vs. hdd they're talking
>> about ("dozens of documents floating around"... "our goal... without any
>> tuning at all".  Am I off base?
>>
>
> Ceph is *extremely* tunable, because whenever we set up a new behavior
> (snapshot trimming sleeps, scrub IO priorities, whatever) and we're not
> sure how it should behave we add a config option. Most of these config
> options we come up with some value through testing or informed guesswork,
> set it in the config, and expect that users won't ever see it. Some of
> these settings we don't know what they should be, and we really hope the
> whole mechanism gets replaced before users see it, but they don't. Some of
> the settings should be auto-tuning or manually set to a different value for
> each deployment to get optimal performance.
> So there are lots of options for people to make things much better or much
> worse for themselves.
>
> However, by far the biggest impact and most common tunables are those that
> basically vary on if the OSD is using a hard drive or an SSD for its local
> storage — those are order-of-magnitude differences in expected latency and
> throughput. So we now have separate default tunables for those cases which
> are automatically applied.
>
> Could somebody who knows what they're doing tweak things even better for a
> particular deployment? Undoubtedly. But do *most* people know what they're
> doing that well? They don't.
> In particular, the old "fix it" configuration settings that a lot of
> people were sharing and using starting in the Cuttlefish days are rather
> dangerously out of date, and we no longer have defaults that are quite as
> stupid as some of those were.
>
> So I'd generally recommend you remove any custom tuning you've set up
> unless you have specific reason to think it will do better than the
> defaults for your currently-deployed release.
> -Greg
>
>
>>
>>  Regards
>>
>> On Thu, Jul 12, 2018 at 9:12 PM, Konstantin Shalygin 
>> wrote:
>>
>>>   I saw this in the Luminous release notes:
>>>>
>>>>   "Each OSD now adjusts its default configuration based on whether the
>>>> backing device is an HDD or SSD. Manual tuning generally not required"
>>>>
>>>>   Which tuning in particular?  The ones in my configuration are
>>>> osd_op_threads, osd_disk_threads, osd_recovery_max_active,
>>>> osd_op_thread_suicide_timeout, and osd_crush_chooseleaf_type, among
>>>> others.  Can I rip these out when I upgrade to
>>>> Luminous?
>>>>
>>>
>>> This mean that some "bluestore_*" settings tuned for nvme/hdd separately.
>>>
>>> Also with Luminous we have:
>>>
>>> osd_op_num_shards_(ssd|hdd)
>>>
>>> osd_op_num_threads_per_shard_(ssd|hdd)
>>>
>>> osd_recovery_sleep_(ssd|hdd)
>>>
>>>
>>>
>>>
>>> k
>>>
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Luminous dynamic resharding, when index max shards already set

2018-07-16 Thread Robert Stanford

I am upgrading my clusters to Luminous.  We are already using rados
gateway, and index max shards has been set for the rgw data pools.  Now we
want to use Luminous dynamic index resharding.  How do we make this
transition?

 Regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Bluestore and number of devices

2018-07-13 Thread Robert Stanford

 I'm using filestore now, with 4 data devices per journal device.

 I'm confused by this: "BlueStore manages either one, two, or (in certain
cases) three storage devices."
(
http://docs.ceph.com/docs/luminous/rados/configuration/bluestore-config-ref/
)

 When I convert my journals to bluestore, will they still be four data
devices (osds) per journal, or will they each require a dedicated journal
drive now?

 Regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD tuning no longer required?

2018-07-13 Thread Robert Stanford

 This is what leads me to believe it's other settings being referred to as
well:
https://ceph.com/community/new-luminous-rados-improvements/

*"There are dozens of documents floating around with long lists of Ceph
configurables that have been tuned for optimal performance on specific
hardware or for specific workloads.  In most cases these ceph.conf
fragments tend to induce funny looks on developers’ faces because the
settings being adjusted seem counter-intuitive, unrelated to the
performance of the system, and/or outright dangerous.  Our goal is to make
Ceph work as well as we can out of the box without requiring any tuning at
all, so we are always striving to choose sane defaults.  And generally, we
discourage tuning by users. "*

To me it's not just bluestore settings / sdd vs. hdd they're talking about
("dozens of documents floating around"... "our goal... without any tuning
at all".  Am I off base?

 Regards

On Thu, Jul 12, 2018 at 9:12 PM, Konstantin Shalygin  wrote:

>   I saw this in the Luminous release notes:
>>
>>   "Each OSD now adjusts its default configuration based on whether the
>> backing device is an HDD or SSD. Manual tuning generally not required"
>>
>>   Which tuning in particular?  The ones in my configuration are
>> osd_op_threads, osd_disk_threads, osd_recovery_max_active,
>> osd_op_thread_suicide_timeout, and osd_crush_chooseleaf_type, among
>> others.  Can I rip these out when I upgrade to
>> Luminous?
>>
>
> This mean that some "bluestore_*" settings tuned for nvme/hdd separately.
>
> Also with Luminous we have:
>
> osd_op_num_shards_(ssd|hdd)
>
> osd_op_num_threads_per_shard_(ssd|hdd)
>
> osd_recovery_sleep_(ssd|hdd)
>
>
>
>
> k
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSD tuning no longer required?

2018-07-12 Thread Robert Stanford

 I saw this in the Luminous release notes:

 "Each OSD now adjusts its default configuration based on whether the
backing device is an HDD or SSD. Manual tuning generally not required"

 Which tuning in particular?  The ones in my configuration are
osd_op_threads, osd_disk_threads, osd_recovery_max_active,
osd_op_thread_suicide_timeout, and osd_crush_chooseleaf_type, among
others.  Can I rip these out when I upgrade to
Luminous?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSDs for data drives

2018-07-11 Thread Robert Stanford

 Any opinions on the Dell DC S3520 (for journals)?  That's what I have,
stock and I wonder if I should replace them.

On Wed, Jul 11, 2018 at 8:34 AM, Simon Ironside 
wrote:

>
> On 11/07/18 14:26, Simon Ironside wrote:
>
> The 2TB Samsung 850 EVO for example is only rated for 300TBW (terabytes
>> written). Over the 5 year warranty period that's only 165GB/day, not even
>> 0.01 full drive writes per day. The SM863a part of the same size is rated
>> for 12,320TBW, over 3 DWPD.
>>
>
> Sorry, my maths is out above - that should be "not even 0.1 full drive
> writes per day".
>
> Simon
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSDs for data drives

2018-07-11 Thread Robert Stanford

Paul -

 That's extremely helpful, thanks.  I do have another cluster that uses
Samsung SM863a just for journal (spinning disks for data).  Do you happen
to have an opinion on those as well?

On Wed, Jul 11, 2018 at 4:03 AM, Paul Emmerich 
wrote:

> PM/SM863a are usually great disks and should be the default go-to option,
> they outperform
> even the more expensive PM1633 in our experience.
> (But that really doesn't matter if it's for the full OSD and not as
> dedicated WAL/journal)
>
> We got a cluster with a few hundred SanDisk Ultra II (discontinued, i
> believe) that was built on a budget.
> Not the best disk but great value. They have been running since ~3 years
> now with very few failures and
> okayish overall performance.
>
> We also got a few clusters with a few hundred SanDisk Extreme Pro, but we
> are not yet sure about their
> long-time durability as they are only ~9 months old (average of ~1000
> write IOPS on each disk over that time).
> Some of them report only 50-60% lifetime left.
>
> For NVMe, the Intel NVMe 750 is still a great disk
>
> Be carefuly to get these exact models. Seemingly similar disks might be
> just completely bad, for
> example, the Samsung PM961 is just unusable for Ceph in our experience.
>
> Paul
>
> 2018-07-11 10:14 GMT+02:00 Wido den Hollander :
>
>>
>>
>> On 07/11/2018 10:10 AM, Robert Stanford wrote:
>> >
>> >  In a recent thread the Samsung SM863a was recommended as a journal
>> > SSD.  Are there any recommendations for data SSDs, for people who want
>> > to use just SSDs in a new Ceph cluster?
>> >
>>
>> Depends on what you are looking for, SATA, SAS3 or NVMe?
>>
>> I have very good experiences with these drives running with BlueStore in
>> them in SuperMicro machines:
>>
>> - SATA: Samsung PM863a
>> - SATA: Intel S4500
>> - SAS: Samsung PM1633
>> - NVMe: Samsung PM963
>>
>> Running WAL+DB+DATA with BlueStore on the same drives.
>>
>> Wido
>>
>> >  Thank you
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen=gmail=g>
> 81247 München
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen=gmail=g>
> www.croit.io
> Tel: +49 89 1896585 90
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSDs for data drives

2018-07-11 Thread Robert Stanford

 Wido -

 You're using the same SATA drive as journals and data drives both?  I want
to make sure my question was understood, since you mention BlueStore (maybe
you were just using them for journals; I want to make sure I understood).

 Thanks

On Wed, Jul 11, 2018 at 3:14 AM, Wido den Hollander  wrote:

>
>
> On 07/11/2018 10:10 AM, Robert Stanford wrote:
> >
> >  In a recent thread the Samsung SM863a was recommended as a journal
> > SSD.  Are there any recommendations for data SSDs, for people who want
> > to use just SSDs in a new Ceph cluster?
> >
>
> Depends on what you are looking for, SATA, SAS3 or NVMe?
>
> I have very good experiences with these drives running with BlueStore in
> them in SuperMicro machines:
>
> - SATA: Samsung PM863a
> - SATA: Intel S4500
> - SAS: Samsung PM1633
> - NVMe: Samsung PM963
>
> Running WAL+DB+DATA with BlueStore on the same drives.
>
> Wido
>
> >  Thank you
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] SSDs for data drives

2018-07-11 Thread Robert Stanford

 In a recent thread the Samsung SM863a was recommended as a journal SSD.
Are there any recommendations for data SSDs, for people who want to use
just SSDs in a new Ceph cluster?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] size of journal partitions pretty small

2018-07-10 Thread Robert Stanford

 I installed my OSDs using ceph-disk.  The journals are SSDs and are 1TB.
I notice that Ceph has only dedicated 5GB each to the four OSDs that use
the journal.

 1) Is this normal

 2) Would performance increase if I made the partitions bigger?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Journel SSD recommendation

2018-07-10 Thread Robert Stanford

 Do the recommendations apply to both data and journal SSDs equally?

On Tue, Jul 10, 2018 at 12:59 PM, Satish Patel  wrote:

> On Tue, Jul 10, 2018 at 11:51 AM, Simon Ironside
>  wrote:
> > Hi,
> >
> > On 10/07/18 16:25, Satish Patel wrote:
> >>
> >> Folks,
> >>
> >> I am in middle or ordering hardware for my Ceph cluster, so need some
> >> recommendation from communities.
> >>
> >> - What company/Vendor SSD is good ?
> >
> >
> > Samsung SM863a is the current favourite I believe.
>
> Thanks, I would also like to know about Intel SSD 3700 (Intel SSD SC
> 3700 Series SSDSC2BA400G3P), price also looking promising, Do you have
> opinion on it?  also should i get 1 SSD driver for journal or need
> two? I am planning to put 5 OSD per server
>
>
> >
> > The Intel DC S4600 is one to specifically avoid at the moment unless the
> > latest firmware has resolved some of the list member reported issues.
> >
> >> - What size should be good for Journal (for BlueStore)
> >
> >
> > ceph-disk defaults to a RocksDB partition that is 1% of the main device
> > size. That'll get you in the right ball park.
> >
> >> I have lots of Samsung 850 EVO but they are consumer, Do you think
> >> consume drive should be good for journal?
> >
> >
> > No :)
> >
> > Cheers,
> > Simon.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] rgw non-ec pool and multipart uploads

2018-06-26 Thread Robert Stanford

 After I started using multipart uploads to RGW, Ceph automatically created
a non-ec pool.  It looks like it stores object pieces there until all the
pieces of a multipart upload arrive, then moves the completed piece to the
normal rgw data pool.  Is this correct?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw bucket listing (s3 ls s3://$bucketname) slow with ~2 billion objects

2018-05-01 Thread Robert Stanford

 I second the indexless bucket suggestion.  The downside being that you
can't use bucket policies like object expiration in that case.

On Tue, May 1, 2018 at 10:02 AM, David Turner <drakonst...@gmail.com> wrote:

> Any time using shared storage like S3 or cephfs/nfs/gluster/etc the
> absolute rule that I refuse to break is to never rely on a directory
> listing to know where objects/files are.  You should be maintaining a
> database of some sort or a deterministic naming scheme.  The only time a
> full listing of a directory should be required is if you feel like your
> tooling is orphaning files and you want to clean them up.  If I had someone
> with a bucket with 2B objects, I would force them to use an index-less
> bucket.
>
> That's me, though.  I'm sure there are ways to manage a bucket in other
> ways, but it sounds awful.
>
> On Tue, May 1, 2018 at 10:10 AM Robert Stanford <rstanford8...@gmail.com>
> wrote:
>
>>
>>  Listing will always take forever when using a high shard number, AFAIK.
>> That's the tradeoff for sharding.  Are those 2B objects in one bucket?
>> How's your read and write performance compared to a bucket with a lower
>> number (thousands) of objects, with that shard number?
>>
>> On Tue, May 1, 2018 at 7:59 AM, Katie Holly <8ld3j...@meo.ws> wrote:
>>
>>> One of our radosgw buckets has grown a lot in size, `rgw bucket stats
>>> --bucket $bucketname` reports a total of 2,110,269,538 objects with the
>>> bucket index sharded across 32768 shards, listing the root context of the
>>> bucket with `s3 ls s3://$bucketname` takes more than an hour which is the
>>> hard limit to first-byte on our nginx reverse proxy and the aws-cli times
>>> out long before that timeout limit is hit.
>>>
>>> The software we use supports sharding the data across multiple s3
>>> buckets but before I go ahead and enable this, has anyone ever had that
>>> many objects in a single RGW bucket and can let me know how you solved the
>>> problem of RGW taking a long time to read the full index?
>>>
>>> --
>>> Best regards
>>>
>>> Katie Holly
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw bucket listing (s3 ls s3://$bucketname) slow with ~2 billion objects

2018-05-01 Thread Robert Stanford

 Listing will always take forever when using a high shard number, AFAIK.
That's the tradeoff for sharding.  Are those 2B objects in one bucket?
How's your read and write performance compared to a bucket with a lower
number (thousands) of objects, with that shard number?

On Tue, May 1, 2018 at 7:59 AM, Katie Holly <8ld3j...@meo.ws> wrote:

> One of our radosgw buckets has grown a lot in size, `rgw bucket stats
> --bucket $bucketname` reports a total of 2,110,269,538 objects with the
> bucket index sharded across 32768 shards, listing the root context of the
> bucket with `s3 ls s3://$bucketname` takes more than an hour which is the
> hard limit to first-byte on our nginx reverse proxy and the aws-cli times
> out long before that timeout limit is hit.
>
> The software we use supports sharding the data across multiple s3 buckets
> but before I go ahead and enable this, has anyone ever had that many
> objects in a single RGW bucket and can let me know how you solved the
> problem of RGW taking a long time to read the full index?
>
> --
> Best regards
>
> Katie Holly
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs not starting if the cluster name is not ceph

2018-04-23 Thread Robert Stanford

 Thanks, yeah we will move away from it.  Sadly, this is one of many
little- (or non-) documented things that have made adapting Ceph for
large-scale use a pain.  Hopefully it will be worth it.

On Mon, Apr 23, 2018 at 4:25 PM, David Turner <drakonst...@gmail.com> wrote:

> If you can move away from having a non-default cluster name, do that.
> It's honestly worth the hassle if it's early enough in your deployment.
> Otherwise you'll end up needing to symlink a lot of things to the default
> ceph name.  Back when it was supported, we still needed to have
> /etc/ceph/ceph.conf symlinked to our actual config file for multiple ceph
> tools.  It was never a widely adopted feature, and the nature open-source
> had a lot of people contributing tools that had never used or thought about
> clusters with different names.
>
> On Fri, Apr 20, 2018 at 4:56 PM Robert Stanford <rstanford8...@gmail.com>
> wrote:
>
>>
>>  Thanks Gregory.  How much trouble I'd have saved if I'd only known
>> this...
>>
>> On Fri, Apr 20, 2018 at 3:41 PM, Gregory Farnum <gfar...@redhat.com>
>> wrote:
>>
>>> Not sure about this specific issue, but I believe we've deprecated the
>>> use of cluster names due to (very) low usage and trouble reliably testing
>>> for all the little things like this. :/
>>> -Greg
>>>
>>> On Fri, Apr 20, 2018 at 10:18 AM Robert Stanford <
>>> rstanford8...@gmail.com> wrote:
>>>
>>>>
>>>>  If I use another cluster name (other than the default "ceph"), I've
>>>> learned that I have to create symlinks in /var/lib/ceph/osd/ with
>>>> [cluster-name]-[osd-num] that symlink to ceph-[osd-num].  The ceph-disk
>>>> command doesn't seem to take a --cluster argument like other commands.
>>>>
>>>>  Is this a known issue, or am I missing something?
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs not starting if the cluster name is not ceph

2018-04-20 Thread Robert Stanford

 Thanks Gregory.  How much trouble I'd have saved if I'd only known this...

On Fri, Apr 20, 2018 at 3:41 PM, Gregory Farnum <gfar...@redhat.com> wrote:

> Not sure about this specific issue, but I believe we've deprecated the use
> of cluster names due to (very) low usage and trouble reliably testing for
> all the little things like this. :/
> -Greg
>
> On Fri, Apr 20, 2018 at 10:18 AM Robert Stanford <rstanford8...@gmail.com>
> wrote:
>
>>
>>  If I use another cluster name (other than the default "ceph"), I've
>> learned that I have to create symlinks in /var/lib/ceph/osd/ with
>> [cluster-name]-[osd-num] that symlink to ceph-[osd-num].  The ceph-disk
>> command doesn't seem to take a --cluster argument like other commands.
>>
>>  Is this a known issue, or am I missing something?
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSDs not starting if the cluster name is not ceph

2018-04-20 Thread Robert Stanford

 If I use another cluster name (other than the default "ceph"), I've
learned that I have to create symlinks in /var/lib/ceph/osd/ with
[cluster-name]-[osd-num] that symlink to ceph-[osd-num].  The ceph-disk
command doesn't seem to take a --cluster argument like other commands.

 Is this a known issue, or am I missing something?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Tens of millions of objects in a sharded bucket

2018-04-19 Thread Robert Stanford

 The rule of thumb is not to have tens of millions of objects in a radosgw
bucket, because reads will be slow.  If using bucket index sharding (with
128 or 256 shards), does this eliminate this concern?  Has anyone tried
tens of millions (20-40M) of objects with sharded indexes?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fixing bad radosgw index

2018-04-16 Thread Robert Stanford

 This doesn't work for me:

for i in `radosgw-admin bucket list`; do radosgw-admin bucket unlink
--bucket=$i --uid=myuser; done   (tried with and without '=')

 Errors for each bucket:

failure: (2) No such file or directory2018-04-16 15:37:54.022423
7f7c250fbc80  0 could not get bucket info for bucket="bucket5",

On Mon, Apr 16, 2018 at 8:30 AM, Casey Bodley <cbod...@redhat.com> wrote:

>
>
> On 04/14/2018 12:54 PM, Robert Stanford wrote:
>
>
>  I deleted my default.rgw.buckets.data and default.rgw.buckets.index pools
> in an attempt to clean them out.  I brought this up on the list and
> received replies telling me essentially, "You shouldn't do that."  There
> was however no helpful advice on recovering.
>
>  When I run 'radosgw-admin bucket list' I get a list of all my old buckets
> (I thought they'd be cleaned out when I deleted and recreated
> default.rgw.buckets.index, but I was wrong.)  Deleting them with s3cmd and
> radosgw-admin does nothing; they still appear (though s3cmd will give a
> '404' error.)  Running radosgw-admin with 'bucket check' and '--fix' does
> nothing as well.  So, how do I get myself out of this mess.
>
>  On another, semi-related note, I've been deleting (existing) buckets and
> their contents with s3cmd (and --recursive); the space is never freed from
> ceph and the bucket still appears in s3cmd ls.  Looks like my radosgw has
> several issues, maybe all related to deleting and recreating the pools.
>
>  Thanks
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> The 'bucket list' command takes a user and prints the list of buckets they
> own - this list is read from the user object itself. You can remove these
> entries with the 'bucket unlink' command.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] High TCP retransmission rates, only with Ceph

2018-04-15 Thread Robert Stanford

 I should have been more clear.  The TCP retransmissions are on the OSD
host.

On Sun, Apr 15, 2018 at 1:48 PM, Paweł Sadowski <c...@sadziu.pl> wrote:

> On 04/15/2018 08:18 PM, Robert Stanford wrote:
>
>>
>>  Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts
>> (8 disks, 8 OSD daemons, one of 3 OSD hosts).  When I benchmark radosgw
>> with cosbench I see high TCP retransmission rates (from sar -n ETCP 1).  I
>> don't see this with iperf.  Why would Ceph, but not iperf, cause high TCP
>> retransmission rates?
>>
>
> Most probably your application (radosgw in this case) is not able to
> process requests fast enough and some packets are dropped.
>
> --
> PS
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] High TCP retransmission rates, only with Ceph

2018-04-15 Thread Robert Stanford

 Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts (8
disks, 8 OSD daemons, one of 3 OSD hosts).  When I benchmark radosgw with
cosbench I see high TCP retransmission rates (from sar -n ETCP 1).  I don't
see this with iperf.  Why would Ceph, but not iperf, cause high TCP
retransmission rates?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Fixing bad radosgw index

2018-04-14 Thread Robert Stanford

 I deleted my default.rgw.buckets.data and default.rgw.buckets.index pools
in an attempt to clean them out.  I brought this up on the list and
received replies telling me essentially, "You shouldn't do that."  There
was however no helpful advice on recovering.

 When I run 'radosgw-admin bucket list' I get a list of all my old buckets
(I thought they'd be cleaned out when I deleted and recreated
default.rgw.buckets.index, but I was wrong.)  Deleting them with s3cmd and
radosgw-admin does nothing; they still appear (though s3cmd will give a
'404' error.)  Running radosgw-admin with 'bucket check' and '--fix' does
nothing as well.  So, how do I get myself out of this mess.

 On another, semi-related note, I've been deleting (existing) buckets and
their contents with s3cmd (and --recursive); the space is never freed from
ceph and the bucket still appears in s3cmd ls.  Looks like my radosgw has
several issues, maybe all related to deleting and recreating the pools.

 Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Cluster unusable after 50% full, even with index sharding

2018-04-13 Thread Robert Stanford

 I have 65TB stored on 24 OSDs on 3 hosts (8 OSDs per host).  SSD journals
and spinning disks.  Our performance before was acceptable for our purposes
- 300+MB/s simultaneous transmit and receive.  Now that we're up to about
50% of our total storage capacity (65/120TB, say), the write performance is
still ok, but the read performance is unworkable (35MB/s!)

 I am using index sharding, with 256 shards.  I don't see any CPUs
saturated on any host (we are using radosgw by the way, and the load is
light there as well).  The hard drives don't seem to be *too* busy (a
random OSD shows ~10 wa in top).  The network's fine, as we were doing much
better in terms of speed before we filled up.

  Is there anything we can do about this, short of replacing hardware?  Is
it really a limitation of Ceph that getting 50% full makes your cluster
unusable?  Index sharding has seemed to not help at all (I did some
benchmarking, with 128 shards and then 256; same result each time.)

 Or are we out of luck?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Purged a pool, buckets remain

2018-04-11 Thread Robert Stanford

 Ok.  How do I fix what's been broke?  How do I "rebuild my index"?

 Thanks

On Wed, Apr 11, 2018 at 1:49 AM, Robin H. Johnson <robb...@gentoo.org>
wrote:

> On Tue, Apr 10, 2018 at 10:06:57PM -0500, Robert Stanford wrote:
> >  I used this command to purge my rgw data:
> >
> >  rados purge default.rgw.buckets.data --yes-i-really-really-mean-it
> >
> >  Now, when I list the buckets with s3cmd, I still see the buckets (s3cmd
> ls
> > shows a listing of them.)  When I try to delete one (s3cmd rb) I get
> this:
> ...
> >  I thought maybe the names were sticking around in
> > default.rgw.buckets.index, so I purged that too.  But no luck, the
> phantom
> > buckets are still there.
> The list of buckets is in the OMAP of the users.
>
> But as the others said, this was not a good way to go about trying to
> delete the data.
>
> The only case I can see is if you were playing around and wanted to
> completely stop using RGW in an existing cluster, and do CephFS or RBD
> instead.
>
> If you did want want to completely get rid of RGW data, you should wipe
> out ALL of the RGW pools, not just the data pool.
> "radosgw-admin zone get" will show them to you.
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Purged a pool, buckets remain

2018-04-10 Thread Robert Stanford

 I used this command to purge my rgw data:

 rados purge default.rgw.buckets.data --yes-i-really-really-mean-it

 Now, when I list the buckets with s3cmd, I still see the buckets (s3cmd ls
shows a listing of them.)  When I try to delete one (s3cmd rb) I get this:

  ERROR: S3 error: 404 (NoSuchKey)

 I thought maybe the names were sticking around in
default.rgw.buckets.index, so I purged that too.  But no luck, the phantom
buckets are still there.

 My questions are:

 1) How can I take care of these phantom buckets?

 2) How can I purge / delete all data in the pool, without remnants such as
these?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph-deploy: recommended?

2018-04-04 Thread Robert Stanford

 I read a couple of versions ago that ceph-deploy was not recommended for
production clusters.  Why was that?  Is this still the case?  We have a lot
of problems automating deployment without ceph-deploy.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph performance falls as data accumulates

2018-04-02 Thread Robert Stanford

 This is a known issue as far as I can tell, I've read about it several
times.  Ceph performs great (using radosgw), but as the OSDs fill up
performance falls sharply.  I am down to half of empty performance with
about 50% disk usage.

 My questions are: does adding more OSDs / disks to the cluster restore
performance?  (Is it the absolute number of objects that degrades
performance, or % occupancy?)  And, will the performance level off at some
point, or will it continue to get lower and lower as our disks fill?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] multiple radosgw daemons per host, and performance

2018-03-26 Thread Robert Stanford

 When I am running at full load my radosgw process uses 100% of one CPU
core (and has many threads).  I have many idle cores.  Is it common for
people to run several radosgw processes on their gateways, to take
advantage of all their cores?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] object lifecycle scope

2018-02-06 Thread Robert Stanford

 Hello Ceph users.  Is object lifecycle (currently expiration) for rgw
implementable on a per-object basis, or is the smallest scope the bucket?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] object lifecycle and updating from jewel

2018-01-02 Thread Robert Stanford

 I would like to use the new object lifecycle feature of kraken /
luminous.  I have jewel, with buckets that have lots and lots of objects.
It won't be practical to move them, then move them back after upgrading.

 In order to use the object lifecycle feature of radosgw in
kraken/luminous, do I need to have buckets configured for this, before
installing data?  In the scenario above, am I out of luck?  Or is object
lifecycle functionality available as soon as radosgw is upgraded?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] radosgw package for kraken missing on ubuntu

2017-12-28 Thread Robert Stanford

 I am installing with ceph-deploy using the instructions at
http://docs.ceph.com/docs/master/install/get-packages/

 ceph-deploy runs fine for the first node until it dies due to not finding
radosgw package.  I have verified on that node (apt-cache search radosgw,
apt-get install radosgw) that this package is not available.

 Since I have followed the instructions on the page to make the Ceph
packages available, all others work fine.  Why isn't radosgw included?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] using more than one pool for radosgw

2017-12-13 Thread Robert Stanford

 I have an indexless pool that I plan to use with RGW, to store up to
billions of objects.  I have the impression that because it's indexless,
the performance won't taper off as the buckets and pools grow.

 Would there be any point / enhancement from dividing our users into more
than one pool, or will we have the same performance keeping everything in
our indexless default.radosgw pool?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RGW uploaded objects integrity

2017-12-07 Thread Robert Stanford

I did some benchmarking with cosbench and found that successful uploads (as
shown in the output report) was not 100% unless I used the "hashCheck=True"
flag in the cosbench configuration file.  Under high load, the percent
successful was significantly lower (say, 80%).

Has anyone dealt with object integrity issues at high throughputs, when
using RGW?  Any suggestions on ensuring what goes in to the cluster is the
same as what comes out?  Do we have to manually check the integrity of the
objects we upload, like cosbench does, every time?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] osd max write size and large objects

2017-11-16 Thread Robert Stanford

 Thanks.  Does this mean that I can send >90MB objects to my RGWs and they
will break it up into manageable (<=90MB) chunks before storing them?  Or
if I'm going to store objects > 90MB do I need to change this parameter?

 I don't know if we'll be able to use libradosstriper, but thanks for
bringing it to my attention.  We are using the S3 interface of RGW
exclusively (nothing custom in there).


On Thu, Nov 16, 2017 at 9:41 AM, Wido den Hollander <w...@42on.com> wrote:

>
> > Op 16 november 2017 om 16:32 schreef Robert Stanford <
> rstanford8...@gmail.com>:
> >
> >
> >  Once 'osd max write size' (90MB by default I believe) is exceeded, does
> > Ceph reject the object (which is coming in through RGW), or does it break
> > it up into smaller objects (of max 'osd  max write size' size)?  If it
> > breaks them up, does it read the fragments in parallel when they're
> > requested by RGW?
> >
>
> The OSD rejects the write. However, RGW already does striping over RADOS
> objects.
>
> Usually you don't need to worry about this setting unless you are talking
> to RADOS directly.
>
> However, 90MB objects are big and you should look into libradosstriper.
>
> Wido
>
> >  Thanks
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] osd max write size and large objects

2017-11-16 Thread Robert Stanford

 Once 'osd max write size' (90MB by default I believe) is exceeded, does
Ceph reject the object (which is coming in through RGW), or does it break
it up into smaller objects (of max 'osd  max write size' size)?  If it
breaks them up, does it read the fragments in parallel when they're
requested by RGW?

 Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-13 Thread Robert Stanford

ceph osd pool create scbench 100 100
rados bench -p scbench 10 write --no-cleanup
rados bench -p scbench 10 seq


On Mon, Nov 13, 2017 at 1:28 AM, Rudi Ahlers <rudiahl...@gmail.com> wrote:

> Would you mind telling me what rados command set you use, and share the
> output? I would like to compare it to our server as well.
>
> On Fri, Nov 10, 2017 at 6:29 AM, Robert Stanford <rstanford8...@gmail.com>
> wrote:
>
>>
>>  In my cluster, rados bench shows about 1GB/s bandwidth.  I've done some
>> tuning:
>>
>> [osd]
>> osd op threads = 8
>> osd disk threads = 4
>> osd recovery max active = 7
>>
>>
>> I was hoping to get much better bandwidth.  My network can handle it, and
>> my disks are pretty fast as well.  Are there any major tunables I can play
>> with to increase what will be reported by "rados bench"?  Am I pretty much
>> stuck around the bandwidth it reported?
>>
>>  Thank you
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> Kind Regards
> Rudi Ahlers
> Website: http://www.rudiahlers.co.za
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Robert Stanford

 But sorry, this was about "rados bench" which is run inside the Ceph
cluster.  So there's no network between the "client" and my cluster.

On Fri, Nov 10, 2017 at 10:35 AM, Robert Stanford <rstanford8...@gmail.com>
wrote:

>
>  Thank you for that excellent observation.  Are there any rumors / has
> anyone had experience with faster clusters, on faster networks?  I wonder
> how Ceph can get ("it depends"), of course, but I wonder about numbers
> people have seen.
>
> On Fri, Nov 10, 2017 at 10:31 AM, Denes Dolhay <de...@denkesys.com> wrote:
>
>> So you are using a 40 / 100 gbit connection all the way to your client?
>>
>> John's question is valid because 10 gbit = 1.25GB/s ... subtract some
>> ethernet, ip, tcp and protocol overhead take into account some additional
>> network factors and you are about there...
>>
>>
>> Denes
>>
>> On 11/10/2017 05:10 PM, Robert Stanford wrote:
>>
>>
>>  The bandwidth of the network is much higher than that.  The bandwidth I
>> mentioned came from "rados bench" output, under the "Bandwidth (MB/sec)"
>> row.  I see from comparing mine to others online that mine is pretty good
>> (relatively).  But I'd like to get much more than that.
>>
>> Does "rados bench" show a near maximum of what a cluster can do?  Or is
>> it possible that I can tune it to get more bandwidth?
>>
>>
>> On Fri, Nov 10, 2017 at 3:43 AM, John Spray <jsp...@redhat.com> wrote:
>>
>>> On Fri, Nov 10, 2017 at 4:29 AM, Robert Stanford
>>> <rstanford8...@gmail.com> wrote:
>>> >
>>> >  In my cluster, rados bench shows about 1GB/s bandwidth.  I've done
>>> some
>>> > tuning:
>>> >
>>> > [osd]
>>> > osd op threads = 8
>>> > osd disk threads = 4
>>> > osd recovery max active = 7
>>> >
>>> >
>>> > I was hoping to get much better bandwidth.  My network can handle it,
>>> and my
>>> > disks are pretty fast as well.  Are there any major tunables I can
>>> play with
>>> > to increase what will be reported by "rados bench"?  Am I pretty much
>>> stuck
>>> > around the bandwidth it reported?
>>>
>>> Are you sure your 1GB/s isn't just the NIC bandwidth limit of the
>>> client you're running rados bench from?
>>>
>>> John
>>>
>>> >
>>> >  Thank you
>>> >
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>>
>>
>>
>>
>> ___
>> ceph-users mailing 
>> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Robert Stanford

 Thank you for that excellent observation.  Are there any rumors / has
anyone had experience with faster clusters, on faster networks?  I wonder
how Ceph can get ("it depends"), of course, but I wonder about numbers
people have seen.

On Fri, Nov 10, 2017 at 10:31 AM, Denes Dolhay <de...@denkesys.com> wrote:

> So you are using a 40 / 100 gbit connection all the way to your client?
>
> John's question is valid because 10 gbit = 1.25GB/s ... subtract some
> ethernet, ip, tcp and protocol overhead take into account some additional
> network factors and you are about there...
>
>
> Denes
>
> On 11/10/2017 05:10 PM, Robert Stanford wrote:
>
>
>  The bandwidth of the network is much higher than that.  The bandwidth I
> mentioned came from "rados bench" output, under the "Bandwidth (MB/sec)"
> row.  I see from comparing mine to others online that mine is pretty good
> (relatively).  But I'd like to get much more than that.
>
> Does "rados bench" show a near maximum of what a cluster can do?  Or is it
> possible that I can tune it to get more bandwidth?
>
>
> On Fri, Nov 10, 2017 at 3:43 AM, John Spray <jsp...@redhat.com> wrote:
>
>> On Fri, Nov 10, 2017 at 4:29 AM, Robert Stanford
>> <rstanford8...@gmail.com> wrote:
>> >
>> >  In my cluster, rados bench shows about 1GB/s bandwidth.  I've done some
>> > tuning:
>> >
>> > [osd]
>> > osd op threads = 8
>> > osd disk threads = 4
>> > osd recovery max active = 7
>> >
>> >
>> > I was hoping to get much better bandwidth.  My network can handle it,
>> and my
>> > disks are pretty fast as well.  Are there any major tunables I can play
>> with
>> > to increase what will be reported by "rados bench"?  Am I pretty much
>> stuck
>> > around the bandwidth it reported?
>>
>> Are you sure your 1GB/s isn't just the NIC bandwidth limit of the
>> client you're running rados bench from?
>>
>> John
>>
>> >
>> >  Thank you
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Robert Stanford

 The bandwidth of the network is much higher than that.  The bandwidth I
mentioned came from "rados bench" output, under the "Bandwidth (MB/sec)"
row.  I see from comparing mine to others online that mine is pretty good
(relatively).  But I'd like to get much more than that.

Does "rados bench" show a near maximum of what a cluster can do?  Or is it
possible that I can tune it to get more bandwidth?

On Fri, Nov 10, 2017 at 3:43 AM, John Spray <jsp...@redhat.com> wrote:

> On Fri, Nov 10, 2017 at 4:29 AM, Robert Stanford
> <rstanford8...@gmail.com> wrote:
> >
> >  In my cluster, rados bench shows about 1GB/s bandwidth.  I've done some
> > tuning:
> >
> > [osd]
> > osd op threads = 8
> > osd disk threads = 4
> > osd recovery max active = 7
> >
> >
> > I was hoping to get much better bandwidth.  My network can handle it,
> and my
> > disks are pretty fast as well.  Are there any major tunables I can play
> with
> > to increase what will be reported by "rados bench"?  Am I pretty much
> stuck
> > around the bandwidth it reported?
>
> Are you sure your 1GB/s isn't just the NIC bandwidth limit of the
> client you're running rados bench from?
>
> John
>
> >
> >  Thank you
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-09 Thread Robert Stanford

 In my cluster, rados bench shows about 1GB/s bandwidth.  I've done some
tuning:

[osd]
osd op threads = 8
osd disk threads = 4
osd recovery max active = 7


I was hoping to get much better bandwidth.  My network can handle it, and
my disks are pretty fast as well.  Are there any major tunables I can play
with to increase what will be reported by "rados bench"?  Am I pretty much
stuck around the bandwidth it reported?

 Thank you
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Radosgw: object lifecycle (expiration) not working?

2017-09-11 Thread Robert Stanford

 Greetings -

 I have created several test buckets in radosgw, to test different
expiration durations:

 $ s3cmd mb s3://test2d

 I set a lifecycle for each of these buckets:

 $ s3cmd setlifecycle lifecycle2d.xml s3://test2d --signature-v2

 The files look like this:


http://s3.amazonaws.com/doc/2006-03-01/
">test2dEnabled2

 I tried different ways of applying the lifecycle with s3cmd, and the
approach I've used above is the only one that doesn't return a 'feature not
implemented' (or similar) error message.

 I've uploaded a file to each bucket, and waited.  Several days have
passed.  The files have *not* been removed from any of the buckets, though
several of the lifecycle passes have run:

ceph01# radosgw-admin lc list
[
{
"bucket":
":stestbucket_1d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.23420.1",
"status": "COMPLETE"
},
{
"bucket":
":stestbucket_2d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.23420.2",
"status": "COMPLETE"
},
{
"bucket":
":stestbucket_5d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.23420.3",
"status": "COMPLETE"
},
{
"bucket":
":stestbucket_7d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.23420.4",
"status": "COMPLETE"
},
{
"bucket":
":t_testbucket_1d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.40904.1",
"status": "COMPLETE"
},
{
"bucket":
":t_testbucket_2d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.40904.2",
"status": "COMPLETE"
},
{
"bucket":
":t_testbucket_5d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.40904.3",
"status": "COMPLETE"
},
{
"bucket":
":t_testbucket_7d:bc186feb-9fd2-4035-aa91-9ec0772eefeb.40904.4",
"status": "UNINITIAL"
},
{
"bucket": ":testbucket:bc186feb-9fd2-4035-aa91-9ec0772eefeb.5492.2",
"status": "UNINITIAL"
}
]

  The UNINITIAL status entries haven't run yet (not enough time has passed,
in the case of 7d).
An s3cmd ls confirms that the files are still in all the buckets.

  Additional relevant information: this is my second test run.  The first
time, object expiration seemed to work... except that all objects were
deleted at the first run, despite having a different "Days" number in the
XML.  I thought this might be due to the "ID" in the XML being the same for
each, so for the second test run (which you see above) I changed the "ID"
field as well as "Days".

 Anyone know what's going on here?

 Regards and thanks

 RS
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

87 matches

Mail list logo