Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-04 Thread Willem Jan Withagen
On 4-4-2017 21:05, Gregory Farnum wrote:
> [ Sorry for the empty email there. :o ]
> 
> On Tue, Apr 4, 2017 at 12:28 PM, Patrick Donnelly  wrote:
>> On Sat, Apr 1, 2017 at 4:58 PM, Willem Jan Withagen  wrote:
>>> On 1-4-2017 21:59, Wido den Hollander wrote:

> Op 31 maart 2017 om 19:15 schreef Willem Jan Withagen :
>
>
> On 31-3-2017 17:32, Wido den Hollander wrote:
>> Hi Willem Jan,
>>
>>> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
>>> :
>>>
>>>
>>> Hi,
>>>
>>> I'm pleased to announce that my efforts to port to FreeBSD have
>>> resulted in a ceph-devel port commit in the ports tree.
>>>
>>> https://www.freshports.org/net/ceph-devel/
>>>
>>
>> Awesome work! I don't touch FreeBSD that much, but I can imagine that
>> people want this.
>>
>> Out of curiosity, does this run on ZFS under FreeBSD? Or what
>> Filesystem would you use behind FileStore with this? Or does
>> BlueStore work?
>
> Since I'm a huge ZFS fan, that is what I run it on.

 Cool! The ZIL, ARC and L2ARC can actually make that very fast. Interesting!
>>>
>>> Right, ZIL is magic, and more or equal to the journal now used with OSDs
>>> for exactly the same reason. Sad thing is that a write is now 3*
>>> journaled: 1* by Ceph, and 2* by ZFS. Which means that the used
>>> bandwidth to the SSDs is double of what it could be.
>>>
>>> Had some discussion about this, but disabling the Ceph journal is not
>>> just setting an option. Although I would like to test performance of an
>>> OSD with just the ZFS journal. But I expect that the OSD journal is
>>> rather firmly integrated.
>>
>> Disabling the OSD journal will never be viable. The journal is also
>> necessary for transactions and batch updates which cannot be done
>> atomically in FileStore.
> 
> To expand on Patrick's statement: You shouldn't get confused by the
> presence of options to disable journaling. They exist but only work on
> btrfs-backed FileStores and are *not* performant. You could do the
> same on zfs, but in order to provide the guarantees of the RADOS
> protocol, when in that mode the OSD just holds replies on all
> operations until it knows they've been persisted to disk and
> snapshotted, then sends back a commit. You can probably imagine the
> horrible IO patterns and bursty application throughput that result.

When I talked about this with Sage in CERN, I got the same answer. So
this is at least consistent. ;-)

And I have to admit that I do not understand the intricate details of
this part of Ceph. So at the moment I'm looking at it from a more global
view

What, i guess, needs to be done, is to get ride of at least one of the
SSD writes.
Which is possible by mounting the journal disk as a separate VDEV (2
SSDs in mirror) and get the max speed out of this.
Problem with this all is that the number of SSDs sort of blows up, and
very likely there is a lot of waste because the journals need not be
very large.

And yes the other way would be to do BlueStore on ZVOL, where the
underlying VDEVs are carefully crafted. But first we need to get AIO
working. And I have not (yet) looked at that at all...

First objective was to get a port of any sorts, which I did last week.
Second is to take Luminous and make a "stable" port which is less of a
moving target.
Only then AIO is on the radar

--WjW

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-04 Thread Gregory Farnum
[ Sorry for the empty email there. :o ]

On Tue, Apr 4, 2017 at 12:28 PM, Patrick Donnelly  wrote:
> On Sat, Apr 1, 2017 at 4:58 PM, Willem Jan Withagen  wrote:
>> On 1-4-2017 21:59, Wido den Hollander wrote:
>>>
 Op 31 maart 2017 om 19:15 schreef Willem Jan Withagen :


 On 31-3-2017 17:32, Wido den Hollander wrote:
> Hi Willem Jan,
>
>> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
>> :
>>
>>
>> Hi,
>>
>> I'm pleased to announce that my efforts to port to FreeBSD have
>> resulted in a ceph-devel port commit in the ports tree.
>>
>> https://www.freshports.org/net/ceph-devel/
>>
>
> Awesome work! I don't touch FreeBSD that much, but I can imagine that
> people want this.
>
> Out of curiosity, does this run on ZFS under FreeBSD? Or what
> Filesystem would you use behind FileStore with this? Or does
> BlueStore work?

 Since I'm a huge ZFS fan, that is what I run it on.
>>>
>>> Cool! The ZIL, ARC and L2ARC can actually make that very fast. Interesting!
>>
>> Right, ZIL is magic, and more or equal to the journal now used with OSDs
>> for exactly the same reason. Sad thing is that a write is now 3*
>> journaled: 1* by Ceph, and 2* by ZFS. Which means that the used
>> bandwidth to the SSDs is double of what it could be.
>>
>> Had some discussion about this, but disabling the Ceph journal is not
>> just setting an option. Although I would like to test performance of an
>> OSD with just the ZFS journal. But I expect that the OSD journal is
>> rather firmly integrated.
>
> Disabling the OSD journal will never be viable. The journal is also
> necessary for transactions and batch updates which cannot be done
> atomically in FileStore.

To expand on Patrick's statement: You shouldn't get confused by the
presence of options to disable journaling. They exist but only work on
btrfs-backed FileStores and are *not* performant. You could do the
same on zfs, but in order to provide the guarantees of the RADOS
protocol, when in that mode the OSD just holds replies on all
operations until it knows they've been persisted to disk and
snapshotted, then sends back a commit. You can probably imagine the
horrible IO patterns and bursty application throughput that result.
o_0
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-04 Thread Gregory Farnum
On Tue, Apr 4, 2017 at 12:28 PM, Patrick Donnelly  wrote:
> On Sat, Apr 1, 2017 at 4:58 PM, Willem Jan Withagen  wrote:
>> On 1-4-2017 21:59, Wido den Hollander wrote:
>>>
 Op 31 maart 2017 om 19:15 schreef Willem Jan Withagen :


 On 31-3-2017 17:32, Wido den Hollander wrote:
> Hi Willem Jan,
>
>> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
>> :
>>
>>
>> Hi,
>>
>> I'm pleased to announce that my efforts to port to FreeBSD have
>> resulted in a ceph-devel port commit in the ports tree.
>>
>> https://www.freshports.org/net/ceph-devel/
>>
>
> Awesome work! I don't touch FreeBSD that much, but I can imagine that
> people want this.
>
> Out of curiosity, does this run on ZFS under FreeBSD? Or what
> Filesystem would you use behind FileStore with this? Or does
> BlueStore work?

 Since I'm a huge ZFS fan, that is what I run it on.
>>>
>>> Cool! The ZIL, ARC and L2ARC can actually make that very fast. Interesting!
>>
>> Right, ZIL is magic, and more or equal to the journal now used with OSDs
>> for exactly the same reason. Sad thing is that a write is now 3*
>> journaled: 1* by Ceph, and 2* by ZFS. Which means that the used
>> bandwidth to the SSDs is double of what it could be.
>>
>> Had some discussion about this, but disabling the Ceph journal is not
>> just setting an option. Although I would like to test performance of an
>> OSD with just the ZFS journal. But I expect that the OSD journal is
>> rather firmly integrated.
>
> Disabling the OSD journal will never be viable. The journal is also
> necessary for transactions and batch updates which cannot be done
> atomically in FileStore.
>
> This is great work Willem. I'm especially looking forward to seeing
> BlueStore performance on a ZVol.
>
> --
> Patrick Donnelly
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-04 Thread Patrick Donnelly
On Sat, Apr 1, 2017 at 4:58 PM, Willem Jan Withagen  wrote:
> On 1-4-2017 21:59, Wido den Hollander wrote:
>>
>>> Op 31 maart 2017 om 19:15 schreef Willem Jan Withagen :
>>>
>>>
>>> On 31-3-2017 17:32, Wido den Hollander wrote:
 Hi Willem Jan,

> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
> :
>
>
> Hi,
>
> I'm pleased to announce that my efforts to port to FreeBSD have
> resulted in a ceph-devel port commit in the ports tree.
>
> https://www.freshports.org/net/ceph-devel/
>

 Awesome work! I don't touch FreeBSD that much, but I can imagine that
 people want this.

 Out of curiosity, does this run on ZFS under FreeBSD? Or what
 Filesystem would you use behind FileStore with this? Or does
 BlueStore work?
>>>
>>> Since I'm a huge ZFS fan, that is what I run it on.
>>
>> Cool! The ZIL, ARC and L2ARC can actually make that very fast. Interesting!
>
> Right, ZIL is magic, and more or equal to the journal now used with OSDs
> for exactly the same reason. Sad thing is that a write is now 3*
> journaled: 1* by Ceph, and 2* by ZFS. Which means that the used
> bandwidth to the SSDs is double of what it could be.
>
> Had some discussion about this, but disabling the Ceph journal is not
> just setting an option. Although I would like to test performance of an
> OSD with just the ZFS journal. But I expect that the OSD journal is
> rather firmly integrated.

Disabling the OSD journal will never be viable. The journal is also
necessary for transactions and batch updates which cannot be done
atomically in FileStore.

This is great work Willem. I'm especially looking forward to seeing
BlueStore performance on a ZVol.

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-01 Thread Willem Jan Withagen
On 1-4-2017 21:59, Wido den Hollander wrote:
> 
>> Op 31 maart 2017 om 19:15 schreef Willem Jan Withagen :
>>
>>
>> On 31-3-2017 17:32, Wido den Hollander wrote:
>>> Hi Willem Jan,
>>>
 Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
 :


 Hi,

 I'm pleased to announce that my efforts to port to FreeBSD have
 resulted in a ceph-devel port commit in the ports tree.

 https://www.freshports.org/net/ceph-devel/

>>>
>>> Awesome work! I don't touch FreeBSD that much, but I can imagine that
>>> people want this.
>>>
>>> Out of curiosity, does this run on ZFS under FreeBSD? Or what
>>> Filesystem would you use behind FileStore with this? Or does
>>> BlueStore work?
>>
>> Since I'm a huge ZFS fan, that is what I run it on.
> 
> Cool! The ZIL, ARC and L2ARC can actually make that very fast. Interesting!

Right, ZIL is magic, and more or equal to the journal now used with OSDs
for exactly the same reason. Sad thing is that a write is now 3*
journaled: 1* by Ceph, and 2* by ZFS. Which means that the used
bandwidth to the SSDs is double of what it could be.

Had some discussion about this, but disabling the Ceph journal is not
just setting an option. Although I would like to test performance of an
OSD with just the ZFS journal. But I expect that the OSD journal is
rather firmly integrated.

Now the real nice thing is that one does not need to worry about
cacheing the OSD performance. This is fully covered by ZFS. Both by ARC
and L2ARC. And ZIL and L2ARC can be constructed again in all shapes and
forms that all AFS vdev's can be made.
So for the ZIL you'd build and SSD's mirror: double the write speed, but
still redundant. For L2ARC I'd concatenate 2 SSD's to get the read
bandwidth. And contrary to some of the other caches ZFS does not return
errors if the l2arc devices go down. (note that data errors are detected
by checksumming) So that again is one less thing to worry about.

> CRC and Compression from ZFS are also very nice.

I did not want to go into too much details, but this is a large part of
the reasons. Compression I tried a bit, but does cost quite a bit of
performance at the Ceph end. Perhaps because the write to the journal is
synced, and thus has to way on both compression and synced writting.

It also bring snapshots without much hassle. But I have not yet figured
(looked at) out if and how btrfs snapshots are used.

Other challenge is the Ceph deep scrubbing: checking for corruption
within files. ZFS is able to detect corruption all by itself due to
extensive file checksumming. And with something way much stronger/better
that crc32. (just put on my fireproof suite)
So I'm not certain that deep-scrub would be obsolete, but I think it
could the frequency could perhaps go down, and/or be triggered by ZFS
errors after scrubbing a pool. Something that has way much less impact
on performance.

In some of the talks I give, I always try to explain to people that RAID
and RAID controllers are the current dinosaurs of IT.

>> To be honest I have not tested on UFS, but I would expect that the xattr
>> are not long enough.
>>
>> BlueStore is not (yet) available because there is a different AIO
>> implementation on FreeBSD. But Sage thinks it is very doable to glue in
>> posix AIO. And one of my port reviewers has offered to look at it. So it
>> could be that BlueStore will be available in the foreseeable future.
>>
>> --WjW

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-04-01 Thread Wido den Hollander

> Op 31 maart 2017 om 19:15 schreef Willem Jan Withagen :
> 
> 
> On 31-3-2017 17:32, Wido den Hollander wrote:
> > Hi Willem Jan,
> > 
> >> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
> >> :
> >> 
> >> 
> >> Hi,
> >> 
> >> I'm pleased to announce that my efforts to port to FreeBSD have
> >> resulted in a ceph-devel port commit in the ports tree.
> >> 
> >> https://www.freshports.org/net/ceph-devel/
> >> 
> > 
> > Awesome work! I don't touch FreeBSD that much, but I can imagine that
> > people want this.
> > 
> > Out of curiosity, does this run on ZFS under FreeBSD? Or what
> > Filesystem would you use behind FileStore with this? Or does
> > BlueStore work?
> 
> Since I'm a huge ZFS fan, that is what I run it on.

Cool! The ZIL, ARC and L2ARC can actually make that very fast. Interesting!

CRC and Compression from ZFS are also very nice.

> To be honest I have not tested on UFS, but I would expect that the xattr
> are not long enough.
> 
> BlueStore is not (yet) available because there is a different AIO
> implementation on FreeBSD. But Sage thinks it is very doable to glue in
> posix AIO. And one of my port reviewers has offered to look at it. So it
> could be that BlueStore will be available in the foreseeable future.
>
> --WjW
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-03-31 Thread Willem Jan Withagen
On 31-3-2017 17:32, Wido den Hollander wrote:
> Hi Willem Jan,
> 
>> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen
>> :
>> 
>> 
>> Hi,
>> 
>> I'm pleased to announce that my efforts to port to FreeBSD have
>> resulted in a ceph-devel port commit in the ports tree.
>> 
>> https://www.freshports.org/net/ceph-devel/
>> 
> 
> Awesome work! I don't touch FreeBSD that much, but I can imagine that
> people want this.
> 
> Out of curiosity, does this run on ZFS under FreeBSD? Or what
> Filesystem would you use behind FileStore with this? Or does
> BlueStore work?

Since I'm a huge ZFS fan, that is what I run it on.
To be honest I have not tested on UFS, but I would expect that the xattr
are not long enough.

BlueStore is not (yet) available because there is a different AIO
implementation on FreeBSD. But Sage thinks it is very doable to glue in
posix AIO. And one of my port reviewers has offered to look at it. So it
could be that BlueStore will be available in the foreseeable future.

--WjW
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-03-31 Thread Wido den Hollander
Hi Willem Jan,

> Op 30 maart 2017 om 13:56 schreef Willem Jan Withagen :
> 
> 
> Hi,
> 
> I'm pleased to announce that my efforts to port to FreeBSD have resulted
> in a ceph-devel port commit in the ports tree.
> 
> https://www.freshports.org/net/ceph-devel/
> 

Awesome work! I don't touch FreeBSD that much, but I can imagine that people 
want this.

Out of curiosity, does this run on ZFS under FreeBSD? Or what Filesystem would 
you use behind FileStore with this? Or does BlueStore work?

Wido

> I'd like to thank everybody that helped me by answering my questions,
> fixing by mistakes, undoing my Git mess. Especially Sage, Kefu and
> Haomei gave a lot of support
> 
> Next release step will be to release an net/ceph port when the
> 'Luminous' version goes officially in release.
> 
> In the meantime I'll be updating the ceph-devel port to a more current
> state of affairs
> 
> Thanx,
> --WjW
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FreeBSD port net/ceph-devel released

2017-03-30 Thread kefu chai
On Thu, Mar 30, 2017 at 7:56 PM, Willem Jan Withagen  wrote:
> Hi,
>
> I'm pleased to announce that my efforts to port to FreeBSD have resulted
> in a ceph-devel port commit in the ports tree.
>
> https://www.freshports.org/net/ceph-devel/
>
> I'd like to thank everybody that helped me by answering my questions,
> fixing by mistakes, undoing my Git mess. Especially Sage, Kefu and
> Haomei gave a lot of support
>
> Next release step will be to release an net/ceph port when the
> 'Luminous' version goes officially in release.
>
> In the meantime I'll be updating the ceph-devel port to a more current
> state of affairs

Great job, Willem!


-- 
Regards
Kefu Chai
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] FreeBSD port net/ceph-devel released

2017-03-30 Thread Willem Jan Withagen
Hi,

I'm pleased to announce that my efforts to port to FreeBSD have resulted
in a ceph-devel port commit in the ports tree.

https://www.freshports.org/net/ceph-devel/

I'd like to thank everybody that helped me by answering my questions,
fixing by mistakes, undoing my Git mess. Especially Sage, Kefu and
Haomei gave a lot of support

Next release step will be to release an net/ceph port when the
'Luminous' version goes officially in release.

In the meantime I'll be updating the ceph-devel port to a more current
state of affairs

Thanx,
--WjW
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com