Re: [lustre-discuss] Quick ZFS pool question?

2016-10-17 Thread Riccardo Veraldi
I do not have always big file, I Also have small files on Lustre, so I 
found out in my scenario that the default 128K record size

fits my needs better.
In real life I do not expect to have direct I/O . But before putting it 
in production I Was testing it
and the Direct I/O performances were far lower than other similar lustre 
partitions with ldiskfs.



On 17/10/16 08:59, PGabriele wrote:
you can have a better understanding of the gap from this presentation: 
ZFS metadata performance improvements 



On 14 October 2016 at 08:42, Dilger, Andreas > wrote:


On Oct 13, 2016 19:02, Riccardo Veraldi
mailto:riccardo.vera...@cnaf.infn.it>> wrote:
>
> Hello,
> will the lustre 2.9.0 rpm be released on the Intel site ?
> Also the latest rpm for zfsonlinux  available is 0.6.5.8

The Lustre 2.9.0 packages will be released, when the release is
complete.
You are welcome to test the pre-release version from Git, if you are
interested.

You are also correct that the ZoL 0.7.0 release is not yet available.
There are still improvements when using ZoL 0.6.5.8, but some of these
patches only made it into 0.7.0.

Cheers, Andreas

> On 13/10/16 11:16, Dilger, Andreas wrote:
>> On Oct 13, 2016, at 10:32, E.S. Rosenberg
mailto:esr%2blus...@mail.hebrew.edu>>
wrote:
>>> On Fri, Oct 7, 2016 at 9:16 AM, Xiong, Jinshan
mailto:jinshan.xi...@intel.com>> wrote:
>>>
> On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith
mailto:p.harvey-sm...@warwick.ac.uk>> wrote:
>
> Having tested a simple setup for lustre / zfs, I'd like to
try and
> replicate on the test system what we currently have on the
production
> system, which uses a much older version of lustre (2.0 IIRC).
>
> Currently we have a combined mgs / mds node and a single oss
node.
> we have 3 filesystems : home, storage and scratch.
>
> The MGS/MDS node currently has the mgs on a seperate block
device and
> the 3 mds on a combined lvm volume.
>
> The OSS has an ost each (on a separate disks) for scratch
and home
> and two ost for storage.
>
> If we migrate this setup to a ZFS based one, will I need to
create a
> separate zpool for each mdt / mgt / oss  or will I be able
to create
> a single zpool and split it up between the individual mdt /
oss blocks,
> if so how do I tell each filesystem how big it should be?
 We strongly recommend to create separate ZFS pools for OSTs,
otherwise grant, which is a Lustre internal space reserve
algorithm, won’t work properly.

 It’s possible to create a single zpool for MDTs and MGS, and
you can use ‘zfs set reservation= ’ to reserve
spaces for different targets.
>>> I thought ZFS was only recommended for OSTs and not for MDTs/MGS?
>> The MGT/MDT can definitely be on ZFS.  The performance of ZFS
has been
>> trailing behind that of ldiskfs, but we've made significant
performance
>> improvements with Lustre 2.9 and ZFS 0.7.0. Many people use ZFS
for the
>> MDT backend because of the checksums and integrated JBOD
management, as
>> well as the ability to create snapshots, data compression, etc.
>>
>> Cheers, Andreas
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org

>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

>>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org

> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org

http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org





--
www: http://paciucci.blogspot.com



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-17 Thread PGabriele
you can have a better understanding of the gap from this presentation: ZFS
metadata performance improvements


On 14 October 2016 at 08:42, Dilger, Andreas 
wrote:

> On Oct 13, 2016 19:02, Riccardo Veraldi 
> wrote:
> >
> > Hello,
> > will the lustre 2.9.0 rpm be released on the Intel site ?
> > Also the latest rpm for zfsonlinux  available is 0.6.5.8
>
> The Lustre 2.9.0 packages will be released, when the release is complete.
> You are welcome to test the pre-release version from Git, if you are
> interested.
>
> You are also correct that the ZoL 0.7.0 release is not yet available.
> There are still improvements when using ZoL 0.6.5.8, but some of these
> patches only made it into 0.7.0.
>
> Cheers, Andreas
>
> > On 13/10/16 11:16, Dilger, Andreas wrote:
> >> On Oct 13, 2016, at 10:32, E.S. Rosenberg 
> wrote:
> >>> On Fri, Oct 7, 2016 at 9:16 AM, Xiong, Jinshan <
> jinshan.xi...@intel.com> wrote:
> >>>
> > On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith <
> p.harvey-sm...@warwick.ac.uk> wrote:
> >
> > Having tested a simple setup for lustre / zfs, I'd like to try and
> > replicate on the test system what we currently have on the production
> > system, which uses a much older version of lustre (2.0 IIRC).
> >
> > Currently we have a combined mgs / mds node and a single oss node.
> > we have 3 filesystems : home, storage and scratch.
> >
> > The MGS/MDS node currently has the mgs on a seperate block device and
> > the 3 mds on a combined lvm volume.
> >
> > The OSS has an ost each (on a separate disks) for scratch and home
> > and two ost for storage.
> >
> > If we migrate this setup to a ZFS based one, will I need to create a
> > separate zpool for each mdt / mgt / oss  or will I be able to create
> > a single zpool and split it up between the individual mdt / oss
> blocks,
> > if so how do I tell each filesystem how big it should be?
>  We strongly recommend to create separate ZFS pools for OSTs,
> otherwise grant, which is a Lustre internal space reserve algorithm, won’t
> work properly.
> 
>  It’s possible to create a single zpool for MDTs and MGS, and you can
> use ‘zfs set reservation= ’ to reserve spaces for different
> targets.
> >>> I thought ZFS was only recommended for OSTs and not for MDTs/MGS?
> >> The MGT/MDT can definitely be on ZFS.  The performance of ZFS has been
> >> trailing behind that of ldiskfs, but we've made significant performance
> >> improvements with Lustre 2.9 and ZFS 0.7.0.  Many people use ZFS for the
> >> MDT backend because of the checksums and integrated JBOD management, as
> >> well as the ability to create snapshots, data compression, etc.
> >>
> >> Cheers, Andreas
> >>
> >> ___
> >> lustre-discuss mailing list
> >> lustre-discuss@lists.lustre.org
> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >>
> >
> > ___
> > lustre-discuss mailing list
> > lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>



-- 
www: http://paciucci.blogspot.com
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-14 Thread Dilger, Andreas
On Oct 13, 2016 19:02, Riccardo Veraldi  wrote:
> 
> Hello,
> will the lustre 2.9.0 rpm be released on the Intel site ?
> Also the latest rpm for zfsonlinux  available is 0.6.5.8

The Lustre 2.9.0 packages will be released, when the release is complete.
You are welcome to test the pre-release version from Git, if you are 
interested.

You are also correct that the ZoL 0.7.0 release is not yet available.
There are still improvements when using ZoL 0.6.5.8, but some of these
patches only made it into 0.7.0.

Cheers, Andreas

> On 13/10/16 11:16, Dilger, Andreas wrote:
>> On Oct 13, 2016, at 10:32, E.S. Rosenberg  wrote:
>>> On Fri, Oct 7, 2016 at 9:16 AM, Xiong, Jinshan  
>>> wrote:
>>> 
> On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith 
>  wrote:
> 
> Having tested a simple setup for lustre / zfs, I'd like to try and
> replicate on the test system what we currently have on the production
> system, which uses a much older version of lustre (2.0 IIRC).
> 
> Currently we have a combined mgs / mds node and a single oss node.
> we have 3 filesystems : home, storage and scratch.
> 
> The MGS/MDS node currently has the mgs on a seperate block device and
> the 3 mds on a combined lvm volume.
> 
> The OSS has an ost each (on a separate disks) for scratch and home
> and two ost for storage.
> 
> If we migrate this setup to a ZFS based one, will I need to create a
> separate zpool for each mdt / mgt / oss  or will I be able to create
> a single zpool and split it up between the individual mdt / oss blocks,
> if so how do I tell each filesystem how big it should be?
 We strongly recommend to create separate ZFS pools for OSTs, otherwise 
 grant, which is a Lustre internal space reserve algorithm, won’t work 
 properly.
 
 It’s possible to create a single zpool for MDTs and MGS, and you can use 
 ‘zfs set reservation= ’ to reserve spaces for different 
 targets.
>>> I thought ZFS was only recommended for OSTs and not for MDTs/MGS?
>> The MGT/MDT can definitely be on ZFS.  The performance of ZFS has been
>> trailing behind that of ldiskfs, but we've made significant performance
>> improvements with Lustre 2.9 and ZFS 0.7.0.  Many people use ZFS for the
>> MDT backend because of the checksums and integrated JBOD management, as
>> well as the ability to create snapshots, data compression, etc.
>> 
>> Cheers, Andreas
>> 
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-13 Thread Riccardo Veraldi

Hello,
will the lustre 2.9.0 rpm be released on the Intel site ?
Also the latest rpm for zfsonlinux  available is 0.6.5.8
thank you

Riccardo


On 13/10/16 11:16, Dilger, Andreas wrote:

On Oct 13, 2016, at 10:32, E.S. Rosenberg  wrote:

On Fri, Oct 7, 2016 at 9:16 AM, Xiong, Jinshan  wrote:


On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith  
wrote:

Having tested a simple setup for lustre / zfs, I'd like to try and
replicate on the test system what we currently have on the production
system, which uses a much older version of lustre (2.0 IIRC).

Currently we have a combined mgs / mds node and a single oss node.
we have 3 filesystems : home, storage and scratch.

The MGS/MDS node currently has the mgs on a seperate block device and
the 3 mds on a combined lvm volume.

The OSS has an ost each (on a separate disks) for scratch and home
and two ost for storage.

If we migrate this setup to a ZFS based one, will I need to create a
separate zpool for each mdt / mgt / oss  or will I be able to create
a single zpool and split it up between the individual mdt / oss blocks,
if so how do I tell each filesystem how big it should be?

We strongly recommend to create separate ZFS pools for OSTs, otherwise grant, 
which is a Lustre internal space reserve algorithm, won’t work properly.

It’s possible to create a single zpool for MDTs and MGS, and you can use ‘zfs set 
reservation= ’ to reserve spaces for different targets.

I thought ZFS was only recommended for OSTs and not for MDTs/MGS?

The MGT/MDT can definitely be on ZFS.  The performance of ZFS has been
trailing behind that of ldiskfs, but we've made significant performance
improvements with Lustre 2.9 and ZFS 0.7.0.  Many people use ZFS for the
MDT backend because of the checksums and integrated JBOD management, as
well as the ability to create snapshots, data compression, etc.

Cheers, Andreas

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-13 Thread Dilger, Andreas
On Oct 13, 2016, at 10:32, E.S. Rosenberg  wrote:
> 
> On Fri, Oct 7, 2016 at 9:16 AM, Xiong, Jinshan  
> wrote:
> 
>> > On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith 
>> >  wrote:
>> >
>> > Having tested a simple setup for lustre / zfs, I'd like to try and
>> > replicate on the test system what we currently have on the production
>> > system, which uses a much older version of lustre (2.0 IIRC).
>> >
>> > Currently we have a combined mgs / mds node and a single oss node.
>> > we have 3 filesystems : home, storage and scratch.
>> >
>> > The MGS/MDS node currently has the mgs on a seperate block device and
>> > the 3 mds on a combined lvm volume.
>> >
>> > The OSS has an ost each (on a separate disks) for scratch and home
>> > and two ost for storage.
>> >
>> > If we migrate this setup to a ZFS based one, will I need to create a
>> > separate zpool for each mdt / mgt / oss  or will I be able to create
>> > a single zpool and split it up between the individual mdt / oss blocks,
>> > if so how do I tell each filesystem how big it should be?
>> 
>> We strongly recommend to create separate ZFS pools for OSTs, otherwise 
>> grant, which is a Lustre internal space reserve algorithm, won’t work 
>> properly.
>> 
>> It’s possible to create a single zpool for MDTs and MGS, and you can use 
>> ‘zfs set reservation= ’ to reserve spaces for different 
>> targets.
> 
> I thought ZFS was only recommended for OSTs and not for MDTs/MGS?

The MGT/MDT can definitely be on ZFS.  The performance of ZFS has been
trailing behind that of ldiskfs, but we've made significant performance
improvements with Lustre 2.9 and ZFS 0.7.0.  Many people use ZFS for the
MDT backend because of the checksums and integrated JBOD management, as
well as the ability to create snapshots, data compression, etc.

Cheers, Andreas

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-13 Thread Michael Di Domenico
On Thu, Oct 13, 2016 at 1:54 PM, Mohr Jr, Richard Frank (Rick Mohr)
 wrote:
>
>> On Oct 13, 2016, at 12:32 PM, E.S. Rosenberg  
>> wrote:
>>
>> I thought ZFS was only recommended for OSTs and not for MDTs/MGS?
>
> ZFS usually has lower metadata performance for MDT than using ldiskfs which 
> is why some people recommend ZFS only for the OSTs.  However, ZFS has 
> features (like snapshots) that are useful for the MDT so some folks are 
> willing to accept a performance hit in order to take advantage of those 
> features.

does anyone have an actual figures on how much lower the performance
is?  like 10%, 20%, 30%  the performance gap was probably mentioned
somewhere else, but i missed it

our next internal lustre build is going to use zfs for both mdt and
ost, but the driver is a patchless kernel not so much performance.
but if there's a tremendous performance gap, i'd like to be aware of
that
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-13 Thread Mohr Jr, Richard Frank (Rick Mohr)

> On Oct 13, 2016, at 12:32 PM, E.S. Rosenberg  
> wrote:
> 
> I thought ZFS was only recommended for OSTs and not for MDTs/MGS?

ZFS usually has lower metadata performance for MDT than using ldiskfs which is 
why some people recommend ZFS only for the OSTs.  However, ZFS has features 
(like snapshots) that are useful for the MDT so some folks are willing to 
accept a performance hit in order to take advantage of those features.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-13 Thread E.S. Rosenberg
On Fri, Oct 7, 2016 at 9:16 AM, Xiong, Jinshan 
wrote:

>
> > On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith <
> p.harvey-sm...@warwick.ac.uk> wrote:
> >
> > Hi all,
> >
> > Having tested a simple setup for lustre / zfs, I'd like to trey and
> replicate on the test system what we currently have on the production
> system, which uses a much older version of lustre (2.0 IIRC).
> >
> > Currently we have a combined mgs / mds node and a single oss node. we
> have 3 filesystems : home, storage and scratch.
> >
> > The MGS/MDS node currently has the mgs on a seperate block device and
> the 3 mds on a combined lvm volume.
> >
> > The OSS has an ost each (on a separate disks) for scratch and home and
> two ost for storage.
> >
> > If we migrate this setup to a ZFS based one, will I need to create a
> separate zpool for each mdt / mgt / oss  or will I be able to create a
> single zpool and split it up between the individual mdt / oss blocks, if so
> how do I tell each filesystem how big it should be?
>
> We strongly recommend to create separate ZFS pools for OSTs, otherwise
> grant, which is a Lustre internal space reserve algorithm, won’t work
> properly.
>
> It’s possible to create a single zpool for MDTs and MGS, and you can use
> ‘zfs set reservation= ’ to reserve spaces for different
> targets.
>
I thought ZFS was only recommended for OSTs and not for MDTs/MGS?
Eli

>
> Jinshan
>
> >
> > Cheers.
> >
> > Phill.
> > ___
> > lustre-discuss mailing list
> > lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-07 Thread Hans Henrik Happe
Just curios, if you set reservation on an zfs OST fs, will the algorithm work? 
Also, will it go totally crazy or just not be able to make good decisions, 
because something external is grabbing the space?

Cheers,
Hans Henrik

On October 7, 2016 8:16:54 AM GMT+02:00, "Xiong, Jinshan" 
 wrote:
>
>> On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith
> wrote:
>> 
>> Hi all,
>> 
>> Having tested a simple setup for lustre / zfs, I'd like to trey and
>replicate on the test system what we currently have on the production
>system, which uses a much older version of lustre (2.0 IIRC).
>> 
>> Currently we have a combined mgs / mds node and a single oss node. we
>have 3 filesystems : home, storage and scratch.
>> 
>> The MGS/MDS node currently has the mgs on a seperate block device and
>the 3 mds on a combined lvm volume.
>> 
>> The OSS has an ost each (on a separate disks) for scratch and home
>and two ost for storage.
>> 
>> If we migrate this setup to a ZFS based one, will I need to create a
>separate zpool for each mdt / mgt / oss  or will I be able to create a
>single zpool and split it up between the individual mdt / oss blocks,
>if so how do I tell each filesystem how big it should be?
>
>We strongly recommend to create separate ZFS pools for OSTs, otherwise
>grant, which is a Lustre internal space reserve algorithm, won’t work
>properly.
>
>It’s possible to create a single zpool for MDTs and MGS, and you can
>use ‘zfs set reservation= ’ to reserve spaces for
>different targets.
>
>Jinshan
>
>> 
>> Cheers.
>> 
>> Phill.
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Quick ZFS pool question?

2016-10-06 Thread Xiong, Jinshan

> On Oct 6, 2016, at 2:04 AM, Phill Harvey-Smith  
> wrote:
> 
> Hi all,
> 
> Having tested a simple setup for lustre / zfs, I'd like to trey and replicate 
> on the test system what we currently have on the production system, which 
> uses a much older version of lustre (2.0 IIRC).
> 
> Currently we have a combined mgs / mds node and a single oss node. we have 3 
> filesystems : home, storage and scratch.
> 
> The MGS/MDS node currently has the mgs on a seperate block device and the 3 
> mds on a combined lvm volume.
> 
> The OSS has an ost each (on a separate disks) for scratch and home and two 
> ost for storage.
> 
> If we migrate this setup to a ZFS based one, will I need to create a separate 
> zpool for each mdt / mgt / oss  or will I be able to create a single zpool 
> and split it up between the individual mdt / oss blocks, if so how do I tell 
> each filesystem how big it should be?

We strongly recommend to create separate ZFS pools for OSTs, otherwise grant, 
which is a Lustre internal space reserve algorithm, won’t work properly.

It’s possible to create a single zpool for MDTs and MGS, and you can use ‘zfs 
set reservation= ’ to reserve spaces for different targets.

Jinshan

> 
> Cheers.
> 
> Phill.
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org