Re: [Lustre-discuss] New Test Framework Development - Requirements Capture

2012-12-07 Thread Gearing, Chris
Just a reminder that we have a meeting on Tuesday to discuss the new test
framework development and that the wiki page on the opensfs site is ready
for capturing everybody's thoughts. The plan for the next meeting to be to
discuss the requirements that have been captured.

Thanks

Chris

Call Info
Tuesday 11th December 16:00 UTC
 
Bridge Info
916-356-2663, Bridge: 1, Passcode: 1146033
...
..

Join online meeting
https://meet.intel.com/chris.gearing/M6HD8CYF
 

First online meeting?
http://r.office.microsoft.com/r/rlidOC10?clid=1033p1=4p2=1041pc=ocver=4
subver=0bld=7185bldver=0




Hi,

During the last meeting we decided to setup a Wiki page to allow everyone
to capture their thoughts, requirements and ideas for possible inclusion
into the new framework environment.

This page is now available on the OpenSFS Wiki and we would welcome your
input before the 7th December, please provide input how big or small.

http://wiki.opensfs.org/New_test_framework

Many thanks

Chris

-
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jon Yeargers
Can Lustre be used to store data like streaming audio / video? I’ve been 
scolded about considering it for DB storage but I’m looking at the relative 
merits of Lustre vs HDFS.

I’m moving to a clustered DB setup and wondering about Cassandra / Lustre vs 
Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware 
components while the other is a ‘one stop shop’.

Not trying to elicit a religious war – and yes, I’ve been reading as much as I 
can find about this. Just hoping for the opinion(s) of this side of the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Colin Faber
Hi,

On 12/07/2012 10:26 AM, Jon Yeargers wrote:

 Can Lustre be used to store data like streaming audio / video?

Yes

 I’ve been scolded about considering it for DB storage but I’m looking 
 at the relative merits of Lustre vs HDFS.

db reads/writes tends to lead to small I/O which lustre does not handle 
as well as large I/O

 I’m moving to a clustered DB setup and wondering about Cassandra / 
 Lustre vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of 
 mixing hardware components while the other is a ‘one stop shop’.

Honestly not sure, If you do perform some benchmarking between the two, 
I, and I'm sure others would be greatly interested in seeing how the 
various FS technologies stack up!

 Not trying to elicit a religious war – and yes, I’ve been reading as 
 much as I can find about this. Just hoping for the opinion(s) of this 
 side of the table.


I don't think you'll find that here =)

-cf



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Dilger, Andreas
On 2012-12-07, at 10:26, Jon Yeargers 
yearg...@ohsu.edumailto:yearg...@ohsu.edu wrote:
Can Lustre be used to store data like streaming audio / video? I’ve been 
scolded about considering it for DB storage but I’m looking at the relative 
merits of Lustre vs HDFS.

I've been using Lustre for years with my home MythTV (Linux PVR) setup. The 
only major change I made was to reduce the readahead window size so that there 
wasn't lag when videos first start playing due to the large readahead window 
being filled.

Of course, the suitability for a given workload depends on the hardware being 
used. Lustre will definitely give you better performance for the same hardware 
than HDFS, but if you need highly available data, the storage needs to be able 
to failover between servers.

Cheers, Andreas

I’m moving to a clustered DB setup and wondering about Cassandra / Lustre vs 
Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware 
components while the other is a ‘one stop shop’.

Not trying to elicit a religious war – and yes, I’ve been reading as much as I 
can find about this. Just hoping for the opinion(s) of this side of the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.orgmailto:Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jon Yeargers
The redundancy of HDFS is very appealing. I've been weighing the merits of this 
vs a RAID-6 / server on Lustre. HDFS recommends avoiding RAID for the very 
reason that the data is (typically) saved in several locations. 

-Original Message-
From: Dilger, Andreas [mailto:andreas.dil...@intel.com] 
Sent: Friday, December 07, 2012 9:35 AM
To: Jon Yeargers
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss] Applications of Lustre - streaming?

On 2012-12-07, at 10:26, Jon Yeargers 
yearg...@ohsu.edumailto:yearg...@ohsu.edu wrote:
Can Lustre be used to store data like streaming audio / video? I’ve been 
scolded about considering it for DB storage but I’m looking at the relative 
merits of Lustre vs HDFS.

I've been using Lustre for years with my home MythTV (Linux PVR) setup. The 
only major change I made was to reduce the readahead window size so that there 
wasn't lag when videos first start playing due to the large readahead window 
being filled.

Of course, the suitability for a given workload depends on the hardware being 
used. Lustre will definitely give you better performance for the same hardware 
than HDFS, but if you need highly available data, the storage needs to be able 
to failover between servers.

Cheers, Andreas

I’m moving to a clustered DB setup and wondering about Cassandra / Lustre vs 
Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware 
components while the other is a ‘one stop shop’.

Not trying to elicit a religious war – and yes, I’ve been reading as much as I 
can find about this. Just hoping for the opinion(s) of this side of the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.orgmailto:Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jason Brooks
Hello,

The question of hdfs storage via lustre has been in the foreground of my 
thinking.  the hadoop hdfs processes are not aware of block devices: they only 
know of a filesystem mount point to begin storing data in hdfs.  THUS…

If we provide a filesystem interface (say a lustre mount point) whose latencies 
and throughput approach that of local disk storage (say, via infiniband), could 
we not have the various hadoop nodes store their data in the lustre filesystem? 
 would hadoop even care?

I realize that this may not be a good place to bring it up.  But there you go…

One of these days, (with all of my ample spare time), I will benchmark it.  and 
report of course…

--jason

From: Jon Yeargers yearg...@ohsu.edumailto:yearg...@ohsu.edu
Date: Friday, December 7, 2012 9:26 AM
To: lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org 
lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org
Subject: [Lustre-discuss] Applications of Lustre - streaming?

Can Lustre be used to store data like streaming audio / video? I’ve been 
scolded about considering it for DB storage but I’m looking at the relative 
merits of Lustre vs HDFS.

I’m moving to a clustered DB setup and wondering about Cassandra / Lustre vs 
Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware 
components while the other is a ‘one stop shop’.

Not trying to elicit a religious war – and yes, I’ve been reading as much as I 
can find about this. Just hoping for the opinion(s) of this side of the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jon Yeargers
If it weren’t for the positive aspects of HDFS I wouldn’t really be considering 
HBase (over Cassandra).

Any notion of the merits of Lustre’s kernel-based mounts vs a FUSE-based mount 
(HDFS)? Whichever filesystem I go with I will need to store ‘flat files’ in.

From: Jason Brooks
Sent: Friday, December 07, 2012 9:38 AM
To: Jon Yeargers; lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss] Applications of Lustre - streaming?

Hello,

The question of hdfs storage via lustre has been in the foreground of my 
thinking.  the hadoop hdfs processes are not aware of block devices: they only 
know of a filesystem mount point to begin storing data in hdfs.  THUS…

If we provide a filesystem interface (say a lustre mount point) whose latencies 
and throughput approach that of local disk storage (say, via infiniband), could 
we not have the various hadoop nodes store their data in the lustre filesystem? 
 would hadoop even care?

I realize that this may not be a good place to bring it up.  But there you go…

One of these days, (with all of my ample spare time), I will benchmark it.  and 
report of course…

--jason

From: Jon Yeargers yearg...@ohsu.edumailto:yearg...@ohsu.edu
Date: Friday, December 7, 2012 9:26 AM
To: lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org 
lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org
Subject: [Lustre-discuss] Applications of Lustre - streaming?

Can Lustre be used to store data like streaming audio / video? I’ve been 
scolded about considering it for DB storage but I’m looking at the relative 
merits of Lustre vs HDFS.

I’m moving to a clustered DB setup and wondering about Cassandra / Lustre vs 
Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware 
components while the other is a ‘one stop shop’.

Not trying to elicit a religious war – and yes, I’ve been reading as much as I 
can find about this. Just hoping for the opinion(s) of this side of the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jason Brooks
I have used fuse for other filesystems: its great if all you need is access to 
the data, but the performance is HORRIBLE.

--jason

From: Jon Yeargers yearg...@ohsu.edumailto:yearg...@ohsu.edu
Date: Friday, December 7, 2012 9:42 AM
To: Jason Brooks brook...@ohsu.edumailto:brook...@ohsu.edu, 
lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org 
lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org
Subject: RE: [Lustre-discuss] Applications of Lustre - streaming?

If it weren’t for the positive aspects of HDFS I wouldn’t really be considering 
HBase (over Cassandra).

Any notion of the merits of Lustre’s kernel-based mounts vs a FUSE-based mount 
(HDFS)? Whichever filesystem I go with I will need to store ‘flat files’ in.

From: Jason Brooks
Sent: Friday, December 07, 2012 9:38 AM
To: Jon Yeargers; 
lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss] Applications of Lustre - streaming?

Hello,

The question of hdfs storage via lustre has been in the foreground of my 
thinking.  the hadoop hdfs processes are not aware of block devices: they only 
know of a filesystem mount point to begin storing data in hdfs.  THUS…

If we provide a filesystem interface (say a lustre mount point) whose latencies 
and throughput approach that of local disk storage (say, via infiniband), could 
we not have the various hadoop nodes store their data in the lustre filesystem? 
 would hadoop even care?

I realize that this may not be a good place to bring it up.  But there you go…

One of these days, (with all of my ample spare time), I will benchmark it.  and 
report of course…

--jason

From: Jon Yeargers yearg...@ohsu.edumailto:yearg...@ohsu.edu
Date: Friday, December 7, 2012 9:26 AM
To: lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org 
lustre-discuss@lists.lustre.orgmailto:lustre-discuss@lists.lustre.org
Subject: [Lustre-discuss] Applications of Lustre - streaming?

Can Lustre be used to store data like streaming audio / video? I’ve been 
scolded about considering it for DB storage but I’m looking at the relative 
merits of Lustre vs HDFS.

I’m moving to a clustered DB setup and wondering about Cassandra / Lustre vs 
Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing hardware 
components while the other is a ‘one stop shop’.

Not trying to elicit a religious war – and yes, I’ve been reading as much as I 
can find about this. Just hoping for the opinion(s) of this side of the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jason Brooks
It is the question of how to handle redundancy that stops me from
immediately testing this idea of mine.  Well, that and time, werewithal,
etc...

Hadoop is great because it uses the speed and latency of local disks to
work with data, and does not require systems be homogeneous.  With the
data replicated, it not only has the storage redundancy, but also x-1
hosts that can work with the data if a host goes down.

The down side appears to be that hadoop can't really handle many nodes
going down ungracefully.  My personal goal with my idea is to make a host
a dumb compute node that I can shutdown with impunity.


On 12/7/12 9:37 AM, Jon Yeargers yearg...@ohsu.edu wrote:

The redundancy of HDFS is very appealing. I've been weighing the merits
of this vs a RAID-6 / server on Lustre. HDFS recommends avoiding RAID for
the very reason that the data is (typically) saved in several locations.

-Original Message-
From: Dilger, Andreas [mailto:andreas.dil...@intel.com]
Sent: Friday, December 07, 2012 9:35 AM
To: Jon Yeargers
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss] Applications of Lustre - streaming?

On 2012-12-07, at 10:26, Jon Yeargers
yearg...@ohsu.edumailto:yearg...@ohsu.edu wrote:
Can Lustre be used to store data like streaming audio / video? I¹ve been
scolded about considering it for DB storage but I¹m looking at the
relative merits of Lustre vs HDFS.

I've been using Lustre for years with my home MythTV (Linux PVR) setup.
The only major change I made was to reduce the readahead window size so
that there wasn't lag when videos first start playing due to the large
readahead window being filled.

Of course, the suitability for a given workload depends on the hardware
being used. Lustre will definitely give you better performance for the
same hardware than HDFS, but if you need highly available data, the
storage needs to be able to failover between servers.

Cheers, Andreas

I¹m moving to a clustered DB setup and wondering about Cassandra / Lustre
vs Hadoop (IE HBase / HDFS). One offers flexibility in terms of mixing
hardware components while the other is a Œone stop shop¹.

Not trying to elicit a religious war ­ and yes, I¹ve been reading as much
as I can find about this. Just hoping for the opinion(s) of this side of
the table.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.orgmailto:Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Applications of Lustre - streaming?

2012-12-07 Thread Jeff Johnson
On 12/7/12 9:34 AM, Dilger, Andreas wrote:
 I've been using Lustre for years with my home MythTV (Linux PVR) setup.
Nerd. :)

-- 
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

/* New Address */
4170 Morena Boulevard, Suite D - San Diego, CA 92117

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] noatime or atime_diff for Lustre 1.8.7?

2012-12-07 Thread Mohr Jr, Richard Frank (Rick Mohr)
On Dec 6, 2012, at 2:58 PM, Grigory Shamov wrote:

 So, on one of our OSS servers the load is now 160. According to collectl, 
 only one OST does most of the job. (We dont do striping on this FS; unless 
 users to it manually on their subdirectories).

This sounds similar to situations we see every now and then.  The load on the 
oss server climbs until it is roughly equally to the number of oss threads 
(which sounds like your case with load=oss_threads=160), but only a single ost 
is performing any significant IO.  This seems to arise when parallel jobs 
access the same file which has stripe_count=1.  The oss is bombarded with so 
many requests to a single ost that they backlog and tie up all the oss threads. 
 At that point, all IO to the oss slows to a crawl no matter which ost on the 
oss is being used.  This becomes problematic because even a modest sized job 
can effectively DOS and oss server.

When you encounter these problems, is the IO to the affected ost primarly 
one-way (ie - mostly reads or mostly writes)?  In our cases, we tend to see 
this when parallel jobs are reading from a common file.  There are a couple of 
things that I have found that help:

1) Increase the file striping a lot.  This helps spread the load over more 
osts.  We have had success with striping even relatively small files (~10 GB) 
over 100+ osts.  Not only does it reduce load on the oss, but it usually speeds 
up the application significantly.

2) Make sure caching is enabled on the oss.  For us, this seems to help mostly 
when lots of processes are reading in the same file.

Not sure if your situation is exactly like what I have seen, but maybe some of 
that info can help a bit.

-- 
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] noatime or atime_diff for Lustre 1.8.7?

2012-12-07 Thread Mark Day
 2) Make sure caching is enabled on the oss. 

How do you check/enable for this? Is it not enabled by default? 

Cheers, Mark 

- Original Message -

From: Mohr Jr, Richard Frank (Rick Mohr) rm...@utk.edu 
To: Grigory Shamov ga...@yahoo.com 
Cc: lustre-discuss@lists.lustre.org 
Sent: Saturday, 8 December, 2012 5:19:31 AM 
Subject: Re: [Lustre-discuss] noatime or atime_diff for Lustre 1.8.7? 

On Dec 6, 2012, at 2:58 PM, Grigory Shamov wrote: 

 So, on one of our OSS servers the load is now 160. According to collectl, 
 only one OST does most of the job. (We dont do striping on this FS; unless 
 users to it manually on their subdirectories). 

This sounds similar to situations we see every now and then. The load on the 
oss server climbs until it is roughly equally to the number of oss threads 
(which sounds like your case with load=oss_threads=160), but only a single ost 
is performing any significant IO. This seems to arise when parallel jobs access 
the same file which has stripe_count=1. The oss is bombarded with so many 
requests to a single ost that they backlog and tie up all the oss threads. At 
that point, all IO to the oss slows to a crawl no matter which ost on the oss 
is being used. This becomes problematic because even a modest sized job can 
effectively DOS and oss server. 

When you encounter these problems, is the IO to the affected ost primarly 
one-way (ie - mostly reads or mostly writes)? In our cases, we tend to see this 
when parallel jobs are reading from a common file. There are a couple of things 
that I have found that help: 

1) Increase the file striping a lot. This helps spread the load over more osts. 
We have had success with striping even relatively small files (~10 GB) over 
100+ osts. Not only does it reduce load on the oss, but it usually speeds up 
the application significantly. 

2) Make sure caching is enabled on the oss. For us, this seems to help mostly 
when lots of processes are reading in the same file. 

Not sure if your situation is exactly like what I have seen, but maybe some of 
that info can help a bit. 

-- 
Rick Mohr 
Senior HPC System Administrator 
National Institute for Computational Sciences 
http://www.nics.tennessee.edu 


___ 
Lustre-discuss mailing list 
Lustre-discuss@lists.lustre.org 
http://lists.lustre.org/mailman/listinfo/lustre-discuss 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss