Offlist exchange forwarded with permission.

----- Forwarded message from Kyle Schochenmaier <[EMAIL PROTECTED]> -----

From: Kyle Schochenmaier <[EMAIL PROTECTED]>
Date: Wed, 12 Nov 2008 16:37:39 -0600
To: Eugen Leitl <[EMAIL PROTECTED]>
Cc: Troy Benjegerdes <[EMAIL PROTECTED]>
Subject: Re: [Pvfs2-users] linux vserver and PVFS2

To answer a couple of your more basic questions about the filesystem:

Performance :
SATA disks provide great throughput as long as you dont overload them
with IO's.  If your workload is fairly sequential these will perform
on par with FC disks.
Scaling from my experience has shown to be fairly dependent on the
disks and the network, not so much on the filesystem - when you start
to get lots of spindles moving and have really fast networks
(infiniband, mx, 10ge), then the filesystem's performance comes into
play a bit more, and there are tweaks for performance that can be made
at this point.  Also, for systems where disks have more bandwidth
available than the networks, scaling seems to be really good - in some
cases very linear - and again, you'd be limited by network rates.

On 4 bonded GigE connections, I would imagine you will be limited by
the network performance as you start adding 50-80MB/s for each node
you add. But I may not exactly understand your setup.
If you have 4*GigE/node then thats a lot different.

I'll leave the DRDB stuff for troy ;-)

Typical failover/HA systems for PVFS2 are handled by a daemon called
Heartbeat and in order for it to work you basically need to have
multiple physical nodes with access to the same LUN's(disks).  I dont
know if this is very simple to implement without SRP/iSCSI/etc.

There is no mirroring done inside pvfs2, I've looked into it, and
several others have looked into it and provided proof of concepts, but
none of said work has seen the light of day.

Hope this helps a little.

~Kyle

Kyle Schochenmaier



On Wed, Nov 12, 2008 at 3:09 PM, Eugen Leitl <[EMAIL PROTECTED]> wrote:
> On Wed, Nov 12, 2008 at 12:19:21PM -0600, Troy Benjegerdes wrote:
>> What's your application?
>
> Hosting some ~100 vservers/node.
>
>> I was just looking at infiniband card prices, and it might cost you less
>> than 4xGigE to get a 24 port IB switch and these cards..
>
> The 4 ports are onboard a 407 EUR Sun Fire X2100 M2 kit. GBit switches
> are almost free, too, and having several switches offers some redundancy.
>
>> http://www.colfaxdirect.com/store/pc/viewPrd.asp?idcategory=6&idproduct=12
>>
>>
>> For tolerating node failures, I would do some sort of software mirroring
>> across nodes, using either something like DRDB, or Infiniband SRP.
>
> I've used drbd, but then I'd be wasting even more active nodes.
> It would be nice to mix striping and mirroring at PVFS2 level.
>
>>
>> Eugen Leitl wrote:
>> >I'm planning to eventually operate a 20+ node cluster of Debian boxes
>> >(2-4 cores AMD64, 8 GByte RAM, about 1-2 TByte RAID 1, 4x GBit Ether
>> >interfaces, probably with jumbo frames) with a unified filestore.
>> >
>> >A few questions I've been unable to answer by searching:
>> >
>> >Can I make Linux vserver guests fs live on PVFS2?
>> >If I don't use unification
>> >http://linux-vserver.org/Frequently_Asked_Questions#Unification
>> >?
>> >
>> >How much aggregate throughput can I expect with some 20
>> >nodes, with a modern 7 krpm SATA drive (RAID 1 pair, about
>> >80 MByte/s sustainable throughput, or so)?
>> >
>> >Is there a way to set up PVFS2 to tolerate 1-2 node losses on
>> >above 20-node assembly, and how much of the raw storage would I lose
>> >that way?
>> >
>> >Thanks,
>> >
>> >
> --
> Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
> ______________________________________________________________
> ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
>

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org";>leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to