Hello again,

Ideally you could run a benchmark of your application and use blktrace+seekwatcher <http://oss.oracle.com/%7Emason/seekwatcher/> to capture and view some really accurate IO stats and then tune accordingly. Other than that it's complete guess work, you have about 50G of potential FS cache there which is 0.2% of your physical data capacity. It all depends then on your cache hit rates (to network performance) and ability to handle the cache misses. You could just run 'iostat' on your backend nodes during a benchmark as well, that's good enough.

From your network stats it appears as though access is actually quite low (64in/43out) and so most might be ending up in your FS cache and even if it isn't there is no way, even if it's 100% small block random operations, that it'll saturate your drives.

By bonding <http://en.wikipedia.org/wiki/Channel_bonding> I mean aggregation, trunking or teaming, depending on what networking school you went to. My rough and totally inaccurate back-of-a-napkin numbers are designed to indicate 1Gbit probably won't be enough, and you might need to consider two or more gbit interfaces. Based on my testing with six servers I can kill the Gbit interface pretty easily (but that's not with your app of course).

Long story short the answer to your original question, "Any guide line we should follow for calculating the memory requirements" is no. It's all about your specific application requirements (and the money you're willing to spend).

The only advice I'd give then is;

   * Be sure to monitor your IO and know exactly what the numbers mean
     and what causes them.
   * Have a capacity plan with an eye on what you need to address any
     of the possible eventualities;

      1. Network throughput/latency - More/faster ports.
      2. Disk sequential read/write - More spindles or flash.
      3. Disk random read/write - More spindles or flash.
      4. File System cache misses - RAM increases on storage nodes.
      5. Single storage node overload - More nodes or striping that file.




On 12/30/2010 08:51 PM, admin iqtc wrote:
Hi,

Sorry Mark, but i don't understand what you exactly need. Could you give me
an example of information you're asking?

Regarding bonding, don't worry, all the current 5 machines are bonded(1gbit
each interface) to the switch, and the new machine would be installed the
same way.

That switch load is from the HPC clusters to the gluster. The info is from
the trunking interface in the switch. Our network topology is as follows:
each gluster server(and the new one) are connected with bonding to a L2
switch, then from that switch 4x1gbit cables goes to a L3 switch. Both
switches are configured for those 4 cables to be trunked. The traffic load i
told you is from the L3 switch.

We may expand that trunking some day, but for now we aren't having any
trouble..

Thanks

2010/12/28 Mark "Naoki" Rogers<[email protected]>

Hi,

Your five machines should get you raw speeds of at least 300MB/s sequential
and 300-500 random IOP/s, your file-system cache alters things depending on
access patterns. Without knowing about those patterns I can't guess as to
the most beneficial disk/memory ratios for you. If possible run some
synthetic benchmarks for base-lining and then try and benchmark your
application, even if it's only a limited benchmark that's ok you can still
extrapolate from there.

The first thing you might hit though could be the 1Gbit interfaces so keep
an eye on those and perhaps have a plan to bond them, and get ready to think
about 10G on the larger one if needed.

Right now it seems the switch load is light, is that per port to the
storage bricks?



On 12/28/2010 05:38 PM, admin iqtc wrote:

Hi,

sorry for not giving more information on the first mail.

The setup would be straight distributed. The disks are SATA2 7200RPM. ATM
the 5 machines we're currently running have 5 disks of 1TB(4TB with RAID5)
each. The new machine would have 12 disks of 2TB with RAID5 as well, so 23TB
approx.

We're using gluster for storage of an HPC cluster. That means: Data gets
copied from gluster and to gluster all the times. For example looking at the
traffic on the switch, the average is 64Mbit/s IN(that is, writing) and
43Mbit/s OUT(that is, reading). That is among the 5 machines.

Is this enough?

Thanks!

_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to