Re: [Gluster-users] Newbie: Exploring Gluster for large-scale deployment in AWS, large media files, high performance I/O

Forrest Aldrich Tue, 14 Jul 2015 12:15:42 -0700

Sorry, I should have noted that. 380MB is both read and write (Iconfirmed this with a developer).

We do need the NFS stack, as that's how all the code and various manyInstances work -- we have several "workers" that chop up video on thesame namespace. It's not efficient, but that's how it has to be for now.

Redundancy, in terms of the server? We have RAIDED volumes if that'swhat you're referring to.


Here's a basic outline of the flow (as I understand it):


Video Capture Agent sends in large file of video (30gb +/-)

Administrative host receives and writes to NFS

A process copies this over to another point in the namespace

Another Instance picks up the file, reads and starts processing andwrites (FFMPEG is involved)

Something like that -- I may not have all the steps, but essentiallythere's a ton of I/O going on. I know our code model is not efficient,but it's complicated and can't just be changed (it's based on an opensource product and there's some code baggage).

We looked into another product that allegedly scaled out using multipleNFS heads with massive local cache (AWS instances) and sharing the samespace, but it was horrible and just didn't work for us.




Thank you.



On 7/14/15 3:06 PM, Mathieu Chateau wrote:

Hello,

is it 380MB in read or write ? What level of redundancy do you need?

do you really need nfs stack or just a mount point (and so be able touse native gluster protocol) ?

Gluster load is mostly put on clients, not server (clients do the syncwrites to all replica, and do the memory cache)



Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-07-14 20:49 GMT+02:00 Forrest Aldrich <[email protected]<mailto:[email protected]>>:


    I'm exploring solutions to help us achieve high throughput and
    scalability within the AWS environment.   Specifically, I work in
    a department where we handle and produce video content that
    results in very large files (30GB etc) that must be written to
    NFS, chopped up and copied over on the same mount (there are some
    odd limits to the code we use, but that's outside the scope of
    this question).

    Currently, we're using a commercial vendor with AWS, with
    dedicated Direct Connect instances as the back end to our
    production.   We're maxing out at 350 to 380 MB/s which is not
    enough.  We expect our capacity will double or even triple when we
    bring on more classes or even other entities and we need to find a
    way to squeeze out as much I/O as we can.

    Our software model depends on NFS, there's no way around that
    presently.

    Since GlusterFS uses FUSE, I'm concerned about performance, which
    is a key issue.   Sounds like a STRIPE would be appropriate.

    My basic understanding of Gluster is the ability to include
    several "bricks" which could be multiples of either dedicated EBS
    volumes or even multiple instances of the above commercial vendor,
    served up via NFS namespace, which would be transparently a single
    namespace to client connections.   The I/O could be distributed in
    this manner.

    I wonder if someone here with more experience with the above might
    elaborate on whether GlusterFS could be used in the above
    scenario. Specifically, performance I/O.  We'd really like to gain
    upwards as much as possible, like 700 Mb/s and 1 GB/s and up if
    possible.



    Thanks in advance.





    _______________________________________________
    Gluster-users mailing list
    [email protected] <mailto:[email protected]>
    http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Newbie: Exploring Gluster for large-scale deployment in AWS, large media files, high performance I/O

Reply via email to