On 11/26/2012 03:44 AM, Zohair Raza wrote:
Understand your point, by the time I also tried with NFS like clustering but didn't help

There is master-master geo replication planned in 3.4 http://www.gluster.org/community/documentation/index.php/Planning34

I think it is for the same purpose, has anyone got an idea on it?

Regards,
Zohair Raza

On Mon, Nov 26, 2012 at 2:58 PM, Robert Hajime Lanning <[email protected] <mailto:[email protected]>> wrote:

    On 11/25/12 23:26, Zohair Raza wrote:

        Hi,

        Thanks for reply,

        Can you please elaborate more on the last line, I understand
        that read
        will have no issues. I tried implementing a replicated volume
        but the
        problem is gluster starts uploading the file to node2 while
        copying for
        example if I have a 500MB file in site1 which is being copied
        from a LAN
        machine to node1 copies at the speed of my internet link which
        I want to
        get copied at much faster speed (in MBps) as it is LAN.

        Isn't there any way by which I can set synchronization speed
        or set
        gluster to sync after the file is copied?


    All the smarts are in the client.

    If you have a replica count of 2, then when a client is writing,
    it is writing to 2 bricks at the same time.  There is no such
    thing as queuing for later sync.

    What happens if a client at site A is writing to the same file as
    a client at site B?  If you have a delayed write to a remote site,
    how do you solve write conflicts?  You would need to completely
    understand the file format and it's transactional state, so that
    the 2 separate writes can be merged without corrupting the file.

    If there is a conflict, there is no way to notify the process that
    was writing, because the write would have already returned as
    successful, since it was queued for later execution on the file.

    The only way to solve this, is to have synchronized locks and
    synchronized writes.  It needs to behave like a local filesystem
    with 2 processes writing.

    Geo-replication solves this by saying one site is the master and
    all writes happen there.  The other site is a replica of the
    master, period.  This gives you a single source of truth about
    file state and no conflicts to mediate.

    For a database with ACID transactions and atomic data structures,
    you can design the data and data structures for multi-master
    replication. You can state that the latest update of an atomic
    structure wins, then design your application around that.  For a
    filesystem, you can't, as you do not have visibility into the
    structure of the files.

    The commercial NAS systems that have multi-master capabilities, do
    it at the block level (not file) and do it synchronously.

    I currently do not know of a way to implement a multi-master
    asynchronous network filesystem, without introducing the
    possibility of file corruption.


I wrote up about the first third of why what you're asking to do is difficult on my blog at http://www.joejulian.name/blog/why-replicated-filesystems-are-hard/
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to