On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <[email protected]> wrote:
> Okay so i have enabled shard in my test volume and it did not help, > stupidly enough, i have enabled it in a production volume > "Distributed-Replicate" and it currpted half of my VMs. > I have updated Gluster to the latest and nothing seems to be changed in my > situation. > below the info of my volume; > I was pointing at the settings in that email as an example for corruption fixing. I wouldn't recommend enabling sharding if you haven't gotten the base working yet on that cluster. What HBA's are you using and what is layout of filesystem for bricks? > Number of Bricks: 3 x 2 = 6 > Transport-type: tcp > Bricks: > Brick1: gfs001:/bricks/b001/vmware > Brick2: gfs002:/bricks/b004/vmware > Brick3: gfs001:/bricks/b002/vmware > Brick4: gfs002:/bricks/b005/vmware > Brick5: gfs001:/bricks/b003/vmware > Brick6: gfs002:/bricks/b006/vmware > Options Reconfigured: > performance.strict-write-ordering: on > cluster.server-quorum-type: server > cluster.quorum-type: auto > network.remote-dio: enable > performance.stat-prefetch: disable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > cluster.eager-lock: enable > features.shard-block-size: 16MB > features.shard: on > performance.readdir-ahead: off > > > On 03/12/2016 08:11 PM, David Gossage wrote: > > > On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan < > [email protected]> wrote: > >> Both servers have HBA no RAIDs and i can setup a replicated or dispensers >> without any issues. >> Logs are clean and when i tried to migrate a vm and got the error, >> nothing showed up in the logs. >> i tried mounting the volume into my laptop and it mounted fine but, if i >> use dd to create a data file it just hang and i cant cancel it, and i cant >> unmount it or anything, i just have to reboot. >> The same servers have another volume on other bricks in a distributed >> replicas, works fine. >> I have even tried the same setup in a virtual environment (created two >> vms and install gluster and created a replicated striped) and again same >> thing, data corruption. >> > > I'd look through mail archives for a topic "Shard in Production" I think > it's called. The shard portion may not be relevant but it does discuss > certain settings that had to be applied with regards to avoiding corruption > with VM's. You may want to try and disable the performance.readdir-ahead > also. > > >> >> On 03/12/2016 07:02 PM, David Gossage wrote: >> >> >> >> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan < >> <[email protected]>[email protected]> wrote: >> >>> Thanks David, >>> >>> My settings are all defaults, i have just created the pool and started >>> it. >>> I have set the settings as your recommendation and it seems to be the >>> same issue; >>> >>> Type: Striped-Replicate >>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 >>> Status: Started >>> Number of Bricks: 1 x 2 x 2 = 4 >>> Transport-type: tcp >>> Bricks: >>> Brick1: gfs001:/bricks/t1/s >>> Brick2: gfs002:/bricks/t1/s >>> Brick3: gfs001:/bricks/t2/s >>> Brick4: gfs002:/bricks/t2/s >>> Options Reconfigured: >>> performance.stat-prefetch: off >>> network.remote-dio: on >>> cluster.eager-lock: enable >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> performance.readdir-ahead: on >>> >> >> >> Is their a raid controller perhaps doing any caching? >> >> In the gluster logs any errors being reported during migration process? >> Since they aren't in use yet have you tested making just mirrored bricks >> using different pairings of servers two at a time to see if problem follows >> certain machine or network ports? >> >> >> >>> >>> >>> >>> >>> >>> On 03/12/2016 03:25 PM, David Gossage wrote: >>> >>> >>> >>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan < >>> <[email protected]>[email protected]> wrote: >>> >>>> Dears, >>>> >>>> I have created a replicated striped volume with two bricks and two >>>> servers but I can't use it because when I mount it in ESXi and try to >>>> migrate a VM to it, the data get corrupted. >>>> Is any one have any idea why is this happening ? >>>> >>>> Dell 2950 x2 >>>> Seagate 15k 600GB >>>> CentOS 7.2 >>>> Gluster 3.7.8 >>>> >>>> Appreciate your help. >>>> >>> >>> Most reports of this I have seen end up being settings related. Post >>> gluster volume info. Below is what I have seen as most common recommended >>> settings. >>> I'd hazard a guess you may have some the read ahead cache or prefetch on. >>> >>> quick-read=off >>> read-ahead=off >>> io-cache=off >>> stat-prefetch=off >>> eager-lock=enable >>> remote-dio=on >>> >>>> >>>> Mahdi Adnan >>>> System Admin >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> [email protected] >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >>> >> >> > >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
