On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <[email protected] > wrote:
> My HBAs are LSISAS1068E, and the filesystem is XFS. > I tried EXT4 and it did not help. > I have created a stripted volume in one server with two bricks, same issue. > and i tried a replicated volume with just "sharding enabled" same issue, > as soon as i disable the sharding it works just fine, niether sharding nor > striping works for me. > i did follow up with some of threads in the mailing list and tried some of > the fixes that worked with the others, none worked for me. :( > Is it possible the LSI has write-cache enabled? > On 03/13/2016 06:54 PM, David Gossage wrote: > > > > > On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan < > [email protected]> wrote: > >> Okay so i have enabled shard in my test volume and it did not help, >> stupidly enough, i have enabled it in a production volume >> "Distributed-Replicate" and it currpted half of my VMs. >> I have updated Gluster to the latest and nothing seems to be changed in >> my situation. >> below the info of my volume; >> > > I was pointing at the settings in that email as an example for corruption > fixing. I wouldn't recommend enabling sharding if you haven't gotten the > base working yet on that cluster. What HBA's are you using and what is > layout of filesystem for bricks? > > >> Number of Bricks: 3 x 2 = 6 >> Transport-type: tcp >> Bricks: >> Brick1: gfs001:/bricks/b001/vmware >> Brick2: gfs002:/bricks/b004/vmware >> Brick3: gfs001:/bricks/b002/vmware >> Brick4: gfs002:/bricks/b005/vmware >> Brick5: gfs001:/bricks/b003/vmware >> Brick6: gfs002:/bricks/b006/vmware >> Options Reconfigured: >> performance.strict-write-ordering: on >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> network.remote-dio: enable >> performance.stat-prefetch: disable >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> cluster.eager-lock: enable >> features.shard-block-size: 16MB >> features.shard: on >> performance.readdir-ahead: off >> >> >> On 03/12/2016 08:11 PM, David Gossage wrote: >> >> >> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan < >> <[email protected]>[email protected]> wrote: >> >>> Both servers have HBA no RAIDs and i can setup a replicated or >>> dispensers without any issues. >>> Logs are clean and when i tried to migrate a vm and got the error, >>> nothing showed up in the logs. >>> i tried mounting the volume into my laptop and it mounted fine but, if i >>> use dd to create a data file it just hang and i cant cancel it, and i cant >>> unmount it or anything, i just have to reboot. >>> The same servers have another volume on other bricks in a distributed >>> replicas, works fine. >>> I have even tried the same setup in a virtual environment (created two >>> vms and install gluster and created a replicated striped) and again same >>> thing, data corruption. >>> >> >> I'd look through mail archives for a topic "Shard in Production" I think >> it's called. The shard portion may not be relevant but it does discuss >> certain settings that had to be applied with regards to avoiding corruption >> with VM's. You may want to try and disable the performance.readdir-ahead >> also. >> >> >>> >>> On 03/12/2016 07:02 PM, David Gossage wrote: >>> >>> >>> >>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan < >>> <[email protected]>[email protected]> wrote: >>> >>>> Thanks David, >>>> >>>> My settings are all defaults, i have just created the pool and started >>>> it. >>>> I have set the settings as your recommendation and it seems to be the >>>> same issue; >>>> >>>> Type: Striped-Replicate >>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 >>>> Status: Started >>>> Number of Bricks: 1 x 2 x 2 = 4 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: gfs001:/bricks/t1/s >>>> Brick2: gfs002:/bricks/t1/s >>>> Brick3: gfs001:/bricks/t2/s >>>> Brick4: gfs002:/bricks/t2/s >>>> Options Reconfigured: >>>> performance.stat-prefetch: off >>>> network.remote-dio: on >>>> cluster.eager-lock: enable >>>> performance.io-cache: off >>>> performance.read-ahead: off >>>> performance.quick-read: off >>>> performance.readdir-ahead: on >>>> >>> >>> >>> Is their a raid controller perhaps doing any caching? >>> >>> In the gluster logs any errors being reported during migration process? >>> Since they aren't in use yet have you tested making just mirrored bricks >>> using different pairings of servers two at a time to see if problem follows >>> certain machine or network ports? >>> >>> >>> >>>> >>>> >>>> >>>> >>>> >>>> On 03/12/2016 03:25 PM, David Gossage wrote: >>>> >>>> >>>> >>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan < >>>> <[email protected]>[email protected]> wrote: >>>> >>>>> Dears, >>>>> >>>>> I have created a replicated striped volume with two bricks and two >>>>> servers but I can't use it because when I mount it in ESXi and try to >>>>> migrate a VM to it, the data get corrupted. >>>>> Is any one have any idea why is this happening ? >>>>> >>>>> Dell 2950 x2 >>>>> Seagate 15k 600GB >>>>> CentOS 7.2 >>>>> Gluster 3.7.8 >>>>> >>>>> Appreciate your help. >>>>> >>>> >>>> Most reports of this I have seen end up being settings related. Post >>>> gluster volume info. Below is what I have seen as most common recommended >>>> settings. >>>> I'd hazard a guess you may have some the read ahead cache or prefetch >>>> on. >>>> >>>> quick-read=off >>>> read-ahead=off >>>> io-cache=off >>>> stat-prefetch=off >>>> eager-lock=enable >>>> remote-dio=on >>>> >>>>> >>>>> Mahdi Adnan >>>>> System Admin >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> <[email protected]>[email protected] >>>>> <http://www.gluster.org/mailman/listinfo/gluster-users> >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>> >>>> >>> >>> >> >> > >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
