yes, drbd will mirror the content of block devices between hosts synchronously or asynchronously. this will provide us data redundancy between hosts. perhaps we should use zfs + drbd for mdt and ost?
Thanks Yu Patrick Farrell <[email protected]> 于2018年6月27日周三 下午9:28写道: > > I’m a little puzzled - it can switch, but isn’t the data on the failed > disk lost...? That’s why Andreas is suggesting RAID. Or is drbd doing > syncing of the disk? That seems like a really expensive way to get > redundancy, since it would have to be full online mirroring with all the > costs in hardware and resource usage that implies...? > > ZFS is not a requirement, it generally performs a bit worse than ldiskfs > but makes it up with impressive features to improve data integrity and > related things. Since it sounds like that’s not a huge concern for you, I > would stick with ldiskfs. It will likely be a little faster and is easier > to set up. > > ------------------------------ > *From:* lustre-discuss <[email protected]> on > behalf of yu sun <[email protected]> > *Sent:* Wednesday, June 27, 2018 8:21:43 AM > *To:* [email protected] > *Cc:* [email protected] > *Subject:* Re: [lustre-discuss] lctl ping node28@o2ib report Input/output > error > > yes, you are right, thanks for your great suggestions. > > now we are using glusterfs to store training data for ML, and we begin to > investigate lustre to instead glusterfs for performance. > > Firstly, yes we do want to get maximum perforance, you means we should use > zfs , for example , not each ost/mdt on a separate partitions, for better > perforance? > > Secondly, we dont use any underlying RAID devices, and we do configure > each ost on a separate disk, considering that lustre does not provide disk > data redundancy, we are use drbd + pacemarker + corosync for data redundancy > and HA, you can see we have configured --servicenode when mkfs.lustre. I > dont know how reliable is this solution? it seems ok for our current test, > when one disk faild, pacemarker can switch to other ost on the other > machine automaticly. > > we also want to use zfs and I have test zfs by mirror, However, if the > physical machine down,data on the machine will lost. so we decice use the > solution listed above. > > Now we are testing, and any suggesting is appreciated 😆. > thanks Andreas. > > Your > Yu > > > > Andreas Dilger <[email protected]> 于2018年6月27日周三 下午7:07写道: > > On Jun 27, 2018, at 09:12, yu sun <[email protected]> wrote: > > > > client: > > [email protected]:~$ mount -t lustre > > node28@o2ib1:node29@o2ib1:/project > /mnt/lustre_data > > mount.lustre: mount node28@o2ib1:node29@o2ib1:/project at > /mnt/lustre_data failed: Input/output error > > Is the MGS running? > > [email protected]:~$ lctl ping node28@o2ib1 > > failed to ping 10.82.143.202@o2ib1: Input/output error > > [email protected]:~$ > > > > > > mgs and mds: > > mkfs.lustre --mgs --reformat --servicenode=node28@o2ib1 > --servicenode=node29@o2ib1 /dev/sdb1 > > mkfs.lustre --fsname=project --mdt --index=0 --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode node28@o2ib1 --servicenode > node29@o2ib1 --reformat --backfstype=ldiskfs /dev/sdc1 > > Separate from the LNet issues, it is probably worthwhile to point out some > issues > with your configuration. You shouldn't use partitions on the OST and MDT > devices > if you want to get maximum performance. That can offset all of the > filesystem IO > from the RAID/sector alignment and hurt performance. > > Secondly, it isn't clear if you are using underlying RAID devices, or if > you are > configuring each OST on a separate disk? It looks like the latter - that > you are > making each disk a separate OST. That isn't a good idea for Lustre, since > it does > not (yet) have any redundancy at higher layers, and any disk failure would > result > in data loss. You currently need to have RAID-5/6 or ZFS for each > OST/MDT, unless > this is a really "scratch" filesystem where you don't care if the data is > lost and > reformatting the filesystem is OK (i.e. low cost is the primary goal, > which is fine > also, but not very common). > > We are working at Lustre-level data redundancy, and there is some support > for this > in the 2.11 release, but it is not yet in a state where you could reliably > use it > to mirror all of the files in the filesystem. > > Cheers, Andreas > > > > > ost: > > ml-storage-ser22.nmg01: > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=12 /dev/sdc1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=13 /dev/sdd1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=14 /dev/sde1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=15 /dev/sdf1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=16 /dev/sdg1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=17 /dev/sdh1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=18 /dev/sdi1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=19 /dev/sdj1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=20 /dev/sdk1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=21 /dev/sdl1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=22 /dev/sdm1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node22@o2ib1 > --servicenode=node23@o2ib1 --ost --index=23 /dev/sdn1 > > ml-storage-ser26.nmg01: > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=36 /dev/sdc1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=37 /dev/sdd1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=38 /dev/sde1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=39 /dev/sdf1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=40 /dev/sdg1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=41 /dev/sdh1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=42 /dev/sdi1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=43 /dev/sdj1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=44 /dev/sdk1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=45 /dev/sdl1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=46 /dev/sdm1 > > mkfs.lustre --fsname=project --reformat --mgsnode=node28@o2ib1 > --mgsnode=node29@o2ib1 --servicenode=node26@o2ib1 > --servicenode=node27@o2ib1 --ost --index=47 /dev/sdn1 > > > > Thanks > > Yu > > > > Mohr Jr, Richard Frank (Rick Mohr) <[email protected]> 于2018年6月27日周三 > 下午1:25写道: > > > > > On Jun 27, 2018, at 12:52 AM, yu sun <[email protected]> wrote: > > > > > > I have create file /etc/modprobe.d/lustre.conf with content on all mdt > ost and client: > > > [email protected]:~$ cat /etc/modprobe.d/lustre.conf > > > options lnet networks="o2ib1(eth3.2)" > > > and I exec command line : lnetctl lnet configure --all to make my > static lnet configuration take effect. but i still can't ping node28 from > my client ml-gpu-ser200.nmg01. I can mount as well as access lustre on > client ml-gpu-ser200.nmg01. > > > > What options did you use when mounting the file system? > > > > -- > > Rick Mohr > > Senior HPC System Administrator > > National Institute for Computational Sciences > > http://www.nics.tennessee.edu > > > > _______________________________________________ > > lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > Cheers, Andreas > --- > Andreas Dilger > Principal Lustre Architect > Whamcloud > > > > > > > >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
