Re: [Lustre-discuss] [zfs-discuss] Problems getting Lustre started with ZFS
On Wed, Oct 23, 2013 at 05:46:41PM +0100, Andrew Holway wrote: Hello, I have hit a wall trying to get lustre started. I have followed this to some extent: http://zfsonlinux.org/lustre-configure-single.html If someone could give me some guidance how to get these services started it would be much appreciated. I running on Centos 6.4 and am getting my packages from: http://archive.zfsonlinux.org/epel/6/SRPMS/ Thanks, Andrew [root@lustre1 ~]# zfs get lustre:svname NAME PROPERTY VALUE SOURCE lustre-mdt0 lustre:svname - - lustre-mdt0/mdt0 lustre:svname lustre:MDT local lustre-mgslustre:svname - - lustre-mgs/mgslustre:svname MGS local [root@lustre1 ~]# zfs get lustre:svname NAME PROPERTY VALUE SOURCE lustre-mdt0 lustre:svname - - lustre-mdt0/mdt0 lustre:svname lustre:MDT local lustre-mgslustre:svname - - lustre-mgs/mgslustre:svname MGS local [root@lustre1 ~]# /etc/init.d/lustre anaconda-ks.cfg .bash_profile .cshrc install.log.syslog.ssh/ .viminfo .bash_logout .bashrc install.log ks-post-anaconda.log .tcshrc [root@lustre1 ~]# /etc/init.d/lustre start [root@lustre1 ~]# /etc/init.d/lustre start lustre-MDT lustre-MDT is not a valid lustre label on this node [root@lustre1 ~]# /etc/init.d/lustre start MGS MGS is not a valid lustre label on this node You need to configure an /etc/ldev.conf file. See man ldev.conf(5). Make sure the first field matches `uname -n`. I have configured three OSS's with a single OST: Andrews-MacBook-Air:~ andrew$ for i in {201..204}; do ssh root@192.168.0.$i hostname; zfs get lustre:svname; done lustre1.calthrop.com NAME PROPERTY VALUE SOURCE lustre-mdt0 lustre:svname - - lustre-mdt0/mdt0 lustre:svname lustre:MDT local lustre-mgslustre:svname - - lustre-mgs/mgslustre:svname MGS local lustre2.calthrop.com NAME PROPERTY VALUE SOURCE lustre-ost0 lustre:svname - - lustre-ost0/ost0 lustre:svname lustre:OST local lustre3.calthrop.com NAME PROPERTY VALUE SOURCE lustre-ost0 lustre:svname - - lustre-ost0/ost0 lustre:svname lustre:OST local You need to use unique index numbers for each OST, i.e. OST, OST1, etc. Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] ldiskfs for MDT and zfs for OSTs?
On Tue, Oct 08, 2013 at 11:40:30AM -0400, Anjana Kar wrote: The git checkout was on Sep. 20. Was the patch before or after? The bug was introduced on Sep. 10 and reverted on Sep. 24, so you hit the lucky window. :) The zpool create command successfully creates a raidz2 pool, and mkfs.lustre does not complain, but The pool you created with zpool create was just for testing. I would recommend destroying that pool, rebuilding your lustre packages from the latest master (or better yet, a stable tag such as v2_4_1_0), and starting over with your original mkfs.lustre command. This would ensure that your pool is properly configured for use with lustre. If you'd prefer to keep this pool, you should set canmount=off on the root dataset, as mkfs.lustre would have done: zfs set canmount=off lustre-ost0 [root@cajal kar]# zpool list NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT lustre-ost0 36.2T 2.24M 36.2T 0% 1.00x ONLINE - [root@cajal kar]# /usr/sbin/mkfs.lustre --fsname=cajalfs --ost --backfstype=zfs --index=0 --mgsnode=10.10.101.171@o2ib lustre-ost0 This command seems to be missing the dataset name, i.e. lustre-ost0/ost0 [root@cajal kar]# /sbin/service lustre start lustre-ost0 lustre-ost0 is not a valid lustre label on this node As mentioned elsewhere, this looks like an ldev.conf configuration error. Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] ldiskfs for MDT and zfs for OSTs?
On Mon, Oct 07, 2013 at 02:23:32PM -0400, Anjana Kar wrote: Here is the exact command used to create a raidz2 pool with 8+2 drives, followed by the error messages: mkfs.lustre --fsname=cajalfs --reformat --ost --backfstype=zfs --index=0 --mgsnode=10.10.101.171@o2ib lustre-ost0/ost0 raidz2 /dev/sda /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sdo /dev/sdq /dev/sds mkfs.lustre FATAL: Invalid filesystem name /dev/sds It seems that either the version of mkfs.lustre you are using has a parsing bug, or there was some sort of syntax error in the actual command entered. If you are certain your command line is free from errors, please post the version of lustre you are using, or report the bug in the Lustre issue tracker. Thanks, Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] ldiskfs for MDT and zfs for OSTs?
I'm guessing your git checkout doesn't include this commit: * 010a78e Revert LU-3682 tunefs: prevent tunefs running on a mounted device It looks like the LU-3682 patch introduced a bug that could cause your issue, so its reverted in the latest master. Ned On Mon, Oct 07, 2013 at 04:54:13PM -0400, Anjana Kar wrote: On 10/07/2013 04:27 PM, Ned Bass wrote: On Mon, Oct 07, 2013 at 02:23:32PM -0400, Anjana Kar wrote: Here is the exact command used to create a raidz2 pool with 8+2 drives, followed by the error messages: mkfs.lustre --fsname=cajalfs --reformat --ost --backfstype=zfs --index=0 --mgsnode=10.10.101.171@o2ib lustre-ost0/ost0 raidz2 /dev/sda /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sdo /dev/sdq /dev/sds mkfs.lustre FATAL: Invalid filesystem name /dev/sds It seems that either the version of mkfs.lustre you are using has a parsing bug, or there was some sort of syntax error in the actual command entered. If you are certain your command line is free from errors, please post the version of lustre you are using, or report the bug in the Lustre issue tracker. Thanks, Ned For building this server, I followed steps from the walk-thru-build* for Centos 6.4, and added --with-spl and --with-zfs when configuring lustre.. *https://wiki.hpdd.intel.com/pages/viewpage.action?pageId=8126821 spl and zfs modules were installed from source for the lustre 2.4 kernel 2.6.32.358.18.1.el6_lustre2.4 Device sds appears to be valid, but I will try issuing the command using by-path names.. -Anjana ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Can't install lustre-tests
On Mon, Jul 22, 2013 at 09:20:30AM -0700, Prakash Surya wrote: Interesting.. Any chance somebody can provide an example URL? If you browse to Status - Build Artifacts for a Jenkins build, there is a link to download all files as a zip file. There is a repodata directory there, so I suspect the unzipped archive could be used as a local yum repo. However, I get a permission denied error, or sometimes an authentication popup, when I click the zip file link. I opened a Jira issue to report the permission problem. https://jira.hpdd.intel.com/browse/LU-3615 Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] /etc/init.d/lnet
Hi Brian, On Sun, Jun 30, 2013 at 05:37:42PM +, Andrus, Brian Contractor wrote: All, I am finding that on reboot, my client systems hang requiring a hard reset because of the lnet service. It doesn't unload all the modules in the proper order for me. I get errors like module osc has non-zero count I do some hunting and see lov needs unloaded before osc I also find fid and fld need unloaded before ptlrpc ofd needs unloaded before ost Sometimes others as well Is there an updated lnet script available that is more complete? For the time, I have been modifying the stock one to include the 'missing' modules for processing. There is a patch to improve the module unloading behavior here, but it is not yet landed: http://review.whamcloud.com/5478/ It would be valuable to know if the proposed changes fix your issue. Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] [wc-discuss] Seeking contributors for Lustre User Manual
On Thu, Nov 15, 2012 at 12:12:40AM +, Dilger, Andreas wrote: I would prefer to see a fix immediately rather than someone filing a ticket to describe the fix, since the documentation fix should be self-describing. However, if there is a problem that isn't immediately resolved then a Jira ticket should be submitted in order to track the defect and allow assigning the work to someone. LUDOC-11 seems to be a catch-all issue for submitting fixes to minor problems like typos. However it sounds like you're saying we can bypass Jira altogether for such patches. That would be nice; linking to a ticket with no useful content doesn't serve any purpose that I can see. The Making changes to the Lustre Manual source article currently instructs the reader to file an LUDOC bug for change tracking in Jira as the first step. To avoid discouraging submission of minor fixes, perhaps a more lightweight process for that case should be covered first. In particular, say either that minor fixes should reference LUDOC-11 in the summary, or just omit the Jira reference altogether, whichever is appropriate. Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] [wc-discuss] Seeking contributors for Lustre User Manual
On Sat, Nov 10, 2012 at 12:09:09AM +, Dilger, Andreas wrote: The manual source is hosted in a Git/Gerrit repository in Docbook XML format and can be downloaded at: git clone http://git.whamcloud.com/doc/manual lustre-manual That doesn't work for me: % git clone http://review.whamcloud.com/doc/manual lustre-manual Cloning into lustre-manual... fatal: http://review.whamcloud.com/doc/manual/info/refs not found: did you run git update-server-info on the server? I've only been able to check out the manual source directly from gerrit: git clone http://review.whamcloud.com/p/doc/manual lustre-manual or, git clone ssh://nedb...@review.whamcloud.com:29418/doc/manual lustre-manual Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] [wc-discuss] Seeking contributors for Lustre User Manual
On Tue, Nov 13, 2012 at 11:48:35AM -0800, Nathan Rutman wrote: Would it be easier to move the manual back to a Wiki? The low hassle factor of wikis has always been a draw for contribution. The openSFS site is up and running with MediaWiki now (wiki.opensfs.org). Easier? Yes, probably. Better? I personally don't think so. Wikis are great collaboration tools for informally sharing information, but I don't think the paradigm scales well for documents of this size and complexity. And a wiki isn't the right tool for producing a formal professional-quality document, which is what I think the Lustre manual should strive to be. True, we would lower the bar for contributions, but for that we would sacrifice the following features that I consider essential. - Ability to export to multiple formats (pdf, html, epub) from one source - Consistency of formatting and navigation elements - A review process for proposed changes that assures a high standard of quality However, there are some short articles that probably do belong in the wiki that could be poached from the manual, i.e. installation and configuration procedures, etc. Ned ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss