Re: [Lustre-discuss] Xyratex News Regarding Lustre - Press Release
Con! On Feb 20, 2013 7:11 AM, Kevin Canady kevin_can...@xyratex.com wrote: Greetings Community! Today we are very excited to announce that Xyratex has purchased Lustre® and its assets from Oracle. We intend for Lustre to remain an open-source, community-driven file system to be promoted by our community organizations. We undertook the acquisition because we realize its importance to the entire community and we want to help ensure that it will continue to deliver for all of us over the long term. This is critically important to the growth and vitality of Lustre; it’s how it became what it is today, and it’s how it will deliver the most value in the future. Several members of the Lustre community have endorsed these plans and voiced their support of our purchase, and contributed quotes which were included in our announcement today. Special thanks to our two community leaders Hugo Falter with EOFS and Norm Morse with OpenSFS for their support to date and into the future as we work to further the collaboration around Lustre. Expect to hear more soon as we head into the 2013 Lustre User Group meeting in San Diego. http://www.opensfs.org/events/lug13/ It should be an exciting event! Best regards, Kevin P. Kevin Canady Director, Business Development Lustre and HPC Services kevin_can...@xyratex.com O: 510-687-5475 C: 415.505.7701 Xyratex Advances Lustre® Initiative, Assumes Ownership of Related Assets Xyratex plans to offer Lustre community and ClusterStor™ users significant value Havant, UK – Feb. 19, 2013 – Xyratex Ltd (Nasdaq: XRTX), a leading provider of data storage technology, today announced it plans to advance the global Lustre® portfolio by supporting the community-oriented development of Lustre as an open source file system and continuing to work in conjunction with the broader community to help chart the best path forward for this key technology. Xyratex has recently acquired the original Lustre trademark, logo, website and associated intellectual property from Oracle, and will assume responsibility for providing support to Lustre customers going forward. “Lustre is a powerful open source file system, and Xyratex strongly believes that all members of the Lustre community need to continue to play a part in the evolution of the code and the benefits it delivers over the long term,” said Steve Barber, CEO of Xyratex. “We want to ensure that current Lustre customers get the best possible feature roadmap and support, and we intend to engage the entire community to advance the Lustre technology. We also appreciate Oracle’s support of Lustre, and their efforts to ensure the long-term success of the technology.” The Lustre file system, which was first released in 2003, is a client/server based, distributed architecture designed for large-scale compute and I/O-intensive, performance-sensitive applications. The Lustre architecture currently powers six of the top 10 high-performance computing (HPC) clusters in the world and more than 60 of the 100-largest HPC installations. It has emerged as a particularly popular choice in the meteorology, simulation, oil and gas, life science, rich media and finance sectors. This purchase also gives Xyratex the opportunity to continue to leverage Lustre and provide more value through its best-of-breed ClusterStor™ family of scale-out HPC data storage solutions. ClusterStor delivers a new standard in file system performance, scalability and efficiency, and brings together what were previously discrete server, network and storage platforms with their own separate software layers. The results are integrated, modular, scale-out storage building blocks that enable systems to scale both performance and capacity while aggressively reducing space, power and administrative overhead. “Cray has been using Lustre as our primary parallel file system for the past 10 years, and has deployed some of the largest and most successful Lustre installations in the world with a variety of storage products,” said Barry Bolding, vice president of storage and data management at Cray. “We have recently worked with Xyratex to deploy successful Lustre installations in the government, energy, manufacturing and academic markets with the Cray Sonexion storage system, including the record-breaking NCSA installation running Lustre at over 1TB/sec. This announcement is another important step for Lustre and the OpenSFS community, and shows the promising future of the Lustre file system in supercomputing and Big Data.” “Xyratex’ deep knowledge of Lustre, and ability to deploy and support it, has been critical in helping NCSA bring the Blue Waters system into production and making a new class of computational and data focused petascale system usable for our scientific and engineering teams,” said Dr. William Kramer, Blue Waters Deputy Directory at the University of Illinois' National Center for Supercomputing Application, whose Blue Waters supercomputer is amongst the fastest and
Re: [Lustre-discuss] [wc-discuss] Re: Lustre 2.2 production experience
We have deployed 2.1.1 for several clusters, which each has hundreds of nodes. On Jun 10, 2012 6:05 AM, Wojciech Turek wj...@cam.ac.uk wrote: Thanks for a quick reply Andreas. I slightly misunderstood the lustre release process and thought that the next stable/production version is 2.2 I am then interested in the experience of people running Lustre 2.1 Cheers Wojciech On 9 June 2012 21:52, Andreas Dilger adil...@whamcloud.com wrote: I think you'll find that there are not yet (m)any production deployments of 2.2. There are a number of production 2.1 deployments, and this is the current maintenance stream from Whamcloud. Cheers, Andreas On 2012-06-09, at 14:33, Wojciech Turek wj...@cam.ac.uk wrote: I am building a 1.5PB storage system which will employ Lustre as the main file system. The storage system will be extended at the later stage beyond 2PB. I am considering using Lustre 2.2 for production environment. This Lustre storage system will replace our older 300TB system which is currently running Lustre 1.8.8. I am quite happy with lustre 1.8.8 however for the new system Lustre 2.2 seem to be a better match. The storage system will be attached to a university wide cluster (800 nodes), hence there will be quite a large range of applications using the filesystem. Could people with production deployments of Lustre 2.2 share their experience please? -- Wojciech Turek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OSS1 Node issue
I have checked your logs, maybe there are several osts on your oss1, there must be at least one ost is read-only, it's have no business with permissions. running e2fsck on you ost device is recommended to resolve the rc=-30 problem. On Tue, Feb 21, 2012 at 4:00 PM, VIJESH EK ekvij...@gmail.com wrote: Dear Sir, Thanks for your immediate response... I have checked the OST permission, it is in read write mode only and no hard disk is failed in the storage console all are in on-line working status. Herewith i have attached the detailed log information , you kindly go through the logs and get back me Thanks Regards VIJESH On Tue, Feb 21, 2012 at 1:13 PM, Larry tsr...@gmail.com wrote: Hi, Your OST becomes read-only, that's the reason. Generally, it has relationship with your hardware, for example, your storage is broken, or your ldiskfs file system is broken. You'd better check your storage and e2fsck the OST. On Tue, Feb 21, 2012 at 2:52 PM, VIJESH EK ekvij...@gmail.com wrote: Dear All, We have done the following changes in the exec Nodes , still now also we are getting the same errors in /var/log/messages. 1. We have changed the exec Nodes spool directory to local directory by editing the file /home/appl/sge-root/default/common/configuration and changes the parameter execd_spool_dir. After changing this also the same error, i.e below mentioned error is coming in OSS1 Node. This error is generating only in the OSS1 Node. Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:05 oss1 kernel: LustreError: 9422:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:06 oss1 kernel: LustreError: 9432:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:07 oss1 kernel: LustreError: 9369:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 6 18:32:10 oss1 kernel: LustreError: 9362:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Can u tell me how to change the Master spool directory ? Is it possible to change the directory in live mode ? Kindly explain briefly, so that we can proceed for the next step.. Thanks and Regards VIJESH On Fri, Feb 10, 2012 at 1:19 PM, Carlos Thomaz ctho...@ddn.com wrote: Hi vijesh. Are you running the SGE master spooling on lustre?!?! What about the exec nodes spooling?! I strongly recommend you to do not run the master spooling on lustre. And if possible use local spooling on local disk for the exec nodes. SGE (át. least until version 6.2u7) is known to get unstable when running the spooling on lustre. Carlos On Feb 10, 2012, at 1:18 AM, VIJESH EK ekvij...@gmail.com wrote: Dear All, Kindly get a solution for these below issue... Thanks Regards VIJESH E K On Thu, Feb 9, 2012 at 3:26 PM, VIJESH EK ekvij...@gmail.com wrote: Dear Sir, I am getting below mentioned error messages continuously in OSS1 Node,it causes that sge service is not running intermittently... Feb 5 04:03:37 oss1 kernel: LustreError: 9193:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 9164:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:47 oss1 kernel: LustreError: 28420:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:48 oss1 kernel: LustreError: 9266:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:50 oss1 kernel: LustreError: 9200:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:53 oss1 kernel: LustreError: 9230:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:03:57 oss1 kernel: LustreError: 9212:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:03 oss1 kernel: LustreError: 9262:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:08 oss1 kernel: LustreError: 9162:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:15 oss1 kernel: LustreError: 9271:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:23 oss1 kernel: LustreError: 9191:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 Feb 5 04:04:32 oss1 kernel: LustreError: 9242:0:(filter_io_26.c:693:filter_commitrw_write()) error starting transaction: rc = -30 The detailed log information i have attached
[Lustre-discuss] problem in lustre lnet routing
Hi all, I have a problem in setting lnet routing. The MDS and OSSes have IB and GigE networks, 30.9.100.* for IB and 20.9.100.* for GigE. Most of the clients have IB, too. But a few of them haven't. So I choose one client as a lnet router. Below is the configurations: On the MDS and OSSes, IB: 30.9.100.* GigE: 20.9.100.* modprobe.conf: options lnet networks=o2ib0(ib0) routes=tcp0 30.9.0.5@o2ib0 On the router, IB 30.9.0.5 GigE: 20.9.0.5 modprobe.conf: options lnet networks=o2ib0(ib0),tcp0(eth1) forwarding=enabled On the GigE client, GigE: 20.9.0.2 modprobe.conf: options lnet networks=tcp0(eth1) routes=o2ib0 20.9.0.5@tcp0 After the lnet configured,client can lctl ping every MDS and OSSes . For example, client:~ # lctl ping 30.9.100.31@o2ib 12345-0@lo 12345-30.9.100.31@o2ib where 30.9.100.31 is MDS. But mount -t lustre 30.9.100.31@o2ib0:30.9.100.32@o2ib0:/fnfs /mnt failed, the log says, Nov 24 10:36:37 cn-fn02 kernel: [502743.285050] Lustre: OBD class driver, http://wiki.whamcloud.com/ Nov 24 10:36:37 cn-fn02 kernel: [502743.285056] Lustre: Lustre Version: 2.1.0 Nov 24 10:36:37 cn-fn02 kernel: [502743.285060] Lustre: Build Version: RC2-g9d71fe8-PRISTINE-2.6.32.12-0.7-default Nov 24 10:36:37 cn-fn02 kernel: [502743.287057] Lustre: Lustre LU module (a17f6d00). Nov 24 10:36:37 cn-fn02 kernel: [502743.358095] Lustre: Added LNI 20.9.0.2@tcp [8/256/0/180] Nov 24 10:36:37 cn-fn02 kernel: [502743.358153] Lustre: Accept secure, port 988 Nov 24 10:36:37 cn-fn02 kernel: [502743.423409] Lustre: Lustre OSC module (a1a9b800). Nov 24 10:36:37 cn-fn02 kernel: [502743.438668] Lustre: Lustre LOV module (a1b09500). Nov 24 10:36:37 cn-fn02 kernel: [502743.460108] Lustre: Lustre client module (a1ba9a40). Nov 24 10:36:37 cn-fn02 kernel: [502743.480266] Lustre: 4329:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import MGC30.9.100.31@o2ib-MGC30.9.100.31@o2ib_0 neti d 2: select flavor null Nov 24 10:36:37 cn-fn02 kernel: [502743.485938] Lustre: MGC30.9.100.31@o2ib: Reactivating import Nov 24 10:36:37 cn-fn02 kernel: [502743.517528] Lustre: 4329:0:(sec.c:1474:sptlrpc_import_sec_adapt()) import fnfs-MDT-mdc-8801b79afc00-30.9.100.31@ o2ib netid 2: select flavor null Nov 24 10:36:42 cn-fn02 kernel: [502748.508709] Lustre: 4401:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1386324633321488 sent from fnfs-MDT00 00-mdc-8801b79afc00 to NID 20.9.100.31@tcp has timed out for sent delay: [sent 1322102197] [real_sent 0] [current 1322102202] [deadline 5s] [delay 0s] r eq@88019c603c00 x1386324633321488/t0(0) o-1-fnfs-MDT_UUID@30.9.100.31@o2ib:12/10 lens 368/512 e 0 to 1 dl 1322102202 ref 2 fl Rpc:XN//ff ff rc 0/-1 Nov 24 10:37:07 cn-fn02 kernel: [502773.472069] Lustre: 4401:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1386324633321491 sent from fnfs-MDT00 00-mdc-8801b79afc00 to NID 30.9.100.32@o2ib has timed out for slow reply: [sent 132210] [real_sent 132210] [current 1322102227] [deadline 5s] [de lay 0s] req@88019b092400 x1386324633321491/t0(0) o-1-fnfs-MDT_UUID@30.9.100.32@o2ib:12/10 lens 368/512 e 0 to 1 dl 1322102227 ref 1 fl Rpc:XN/f fff/ rc 0/-1 Nov 24 10:37:27 cn-fn02 kernel: [502793.442762] Lustre: 4402:0:(import.c:526:import_select_connection()) fnfs-MDT-mdc-8801b79afc00: tried all connect ions, increasing latency to 5s Nov 24 10:37:27 cn-fn02 kernel: [502793.442802] Lustre: 4401:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1386324633321493 sent from fnfs-MDT00 00-mdc-8801b79afc00 to NID 20.9.100.31@tcp has failed due to network error: [sent 1322102247] [real_sent 1322102247] [current 1322102247] [deadline 10s] [delay -10s] req@8801b68ebc00 x1386324633321493/t0(0) o-1-fnfs-MDT_UUID@30.9.100.31@o2ib:12/10 lens 368/512 e 0 to 1 dl 1322102257 ref 1 fl Rpc:XN/ / rc 0/-1 Nov 24 10:38:02 cn-fn02 kernel: [502828.392144] Lustre: 4401:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1386324633321495 sent from fnfs-MDT00 00-mdc-8801b79afc00 to NID 30.9.100.32@o2ib has timed out for slow reply: [sent 1322102272] [real_sent 1322102272] [current 1322102282] [deadline 10s] [d elay 0s] req@88019c603c00 x1386324633321495/t0(0) o-1-fnfs-MDT_UUID@30.9.100.32@o2ib:12/10 lens 368/512 e 0 to 1 dl 1322102282 ref 1 fl Rpc:XN/ / rc 0/-1 Nov 24 10:38:17 cn-fn02 kernel: [502843.369501] Lustre: 4402:0:(import.c:526:import_select_connection()) fnfs-MDT-mdc-8801b79afc00: tried all connect ions, increasing latency to 10s Nov 24 10:38:17 cn-fn02 kernel: [502843.369561] Lustre: 4401:0:(client.c:1778:ptlrpc_expire_one_request()) @@@ Request x1386324633321497 sent from fnfs-MDT00 00-mdc-8801b79afc00 to NID 20.9.100.31@tcp has failed due to network error: [sent 1322102297] [real_sent 1322102297] [current 1322102297] [deadline 15s] [delay -15s] req@88019b082000 x1386324633321497/t0(0)
Re: [Lustre-discuss] help for endless e2fsck
8TB LUN is a big device, I have ever had a 3TB OST device error, whose e2fsck consumed about 10 hours, with Lustre 1.8.1.1 and e2fsprogs-1.41.10.sun2. Maybe you can first upgrade your e2fsprogs? 2011/9/19 enqiang zhou eqz...@gmail.com: It's a 8TB LUN and e2fsck have lasted for about 30 hours.I'm not sure if I should wait more enough time for the end.Bellow is part of e2fsck's log. ... ... Illegal block number passed to ext2fs_test_block_bitmap #4243581855 for multiply claimed block map Illegal block number passed to ext2fs_test_block_bitmap #2363489791 for multiply claimed block map Illegal block number passed to ext2fs_test_block_bitmap #4091539423 for multiply claimed block map Illegal block number passed to ext2fs_test_block_bitmap #398961 for multiply claimed block map Illegal block number passed to ext2fs_test_block_bitmap #1339682798 for multiply claimed block map Pass 1C: Scanning directories for inodes with multiply-claimed blocks 19:06 Pass 1D: Reconciling multiply-claimed blocks ... (inode #423776, mod time Tue Oct 6 18:10:49 1981) ... (inode #405328, mod time Tue Oct 6 18:10:49 1981) ... (inode #366432, mod time Tue Oct 6 18:10:49 1981) ... (inode #349536, mod time Tue Oct 6 18:10:49 1981) ... (inode #329824, mod time Tue Oct 6 18:10:49 1981) ... (inode #312928, mod time Tue Oct 6 18:10:49 1981) ... (inode #275296, mod time Tue Oct 6 18:10:49 1981) ... (inode #238688, mod time Tue Oct 6 18:10:49 1981) ... (inode #223056, mod time Tue Oct 6 18:10:49 1981) ... (inode #220768, mod time Tue Oct 6 18:10:49 1981) ... (inode #201056, mod time Tue Oct 6 18:10:49 1981) ... (inode #184160, mod time Tue Oct 6 18:10:49 1981) ... (inode #164448, mod time Tue Oct 6 18:10:49 1981) ... (inode #146528, mod time Tue Oct 6 18:10:49 1981) ... (inode #126816, mod time Tue Oct 6 18:10:49 1981) ... (inode #109920, mod time Tue Oct 6 18:10:49 1981) ... (inode #90208, mod time Tue Oct 6 18:10:49 1981) ... (inode #74576, mod time Tue Oct 6 18:10:49 1981) ... (inode #72288, mod time Tue Oct 6 18:10:49 1981) ... (inode #35680, mod time Tue Oct 6 18:10:49 1981) Clone multiply-claimed blocks? yes Illegal block number passed to ext2fs_test_block_bitmap #3449154175 for multiply claimed block map Clone multiply-claimed blocks? yes Illegal block number passed to ext2fs_test_block_bitmap #3449154175 for multiply claimed block map I'd appreciate any suggestion anyone could give me! 2011/9/19, Larry tsr...@gmail.com: you say the e2fsck enters a endless loop, maybe you don't give enough time for it. By the way, you'd better attach some logs On 9/18/11, enqiang zhou eqz...@gmail.com wrote: hi,all We experienced a serious raid problem and OST on the RAID corrupted, it could not be mounted. Dmesg showed message as bellow when I tryed to mount it as ldiskfs, LDISKFS-fs error (device sdd): ldiskfs_check_descriptors: Checksum for group 14208 failed (51136!=40578) LDISKFS-fs: group descriptors corrupted! Then I tryed to repair it using e2fsck but entering a endless loop, e2fsck never stop! And I couldn't mount it as ldiskfs after I sent kill signal to e2fsck. I also tryed some advice found on list, like tune2fs -O ununit_bg /dev/xxx, then e2fsck, but none could be helpfull. Our lustre version is 1.8.1.1 with e2fsprogs-1.41.10.sun2 Can Mr Andreas Dilger give me some advice? Any help will be greatly appreciated. Thanks! Best Regards ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Where to download Lustre from since 01 Aug?
I believe there is no offcial version, just oracle version and whamcloud version On Tue, Aug 9, 2011 at 8:19 AM, Mathew Eis m...@usgs.gov wrote: Hi List, We too, got lost looking for downloads in the Oracle etherland... Are there any differences between the official version and the whamcloud version? Also, I don't see previous releases such as 1.8.4 or 1.8.5 available, are 1.8.6 and 2.0.0 the only versions available through whamcloud? Will OpenSFS be taking over the maintenance/hosting of Lustre now that Oracle seems to have dropped support? Thanks in advance! -- Mathew I Eis IT Specialist US Geological Survey ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Does the Whamcloud's lustre 1.8.6 rc1 support sles?
ok, I'll report it soon, thanks On Sat, Jun 11, 2011 at 1:20 AM, Andreas Dilger adil...@whamcloud.com wrote: On 2011-06-10, at 9:42 AM, Larry wrote: I build lustre 1.8.6 rc1 on sles 10 sp2, kernel 2.6.16.60-0.69.1_x86_64-smp today, and get a lot of errors in applying ldiskfs' kernel patches. Now I'm trying to update these kernel patches one by one. So does this version support sles 10 sp2? The specific version of each supported kernel is in lustre/ChangeLog. It reports 2.6.16.60-0.42.8 (SLES 10) as the supported SLES kernel, so it shouldn't be very different than the one you have, but one never knows what SLES is up to (they bumped SLES 11 from 2.6.27 to 2.6.32 for SP1). If you need to make any serious patch changes, you should file a bugzilla and/or Jira bug with the updates using a clear topic like Updated ldiskfs patches for SLES 10 2.6.16.60-0.69.1 so that others don't need to do this work again. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Does the Whamcloud's lustre 1.8.6 rc1 support sles?
Thanks, Peter I have learned from the Oracle's changelog that they still support SLES. Maybe I'll do some jobs to export these kernel patches to SLES and test it in the future. On Sat, Jun 11, 2011 at 10:01 PM, Peter Jones pjo...@whamcloud.com wrote: Sorry, I was in transit yesterday when this was posted or I would have replied sooner. You should note for the 1.8.6-wc release that Whamcloud is only supporting RHEL\CentOS5 servers and clients and RHEL6 clients - we are not routinely building and testing SLES. We have done some exploratory work on extending to include SLES and may add this in future 1.8.x releases. SLES11 clients are already supported for the upcoming 2.1 community release. In the meantime, SLES users can still use the equivalent Oracle 1.8.6 release (though someone from Oracle would need to comment on the availability of this). On 11-06-11 4:21 AM, Larry wrote: ok, I'll report it soon, thanks On Sat, Jun 11, 2011 at 1:20 AM, Andreas Dilgeradil...@whamcloud.com wrote: On 2011-06-10, at 9:42 AM, Larry wrote: I build lustre 1.8.6 rc1 on sles 10 sp2, kernel 2.6.16.60-0.69.1_x86_64-smp today, and get a lot of errors in applying ldiskfs' kernel patches. Now I'm trying to update these kernel patches one by one. So does this version support sles 10 sp2? The specific version of each supported kernel is in lustre/ChangeLog. It reports 2.6.16.60-0.42.8 (SLES 10) as the supported SLES kernel, so it shouldn't be very different than the one you have, but one never knows what SLES is up to (they bumped SLES 11 from 2.6.27 to 2.6.32 for SP1). If you need to make any serious patch changes, you should file a bugzilla and/or Jira bug with the updates using a clear topic like Updated ldiskfs patches for SLES 10 2.6.16.60-0.69.1 so that others don't need to do this work again. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Peter Jones Whamcloud, Inc. www.whamcloud.com ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Does the Whamcloud's lustre 1.8.6 rc1 support sles?
Hi, everyone I build lustre 1.8.6 rc1 on sles 10 sp2, kernel 2.6.16.60-0.69.1_x86_64-smp today, and get a lot of errors in applying ldiskfs' kernel patches. Now I'm trying to update these kernel patches one by one. So does this version support sles 10 sp2? Thanks a lot Best Regards, Larry ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] problem reading HDF files on 1.8.5 filesystem
try mounting the lustre filesystem with -o flock or -o localflock On Thu, May 5, 2011 at 4:47 AM, Christopher Walker cwal...@fas.harvard.edu wrote: Hello, We have a user who is trying to post-process HDF files in R. Her script goes through a number (~2500) of files in a directory, opening and reading the contents. This usually goes fine, but occasionally the script dies with: HDF5-DIAG: Error detected in HDF5 (1.9.4) thread 46944713368080: #000: H5F.c line 1560 in H5Fopen(): unable to open file major: File accessability minor: Unable to open file #001: H5F.c line 1337 in H5F_open(): unable to read superblock major: File accessability minor: Read failed #002: H5Fsuper.c line 542 in H5F_super_read(): truncated file major: File accessability minor: File has been truncated Error in hdf5load(file = myfile, load = FALSE, verbosity = 0, tidy = TRUE) : unable to open HDF file: /n/scratch2/moorcroft_lab/nlevine/Moore_sites_final/met/LT_spinup/ms67/analy/s67-E-1628-04-00-00-g01.h5 HDF5-DIAG: Error detected in HDF5 (1.9.4) thread 46944713368080: #000: H5F.c line 2012 in H5Fclose(): decrementing file ID failed major: Object atom minor: Unable to close file #001: H5I.c line 1340 in H5I_dec_ref(): can't locate ID major: Object atom minor: Unable to find atom information (already closed?) Error in hdf5cleanup(16778754L) : unable to close HDF file But this file definitely does exist -- any stat or ls command shows it without a problem. Further, once I 'ls' this file, if I rerun the same script, it successfully reads this file, but then dies on the next one with the same error. If I 'ls' the entire directory, the script runs to completion without a problem. strace output shows: open(/n/scratch2/moorcroft_lab/nlevine/Moore_sites_final/met/LT_spinup/ms67/analy/s67-E-1628-04-00-00-g01.h5, O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 lseek(3, 0, SEEK_SET) = 0 read(3, \211HDF\r\n\32\n, 8) = 8 read(3, \0, 1) = 1 read(3, \0\0\0\0\10\10\0\4\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377@..., 87) = 87 close(3) = 0 write(2, HDF5-DIAG: Error detected in HDF..., 42) = 42 etc which initially looks fine to me, followed by an abrupt close. NFS filesystems and our 1.6.7.2 filesystem have no such problems -- any suggestions? Thanks very much, Chris ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] e2fsck issue
you‘d better unmount the clients and ost/mdt first, then fsck them. On Wed, Apr 13, 2011 at 4:29 PM, Christos Theodosiou ctheo...@grid.auth.gr wrote: Hi all, I am trying to perform a file-system check on a mounted lustre file-system. The e2fsck fails with the following message: e2fsck -n -v --mdsdb /tmp/mdsdb /dev/msavg/lv001 e2fsck 1.41.10.sun2 (24-Feb-2010) device /dev/mapper/msavg-lv001 mounted by lustre per /proc/fs/lustre/mds/lustrefs-MDT/mntdev Warning! /dev/msavg/lv001 is mounted. e2fsck: MMP: device currently active while trying to open /dev/msavg/lv001 The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 32744 device I tried setting -b argument but the message persists. Do you have any suggestions on how to proceed. Best regards Christos -- Christos Theodosiou Scientific Computational Center Aristotle University 54 124 Thessaloniki, Greece Tel: +30 2310 99 8988 Fax: +30 2310 99 4309 http://www.grid.auth.gr ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] e2fsck and related errors during recovering
Is it helpful updating the e2fsprogs to the newest version? I have ever had a problem during e2fsck, after updating the e2fsprogs, it's ok. On Thu, Apr 7, 2011 at 2:29 AM, Andreas Dilger adil...@whamcloud.com wrote: Having the actual error messages makes this kind of problem much easier to solve. At a guess, if the journal was removed by e2fsck you can re-add it with tune2fs -J size=400 /dev/{mdsdev}. As for lfsck, if you still need to run it, you need to make sure the same version of e2fsprogs is on all OSTs and MDS. Cheers, Andreas On 2011-04-06, at 1:26 AM, Werner Dilling dill...@zdv.uni-tuebingen.de wrote: Hello, after a crash of our lustre system (1.6.4) we have problems repairing the filesystem. Running the 1.6.4 e2fsck failed on the mds filesystem so we tried with the latest 1.8 version which succeeded. But trying to mount mds as ldiskfs filesystem failed with the standard error message: bad superblock on We tried to get more info and the file command file -s -L /dev/ produced ext2 filesystem instead of ext3 filesystem which we got from all ost-filesystems. We were able to produce the mds-database which is needed to get info for lfs fsck. But using this database to create the ost databases failed with the error message: error getting mds_hdr (large number:8) in /tmp/msdb: Cannot allocate memory .. So I assume the msdb is in bad shape and my question is how we can proceed. I assume we have to create a correct version of the mds-filesystem and how to do this is unknown. Any help and info is appreciated. Thanks w.dilling ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] persistent client re-connect failure
If you *only* deactivate it on mds, then you can still see the ost on client, just not to write on it anymore. On Mon, Mar 21, 2011 at 11:49 AM, Samuel Aparicio sapari...@bccrc.ca wrote: Follow up to this posting. I notice on the client that lctl device_list reports the following: 0 UP mgc MGC10.9.89.51@tcp 5a76b5b6-82bf-2053-8c17-e68ffe552edc 5 1 UP lov lustre-clilov-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 4 2 UP mdc lustre-MDT-mdc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 3 UP osc lustre-OST-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 4 UP osc lustre-OST0001-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 5 UP osc lustre-OST0002-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 6 UP osc lustre-OST0003-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 4 7 UP osc lustre-OST0004-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 8 UP osc lustre-OST0005-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 9 UP osc lustre-OST0006-osc-8100459a9c00 6775de4c-6c29-9316-a715-3472233477d1 5 10 UP lov lustre-clilov-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 4 11 UP mdc lustre-MDT-mdc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 12 UP osc lustre-OST-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 13 UP osc lustre-OST0001-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 14 UP osc lustre-OST0002-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 15 UP osc lustre-OST0003-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 4 16 UP osc lustre-OST0004-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 17 UP osc lustre-OST0005-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 18 UP osc lustre-OST0006-osc-810c92f2b800 0ecd69f5-6793-fcb1-0e05-8851c99e5dc5 5 19 UP lov lustre-clilov-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 4 20 UP mdc lustre-MDT-mdc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 21 UP osc lustre-OST-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 22 UP osc lustre-OST0001-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 23 UP osc lustre-OST0002-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 24 UP osc lustre-OST0003-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 4 25 UP osc lustre-OST0004-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 26 UP osc lustre-OST0005-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 27 UP osc lustre-OST0006-osc-81047a45c000 6a3d5815-4851-31b0-9400-c8892e11dae4 5 However OST3 is non-existent, it was de-activated on the MDS - why would the clients think it exists? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 lab website http://molonc.bccrc.ca PLEASE SUPPORT MY FUNDRAISING FOR THE RIDE TO SEATTLE AND THE WEEKEND TO END WOMENS CANCERS. YOU CAN DONATE AT THE LINKS BELOW Ride to Seattle Fundraiser Weekend to End Womens Cancers On Mar 20, 2011, at 8:41 PM, Samuel Aparicio wrote: I am stuck with the following issue on a client attached to a lustre system. we are running lustre 1.8.5 somehow connectivity to the OST failed at some point and the mount hung. after unmounting and re-mounting the client attempts to reconnect. lctl ping shows the client to be connected and normal ping to the OSS/MGS servers shows connectivity. remounting the filesystem results in only some files being visible. the kernel messages are as follows: - Lustre: setting import lustre-OST0003_UUID INACTIVE by administrator request Lustre: lustre-OST0003-osc-8110238c7400.osc: set parameter active=0 Lustre: Skipped 3 previous similar messages LustreError: 14114:0:(lov_obd.c:315:lov_connect_obd()) not connecting OSC ^\; administratively disabled Lustre: Client lustre-client has started LustreError: 14207:0:(file.c:995:ll_glimpse_size()) obd_enqueue returned rc -5, returning -EIO LustreError: 14207:0:(file.c:995:ll_glimpse_size()) Skipped 1 previous similar message LustreError: 14207:0:(file.c:995:ll_glimpse_size()) obd_enqueue returned rc -5, returning -EIO LustreError: 14686:0:(file.c:995:ll_glimpse_size()) obd_enqueue returned rc -5, returning -EIO Lustre: 22218:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1363662012007464 sent from lustre-OST-osc-8110238c7400 to NID 10.9.89.21@tcp 16s ago has timed out (16s prior to deadline). req@810459ce4c00 x1363662012007464/t0 o8-lustre-OST_UUID@10.9.89.21@tcp:28/4 lens 368/584 e 0 to 1 dl 1300678232 ref 1 fl Rpc:N/0/0 rc 0/0 Lustre: 22218:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 182 previous similar messages Lustre: 22219:0:(import.c:517:import_select_connection())
[Lustre-discuss] Does lustre 1.8 stop update and maintenance?
Hi, all Does lustre 1.8 stop update and maintenance? I have not seen any updates for a long time. Only Whamcloud releases the Lustre 2.1. Does it mean Oracle freeze the development of lustre? ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Does lustre 1.8 stop update and maintenance?
Whamcloud is a great company! I think I should do something for lustre... On Wed, Mar 2, 2011 at 1:48 AM, Robert Read rr...@whamcloud.com wrote: Hi, I cannot comment about Oracle's plans regarding Lustre, but Whamcloud does intend to continue supporting 1.8.x for some time. You can see activity related to 1.8.x (as well as 2.1) in http://jira.whamcloud.com. cheers, robert read Whamcloud, Inc On Mar 1, 2011, at 4:48 , Larry wrote: Hi, all Does lustre 1.8 stop update and maintenance? I have not seen any updates for a long time. Only Whamcloud releases the Lustre 2.1. Does it mean Oracle freeze the development of lustre? ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OST problem
Hi Lucius, lustre manual chapter 15 tells you how to do it On Tue, Mar 1, 2011 at 1:05 PM, Lucius lucius...@hotmail.com wrote: Hello everyone, I would like to extend a OSS, which is still in current use. I would like to extend it with a server which has exactly the same HW configuration, and I’d like to extend it in an active/active mode. I couldn’t find any documentation about this, as most of the examples show how to use failnode during formatting. However, I need to extend the currently working system without losing data. Also, tunefs.lustre examples show only the parameter configuration, but they won’t tell if you need to synchronize the file system before setting the How would the system know that on the given server identified by its unique IP, which OST mirrors should run? Thank you in advance, Viktor ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] An odd problem in my lustre 1.8.0
Dear all, I have an odd problem today in my lustre 1.8.0. All of the OSSes and MDS appear well. But one of client has problem. When creating a file in OST5(one of my osts), and dd or echo something to this file, then the process hangs, and never succeeds. for example, client1:/home # lfs setstripe -o 5 test.txt client1:/home # lfs getstripe test.txt OBDS: 0: lustre-OST_UUID ACTIVE 1: lustre-OST0001_UUID ACTIVE 2: lustre-OST0002_UUID ACTIVE 3: lustre-OST0003_UUID ACTIVE 4: lustre-OST0004_UUID ACTIVE 5: lustre-OST0005_UUID ACTIVE 6: lustre-OST0006_UUID ACTIVE 7: lustre-OST0007_UUID ACTIVE 8: lustre-OST0008_UUID ACTIVE 9: lustre-OST0009_UUID ACTIVE 10: lustre-OST000a_UUID ACTIVE 11: lustre-OST000b_UUID ACTIVE 12: lustre-OST000c_UUID ACTIVE 13: lustre-OST000d_UUID ACTIVE 14: lustre-OST000e_UUID ACTIVE 15: lustre-OST000f_UUID ACTIVE 16: lustre-OST0010_UUID ACTIVE test.txt obdidx objid objidgroup 5 158029029 0x96b54e50 client1:/home # dd if=/dev/zero of=test.txt bs=1M count=100 then the dd process hangs and never return. If I edit and save it, then it's location changes to another OST, not OST5. for example client1:/home # dd if=/dev/zero of=test.txt bs=1M count=100 #(ctrl-C) 1+0 records in 0+0 records out 0 bytes (0 B) copied, 173.488 seconds, 0.0 kB/s client1:/home # vi test.txt #add something client1:/home # lfs getstripe test.txt OBDS: 0: lustre-OST_UUID ACTIVE 1: lustre-OST0001_UUID ACTIVE 2: lustre-OST0002_UUID ACTIVE 3: lustre-OST0003_UUID ACTIVE 4: lustre-OST0004_UUID ACTIVE 5: lustre-OST0005_UUID ACTIVE 6: lustre-OST0006_UUID ACTIVE 7: lustre-OST0007_UUID ACTIVE 8: lustre-OST0008_UUID ACTIVE 9: lustre-OST0009_UUID ACTIVE 10: lustre-OST000a_UUID ACTIVE 11: lustre-OST000b_UUID ACTIVE 12: lustre-OST000c_UUID ACTIVE 13: lustre-OST000d_UUID ACTIVE 14: lustre-OST000e_UUID ACTIVE 15: lustre-OST000f_UUID ACTIVE 16: lustre-OST0010_UUID ACTIVE test.txt obdidx objid objidgroup 6 159122026 0x97c026a0 But both the client and the OSS seems good. By the way, other clients and OSS have not this problem. client1:/home # lfs check servers lustre-MDT-mdc-810438d12c00 active. lustre-OST000a-osc-810438d12c00 active. lustre-OST000f-osc-810438d12c00 active. lustre-OST000c-osc-810438d12c00 active. lustre-OST0006-osc-810438d12c00 active. lustre-OST000e-osc-810438d12c00 active. lustre-OST0009-osc-810438d12c00 active. lustre-OST-osc-810438d12c00 active. lustre-OST000d-osc-810438d12c00 active. lustre-OST0003-osc-810438d12c00 active. lustre-OST0002-osc-810438d12c00 active. lustre-OST0008-osc-810438d12c00 active. lustre-OST000b-osc-810438d12c00 active. lustre-OST0004-osc-810438d12c00 active. lustre-OST0007-osc-810438d12c00 active. lustre-OST0005-osc-810438d12c00 active. lustre-OST0010-osc-810438d12c00 active. lustre-OST0001-osc-810438d12c00 active. I try it many times. The log report some error messages only once. On client: Dec 19 18:28:57 client1 kernel: LustreError: 11-0: an error occurred while communicating with 12.12.71@o2ib. The ost_punch operation failed with -107 Dec 19 18:28:57 client1 kernel: LustreError: Skipped 1 previous similar message Dec 19 18:28:57 client1 kernel: Lustre: lustre-OST0005-osc-810438d12c00: Connection to service lustre-OST0005 via nid 12.12.71@o2ib was lost; in prog ress operations using this service will wait for recovery to complete. Dec 19 18:28:57 client1 kernel: LustreError: 4570:0:(import.c:909:ptlrpc_connect_interpret()) lustre-OST0005_UUID went back in time (transno 189979771521 was previously committed, server now claims 0)! See https://bugzilla.lustre.org/show_bug.cgi?id=9646 Dec 19 18:28:57 client1 kernel: LustreError: 167-0: This client was evicted by lustre-OST0005; in progress operations using this service will fail. Dec 19 18:28:57 client1 kernel: LustreError: 7128:0:(rw.c:192:ll_file_punch()) obd_truncate fails (-5) ino 41729130 Dec 19 18:28:57 client1 kernel: Lustre: lustre-OST0005-osc-810438d12c00: Connection restored to service lustre-OST0005 using nid 12.12.71@o2ib. On OSS: Dec 19 18:27:52 os6 kernel: LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 12 .12.12...@o2ib ns: filter-lustre-OST0005_UUID lock: 810087d66200/0xae56b014db6d6d0a lrc: 3/0,0 mode: PR/PR res: 158015656/0 rrc: 2 type: EXT [0-1844 6744073709551615] (req 0-18446744073709551615) flags: 0x10020 remote: 0xe02336632642c5fc expref: 27 pid: 5333 timeout 7284896273 Dec 19 18:28:57 os6 kernel: LustreError: 5407:0:(ldlm_lib.c:1826:target_send_reply_msg()) @@@ processing error (-107) r...@8103dd91b400 x1343016412725286/t0 o10-?@?:0/0 lens 400/0 e 0 to 0 dl 1292754580 ref 1 fl Interpret:/0/0 rc -107/0 The MDS has no messages related with this. I don't know
Re: [Lustre-discuss] fsck.ext4 for device ... exited with signal 11.
Old version of e2fsprogs actually has bugs like this, the newer the better, I think On Thu, Dec 2, 2010 at 7:11 AM, Craig Prescott presc...@hpc.ufl.edu wrote: Andreas Dilger wrote: Do you have enough RAM to run e2fsck on this node? Have you tried running it under gdb to see if it can catch the sig11 and print a stack trace? Yup, plenty of RAM - we've got 32GB in this node. We've already started up fsck again using Colin's suggestion of e2fsprogs-1.41.12.2. So far so good. But if we need to fire it up under gdb, I guess that's what we'll do. Thanks, Craig Prescott UF HPC Center ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] lmt version 3 release
Con! Let's have a try. On Wed, Nov 3, 2010 at 4:12 AM, Jim Garlick garl...@llnl.gov wrote: Version 3 of the Lustre Monitoring Tool (LMT) is now available on gcode: http://code.google.com/p/lmt/ This is a major release that hopefully will improve LMT usability. It has been tested with Lustre 1.8.3. A few highlights: * New ltop that works directly with Cerebro and has an expanded display. * Auto-configuration of MySQL database (lustre config is determied on the fly) * Improved error handling and logging (configurable) * New config file * Code improvements for maintainability For those upgrading from Version 2, the LMT schema has not changed, and the new monitor module is backwards compatible with the Version 2 metric modules. Upgrading consists of: 1. Setting up the new /etc/lmt/lmt.conf config file on the LMT server 2. Updating the lmt-server package on the LMT server and restarting cerebrod 3. Updating the lmt-server-agent package on the Lustre servers and restarting cerebrod Please refer to the Installation wiki page on the above gcode site, and direct any issues to lmt-disc...@googlegroups.com and/or the LMT gcode issue tracker. Jim Garlick ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] MPI-IO / ROMIO support for Lustre
we use localflock in order to work with MPI-IO. flock may consume more addtional resource than localflock. On Mon, Nov 1, 2010 at 10:35 PM, Mark Dixon m.c.di...@leeds.ac.uk wrote: Hi, I'm trying to get the MPI-IO/ROMIO shipped with OpenMPI and MVAPICH2 working with our Lustre 1.8 filesystem. Looking back at the list archives, 3 different solutions have been offered: 1) Disable data sieving (change default library behaviour) 2) Mount Lustre with localflock (flock consistent only within a node) 3) Mount Lustre with flock (flock consistent across cluster) However, it is not entirely clear which of these was considered the best. Could anyone who is using MPI-IO on Lustre comment which they picked, please? I *think* the May 2008 list archive indicates I should be using (3), but I'd feel a whole lot better about it if I knew I wasn't alone :) Cheers, Mark -- - Mark Dixon Email : m.c.di...@leeds.ac.uk HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK - ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Compiling lustre with snmp feature
Fortunately we don't need the checksum, so we haven't seen 20560 until now. But 19528 happened several days ago, we have two choice: 1, update to the higher version, eg, 1.8.1 or 1.8.4, I'd like to update to 1.8.4, but does that mean I should update the OFED driver together? We try to avoid changing the OFED. 2, patch the current version with attachment 23648 and 23751 for bz 19528. Considering the OFEd driver, we may have to patch 1.8.0 instead of updating it. On Tue, Oct 19, 2010 at 9:03 PM, Peter Jones peter.x.jo...@oracle.com wrote: Hmm. My experience is that 20560 was the most disruptive issue in early 1.8.x releases, but that was fixed in 1.8.1.1. Larry, 19528 was fixed in 1.8.1. You can check by checking the patch and noting that the first release with a landed+ flag is 1.8.1. HTH Larry wrote: which critical bug does 1.8.0 have and fixed in 1.8.0.1? I know 1.8.0 has bug #19528, but I don't know whether it fixed or not in 1.8.0.1 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Compiling lustre with snmp feature
which critical bug does 1.8.0 have and fixed in 1.8.0.1? I know 1.8.0 has bug #19528, but I don't know whether it fixed or not in 1.8.0.1 On Mon, Oct 18, 2010 at 10:41 PM, Brian J. Murrell brian.murr...@oracle.com wrote: On Mon, 2010-10-18 at 16:37 +0200, Alfonso Pardo wrote: Hello, Hi, I try to compile with command: ./configure --with-linux=/usr/src/kernels/2.6.18-92.el5-x86_64/ --enable-snmp But I get the error in some point: checking for register_mib... no You need to look in config.log and see why it's failing to find that. I have Centos 5.2 and lustre version 1.8.0 1.8.0 had a subsequent 1.8.0.1 release which means that it fixed a critical bug. I would strongly advise upgrading, and since you are going to upgrade, it might as well be to 1.8.4, the latest release where you will likely get more people's attention with questions and bug reports/fixes. Any package to install? I don't know off-hand, which is why I gave you instructions to discover what the problem is exactly. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] How do you monitor your lustre?
the latest lmt (lmt-2.6.4-2) is updated on Sep 17, 2010. Why say moribund? On Thu, Sep 30, 2010 at 5:46 PM, Andreas Davour dav...@pdc.kth.se wrote: I ask because the lmt project seem to be quite moribund. Anyone else out there doing something? /andreas -- Systems Engineer PDC Center for High Performance Computing CSC School of Computer Science and Communication KTH Royal Institute of Technology SE-100 44 Stockholm, Sweden Phone: 087906658 A satellite, an earring, and a dust bunny are what made America great! ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre 1.8.4 ETA
just git-pull from lustre repository and checkout the 1.8.4 On Tue, Aug 17, 2010 at 9:50 PM, Wojciech Turek wj...@cam.ac.uk wrote: Any idea when 1.8.4 will be released? Is there a source code available somewhere so I can try to build it myself? Many thanks Wojciech On 3 August 2010 17:55, Johann Lombardi johann.lomba...@oracle.com wrote: Hi James, On Tue, Aug 03, 2010 at 10:42:00AM -0600, James Robnett wrote: Wonderful news. On a related topic. Can the build scripts be made available (or a cleansed variant). It's not that cumbersome to write one's own but if they already exist it'd be handy to re-use them rather than recreating them, or at least use them as a reference. Our build scripts are - and have alway been - available under the build directory (build/{lbuild,lbuild-rhel5,...}). Cheers, Johann ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Senior System Architect High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN
good job, I'll download and learn from them. On Wed, Jul 28, 2010 at 9:38 PM, Adrian Ulrich adr...@blinkenlights.ch wrote: First: Sorry for the shameless self advertising, but... I uploaded two lustre-related modules to the CPAN: #1: Lustre::Info provides easy access to information located at /proc/fs/lustre, it also comes with a 'performance monitoring' script called 'lustre-info.pl' #2 Lustre::LFS offers IO::Dir and IO::File-like filehandles but with additional lustre-specific features ($dir_fh-set_stripe...) Examples and details: Lustre::Info and lustre-info.pl --- Lustre::Info provides a Perl-OO interface to lustres procfs information. (confusing) example code to get the blockdevice of all OSTs: # my $l = Lustre::Info-new; print join(\n, map( { $l-get_ost($_)-get_name.: .$l-get_ost($_)-get_blockdevice } \ �...@{$l-get_ost_list}), '' ) if $l-is_ost; # ..output: $ perl test.pl lustre1-OST001e: /dev/md17 lustre1-OST0016: /dev/md15 lustre1-OST000e: /dev/md13 lustre1-OST0006: /dev/md11 The module also includes a script called 'lustre-info.pl' that can be used to gather some live performance statistics: Use `--ost-stats' to get a quick overview on what's going on: $ lustre-info.pl --ost-stats lustre1-OST0006 (@ /dev/md11) : write= 5.594 MB/s, read= 0.000 MB/s, create= 0.0 R/s, destroy= 0.0 R/s, setattr= 0.0 R/s, preprw= 6.0 R/s lustre1-OST000e (@ /dev/md13) : write= 3.997 MB/s, read= 0.000 MB/s, create= 0.0 R/s, destroy= 0.0 R/s, setattr= 0.0 R/s, preprw= 4.0 R/s lustre1-OST0016 (@ /dev/md15) : write= 5.502 MB/s, read= 0.000 MB/s, create= 0.0 R/s, destroy= 0.0 R/s, setattr= 0.0 R/s, preprw= 6.0 R/s lustre1-OST001e (@ /dev/md17) : write= 5.905 MB/s, read= 0.000 MB/s, create= 0.0 R/s, destroy= 0.0 R/s, setattr= 0.0 R/s, preprw= 6.7 R/s You can also get client-ost details via `--monitor=MODE' $ lustre-info.pl --monitor=ost --as-list # this will only show clients where read+write = 1MB/s client nid | lustre1-OST0006 | lustre1-OST000e | lustre1-OST0016 | lustre1-OST001e | +++ TOTALS +++ (MB/s) 10.201.46...@o2ib | r= 0.0, w= 0.0 | r= 0.0, w= 0.0 | r= 0.0, w= 0.0 | r= 0.0, w= 1.1 | read= 0.0, write= 1.1 10.201.47...@o2ib | r= 0.0, w= 0.0 | r= 0.0, w= 1.2 | r= 0.0, w= 2.0 | r= 0.0, w= 0.0 | read= 0.0, write= 3.2 There are many more options, checkout `lustre-info.pl --help' for details! Lustre::LFS::Dir and Lustre::LFS::File --- This two packages behave like IO::File and IO::Dir but both of them add some lustre-only features to the returned filehandle. Quick example: my $fh = Lustre::LFS::File; # $fh is a normal IO::File-like FH $fh-open( test) or die; print $fh Foo Bar!\n; my $stripe_info = $fh-get_stripe or die Not on a lustre filesystem?!\n; Keep in mind that both Lustre modules are far from being complete: Lustre::Info really needs some MDT support and Lustre::LFS is just a wrapper for /usr/bin/lfs: An XS-Version would be much better. But i'd love to hear some feedback if someone decides to play around with this modules + lustre-info.pl :-) Cheers, Adrian ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] I/O errors with NAMD
we have the same problem when running namd in lustre sometimes, the console log suggest file lock expired, but I don't know why. On Fri, Jul 23, 2010 at 8:12 AM, Wojciech Turek wj...@cam.ac.uk wrote: Hi Richard, If the cause of the I/O errors is Lustre there will be some message in the logs. I am seeing similar problem with some applications that run on our cluster. The symptoms are always the same, just before application crashes with I/O error node gets evicted with a message like that: LustreError: 167-0: This client was evicted by ddn_data-OST000f; in progress operations using this service will fail. The OSS that mounts the OST from the above message has following line in the log: LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 10.143@tcp ns: filter-ddn_data-OST000f_UUID lock: 81021a84ba00/0x744b1dd44 81e38b2 lrc: 3/0,0 mode: PR/PR res: 34959884/0 rrc: 2 type: EXT [0-18446744073709551615] (req 0-18446744073709551615) flags: 0x20 remote: 0x1d34b900a905375d expref: 9 pid: 1506 timeout 8374258376 Can you please check your logs for similar messages? Best regards Wojciech On 22 July 2010 23:43, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-07-22, at 14:59, Richard Lefebvre wrote: I have a problem with the Scalable molecular dynamics software NAMD. It write restart files once in a while. But sometime the binary write crashes. The when it crashes is not constant. The only constant thing is it happens when it writes on our Lustre file system. When it write on something else, it is fine. I can't seem find any errors in any of the /var/log/messages. Anyone had any problems with NAMD? Rarely has anyone complained about Lustre not providing error messages when there is a problem, so if there is nothing in /var/log/messages on either the client or the server then it is hard to know whether it is a Lustre problem or not... If possible, you could try running the application under strace (limited to the IO calls, or it would be much too much data) to see which system call the error is coming from. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] I/O errors with NAMD
There are many kinds of reasons that a server evicts a client, maybe network error, maybe ptlrpcd bug, but according to my experience, the only chance to see the I/O error is running namd in lustre filesystem, I can see some other evict events sometimes, but none of them results in I/O error. So besides the evict client, there may be something else causing the I/O error. On Fri, Jul 23, 2010 at 6:54 PM, Wojciech Turek wj...@cam.ac.uk wrote: There is a similar thread on this mailing list: http://groups.google.com/group/lustre-discuss-list/browse_thread/thread/afe24159554cd3ff/8b37bababf848123?lnk=gstq=I%2FO+error+on+clients# Also there is a bug open which reports similar problem: https://bugzilla.lustre.org/show_bug.cgi?id=23190 On 23 July 2010 10:02, Larry tsr...@gmail.com wrote: we have the same problem when running namd in lustre sometimes, the console log suggest file lock expired, but I don't know why. On Fri, Jul 23, 2010 at 8:12 AM, Wojciech Turek wj...@cam.ac.uk wrote: Hi Richard, If the cause of the I/O errors is Lustre there will be some message in the logs. I am seeing similar problem with some applications that run on our cluster. The symptoms are always the same, just before application crashes with I/O error node gets evicted with a message like that: LustreError: 167-0: This client was evicted by ddn_data-OST000f; in progress operations using this service will fail. The OSS that mounts the OST from the above message has following line in the log: LustreError: 0:0:(ldlm_lockd.c:305:waiting_locks_callback()) ### lock callback timer expired after 101s: evicting client at 10.143@tcp ns: filter-ddn_data-OST000f_UUID lock: 81021a84ba00/0x744b1dd44 81e38b2 lrc: 3/0,0 mode: PR/PR res: 34959884/0 rrc: 2 type: EXT [0-18446744073709551615] (req 0-18446744073709551615) flags: 0x20 remote: 0x1d34b900a905375d expref: 9 pid: 1506 timeout 8374258376 Can you please check your logs for similar messages? Best regards Wojciech On 22 July 2010 23:43, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-07-22, at 14:59, Richard Lefebvre wrote: I have a problem with the Scalable molecular dynamics software NAMD. It write restart files once in a while. But sometime the binary write crashes. The when it crashes is not constant. The only constant thing is it happens when it writes on our Lustre file system. When it write on something else, it is fine. I can't seem find any errors in any of the /var/log/messages. Anyone had any problems with NAMD? Rarely has anyone complained about Lustre not providing error messages when there is a problem, so if there is nothing in /var/log/messages on either the client or the server then it is hard to know whether it is a Lustre problem or not... If possible, you could try running the application under strace (limited to the IO calls, or it would be much too much data) to see which system call the error is coming from. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] NFS Export Issues (RESOLVED)
After my installation of OS, the first thing is to turn off the SElinux, it seems no use, and often makes a lot of trouble... On Wed, Jul 21, 2010 at 10:24 PM, William Olson lustre_ad...@reachone.com wrote: When it comes to inexplicable permission problems, have you checked if SELinux is turned off on the NFS server? I knew if I was patient somebody would point out the simple answer and make me look like an idiot!! hahahaha THANK YOU!! So, set selinux into permissive mode, adjusted iptables(wasn't part of the original problem, but I didn't save my rules before rebooting) and guess what?.. It works. :) YAY! I think my sysadmin badge needs to be revoked for a day... Regards, Daniel. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Permanently delete OST
https://bugzilla.lustre.org/show_bug.cgi?id=18329 According to dev team resolved and closed Regards Heiko ___ That shows the version being 1.6.6 but also that it was resolved 1/22/10 which is very recent. This bug must be in the 1.8 tree (as that is my current version) and was not noticed? Would this fix automatically be applied to the 1.8 tree procedurally or does a separate bug have to be reported? Also, if it has been fixed in 1.8 is there a scheduled release? Thanks, as this will be nice to be able to tell what disk capacity is available. Larry ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Permanently delete OST
https://bugzilla.lustre.org/show_bug.cgi?id=18329 According to dev team resolved and closed Regards Heiko By the way, I got a hold of the source code and the patch listed for that bug listed in Heiko's message. I introduced the changes to the lfs.c file and now I no longer get the error code. That patch does work but has not been applied to 1.8.1.1 in the released rpms. Still no option for permanent removal, but at least lfs df works again. Larry ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Permanently delete OST
On Tue, 2010-01-26 at 11:29 -0500, Brian J. Murrell wrote: Yes, I looked at that bug yesterday. I don't see anything in there that provides any sort of --perm argument to completely purge an OST from the configuration. b. What is the latest on this? Also, after a system reboot, at the point the first permanently inactive OST would be listed in lfs df the output stops with the line error: llapi_obd_statfs failed: Bad address (-14). This looks like a bug mentioned earlier in the 1.6.6 version of Lustre. I am running 1.8.1.1. Larry ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Permanently delete OST
For some reason this eludes my searches through the archives. I keep seeing documentation on how to deactive the OST and copy files off it. In the manual it states removal of the OST permanently can be accomplished with: lctl conf_param OST name.osc.active=0 I have run this and in the proc info I now see active is now set to 0. However it still exists in proc and when running lfs df it shows there as an inactive device. I'm wanting it removed from existence. The OST no longer physically exists yet I am haunted by its persistence on the MGS/MDT. TIA, Larry ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] stata_mv mv_stata which is better?
Yes Brock -- as Mike has mentioned we also took this doc and provided this for our TACC customer: http://www.tacc.utexas.edu/resources/hpcsystems/ where we put 72 x4500s in place with this configuration with them. In addition, Sun's recent Linux HPC Software http://www.sun.com/software/products/hpcsoftware/index.xml has the mv_sata driver and the SW configurations needed to put the x4500 together as an OSS one can build up upon and further configure with the SW RAID patches also included. HTH Mike Berg wrote On 08/07/08 11:11,: Brock, It is recommended that mv_sata is used on the x4500. It has been a while since I have built this up myself and a few Lustre releases back but I do understand the pain. I hope that with Lustre 1.6.5.1 on RHEL 4.5 you can just build mv_sata against the provided Lustre kernel and alias it accordingly in modprobe.conf and create a new initrd, then update grub. I don't have gear handy to give it try unfortunately. Please let me know your experiences with this if you pursue it. Enclosed is a somewhat dated document on what we have found to be the best configuration of the x4500 for use with Lustre. Ignore the N1SM parts. We optimized for performance and RAS with some sacrifices on capacity. Hopefully this is a useful reference. Regards, Mike Berg Sr. Lustre Solutions Engineer Sun Microsystems, Inc. Office/Fax: (303) 547-3491 E-mail: [EMAIL PROTECTED] On Aug 6, 2008, at 1:48 PM, Brock Palen wrote: Is it still worth the effort to try and build mv_stata? when working with an x4500? stata_mv from RHEL4 does not appear to show some of the stability problems discussed online before. I am curious because the build system sun provides with the driver does not play nicely with the lustre kernel source packaging. If it is worth all the pain, if others have already figured it out. Any help would be grateful. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre Solution Delivery
Brennan, One needs SAM-QFS Linux client on a given node that is a lustre client. There can be multiples of these within a cluster. This client(s) can then move (copy data) between SAM and Lustre. This has been put in place at a number of places for Sun customers such as DKRZ in Germany. This is a very basic solution today and we would like to see a tighter integration between the Lustre and SAM development efforts to have this as a more robust offering and SAM SW support for newer Linux Kernels. It is my understanding that this is underway. However for the time being it will require the aforementioned type of client. Larry Peter Bojanic wrote On 03/26/08 15:48,: Hi Brennan, Larry McIntosh of our Linux HPC team can advise you regarding our Lustre/SAM integration options. Cheers, Bojanic On 26-Mar-08, at 17:47, Brennan [EMAIL PROTECTED] wrote: What is the process for integrating a Lustre+SAMFS solution into an existing customer environment. The plan is to have CRS build the Lustre component, but Lustre and SAMFS will need to configured and integrated into the customer computing environment. I am very familiar with the SAMFS integration, but not Lustre integration. Do we have resources in PS to provide the integration? Is this done by the CFS organization? Also, a small scale benchmark of the solution may be required. Which Benchmark center could provide Lustre support? Thanks, Jim Brennan Digital Media Systems Sun Systems Group Universal City, CA (310) 901-86777 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre Solution Delivery
Brennan, One needs SAM-QFS Linux client on a given node that is a lustre client. There can be multiples of these within a cluster. This client(s) can then move (copy data) between SAM and Lustre. This has been put in place at a number of places for Sun customers such as DKRZ in Germany. This is a very basic solution today and we would like to see a tighter integration between the Lustre and SAM development efforts to have this as a more robust offering and SAM SW support for newer Linux Kernels. It is my understanding that this is underway. However for the time being it will require the aforementioned type of client. Larry Brennan wrote On 03/26/08 14:47,: What is the process for integrating a Lustre+SAMFS solution into an existing customer environment. The plan is to have CRS build the Lustre component, but Lustre and SAMFS will need to configured and integrated into the customer computing environment. I am very familiar with the SAMFS integration, but not Lustre integration. Do we have resources in PS to provide the integration? Is this done by the CFS organization? Also, a small scale benchmark of the solution may be required. Which Benchmark center could provide Lustre support? Thanks, Jim Brennan Digital Media Systems Sun Systems Group Universal City, CA (310) 901-86777 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss