[Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Theodore Omtzigt
I configured a Lustre file system on a collection of storage servers that have 12TB raw devices. I configured a combined MGS/MDS with the default configuration. On the OSTs however I added the force_over_8tb to the mountfsoptions. Two part question: 1- do I need to set that parameter on the

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Michael Barnes
On Jul 14, 2011, at 1:15 PM, Theodore Omtzigt wrote: Two part question: 1- do I need to set that parameter on the MGS/MDS server as well No, they are different filesystems. You shouldn't need to do this on the OSTs either. You must be using an older lustre release. 2- if yes, how do I

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Andreas Dilger
If you are seeing this problem it means you are using the ext3-based ldiskfs. Go back to the download site and get the lustre-ldiskfs and lustre-modules RPMs with ext4 in the name. That is the code that was tested with LUNs over 8TB. We kept these separate for some time to reduce risk for

[Lustre-discuss] potential issue with data corruption

2011-07-14 Thread Lisa Giacchetti
Hi, We are seeing a problem where some running jobs attempted to copy a file from local disk on a worker node to a lustre file system. 14 of those files ended up empty or truncated. We have 7 OSSs with either 6 or 12 ost's on each. All 14 files ended up being on an ost on one of the two

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Theodore Omtzigt
Andreas: Thanks for taking a look at this. Unfortunately, I don't quite understand the guidance you present: If you are seeing 'this' problem. I haven't seen 'any' problems pertaining to 8tb yet, so I cannot place your guidance in the context of the question I posted. My question was

[Lustre-discuss] Packaged kerberized VM client image Re: Migrating virtual machines over Lustre using Proxmox

2011-07-14 Thread Josephine Palencia
Hi Paul, I wanted to signify our interest in your project as we have something similar and related. As part of the OSG ExTENCI project, we've set up a kerberized lustre fs that uses virtual (VM) lustre clients in remote sites. With proper network tuning/route analysis, we observe that it is

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Theodore Omtzigt
Michael: The reason I had to do it on the OST's is because when issuing the mkfs.lustre command to build the OST it would error out with the message that I should use the force_over_8tb mount option. I was not able to create an OST on that device without the force_over_8tb option. Your

Re: [Lustre-discuss] potential issue with data corruption

2011-07-14 Thread Lisa Giacchetti
I am running 1.8.3 on servers and clients. lisa On 7/14/11 12:59 PM, Lisa Giacchetti wrote: Hi, We are seeing a problem where some running jobs attempted to copy a file from local disk on a worker node to a lustre file system. 14 of those files ended up empty or truncated. We have 7 OSSs

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Cliff White
--writeconf will erase parameters set via lctl conf_param, and will erase pools definitions. It will also allow you to set rather silly parameters that can prevent your filesystem from starting, such as incorrect server NIDs or incorrect failover NIDs. For this reason (and from a history of

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Cliff White
This error message you are seeing is what Andreas was talking about - you must use the ext4-based version, as you will not need any option with your size LUNS. The 'must use force_over_8tb' error is the key here, you most certainly want/need to *.ext4.rpm versions of stuff. cliffw On Thu, Jul

[Lustre-discuss] LNET o2ib networking and MTU

2011-07-14 Thread Adesanya, Adeyemi
Just need some clarification on this: We use the o2ib driver for Lustre IB communication. We also use IPoIB to define IP addresses for the IB interfaces in the network. Does the MTU configuration parameter impact Lustre in any way? My understanding is that LNET is only using IPoIB for

Re: [Lustre-discuss] potential issue with data corruption

2011-07-14 Thread Oleg Drokin
Hello! On Jul 14, 2011, at 1:59 PM, Lisa Giacchetti wrote: Jul 7 07:10:08 cmsls6 kernel: Lustre: 15431:0:(ldlm_lib.c:575:target_handle_reconnect()) cmsprod1-OST002d: c03badd9-c242-1507-6824-3a9648c8b21f reconnecting Jul 7 07:59:42 cmsls6 kernel: Lustre:

Re: [Lustre-discuss] potential issue with data corruption

2011-07-14 Thread Lisa Giacchetti
Oleg, thanks for your response. See my responses inline. lisa On 7/14/11 2:47 PM, Oleg Drokin wrote: Hello! On Jul 14, 2011, at 1:59 PM, Lisa Giacchetti wrote: Jul 7 07:10:08 cmsls6 kernel: Lustre: 15431:0:(ldlm_lib.c:575:target_handle_reconnect()) cmsprod1-OST002d:

Re: [Lustre-discuss] potential issue with data corruption

2011-07-14 Thread Oleg Drokin
Hello! On Jul 14, 2011, at 3:55 PM, Lisa Giacchetti wrote: Jul 7 07:10:08 cmsls6 kernel: Lustre: 15431:0:(ldlm_lib.c:575:target_handle_reconnect()) cmsprod1-OST002d: c03badd9-c242-1507-6824-3a9648c8b21f reconnecting Some of these errors seem really bad - like the bulk IO comm error or the

Re: [Lustre-discuss] potential issue with data corruption

2011-07-14 Thread Lisa Giacchetti
On 7/14/11 3:05 PM, Oleg Drokin wrote: Hello! On Jul 14, 2011, at 3:55 PM, Lisa Giacchetti wrote: Jul 7 07:10:08 cmsls6 kernel: Lustre: 15431:0:(ldlm_lib.c:575:target_handle_reconnect()) cmsprod1-OST002d: c03badd9-c242-1507-6824-3a9648c8b21f reconnecting Some of these errors seem really bad

Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Kevin Van Maren
With one other note: you should have used --mkfsoptions='-t ext4' when doing mkfs.lustre, and NOT the force option. Given that it is already formatted and you don't want to use data, at least use the ext4 Lustre RPMs. Pretty sure you don't need a --writeconf -- you would either run as-is with

Re: [Lustre-discuss] New wc-discuss Lustre Mailing List

2011-07-14 Thread Isaac Huang
On Tue, Jul 12, 2011 at 02:12:38PM -0700, Peter Jones wrote: Isaac If you (or anyone else for that matter) is having trouble joining the group let me know privately at pjo...@whamcloud.com which email address that you would like to use and I will add you manually. Thanks Peter, I got an

Re: [Lustre-discuss] LNET o2ib networking and MTU

2011-07-14 Thread Isaac Huang
On Thu, Jul 14, 2011 at 12:43:32PM -0700, Adesanya, Adeyemi wrote: Just need some clarification on this: We use the o2ib driver for Lustre IB communication. We also use IPoIB to define IP addresses for the IB interfaces in the network. Does the MTU configuration parameter impact Lustre