Re: [Lustre-discuss] recovering formatted OST

2010-10-26 Thread Bernd Schubert
Hello Wojciech,

I think both would work, but why don't just create a small OST with 
mkfs.lustre on a loopback device? And then copy over those files to your 
recovered filesystem. 
Hmm, well, e2fsck might not have fixed all issues and then a reformat indeed 
might be helpful.

Also note: EAs on OST objects are a nice to have, but not absolutely required. 

Cheers,
Bernd


On Tuesday, October 26, 2010, Wojciech Turek wrote:
 Bernd, I would like to clarify if I understood you suggestion correctly:
 
 1) create a new OST but using old index and old label
 2) mount it as ldiskfs and copy recovered objects (using tar or rsync with
 xattrs support) from the old OST to the new OST
 3) run --writeconf on MDT and OST of that filesystem
 4) mount MDT and all OSTs
 
 
 I guess I could do it also that way:
 
 1) backup restored object using tar or rsync with xattrs support
 2) format old OST with old index and old label
 3) restore Objects from the backup
 
 Do you think that would work?
 
 Best regards,
 
 Wojciech
 
 On 22 October 2010 18:52, Bernd Schubert bernd.schub...@fastmail.fm wrote:
  Hmm, I would probably format a small fake device on a ramdisk and copy
  files
  over, run tunefs --writeconf /mdt and then start everything (inlcuding
  all OSTs) again.
  
  
  Cheers,
  
  On Friday, October 22, 2010, Wojciech Turek wrote:
   I have tried Bernd's suggestion and it seem to have worked, after
   running e2fsck -D ll_recover_lost_found_objs didn't cause kernel panic
   but moved
  
  a
  
   number of objects to O directory. Problem is that I do not have
   last_rcvd file so the OST has no index at the moment. What would be
   the next step
  
  to
  
   enable access to those files in the filesystem?
   
   Best regards,
   
   Wojciech
   
   On 22 October 2010 17:15, Andreas Dilger andreas.dil...@oracle.com
  
  wrote:
On 2010-10-22, at 5:42, Bernd Schubert bernd.schub...@fastmail.fm
  
  wrote:
 Hmm, e2fsck didn't catch that? rec_len is the length of a directory

entry, so

 after how many bytes the next entry follows.

I agree that e2fsck should have caught that.

 You can try to force e2fsck to do
 something about that: e2fsck -D

No, I would recommend against using -D at this point. That will cause
  
  it
  
to re-write the directory contents, and given that the filesystem was
previously corrupted I would prefer making as few changes as possible
before the data is estranged.

Wojciech,
note that if you are able to mount the filesystem you could just copy
  
  all
  
of the objects (with xattrs!) from lost+found on the bad filesystem,
along with the last_rcvd file (if you can find it) into a new ldiskfs
filesystem and then run ll_recover_lost_found_objs on that.

 On Friday, October 22, 2010, Wojciech Turek wrote:
 Ok, removing and recreating the journal fixed that problem and I
 am able

to

 mount device as ldiskfs filesystem. Now I hit another wall when
  
  trying
  
to

 run ll_recover_lost_found_objs
 When I first time run ll_recover_lost_found_objs -d
 /mnt/ost/lost+found

it

 only creates the O dir and exits. When I repeat this command again

kernel

 panics. Any idea what could be the problem here?
 
 
 LDISKFS-fs error (device dm-4): ldiskfs_readdir: bad entry in
 directory #6831: rec_len is smaller than minimal - offset=0,
  
  inode=0,
  
 rec_len=0, name_len=0
 Aborting journal on device dm-4.
 Unable to handle kernel NULL pointer dereference at
 

RIP:
 [88033448] :jbd:journal_commit_transaction+0xc5b/0x12db
 PGD 1a118d067 PUD 1ce7e7067 PMD 0
 Oops: 0002 [1] SMP
 last sysfs file: /class/infiniband_mad/umad0/port
 CPU 3
 Modules linked in: ldiskfs(U) crc16(U) autofs4(U) hidp(U) l2cap(U)
 bluetooth(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U)
 ib_ipoib(U) ipoib_helper(U) ib_cm(U) ipv6(U) xfrm_nalgo(U)
 crypto_api(U)

ib_uverbs(U)

 ib_umad(U) mlx4_vnic(U) mlx4_vnic_helper(U) ib_sa(U) ib_mthca(U)

mptctl(U)

 dm_mirror(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U)

i2c_ec(U)

 i2c_core(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U)
 acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U)

cdrom(U)

 mlx4_ib(U) ib_mad(U) ib_core(U) joydev(U) mlx4_core(U)
  
  usb_storage(U)
  
 pcspkr(U) shpchp(U) serio_raw(U) i5000_edac(U) edac_mc(U)
  
  dm_raid45(U)
  
 dm_message(U) dm_region_hash(U) dm_log(U) dm_mod(U)
 dm_mem_cache(U)

nfs(U)

 lockd(U) fscache(U) nfs_acl(U) sunrpc(U) mptsas(U) mptscsih(U)

mptbase(U)

 scsi_transport_sas(U) mppVhba(U) megaraid_sas(U) mppUpper(U) sg(U)
 sd_mod(U) scsi_mod(U) bnx2(U) ext3(U) jbd(U) uhci_hcd(U)
 ohci_hcd(U) ehci_hcd(U) Pid: 11360, comm: 

Re: [Lustre-discuss] recovering formatted OST

2010-10-26 Thread Lisa Giacchetti

Wojciech,
 since you  have successfully done step #4 can you tell me what is use 
in the reformat for the old index id?
 I tried to do this a few weeks ago was not succsessful at reformatting 
an ost with the old index because
 I am not clear on what the index is.  I asked on this list at that 
time for input and did not get much.

 If you could provide the exact command you used that would be good too.

lisa


On 10/26/10 10:31 AM, Wojciech Turek wrote:
Since some of our users started to recover their data from backups or 
by other means (rerunning jobs etc) into the original locations I 
don't think it would be good idea to put the recovered OST back in 
service as it is, as that may cause some of users new files to be 
overwritten by the recovered files.


To avoid that scenario I decided to reformat the old OST and put it 
back into filesystem as empty.

1) First I have created a backup of the recovered object files
2) then using lfs find and lfs getstripe on the client  I created a 
list of files and object ids from the formatted OST
3) using backup from point 1 and information from point 2 I copied 
objects to a new location on the filesystem and renamed them to their 
original name. Now users can interrogate those files and choose which 
they want to keep.

4) I reformatted old OST with old index id and old label




Before I mount that OST into filesystem I want to make sure that MDS 
detects it as empty OST and does not try to recreate missing objects. 
Would it be enough to remove lov_objid from MDT and let it create new 
lov_objid based on information from OSTs, or do I need to first unlink 
all missing files from the client?


Best regards,

Wojciech

On 26 October 2010 05:36, Wojciech Turek wj...@cam.ac.uk 
mailto:wj...@cam.ac.uk wrote:


Bernd, I would like to clarify if I understood you suggestion
correctly:

1) create a new OST but using old index and old label
2) mount it as ldiskfs and copy recovered objects (using tar or
rsync with xattrs support) from the old OST to the new OST
3) run --writeconf on MDT and OST of that filesystem
4) mount MDT and all OSTs


I guess I could do it also that way:

1) backup restored object using tar or rsync with xattrs support
2) format old OST with old index and old label
3) restore Objects from the backup

Do you think that would work?

Best regards,

Wojciech



On 22 October 2010 18:52, Bernd Schubert
bernd.schub...@fastmail.fm mailto:bernd.schub...@fastmail.fm
wrote:

Hmm, I would probably format a small fake device on a ramdisk
and copy files
over, run tunefs --writeconf /mdt and then start everything
(inlcuding all
OSTs) again.


Cheers,

On Friday, October 22, 2010, Wojciech Turek wrote:
 I have tried Bernd's suggestion and it seem to have worked,
after running
 e2fsck -D ll_recover_lost_found_objs didn't cause kernel
panic but moved a
 number of objects to O directory. Problem is that I do not
have last_rcvd
 file so the OST has no index at the moment. What would be
the next step to
 enable access to those files in the filesystem?

 Best regards,

 Wojciech

 On 22 October 2010 17:15, Andreas Dilger
andreas.dil...@oracle.com mailto:andreas.dil...@oracle.com
wrote:
  On 2010-10-22, at 5:42, Bernd Schubert
bernd.schub...@fastmail.fm
mailto:bernd.schub...@fastmail.fm wrote:
   Hmm, e2fsck didn't catch that? rec_len is the length of
a directory
 
  entry, so
 
   after how many bytes the next entry follows.
 
  I agree that e2fsck should have caught that.
 
   You can try to force e2fsck to do
   something about that: e2fsck -D
 
  No, I would recommend against using -D at this point. That
will cause it
  to re-write the directory contents, and given that the
filesystem was
  previously corrupted I would prefer making as few changes
as possible
  before the data is estranged.
 
  Wojciech,
  note that if you are able to mount the filesystem you
could just copy all
  of the objects (with xattrs!) from lost+found on the bad
filesystem,
  along with the last_rcvd file (if you can find it) into a
new ldiskfs
  filesystem and then run ll_recover_lost_found_objs on that.
 
   On Friday, October 22, 2010, Wojciech Turek wrote:
   Ok, removing and recreating the journal fixed that
problem and I am
   able
 
  to
 
   mount device as ldiskfs filesystem. Now I hit another
wall when trying
 
  to
 
   run ll_recover_lost_found_objs
   When I 

Re: [Lustre-discuss] recovering formatted OST

2010-10-26 Thread Bernd Schubert
On Tuesday, October 26, 2010, Wojciech Turek wrote:
 Hi,
 
 There is a LAST_ID file on the OST and indeed it equals a highest object
 number
 
 [r...@oss09 ~]# od -Ax -td8 /tmp/LAST_ID
 00  2490599
 08
 
 [r...@oss09 ~]# ls -1s /mnt/ost/O/0/d* | grep -v [a-z] | sort -k2 -n | tail
 -1
   8 2490599
 
 However MDS seem to think differently.
 
 r...@mds03 ~]# lctl get_param osc.*.prealloc_last_id | grep OST0010
 osc.scratch2-OST0010-osc.prealloc_last_id=1

Yeah.

 
 Is this caused by deactivating the OST on the MDS? I have deactivated  OST
 on  MDS using this command:
 
 lctl --device 19 conf_param scratch2-OST0010.osc.active=0
 
 I looked into lov_objid reported by the MDS but I am not sure how to
 interpret the output correctly
 [r...@mds03 ~]# od -Ax -td8 /tmp/lov_objid
 00  2073842  2100049
 10  2115247  2038471
 20  2119821  2190996
 30  2029234  2354424
 40  2160856  2167105
 50  1970351  2059045
 60  2706486  2571655
 70  2662262  2628346
 80  2490688  2668926
 90  2631587  2643791
 a0
 
 So my question is how I can find out if my LAST_ID is fine?

Above you deactivated OST0010 (hex), so OST-16 in decimal (counting starts 
with zero). That should be 2490688 then.

I still wonder if we could convince e2fsck to set that last_id value on the 
OST itself. It already can correct the wrong last_id value, but it sets that 
to the last_id it finds on disk 
(https://bugzilla.lustre.org/show_bug.cgi?id=22734). Setting it to the MDS 
value should also work, but firstly for sanity reasons it falls back to the on 
disk value, if the values differ too much (1) and secondly I figured out 
with those patches there, that using the MDS value is broken (and did not get 
broken by patches, but my patches revealed it...). 

Cheers,
Bernd


-- 
Bernd Schubert
DataDirect Networks
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] recovering formatted OST

2010-10-26 Thread Bernd Schubert
Hello Lisa,

OST-index and the fsname identify the OST for the MGS, MGS and clients. 

If you reformat an OST and you do not re-use the old index, it will leave a 
hole, as the new OST gets another index. And OST holes are an uncommon 
scenario, it often triggers some bugs...


Cheers,
Bernd

On Tuesday, October 26, 2010, Lisa Giacchetti wrote:
 Wojciech,
   since you  have successfully done step #4 can you tell me what is use
 in the reformat for the old index id?
   I tried to do this a few weeks ago was not succsessful at reformatting
 an ost with the old index because
   I am not clear on what the index is.  I asked on this list at that
 time for input and did not get much.
   If you could provide the exact command you used that would be good too.
 
 lisa
 
 On 10/26/10 10:31 AM, Wojciech Turek wrote:
  Since some of our users started to recover their data from backups or
  by other means (rerunning jobs etc) into the original locations I
  don't think it would be good idea to put the recovered OST back in
  service as it is, as that may cause some of users new files to be
  overwritten by the recovered files.
  
  To avoid that scenario I decided to reformat the old OST and put it
  back into filesystem as empty.
  1) First I have created a backup of the recovered object files
  2) then using lfs find and lfs getstripe on the client  I created a
  list of files and object ids from the formatted OST
  3) using backup from point 1 and information from point 2 I copied
  objects to a new location on the filesystem and renamed them to their
  original name. Now users can interrogate those files and choose which
  they want to keep.
  4) I reformatted old OST with old index id and old label
  
  
  Before I mount that OST into filesystem I want to make sure that MDS
  detects it as empty OST and does not try to recreate missing objects.
  Would it be enough to remove lov_objid from MDT and let it create new
  lov_objid based on information from OSTs, or do I need to first unlink
  all missing files from the client?
  
  Best regards,
  
  Wojciech
  
  On 26 October 2010 05:36, Wojciech Turek wj...@cam.ac.uk
  
  mailto:wj...@cam.ac.uk wrote:
  Bernd, I would like to clarify if I understood you suggestion
  correctly:
  
  1) create a new OST but using old index and old label
  2) mount it as ldiskfs and copy recovered objects (using tar or
  rsync with xattrs support) from the old OST to the new OST
  3) run --writeconf on MDT and OST of that filesystem
  4) mount MDT and all OSTs
  
  
  I guess I could do it also that way:
  
  1) backup restored object using tar or rsync with xattrs support
  2) format old OST with old index and old label
  3) restore Objects from the backup
  
  Do you think that would work?
  
  Best regards,
  
  Wojciech
  
  
  
  On 22 October 2010 18:52, Bernd Schubert
  bernd.schub...@fastmail.fm mailto:bernd.schub...@fastmail.fm
  
  wrote:
  Hmm, I would probably format a small fake device on a ramdisk
  and copy files
  over, run tunefs --writeconf /mdt and then start everything
  (inlcuding all
  OSTs) again.
  
  
  Cheers,
  
  On Friday, October 22, 2010, Wojciech Turek wrote:
   I have tried Bernd's suggestion and it seem to have worked,
  
  after running
  
   e2fsck -D ll_recover_lost_found_objs didn't cause kernel
  
  panic but moved a
  
   number of objects to O directory. Problem is that I do not
  
  have last_rcvd
  
   file so the OST has no index at the moment. What would be
  
  the next step to
  
   enable access to those files in the filesystem?
   
   Best regards,
   
   Wojciech
   
   On 22 October 2010 17:15, Andreas Dilger
  
  andreas.dil...@oracle.com mailto:andreas.dil...@oracle.com
  
  wrote:
On 2010-10-22, at 5:42, Bernd Schubert
  
  bernd.schub...@fastmail.fm
  
  mailto:bernd.schub...@fastmail.fm wrote:
 Hmm, e2fsck didn't catch that? rec_len is the length of
  
  a directory
  
entry, so

 after how many bytes the next entry follows.

I agree that e2fsck should have caught that.

 You can try to force e2fsck to do
 something about that: e2fsck -D

No, I would recommend against using -D at this point. That
  
  will cause it
  
to re-write the directory contents, and given that the
  
  filesystem was
  
previously corrupted I would prefer making as few changes
  

Re: [Lustre-discuss] system disk with external journals for OSTs formatted

2010-10-26 Thread Wojciech Turek
Hi Alex,

So if I understand you correctly you have accidentally destroyed your
external journals. So it seem that your OSTs are missing journals. Maybe the
fix will be to recreate the journal on the OSTs

regards,

Wojciech

On 26 October 2010 20:42, Alexander Bugl alexander.b...@zmaw.de wrote:

 Hi,

 we had an accident with a Sun Fire X4540 Thor System with 48 HDDs:

 The first two disks sda and sdb contain several partitions, one for the /
 file
 system, one for swap (not used) and 5 small partitions used as external
 journals for the OSTs, which reside on the 46 other HDDs.

 [r...@soss10 ~]# fdisk -l /dev/sda /dev/sdb

 Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
 255 heads, 63 sectors/track, 121601 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot  Start End  Blocks   Id  System
 /dev/sda1   *   1652752428096   fd  Linux raid
 autodetect
 /dev/sda26528   1070433551752+  fd  Linux raid
 autodetect
 /dev/sda3   10705  121601   890780152+   5  Extended
 /dev/sda5   10705   10953 261   fd  Linux raid
 autodetect
 /dev/sda6   10954   11202 261   fd  Linux raid
 autodetect
 /dev/sda7   11203   11451 261   fd  Linux raid
 autodetect
 /dev/sda8   11452   11700 261   fd  Linux raid
 autodetect
 /dev/sda9   11701   11949 261   fd  Linux raid
 autodetect

 Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
 255 heads, 63 sectors/track, 121601 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot  Start End  Blocks   Id  System
 /dev/sdb1   *   1652752428096   fd  Linux raid
 autodetect
 /dev/sdb26528   1070433551752+  fd  Linux raid
 autodetect
 /dev/sdb3   10705  121601   890780152+   5  Extended
 /dev/sdb5   10705   10953 261   fd  Linux raid
 autodetect
 /dev/sdb6   10954   11202 261   fd  Linux raid
 autodetect
 /dev/sdb7   11203   11451 261   fd  Linux raid
 autodetect
 /dev/sdb8   11452   11700 261   fd  Linux raid
 autodetect
 /dev/sdb9   11701   11949 261   fd  Linux raid
 autodetect

 The md devices are:
 md14 : active raid6 sdw[0] sdav[9] sdan[8] sdaf[7] sdx[6] sdp[5] sdh[4]
 sdau[3] sdam[2] sdae[1]
  7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UU]

 md13 : active raid6 sdak[0] sdo[9] sdg[8] sdat[7] sdal[6] sdad[5] sdv[4]
 sdn[3] sdf[2] sdas[1]
  7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UU]

 md12 : active raid6 sdd[0] sdac[9] sdu[8] sdm[7] sde[6] sdar[5] sdaj[4]
 sdab[3] sdt[2] sdl[1]
  7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UU]

 md11 : active raid6 sdah[0] sdaq[7] sdai[6] sdaa[5] sds[4] sdk[3] sdc[2]
 sdap[1]
  5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] []

 md10 : active raid6 sdi[0] sdz[7] sdao[6] sdag[5] sdy[4] sdr[3] sdq[2]
 sdj[1]
  5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] []

 md1 : active raid1 sdb2[1] sda2[0]
  33551680 blocks [2/2] [UU]

 md20 : active raid1 sdb5[1] sda5[0]
  136 blocks [2/2] [UU]

 md21 : active raid1 sdb6[1] sda6[0]
  136 blocks [2/2] [UU]

 md22 : active raid1 sdb7[1] sda7[0]
  136 blocks [2/2] [UU]

 md23 : active raid1 sdb8[1] sda8[0]
  136 blocks [2/2] [UU]

 md24 : active raid1 sdb9[1] sda9[0]
  136 blocks [2/2] [UU]

 md0 : active raid1 sdb1[1] sda1[0]
  52428032 blocks [2/2] [UU]

 The original OSTs had been created using a command like:
 mkfs.lustre --ost --fsname=${FSNAME} --mgsnode=${mgsno...@o2ib \
--reformat --mkfsoptions=-m 0 -J device=/dev/md20 \
--param ost.quota_type=ug /dev/md10 
 (the pairs md21/md11, md22/md12, ..., respectively)

 Accidentally we started a fresh installation, which could not be aborted
 fast
 enough -- the partition information on sda and sdb was erased.
 The other 46 disks should not have been harmed, though.

 We started a reinstallation which only formatted the first 2 partitions and
 which recreated the partition layout on sda and sdb, all of the md devices
 resynced without problems.

 When we now try to mount any of the 5 OSTs, we get the following error:

 [r...@soss10 ~]# mount /dev/md14
 mount.lustre: mount /dev/md14 at /lustre/ost4 failed: Invalid argument
 This may have multiple causes.
 Are the mount options correct?
 Check the syslog for more info.

 syslog says:
 Oct 26 21:34:55 soss10 kernel: LDISKFS-fs error (device md14):
 ldiskfs_check_descriptors: Block bitmap for group 1920 not in group (block
 268482810)!
 Oct 26 21:34:55 soss10 kernel: LDISKFS-fs: group descriptors corrupted!
 Oct 26 21:34:55 soss10 kernel: LustreError: 10719:0:
 (obd_mount.c:1292:server_kernel_mount()) premount /dev/md14:0x0 ldiskfs
 failed: -22, ldiskfs2 failed: -19.  Is the 

Re: [Lustre-discuss] finding clients that is opening/closing files

2010-10-26 Thread Brock Palen
This was very helpful, I found the culprit. 

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Oct 26, 2010, at 3:42 PM, Wojciech Turek wrote:

 One way is to check the /proc/fs/lustre/mds/*/exports/*/stats files, which 
 contains per-client statistics.  They can be cleared by writing 0 to the 
 file, and then check for files with lots of operations.
 
 
 On 26 October 2010 20:10, Brock Palen bro...@umich.edu wrote:
 I have wat I think is a badly behaving user, look at
 /proc/fs/lustre/mds/nobackup-MDT/stats
 
 The open/close counters are running about 1000/s,
 
 I would like to track down what clients this is coming from and knock the 
 users about fixing there code that are doing this.
 
 how does does look at 'stats by node'  do I need to look at all clients?  Or 
 can I get this information from the mds?
 Thanks!
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 
 -- 
 Wojciech Turek
 
 Senior System Architect
 
 High Performance Computing Service
 University of Cambridge
 Email: wj...@cam.ac.uk
 Tel: (+)44 1223 763517 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] finding clients that is opening/closing files

2010-10-26 Thread Joshua Walgenbach

You could check /proc/fs/lustre/mds/client-MDT/exports/NID/stats on 
your MDS.

That contains per client stats for the MDS.

-Josh

On 10/26/10 3:10 PM, Brock Palen wrote:
 I have wat I think is a badly behaving user, look at
 /proc/fs/lustre/mds/nobackup-MDT/stats

 The open/close counters are running about 1000/s,

 I would like to track down what clients this is coming from and knock the 
 users about fixing there code that are doing this.

 how does does look at 'stats by node'  do I need to look at all clients?  Or 
 can I get this information from the mds?
 Thanks!

 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] How does lustre deal with different types of media (ssd, sata, tape)

2010-10-26 Thread 3piece
I would like to make a SAN implementing lustre but want to know how it
deals with different kinds of storage media such as SSD, SATA disks and
Tape(Or possibly cloud storage)?


-- 
Best Regards,
James Moyle
+84 (0)983 117 302
m...@3piece.asia
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] How does lustre deal with different types of media (ssd, sata, tape)

2010-10-26 Thread Brian J. Murrell
On Wed, 2010-10-27 at 02:26 +0200, 3piece wrote:
 I would like to make a SAN implementing lustre but want to know how it
 deals with different kinds of storage media such as SSD, SATA disks and
 Tape(Or possibly cloud storage)?

I'm sure I understand exactly what you are trying to do but as far as
storage is concerned, as long as whatever storage you want to use for
your Lustre filesystem is abstracted as a block device on Linux, Lustre
can use it.

b.


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] How does lustre deal with different types of media (ssd, sata, tape)

2010-10-26 Thread Wojciech Turek
Lustre allows to group block devices into pools. This feature enables Lustre
administrator to create policies for file striping, for example based on the
storage media types.

On 27 October 2010 01:26, 3piece m...@3piece.asia wrote:

 I would like to make a SAN implementing lustre but want to know how it
 deals with different kinds of storage media such as SSD, SATA disks and
 Tape(Or possibly cloud storage)?


 --
 Best Regards,
 James Moyle
 +84 (0)983 117 302
 m...@3piece.asia
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] recovering formatted OST

2010-10-26 Thread Andreas Dilger
On 2010-10-27, at 03:27, Wojciech Turek wrote:
 After the recovery the OST has around 95000 objects left but LAST_ID is set 
 to 2490599 which is the highest object number left on that OST
 
 What is worrying me now is that the old OST's LAST_ID value is quite high

The OST ID values are sequential and are only used once, so the LAST_ID value 
being higher than the number of existing objects is totally normal.

 [r...@mds03 ~]# lctl get_param osc.*.prealloc_last_id | grep OST0010
 osc.ddn_data-OST0010-osc-8101dc723c00.prealloc_last_id=1

This is the client filesystem mount OSC, so the value here is irrelevant.

 osc.scratch2-OST0010-osc.prealloc_last_id=2490631

This is the MDS OSC, and it looks correct.

 Is this going to affect the operation of that OST or is this OK  and OST will 
 carry on from that number with no problems?

Yes, it appears that it is working correctly.

 On 26 October 2010 19:55, Wojciech Turek wj...@cam.ac.uk wrote:
 In that case LAST_ID seem to be fine as OST show 2490599 and MDT shows 
 2490688 so the difference is 89, I don't understand why you said that 
 difference is over 10
 
 
  [r...@oss09 ~]# od -Ax -td8 /tmp/LAST_ID
   00  2490599
   08
 
  [r...@mds03 ~]# od -Ax -td8 /tmp/lov_objid
   00  2073842  2100049
   10  2115247  2038471
   20  2119821  2190996
   30  2029234  2354424
   40  2160856  2167105
   50  1970351  2059045
   60  2706486  2571655
   70  2662262  2628346
   80  2490688  2668926
   90  2631587  2643791
   a0
 
 What I don't understand is why lctl reports last_id=1 for that OST 
 
 lctl get_param osc.*.prealloc_last_id | grep OST0010
 osc.scratch2-OST0010-osc.prealloc_last_id=1
 
 Unless this is because that OST is deactivated on the MDT ? 
 
 On 26 October 2010 19:49, Bernd Schubert bs_li...@aakef.fastmail.fm wrote:
 That is the value in the lov_objid.
 
 Cheers,
 Bernd
 
 On Tuesday, October 26, 2010, Wojciech Turek wrote:
  I can not find where MDT stores that LAST_ID value for the OST?
 
  On 26 October 2010 19:10, Bernd Schubert bs_li...@aakef.fastmail.fm wrote:
   I think the difference is quite huge (over 10 files). But the MDS has
   a sanity check and will refuse to activate this OST, if the difference
   is larger
   than 2 files.
  
   So one way or the other you need to correct it (either increase LAST_ID
   value
   on the OST or on the MDS).
  
  
   Cheers,
   Bernd
  
   On Tuesday, October 26, 2010, Wojciech Turek wrote:
Ok, I have created a filesystem on a loopback device. I mounted it as
ldiskfs and copied CONFIGS directory back to my old OST. Now
  
   tunefs.lustre
  
returns correct info.
   
last_id on OST is smaller then number in MDT lov_objid which is good
   
Can ignore that lctl get_param osc.*.prealloc_last_id | grep OST0010
osc.scratch2-OST0010-osc.prealloc_last_id=1
   
I guess when I restart whole filesystem after writeconf MDT should
  
   correct
  
that?
   
best regards,
   
Wojciech
   
On 26 October 2010 18:05, Bernd Schubert bs_li...@aakef.fastmail.fm
  
   wrote:
 Hello Wojciech,

 tunefs.lustre has to complain as the files are missing. If you copy
  
   over
  
 the
 files from the loop back device (yes, same index and label),
 tunefs.lustre should work.

 Cheers,
 Bernd

 On Tuesday, October 26, 2010, Wojciech Turek wrote:
  Hi Bernd,
 
  I am not quite clear how creating new OST on a loopback device
  would

 help:
  Shall I create new OST on a loopback device formatting it with old
  index and label and then copy recovered objects to that OST and
  mount it to the filesystem?
 
  I think I need to reformat old OST before mounting it as lustre
  type filesystem as although fsck recovered some objects (and I can
  access them mounting OST as ldiskfs)  if you run tunefs.lustre on
  that OST device, tunefs.lustre complaints that it doesn't find any
  lustre filesystem.
 
  As for the EAs I have created a backup of the recovered objects

 preserving

  EAs.
 
  Best regards,
 
  Wojciech
 
  On 26 October 2010 16:35, Bernd Schubert
  bernd.schub...@fastmail.fm

 wrote:
   Hello Wojciech,
  
   I think both would work, but why don't just create a small OST
   with mkfs.lustre on a loopback device? And then copy over those
   files to

 your

   recovered filesystem.
   Hmm, well, e2fsck might not have fixed all issues and then a
  
   reformat
  
   indeed
   might be helpful.
  
   Also note: EAs on OST objects are a nice to have, but not
  
   

[Lustre-discuss] Lustre Windows native client?

2010-10-26 Thread Germán Latorre

Hello,

Here 
(http://wiki.lustre.org/index.php/Windows_Native_Client#Download_and_Support) 
it says that alpha and beta releases are available via the early access 
program.  How could we access those releases?


Thanks a million in advance,

Germán.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss