Re: [lustre-discuss] online LFSCK - detailed output?

2017-03-02 Thread Dilger, Andreas
On Mar 2, 2017, at 17:51, Yong, Fan  wrote:
> 
> Hi Steve,
> 
> The online LFSCK has two kinds of main outputs: one is the LFSCK process and 
> statistics, the other is the LFSCK logs about what inconsistencies are found 
> and fixed.
> 
> The former one depends on the LFSCK type, can be found via:
> 1. namespace LFSCK: "lctl get_param -n mdd.${MDT_DEV}.lfsck_namespace"
> 2. layout LFSCK "lctl get_param -n obdfilter.${DEV}.lfsck_layout"
> 3. oi_scrub: "lctl get_param -n osd-ldiskfs.${MDT_DEV}.oi_scrub"
> 
> The latter one needs to be dumped on related server (MDT and/or OST) via 
> "lctl dk" with lfsck debug enabled (lctl set_param debug=+lfsck).

It is also possible to print LFSCK messages directly to the console with:

   lctl set_param printk=+lfsck


> Cheers,
> Nasf
> -Original Message-
> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On 
> Behalf Of Steve Barnet
> Sent: Friday, March 3, 2017 2:28 AM
> To: lustre-discuss@lists.lustre.org
> Subject: [lustre-discuss] online LFSCK - detailed output?
> 
> Hi all,
> 
>   I have a filesystem that definitely has some consistency errors (very long 
> tail of woe). I'm planning to run the online LFSCK to find and fix these 
> errors. At the moment, I'm just doing a dry run to get a feel for how much 
> corruption it's finding.
> 
>   One thing that would be very interesting would be more detailed output 
> about corrective actions that *would be* taken. The traditional filesystem 
> fsck provide this sort of information with the right flags. You can also 
> capture the output of a live run which is occasionally very helpful.
> 
>   I've been poking around the documentation, and I don't see that it is 
> possible to do this for the online check. Is there any option (possibly 
> undocumented :-) ) for doing this? We're running the 2.8.0 community edition.
> 
> TIA!
> 
> Best,
> 
> ---Steve
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation







___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] error destroying object rc 115

2017-03-02 Thread Scott Shaw
Hi,
We are running into an issue with one OST logging errors destroying objects and 
I am not familiar with the process of addressing this issue.   For this OST I 
have migrated and redistributed the objects to other OSTs so we don't lose any 
user data.  Below is the messages file output from the OSS server presenting 
the OST002d target and the errors. 

/var/log/messages from the OSS server:
Mar  2 22:07:29 oss4 kernel: LustreError: 
12887:0:(ofd_dev.c:1872:ofd_destroy_hdl()) sgilfs-OST002d: error destroying 
object [0x1002d:0x41799d:0x0]: -115 
Mar  2 22:07:29 oss4 kernel: LustreError: 
12887:0:(ofd_dev.c:1872:ofd_destroy_hdl()) Skipped 117 previous similar 
messages 
Mar  2 22:08:11 oss4 kernel: Lustre: sgilfs-OST002d-osd: FID 
[0x1002d:0x3fa284:0x0] != self_fid [0x1002d:0x2d1e07:0x0] 
Mar  2 22:08:11 oss4 kernel: Lustre: sgilfs-OST002d-osd: FID 
[0x1002d:0x41e662:0x0] != self_fid [0x1002d:0x2d1915:0x0] 
Mar  2 22:08:11 oss4 kernel: Lustre: sgilfs-OST002d-o: trigger OI scrub by RPC 
for [0x1002d:0x41e662:0x0], rc = 0 [1] 
Mar  2 22:08:53 oss4 kernel: Lustre: sgilfs-OST002d-osd: FID 
[0x1002d:0x3fa284:0x0] != self_fid [0x1002d:0x2d1e07:0x0] 
Mar  2 22:08:53 oss4 kernel: Lustre: sgilfs-OST002d-osd: FID 
[0x1002d:0x41e662:0x0] != self_fid [0x1002d:0x2d1915:0x0] 
Mar  2 22:08:53 oss4 kernel: Lustre: sgilfs-OST002d-o: trigger OI scrub by RPC 
for [0x1002d:0x41e662:0x0], rc = 0 [1] ...

{these errors keep cycling in the messages file}

>From the MDS server I initiated an LFSCK with a "-o" option so the MDT 
>database would sync with the OST objects but oi_scrub shows no errors and the 
>the OST oi_script cycles from completed to scanning.  

I read the available documentation but I am not clear how to determine the 
object oi_scrub is showing an issue.  Can someone help and provide a steps to 
fix this issue?

UUID   1K-blocksUsed   Available Use% Mounted on
sgilfs-OST002d_UUID  1528353475220228068 14470350844   0% 
/gbc-lustre[OST:45]

If I create a  large 50GB sequential file I can see the used capacity increase 
but if I remove the file the used capacity does not decrease by 50GB. 

On the MDS server oi_scrub shows no errors. 
sgilfs-MDT]# cat oi_scrub
name: OI_scrub
magic: 0x4c5fd252
oi_files: 64
status: completed
flags:
param:
time_since_last_completed: 21180 seconds
time_since_latest_start: 22628 seconds
time_since_last_checkpoint: 21180 seconds
latest_start_position: 137533015
last_checkpoint_position: 288358401
first_failure_position: N/A
checked: 77862246
updated: 0
failed: 0
prior_updated: 0
noscrub: 40241
igif: 1
success_count: 4
run_time: 2066 seconds
average_speed: 37687 objects/sec
real-time_speed: N/A
current_position: N/A
lf_scanned: 0
lf_reparied: 0
lf_failed: 0

oss4 sgilfs-OST002d]# cat oi_scrub
name: OI_scrub
magic: 0x4c5fd252
oi_files: 64
status: completed
flags:
param:
time_since_last_completed: 344 seconds
time_since_latest_start: 346 seconds
time_since_last_checkpoint: 344 seconds
latest_start_position: 12
last_checkpoint_position: 29868033
first_failure_position: N/A
checked: 809524
updated: 0
failed: 0
prior_updated: 0
noscrub: 0
igif: 1
success_count: 2846929
run_time: 2 seconds
average_speed: 404762 objects/sec
real-time_speed: N/A
current_position: N/A
lf_scanned: 0
lf_reparied: 0
lf_failed: 0


Efsck returns no errors on the target device. 

We're currently using the following releases. 
Lustre release 2.7.16.11
IEEL release 3.0.1.4
OS: RHEL 7.2
e2fsprogs release 1.42.13.wc5


Any help would be greatly appreciated!

Thanks,
Scott Shaw

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] online LFSCK - detailed output?

2017-03-02 Thread Yong, Fan
Hi Steve,

The online LFSCK has two kinds of main outputs: one is the LFSCK process and 
statistics, the other is the LFSCK logs about what inconsistencies are found 
and fixed.

The former one depends on the LFSCK type, can be found via:
1. namespace LFSCK: "lctl get_param -n mdd.${MDT_DEV}.lfsck_namespace"
2. layout LFSCK "lctl get_param -n obdfilter.${DEV}.lfsck_layout"
3. oi_scrub: "lctl get_param -n osd-ldiskfs.${MDT_DEV}.oi_scrub"

The latter one needs to be dumped on related server (MDT and/or OST) via "lctl 
dk" with lfsck debug enabled (lctl set_param debug=+lfsck).

Cheers,
Nasf
-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Steve Barnet
Sent: Friday, March 3, 2017 2:28 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] online LFSCK - detailed output?

Hi all,

   I have a filesystem that definitely has some consistency errors (very long 
tail of woe). I'm planning to run the online LFSCK to find and fix these 
errors. At the moment, I'm just doing a dry run to get a feel for how much 
corruption it's finding.

   One thing that would be very interesting would be more detailed output about 
corrective actions that *would be* taken. The traditional filesystem fsck 
provide this sort of information with the right flags. You can also capture the 
output of a live run which is occasionally very helpful.

   I've been poking around the documentation, and I don't see that it is 
possible to do this for the online check. Is there any option (possibly 
undocumented :-) ) for doing this? We're running the 2.8.0 community edition.

TIA!

Best,

---Steve
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Chunk of file -> LNET node

2017-03-02 Thread Dilger, Andreas
On Mar 2, 2017, at 12:31, François Tessier  wrote:
> 
> Hello,
> 
> Correct me if I'm wrong: when a file is created on a Lustre fs, a set of
> OSTs (depending on the stripe count) is assigned.

... a set of OST objects is assigned.

> It means that the chunks of file (of size stripe_size) will be distributed
> among these OSTs. To each OST corresponds a set of LNET nodes.

I'd say "Each OST is hosted by an OSS node".

> From an application point of view, when the file is effectively written, the
> chunks are sent to the OST(s) through the corresponding set of LNET nodes.

s/LNET/OSS/ yes.

> My questions are:
> 
> - How to know (if possible using the Lustre API), for each chunk, what
> is the corresponding LNET node?

After the fact this is relatively straight forward.  You can use the FIEMAP
ioctl (via the "filefrag" utility from Lustre e2fsprogs) running on any client
to report exactly the placement of each byte of the file on each OST.

In advance of actual file IO (or also after the fact), the formula for each
file is basically:

fetch file layout via llapi_layout_get_by_path() or similar
stripe_index = (logical file offset / stripe_size) % stripe_count
OST index = llapi_layout_ost_index_get(layout, stripe_index)

> - Is this distribution decided at file creation? In other words, is this
> distribution based only on offsets in file?

Yes, round-robin (RAID-0) striping is currently the only form of file layout,
and the OST object allocation is done when the file is first opened.  The
OST object used is round-robin based only on file offset, as shown above.
It is possible to "change" the layout of a file after it was written using the
"lfs migrate" command, though this is essentially rewriting the file content
after the fact to map to new objects/OSTs as requested.

We are also working on a new feature for the Lustre 2.10 release (PFL, see
http://wiki.lustre.org/images/1/1a/Progressive-File-Layouts_Hammond.pdf and DoM
for 2.11, see http://wiki.lustre.org/images/8/8f/LUG2014-DataOnMDT-Pershin.pdf )
that will allow each file's layout to have different segments based on the file
offset, so that it is possible to have different stripe count, stripe size, and
even different classes of storage based on the file offset (e.g. SSD for the
first 1MB index, HDD for the rest of the file).

This will allow a great deal of flexibility for file layouts if 
applications/libraries
need it, and will improve "out of the box" performance for users that don't 
want to
deal with the details.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation







___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Chunk of file -> LNET node

2017-03-02 Thread François Tessier
Hello,

Correct me if I'm wrong: when a file is created on a Lustre fs, a set of
OSTs (depending on the stripe count) is assigned. It means that the
chunks of file (of size stripe_size) will be distributed among these
OSTs. To each OST corresponds a set of LNET nodes. From an application
point of view, when the file is effectively written, the chunks are sent
to the OST(s) through the corresponding set of LNET nodes.

My questions are:

- How to know (if possible using the Lustre API), for each chunk, what
is the corresponding LNET node?

- Is this distribution decided at file creation? In other words, is this
distribution based only on offsets in file?

Thanks,

François

-- 
--
François TESSIER, Ph.D.
Postdoctoral Appointee
Argonne National Laboratory
LCF Division - Bldg 240, 4E 19
Tel : +1 (630)-252-5068
http://www.francoistessier.info

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] online LFSCK - detailed output?

2017-03-02 Thread Steve Barnet

Hi all,

  I have a filesystem that definitely has some consistency
errors (very long tail of woe). I'm planning to run the
online LFSCK to find and fix these errors. At the moment,
I'm just doing a dry run to get a feel for how much corruption
it's finding.

  One thing that would be very interesting would be more detailed
output about corrective actions that *would be* taken. The
traditional filesystem fsck provide this sort of information
with the right flags. You can also capture the output of a live
run which is occasionally very helpful.

  I've been poking around the documentation, and I don't see
that it is possible to do this for the online check. Is there
any option (possibly undocumented :-) ) for doing this? We're
running the 2.8.0 community edition.

TIA!

Best,

---Steve
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre Usage Survey: Input Requested

2017-03-02 Thread OpenSFS Administration
Dear Lustre Community,

The OpenSFS Lustre Working Group has launched the sixth survey for 
organizations using Lustre. We are looking for trends in Lustre usage to assist 
with future planning on releases and will present the results at LUG.

Please complete this short survey (https://www.surveymonkey.com/r/GB89BM2) to 
make sure your organization's voice is heard!

Response to the survey is due by March 31st. Note that all questions are 
optional, so it is ok to submit a partially completed survey if you prefer not 
to disclose some information.

Best regards,

OpenSFS Administration

__

OpenSFS Administration

3855 SW 153rd Drive Beaverton, OR 97003 USA

Phone: +1 503-619-0561 | Fax: +1 503-644-6708

Twitter: @OpenSFS

Email: ad...@opensfs.org | Website: 
www.opensfs.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] seclabel

2017-03-02 Thread Robin Humble
Hiya,

I'm updating an image for a root-on-lustre cluster from centos6 to 7
and I've hit a little snag. I can't seem to mount lustre so that it
understands seclabel. ie. setcap/getcap don't work. the upshot is that
root can use ping (and a few other tools), but users can't.

any idea what I'm doing wrong?

from what little I understand about it I think seclabel is a form of
xattr.

cheers,
robin
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org