[lustre-discuss] FID used by two objects

2017-07-17 Thread wanglu
Hello, 

One OST of our system can not be mounted in lustre mode after an severe disk 
error and an 5 days' e2fsck.  Here are errors we got during the mount operation.
#grep FID /var/log/messages
Jul 17 20:15:21 oss04 kernel: LustreError: 
13089:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48085/1708371613
Jul 17 20:38:41 oss04 kernel: LustreError: 
13988:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48086/3830163079
Jul 17 20:49:55 oss04 kernel: LustreError: 
14221:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48087/538285899
Jul 18 11:39:25 oss04 kernel: LustreError: 
31071:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48088/2468309129
Jul 18 11:39:56 oss04 kernel: LustreError: 
31170:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48089/2021195118
Jul 18 12:04:31 oss04 kernel: LustreError: 
32127:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48090/956682248

and the mount operation is failed with error -17
Jul 18 12:04:31 oss04 kernel: LustreError: 
32127:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID 
[0x20005:0x1:0x0] is used by two objects: 86/3303188178 48090/956682248
Jul 18 12:04:31 oss04 kernel: LustreError: 
32127:0:(qsd_lib.c:418:qsd_qtype_init()) lustre-OST0036: can't open slave index 
copy [0x20006:0x2:0x0] -17
Jul 18 12:04:31 oss04 kernel: LustreError: 
32127:0:(obd_mount_server.c:1723:server_fill_super()) Unable to start targets: 
-17
Jul 18 12:04:31 oss04 kernel: Lustre: Failing over lustre-OST0036
Jul 18 12:04:32 oss04 kernel: Lustre: server umount lustre-OST0036 complete

If you run e2fsck again, the command will claim that the inode 480xx has two 
reference and remove 480xxx to Lost+Found. 
# e2fsck -f /dev/sdn 
e2fsck 1.42.12.wc1 (15-Sep-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 48090
Connect to /lost+found? yes
Inode 48090 ref count is 2, should be 1.  Fix? yes
Pass 5: Checking group summary information

lustre-OST0036: * FILE SYSTEM WAS MODIFIED *
lustre-OST0036: 238443/549322752 files (4.4% non-contiguous), 
1737885841/2197287936 blocks

Is it possible to find the file corresponding to 86/3303188178 and delete it ?

P.S  1. in ldiskfs mode,  most of the disk files are OK to read, while some of 
them are red. 
   2.  there are about 240'000 objects in the OST. 
  [root@oss04 d0]# df -i /lustre/ostc
FilesystemInodes  IUsed IFree IUse% Mounted on
/dev/sdn   549322752 238443 5490843091% /lustre/ostc
   3.  Lustre Version 2.5.3,  e2fsprog version 

 Thank You!


Computing center,the Institute of High Energy Physics, CAS, China
Wang LuTel: (+86) 10 8823 6087
P.O. Box 918-7   Fax: (+86) 10 8823 6839
Beijing 100049  P.R. ChinaEmail: lu.w...@ihep.ac.cn
===
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre 2.10.0 ZFS version

2017-07-17 Thread Cowe, Malcolm J
The further complication is that the Lustre kmod packages, including 
kmod-zfs-osd, are compiled against the “lustre-patched” kernel 
(3.10.0-514.21.1.el7_lustre.x86_64), rather than the unpatched OS distribution 
kernel that the ZoL packages are no doubt compiled against. The move to 
patchless kernels for LDISKFS (which more or less works today, provided you 
don’t need project quotas) will further simplify binary distribution of the 
Lustre modules.

Malcolm.

On 18/7/17, 9:01 am, "lustre-discuss on behalf of Dilger, Andreas" 
 
wrote:

To be clear - we do not _currently_ build the Lustre RPMs against a binary 
RPM from ZoL, but rather build our own ZFS RPM packages, then build the Lustre 
RPMs against those packages.  This was done because ZoL didn't provide binary 
RPM packages when we started using ZFS, and we are currently not able to ship 
the binary RPM packages ourselves.

We are planning to change the Lustre build process to use the ZoL 
pre-packaged binary RPMs for Lustre 2.11, so that the binary RPM packages we 
build can be used together with the ZFS RPMs installed by end users.  If that 
change is not too intrusive, we will also try to backport this to b2_10 for a 
2.10.x maintenance release.

Cheers, Andreas

On Jul 17, 2017, at 10:42, Götz Waschk  wrote:
> 
> Hi Peter,
> 
> I wasn't able to install the official binary build of
> kmod-lustre-osd-zfs, even with kmod-zfs-0.6.5.9-1.el7_3.centos from
> from zfsonlinux.org, the ksym deps do not match. For me, it is always
> rebuilding the lustre source rpm against the zfs kmod packages.
> 
> Regards, Götz Waschk
> 
> On Mon, Jul 17, 2017 at 2:39 PM, Jones, Peter A  
wrote:
>> 0.6.5.9 according to lustre/Changelog. We have tested with pre-release 
versions of 0.7 during the release cycle too if that’s what you’re wondering.
>> 
>> 
>> 
>> 
>> On 7/17/17, 1:55 AM, "lustre-discuss on behalf of Götz Waschk" 
 
wrote:
>> 
>>> Hi everyone,
>>> 
>>> which version of kmod-zfs was the official Lustre 2.10.0 binary
>>> release for CentOS 7.3 built against?
>>> 
>>> Regards, Götz Waschk
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation







___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre 2.10.0 ZFS version

2017-07-17 Thread Dilger, Andreas
To be clear - we do not _currently_ build the Lustre RPMs against a binary RPM 
from ZoL, but rather build our own ZFS RPM packages, then build the Lustre RPMs 
against those packages.  This was done because ZoL didn't provide binary RPM 
packages when we started using ZFS, and we are currently not able to ship the 
binary RPM packages ourselves.

We are planning to change the Lustre build process to use the ZoL pre-packaged 
binary RPMs for Lustre 2.11, so that the binary RPM packages we build can be 
used together with the ZFS RPMs installed by end users.  If that change is not 
too intrusive, we will also try to backport this to b2_10 for a 2.10.x 
maintenance release.

Cheers, Andreas

On Jul 17, 2017, at 10:42, Götz Waschk  wrote:
> 
> Hi Peter,
> 
> I wasn't able to install the official binary build of
> kmod-lustre-osd-zfs, even with kmod-zfs-0.6.5.9-1.el7_3.centos from
> from zfsonlinux.org, the ksym deps do not match. For me, it is always
> rebuilding the lustre source rpm against the zfs kmod packages.
> 
> Regards, Götz Waschk
> 
> On Mon, Jul 17, 2017 at 2:39 PM, Jones, Peter A  
> wrote:
>> 0.6.5.9 according to lustre/Changelog. We have tested with pre-release 
>> versions of 0.7 during the release cycle too if that’s what you’re wondering.
>> 
>> 
>> 
>> 
>> On 7/17/17, 1:55 AM, "lustre-discuss on behalf of Götz Waschk" 
>> > goetz.was...@gmail.com> wrote:
>> 
>>> Hi everyone,
>>> 
>>> which version of kmod-zfs was the official Lustre 2.10.0 binary
>>> release for CentOS 7.3 built against?
>>> 
>>> Regards, Götz Waschk
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation







___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] set OSTs read only ?

2017-07-17 Thread Bob Ball

Thanks for all this, Andreas.  Always appreciated.

bob

On 7/17/2017 12:00 AM, Dilger, Andreas wrote:
When you write "MGS", you really mean "MDS". The MGS would be the 
place for this if you were changing the config to permanently 
deactivate the OSTs via "lctl conf_param". To temporarily do this, the 
commands should be run on the MDS via "lctl set_param".  In most cases 
the MDS and MGS are co-located, so the distinction is irrelevant, but 
good to get it right for the record.


The problem of objects not being unlinked until after the MDS is 
restarted has been fixed.


Also, with 2.9 and later it is possible to use "lctl set_param 
osp..create_count=0" to stop new file allocation on that OST 
without blocking unlinked at all, which is best for emptying old OSTs, 
rather than using "deactivate".


For marking the OSTs read-only, both of these solutions will not 
prevent clients from modifying the OST filesystems, just from creating 
new files (assuming all OSTs are set this way).


You might consider to try "mount -o remount,ro" on the MDT and OST 
filesystems on the servers to see if this works (I haven't tested this 
myself). The problem might be that this prevents new clients from 
mounting.


It probably makes sense to add server-side read-only mounting as a 
feature. Could you please file a ticket in Jira about this?


Cheers, Andreas

On Jul 16, 2017, at 09:16, Bob Ball > wrote:


I agree with Raj.  Also, I have noted with Lustre 2.7, that the space 
is not actually freed after re-activation of the OST, until the mgs 
is restarted.  I don't recall the reason for this, or know if this 
was fixed in later Lustre versions.


Remember, this is done on the mgs, not on the clients.  If you do it 
on a client, the behavior is as you thought.


bob

On 7/16/2017 11:10 AM, Raj wrote:


No. Deactivating an OST will not allow to create new objects(file). 
But, client can read AND modify an existing objects(append the 
file). Also, it will not free any space from deleted objects until 
the OST is activated again.



On Sun, Jul 16, 2017, 9:29 AM E.S. Rosenberg 
> 
wrote:


On Thu, Jul 13, 2017 at 5:49 AM, Bob Ball > wrote:

On the mgs/mgt do something like:
lctl --device -OST0019-osc-MDT deactivate

No further files will be assigned to that OST.  Reverse with
"activate".  Or reboot the mgs/mdt as this is not
persistent.  "lctl dl" will tell you exactly what that
device name should be for you.

Doesn't that also disable reads from the OST though?


bob


On 7/12/2017 6:04 PM, Alexander I Kulyavtsev wrote:

You may find advise from Andreas on this list (also
attached below). I did not try setting fail_loc myself.

In 2.9 there is setting osp.*.max_create_count=0described
at LUDOC-305.

We used to set OST degraded as described in lustre manual.
It works most of the time but at some point I saw lustre
errors in logs for some ops. Sorry, I do not recall details.

I still not sure either of these approaches will work for
you: setting OST degraded or fail_loc will makes some osts
selected instead of others.
You may want to verify if these settings will trigger clean
error on user side (instead of blocking) when all OSTs are
degraded.

The other and also simpler approach would be to enable
lustre quota and set quota below used space for all users
(or groups).

Alex.


*From: *"Dilger, Andreas" >
*Subject: **Re: [lustre-discuss] lustre 2.5.3 ost not
draining*
*Date: *July 28, 2015 at 11:51:38 PM CDT
*Cc: *"lustre-discuss@lists.lustre.org
"
>

Setting it degraded means the MDS will avoid allocations
on that OST
unless there aren't enough OSTs to meet the request (e.g.
stripe_count =
-1), so it should work.

That is actually a very interesting workaround for this
problem, and it
will work for older versions of Lustre as well.  It
doesn't disable the
OST completely, which is fine if you are doing space
balancing (and may
even be desirable to allow apps that need more bandwidth
for a widely
striped file), but it isn't good if you are trying to
empty the OST
completely to remove it.

It looks like another approach would be to mark the OST as
having no free
space using OBD_FAIL_OST_ENOINO (0x229) fault injection on
that OST:

  lctl set_param fail_loc=0x229 

Re: [lustre-discuss] Lustre 2.10.0 ZFS version

2017-07-17 Thread Jones, Peter A
0.6.5.9 according to lustre/Changelog. We have tested with pre-release versions 
of 0.7 during the release cycle too if that’s what you’re wondering.




On 7/17/17, 1:55 AM, "lustre-discuss on behalf of Götz Waschk" 
 
wrote:

>Hi everyone,
>
>which version of kmod-zfs was the official Lustre 2.10.0 binary
>release for CentOS 7.3 built against?
>
>Regards, Götz Waschk
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre 2.10.0 ZFS version

2017-07-17 Thread Götz Waschk
Hi everyone,

which version of kmod-zfs was the official Lustre 2.10.0 binary
release for CentOS 7.3 built against?

Regards, Götz Waschk
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org