Hi folks,

Just going to piggy back on this thread as we are experiencing exactly the same 
thing.  We're ldiskfs, though, not zfs.  We were 2.10.3 on all servers and, 
when this first occurred, it brought down our MDT with the following ASSERT:

Apr 10 01:22:09 hpcmds01.adqimr.ad.lan kernel: LustreError: 
183448:0:(osp_precreate.c:634:osp_precreate_send()) qimrb-OST008f-osc-MDT0000: 
precreate fid [0x1008f0000:0xa65bd3f:0x0] < local used fid 
[0x1008f0000:0xa65bd3f:0x0]:
Apr 10 01:22:09 hpcmds01.adqimr.ad.lan kernel: LustreError: 
57275:0:(osp_precreate.c:1311:osp_precreate_ready_condition()) 
qimrb-OST008f-osc-MDT0000: precreate failed opd_pre_status -116
Apr 10 01:22:09 hpcmds01.adqimr.ad.lan kernel: LustreError: 
183448:0:(osp_precreate.c:1259:osp_precreate_thread()) 
qimrb-OST008f-osc-MDT0000: cannot precreate objects: rc = -116
Apr 10 01:22:09 hpcmds01.adqimr.ad.lan kernel: LustreError: 
8475:0:(lod_qos.c:1624:lod_alloc_qos()) ASSERTION( nfound <= inuse->op_count ) 
failed: nfound:19, op_count:0
Apr 10 01:22:09 hpcmds01.adqimr.ad.lan kernel: LustreError: 
8475:0:(lod_qos.c:1624:lod_alloc_qos()) LBUG

Attempts to remount the MDT resulted in repeated crashes.  Thinking it was 
https://jira.whamcloud.com/browse/LU-10297, we brought the MDS up to 2.10.4 and 
were immediately bit by https://jira.whamcloud.com/browse/LU-11227, as we have 
deactivated OSTs so we quickly upgraded MDS/MGS to 2.10.6.  We're now still 
seeing the -52/-116 and we have three OSTs that we similarly can't create 
objects on with explicit "lfs setstripe -i".  OSSs are still on 2.10.3.

Not sure if I should reply to Marco's request for the various node list, "lfs 
df" and "getparam"s here, or open up a jira ticket.  Leaning towards the latter 
but I'll spend some time in jira today to ensure it's not a duplicate, first.

We're currently up and running again but are looking to resolve the remaining 
unusable OSTs.  And, like Amit, we're working towards a 2.12 upgrade in the 
near future but we just haven't got there yet.

Cheers,
Scott

________________________________
From: lustre-discuss <[email protected]> on behalf of 
Marco Grossi <[email protected]>
Sent: Tuesday, 10 March 2020 9:22 PM
To: Kumar, Amit <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: [lustre-discuss] unable to precreate -52/-116

Hi Amit,

Sounds definitely different from my case.

The only JIRA issue logging a "precreate fid < local used fid" is:
https://jira.whamcloud.com/browse/LU-11536

What puzzle me is the "rc = -52" on the "ofd_create_hdl"; if I mapped it
correctly, is a -EBADE error, i.e. "invalid exchange".

Can you provide:
- HA node list and location of MGS, MDT and OST between nodes

As well as the output of:
- lfs df
- lfs df -i
- lctl get_param osp.*scratch0-OST0029*.prealloc*
- lctl get_param obdfilter.*scratch0-OST0029*.last_id

Regards,
Marco


On 3/9/20 5:23 PM, Kumar, Amit wrote:
> Hi Marco,
>
> Thank you for the response on this issue.
>
> We have an HA setup, I tried to fail over MDT to the secondary pair and then 
> fail it back. This did not help.
> I also tried restart of the MDS servers, that did not help.
> I have rebooted OSS servers as well, that did not help
> I also tried completely stopping MDS and unmounting MDS for a little while 
> and that did not help either.
>
> This error ritually comes back right after MDT is mounted. Additionally I am 
> not able to manually create any files on that particular OST. Any other 
> thoughts.
>
> Thank you,
> Amit
>
> -----Original Message-----
> From: Marco Grossi <[email protected]>
> Sent: Monday, March 9, 2020 11:23 AM
> To: Kumar, Amit <[email protected]>
> Cc: [email protected]
> Subject: Re: [lustre-discuss] unable to precreate -52/-116
>
> Hi Amit,
>
> We had a similar issue after a set_param of "max_create_count=0"
>
> In our case re-mounting the MDT (not the OST) fixed the issue.
>
> Hope it helps.
>
> Regards,
> Marco
>
>
> On 3/3/20 8:25 PM, Kumar, Amit wrote:
>> Dear Lustre,
>>
>>
>>
>> Recently we had a degraded(Not failed) RAID and had to wait longer to
>> get compatible disk, as we had received incompatible one and it took
>> over a week to get the correct one back in place.
>>
>>
>>
>> During this wait I ended up disabling the OST first and then noticed
>> continuous IO to the OST and thought of disabling object creation on
>> it as well. Everything looked normal after that and once the disk was
>> replaced I reenabled object creation and enabled OST. Since then I
>> started seeing these messages on OST
>>
>> .(ofd_dev.c:1784:ofd_create_hdl()) scratch0-OST0029: unable to
>> precreate: rc = -52
>>
>> And following messages on MDS
>>
>> .(osp_precreate.c:1282:osp_precreate_thread())
>> scratch0-OST0029-osc-MDT0000: cannot precreate objects: rc = -116
>>
>> .(osp_precreate.c:657:osp_precreate_send())
>> scratch0-OST0029-osc-MDT0000: precreate fid
>> [0x100290000:0x101b39a:0x0] < local used fid
>> [0x100290000:0x101b39a:0x0]: rc = -116
>>
>>
>>
>> These messages don't seem to stop. I am wondering what impact could
>> these errors have in long run? I have noticed I am not able to create
>> files on this particular OST using lfs setstripe, when I do so it gets
>> me an object on another OST by default. Just want to make sure this is
>> not causing any data loss for files the currently on them and new requests?
>>
>> We plan to upgrade to 2.12 in the summer downtime and assuming that
>> has a fix based on LU-9442 & LU-11186.  Currently running servers on
>> lustre
>> 10.4.1 over ZFS-0.7.9-1
>>
>>
>>
>> Any help is greatly appreciated.
>>
>>
>>
>> Thank you,
>> Amit
>>
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> [email protected]
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
> --
> Marco Grossi
> ICHEC Systems Team
>
>
> ----IF CLASSIFICATION START----
>
> ----IF CLASSIFICATION END----
>

--
Marco Grossi
ICHEC Systems Team
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to