Hello,

Indeed, there is an issue with Lustre changelog UNLINK records that never 
report actual deletion (UNLINK_LAST flag is not set) when a file is still 
opened when deleted.
I don't know if there is already a Jira  about that. 
 
Thomas

-----Message d'origine-----
De : Davide Tacchella [mailto:dtacche...@cray.com] 
Envoyé : jeudi 13 juin 2019 09:30
À : gerald.ho...@hpe.com; robinhood-support@lists.sourceforge.net
Objet : Re: [robinhood-support] Missing attribute 'fullpath'

Hello,
we see this regularly on RH 3.1.5, haven't had time to nail it down but
our work-around is to scan FS regularly to purge DB orphans.

one possible reproducer is:
- open file
- delete file
- sleep 1h
- close file

open - delete - close is perfectly acceptable in Unix, as file will be
deleted once the last file descriptor is closed.

Other possible causes for these DB orphans are Changelog events out of
order, causing events to be removed from CL queue without being
processed.

Davide

On Thu, 2019-06-13 at 07:03 +0000, Hofer, Gerald (HPE Pointnext South
Pacific Delivery) wrote:
> I have an issue with the lhsm_archive where several FIDs produce this
> error message during the archive:
>  
> 2019/06/13 16:27:31 [32655/21] Policy | Missing attribute 'fullpath'
> for evaluating boolean expression on [0x200128c87:0x1db1d:0x0]
> 2019/06/13 16:27:31 [32655/21] Policy | [0x200128c87:0x1db1d:0x0]:
> attribute is missing for checking ignore_fileclass rule
> 2019/06/13 16:27:31 [32655/21] lhsm_archive | Warning: cannot
> determine if entry  is whitelisted: skipping it.
> 2019/06/13 16:27:31 [32655/21] Policy | [0x200128c87:0x1db1d:0x0]:
> attribute is missing for checking fileset 'scratch'
>  
> The reason for this error is that the entry exists in the database
> but does not have a path. And because I have a fileclass that is
> based on a tree, the check is failing:
>  
> FileClass scratch {
>         definition {
>             tree == "/lustre/scratch"
>         }
> }
>  
> The entry is in the database – the only entry that is missing is the
> NAMES entry. But it must have existed at some stage since the STRIPE
> entries are there.
>  
>  
> MariaDB [robinhood_lustre]> select * from ENTRIES where
> id='0x200128c87:0x1db1d:0x0';
> +-------------------------+----------+---------+--------+--------+---
> ------------+-------------+------------+---------------+------+----
> --+-------+------------+---------+-------------+--------------+----
> ---------+-------------+-------------+-------------+-------------+---
> ----------+-----------+
> | id                      | uid      | gid     | size   | blocks |
> creation_time | last_access | last_mod   | last_mdchange | type |
> mode | nlink | md_update  | invalid | fileclass   | class_update |
> lhsm_status | lhsm_archid | lhsm_norels | lhsm_noarch | lhsm_lstarc |
> lhsm_lstrst | lhsm_uuid |
> +-------------------------+----------+---------+--------+--------+---
> ------------+-------------+------------+---------------+------+----
> --+-------+------------+---------+-------------+--------------+----
> ---------+-------------+-------------+-------------+-------------+---
> ----------+-----------+
> | 0x200128c87:0x1db1d:0x0 | n9614532 | default | 815808 |   1600 |   
> 1560317119 |  1560317125 | 1560317113 |    1560317119 | file |  484
> |     0 | 1560323027 |       0 | +std_files+ |   1560407251 |
> modified    |           1 |           0 |           0 |           0
> |           0 | NULL      |
> +-------------------------+----------+---------+--------+--------+---
> ------------+-------------+------------+---------------+------+----
> --+-------+------------+---------+-------------+--------------+----
> ---------+-------------+-------------+-------------+-------------+---
> ----------+-----------+
> 1 row in set (0.00 sec)
>  
> MariaDB [robinhood_lustre]> select * from NAMES where
> id='0x200128c87:0x1db1d:0x0';
> Empty set (0.00 sec)
>  
> MariaDB [robinhood_lustre]> select * from STRIPE_INFO where
> id='0x200128c87:0x1db1d:0x0';
> +-------------------------+-----------+--------------+-------------+-
> ----------+
> | id                      | validator | stripe_count | stripe_size |
> pool_name |
> +-------------------------+-----------+--------------+-------------+-
> ----------+
> | 0x200128c87:0x1db1d:0x0 |         0 |            1 |     1048576
> |           |
> +-------------------------+-----------+--------------+-------------+-
> ----------+
> 1 row in set (0.00 sec)
>  
> MariaDB [robinhood_lustre]> select * from STRIPE_ITEMS where
> id='0x200128c87:0x1db1d:0x0';
> +-------------------------+--------------+--------+----------------
> ------+
> | id                      | stripe_index | ostidx |
> details              |
> +-------------------------+--------------+--------+----------------
> ------+
> | 0x200128c87:0x1db1d:0x0 |            0 |     12 |    
> �              |
> +-------------------------+--------------+--------+----------------
> ------+
> 1 row in set (0.00 sec)
>  
>  
> That particular file does not exist any more in Lustre:
> [root@robinhood robinhood]# lfs fid2path /lustre
> 0x200128c87:0x1db1d:0x0
> fid2path: error on FID 0x200128c87:0x1db1d:0x0: No such file or
> directory
>  
> So something did go wrong that did not delete the entry out of the
> database.
>  
> This happens now fairly regularly. I seem to accumulate these errors
> regularly. A restart of robinhood does seem to clear out these
> errors, but new ones accumulate.
>  
> I am running robinhood 3.1.5 on RHEL 7.6 and Lustre 2.10.5.
>  
> My assumption at that stage is that I seem to be hitting some timing
> issue, where a file gets deleted while it is processed by the
> changelog – resulting in an incomplete database entry.
> I have not seen that in the past, but I have not changed anything for
> a while, so it is odd that this appeared.
>  
> Is this anything someone has seen before?
> Any theory how that could happen?
>  
>  
> Thanks,
> Gerald
>  
>  
>  
>  
> Gerald Hofer
> Technical Consultant
> High Performance Computing
> HPE Pointnext South Pacific Delivery
> 
> +61 418 888 567  Mobile
> 
> Brisbane/Queensland
> hpe.com/pointnext
> 
> 
> 
>  
> _______________________________________________
> robinhood-support mailing list
> robinhood-support@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/robinhood-support

_______________________________________________
robinhood-support mailing list
robinhood-support@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/robinhood-support

_______________________________________________
robinhood-support mailing list
robinhood-support@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to