Re: [lustre-discuss] [EXTERNAL] Re: storing Lustre jobid in file xattrs: seeking feedback

2023-05-15 Thread Andreas Dilger via lustre-discuss
Note that there have been some requests to increase the jobid size (LU-16765) 
so any tools that are accessing the xattr shouldn't assume the jobid is only 32 
bytes in size.

On May 14, 2023, at 13:11, Bertschinger, Thomas Andrew Hjorth 
mailto:bertschin...@lanl.gov>> wrote:

Thanks for the responses.

I like the idea of allowing the xattr name to be a parameter, because while it 
increases the complexity, it seems safer.

The main difficulty I can think of is that user tools that query the jobid will 
need to get the value of the parameter first in order to query the correct 
xattr. Additionally, if the parameter is changed, jobids from old files may be 
missed. This doesn't seem like a big risk however, because I imagine this value 
would be changed rarely if ever.

As for limiting the name to 7 characters, I believe Andreas is referring to the 
xattr name itself, not the contents of the xattr, so there should be no problem 
with storing the full length of a jobid (32 characters) -- but let me know if I 
am not interpreting that correctly.

- Tom Bertschinger

From: Jeff Johnson 
mailto:jeff.john...@aeoncomputing.com>>
Sent: Friday, May 12, 2023 4:56 PM
To: Andreas Dilger
Cc: Bertschinger, Thomas Andrew Hjorth; 
lustre-discuss@lists.lustre.org
Subject: [EXTERNAL] Re: [lustre-discuss] storing Lustre jobid in file xattrs: 
seeking feedback

Just a thought, instead of embedding the jobname itself, perhaps just a least 
significant 7 character sha-1 hash of the jobname. Small chance of collision, 
easy to decode/cross reference to jobid when needed. Just a thought.

--Jeff


On Fri, May 12, 2023 at 3:08 PM Andreas Dilger via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>>
 wrote:
Hi Thomas,
thanks for working on this functionality and raising this question.

As you know, I'm inclined toward the user.job xattr, but I think it is never a 
good idea to unilaterally make policy decisions in the kernel that cannot be 
changed.

As such, it probably makes sense to have a tunable parameter like 
"mdt.*.job_xattr=user.job" and then this could be changed in the future if 
there is some conflict (e.g. some site already uses the "user.job" xattr for 
some other purpose).

I don't think the job_xattr should allow totally arbitrary values (e.g. 
overwriting trusted.lov or trusted.lma or security.* would be bad). One option 
is to only allow a limited selection of valid xattr namespaces, and possibly 
names:

 *   NONE to turn this feature off
 *   user, or trusted or system (if admin wants to restrict the ability of 
regular users to change this value?), with ".job" added automatically
 *   user.* (or trusted.* or system.*) to also allow specifying the xattr name

If we allow the xattr name portion to be specified (which I'm not sure about, 
but putting it out for completeness), it should have some reasonable limits:

 *   <= 7 characters long to avoid wasting valuable xattr space in the inode
 *   should not conflict with other known xattrs, which is tricky if we allow 
the name to be arbitrary. Possibly if in trusted (and system?) it should only 
allow trusted.job to avoid future conflicts?
 *   maybe restrict it to contain "job" (or maybe "pbs", "slurm", ...) to 
reduce the chance of namespace clashes in user or system? However, I'm 
reluctant to restrict names in user since this shouldn't have any fatal side 
effects (e.g. data corruption like in trusted or system), and the admin is 
supposed to know what they are doing...

On May 4, 2023, at 15:53, Bertschinger, Thomas Andrew Hjorth via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>>
 wrote:

Hello Lustre Users,

There has been interest in a proposed feature 
https://jira.whamcloud.com/browse/LU-13031 to store the jobid with each Lustre 
file at create time, in an extended attribute. An open question is which xattr 
namespace is to use between "user", the Lustre-specific namespace "lustre", 
"trusted", or even perhaps "system".

The correct namespace likely depends on how this xattr will be used. For 
example, will interoperability with other filesystems be important? Different 
namespaces have their own limitations so the correct choice depends on the use 
cases.

I'm looking for feedback on applications for this feature. If you have thoughts 
on how you could use this, please feel free to share them so that we design it 
in a way that meets your needs.

Thanks!

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: storing Lustre jobid in file xattrs: seeking feedback

2023-05-14 Thread Bertschinger, Thomas Andrew Hjorth via lustre-discuss
Thanks for the responses.

I like the idea of allowing the xattr name to be a parameter, because while it 
increases the complexity, it seems safer.

The main difficulty I can think of is that user tools that query the jobid will 
need to get the value of the parameter first in order to query the correct 
xattr. Additionally, if the parameter is changed, jobids from old files may be 
missed. This doesn't seem like a big risk however, because I imagine this value 
would be changed rarely if ever.

As for limiting the name to 7 characters, I believe Andreas is referring to the 
xattr name itself, not the contents of the xattr, so there should be no problem 
with storing the full length of a jobid (32 characters) -- but let me know if I 
am not interpreting that correctly.

- Tom Bertschinger

From: Jeff Johnson 
Sent: Friday, May 12, 2023 4:56 PM
To: Andreas Dilger
Cc: Bertschinger, Thomas Andrew Hjorth; lustre-discuss@lists.lustre.org
Subject: [EXTERNAL] Re: [lustre-discuss] storing Lustre jobid in file xattrs: 
seeking feedback

Just a thought, instead of embedding the jobname itself, perhaps just a least 
significant 7 character sha-1 hash of the jobname. Small chance of collision, 
easy to decode/cross reference to jobid when needed. Just a thought.

--Jeff


On Fri, May 12, 2023 at 3:08 PM Andreas Dilger via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>> wrote:
Hi Thomas,
thanks for working on this functionality and raising this question.

As you know, I'm inclined toward the user.job xattr, but I think it is never a 
good idea to unilaterally make policy decisions in the kernel that cannot be 
changed.

As such, it probably makes sense to have a tunable parameter like 
"mdt.*.job_xattr=user.job" and then this could be changed in the future if 
there is some conflict (e.g. some site already uses the "user.job" xattr for 
some other purpose).

I don't think the job_xattr should allow totally arbitrary values (e.g. 
overwriting trusted.lov or trusted.lma or security.* would be bad). One option 
is to only allow a limited selection of valid xattr namespaces, and possibly 
names:

  *   NONE to turn this feature off
  *   user, or trusted or system (if admin wants to restrict the ability of 
regular users to change this value?), with ".job" added automatically
  *   user.* (or trusted.* or system.*) to also allow specifying the xattr name

If we allow the xattr name portion to be specified (which I'm not sure about, 
but putting it out for completeness), it should have some reasonable limits:

  *   <= 7 characters long to avoid wasting valuable xattr space in the inode
  *   should not conflict with other known xattrs, which is tricky if we allow 
the name to be arbitrary. Possibly if in trusted (and system?) it should only 
allow trusted.job to avoid future conflicts?
  *   maybe restrict it to contain "job" (or maybe "pbs", "slurm", ...) to 
reduce the chance of namespace clashes in user or system? However, I'm 
reluctant to restrict names in user since this shouldn't have any fatal side 
effects (e.g. data corruption like in trusted or system), and the admin is 
supposed to know what they are doing...

On May 4, 2023, at 15:53, Bertschinger, Thomas Andrew Hjorth via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>> wrote:

Hello Lustre Users,

There has been interest in a proposed feature 
https://jira.whamcloud.com/browse/LU-13031 to store the jobid with each Lustre 
file at create time, in an extended attribute. An open question is which xattr 
namespace is to use between "user", the Lustre-specific namespace "lustre", 
"trusted", or even perhaps "system".

The correct namespace likely depends on how this xattr will be used. For 
example, will interoperability with other filesystems be important? Different 
namespaces have their own limitations so the correct choice depends on the use 
cases.

I'm looking for feedback on applications for this feature. If you have thoughts 
on how you could use this, please feel free to share them so that we design it 
in a way that meets your needs.

Thanks!
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org