Re: [lustre-discuss] [EXTERNAL] permission denied for some files

2020-12-15 Thread Chad DeWitt
Hi Robert,

Just rolling dice, but is it possible that the append only attribute is set
on the files in question?

For instance, I have the attribute set on a file named *testing* (the
append attribute is shown by the 'a' flag):

# lsattr testing
-a-- testing

When I try to remove *testing* with the append flag set:

# rm testing
rm: remove regular empty file ‘testing’? y
rm: cannot remove ‘testing’: Operation not permitted


If that flag is set on your problematic files, the unsetting the flag will
allow you to delete them:

# chattr -a testing
# rm testing
rm: remove regular empty file ‘testing’? y


BTW - You have to use lsattr to see the append flag; a normal ls will not
show it...

Chad



Chad DeWitt, CISSP | University Research Computing

UNC Charlotte *| *Office of OneIT

ccdew...@uncc.edu *| *https://oneit.uncc.edu




On Tue, Dec 15, 2020 at 8:02 AM Robert Redl  wrote:

> Dear Lustre Users,
>
> we have some files in the system, that are not writable anymore. Symptoms:
>
> - normal rm fails: rm: cannot remove 'file': Operation not permitted.
> - lfs rmfid fails without error message. The file is just not deleted.
> - lfs migrate fails: file: no write permission, skipped
> - all commands are executes as root on a node which is listed in
> nosquash_nids.
> - other files with the same permissions located in the same folder are
> writable.
> - the affected file are readable. Creating a copy is possible, but
> deleting the original file not.
> - the problem remains after running lfsck.
>
> We use Lustre 2.12.5 on Centos 7 with ZFS backend.
>
> Has anyone had such issues before? Any ideas how to delete these files?
>
> Thank you!
> Robert
> --
>
> Dr. Robert Redl
> Scientific Programmer, "Waves to Weather" (SFB/TRR165)
> Meteorologisches Institut
> Ludwig-Maximilians-Universität München
> Theresienstr. 37, 80333 München, Germany
>
> [Caution: Email from External Sender. Do not click or open links or
> attachments unless you know this sender.]
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: Disk quota exceeded while quota is not filled

2020-08-26 Thread Chad DeWitt
Sure, David.

Unfortunately, I do not have experience with project quota, but I would not
think that it's causing any ill effects by not having any project quotas
defined.

>From the manual, this appears to be what you are encountering (25.5. Quota
Allocation):

*Note*
*It is very important to note that the block quota is consumed per OST and
the inode quota per MDS. Therefore, when the quota is consumed on one OST
(resp. MDT), the client may not be able to create files regardless of the
quota available on other OSTs (resp. MDTs).*


(The group, md_kaplan, has hit quota on 19 OSTs.) Not sure if there is a
way to "free" up the 500GB still allowed to md_kaplan. Maybe unset and then
reset the group's quota? Maybe stripe any large files owned by md_kaplan so
the data is spread amongst the OSTs?

Cheers,
Chad

On Wed, Aug 26, 2020 at 11:41 AM David Cohen 
wrote:

> Thank you Chad for answering,
> We are using the patched kernel on the MDT/OSS
> The problem is in the group space quota.
> In any case I enabled project quota just for future purposes.
> There are no defined projects, do you think it can still pose a problem?
>
> Best,
> David
>
>
>
>
> On Wed, Aug 26, 2020 at 3:18 PM Chad DeWitt  wrote:
>
>> Hi David,
>>
>> Hope you're doing well.
>>
>> This is a total shot in the dark, but depending on the kernel version you
>> are running, you may need a patched kernel to use project quotas. I'm not
>> sure what the symptoms would be, but it may be worth turning off project
>> quotas and seeing if doing so resolves your issue:
>>
>> lctl conf_param technion.quota.mdt=none
>> lctl conf_param technion.quota.mdt=ug
>> lctl conf_param technion.quota.ost=none
>> lctl conf_param technion.quota.ost=ug
>>
>> (Looks like you have been running project quota on your MDT for a while
>> without issue, so this may be a deadend.)
>>
>> Here's more info concerning when a patched kernel is necessary for
>> project quotas (25.2.  Enabling Disk Quotas):
>>
>> http://doc.lustre.org/lustre_manual.xhtml
>>
>>
>> Cheers,
>> Chad
>>
>> 
>>
>> Chad DeWitt, CISSP | University Research Computing
>>
>> UNC Charlotte *| *Office of OneIT
>>
>> ccdew...@uncc.edu
>>
>> 
>>
>>
>>
>> On Tue, Aug 25, 2020 at 3:04 AM David Cohen <
>> cda...@physics.technion.ac.il> wrote:
>>
>>> [*Caution*: Email from External Sender. Do not click or open links or
>>> attachments unless you know this sender.]
>>>
>>> Hi,
>>> Still hoping for a reply...
>>>
>>> It seems to me that old groups are more affected by the issue than new
>>> ones that were created after a major disk migration.
>>> It seems that the quota enforcement is somehow based on a counter other
>>> than the accounting as the accounting produces the same numbers as du.
>>> So if quota is calculated separately from accounting, it is possible
>>> that quota is broken and keeps values from removed disks, while accounting
>>> is correct.
>>> So following that suspicion I tried to force the FS to recalculate quota.
>>> I tried:
>>> lctl conf_param technion.quota.ost=none
>>> and back to:
>>> lctl conf_param technion.quota.ost=ugp
>>>
>>> I tried running on mds and all ost:
>>> tune2fs -O ^quota
>>> and on again:
>>> tune2fs -O quota
>>> and after each attempt, also:
>>> lctl lfsck_start -A -t all -o -e continue
>>>
>>> But still the problem persists and groups under the quota usage get
>>> blocked with "quota exceeded"
>>>
>>> Best,
>>> David
>>>
>>>
>>> On Sun, Aug 16, 2020 at 8:41 AM David Cohen <
>>> cda...@physics.technion.ac.il> wrote:
>>>
>>>> Hi,
>>>> Adding some more information.
>>>> A Few months ago the data on the Lustre fs was migrated to new physical
>>>> storage.
>>>> After successful migration the old ost were marked as active=0
>>>> (lctl conf_param technion-OST0001.osc.active=0)
>>>>
>>>> Since then all the clients were unmounted and mounted.
>>>> tunefs.lustre --writeconf was executed on the mgs/mdt and all the ost.
>>>> lctl dl don't show the old ost anymore, but when querying the quota
>>>> they still appear.
>>>> As I see that new users are less affected by the "quota exceeded"

Re: [lustre-discuss] [EXTERNAL] Re: Disk quota exceeded while quota is not filled

2020-08-26 Thread Chad DeWitt
Hi David,

Hope you're doing well.

This is a total shot in the dark, but depending on the kernel version you
are running, you may need a patched kernel to use project quotas. I'm not
sure what the symptoms would be, but it may be worth turning off project
quotas and seeing if doing so resolves your issue:

lctl conf_param technion.quota.mdt=none
lctl conf_param technion.quota.mdt=ug
lctl conf_param technion.quota.ost=none
lctl conf_param technion.quota.ost=ug

(Looks like you have been running project quota on your MDT for a while
without issue, so this may be a deadend.)

Here's more info concerning when a patched kernel is necessary for
project quotas (25.2.  Enabling Disk Quotas):

http://doc.lustre.org/lustre_manual.xhtml


Cheers,
Chad



Chad DeWitt, CISSP | University Research Computing

UNC Charlotte *| *Office of OneIT

ccdew...@uncc.edu





On Tue, Aug 25, 2020 at 3:04 AM David Cohen 
wrote:

> [*Caution*: Email from External Sender. Do not click or open links or
> attachments unless you know this sender.]
>
> Hi,
> Still hoping for a reply...
>
> It seems to me that old groups are more affected by the issue than new
> ones that were created after a major disk migration.
> It seems that the quota enforcement is somehow based on a counter other
> than the accounting as the accounting produces the same numbers as du.
> So if quota is calculated separately from accounting, it is possible that
> quota is broken and keeps values from removed disks, while accounting is
> correct.
> So following that suspicion I tried to force the FS to recalculate quota.
> I tried:
> lctl conf_param technion.quota.ost=none
> and back to:
> lctl conf_param technion.quota.ost=ugp
>
> I tried running on mds and all ost:
> tune2fs -O ^quota
> and on again:
> tune2fs -O quota
> and after each attempt, also:
> lctl lfsck_start -A -t all -o -e continue
>
> But still the problem persists and groups under the quota usage get
> blocked with "quota exceeded"
>
> Best,
> David
>
>
> On Sun, Aug 16, 2020 at 8:41 AM David Cohen 
> wrote:
>
>> Hi,
>> Adding some more information.
>> A Few months ago the data on the Lustre fs was migrated to new physical
>> storage.
>> After successful migration the old ost were marked as active=0
>> (lctl conf_param technion-OST0001.osc.active=0)
>>
>> Since then all the clients were unmounted and mounted.
>> tunefs.lustre --writeconf was executed on the mgs/mdt and all the ost.
>> lctl dl don't show the old ost anymore, but when querying the quota they
>> still appear.
>> As I see that new users are less affected by the "quota exceeded" problem
>> (blocked from writing while quota is not filled),
>> I suspect that quota calculation is still summing values from the old ost:
>>
>> *lfs quota -g -v md_kaplan /storage/*
>> Disk quotas for grp md_kaplan (gid 10028):
>>  Filesystem  kbytes   quota   limit   grace   files   quota   limit
>> grace
>>   /storage/ 4823987000   0 5368709120   -  143596   0
>>   0   -
>> technion-MDT_UUID
>>   37028   -   0   -  143596   -   0
>> -
>> quotactl ost0 failed.
>> quotactl ost1 failed.
>> quotactl ost2 failed.
>> quotactl ost3 failed.
>> quotactl ost4 failed.
>> quotactl ost5 failed.
>> quotactl ost6 failed.
>> quotactl ost7 failed.
>> quotactl ost8 failed.
>> quotactl ost9 failed.
>> quotactl ost10 failed.
>> quotactl ost11 failed.
>> quotactl ost12 failed.
>> quotactl ost13 failed.
>> quotactl ost14 failed.
>> quotactl ost15 failed.
>> quotactl ost16 failed.
>> quotactl ost17 failed.
>> quotactl ost18 failed.
>> quotactl ost19 failed.
>> quotactl ost20 failed.
>> technion-OST0015_UUID
>> 114429464*  - 114429464   -   -   -
>> -   -
>> technion-OST0016_UUID
>> 92938588   - 92938592   -   -   -   -
>>   -
>> technion-OST0017_UUID
>> 128496468*  - 128496468   -   -   -
>> -   -
>> technion-OST0018_UUID
>> 191478704*  - 191478704   -   -   -
>> -   -
>> technion-OST0019_UUID
>> 107720552   - 107720560   -   -   -
>> -   -
>> technion-OST001a_UUID
>> 165631952*  - 165631952   -   -   -
>> -   -
>> technion-OST001b_UUID
>> 4607141

Re: [lustre-discuss] new install client locks up on ls /lustre

2020-07-08 Thread Chad DeWitt
Hi Sid,

Hope you're doing well.

This link may help:

http://wiki.lustre.org/Mounting_a_Lustre_File_System_on_Client_Nodes


Just for general troubleshooting, you may want to ensure that both the
firewall and SELinux are disabled for all your Lustre virtuals.

Cheers,
Chad



Chad DeWitt, CISSP

UNC Charlotte *| *OneIT – University Research Computing

ccdew...@uncc.edu *| *www.uncc.edu




If you are not the intended recipient of this transmission or a person
responsible for delivering it to the intended recipient, any disclosure,
copying, distribution, or other use of any of the information in this
transmission is strictly prohibited. If you have received this transmission
in error, please notify me immediately by reply email or by telephone at
704-687-7802. Thank you.


On Wed, Jul 8, 2020 at 7:36 PM Sid Young  wrote:

> Hi all,
>
> I'm new ish to lustre and I've just created a lustre 2.12.5 cluster using
> the RPMs from whamcloud for Centos 7.8 with 1 MDT/MGS and 1 OSS with 3
> OST's (20GB each)
> Everything is formatted as ldiskfs and it's running on a vmware platform
> as a test bed using tcp.
> The MDT mounts ok, the OST's mount and on my client I can mount the
> /lustre mount point (58GB) and I can ping everything via the lnet however
> as soon as I try to do an ls -l /lustre or any kind of I/O the client locks
> solid till I reboot it.
>
> I've tried to work out how to run basic diagnostics to no avail so I am
> stupped why I don't see a directory listing for what should be an empty 60G
> disk.
>
> On the MDS I ran this:
> [root@lustre-mds tests]# lctl  dl
>   0 UP osd-ldiskfs lustre-MDT-osd lustre-MDT-osd_UUID 10
>   1 UP mgs MGS MGS 8
>   2 UP mgc MGC10.140.95.118@tcp acdb253b-b7a8-a949-0bf2-eaa17dc8dca4 4
>   3 UP mds MDS MDS_uuid 2
>   4 UP lod lustre-MDT-mdtlov lustre-MDT-mdtlov_UUID 3
>   5 UP mdt lustre-MDT lustre-MDT_UUID 12
>   6 UP mdd lustre-MDD lustre-MDD_UUID 3
>   7 UP qmt lustre-QMT lustre-QMT_UUID 3
>   8 UP lwp lustre-MDT-lwp-MDT lustre-MDT-lwp-MDT_UUID 4
>   9 UP osp lustre-OST-osc-MDT lustre-MDT-mdtlov_UUID 4
>  10 UP osp lustre-OST0001-osc-MDT lustre-MDT-mdtlov_UUID 4
>  11 UP osp lustre-OST0002-osc-MDT lustre-MDT-mdtlov_UUID 4
> [root@lustre-mds tests]#
>
> So it looks like I have everything is running even dmesg on the client
> reports:
>
> [7.998649] Lustre: Lustre: Build Version: 2.12.5
> [8.016113] LNet: Added LNI 10.140.95.65@tcp [8/256/0/180]
> [8.016214] LNet: Accept secure, port 988
> [   10.992285] Lustre: Mounted lustre-client
>
>
> Any pointer where to look? /var/log/messages shows no errors
>
>
> Sid Young
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Questions about LU-13645

2020-07-02 Thread Chad DeWitt
Good afternoon, All

I hope everyone is doing well.

I had a few questions concerning *LU-13645 - Various data corruptions
possible in lustre* [https://jira.whamcloud.com/browse/LU-13645].

We are looking at deploying Lustre 2.12.5 and while browsing JIRA, this
particular issue caught my attention. I read the issue, however, I can't
determine if it is severe enough to prevent upgrading.

Is anyone familiar with this issue and its severity/prevalence? Has anyone
else upgraded to 2.12.5?

Thank you,
Chad



Chad DeWitt, CISSP

UNC Charlotte *| *OneIT – University Research Computing




smime.p7s
Description: S/MIME Cryptographic Signature
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] High MDS load

2020-05-28 Thread Chad DeWitt
Hi Heath,

Hope you're doing well!

Your mileage may vary (and quite frankly, there may be better approaches),
but this is a quick and dirty set of steps to find which client is issuing
a large number of metadata operations.:


   - Log into the affected MDS.


   - Change into the exports directory.

cd /proc/fs/lustre/mdt/**/exports/


   - OPTIONAL: Set all your stats to zero and clear out stale clients. (If
   you don't want to do this step, you don't really have to, but it does make
   it easier to see the stats if you are starting with a clean slate. In fact,
   you may want to skip this the first time through and just look for high
   numbers. If a particular client is the source of the issue, the stats
   should clearly be higher for that client when compared to the others.)

echo "C" > clear


   - Wait for a few seconds and dump the stats.

for client in $( ls -d */ ) ; do echo && echo && echo ${client} && cat
${client}/stats && echo ; done


You'll get a listing of stats for each mounted client like so:

open  278676 samples [reqs]
close 278629 samples [reqs]
mknod 2320 samples [reqs]
unlink495 samples [reqs]
mkdir 575 samples [reqs]
rename1534 samples [reqs]
getattr   277552 samples [reqs]
setattr   550 samples [reqs]
getxattr  2742 samples [reqs]
statfs350058 samples [reqs]
samedir_rename1534 samples [reqs]


(Don't worry if some of the clients give back what appears to be empty
stats. That just means they are mounted, but have not yet performed any
metadata operations.) From this data, you are looking for any "high"
samples.  The client with the high samples is usually the culprit.  For the
example client stats above, I would look to see what process(es) on this
client is listing, opening, and then closing files in Lustre... The
advantage with this method is you are seeing exactly which metadata
operations are occurring. (I know there are also various utilities included
with Lustre that may give this information as well, but I just go to the
source.)

Once you find the client, you can use various commands, such as mount and
lsof to get a better understanding of what may be hitting Lustre.

Some of the more common issues I've found that can cause a high MDS load:

   - List a directory containing a large number of files. (Instead, unalias
   ls or better yet, use lfs find.)
   - Remove on many files.
   - Open and close many files. (May be better to move the data over to
   another file system, such as XFS, etc.  We keep some of our deep learning
   off Lustre, because of the sheer number of small files.)

Of course the actual mitigation of the load depends on what the user is
attempting to do...

I hope this helps...

Cheers,
Chad

--------

Chad DeWitt, CISSP

UNC Charlotte *| *ITS – University Research Computing

ccdew...@uncc.edu *| *www.uncc.edu




If you are not the intended recipient of this transmission or a person
responsible for delivering it to the intended recipient, any disclosure,
copying, distribution, or other use of any of the information in this
transmission is strictly prohibited. If you have received this transmission
in error, please notify me immediately by reply email or by telephone at
704-687-7802. Thank you.


On Thu, May 28, 2020 at 11:37 AM Peeples, Heath 
wrote:

> I have 2 MDSs and periodically on one of them (either at one time or
> another) peak above 300, causing the file system to basically stop.  This
> lasts for a few minutes and then goes away.  We can’t identify any one user
> running jobs at the times we see this, so it’s hard to pinpoint this on a
> user doing something to cause it.   Could anyone point me in the direction
> of how to begin debugging this?  Any help is greatly appreciated.
>
>
>
> Heath
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] List of Files on Busted OST

2019-03-13 Thread Chad DeWitt
Hi Paul,

lfs find may do what you want:

lfs find *lustre_mount_point* --ost *IDs_of_OSTs*

*IDs_of_OSTs* is comma delimited


Should even grab files that are striped and have a portion of their data on
the specified OSTS.

-cd



Chad DeWitt, CISSP

UNC Charlotte *| *ITS – University Research Computing

9201 University City Blvd. *| *Charlotte, NC 28223

ccdew...@uncc.edu *| *www.uncc.edu




If you are not the intended recipient of this transmission or a person
responsible for delivering it to the intended recipient, any disclosure,
copying, distribution, or other use of any of the information in this
transmission is strictly prohibited. If you have received this transmission
in error, please notify me immediately by reply email or by telephone at
704-687-7802. Thank you.


On Wed, Mar 13, 2019 at 10:18 AM Paul Edmon  wrote:

> I have a OSS that is offline.  Is there a way to poll the MDT and grab a
> list of files that are on the affected OST's?  Or is the only method
> just scaning the whole filesystem with normal find combined with lfs
> getstripe commands?
>
> -Paul Edmon-
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Migrating files doesn't free space on the OST

2019-01-17 Thread Chad DeWitt
Hi Jason,

I do not know if this will help you or not, but I had a situation in 2.8.0
where an OST filled up and I marked it as disabled on the MDS:

lctl dl | grep osc
...Grab the *device_id* of the full OST and then deactivate it...
lctl --device *device_id* deactivate

IIRC, this allowed the data to be read, but deletes were not processed.
When I re-activated the OST, then the deletes were processed and space
started clearing.  I think you stated you had the OST deactivated.  If you
still do, try to reactive it.

lctl --device *device_id* activate

Once you reactivate the OST, the deletes will start processing within 10 -
30 seconds...  Just use lfs df -h to watch...

-cd




Chad DeWitt, CISSP

UNC Charlotte *| *ITS – University Research Computing

9201 University City Blvd. *| *Charlotte, NC 28223

ccdew...@uncc.edu *| *www.uncc.edu




If you are not the intended recipient of this transmission or a person
responsible for delivering it to the intended recipient, any disclosure,
copying, distribution, or other use of any of the information in this
transmission is strictly prohibited. If you have received this transmission
in error, please notify me immediately by reply email or by telephone at
704-687-7802. Thank you.


On Thu, Jan 17, 2019 at 2:38 PM Jason Williams  wrote:

> Hello Alexander,
>
>
> Thank you for your reply.
>
> - We are not using zfs, it's an LDISKFS backing store, so no snapshots.
>
> - I have re-run lfs getstripe to make sure the file is indeed moving
>
> - I just looked for lfsck but I don't seem to have it.  We are running
> 2.10.4 so I don't know what version that appeared in.
>
> - I will try to have a look into the jobstats and see what I can find, but
> I made sure the files I moved were not in use when I moved them.
>
>
>
> --
> Jason Williams
> Assistant Director
> Systems and Data Center Operations.
> Maryland Advanced Research Computing Center (MARCC)
> Johns Hopkins University
> jas...@jhu.edu
>
>
>
> --
> *From:* Alexander I Kulyavtsev 
> *Sent:* Thursday, January 17, 2019 12:56 PM
> *To:* Jason Williams; lustre-discuss@lists.lustre.org
> *Subject:* Re: Migrating files doesn't free space on the OST
>
>
> - you can re-run command to find files residing on ost to see if files are
> new or old.
>
> - zfs may have snapshots if you ever did snapshots; it takes space.
>
> - removing data or snapshots has some lag to release the blocks (tens of
> minutes) but I guess that is completed by now.
>
> - there are can be orphan objects on OST if you had crashes. On older
> lustre versions if the ost was emptied out you can mount underlying fs as
> ext4 or zfs; set mount to readonly and browse ost objects - you may see if
> there are some orphan objects left. On newer lustre releases you probably
> can run lfsck (lustre scanner).
>
> - to find what hosts / jobs currently writing to lustre you may enable
> lustre jobstats; clear counters and parse stats files in /proc . There was
> xltop tool on github for older versions of lustre not having implemented
> jobstats but it was not updated for a while.
>
> - depending on lustre version you have the implementation of lfs migrate
> is different. The older version copied file with other name to other ost,
> renamed files and removed old file. If migration done on file open for
> write by application the data will not be released until file closed (and
> data in new file are wrong). Recent implementation of migrate does swap of
> the file objects with file layout lock taken. I can not tell if it is safe
> for active write.
>
> - not releasing space can be a bug - did you check jira on whamcloud? What
> version of lustre do you have? Is it ldiskfs or zfs based? zfs version?
>
>
> Alex.
>
>
> --
> *From:* lustre-discuss  on
> behalf of Jason Williams 
> *Sent:* Wednesday, January 16, 2019 10:25 AM
> *To:* lustre-discuss@lists.lustre.org
> *Subject:* [lustre-discuss] Migrating files doesn't free space on the OST
>
>
> I am trying to migrate files I know are not in use off of the full OST
> that I have using lfs migrate.  I have verified up and down that the files
> I am moving are on that OST and that after the migrate lfs getstripe indeed
> shows they are no longer on that OST since it's disabled in the MDS.
>
>
> The problem is, the used space on the OST is not going down.
>
>
> I see one of at least two issues:
>
> - the OST is just not freeing the space for some reason or another ( I
> don't know)
>
> - Or someone is writing to existing files just as fast as I am clearing
> the data (possible, bu

Re: [lustre-discuss] lnet can't bind to 988

2018-06-08 Thread Chad DeWitt
Maybe this applies in your situation?

https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#idm140687082747200



Chad DeWitt, CISSP

UNC Charlotte *| *ITS – University Research Computing

ccdew...@uncc.edu *| *www.uncc.edu



On Fri, Jun 8, 2018 at 11:33 AM, Ben Evans  wrote:

> I've found that doing "modprobe lustre" until it succeeds works, but
> that's just on my own dev VMs
>
> -Ben Evans
>
> On 6/8/18, 11:17 AM, "lustre-discuss on behalf of Michael Di Domenico"
>  mdidomeni...@gmail.com> wrote:
>
> >i'm having trouble with 2.10.4 clients running on rhel 7.5 kernel 862.3.2
> >
> >at times when the box boots lustre wont mount, lnet bops out and
> >complains about port 988 being in use
> >
> >however, when i run netstat or lsof commands, i cannot find port 988
> >listed against anything
> >
> >is there some way to trace deeper to see what lnet is really complaining
> >about
> >
> >usually rebooting the box fixes the issue, but this seems a little
> >mysterious
> >___
> >lustre-discuss mailing list
> >lustre-discuss@lists.lustre.org
> >http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre [2.8.0] and the Linux Automounter

2017-06-19 Thread Chad DeWitt
Good morning, All.

We are considering using Lustre [2.8.0] with the Linux automounter.  Is
anyone using this combination successfully?  Are there any caveats?

(I did check JIRA, but only found two tickets concerning 1.x Lustre.)

Thank you in advance,
Chad



Chad DeWitt, CISSP | HPC Storage Administrator

UNC Charlotte *| *ITS – University Research Computing


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org