Re: [lustre-discuss] LUG 2018

2018-06-20 Thread Shawn Hall
There were recordings taken – they have not been put online yet it appears.  
I’ll check with Kevin from Argonne this afternoon to see where we’re at on 
getting those posted.

Shawn

From: lustre-discuss  on behalf of 
"E.S. Rosenberg" 
Date: Wednesday, June 20, 2018 at 1:51 PM
To: Andreas Dilger 
Cc: Alexander I Kulyavtsev , Lustre discussion 

Subject: Re: [lustre-discuss] LUG 2018

Thanks those are great overviews, does the fact that there are only links to 
slides mean that no recordings were put online (on youtube) this year?

On Wed, Jun 20, 2018 at 8:46 PM, Andreas Dilger 
mailto:adil...@whamcloud.com>> wrote:
There is also a semi-complete archive of past LUG events at:

http://wiki.lustre.org/Category:Events

like http://wiki.lustre.org/Lustre_User_Group_2018 and similar.  While
the page layout isn't as fancy, this archive goes back a lot further
than the official OpenSFS pages, and holds presentations, video links,
etc. for LUG sites that no longer exist (e.g. when LUG was organized
with a 3rd-party provider).

It doesn't include presentations from e.g. China LUG or Japan LUG or
other events that I don't have the slides for, but if someone has a
copy of those slides it would be useful to make a page to host them.

Cheers, Andreas

On Jun 20, 2018, at 11:25, Alexander I Kulyavtsev 
mailto:a...@fnal.gov>> wrote:
>
> Slides at:
> http://opensfs.org/lug-2018-agenda/
> -A.
>
> From: lustre-discuss 
> mailto:lustre-discuss-boun...@lists.lustre.org>>
>  on behalf of "E.S. Rosenberg" 
> mailto:esr%2blus...@mail.hebrew.edu>>
> Date: Wednesday, June 20, 2018 at 12:20 PM
> To: Lustre discussion 
> mailto:lustre-discuss@lists.lustre.org>>
> Subject: [lustre-discuss] LUG 2018
>
> Hi all,
> Are the talks online yet?
> Thanks,
> Eli
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

---
Andreas Dilger
Principal Lustre Architect
Whamcloud
This e-mail has been scanned for all viruses and malware, and may have been 
automatically archived by Mimecast Ltd, an innovator in Software as a Service 
(SaaS) for business
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] LUG 2018

2018-06-20 Thread E.S. Rosenberg
Thanks those are great overviews, does the fact that there are only links
to slides mean that no recordings were put online (on youtube) this year?

On Wed, Jun 20, 2018 at 8:46 PM, Andreas Dilger 
wrote:

> There is also a semi-complete archive of past LUG events at:
>
> http://wiki.lustre.org/Category:Events
>
> like http://wiki.lustre.org/Lustre_User_Group_2018 and similar.  While
> the page layout isn't as fancy, this archive goes back a lot further
> than the official OpenSFS pages, and holds presentations, video links,
> etc. for LUG sites that no longer exist (e.g. when LUG was organized
> with a 3rd-party provider).
>
> It doesn't include presentations from e.g. China LUG or Japan LUG or
> other events that I don't have the slides for, but if someone has a
> copy of those slides it would be useful to make a page to host them.
>
> Cheers, Andreas
>
> On Jun 20, 2018, at 11:25, Alexander I Kulyavtsev  wrote:
> >
> > Slides at:
> > http://opensfs.org/lug-2018-agenda/
> > -A.
> >
> > From: lustre-discuss  on
> behalf of "E.S. Rosenberg" 
> > Date: Wednesday, June 20, 2018 at 12:20 PM
> > To: Lustre discussion 
> > Subject: [lustre-discuss] LUG 2018
> >
> > Hi all,
> > Are the talks online yet?
> > Thanks,
> > Eli
> >
> >
> > ___
> > lustre-discuss mailing list
> > lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> ---
> Andreas Dilger
> Principal Lustre Architect
> Whamcloud
>
>
>
>
>
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] LUG 2018

2018-06-20 Thread Andreas Dilger
There is also a semi-complete archive of past LUG events at:

http://wiki.lustre.org/Category:Events

like http://wiki.lustre.org/Lustre_User_Group_2018 and similar.  While
the page layout isn't as fancy, this archive goes back a lot further
than the official OpenSFS pages, and holds presentations, video links,
etc. for LUG sites that no longer exist (e.g. when LUG was organized
with a 3rd-party provider).

It doesn't include presentations from e.g. China LUG or Japan LUG or
other events that I don't have the slides for, but if someone has a
copy of those slides it would be useful to make a page to host them.

Cheers, Andreas

On Jun 20, 2018, at 11:25, Alexander I Kulyavtsev  wrote:
> 
> Slides at:
> http://opensfs.org/lug-2018-agenda/
> -A.
> 
> From: lustre-discuss  on behalf of 
> "E.S. Rosenberg" 
> Date: Wednesday, June 20, 2018 at 12:20 PM
> To: Lustre discussion 
> Subject: [lustre-discuss] LUG 2018
> 
> Hi all,
> Are the talks online yet?
> Thanks,
> Eli
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

---
Andreas Dilger
Principal Lustre Architect
Whamcloud








signature.asc
Description: Message signed with OpenPGP
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] dealing with maybe dead OST

2018-06-20 Thread Andreas Dilger
On Jun 19, 2018, at 09:33, Robin Humble  wrote:
> 
> Hi,
> 
> so we've maybe lost 1 OST out of a filesystem with 115 OSTs. we may
> still be able to get the OST back, but it's been a month now so
> there's pressure to get the cluster back and working and leave the
> files missing for now...
> 
> the complication is that because the OST might come back to life we
> would like to avoid the users rm'ing their broken files and potentially
> deleting them forever.
> 
> lustre is 2.5.41 ldiskfs centos6.x x86_64.
> 
> ideally I think we'd move all the ~2M files on the OST to a root access
> only "shadow" directory tree in lustre that's populated purely with
> files from the dead OST.
> if we manage to revive the OST then these can magically come back to
> life and we can mv them back into their original locations.
> 
> but currently
>  mv: cannot stat 'some_file': Cannot send after transport endpoint shutdown
> the OST is deactivated on the client. the client hangs if the OST isn't
> deactivated. the OST is still UP & activated on the MDS.
> 
> is there a way to mv files when their OST is unreachable?
> 
> seems like mv is an MDT operation so it should be possible somehow?

This is a problem purely of GNU fileutil's invention.  It is very stat()
happy and will stat() a file and its parent directory several times during
mv, cp, rm, etc. "just to make sure" rather than going ahead and just
trying the operation.  You can see this by running "strace mv  ",
especially if they are in different directories.

I don't think there is a low-level "rename" tool that is like "unlink"
that will just do the rename() call without all of the overhead.  The
"rename" command is (AFAICS) meant to rename a batch of files with some
common substring in the filename (like "rename foo bar foo*.txt").

In the Lustre source tree there is a very simple C program that only calls

rename(argv[1], argv[2]);

without doing stat() or anything else.  This is lustre/tests/mrename.c
that you could use together with "lfs find", something like:

cd /mnt/myfs
mkdir -p .broken_ost0012
lfs find . -type f --ost myfs-OST0012 |
while read F; do
mkdir -p ".broken_ost0012/$(dirname "$F")"
mrename "$F" ".broken_ost0012/$F"
done

(this is completely untested, but something similar should work).

> the only thing I've thought of seems pretty out there...
> mount the MDT as ldiskfs and mv the affected files into the shadow
> tree at the ldiskfs level.
> ie. with lustre running and mounted, create an empty shadow tree of
> all dirs under eg. /lustre/shadow/, and then at the ldiskfs level on
> the MDT:
>  for f in ; do
> mv /mnt/mdt0/ROOT/$f /mnt/mdt0/ROOT/shadow/$f
>  done
> 
> would that work?

This would work to some degree, but the "link" xattr on each file
would not be updated, so "lfs fid2path" would be broken until a
full LFSCK is run.

> maybe we'd also have to rebuild OI's and lfsck - something along the
> lines of the MDT restore procedure in the manual. hopefully that would
> all work with an OST deactivated.
> 
> 
> alternatively, should we just unlink all the currently dead files from
> lustre now, and then if the OST comes back can we reconstruct the paths
> and filenames from the FID in xattrs's on the revived OST?
> I suspect unlink is final though and this wouldn't work... ?

That would be possible, but overly complex, since the inodes would be
removed from the MDT and you'd need to reconstruct them with LFSCK and
find the names, as LFSCK would dump them all into $MNT/.lustre/lost+found.

> we can also take an lvm snapshot of the MDT and refer to that later I
> suppose, but I'm not sure how that might help us.

It should be possible to copy the unlinked files from the backup MDT
to the current MDT (via ldiskfs), along with an LFSCK run to rebuild
the OI files.  It is always a good idea to have an MDT device-level
backup before you do anything drastic like this.  However, for the
meantime I think that renaming the broken files to a root-only directory
is the safest.

Cheers, Andreas

---
Andreas Dilger
Principal Lustre Architect
Whamcloud






signature.asc
Description: Message signed with OpenPGP
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] LUG 2018

2018-06-20 Thread E.S. Rosenberg
I saw the slides which is why I was wondering about the recordings (if
there are recordings).

On Wed, Jun 20, 2018 at 8:25 PM, Alexander I Kulyavtsev 
wrote:

> Slides at:
> http://opensfs.org/lug-2018-agenda/
> -A.
>
> From: lustre-discuss  on behalf
> of "E.S. Rosenberg" 
> Date: Wednesday, June 20, 2018 at 12:20 PM
> To: Lustre discussion 
> Subject: [lustre-discuss] LUG 2018
>
> Hi all,
> Are the talks online yet?
> Thanks,
> Eli
>
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] LUG 2018

2018-06-20 Thread Alexander I Kulyavtsev
Slides at:
http://opensfs.org/lug-2018-agenda/
-A.

From: lustre-discuss  on behalf of 
"E.S. Rosenberg" 
Date: Wednesday, June 20, 2018 at 12:20 PM
To: Lustre discussion 
Subject: [lustre-discuss] LUG 2018

Hi all,
Are the talks online yet?
Thanks,
Eli


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] LUG 2018

2018-06-20 Thread E.S. Rosenberg
Hi all,
Are the talks online yet?
Thanks,
Eli
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Automated tests

2018-06-20 Thread Peter Jones
There is but they are temporarily out of service due to being relocated.

On 2018-06-20, 8:40 AM, "lustre-discuss on behalf of George Melikov" 
 wrote:

Hello,

Is there any public automated test platforms for latest Lustre code?

I've found buildbots http://build.lustre.org/console , but they don't run 
any tests.

Thanks!


Sincerely,
George Melikov,
Tel. 7-915-278-39-36
Skype: georgemelikov

С наилучшими пожеланиями,
Георгий Меликов,
m...@gmelikov.ru
Моб: +7 9152783936
Skype: georgemelikov
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Automated tests

2018-06-20 Thread George Melikov
Hello,

Is there any public automated test platforms for latest Lustre code?

I've found buildbots http://build.lustre.org/console , but they don't run any 
tests.

Thanks!


Sincerely,
George Melikov,
Tel. 7-915-278-39-36
Skype: georgemelikov

С наилучшими пожеланиями,
Георгий Меликов,
m...@gmelikov.ru
Моб:         +7 9152783936
Skype:     georgemelikov
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] dealing with maybe dead OST

2018-06-20 Thread Robin Humble
On Wed, Jun 20, 2018 at 10:20:09AM -0400, Robin Humble wrote:
>On Tue, Jun 19, 2018 at 08:54:53PM +, Cowe, Malcolm J wrote:
>>Would using hard links work, instead of mv?

ah. success! looks like it's just that gnu 'mv' and 'ln' are wy too
smart for their own good. you got me thinking... what are 'mv' and 'ln'
doing lstat() for anyway?

so I wrote a few lines of C and stdio's rename() "just works" on the
client, even when the OST is disabled (as it damn well should).
too easy...
happily python's os.rename() works too ('cos I am lazy)

whoo! no need to mess with the MDT. that's a relief.

thanks :)

cheers,
robin
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] dealing with maybe dead OST

2018-06-20 Thread Robin Humble
Hi Malcolm,

thanks for replying.

On Tue, Jun 19, 2018 at 08:54:53PM +, Cowe, Malcolm J wrote:
>Would using hard links work, instead of mv?

hmm, interesting idea, but no:
  # ln some_file /lustre/shadow/some_file
  ln: failed to access 'some_file' Cannot send after transport endpoint shutdown

ln is trying to lstat() which fails. I think almost all client
operations are going to fail with a deactivated/down OST.

things like 'lfs getstripe' (pure MDS ops) work ok.

or did you mean doing hard links on the MDT?

unless there's a purely MDS lustre tool to do a mv/rename operation on
the MDT, then I think the only option is to mess around with the low
level suff on the MDT when it's mounted as ldiskfs and hope I don't
break too much...

there used to be a 'lfs mv' (now 'lfs migrate') but that isn't quite the
mv operations I'm after.

any advice or war stories (especially "this is a waste of your time -
it will never work because of X,Y,Z") would be much appreciated :)

time to read more of the lustre manual now...

cheers,
robin


>Malcolm.
> 
>
>???On 20/6/18, 1:34 am, "lustre-discuss on behalf of Robin Humble" 
>rjh+lus...@cita.utoronto.ca> wrote:
>
>Hi,
>
>so we've maybe lost 1 OST out of a filesystem with 115 OSTs. we may
>still be able to get the OST back, but it's been a month now so
>there's pressure to get the cluster back and working and leave the
>files missing for now...
>
>the complication is that because the OST might come back to life we
>would like to avoid the users rm'ing their broken files and potentially
>deleting them forever.
>
>lustre is 2.5.41 ldiskfs centos6.x x86_64.
>
>ideally I think we'd move all the ~2M files on the OST to a root access
>only "shadow" directory tree in lustre that's populated purely with
>files from the dead OST.
>if we manage to revive the OST then these can magically come back to
>life and we can mv them back into their original locations.
>
>but currently
>  mv: cannot stat 'some_file': Cannot send after transport endpoint 
> shutdown
>the OST is deactivated on the client. the client hangs if the OST isn't
>deactivated. the OST is still UP & activated on the MDS.
>
>is there a way to mv files when their OST is unreachable?
>
>seems like mv is an MDT operation so it should be possible somehow?
>
>
>the only thing I've thought of seems pretty out there...
>mount the MDT as ldiskfs and mv the affected files into the shadow
>tree at the ldiskfs level.
>ie. with lustre running and mounted, create an empty shadow tree of
>all dirs under eg. /lustre/shadow/, and then at the ldiskfs level on
>the MDT:
>  for f in ; do
> mv /mnt/mdt0/ROOT/$f /mnt/mdt0/ROOT/shadow/$f
>  done
>
>would that work?
>maybe we'd also have to rebuild OI's and lfsck - something along the
>lines of the MDT restore procedure in the manual. hopefully that would
>all work with an OST deactivated.
>
>
>alternatively, should we just unlink all the currently dead files from
>lustre now, and then if the OST comes back can we reconstruct the paths
>and filenames from the FID in xattrs's on the revived OST?
>I suspect unlink is final though and this wouldn't work... ?
>
>we can also take an lvm snapshot of the MDT and refer to that later I
>suppose, but I'm not sure how that might help us.
>
>as you can probably tell I haven't had to deal with this particular
>situation before :)
>
>thanks for any help.
>
>cheers,
>robin
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org