[Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move

2013-03-07 Thread Alex Kulyavtsev
Hi Colin.
This is not what the manual says.

Shall it be corrected then? Or, add description for startup sequence in 
different situations (first start, restart).

The manual (or online information) does not describe graceful shutdown sequence 
for separate MGS/MDT configuration, it will be nice to add that too.

Alex.

E.g.

http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122
and similar

http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.50438194_24122
 13.2  Starting Lustre
 
 The startup order of Lustre components depends on whether you have a combined 
 MGS/MDT or these components are separate.
 
 If you have a combined MGS/MDT, the recommended startup order is OSTs, then 
 the MGS/MDT, and then clients.
 
 If the MGS and MDT are separate, the recommended startup order is: MGS, then 
 OSTs, then the MDT, and then clients.



On Mar 7, 2013, at 9:51 AM, Colin Faber wrote:

 Hi Christopher,
 
 In general this can happen when your initial remount of the various 
 services is in the wrong order.
 
 Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc.
 
 During initial mount and registration it's critical that your mount be 
 in the correct order:
 
 MGS - MDT - OST(s) - Client(s)
 
 CATALOG corruption, or out of order sequence is more rare on active file 
 system, but is possible. The simple fix here as described below is to 
 just truncate it and all should be well again.
 
 -cf
 
 ailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Document Database Re: [wc-discuss] Seeking contributors for Lustre User Manual

2012-11-14 Thread Alex Kulyavtsev

On Nov 13, 2012, at 3:26 PM, Ned Bass wrote:

 On Tue, Nov 13, 2012 at 11:48:35AM -0800, Nathan Rutman wrote:
 Would it be easier to move the manual back to a Wiki?  The low hassle
 factor of wikis has always been a draw for contribution.  The openSFS
 site is up and running with MediaWiki now (wiki.opensfs.org).
 
 Easier? Yes, probably. Better? I personally don't think so.  Wikis are
 great collaboration tools for informally sharing information, but I
 don't think the paradigm scales well for documents of this size and
 complexity. And a wiki isn't the right tool for producing a formal
 professional-quality document, which is what I think the Lustre manual
 should strive to be.
 
 True, we would lower the bar for contributions, but for that we would
 sacrifice the following features that I consider essential.
 
 - Ability to export to multiple formats (pdf, html, epub) from one source

http://www.docbook.org ?

 - Consistency of formatting and navigation elements
 - A review process for proposed changes that assures a high standard of 
 quality
- ability to track changes between document versions to incrementally update 
'higher level' documents

 
 However, there are some short articles that probably do belong in the
 wiki that could be poached from the manual, i.e. installation and
 configuration procedures, etc.
Right. 
And also the other way around: detailed articles on wiki written by developers 
can be later 'harvested' by professional writer into manual chapter, 
referencing to wiki for details. 
Lowering entry bar is vital to encourage developers to write or update 
documentation.

DB:
In addition to wiki and manual it will be nice to have Document Database, 
where conference reports, RFCs, RFP, HLD, DLD, ... can be committed, updated 
and later searched. 
Something like DocDB
http://sourceforge.net/projects/docdb-v/
Document format can be any.
DocDB has been created to keep track of documentation in large collaboration - 
BTeV experiment -  and then used by several others. DocDB has ability to manage 
access rights to some documents. 

I think we need all three - wiki, DocDB and manuals, they serve different 
purpose.

KB:
Right now lustre support tips and hints are living on lustre-discuss list. It 
is tedious to search emails (no tags,no links), and when the answer found, 
there is no guarantee it is still relevant.
It can be useful to accumulate tips and best practices in Knowledge Base and 
have mechanisms to update it, e.g. instead of answering directly to the list 
create entry in KB and post the ref. to the list.

Alex.

 
 Ned
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Tar backup of MDT runs extremely slow, tar pauses on pointers to very large files

2012-05-30 Thread Alex Kulyavtsev

Is this the same issue as at backup MDT question (and follow up)
http://lists.lustre.org/pipermail/lustre-discuss/2009-April/010151.html
due to sparse files on MDT?  Does tar take a lot of CPU?
Alex.

On May 30, 2012, at 5:02 PM, Andreas Dilger wrote:


The tar backup of the MDT is taking a very long time. So far it has
backed up 1.6GB of the 5.0GB used in nine hours. In watching the tar
process pointers to small or average size files are backed up  
quickly

and at a consistent pace. When tar encounters a pointer/inode
belonging to a very large file (100GB+) the tar process stalls on  
that

file for a very long time, as if it were trying to archive the real
filesize amount of data rather than the pointer/inode.


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] 2.1.1? Re: 1.8.x server for el6

2012-01-05 Thread Alex Kulyavtsev
On Jan 5, 2012, at 12:12 PM, Peter Jones wrote:
 On 12-01-05 9:21 AM, Andreas Dilger wrote:
...
 For new deployments the recommended version is 2.1.0 with RHEL6.1.   
 We are
 starting work on a 2.1.1 maintenance release for the spring.
 While it is not often that I would disagree with Andreas, I would say
 that the answer on this point depends upon your timing. Right now, if
 stability is your primary driver (and it sounds like it is) then I  
 would
 recommend 1.8.7-wc1. The early feedback from 2.1 is very encouraging,
 but I think that we need a little more production feedback before we
 could confidently assert that 2.1.x is the default option.

Peter,
if by some reason (features) we need to stick with 2.1.x and rebuild  
2.1.0 with 2.1.1 patches to get more stability if needed,
is there
- separate branch for it (2.1.1)
- JIRA tracker for bug fixes for 2.1.1 release

Alex.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre 2.0 client cache size

2011-03-31 Thread Alex Kulyavtsev
We used POSIX_FADV_DONTNEED in distributed iozone test to clear cache  
on slave client after initial write and before client reads back same  
data.  It helped to see real data rates instead of unrealistically  
high read rate due do cacheing. Perhaps it was non-lustre (NFS) file  
server.

If you do scripting, small executable like
   http://www.citi.umich.edu/projects/asci/benchmarks.html  (scroll  
down to Clearcache)
can be called after cp or dd.
Alex.


On Mar 19, 2011, at 12:47 AM, Jay wrote:

 After checking 2.6.35 kernel source code, POSIX_FADV_NOREUSE  
 actually doesn't do anything. So I don't know how it helps. Probably  
 we should do POSIX_FADV_DONTNEED after reading?

 Jay

 On Mar 18, 2011, at 1:07 AM, DEGREMONT Aurelien wrote:

 ... snip...
 Hmm... I do not want to patch 'cp' or 'dd' :)
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss