[Lustre-discuss] inconsistent client behavior when creating an empty directory

2011-08-09 Thread Andrej Filipcic

Hi,

the following code does not work as expected:
-
#include sys/stat.h
#include errno.h
#include stdio.h

int main(int argc, char** argv) {

  int rc;
  rc=mkdir(argv[1],S_IRWXU);
  if(rc) perror(failed create dir);
  chown(argv[1],4103,4100);

  struct stat buf;
  /* stat(argv[1],buf); */

  setresuid(0,4103,4100);
  rc=mkdir(argv[1],S_IRWXU);
  if(rc) perror(failed create dir as user);
}
-

initial status:

# ls -ld /lustre/test
drwxr-xr-x 2 root root 4096 Aug  9 14:59 /lustre/test
# ls -l /lustre/test
total 0

1) running the test program:

# /tmp/test /lustre/test/testdir
failed create dir as user: Permission denied
# ls -l /lustre/test
total 4
drwx-- 2 griduser03 grid 4096 Aug  9 15:02 testdir

griduser03, grid correspond to uid=4103,gid=4100


2) running the test program, but with uncommented stat call:
# /tmp/test /lustre/test/testdir
failed create dir as user: File exists
# ls -l /lustre/test
total 4
drwx-- 2 griduser03 grid 4096 Aug  9 15:04 testdir


The code first makes the testdir as root and changes the ownership to uid 4103. 
Then it tries to (re)create the same dir with the user privileges. 

If stat is called, the code behaves as expected (case 2), but if not (case 
1), the second mkdir should return EEXIST and not EACCES. Is this behavior 
expected or is it a client bug? The client runs lustre 1.8.6.

The code just illustrates, what is actually used in a complex software.

Andrej

-- 
_
   prof. dr. Andrej Filipcic,   E-mail: andrej.filip...@ijs.si
   Department of Experimental High Energy Physics - F9
   Jozef Stefan Institute, Jamova 39, P.o.Box 3000
   SI-1001 Ljubljana, Slovenia
   Tel.: +386-1-477-3674Fax: +386-1-477-3166
-

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Parallel Data Storage Workshop 2011 CFP

2011-08-09 Thread Andreas Dilger

6th Parallel Data Storage Workshop
Sunday, November 13, 2011, 9:00am - 5:30pm
http://www.pdsi-scidac.org/events/PDSW11/
Held in conjunction with SC11 in Seattle, WA 


PDSW11 CALL FOR PAPERS

Workshop Abstract: Computational scientists are no longer satisfied with
petascale infrastructures.  Their demands for finer and finer spatial and
temporal resolutions are driving parallel storage systems to larger and larger
scales of parallelism and concurrency.  This scale creates new problems and
exacerbates old ones in areas such as storage capacity, performance,
concurrency, data retrieval, reliability, availability, and manageability.
Additionally, new technologies such as cloud storage are encouraging scientists
to preserve more old data and to expand their analyses to include data 
from a wider range of previous computations.  Paying special attention to
issues in which community collaboration can be crucial such as problem
identification, workload capture, solution interoperability, standards with
community buy-in, and shared tools, this one-day workshop seeks contributions
in the form of papers and posters on relevant topics, including but not limited
to: 

* performance and benchmarking results and tools, 
* failure tolerance, 
* APIs and protocols for high performance features, 
* parallel file systems, 
* high bandwidth storage architectures, 
* wide area file systems, 
* metadata intensive workloads, 
* information extraction,
* autonomics for HPC storage, 
* checkpoint/restart,  
* virtualization for storage systems,
* archival storage advances, and 
* resource management innovations.


Paper Submissions:

Due: Friday, September 16, 2011, 11:59 PM PDT
Notification: Tuesday, October 11, 2011
Camera-ready due: Sunday, November 6, 2011
Slides due: Friday, Nov. 11, 2011 

The parallel data storage workshop holds a peer reviewed competitive process
for selecting extended abstracts and short papers.  Submit a not previously
published extended abstract of up to 5 pages, not less than 10 point font, in a
PDF file as instructed on the workshop web site.  Submitted papers will be
reviewed under the supervision of the workshop program committee. Submissions
should indicate authors and affiliations.  Final papers must not be longer than
5 pages.  Selected papers and associated talk slides will be made available on
the workshop web site; the papers will also be published in the digital library
of the IEEE or ACM. 


Poster Submissions:

Due: Monday, November 7, 2011, 11:59 PM PDT
Notification: Wednesday, November 19, 2011

The PDSW program committee highly encourages authors of accepted papers to
present posters of their work. Additional submissions for technical poster
presentation will be considered if they are marked as such and include title
and author list and a short abstract. Further specifications for poster
production will be available on the workshop web site.


Program Committee:

John Bent, Los Alamos National Laboratory (PC Chair)
Randal Burns, Johns Hopkins University 
Andreas Dilger, Whamcloud, Inc. 
Yong Chen, Texas Tech University 
Haryadi Gunawi, University of California, Berkeley 
Adam Manzanares, Los Alamos National Laboratory 
Dutch Meyer, University of British Columbia 
Ethan Miller, University of California, Santa Cruz 
Ron Oldfield, Sandia National Laboratory 
Vijayan Prabhakaran, Microsoft Research 
Karsten Schwan, Georgia Tech 
Brad Settlemyer, Oak Ridge National Laboratory 
Raju Rangaswami, Florida International University 
Doug Thain, University of Notre Dame 
Rob Ross, Argonne National Laboratory 


Steering Committee:

Scott Brandt, University of California, Santa Cruz
Evan J. Felix, Pacific Northwest National Laboratory
Garth A. Gibson, Carnegie Mellon University and Panasas Inc.
Gary Grider, Los Alamos National Laboratory
Peter Honeyman, University of Michigan, Ann Arbor, Center for Information 
Technology Integration
Bill Kramer, National Center for Supercomputing Applications/University of 
Illinois Urbana-Champaign
Darrell Long, University of California, Santa Cruz
Carlos Maltzahn, University of California, Santa Cruz
Philip C. Roth, Oak Ridge National Laboratory
John Shalf, National Energy Research Scientific Computing Center, Lawrence 
Berkeley National Laboratory
Lee Ward, Sandia National Laboratories



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] inconsistent client behavior when creating an empty directory

2011-08-09 Thread Kevin Van Maren
This appears to be the same issue as 
https://bugzilla.lustre.org/show_bug.cgi?id=23459

Kevin


Andrej Filipcic wrote:
 Hi,

 the following code does not work as expected:
 -
 #include sys/stat.h
 #include errno.h
 #include stdio.h

 int main(int argc, char** argv) {

   int rc;
   rc=mkdir(argv[1],S_IRWXU);
   if(rc) perror(failed create dir);
   chown(argv[1],4103,4100);

   struct stat buf;
   /* stat(argv[1],buf); */

   setresuid(0,4103,4100);
   rc=mkdir(argv[1],S_IRWXU);
   if(rc) perror(failed create dir as user);
 }
 -

 initial status:

 # ls -ld /lustre/test
 drwxr-xr-x 2 root root 4096 Aug  9 14:59 /lustre/test
 # ls -l /lustre/test
 total 0

 1) running the test program:

 # /tmp/test /lustre/test/testdir
 failed create dir as user: Permission denied
 # ls -l /lustre/test
 total 4
 drwx-- 2 griduser03 grid 4096 Aug  9 15:02 testdir

 griduser03, grid correspond to uid=4103,gid=4100


 2) running the test program, but with uncommented stat call:
 # /tmp/test /lustre/test/testdir
 failed create dir as user: File exists
 # ls -l /lustre/test
 total 4
 drwx-- 2 griduser03 grid 4096 Aug  9 15:04 testdir


 The code first makes the testdir as root and changes the ownership to uid 
 4103. 
 Then it tries to (re)create the same dir with the user privileges. 

 If stat is called, the code behaves as expected (case 2), but if not (case 
 1), the second mkdir should return EEXIST and not EACCES. Is this behavior 
 expected or is it a client bug? The client runs lustre 1.8.6.

 The code just illustrates, what is actually used in a complex software.

 Andrej

   

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] New to Lustre, test install.

2011-08-09 Thread Ray Muno
Now that I have located what I want to do a Lustre deployment test, I am 
running in to a few issues.

(If there is a searchable archive for this mailing list, I would have 
started there. I only found it archived by date).

I have a fresh install of CentOS 5.6.

I installed Lustre from the pre-built RPM's available on Whamcloud's 
server.

I followed the Walk-thru- Deploying a Lustre pre-built kernel which 
seems to be a bit out of date.  There are some errors on this page 
relative to installation of Ldiskfs. The section seems to be an edited 
clone of the Lustre Modules section.

http://wiki.whamcloud.com/display/PUB/Walk-thru-+Deploying+a+Lustre+pre-built+kernel

 From there I went to testing.

http://wiki.whamcloud.com/display/PUB/Testing+a+Lustre+filesystem

When I run the test suite, as indicated, I do not get very far.

# /usr/lib64/lustre/tests/llmount.sh
Stopping clients: nike-lustre-oss-0-0.local /mnt/lustre (opts:)
Stopping clients: nike-lustre-oss-0-0.local /mnt/lustre2 (opts:)
Loading modules from /usr/lib64/lustre/tests/..
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet options: 'networks=tcp0 accept=all'
Formatting mgs, mds, osts
Checking servers environments
Checking clients nike-lustre-oss-0-0.local environments
Setup mgs, mdt, osts
Starting mds: -o loop  /tmp/lustre-mdt /mnt/mds
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=24
error: set_param: writing to file /proc/sys/lnet/debug_mb: Invalid argument


-Ray Muno
  University of Minnesota
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-09 Thread Kevin Van Maren
chas williams - CONTRACTOR wrote:
 On Mon, 08 Aug 2011 12:03:25 -0400
 chas williams - CONTRACTOR c...@cmf.nrl.navy.mil wrote:

   
 later mdc_exit_request() finds this mcw by iterating the list.
 seeing as mcw was allocated on the stack, i dont think you can do this.
 mcw might have been reused by the time mdc_exit_request() gets around
 to removing it.
 

 nevermind. i see this has been fixed in later releases apparently (i
 was looking at 1.8.5). if l_wait_event() returns early (like
 from being interrupted) mdc_enter_request() does the cleanup itself now.
   

That code is unchanged in 1.8.6.

Kevin

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] New to Lustre, test install.

2011-08-09 Thread Peter Jones
Ray

If your questions relate to content on a Whamcloud wiki relating to a 
Whamcloud release it would be more appropriate to post to the Whamcloud 
discuss mailing list - 
https://groups.google.com/a/whamcloud.com/group/wc-discuss ;-)

Peter

On 11-08-09 9:20 AM, Ray Muno wrote:
 Now that I have located what I want to do a Lustre deployment test, I am
 running in to a few issues.

 (If there is a searchable archive for this mailing list, I would have
 started there. I only found it archived by date).

 I have a fresh install of CentOS 5.6.

 I installed Lustre from the pre-built RPM's available on Whamcloud's
 server.

 I followed the Walk-thru- Deploying a Lustre pre-built kernel which
 seems to be a bit out of date.  There are some errors on this page
 relative to installation of Ldiskfs. The section seems to be an edited
 clone of the Lustre Modules section.

 http://wiki.whamcloud.com/display/PUB/Walk-thru-+Deploying+a+Lustre+pre-built+kernel

   From there I went to testing.

 http://wiki.whamcloud.com/display/PUB/Testing+a+Lustre+filesystem

 When I run the test suite, as indicated, I do not get very far.

 # /usr/lib64/lustre/tests/llmount.sh
 Stopping clients: nike-lustre-oss-0-0.local /mnt/lustre (opts:)
 Stopping clients: nike-lustre-oss-0-0.local /mnt/lustre2 (opts:)
 Loading modules from /usr/lib64/lustre/tests/..
 lnet.debug=0x33f1504
 lnet.subsystem_debug=0xffb7e3ff
 lnet options: 'networks=tcp0 accept=all'
 Formatting mgs, mds, osts
 Checking servers environments
 Checking clients nike-lustre-oss-0-0.local environments
 Setup mgs, mdt, osts
 Starting mds: -o loop  /tmp/lustre-mdt /mnt/mds
 lnet.debug=0x33f1504
 lnet.subsystem_debug=0xffb7e3ff
 lnet.debug_mb=24
 error: set_param: writing to file /proc/sys/lnet/debug_mb: Invalid argument


 -Ray Muno
University of Minnesota
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
Peter Jones
Whamcloud, Inc.
www.whamcloud.com

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-09 Thread chas williams - CONTRACTOR
On Tue, 09 Aug 2011 10:29:43 -0600
Kevin Van Maren kevin.van.ma...@oracle.com wrote:

  chas williams - CONTRACTOR wrote:
  nevermind. i see this has been fixed in later releases apparently (i
  was looking at 1.8.5). if l_wait_event() returns early (like
  from being interrupted) mdc_enter_request() does the cleanup itself now.
 
 That code is unchanged in 1.8.6.

it appears to have been fixed in the 2.x releases.  i think this is the
relevant change http://review.whamcloud.com/#change,506
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] New to Lustre, test install.

2011-08-09 Thread Oleg Drokin
Hello!

On Aug 9, 2011, at 12:20 PM, Ray Muno wrote:
 When I run the test suite, as indicated, I do not get very far.
 
 # /usr/lib64/lustre/tests/llmount.sh
 ...
 lnet.debug_mb=24
 error: set_param: writing to file /proc/sys/lnet/debug_mb: Invalid argument

If you have a lot of CPU cores, try setting DEBUG_SIZE environment variable to 
something bigger.
like double your CPU cores (this is in megabytes for debug buffers. if you are 
not
interested in the debug buffers, set it to 0).

Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre-1.8.4 : BUG soft lock up

2011-08-09 Thread Jeff Johnson
Greetings,

The below console output is from a 1.8.4 OST (RHEL5.5, 
2.6.18-194.3.1.el5_lustre.1.8.4, x86_64). Not saying it is a Lustre bug 
for sure. Just wondering if anyone has seen this or something very 
similar. Updating to 1.8.6 WC variant isn't an option at this time.

If anyone has some insight into this I'd appreciate the feedback.

Thanks,

--Jeff

BUG: soft lockup - CPU#6 stuck for 10s! [kswapd0:409]
CPU 6:
Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) 
jbd2(U) crc16(U) lustre(U) lov(U) mdc(U) lquota(U)
osc(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) 
autofs4(U) hidp(U) l2cap(U) bluetooth(U)
lockd(U) sunrpc(U) ip6t_REJECT(U) xt_tcpudp(U) ip6table_filter(U) ip6_tables(U) 
x_tables(U) ib_iser(U) libiscsi2(U)
scsi_transport_iscsi2(U) scsi_transport_iscsi(U) ib_srp(U) rds(U) ib_sdp(U) 
ib_ipoib(U) ipoib_helper(U) ipv6(U) xfrm_nalgo(U)
crypto_api(U) rdma_ucm(U) rdma_cm(U) ib_ucm(U) ib_uverbs(U) ib_umad(U) ib_cm(U) 
iw_cm(U) ib_addr(U) ib_sa(U) mptsas(U) mptctl(U)
dm_mirror(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) 
power_meter(U) hwmon(U) i2c_ec(U) dell_wmi(U) wmi(U)
button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) 
parport(U) mlx4_ib(U) ib_mad(U) ib_core(U)
mlx4_en(U) joydev(U) shpchp(U) sg(U) mlx4_core(U) e1000e(U) serio_raw(U) 
pcspkr(U) i2c_i801(U) i2c_core(U) dm_raid45(U)
dm_message(U) dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) mptspi(U) 
scsi_transport_spi(U) mptscsih(U) mptbase(U)
scsi_transport_sas(U) ata_piix(U) libata(U) sd_mod(U) scsi_mod(U) raid1(U) 
ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Pid: 409, comm: kswapd0 Tainted: G  2.6.18-194.3.1.el5_lustre.1.8.4 #1
RIP: 0010:[801011bf]  [801011bf] dqput+0x105/0x19f
RSP: 0018:8101be805cd0  EFLAGS: 0202
RAX: 81012e03f000 RBX:  RCX: 81012e03f000
RDX: ffe2 RSI: 0002 RDI: 81012f4f01c0
RBP: 81007fb4c918 R08: 81018b00 R09: 81007fb4c918
R10: 8101be805c60 R11: 8b6448f0 R12: 8101be805c60
R13: 8b6448f0 R14: ffe2 R15: 8b6448f0
FS:  () GS:8101bfc2adc0() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 00402000 CR3: 00201000 CR4: 06e0

Call Trace:
  [8010182b] dquot_drop+0x30/0x5e
  [8b647e83] :ldiskfs:ldiskfs_dquot_drop+0x43/0x70
  [80022d99] clear_inode+0xb4/0x123
  [80034e52] dispose_list+0x41/0xe0
  [8002d6a7] shrink_icache_memory+0x1b7/0x1e6
  [8003f466] shrink_slab+0xdc/0x153
  [80057e59] kswapd+0x343/0x46c
  [800a0ab2] autoremove_wake_function+0x0/0x2e
  [80057b16] kswapd+0x0/0x46c
  [800a089a] keventd_create_kthread+0x0/0xc4
  [80032890] kthread+0xfe/0x132
  [8009d728] request_module+0x0/0x14d
  [8005dfb1] child_rip+0xa/0x11
  [800a089a] keventd_create_kthread+0x0/0xc4
  [80032792] kthread+0x0/0x132
  [8005dfa7] child_rip+0x0/0x11


-- 
--
Jeff Johnson
Manager
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845

4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss