I have finally discovered that the problem (partly solved) was with NAMD not the OS.
francesco pietra ---------- Forwarded message ---------- From: Francesco Pietra <[email protected]> Date: Sat, Mar 10, 2012 at 7:33 AM Subject: Fwd: Failure to activate node zero in shared memory machine To: amd64 Debian <[email protected]> I forgot to add that I tried with either AMBER ff and CHARMM ff (all27). In both cases also with previously proven systems and scripts. Also, I am using the precompiled NAMD (self-contained parallelization), not message passing from Debian. francesco ---------- Forwarded message ---------- From: Francesco Pietra <[email protected]> Date: Fri, Mar 9, 2012 at 7:14 PM Subject: Failure to activate node zero in shared memory machine To: amd64 Debian <[email protected]> Hello: I was running NAMD-CUDA 2.8 4JUN2011nb (a molecular dynamics simulation code) successfully on nvidia 280.13-1. I am now bach to namd after a few months, on the same macjhine, now nvidia 295.20-1 (which version matches debian amd64 xserver and all libraries). First activating CUDA: # nvidia-smi -L # nvidia-smi -pm 1 then launching namd, node zero failure Charmrun> charmrun started... Charmrun> node programs all started Charmrun> error 0 attaching to node: Timeout waiting for node-program to connect Charmrun> adding client 0: "127.0.0.1", IP:127.0.0.1 Charmrun> adding client 1: "127.0.0.1", IP:127.0.0.1 Charmrun> adding client 2: "127.0.0.1", IP:127.0.0.1 Charmrun> adding client 3: "127.0.0.1", IP:127.0.0.1 Charmrun> adding client 4: "127.0.0.1", IP:127.0.0.1 Charmrun> adding client 5: "127.0.0.1", IP:127.0.0.1 Charmrun> Charmrun = 127.0.0.1, port = 41824 Charmrun> start 0 node program on localhost. Charmrun> start 1 node program on localhost. Charmrun> start 2 node program on localhost. Charmrun> start 3 node program on localhost. Charmrun> start 4 node program on localhost. Charmrun> start 5 node program on localhost. Charmrun> Waiting for 0-th client to connect. Hardware Gigabyte Technology Co., Ltd. GA-890FXA-UD5/GA-890FXA-UD5, BIOS F6 11/24/2010 AMD Phenom(tm) II X6 1075T Processor (6 cpu cores) (version 2.20.00) 16GB RAM Two GTX-580 Scanning NUMA topology in Northbridge 24 [ 0.000000] No NUMA configuration found (SHOULD NUMA BE ACTIVATED? it was not when running parallel in the past) All nvidia tests were OK: francesco@gig64:~/1PLC$ dpkg -l | grep nvidia ii glx-alternative-nvidia 0.2.1 allows the selection of NVIDIA as GLX provider ii libgl1-nvidia-alternatives 295.20-1 transition libGL.so* diversions to glx-alternative-nvidia ii libgl1-nvidia-glx 295.20-1 NVIDIA binary OpenGL libraries ii libglx-nvidia-alternatives 295.20-1 transition libgl.so diversions to glx-alternative-nvidia ii libnvidia-compiler-ia32 295.20-1 NVIDIA runtime compiler library (32-bit) ii libnvidia-ml1 295.20-1 NVIDIA management library (NVML) runtime library ii nvidia-alternative 295.20-1 allows the selection of NVIDIA as GLX provider ii nvidia-compute-profiler 4.0.17-3 NVIDIA Compute Visual Profiler ii nvidia-cuda-dev 4.0.17-3 NVIDIA CUDA development files ii nvidia-cuda-doc 4.1.28-1 NVIDIA CUDA and OpenCL documentation ii nvidia-cuda-gdb 4.1.28-1 NVIDIA CUDA GDB ii nvidia-cuda-toolkit 4.0.17-3 NVIDIA CUDA toolkit ii nvidia-glx 295.20-1 NVIDIA metapackage ii nvidia-installer-cleanup 20111111+3 Cleanup after driver installation with the nvidia-installer ii nvidia-kernel-common 20111111+3 NVIDIA binary kernel module support files ii nvidia-kernel-dkms 295.20-1 NVIDIA binary kernel module DKMS source ii nvidia-libopencl1 295.20-1 NVIDIA OpenCL library ii nvidia-libopencl1-ia32 295.20-1 NVIDIA OpenCL 32-bit library ii nvidia-opencl-common 295.20-1 NVIDIA OpenCL driver ii nvidia-opencl-dev 4.0.17-3 NVIDIA OpenCL development files ii nvidia-opencl-icd-ia32 295.20-1 NVIDIA OpenCL ICD (32-bit) ii nvidia-smi 295.20-1 NVIDIA System Management Interface ii nvidia-support 20111111+3 NVIDIA binary graphics driver support files ii nvidia-vdpau-driver 295.20-1 NVIDIA vdpau driver ii nvidia-xconfig 295.20-1 X configuration tool for non-free NVIDIA drivers ii xserver-xorg-video-nvidia 295.20-1 NVIDIA binary Xorg driver francesco@gig64:~/1PLC$ root@gig64:/home/francesco/1PLC# modinfo nvidia filename: /lib/modules/2.6.38-2-amd64/updates/dkms/nvidia.ko alias: char-major-195-* version: 295.20 supported: external license: NVIDIA alias: pci:v000010DEd00000E00sv*sd*bc04sc80i00* alias: pci:v000010DEd00000AA3sv*sd*bc0Bsc40i00* alias: pci:v000010DEd*sv*sd*bc03sc02i00* alias: pci:v000010DEd*sv*sd*bc03sc00i00* depends: i2c-core vermagic: 2.6.38-2-amd64 SMP mod_unload modversions parm: NVreg_EnableVia4x:int parm: NVreg_EnableALiAGP:int parm: NVreg_ReqAGPRate:int parm: NVreg_EnableAGPSBA:int parm: NVreg_EnableAGPFW:int parm: NVreg_Mobile:int parm: NVreg_ResmanDebugLevel:int parm: NVreg_RmLogonRC:int parm: NVreg_ModifyDeviceFiles:int parm: NVreg_DeviceFileUID:int parm: NVreg_DeviceFileGID:int parm: NVreg_DeviceFileMode:int parm: NVreg_RemapLimit:int parm: NVreg_UpdateMemoryTypes:int parm: NVreg_InitializeSystemMemoryAllocations:int parm: NVreg_UseVBios:int parm: NVreg_RMEdgeIntrCheck:int parm: NVreg_UsePageAttributeTable:int parm: NVreg_EnableMSI:int parm: NVreg_MapRegistersEarly:int parm: NVreg_RegisterForACPIEvents:int parm: NVreg_RegistryDwords:charp parm: NVreg_RmMsg:charp parm: NVreg_NvAGP:int root@gig64:/home/francesco/1PLC# Thanks a lot for advice francesco pietra -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/CAEv0nmt9znELnBAtqHcxVgmdXWbnnsa6kOHcPq3TZ=ethqz...@mail.gmail.com

