Hello: With a gaming machine Gigabyte GA 890FXAUD5 Six-core AMD PhenomII 1075T 2x GTX 470 Debian GNU-Linux amd64 wheezy
I run successfully NAMD code (molecular dynamics simulations). Now I am having problems getting GTX 470 to work and I can't understand whether it is hardware or software problem, and if software the OS is concerned. I am submitting the same problem to NAMD, s it might be NAMD specific. When the code works, the top of the log file says: nfo: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco Info: Running on 6 processors, 6 nodes, 1 physical nodes. Info: CPU topology information available. Info: Charm++/Converse parallel runtime startup completed at 0.00650811 s Pe 5 sharing CUDA device 1 first 1 next 1 Pe 2 sharing CUDA device 0 first 0 next 4 Did not find +devices i,j,k,... argument, using all Pe 5 physical rank 5 binding to CUDA device 1 on gig64: 'GeForce GTX 470' Mem: 1279MB Rev: 2.0 Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'GeForce GTX 470' Mem: 1279MB Rev: 2.0 Pe 0 sharing CUDA device 0 first 0 next 2 Pe 3 sharing CUDA device 1 first 1 next 5 Pe 1 sharing CUDA device 1 first 1 next 3 Pe 1 physical rank 1 binding to CUDA device 1 on gig64: 'GeForce GTX 470' Mem: 1279MB Rev: 2.0 Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'GeForce GTX 470' Mem: 1279MB Rev: 2.0 Pe 3 physical rank 3 binding to CUDA device 1 on gig64: 'GeForce GTX 470' Mem: 1279MB Rev: 2.0 Pe 4 sharing CUDA device 0 first 0 next 0 Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'GeForce GTX 470' Mem: 1279MB Rev: 2.0 Info: 1.64104 MB of memory in use based on CmiMemoryUsage Info: Configuration file is min-02.conf When failure: Info: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco Info: Running on 6 processors, 6 nodes, 1 physical nodes. Info: CPU topology information available. Info: Charm++/Converse parallel runtime startup completed at 0.0124412 s Pe 5 sharing CUDA device 0 first 0 next 0 Pe 5 physical rank 5 binding to CUDA device 0 on gig64: 'Device Emulation (CPU)' Mem: 0MB Rev: 9999.9999 FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device 0): no CUDA-capable device is available ------------- Processor 5 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device 0): no CUDA-capable device is available Did not find +devices i,j,k,... argument, using all Pe 0 sharing CUDA device 0 first 0 next 1 Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'Device Emulation (CPU)' Mem: 0MB Rev: 9999.9999 Pe 3 sharing CUDA device 0 first 0 next 4 Pe 3 physical rank 3 binding to CUDA device 0 on gig64: 'Device Emulation (CPU)' Mem: 0MB Rev: 9999.9999 Pe 1 sharing CUDA device 0 first 0 next 2 Pe 1 physical rank 1 binding to CUDA device 0 on gig64: 'Device Emulation (CPU)' Mem: 0MB Rev: 9999.9999 FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device 0): no CUDA-capable device is available ------------- Processor 0 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device 0): no CUDA-capable device is available FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device 0): no CUDA-capable device is available ------------- Processor 3 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device 0): no CUDA-capable device is available FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device 0): no CUDA-capable device is available ------------- Processor 1 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device 0): no CUDA-capable device is available Pe 2 sharing CUDA device 0 first 0 next 3 Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'Device Emulation (CPU)' Mem: 0MB Rev: 9999.9999 FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no CUDA-capable device is available ------------- Processor 2 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no CUDA-capable device is available Pe 4 sharing CUDA device 0 first 0 next 5 Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'Device Emulation (CPU)' Mem: 0MB Rev: 9999.9999 FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device 0): no CUDA-capable device is available ------------- Processor 4 Exiting: Called CmiAbort ------------ Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device 0): no CUDA-capable device is available [0] Stack Traceback: -------------------------------- In both cases: /var/lib/dkms/nvidia/270.41.19/2.6.38-2-amd64/x86_64/module/nvidia.ko /lib/module/2.6.38-2-amd64/update/dkms/nvidia.ko are in order. I tried: nvidia-smi -r (or nvidia-smi -a) NVIDIA: could not open the device file /dev/nvidia1 (no such file) Failed to initialize NVML: unknown error. unsure if these commands are for Tesla only. Thanks for advice francesco pietra -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

