I forgot the list. f.
---------- Forwarded message ---------- From: Francesco Pietra <chiendar...@gmail.com> Date: Thu, Jun 16, 2011 at 4:11 PM Subject: Re: Fwd: "cuda error cudastreamcreate", To: Brian Morris <cymraeg...@gmail.com> Oh, no, absolutely no. Where are scientific apencl applications? And not only for that. f. On Thu, Jun 16, 2011 at 3:59 AM, Brian Morris <cymraeg...@gmail.com> wrote: > Why are you using Cuda rather than OpenCL ? Nvidia has said they are cutting > back on their GPU business and moving into CPUs for tablets which are now > appearing on the market. If you have to move to AMD/ATI in the future OpenCL > will still work, but CUDA will not. > > > > On Wed, Jun 15, 2011 at 8:22 AM, Francesco Pietra <chiendar...@gmail.com> > wrote: >> >> Running "nvidia-smi -L" as root restores the visibility of the graphic >> cards. At any boot such visibility vanishes. So, it is a small >> problem, or no problem. francesco >> >> >> ---------- Forwarded message ---------- >> From: Francesco Pietra <chiendar...@gmail.com> >> Date: Wed, Jun 15, 2011 at 4:37 PM >> Subject: Fwd: Fwd: "cuda error cudastreamcreate", >> To: Lennart Sorensen <lsore...@csclub.uwaterloo.ca>, amd64 Debian >> <debian-amd64@lists.debian.org> >> >> >> The simulation (pressure equilibration) was completed successfully. >> Next run (just a continuation of previous pressure equilibration) >> failed, again 'Device Emulation (CPU' , see log file below. Attempted >> again, same error. >> >> # modinfo nvidia >> filename: /lib/modules/2.6.38-2-amd64/updates/dkms/nvidia.ko >> alias: char-major-195-* >> supported: external >> license: NVIDIA >> alias: pci:v000010DEd00000E00sv*sd*bc04sc80i00* >> alias: pci:v000010DEd00000AA3sv*sd*bc0Bsc40i00* >> alias: pci:v000010DEd*sv*sd*bc03sc02i00* >> alias: pci:v000010DEd*sv*sd*bc03sc00i00* >> depends: i2c-core >> vermagic: 2.6.38-2-amd64 SMP mod_unload modversions >> parm: NVreg_EnableVia4x:int >> parm: NVreg_EnableALiAGP:int >> parm: NVreg_ReqAGPRate:int >> parm: NVreg_EnableAGPSBA:int >> parm: NVreg_EnableAGPFW:int >> parm: NVreg_Mobile:int >> parm: NVreg_ResmanDebugLevel:int >> parm: NVreg_RmLogonRC:int >> parm: NVreg_ModifyDeviceFiles:int >> parm: NVreg_DeviceFileUID:int >> parm: NVreg_DeviceFileGID:int >> parm: NVreg_DeviceFileMode:int >> parm: NVreg_RemapLimit:int >> parm: NVreg_UpdateMemoryTypes:int >> parm: NVreg_InitializeSystemMemoryAllocations:int >> parm: NVreg_UseVBios:int >> parm: NVreg_RMEdgeIntrCheck:int >> parm: NVreg_UsePageAttributeTable:int >> parm: NVreg_EnableMSI:int >> parm: NVreg_MapRegistersEarly:int >> parm: NVreg_RegisterForACPIEvents:int >> parm: NVreg_RegistryDwords:charp >> parm: NVreg_RmMsg:charp >> parm: NVreg_NvAGP:int >> >> However: >> >> $ nvidia-smi -L >> Could not open device /dev/nvidia1 (no such file) >> Failed to initialize NVML: unknown error. >> >> >> I am unable to draw technical conclusions from this 'unknown error'. I >> wonder whether other information can be extracted to fix the problems. >> >> Thanks for advice. >> >> francesco >> >> >> >> >> Info: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic >> Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu >> Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco >> Info: Running on 6 processors, 6 nodes, 1 physical nodes. >> Info: CPU topology information available. >> Info: Charm++/Converse parallel runtime startup completed at 0.00658393 s >> Pe 2 sharing CUDA device 0 first 0 next 3 >> Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'Device >> Emulation (CPU)' Mem: 0MB Rev: 9999.9999 >> FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no >> CUDA-capable device is available >> >> >> ---------- Forwarded message ---------- >> From: Francesco Pietra <chiendar...@gmail.com> >> Date: Wed, Jun 15, 2011 at 9:04 AM >> Subject: Re: Fwd: "cuda error cudastreamcreate", >> To: Fabricio Cannini <fabri...@versatushpc.com.br>, Lennart Sorensen >> <lsore...@csclub.uwaterloo.ca>, amd64 Debian >> <debian-amd64@lists.debian.org> >> >> >> The "nvidia-smi -L" output was for a machine of Jim Phillips, the >> main developer of NAMD. He provided that to show that it should also >> work with my GTX 470 cards. >> >> That said, my problems seem to have been solved by following Lennart's >> indications. The driver was rebuilt, date 15 June, and NAMD simulation >> could be started regularly. However, we have to wait before claiming >> full victory. Please see below.. >> >> In retrospect, the nvidia.ko I had before, dated 5 June, must have >> also been built within Debian. Renaming it no_nvidia.ko prevented >> rebuilding for the reasons that Lennart clarified. >> >> For some reasons, previous installation of nvidia.ko must have had >> some problems, as, for example, "nvidia-smi -L" did not work (there >> was a single installation of nvidia-smi, "nvidia-smi 270.41.19-1"), >> while "modinfo nvidia" output was correct. Now, both are correct: >> >> $ nvidia-smi -L >> GPU 0: GeForce GTX 470 (UUID: N/A) >> GPU 1: GeForce GTX 470 (UUID: N/A) >> >> # modinfo nvidia >> filename: /lib/modules/2.6.38-2-amd64/updates/dkms/nvidia.ko >> alias: char-major-195-* >> supported: external >> license: NVIDIA >> alias: pci:v000010DEd00000E00sv*sd*bc04sc80i00* >> alias: pci:v000010DEd00000AA3sv*sd*bc0Bsc40i00* >> alias: pci:v000010DEd*sv*sd*bc03sc02i00* >> alias: pci:v000010DEd*sv*sd*bc03sc00i00* >> depends: i2c-core >> vermagic: 2.6.38-2-amd64 SMP mod_unload modversions >> parm: NVreg_EnableVia4x:int >> parm: NVreg_EnableALiAGP:int >> parm: NVreg_ReqAGPRate:int >> parm: NVreg_EnableAGPSBA:int >> parm: NVreg_EnableAGPFW:int >> parm: NVreg_Mobile:int >> parm: NVreg_ResmanDebugLevel:int >> parm: NVreg_RmLogonRC:int >> parm: NVreg_ModifyDeviceFiles:int >> parm: NVreg_DeviceFileUID:int >> parm: NVreg_DeviceFileGID:int >> parm: NVreg_DeviceFileMode:int >> parm: NVreg_RemapLimit:int >> parm: NVreg_UpdateMemoryTypes:int >> parm: NVreg_InitializeSystemMemoryAllocations:int >> parm: NVreg_UseVBios:int >> parm: NVreg_RMEdgeIntrCheck:int >> parm: NVreg_UsePageAttributeTable:int >> parm: NVreg_EnableMSI:int >> parm: NVreg_MapRegistersEarly:int >> parm: NVreg_RegisterForACPIEvents:int >> parm: NVreg_RegistryDwords:charp >> parm: NVreg_RmMsg:charp >> parm: NVreg_NvAGP:int >> >> >> I said above that time will show if the system is stable. In fact, >> this morning, NAMD simulation did not start (I am using the console >> memory to recover commands, so that no error of digitizing). I had not >> carried out any amd64 upgrade in between. From the simulation log: >> >> >> Info: Charm++/Converse parallel runtime startup completed at 0.00989103 s >> Pe 2 sharing CUDA device 0 first 0 next 3 >> Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'Device >> Emulation (CPU)' Mem: 0MB Rev: 9999.9999 >> FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no >> CUDA-capable device is available >> >> 'Device Emulation (CPU)' indicates (for some to me unclear reasons) >> that things have gone bad. >> >> On a second identical attempt (after having explored the driver >> location and carried out info commands), NAMD simulation started, with >> the correct log output: >> >> Info: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic >> Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu >> Info: 1 NAMD CVS-2011-06-04 Linux-x86_64-CUDA 6 gig64 francesco >> Info: Running on 6 processors, 6 nodes, 1 physical nodes. >> Info: CPU topology information available. >> Info: Charm++/Converse parallel runtime startup completed at 0.00650811 s >> >> >> We will see if failure/success will be presented again (now a >> simulation lasts several hours (which would be days on a 8 processor >> machine). If failure will occur again, there are so many possible >> reasons, including problems with the NAMD code. >> >> I was so discomforted yesterday to allude to a change of driver >> source. Which was unfair. >> >> Thanks a lot >> francesco >> >> On Wed, Jun 15, 2011 at 2:22 AM, Fabricio Cannini >> <fabri...@versatushpc.com.br> wrote: >> > Em terça-feira 14 junho 2011, às 16:01:57, Lennart Sorensen escreveu: >> >> On Tue, Jun 14, 2011 at 07:23:38PM +0200, Francesco Pietra wrote: >> >> > I forgot to answer: yes, sometime it works, sometimes not, everything >> >> > being the same. >> >> > >> >> > As a matter of fact, after a day of failure, I have now renamed back >> >> > >> >> > /lib/modules/2.638-2-amd64/updatesdkms/no_nvidia.ko >> >> > >> >> > to >> >> > >> >> > /lib/modules/2.638-2-amd64/updatesdkms/nvidia.ko >> >> > >> >> > and the NAMD simulation started regularly using both gtx 470. The >> >> > machine had not been touched either. >> >> >> >> I wonder if having the 9800 card in there along with the 470 gtx cards >> >> is confusing the driver. Maybe the card order is getting swapped >> >> around >> >> on some boots. >> >> >> >> What is the 9800 doing in the box anyhow? >> > >> > Hi All. >> > >> > I'm thinking the same as Lennart. It seems to me that the order which >> > the >> > cards are named varies, thus confusing the application( s ). I'd try to >> > fix the >> > order in /etc/X11/xorg.conf and see if it works. Look in the cuda docs >> > how to >> > do that. >> > >> > Good luck. >> > >> > >> > -- >> > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org >> > with a subject of "unsubscribe". Trouble? Contact >> > listmas...@lists.debian.org >> > Archive: http://lists.debian.org/201106142122.04376.fcann...@gmail.com >> > >> > >> >> >> -- >> To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org >> with a subject of "unsubscribe". Trouble? Contact >> listmas...@lists.debian.org >> Archive: >> http://lists.debian.org/banlktimuupnrkwcjy_2symwlds4a1nc...@mail.gmail.com >> > > -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/BANLkTi>mk9chdwcct8ffobykjko+v...@mail.gmail.com