Re: Tesla c1060 driver installation

2011-08-21 Thread Phil Perry

On 21/08/11 04:48, Jon Peatfield wrote:


btw there are plenty of rpms of the nvidia drivers using dkms for the
auto-kernel-module rebuilding (and probably others using kabi tracking).
We use locally maintained rpms based on the DAG srpms but with some
local tweaks (which might make them not ideal for others) and updated to
a version of the nvidia binary blobs that we just download from nvidia
whenever we feel the need for an update...

Until recently we were using nvidia version 190.42, but are in the
middle of updating to 280.13 at the moment - so far it seems to be fine
and we plan to roll it out to the rest of our sl5 boxes next Wednesday...

That said we do this mainly for X support - that these drivers also
support CUDA is mostly (for us) a bonus though we do have one box with a
C1060 card using it...



Dag's dkms drivers have been unmaintained for some time and are 
deprecated in favour of the nvidia kmod packages available in elrepo. 
These are currently well maintained and version 280.13 has been 
available since it's upstream release.


http://elrepo.org/
http://elrepo.org/tiki/kmod-nvidia


Re: Tesla c1060 driver installation

2011-08-20 Thread Nico Kadel-Garcia
On Sat, Aug 20, 2011 at 12:08 AM, Predrag Punosevac ppunose...@devio.us wrote:
 Nico Kadel-Garcia nka...@gmail.com wrote:

 On Fri, Aug 19, 2011 at 10:25 PM, Predrag Punosevac ppunose...@devio.us 
 wrote:
  Deal All,
 
  I apologize to all of you who find this question trivial. I am
  completely new to Linux and to Scientific Linux in particular albeit
  Unix (OpenBSD and Solaris) user of over 20 years.
 
  I have been entrusted with the installation and configuration of NVidia
  Tesla c1060 on our university test rig running i386_64 Scientific Linux
  5.5.

 Wonderful for you you! First, May I suggest that you figure out wither
 you mean i386 or x86_64 Scientific Linux? And second, if feasible,

 x86_64 (amd64) of course because I have a lot of RAM which can not be
 fully accessed even with PAE enabled kernel on i386. In my baby tests SL
 5.5 i386 was limited to 12GB of RAM.

Cool.


 can I encourage you to update to version 5.6? There are a number of
 very useful updates and integration improvements in that release.

 I could install even 6.1. The only reason I went with 5.5 was that
 NVidia claimed that was officially supported version. I am also a bit
 concern about other applications and their availability for SL 6.1. This
 thing must run MATLAB, Maple, Mathematica, SciPy, Numpy and be
 accessible not only via ssh but also via NoMachine NX. In particular NX
 is closed source for version 4.0 and above so I am not sure if the free
 version of server will even install let alone run on SL 6.1.

Wow, you do have a suite of tools that might add up to some support
issues. Since our favorite upstream vendor's version 6 has been out
since October of 2010, I suspect that all of those packages are now
compatible with SL 6.0 or SL 6.1 can attest to NoMachine NX version 3
being compatible: SL 6 has the same bugs as SL 5, because the OpenSSH
is actually compiled on RHEL 3 adn the xauth command is not where
the customized SSH server expects it by default. Just remember to set
XauthLocation in the relevant sshd_config file.

I wouldn't sweat the binary NoMachine implementation. While I dislike
intensely closed source code, the freeware rebuilds of NX based tools,
such as neatx and freenx, are all abandonware, and NoMachine's
implementation is noticeably superios, especially for the Windows
clients. And hey, with Putty 0.61 out and supporting genuine GSSAPI,
I'm hoping that it can support genuine single-sign-on..

  After a bit of pocking around I managed to kill X server, install gcc as
  directed by NVidia driver installation script. However, due to the lack
  of pre-compiled kernel interfaces on NVidia ftp server I am forced by
  installer to compile a kernel interface. This is where my troubles

If you have to do this again, you should be able to run su or sudo
and run the command telinit 3. That should switch you to runlevel
3, which doesn't have that X server running.

 The NVidia installer scripts can !@#$!@@@#$ my !@!@$#!$. I've
 personally had to rewrite them far too many times, and my updated
 versions have been ignored. They do not play well with updates to the
 OpenGL libraries, which they replace without informing the RPM system
 of the replacement, they do not uninstall gracefully unless they've
 been heavily edited since I last looked, and RPM has no way of knowing
 about them to deal with kernel updates.

 I have heard of the update issues. Obviously, I am not happy running
 NVidia binary blobs period but I have no choice.

By the way, if this hasn't changed: if you ever have to update the
manually installed NVidia drivers, first *uninstall* the old ones,
then install the updates.

 There are good RPM's, and notes,on the process, at
 http://rpmfusion.org/Howto/nVidia. Scientific Linux plays as nicely as
 it feasibly can with such third party repositories.


 Thank you so much for that info!

  begin. I have no source code for the kernel. I used yum to install
  kernel-devel.rpm and all other rpms (since I didn't find kernel-src.rpm)
  which contain kernel in the name. Never the less script still complains
  about the lack of the kernel source code. Could you please tell me where
  can I get kernel source and where is supposed to be placed on Linux?

 Have you updated the kernel and rebooted since the last kernel update?
 One thing that the NVidia installers have traditionally been horrid
 about is detecting what your current kernel is, versus what kernel
 will be at boot time. I've traditionally dealt with this by having an
 init script run at boot time to re-install the NVidia drivers, just in
 case, but the modern kmod based tools are supposed to do this for
 you.

 No, I have not updated anything. I run NVidia installation script on
 release version of SL 5.5 in a hope to get working installation while
 learning about SL, CUDA, and having things like MATLAB utilize Tesla.

It sounds like you'd previously done a kernel update, but not rebooted
with the new kernel. That's probably why you had a 

Re: Tesla c1060 driver installation

2011-08-20 Thread Nico Kadel-Garcia
On Sat, Aug 20, 2011 at 6:43 PM, Akemi Yagi amy...@gmail.com wrote:
 On Sat, Aug 20, 2011 at 6:59 AM, Nico Kadel-Garcia nka...@gmail.com wrote:
 On Sat, Aug 20, 2011 at 12:08 AM, Predrag Punosevac ppunose...@devio.us 
 wrote:

 I am also a bit
 concern about other applications and their availability for SL 6.1. This
 thing must run MATLAB, Maple, Mathematica, SciPy, Numpy and be
 accessible not only via ssh but also via NoMachine NX. In particular NX
 is closed source for version 4.0 and above so I am not sure if the free
 version of server will even install let alone run on SL 6.1.

 Wow, you do have a suite of tools that might add up to some support
 issues. Since our favorite upstream vendor's version 6 has been out
 since October of 2010, I suspect that all of those packages are now
 compatible with SL 6.0 or SL 6.1

 I wouldn't be surprised if some of the applications mentioned are not
 compatible with EL 6.  I have an EL-5 box running VMWare Workstation 7
 but cannot upgrade it to EL 6 because this VMWare product does not
 support RHEL-6.0 as host, does not support RHEL 6.1 as host/guest.
 This is rather surprising; nine months after the release of RHEL 6.0,
 it is still not supported. VMware WS is not free, and one would think
 a company like VMware should do a better job for paying customers.

 I wouldn't sweat the binary NoMachine implementation. While I dislike
 intensely closed source code, the freeware rebuilds of NX based tools,
 such as neatx and freenx, are all abandonware, and NoMachine's
 implementation is noticeably superios, especially for the Windows
 clients. And hey, with Putty 0.61 out and supporting genuine GSSAPI,
 I'm hoping that it can support genuine single-sign-on..

 nx/freenx is indeed nice. Unfortunately, the version for EL6 is still
 under testing. I have been running it just fine on EL6.0 as well as on
 6.1. It just has to be finalized and published (from the CentOS extras
 repository). Anyone wishing to give it a try can download the testing
 version from:

 http://centos.toracat.org/misc/nx-freenx/6/

 The current version is:

 freenx-0.7.3-7.el6.ay
 nx-3.4.0-7.el6.ay

And the nx code is about to leave GPL licensing (according to the
company that owns it, www.nomachine.com), with the release of version
4. And FreeNX hasn't had a software update in over three years. It's
abandonware, like all the other freeware NX wrappers.

And by the way, I do believe I personally *wrote* the last updates
from CentOS for those tools: I certainly submitted my updates for RHEL
5.6 and RHEL 6.0 compatibility, and I haven't noticed anyone tackling
the project of porting the features of the commercial NX 4.x alpha
releases to any other new GPL releases. I do wish that NoMachine would
publish them under GPL, and wrote to them about it, in combination
with buying some licenses.



Re: Tesla c1060 driver installation

2011-08-20 Thread Jon Peatfield

On Fri, 19 Aug 2011, Predrag Punosevac wrote:


Deal All,

I apologize to all of you who find this question trivial. I am
completely new to Linux and to Scientific Linux in particular albeit
Unix (OpenBSD and Solaris) user of over 20 years.

I have been entrusted with the installation and configuration of NVidia
Tesla c1060 on our university test rig running i386_64 Scientific Linux
5.5.

After a bit of pocking around I managed to kill X server, install gcc as
directed by NVidia driver installation script. However, due to the lack
of pre-compiled kernel interfaces on NVidia ftp server I am forced by
installer to compile a kernel interface. This is where my troubles
begin. I have no source code for the kernel. I used yum to install
kernel-devel.rpm and all other rpms (since I didn't find kernel-src.rpm)
which contain kernel in the name. Never the less script still complains
about the lack of the kernel source code. Could you please tell me where
can I get kernel source and where is supposed to be placed on Linux?


You don't actually need the full kernel source to 'build' the nvidia 
kernel interfaces, just the kernel-devel package provides enough of the 
headers etc to do it.


btw there are plenty of rpms of the nvidia drivers using dkms for the 
auto-kernel-module rebuilding (and probably others using kabi tracking). 
We use locally maintained rpms based on the DAG srpms but with some local 
tweaks (which might make them not ideal for others) and updated to a 
version of the nvidia binary blobs that we just download from nvidia 
whenever we feel the need for an update...


Until recently we were using nvidia version 190.42, but are in the middle 
of updating to 280.13 at the moment - so far it seems to be fine and we 
plan to roll it out to the rest of our sl5 boxes next Wednesday...


That said we do this mainly for X support - that these drivers also 
support CUDA is mostly (for us) a bonus though we do have one box with a 
C1060 card using it...


--
/\
| Computers are different from telephones.  Computers do not ring. |
|   -- A. Tanenbaum, Computer Networks, p. 32  |
-|
| Jon Peatfield, _Computer_ Officer, DAMTP,  University of Cambridge |
| Mail:  jp...@damtp.cam.ac.uk Web:  http://www.damtp.cam.ac.uk/ |
\/


Tesla c1060 driver installation

2011-08-19 Thread Predrag Punosevac
Deal All,

I apologize to all of you who find this question trivial. I am
completely new to Linux and to Scientific Linux in particular albeit
Unix (OpenBSD and Solaris) user of over 20 years.

I have been entrusted with the installation and configuration of NVidia
Tesla c1060 on our university test rig running i386_64 Scientific Linux
5.5. 

After a bit of pocking around I managed to kill X server, install gcc as
directed by NVidia driver installation script. However, due to the lack
of pre-compiled kernel interfaces on NVidia ftp server I am forced by
installer to compile a kernel interface. This is where my troubles
begin. I have no source code for the kernel. I used yum to install 
kernel-devel.rpm and all other rpms (since I didn't find kernel-src.rpm)
which contain kernel in the name. Never the less script still complains
about the lack of the kernel source code. Could you please tell me where
can I get kernel source and where is supposed to be placed on Linux?

I would welcome any other tip or howto or pointer to documentation since
I really want to do science instead of playing with system
administration.

Thank you,
Predrg Punosevac

P.S. Is there TeXLive rmp for Scientific Linux? I saw teTeX which is
probably enough for this machine but if TeXLive is available why not.


Re: Tesla c1060 driver installation

2011-08-19 Thread Doug Johnson
Greetings,

You can install the kernel source with yum. On SL5, the packages are
labeled as:

hendrix rpm -q -a | grep kernel | grep devel
kernel-devel-2.6.18-238.5.1.el5
kernel-devel-2.6.18-238.19.1.el5
kernel-devel-2.6.18-238.9.1.el5
kernel-devel-2.6.18-238.12.1.el5

I imagine it is similar on SL6, so you would do something like:

yum install kernel-devel-`uname -r`

Good luck,
doug

 
 Deal All,
 
 I apologize to all of you who find this question trivial. I am
 completely new to Linux and to Scientific Linux in particular albeit
 Unix (OpenBSD and Solaris) user of over 20 years.
 
 I have been entrusted with the installation and configuration of NVidia
 Tesla c1060 on our university test rig running i386_64 Scientific Linux
 5.5. 
 
 After a bit of pocking around I managed to kill X server, install gcc as
 directed by NVidia driver installation script. However, due to the lack
 of pre-compiled kernel interfaces on NVidia ftp server I am forced by
 installer to compile a kernel interface. This is where my troubles
 begin. I have no source code for the kernel. I used yum to install 
 kernel-devel.rpm and all other rpms (since I didn't find kernel-src.rpm)
 which contain kernel in the name. Never the less script still complains
 about the lack of the kernel source code. Could you please tell me where
 can I get kernel source and where is supposed to be placed on Linux?
 
 I would welcome any other tip or howto or pointer to documentation since
 I really want to do science instead of playing with system
 administration.
 
 Thank you,
 Predrg Punosevac
 
 P.S. Is there TeXLive rmp for Scientific Linux? I saw teTeX which is
 probably enough for this machine but if TeXLive is available why not.
 

 
   Doug Johnsonemail: drj...@pizero.colorado.edu
   B390, Duane Physics (303)-492-4506 Office 
   Boulder, CO 80309   (303)-492-5119 FAX
   http://www.aaccchildren.org   
 You cannot see. You think I cannot see?
 Of all things, to live in darkness must be worst.
 Fear is the only darkness.



Re: Tesla c1060 driver installation

2011-08-19 Thread Predrag Punosevac
Nico Kadel-Garcia nka...@gmail.com wrote:

 On Fri, Aug 19, 2011 at 10:25 PM, Predrag Punosevac ppunose...@devio.us 
 wrote:
  Deal All,
 
  I apologize to all of you who find this question trivial. I am
  completely new to Linux and to Scientific Linux in particular albeit
  Unix (OpenBSD and Solaris) user of over 20 years.
 
  I have been entrusted with the installation and configuration of NVidia
  Tesla c1060 on our university test rig running i386_64 Scientific Linux
  5.5.

 Wonderful for you you! First, May I suggest that you figure out wither
 you mean i386 or x86_64 Scientific Linux? And second, if feasible,

x86_64 (amd64) of course because I have a lot of RAM which can not be
fully accessed even with PAE enabled kernel on i386. In my baby tests SL
5.5 i386 was limited to 12GB of RAM. 


 can I encourage you to update to version 5.6? There are a number of
 very useful updates and integration improvements in that release.

I could install even 6.1. The only reason I went with 5.5 was that
NVidia claimed that was officially supported version. I am also a bit
concern about other applications and their availability for SL 6.1. This
thing must run MATLAB, Maple, Mathematica, SciPy, Numpy and be
accessible not only via ssh but also via NoMachine NX. In particular NX
is closed source for version 4.0 and above so I am not sure if the free
version of server will even install let alone run on SL 6.1.



  After a bit of pocking around I managed to kill X server, install gcc as
  directed by NVidia driver installation script. However, due to the lack
  of pre-compiled kernel interfaces on NVidia ftp server I am forced by
  installer to compile a kernel interface. This is where my troubles

 The NVidia installer scripts can !@#$!@@@#$ my !@!@$#!$. I've
 personally had to rewrite them far too many times, and my updated
 versions have been ignored. They do not play well with updates to the
 OpenGL libraries, which they replace without informing the RPM system
 of the replacement, they do not uninstall gracefully unless they've
 been heavily edited since I last looked, and RPM has no way of knowing
 about them to deal with kernel updates.

I have heard of the update issues. Obviously, I am not happy running
NVidia binary blobs period but I have no choice.



 There are good RPM's, and notes,on the process, at
 http://rpmfusion.org/Howto/nVidia. Scientific Linux plays as nicely as
 it feasibly can with such third party repositories.


Thank you so much for that info!

  begin. I have no source code for the kernel. I used yum to install
  kernel-devel.rpm and all other rpms (since I didn't find kernel-src.rpm)
  which contain kernel in the name. Never the less script still complains
  about the lack of the kernel source code. Could you please tell me where
  can I get kernel source and where is supposed to be placed on Linux?

 Have you updated the kernel and rebooted since the last kernel update?
 One thing that the NVidia installers have traditionally been horrid
 about is detecting what your current kernel is, versus what kernel
 will be at boot time. I've traditionally dealt with this by having an
 init script run at boot time to re-install the NVidia drivers, just in
 case, but the modern kmod based tools are supposed to do this for
 you.

No, I have not updated anything. I run NVidia installation script on
release version of SL 5.5 in a hope to get working installation while
learning about SL, CUDA, and having things like MATLAB utilize Tesla.

Thank you so much for your frank and helpful post.

Cheers,
Predrag Punosevac


  I would welcome any other tip or howto or pointer to documentation since
  I really want to do science instead of playing with system
  administration.
 
  Thank you,
  Predrg Punosevac
 
  P.S. Is there TeXLive rmp for Scientific Linux? I saw teTeX which is
  probably enough for this machine but if TeXLive is available why not.
 

 http://rpm.pbone.net is your friend for this. I see it apparently
 built into Scientific Linux 6,