Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-16 Thread tuxic
On 08/16 09:26, Nikos Chantziaras wrote:
> On 16/08/18 09:24, Nikos Chantziaras wrote:
> > On 15/08/18 20:24, tu...@posteo.de wrote:
> > > Secure Connection Failed
> > > 
> > > An error occurred during a connection to nvidia.com. Peer attempted
> > > old style (potentially vulnerable) handshake. Error code:
> > > SSL_ERROR_UNSAFE_NEGOTIATION
> > 
> > Click "Advanced" and then "add exception". If you uncheck the
> > "permament" checkbox, the exception will not be saved and be only valid
> > for this session.
> 
> Or use this URL instead:
> 
> https://www.nvidia.com/drivers
> 
> 

...my comment about the not so well implemented NVIDIA homepage
was only a sligtly ironic/cynic additonal "sigh" in all that
trouble...





Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic
On 08/15 08:13, Nikos Chantziaras wrote:
> On 15/08/18 18:45, tu...@posteo.de wrote:
> > 
> > I put nvidia-uvm explictly into /etc/conf.d/modules - which was not
> > necessary ever beforeand it shows the same problems: No cuda
> > devices.
> > 
> > I think I will dream this night of no cuda devices... ;(
> 
> Or you might want to use the LTS (Long Term Support) driver series for now,
> which is 390.x (390.77 being the latest of that series.)
> 
> You can see what the latest LTS series is by going here:
> 
>   https://nvidia.com/drivers
> 
> Select your GPU and "Linux 64-bit" and click search. This will tell you what
> the currently recommended "stable" driver is.
> 
> 


And the show must go on:


Secure Connection Failed

An error occurred during a connection to nvidia.com. Peer attempted old style 
(potentially vulnerable) handshake. Error code: SSL_ERROR_UNSAFE_NEGOTIATION

The page you are trying to view cannot be shown because the authenticity of 
the received data could not be verified.
Please contact the website owners to inform them of this problem.

Learn more…

Report errors like this to help Mozilla identify and block malicious sites


Sigh...



Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread Corentin “Nado” Pazdera
August 15, 2018 5:45 PM, tu...@posteo.de wrote:

> I put nvidia-uvm explictly into /etc/conf.d/modules - which was not
> necessary ever beforeand it shows the same problems: No cuda
> devices.
> 
> I think I will dream this night of no cuda devices... ;(
> 
> On 08/15 05:11, tu...@posteo.de wrote:
> 
>> On 08/15 02:32, Corentin “Nado” Pazdera wrote:
>> August 15, 2018 2:59 PM, tu...@posteo.de wrote:
> 
> Yes I did reboot the sustem. In my initial mail I mentioned a tool
> called CUDA-Z and Blender, which both reports a missing CUDA device.
>> Ok, so you do not have a specific error which might have been thrown by the 
>> module?
>> Other ideas, check dev-util/nvidia-cuda-toolkit version and double check 
>> nvidia/nvidia_uvm with
>> modinfo to ensure they are installed and loaded correctly with the right 
>> version?
>> Could you also run /opt/cuda/extras/demo_suite/deviceQuery (from 
>> nvidia-cuda-toolkit) and show its
>> output?
>> 
>> My installation works, so at least we know their version is not completely 
>> broken...
>> Driver version: 396.51
>> Cuda version: 9.2.88
>> 
>> --
>> Corentin “Nado” Pazdera
>> 
>> I compiled the new version of the driver again and rebooted the
>> system.
>> 
>> # dmesg | grep -i nvidia:
>> 
>> [ 11.375631] nvidia_drm: module license 'MIT' taints kernel.
>> [ 12.313260] nvidia-nvlink: Nvlink Core is being initialized, major device 
>> number 246
>> [ 12.313586] nvidia :07:00.0: vgaarb: changed VGA decodes:
>> olddecodes=io+mem,decodes=none:owns=io+mem
>> [ 12.313691] nvidia :02:00.0: enabling device ( -> 0003)
>> [ 12.313737] nvidia :02:00.0: vgaarb: changed VGA decodes:
>> olddecodes=io+mem,decodes=none:owns=none
>> [ 12.313826] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 396.51 Tue Jul 
>> 31 10:43:06 PDT 2018
>> (using threaded interrupts)
>> [ 12.491106] input: HDA NVidia HDMI as
>> /devices/pci:00/:00:0b.0/:02:00.1/sound/card2/input9
>> [ 12.492291] input: HDA NVidia HDMI as
>> /devices/pci:00/:00:0b.0/:02:00.1/sound/card2/input10
>> [ 12.493772] input: HDA NVidia HDMI as
>> /devices/pci:00/:00:02.0/:07:00.1/sound/card1/input11
>> [ 12.494605] input: HDA NVidia HDMI as
>> /devices/pci:00/:00:02.0/:07:00.1/sound/card1/input12
>> [ 13.963644] caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
>> [ 34.236553] caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
>> [ 34.516495] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for 
>> UNIX platforms 396.51
>> Tue Jul 31 14:52:09 PDT 2018
>> 
>> # modprobe -a nvidia-uvm
>> 
>> # dmesg | grep uvm
>> 
>> [ 209.441956] nvidia-uvm: Loaded the UVM driver in 8 mode, major device 
>> number 245
>> 
>> # /opt/cuda/extras/demo_suite/deviceQuery
>> /opt/cuda/extras/demo_suite/deviceQuery Starting...
>> 
>> CUDA Device Query (Runtime API) version (CUDART static linking)
>> 
>> cudaGetDeviceCount returned 30
>> -> unknown error
>> Result = FAIL
>> [1] 5086 exit 1 /opt/cuda/extras/demo_suite/deviceQuery
>> 
>> CUDA-Z shows also "no CUDA device"
>> 
>> # modinfo nvidia-uvm
>> filename: /lib/modules/4.18.0-RT/video/nvidia-uvm.ko
>> supported: external
>> license: MIT
>> depends: nvidia
>> name: nvidia_uvm
>> vermagic: 4.18.0-RT SMP preempt mod_unload
>> parm: uvm_perf_prefetch_enable:uint
>> parm: uvm_perf_prefetch_threshold:uint
>> parm: uvm_perf_prefetch_min_faults:uint
>> parm: uvm_perf_thrashing_enable:uint
>> parm: uvm_perf_thrashing_threshold:uint
>> parm: uvm_perf_thrashing_pin_threshold:uint
>> parm: uvm_perf_thrashing_lapse_usec:uint
>> parm: uvm_perf_thrashing_nap_usec:uint
>> parm: uvm_perf_thrashing_epoch_msec:uint
>> parm: uvm_perf_thrashing_max_resets:uint
>> parm: uvm_perf_thrashing_pin_msec:uint
>> parm: uvm_perf_map_remote_on_native_atomics_fault:uint
>> parm: uvm_hmm:Enable (1) or disable (0) HMM mode. Default: 0. Ignored if 
>> CONFIG_HMM is not set, or
>> if NEXT settings conflict with HMM. (int)
>> parm: uvm_global_oversubscription:Enable (1) or disable (0) global 
>> oversubscription support. (int)
>> parm: uvm_leak_checker:Enable uvm memory leak checking. 0 = disabled, 1 = 
>> count total bytes
>> allocated and freed, 2 = per-allocation origin tracking. (int)
>> parm: uvm_force_prefetch_fault_support:uint
>> parm: uvm_debug_enable_push_desc:Enable push description tracking (int)
>> parm: uvm_page_table_location:Set the location for UVM-allocated page 
>> tables. Choices are: vid,
>> sys. (charp)
>> parm: uvm_perf_access_counter_mimc_migration_enable:Whether MIMC access 
>> counters will trigger
>> migrations.Valid values: <= -1 (default policy), 0 (off), >= 1 (on) (int)
>> parm: uvm_perf_access_counter_momc_migration_enable:Whether MOMC access 
>> counters will trigger
>> migrations.Valid values: <= -1 (default policy), 0 (off), >= 1 (on) (int)
>> parm: uvm_perf_access_counter_batch_count:uint
>> parm: uvm_perf_access_counter_granularity:Size of the physical memory region 
>> tracked by each
>> counter. Valid values 

Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic


I put nvidia-uvm explictly into /etc/conf.d/modules - which was not
necessary ever beforeand it shows the same problems: No cuda
devices.

I think I will dream this night of no cuda devices... ;(


On 08/15 05:11, tu...@posteo.de wrote:
> On 08/15 02:32, Corentin “Nado” Pazdera wrote:
> > August 15, 2018 2:59 PM, tu...@posteo.de wrote:
> > 
> > > Yes I did reboot the sustem. In my initial mail I mentioned a tool
> > > called CUDA-Z and Blender, which both reports a missing CUDA device.
> > 
> > Ok, so you do not have a specific error which might have been thrown by the 
> > module?
> > Other ideas, check dev-util/nvidia-cuda-toolkit version and double check 
> > nvidia/nvidia_uvm with modinfo to ensure they are installed and loaded 
> > correctly with the right version?
> > Could you also run /opt/cuda/extras/demo_suite/deviceQuery (from 
> > nvidia-cuda-toolkit) and show its output?
> > 
> > My installation works, so at least we know their version is not completely 
> > broken...
> > Driver version: 396.51
> > Cuda version: 9.2.88
> > 
> > --
> > Corentin “Nado” Pazdera
> > 
> 
> I compiled the new version of the driver again and rebooted the
> system.
> 
> # dmesg | grep -i nvidia:
> 
> [   11.375631] nvidia_drm: module license 'MIT' taints kernel.
> [   12.313260] nvidia-nvlink: Nvlink Core is being initialized, major device 
> number 246
> [   12.313586] nvidia :07:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=io+mem
> [   12.313691] nvidia :02:00.0: enabling device ( -> 0003)
> [   12.313737] nvidia :02:00.0: vgaarb: changed VGA decodes: 
> olddecodes=io+mem,decodes=none:owns=none
> [   12.313826] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  396.51  Tue 
> Jul 31 10:43:06 PDT 2018 (using threaded interrupts)
> [   12.491106] input: HDA NVidia HDMI as 
> /devices/pci:00/:00:0b.0/:02:00.1/sound/card2/input9
> [   12.492291] input: HDA NVidia HDMI as 
> /devices/pci:00/:00:0b.0/:02:00.1/sound/card2/input10
> [   12.493772] input: HDA NVidia HDMI as 
> /devices/pci:00/:00:02.0/:07:00.1/sound/card1/input11
> [   12.494605] input: HDA NVidia HDMI as 
> /devices/pci:00/:00:02.0/:07:00.1/sound/card1/input12
> [   13.963644] caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
> [   34.236553] caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
> [   34.516495] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for 
> UNIX platforms  396.51  Tue Jul 31 14:52:09 PDT 2018
> 
> # modprobe -a nvidia-uvm
> 
> # dmesg | grep uvm
> 
> [  209.441956] nvidia-uvm: Loaded the UVM driver in 8 mode, major device 
> number 245
> 
> 
> # /opt/cuda/extras/demo_suite/deviceQuery
> /opt/cuda/extras/demo_suite/deviceQuery Starting...  
> 
>  CUDA Device Query (Runtime API) version (CUDART static linking)
> 
> cudaGetDeviceCount returned 30
> -> unknown error
> Result = FAIL
> [1]5086 exit 1 /opt/cuda/extras/demo_suite/deviceQuery
> 
> CUDA-Z shows also "no CUDA device" 
> 
> # modinfo nvidia-uvm
> filename:   /lib/modules/4.18.0-RT/video/nvidia-uvm.ko
> supported:  external
> license:MIT
> depends:nvidia
> name:   nvidia_uvm
> vermagic:   4.18.0-RT SMP preempt mod_unload 
> parm:   uvm_perf_prefetch_enable:uint
> parm:   uvm_perf_prefetch_threshold:uint
> parm:   uvm_perf_prefetch_min_faults:uint
> parm:   uvm_perf_thrashing_enable:uint
> parm:   uvm_perf_thrashing_threshold:uint
> parm:   uvm_perf_thrashing_pin_threshold:uint
> parm:   uvm_perf_thrashing_lapse_usec:uint
> parm:   uvm_perf_thrashing_nap_usec:uint
> parm:   uvm_perf_thrashing_epoch_msec:uint
> parm:   uvm_perf_thrashing_max_resets:uint
> parm:   uvm_perf_thrashing_pin_msec:uint
> parm:   uvm_perf_map_remote_on_native_atomics_fault:uint
> parm:   uvm_hmm:Enable (1) or disable (0) HMM mode. Default: 0. 
> Ignored if CONFIG_HMM is not set, or if NEXT settings conflict with HMM. (int)
> parm:   uvm_global_oversubscription:Enable (1) or disable (0) global 
> oversubscription support. (int)
> parm:   uvm_leak_checker:Enable uvm memory leak checking. 0 = 
> disabled, 1 = count total bytes allocated and freed, 2 = per-allocation 
> origin tracking. (int)
> parm:   uvm_force_prefetch_fault_support:uint
> parm:   uvm_debug_enable_push_desc:Enable push description tracking 
> (int)
> parm:   uvm_page_table_location:Set the location for UVM-allocated 
> page tables. Choices are: vid, sys. (charp)
> parm:   uvm_perf_access_counter_mimc_migration_enable:Whether MIMC 
> access counters will trigger migrations.Valid values: <= -1 (default policy), 
> 0 (off), >= 1 (on) (int)
> parm:   uvm_perf_access_counter_momc_migration_enable:Whether MOMC 
> access counters will trigger migrations.Valid values: <= -1 (default policy), 
> 0 (off), >= 1 (on) (int)
> 

Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic
On 08/15 02:32, Corentin “Nado” Pazdera wrote:
> August 15, 2018 2:59 PM, tu...@posteo.de wrote:
> 
> > Yes I did reboot the sustem. In my initial mail I mentioned a tool
> > called CUDA-Z and Blender, which both reports a missing CUDA device.
> 
> Ok, so you do not have a specific error which might have been thrown by the 
> module?
> Other ideas, check dev-util/nvidia-cuda-toolkit version and double check 
> nvidia/nvidia_uvm with modinfo to ensure they are installed and loaded 
> correctly with the right version?
> Could you also run /opt/cuda/extras/demo_suite/deviceQuery (from 
> nvidia-cuda-toolkit) and show its output?
> 
> My installation works, so at least we know their version is not completely 
> broken...
> Driver version: 396.51
> Cuda version: 9.2.88
> 
> --
> Corentin “Nado” Pazdera
> 

I compiled the new version of the driver again and rebooted the
system.

# dmesg | grep -i nvidia:

[   11.375631] nvidia_drm: module license 'MIT' taints kernel.
[   12.313260] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 246
[   12.313586] nvidia :07:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=io+mem
[   12.313691] nvidia :02:00.0: enabling device ( -> 0003)
[   12.313737] nvidia :02:00.0: vgaarb: changed VGA decodes: 
olddecodes=io+mem,decodes=none:owns=none
[   12.313826] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  396.51  Tue Jul 
31 10:43:06 PDT 2018 (using threaded interrupts)
[   12.491106] input: HDA NVidia HDMI as 
/devices/pci:00/:00:0b.0/:02:00.1/sound/card2/input9
[   12.492291] input: HDA NVidia HDMI as 
/devices/pci:00/:00:0b.0/:02:00.1/sound/card2/input10
[   12.493772] input: HDA NVidia HDMI as 
/devices/pci:00/:00:02.0/:07:00.1/sound/card1/input11
[   12.494605] input: HDA NVidia HDMI as 
/devices/pci:00/:00:02.0/:07:00.1/sound/card1/input12
[   13.963644] caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
[   34.236553] caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
[   34.516495] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for 
UNIX platforms  396.51  Tue Jul 31 14:52:09 PDT 2018

# modprobe -a nvidia-uvm

# dmesg | grep uvm

[  209.441956] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 
245


# /opt/cuda/extras/demo_suite/deviceQuery
/opt/cuda/extras/demo_suite/deviceQuery Starting...  

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 30
-> unknown error
Result = FAIL
[1]5086 exit 1 /opt/cuda/extras/demo_suite/deviceQuery

CUDA-Z shows also "no CUDA device" 

# modinfo nvidia-uvm
filename:   /lib/modules/4.18.0-RT/video/nvidia-uvm.ko
supported:  external
license:MIT
depends:nvidia
name:   nvidia_uvm
vermagic:   4.18.0-RT SMP preempt mod_unload 
parm:   uvm_perf_prefetch_enable:uint
parm:   uvm_perf_prefetch_threshold:uint
parm:   uvm_perf_prefetch_min_faults:uint
parm:   uvm_perf_thrashing_enable:uint
parm:   uvm_perf_thrashing_threshold:uint
parm:   uvm_perf_thrashing_pin_threshold:uint
parm:   uvm_perf_thrashing_lapse_usec:uint
parm:   uvm_perf_thrashing_nap_usec:uint
parm:   uvm_perf_thrashing_epoch_msec:uint
parm:   uvm_perf_thrashing_max_resets:uint
parm:   uvm_perf_thrashing_pin_msec:uint
parm:   uvm_perf_map_remote_on_native_atomics_fault:uint
parm:   uvm_hmm:Enable (1) or disable (0) HMM mode. Default: 0. Ignored 
if CONFIG_HMM is not set, or if NEXT settings conflict with HMM. (int)
parm:   uvm_global_oversubscription:Enable (1) or disable (0) global 
oversubscription support. (int)
parm:   uvm_leak_checker:Enable uvm memory leak checking. 0 = disabled, 
1 = count total bytes allocated and freed, 2 = per-allocation origin tracking. 
(int)
parm:   uvm_force_prefetch_fault_support:uint
parm:   uvm_debug_enable_push_desc:Enable push description tracking 
(int)
parm:   uvm_page_table_location:Set the location for UVM-allocated page 
tables. Choices are: vid, sys. (charp)
parm:   uvm_perf_access_counter_mimc_migration_enable:Whether MIMC 
access counters will trigger migrations.Valid values: <= -1 (default policy), 0 
(off), >= 1 (on) (int)
parm:   uvm_perf_access_counter_momc_migration_enable:Whether MOMC 
access counters will trigger migrations.Valid values: <= -1 (default policy), 0 
(off), >= 1 (on) (int)
parm:   uvm_perf_access_counter_batch_count:uint
parm:   uvm_perf_access_counter_granularity:Size of the physical memory 
region tracked by each counter. Valid values asof Volta: 64k, 2m, 16m, 16g 
(charp)
parm:   uvm_perf_access_counter_threshold:Number of remote accesses on 
a region required to trigger a notification.Valid values: [1, 65535] (uint)
parm:   uvm_perf_reenable_prefetch_faults_lapse_msec:uint
parm:   

Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread Corentin “Nado” Pazdera
August 15, 2018 2:59 PM, tu...@posteo.de wrote:

> Yes I did reboot the sustem. In my initial mail I mentioned a tool
> called CUDA-Z and Blender, which both reports a missing CUDA device.

Ok, so you do not have a specific error which might have been thrown by the 
module?
Other ideas, check dev-util/nvidia-cuda-toolkit version and double check 
nvidia/nvidia_uvm with modinfo to ensure they are installed and loaded 
correctly with the right version?
Could you also run /opt/cuda/extras/demo_suite/deviceQuery (from 
nvidia-cuda-toolkit) and show its output?

My installation works, so at least we know their version is not completely 
broken...
Driver version: 396.51
Cuda version: 9.2.88

--
Corentin “Nado” Pazdera



Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic
On 08/15 12:45, Corentin “Nado” Pazdera wrote:
> August 15, 2018 2:02 PM, tu...@posteo.de wrote:
> > ...sorry, I am no native speaker...I dont understand.
> > 
> > I did not know how to disable CUDA on both cards.
> > So...since it works perfectly with the old driver I would think:
> > No, CUDA is enabled (or at least the old driver does this for me).
> > 
> > Then I do an "emerge " and CUDA stops
> > working. I do not change anything else nor do I know, who/what could
> > disable CUDA on both cards ... except for the driver itsself.
> > 
> > This is weird.
> > 
> > Again, for logical reasons I think, that the culprit is either the
> > driver itsself or a missing (and therefor undocumented) configuration
> > step needed for the new drivers.
> > 
> > The cards are not "old" in any sense.
> 
> Ok, what I meant is : how did you check CUDA "was not working"? And could you 
> check it on both cards.
> 
> Also, as said by realnc, did you reboot ?
> 
> --
> Corentin “Nado” Pazdera
> 

Yes I did reboot the sustem. In my initial mail I mentioned a tool
called CUDA-Z and Blender, which both reports a missing CUDA device.





Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic
On 08/15 03:35, Nikos Chantziaras wrote:
> On 15/08/18 15:02, tu...@posteo.de wrote:
> > Then I do an "emerge " and CUDA stops
> > working. I do not change anything else nor do I know, who/what could
> > disable CUDA on both cards ... except for the driver itsself.
> 
> Dumb question, but just to be sure: did you reboot after upgrading the
> driver? The driver never worked for me correctly, unless I reboot. Unloading
> the driver with "modprobe -r" and loading the new one doesn't work
> correctly, only rebooting does.
> 
> 

Yes, I did reboot the system




Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread Corentin “Nado” Pazdera
August 15, 2018 2:02 PM, tu...@posteo.de wrote:
> ...sorry, I am no native speaker...I dont understand.
> 
> I did not know how to disable CUDA on both cards.
> So...since it works perfectly with the old driver I would think:
> No, CUDA is enabled (or at least the old driver does this for me).
> 
> Then I do an "emerge " and CUDA stops
> working. I do not change anything else nor do I know, who/what could
> disable CUDA on both cards ... except for the driver itsself.
> 
> This is weird.
> 
> Again, for logical reasons I think, that the culprit is either the
> driver itsself or a missing (and therefor undocumented) configuration
> step needed for the new drivers.
> 
> The cards are not "old" in any sense.

Ok, what I meant is : how did you check CUDA "was not working"? And could you 
check it on both cards.

Also, as said by realnc, did you reboot ?

--
Corentin “Nado” Pazdera



Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic
On 08/15 11:39, Corentin “Nado” Pazdera wrote:
> August 15, 2018 1:16 PM, tu...@posteo.de wrote:
> 
> > The wiki-page is old...it speaks of nvidia-driver-174.
> 
> Yeah, for legacy cards... If you check the history its also been updated 
> quite frequently.
> 
> > modprobe.d/nvidia.conf:
> > 
> > # Nvidia drivers support
> > alias char-major-195 nvidia
> > alias /dev/nvidiactl char-major-195
> > 
> > # To tweak the driver the following options can be used, note that
> > # you should be careful, as it could cause instability!! For more
> > # options see /usr/share/doc/nvidia-drivers-396.24-r1/README
> > #
> > # !!! SECURITY WARNING !!!
> > # DO NOT MODIFY OR REMOVE THE DEVICE FILE RELATED OPTIONS UNLESS YOU KNOW
> > # WHAT YOU ARE DOING.
> > # ONLY ADD TRUSTED USERS TO THE VIDEO GROUP, THESE USERS MAY BE ABLE TO 
> > CRASH,
> > # COMPROMISE, OR IRREPARABLY DAMAGE THE MACHINE.
> > options nvidia NVreg_DeviceFileMode=432 NVreg_DeviceFileUID=0 
> > NVreg_DeviceFileGID=27
> > NVreg_ModifyDeviceFiles=1
> > 
> > modprobe.d/nvidia-rmmod.conf
> > 
> > # Nvidia UVM support
> > remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset 
> > nvidia-uvm nvidia
> > 
> > All the configurations are working all the years up to 
> > x11-drivers/nvidia-drivers-396.24-r1.
> > After that, no CUDA device was found.
> > Based on logical reasons, I would tend to think, that it is something
> > version specific and no global setting which is valid since
> > nvidia-driver-174.
> 
> Updates may need to change a config file after bringing breaking changes, it 
> might not be the cause
> I agree. But its possible.
> 
> Is CUDA disabled on both cards? I have a 970Ti, although my MB is different, 
> we might try to
> compare the big differences in our systems?
> 
> --
> Corentin “Nado” Pazdera
> 

...sorry, I am no native speaker...I dont understand.

I did not know how to disable CUDA on both cards.
So...since it works perfectly with the old driver I would think: 
No, CUDA is enabled (or at least the old driver does this for me).

Then I do an "emerge " and CUDA stops
working. I do not change anything else nor do I know, who/what could
disable CUDA on both cards ... except for the driver itsself.

This is weird.

Again, for logical reasons I think, that the culprit is either the
driver itsself or a missing (and therefor undocumented) configuration
step needed for the new drivers.

The cards are not "old" in any sense. 




Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread Corentin “Nado” Pazdera
August 15, 2018 1:16 PM, tu...@posteo.de wrote:

> The wiki-page is old...it speaks of nvidia-driver-174.

Yeah, for legacy cards... If you check the history its also been updated quite 
frequently.

> modprobe.d/nvidia.conf:
> 
> # Nvidia drivers support
> alias char-major-195 nvidia
> alias /dev/nvidiactl char-major-195
> 
> # To tweak the driver the following options can be used, note that
> # you should be careful, as it could cause instability!! For more
> # options see /usr/share/doc/nvidia-drivers-396.24-r1/README
> #
> # !!! SECURITY WARNING !!!
> # DO NOT MODIFY OR REMOVE THE DEVICE FILE RELATED OPTIONS UNLESS YOU KNOW
> # WHAT YOU ARE DOING.
> # ONLY ADD TRUSTED USERS TO THE VIDEO GROUP, THESE USERS MAY BE ABLE TO CRASH,
> # COMPROMISE, OR IRREPARABLY DAMAGE THE MACHINE.
> options nvidia NVreg_DeviceFileMode=432 NVreg_DeviceFileUID=0 
> NVreg_DeviceFileGID=27
> NVreg_ModifyDeviceFiles=1
> 
> modprobe.d/nvidia-rmmod.conf
> 
> # Nvidia UVM support
> remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset 
> nvidia-uvm nvidia
> 
> All the configurations are working all the years up to 
> x11-drivers/nvidia-drivers-396.24-r1.
> After that, no CUDA device was found.
> Based on logical reasons, I would tend to think, that it is something
> version specific and no global setting which is valid since
> nvidia-driver-174.

Updates may need to change a config file after bringing breaking changes, it 
might not be the cause
I agree. But its possible.

Is CUDA disabled on both cards? I have a 970Ti, although my MB is different, we 
might try to
compare the big differences in our systems?

--
Corentin “Nado” Pazdera



Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread tuxic
On 08/15 10:33, Corentin “Nado” Pazdera wrote:
> August 15, 2018 4:19 AM, tu...@posteo.de wrote:
> 
> > On 08/14 11:16, Nikos Chantziaras wrote:
> > 
> >> On 14/08/18 13:35, tu...@posteo.de wrote:
> >> Hi,
> >> 
> >> after upgrading to nvidia-drivers-396.51 no CUDA devices were found.
> >> Last version, which works for me is nvidia-drivers-396.24-r1.
> >> 
> >> Do you have the "uvm" USE flag set? It might be required for CUDA, but it's
> >> disabled by default (perhaps wrongly, because USE flags should follow
> >> upstream defaults unless there's a reason not to.)
> > 
> > Yes it is:
> > 
> > (this is the version, which is currentlu still working
> > Installed versions: 396.24-r1(0/396)^md(08:31:04 PM 08/14/2018)(X driver 
> > kms static-libs tools uvm
> > -acpi -compat -gtk3 -multilib -pax_kernel -wayland ABI_MIPS="-n32 -n64 
> > -o32" ABI_PPC="-32 -64"
> > ABI_S390="-32 -64" ABI_X86="64 -32 -x32" KERNEL="linux -FreeBSD")
> > 
> > and set via /etc/portage/package.use
> > 
> > # required by app-admin/conky-1.10.6-r1::gentoo[nvidia,X]
> > # required by @selected
> > # required by @world (argument)
> >> =x11-drivers/nvidia-drivers-378.13 static-libs uvm
> > 
> > What else could be the reason for the problem?
> > How can I fix it?
> 
> Can you also show content of modprobe.d file ?
> Did you read the whole wiki page ? Did you check for MSI interrupts ?
> https://wiki.gentoo.org/wiki/NVidia/nvidia-drivers#Driver_fails_to_initialize_when_MSI_interrupts_are_enabled
> 
> Regards,
> --
> Corentin “Nado” Pazdera
> 

The wiki-page is old...it speaks of nvidia-driver-174.


modprobe.d/nvidia.conf:

# Nvidia drivers support
alias char-major-195 nvidia
alias /dev/nvidiactl char-major-195

# To tweak the driver the following options can be used, note that
# you should be careful, as it could cause instability!! For more 
# options see /usr/share/doc/nvidia-drivers-396.24-r1/README 
#
# !!! SECURITY WARNING !!!
# DO NOT MODIFY OR REMOVE THE DEVICE FILE RELATED OPTIONS UNLESS YOU KNOW
# WHAT YOU ARE DOING.
# ONLY ADD TRUSTED USERS TO THE VIDEO GROUP, THESE USERS MAY BE ABLE TO CRASH,
# COMPROMISE, OR IRREPARABLY DAMAGE THE MACHINE.
options nvidia NVreg_DeviceFileMode=432 NVreg_DeviceFileUID=0 
NVreg_DeviceFileGID=27 NVreg_ModifyDeviceFiles=1


modprobe.d/nvidia-rmmod.conf

# Nvidia UVM support
remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset nvidia-uvm 
nvidia


All the configurations are working all the years up to 
x11-drivers/nvidia-drivers-396.24-r1.
After that, no CUDA device was found.
Based on logical reasons, I would tend to think, that it is something
version specific and no global setting which is valid since
nvidia-driver-174.

Regards,




Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-15 Thread Corentin “Nado” Pazdera
August 15, 2018 4:19 AM, tu...@posteo.de wrote:

> On 08/14 11:16, Nikos Chantziaras wrote:
> 
>> On 14/08/18 13:35, tu...@posteo.de wrote:
>> Hi,
>> 
>> after upgrading to nvidia-drivers-396.51 no CUDA devices were found.
>> Last version, which works for me is nvidia-drivers-396.24-r1.
>> 
>> Do you have the "uvm" USE flag set? It might be required for CUDA, but it's
>> disabled by default (perhaps wrongly, because USE flags should follow
>> upstream defaults unless there's a reason not to.)
> 
> Yes it is:
> 
> (this is the version, which is currentlu still working
> Installed versions: 396.24-r1(0/396)^md(08:31:04 PM 08/14/2018)(X driver kms 
> static-libs tools uvm
> -acpi -compat -gtk3 -multilib -pax_kernel -wayland ABI_MIPS="-n32 -n64 -o32" 
> ABI_PPC="-32 -64"
> ABI_S390="-32 -64" ABI_X86="64 -32 -x32" KERNEL="linux -FreeBSD")
> 
> and set via /etc/portage/package.use
> 
> # required by app-admin/conky-1.10.6-r1::gentoo[nvidia,X]
> # required by @selected
> # required by @world (argument)
>> =x11-drivers/nvidia-drivers-378.13 static-libs uvm
> 
> What else could be the reason for the problem?
> How can I fix it?

Can you also show content of modprobe.d file ?
Did you read the whole wiki page ? Did you check for MSI interrupts ?
https://wiki.gentoo.org/wiki/NVidia/nvidia-drivers#Driver_fails_to_initialize_when_MSI_interrupts_are_enabled

Regards,
--
Corentin “Nado” Pazdera



Re: [gentoo-user] Re: "No CUDA device found" with nvidia-drivers newer than nvidia-drivers-396.24-r1(

2018-08-14 Thread tuxic
On 08/14 11:16, Nikos Chantziaras wrote:
> On 14/08/18 13:35, tu...@posteo.de wrote:
> > Hi,
> > 
> > after upgrading to nvidia-drivers-396.51 no CUDA devices were found.
> > Last version, which works for me is nvidia-drivers-396.24-r1.
> 
> Do you have the "uvm" USE flag set? It might be required for CUDA, but it's
> disabled by default (perhaps wrongly, because USE flags should follow
> upstream defaults unless there's a reason not to.)
> 
> 
Yes it is:

(this is the version, which is currentlu still working
Installed versions:  396.24-r1(0/396)^md(08:31:04 PM 08/14/2018)(X driver 
kms static-libs tools uvm -acpi -compat -gtk3 -multilib -pax_kernel -wayland 
ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86="64 -32 
-x32" KERNEL="linux -FreeBSD")

and set via /etc/portage/package.use 

# required by app-admin/conky-1.10.6-r1::gentoo[nvidia,X]
# required by @selected
# required by @world (argument)
>=x11-drivers/nvidia-drivers-378.13 static-libs uvm

What else could be the reason for the problem?
How can I fix it?