Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-06 Thread tezeb
On 01/06/2018 09:20 PM, yreb...@riseup.net wrote:
> On 2018-01-06 08:23, awokd wrote:
>> On Sat, January 6, 2018 5:32 pm, yreb...@riseup.net wrote:
>>
>>>
>>> The "OOM" bug, as I read it on the URL, seems to indicate only that "X"
>>> crashes, in my case more often the whole system has rebooted, but perhaps
>>> the OOM could also cause that?
>>>
>>> Plus "grep" seems to find  only 2 entries , and I've had many such
>>> crashes :)
>>
>> Look through that journalctl log manually and try to find more crashes.
>> Might be something else besides OOM causing them too. Also, look through
>> /var/log/xen/console/hypervisor.log for crash messages.
> 
> sadly, it being in dom0 complicates that, as journalctl is so huge ;
> besides |more and |less  any other suggestions  on  examining logs in
> dom0? 
> 
> or particular terms to grep ?  "crash" didn't seem to do much :)
> 

Judging by the end result(crash) and the fact that my Qubes has worked
well until recently I believe that I am hitting the same issue as you
do. I don't have any "oom" logs in journalctl, but I had seen few
crashes recently.


The last lines before "-- Reboot --" in journalctl is, but that does not
repeat earlier so I doubt it's the issue:

Jan 06 20:15:03 dom0 block-cleaner-daemon.py[2759]: libxl: error:
libxl_device.c:369:libxl__device_disk_set_backend: no suitable backend
for disk xvdd
Jan 06 20:15:03 dom0 block-cleaner-daemon.py[2759]:
libxl_device_disk_remove failed.
Jan 06 20:15:04 dom0 block-cleaner-daemon.py[2759]: libxl: error:
libxl_device.c:369:libxl__device_disk_set_backend: no suitable backend
for disk xvdc
Jan 06 20:15:04 dom0 block-cleaner-daemon.py[2759]:
libxl_device_disk_remove failed.

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/9e1178a2-d397-1c76-06b8-37c8787c3d15%40outoftheblue.pl.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-06 Thread yrebstv
On 2018-01-06 08:23, awokd wrote:
> On Sat, January 6, 2018 5:32 pm, yreb...@riseup.net wrote:
> 
>>
>> The "OOM" bug, as I read it on the URL, seems to indicate only that "X"
>> crashes, in my case more often the whole system has rebooted, but perhaps
>> the OOM could also cause that?
>>
>> Plus "grep" seems to find  only 2 entries , and I've had many such
>> crashes :)
> 
> Look through that journalctl log manually and try to find more crashes.
> Might be something else besides OOM causing them too. Also, look through
> /var/log/xen/console/hypervisor.log for crash messages.

sadly, it being in dom0 complicates that, as journalctl is so huge ;
besides |more and |less  any other suggestions  on  examining logs in
dom0? 

or particular terms to grep ?  "crash" didn't seem to do much :)

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/99d1e6fa72ffdae30ebe11b3b6200123%40riseup.net.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-06 Thread 'awokd' via qubes-users
On Sat, January 6, 2018 5:32 pm, yreb...@riseup.net wrote:

>
> The "OOM" bug, as I read it on the URL, seems to indicate only that "X"
> crashes, in my case more often the whole system has rebooted, but perhaps
> the OOM could also cause that?
>
> Plus "grep" seems to find  only 2 entries , and I've had many such
> crashes :)

Look through that journalctl log manually and try to find more crashes.
Might be something else besides OOM causing them too. Also, look through
/var/log/xen/console/hypervisor.log for crash messages.

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/c6bf260cfc7cbd7302aba6710d08d488.squirrel%40tt3j2x4k5ycaa5zt.onion.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-06 Thread yrebstv
On 2018-01-06 03:13, awokd wrote:
> On Sat, January 6, 2018 4:06 am, yreb...@riseup.net wrote:
> 
>> Despite not using the Wireless, once in a while it pops up asking me if
>> I want to connect , and I just close the window. I'm really not sure, if
>> there is a better way to "disable" it, or if that explains  the   entry
>> above  ?
> 
> Unless that helped with the crashes, go ahead and keep using wireless. I
> think this is actually what you are hitting:
> https://github.com/QubesOS/qubes-issues/issues/3079. You can see if it's
> the same issue by doing a "sudo journalctl" after a crash and looking for
> those messages about oom-killer.

I browsed through that URL and did this on my system:

[quser4@dom0 Desktop]$ journalctl|grep oom
Dec 27 21:11:28 dom0 kernel: Xorg invoked oom-killer:
gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), nodemask=0,
order=3, oom_score_adj=0
Dec 27 21:11:28 dom0 kernel:  []
oom_kill_process+0x219/0x3e0
Dec 27 21:11:28 dom0 kernel: [ pid ]   uid  tgid total_vm  rss
nr_ptes nr_pmds swapents oom_score_adj name
Dec 27 21:11:28 dom0 kernel: oom_reaper: reaped process 5560 (Xorg), now
anon-rss:0kB, file-rss:0kB, shmem-rss:270816kB
Jan 03 12:25:17 dom0 kernel: Xorg invoked oom-killer:
gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), nodemask=0,
order=3, oom_score_adj=0
Jan 03 12:25:17 dom0 kernel:  []
oom_kill_process+0x219/0x3e0
Jan 03 12:25:17 dom0 kernel: [ pid ]   uid  tgid total_vm  rss
nr_ptes nr_pmds swapents oom_score_adj name
Jan 03 12:25:17 dom0 kernel: oom_reaper: reaped process 5545 (Xorg), now
anon-rss:0kB, file-rss:0kB, shmem-rss:422640kB

The "OOM" bug, as I read it on the URL, seems to indicate only that "X"
crashes, in my case more often the whole system has rebooted, but
perhaps the OOM could also cause that?

Plus "grep" seems to find  only 2 entries , and I've had many such
crashes :)


-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/d0999243e5532ca7d2dbfce4ae1d10a3%40riseup.net.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-06 Thread 'awokd' via qubes-users
On Sat, January 6, 2018 4:06 am, yreb...@riseup.net wrote:

> Despite not using the Wireless, once in a while it pops up asking me if
> I want to connect , and I just close the window. I'm really not sure, if
> there is a better way to "disable" it, or if that explains  the   entry
> above  ?

Unless that helped with the crashes, go ahead and keep using wireless. I
think this is actually what you are hitting:
https://github.com/QubesOS/qubes-issues/issues/3079. You can see if it's
the same issue by doing a "sudo journalctl" after a crash and looking for
those messages about oom-killer.

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/e60717553482e5c0d2de1d7d84d4a522.squirrel%40tt3j2x4k5ycaa5zt.onion.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-05 Thread yrebstv
On 2018-01-05 14:31, awokd wrote:
> On Sat, January 6, 2018 12:14 am, yreb...@riseup.net wrote:
>> (XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
>> fff0, iommu reg = 82c0009f4000 (XEN) [VT-D]DMAR: reason 05 - PTE
>> Write access is not set
>>
>>
>>
>> ..repeat another 300 times :)
> 
> I'm not seeing any of the memory balancing log messages I was expecting,
> maybe they aren't listed there in 3.2? I'll check on my system later.
> 
> Is it possible to remove or disable device 04:00.0 for a while to see if
> that's causing your issue? I'm guessing it's an Ethernet card. You can
> check with "lspci".
> 
> "sudo journalctl -b" might also give you some clues.


Hi   *yes, 04:00.0 is : 

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2)
I219-V (rev 31)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

04:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821AE
802.11ac PCIe Wireless Network Adapter

Despite not using the Wireless, once in a while it pops up asking me if
I want to connect , and I just close the window. I'm really not sure, if
there is a better way to "disable" it, or if that explains  the   entry 
above  ? 

ok, I'll wait to hear back on which log. 

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/9e61e31940808cd867418f1e109a9e5f%40riseup.net.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-05 Thread 'awokd' via qubes-users
On Sat, January 6, 2018 12:14 am, yreb...@riseup.net wrote:
> (XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
> fff0, iommu reg = 82c0009f4000 (XEN) [VT-D]DMAR: reason 05 - PTE
> Write access is not set
>
>
>
> ..repeat another 300 times :)

I'm not seeing any of the memory balancing log messages I was expecting,
maybe they aren't listed there in 3.2? I'll check on my system later.

Is it possible to remove or disable device 04:00.0 for a while to see if
that's causing your issue? I'm guessing it's an Ethernet card. You can
check with "lspci".

"sudo journalctl -b" might also give you some clues.



-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/f160b01d70f74b7a97ec0fb005858b6e.squirrel%40tt3j2x4k5ycaa5zt.onion.
For more options, visit https://groups.google.com/d/optout.


Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-05 Thread yrebstv
On 2018-01-05 13:35, awokd wrote:
> On Fri, January 5, 2018 6:54 pm, yreb...@riseup.net wrote:
>> Hello,  I've had a stable system for >6 months, but in the last month,
>> I'd say I've had 4-6 total crashes, where the machine reboots itself ,
>> then maybe 2-4  crashes where  the system doesn't reboot, but closes all
>> VMs and asks me to re-login .
> 
> Look in "xl dmesg" for memory balancing errors. You might need to add
> loglvl=all to your Xen command line.


On 01/05/2018 01:35 PM, 'awokd' via qubes-users wrote:
> On Fri, January 5, 2018 6:54 pm, 
> yrebstv-sgozh3hwpm2stnjn9+b...@public.gmane.org wrote:
>> Hello,  I've had a stable system for >6 months, but in the last month,
>> I'd say I've had 4-6 total crashes, where the machine reboots itself ,
>> then maybe 2-4  crashes where  the system doesn't reboot, but closes all
>> VMs and asks me to re-login .
> 
> Look in "xl dmesg" for memory balancing errors. You might need to add
> loglvl=all to your Xen command line.
> 
1)
Thanks for responding.  I have always have these ACPI complaints when I
boot, it may have grown to about 5-6 lines, but flashes by and continues
to boot, FWIW.

2) Below, is what a "memory balancing error" might look like?   If so ,
what , if anything in the  BIOS  or elsewhere  would you advise ??

3) 
Meanwhile.  Please excuse the long post below of 

$xl dmesg   (actually I used the copy from dom0 log feature) 

--
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000

(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
(XEN) [VT-D]DMAR:[DMA Write] Request device [:04:00.0] fault addr
fff0, iommu reg = 82c0009f4000
(XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
 Xen 4.6.6-35.fc23
(XEN) Xen version 4.6.6 (user@[unknown]) (gcc (GCC) 5.3.1 20160406 (Red
Hat 5.3.1-6)) debug=n Tue Nov 28 12:59:56 UTC 2017
(XEN) Latest ChangeSet: 
(XEN) Bootloader: EFI
(XEN) Command line: loglvl=all dom0_mem=min:1024M dom0_mem=max:4096M
(XEN) Video information:
(XEN)  VGA is graphics mode 1920x1080, 32 bpp
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) EFI RAM map:
(XEN)   - 00058000 (usable)
(XEN)  00058000 - 00059000 (reserved)
(XEN)  00059000 - 0009f000 (usable)
(XEN)  0009f000 - 000a (reserved)
(XEN)  0010 - 5cec8000 (usable)
(XEN)  5cec8000 - 5cec9000 (ACPI NVS)
(XEN)  5cec9000 - 5cef3000 (reserved)
(XEN)  5cef3000 - 5cf43000 (usable)
(XEN)  5cf43000 - 5dc64000 (reserved)
(XEN)  5dc64000 - 76e59000 (usable)
(XEN)  76e59000 - 777b2000 (reserved)
(XEN)  777b2000 - 77f99000 (ACPI NVS)
(XEN)  77f99000 - 77ffe000 (ACPI data)
(XEN)  77ffe000 - 77fff000 (usable)
(XEN)  7800 - 7810 (reserved)
(XEN)  e000 - f000 (reserved)
(XEN)  fe00 - fe011000 (reserved)
(XEN)  fec0 - fec01000 (reserved)
(XEN)  fee0 - fee01000 (reserved)
(XEN)  ff00 - 0001 (reserved)
(XEN)  0001 - 00047600 (usable)
(XEN) ACPI: RSDP 77F3, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 77F300A0, 00C4 (r1 ALASKA   A M I   1072009 AMI
10013)
(XEN) ACPI: FACP 77F51188, 010C (r5 ALASKA   A M I   1072009 AMI
10013)
(XEN) ACPI: DSDT 77F30200, 20F88 (r2 ALASKA   A M I   1072009 INTL
20120913)
(XEN) ACPI: FACS 77F98F80, 0040
(XEN) ACPI: APIC 77F51298, 0084 (r3 ALASKA   A M I   1072009 AMI
10013)
(XEN) ACPI: FPDT 77F51320, 0044 (r1 ALASKA   A M I   1072009 AMI
10013)
(XEN) ACPI: FIDT 77F51368, 009C (r1 ALASKA   A M I   1072009 AMI
10013)
(XEN) ACPI: MCFG 77F51408, 003C (r1 ALASKAA M I  1072009 MSFT  
97)
(XEN) ACPI: HPET 77F51448, 0038 (r1 ALASKAA M I  1072009 

Re: [qubes-users] Frequent 3.2 crashes , How to troubleshoot?

2018-01-05 Thread 'awokd' via qubes-users
On Fri, January 5, 2018 6:54 pm, yreb...@riseup.net wrote:
> Hello,  I've had a stable system for >6 months, but in the last month,
> I'd say I've had 4-6 total crashes, where the machine reboots itself ,
> then maybe 2-4  crashes where  the system doesn't reboot, but closes all
> VMs and asks me to re-login .

Look in "xl dmesg" for memory balancing errors. You might need to add
loglvl=all to your Xen command line.

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To post to this group, send email to qubes-users@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/dceaeb8e6ef3ac57a60d5b9c66c9bf26.squirrel%40tt3j2x4k5ycaa5zt.onion.
For more options, visit https://groups.google.com/d/optout.