Re: FreeBSD 12.x, virtio and alicloud (aliyun.com)
Hello. Well, with this patch I'm getting trap immediately on the kernel booting phase: ===Cut=== virtio_pci2:virtqueue 0 (vtnet0-0 rx) does not exists (size is zero) virtio_pci2: cannot allocate virtqueue 0: 19 vtnet0: cannot allocate virtqueues ===Cut=== see the screenshot below: https://enazadev.ru/stub-data/freebsd12-patched-trap.png 05.11.2020 11:06, Cevin пишет: The problem seems to have been fixed, but the code is still in the review. For more details, see https://reviews.freebsd.org/D26915#601420 Eugene M. Zheganin 于2020年11月5日周四 下午12:35写道: Hello, Guys, does anyone have VM running in AliCloud Chinese provider (one of the biggest, if not the biggest one) ? They seem to provide stock FreeBSD 11.x images on some Redhat-based Linux with VirtIO which run just fine (at least I take a look at their kernel and it seem to be a stock GENERIC), but after source uprgading to 12.2 it cannot mountroot, because literally no disks are found after thee kernel booting stage. This, in turn, is cause by a bunch of repeatable virtio errors, which looks like (screenshot provided in the link): virtio_pci1: cannot map I/O space device_attach: virtio_pci1 attach returned 6 (https://enazadev.ru/stub-data/freebsd12-alicloud-cannot-map-io.png) So not only vtbd0 cannot be attached to, but also a network adater. Surprisingly, virtio_console and memory baloon device seems to be working. I've took a look at various VirtIO cases in the bug tracker and compiled a kernel without netmap (yeah, after some consideration this could help only with virtio_net part), but this doesn't help. Is this some sort of regression that needs to be reported ? Is there some kind of known workaround ? I also have a running 11.3 on a second VM, so I can provide any necessary details if needed. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD 12.x, virtio and alicloud (aliyun.com)
Hello, Guys, does anyone have VM running in AliCloud Chinese provider (one of the biggest, if not the biggest one) ? They seem to provide stock FreeBSD 11.x images on some Redhat-based Linux with VirtIO which run just fine (at least I take a look at their kernel and it seem to be a stock GENERIC), but after source uprgading to 12.2 it cannot mountroot, because literally no disks are found after thee kernel booting stage. This, in turn, is cause by a bunch of repeatable virtio errors, which looks like (screenshot provided in the link): virtio_pci1: cannot map I/O space device_attach: virtio_pci1 attach returned 6 (https://enazadev.ru/stub-data/freebsd12-alicloud-cannot-map-io.png) So not only vtbd0 cannot be attached to, but also a network adater. Surprisingly, virtio_console and memory baloon device seems to be working. I've took a look at various VirtIO cases in the bug tracker and compiled a kernel without netmap (yeah, after some consideration this could help only with virtio_net part), but this doesn't help. Is this some sort of regression that needs to be reported ? Is there some kind of known workaround ? I also have a running 11.3 on a second VM, so I can provide any necessary details if needed. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: pf and hnX interfaces
Hello, On 13.10.2020 14:19, Kristof Provost wrote: Are these symptoms of a bug ? Perhaps. It can also be a symptom of resource exhaustion. Are there any signs of memory allocation failures, or incrementing error counters (in netstat or in pfctl)? Well, the only signs of resource exhaustion I know so far are: - "PF state limit reached" in /var/log/messages (none so far) - mbufs starvation in netstat -m (zero so far) - various queue failure counters in netstat -s -p tcp, but since this only applies to TCP this is hardly related (although it seems like there's also none). so, what should I take a look at ? Disabled PF shows in pfctl -s info: [root@gw1:/var/log]# pfctl -s info Status: Disabled for 0 days 00:41:42 Debug: Urgent State Table Total Rate current entries 9634 searches 24212900618 9677418.3/s inserts 222708269 89012.1/s removals 222698635 89008.2/s Counters match 583327668 233144.6/s bad-offset 0 0.0/s fragment 1 0.0/s short 0 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 76057 30.4/s proto-cksum 9669 3.9/s state-mismatch 3007108 1201.9/s state-insert 13236 5.3/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s map-failed 0 0.0/s And these gazzillions of searches kinda bother me a lot, although this seems just to be a counting bug after PF reloading last time, because it's constantly diminished from 20 millions. To be honest I doubt 10 millions of searches per second can be reached on a pps of 22Kpps. Definitely a math bug. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
pf and hnX interfaces
Hello, I'm running a FreeBSD 12.1 server as a VM under Hyper-V. And although this letter will make an impression of another lame post blaming FreeBSD for all of the issues while the author should blame himselm, I'm atm out of another explanation. The thing is: I'm getting loads of sendmail errors like: ===Cut=== Oct 13 13:49:33 gw1 sm-mta[95760]: 09D8mN2P092173: SYSERR(root): putbody: write error: Permission denied Oct 13 13:49:33 gw1 sm-mta[95760]: 09D8mN2P092173: SYSERR(root): timeout writing message to .mail.protection.outlook.com.: Permission denied ===Cut=== The relay address is just random. The thing is, I can successfully connect to it via telnet. Even send some commands. But when this is done by senamil - and when it's actually sending messages, I get random errors. Firstly I was blaming myself and trying to get the rule that actually blocks something. I ended up having none of the block rules without log clause, and in the same time tcpdump -netti pflog0 shows no droppen packets, but sendmail still eventually complains. If it matters, I have relatively high rps on this interface, about 25 Kpps. I've also found several posting mentionsing that hnX is badly handling the TSO and LRO mode, so I switched it off. No luck however, with vlanhwtag and vlanmtu, which for some reason just cannot be switched off. the if_hn also lacks a man page for some reason, so it's unclear how to tweak it right. And the most mysterious part - when I switch the pf off, the errors stops to appear. This would clearly mean that pf blocks some packets, but then again, this way the pflog0 would show them up, right (and yes - it's "UP" )? Is there some issue with pf and hn interfaces that I'm unaware about? Are these symptoms of a bug ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: spa_namespace_lock and concurrent zfs commands
On 09.09.2020 17:29, Eugene M. Zheganin wrote: Hello, I'm using sort of FreeBSD ZFS appliance with custom API, and I'm suffering from huge timeouts when large (dozens, actually) of concurrent zfs/zpool commands are issued (get/create/destroy/snapshot/clone mostly). Are there any tunables that could help mitigate this ? Once I took part in reporting the https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203906 , but the issue that time got resolved somehow. Now I have another set of FreeBSD SANs and it;'s back. I've read the https://wiki.freebsd.org/AndriyGapon/AvgZFSLocking and I realize this probably doesn't have a quick solution, but still... This actually looks like this (sometime it takes several [dozens of] minutes): root@cg-mr-prod-stg09:/usr/ports/sysutils/smartmontools # zfs get volmode load: 3.58 cmd: zfs 70231 [spa_namespace_lock] 16.38r 0.00u 0.00s 0% 3872k load: 3.58 cmd: zfs 70231 [spa_namespace_lock] 16.59r 0.00u 0.00s 0% 3872k load: 3.58 cmd: zfs 70231 [spa_namespace_lock] 16.76r 0.00u 0.00s 0% 3872k load: 3.58 cmd: zfs 70231 [spa_namespace_lock] 16.90r 0.00u 0.00s 0% 3872k load: 3.58 cmd: zfs 70231 [spa_namespace_lock] 17.04r 0.00u 0.00s 0% 3872k load: 3.58 cmd: zfs 70231 [spa_namespace_lock] 17.17r 0.00u 0.00s 0% 3872k root@cg-mr-prod-stg09:~ # ps ax | grep volmode 70231 5 D+ 0:00.00 zfs get volmode 70233 6 S+ 0:00.00 grep volmode root@cg-mr-prod-stg09:~ # procstat -kk 70231 PID TID COMM TDNAME KSTACK 70231 101598 zfs - mi_switch+0xe2 sleepq_wait+0x2c _sx_xlock_hard+0x459 spa_all_configs+0x1aa zfs_ioc_pool_configs+0x19 zfsdev_ioctl+0x72e devfs_ioctl+0xad VOP_IOCTL_APV+0x7c vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x2be sys_ioctl+0x15d amd64_syscall+0x364 fast_syscall_common+0x101 root@cg-mr-prod-stg09:~ # Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
spa_namespace_lock and concurrent zfs commands
Hello, I'm using sort of FreeBSD ZFS appliance with custom API, and I'm suffering from huge timeouts when large (dozens, actually) of concurrent zfs/zpool commands are issued (get/create/destroy/snapshot/clone mostly). Are there any tunables that could help mitigate this ? Once I took part in reporting the https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203906 , but the issue that time got resolved somehow. Now I have another set of FreeBSD SANs and it;'s back. I've read the https://wiki.freebsd.org/AndriyGapon/AvgZFSLocking and I realize this probably doesn't have a quick solution, but still... Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: running out of ports: every client port is used only once in outgoing connection
Hello, 27.08.2020 23:01, Eugene M. Zheganin wrote: And as soon as I'm switching to it from DNS RR I'm starting to get get "Can't assign outgoing address when connecting to ...". The usual approach would be to assign multiple IP aliases to the destination backends, so I will get more of socket tuples. So, seems like FreeBSD isn't reusing client ports out-of-the-box. Linux, on the other hand, does reuse ports for client connection, as long as the socket tuple stays unique. How do I get the same behavior on FreeBSD ? Found this: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=174087 and this: https://svnweb.freebsd.org/base?view=revision=361228 How do I determine if the latter is merged into STABLE (is it ?) ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
running out of ports: every client port is used only once in outgoing connection
Hello, I have a situation where I'm running out of client ports on a huge reverse-proxy. Say I have an nginx upstream like this: upstream geoplatform { hash $hashkey consistent; server 127.0.0.1:4079 fail_timeout=10s; server 127.0.0.1:4080 fail_timeout=10s; server 10.100.34.5:4079 fail_timeout=10s; server 10.100.34.5:4080 fail_timeout=10s; server 10.100.34.7:4079 fail_timeout=10s; server 10.100.34.7:4080 fail_timeout=10s; server 10.100.34.8:4079 fail_timeout=10s; server 10.100.34.8:4080 fail_timeout=10s; } And as soon as I'm switching to it from DNS RR I'm starting to get get "Can't assign outgoing address when connecting to ...". The usual approach would be to assign multiple IP aliases to the destination backends, so I will get more of socket tuples. So I did this: upstream geoplatform { hash $hashkey consistent; server 127.0.0.1:4079 fail_timeout=10s; server 127.0.0.1:4080 fail_timeout=10s; server 127.0.0.2:4079 fail_timeout=10s; server 127.0.0.2:4080 fail_timeout=10s; server 127.0.0.3:4079 fail_timeout=10s; server 127.0.0.3:4080 fail_timeout=10s; server 10.100.34.5:4079 fail_timeout=10s; server 10.100.34.5:4080 fail_timeout=10s; server 10.100.33.8:4079 fail_timeout=10s; server 10.100.33.8:4080 fail_timeout=10s; server 10.100.33.9:4079 fail_timeout=10s; server 10.100.33.9:4080 fail_timeout=10s; server 10.100.33.10:4079 fail_timeout=10s; server 10.100.33.10:4080 fail_timeout=10s; server 10.100.34.7:4079 fail_timeout=10s; server 10.100.34.7:4080 fail_timeout=10s; server 10.100.34.8:4079 fail_timeout=10s; server 10.100.34.8:4080 fail_timeout=10s; server 10.100.34.10:4079 fail_timeout=10s; server 10.100.34.10:4080 fail_timeout=10s; server 10.100.34.11:4079 fail_timeout=10s; server 10.100.34.11:4080 fail_timeout=10s; server 10.100.34.12:4079 fail_timeout=10s; server 10.100.34.12:4080 fail_timeout=10s; } Surprisingly, this didn't work. So... I just checked if I really have that much of connections. Seems like I'm starting to get troubles on 130K of connections, but even on the initial upstream configuration I should be able to handle 65535 - 10K (since net.inet.ip.portrange.first is 10K) = 55535, 55535 * 8 ~ 450K of connections. Looks like the client port is not reused at all in socket tuples ! Indeed it does not: the below line is taken when there's no free ports, since the nearby console window is flooded with "Can't assign requested address", so I assume I should already have 10.100.34.6.57026 (local IP-port pair) used in as many connection, as many servers I have. But it occurs only once: # netstat -an | grep 10.100.34.6.57026 tcp4 0 0 10.100.34.6.57026 10.100.34.5.4079 ESTABLISHED [root@geo2ng:vhost.d/balancer]# Second test: lets count how many times each port is used in netstat -an: # netstat -an -p tcp | grep -v LISTEN | grep 10.100 | awk '{print $4}' | sort | uniq -c | more | grep -v 1\ (none) So, seems like FreeBSD isn't reusing client ports out-of-the-box. Linux, on the other hand, does reuse ports for client connection, as long as the socket tuple stays unique. How do I get the same behavior on FreeBSD ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
CARP under Hyper-V: weird things happen
Hello, I'm Running 12.0-REL in a VM under W2016S with CARP enabled and paired to a baremetal FreeBSD server. All of a sudden I realized that thjis machine is unable to become a CARP MASTER - because it sees it's own ACRP announces, but instead of seeing them from a CARP synthetic MAC address only, it sees additional extra packets with several MACs derived from the original one (I'm well awared about the -MacAddressSpoof on SetVmNetworkAdapterVlan switch, and it's running with this thingg on, but still). These packets always almost (but not 100%) accompany each valid CARP advertisement. Say, we have a CARP-enabled interface: vlan2: flags=8943 metric 0 mtu 1500 description: AS WAN options=8 ether 00:15:5d:0a:79:12 inet 91.206.242.9/28 broadcast 91.206.242.15 inet 91.206.242.12/28 broadcast 91.206.242.15 vhid 3 groups: vlan carp: BACKUP vhid 3 advbase 1 advskew 250 vlan: 2 vlanpcp: 0 parent interface: hn1 media: Ethernet autoselect (10Gbase-T ) status: active nd6 options=29 Notice the MAC and now look at this: ===Cut=== [root@gw1:~]# tcpdump -T carp -nepi vlan2 carp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vlan2, link-type EN10MB (Ethernet), capture size 262144 bytes 20:45:54.152619 00:00:5e:00:01:03 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227035 ^^^ this is the ordinary and valid CARP advertisement, notice the synthetic MAC which is requiring setting mac address spoofing. 20:45:54.152880 9c:8e:99:0f:79:42 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227035 ^^^ this is some insanity happening 20:45:54.153234 9c:8e:99:0f:79:42 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227035 ^^^ and again 20:45:54.153401 9c:8e:99:0f:79:42 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227035 ^^^ and again 20:45:57.562470 00:00:5e:00:01:03 > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227036 ^^^ valid CARP advertisement, next one-second advbase cycle 20:45:57.562874 9c:8e:99:0f:79:3c > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227036 ^^^ more insane stuff, notice the NEW (sic !) MAC-address 20:45:57.562955 9c:8e:99:0f:79:3c > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227036 20:45:57.562989 9c:8e:99:0f:79:3c > 01:00:5e:00:00:12, ethertype IPv4 (0x0800), length 70: 91.206.242.9 > 224.0.0.18: CARPv2-advertise 36: vhid=3 advbase=1 advskew=100 authlen=7 counter=13769798250643227036 ^C 8 packets captured 3195 packets received by filter ===Cut=== Does anyone has, by any chance, some idea about what's happening ? As soon as I stop CARP stack on this VM these "mad" MACs aren't received anymore, so I'm pretty confident these are somehow procuced on the Hyper-V side. Another weird this is that vlan1 is refusing to work (seems like packets are never received on the VM side) unless its configured on another adapter in the -Untagged (once again powershell term for SetVmNetworkAdapterVlan). Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ipsec/gif(4) tunnel not working: traffic not appearing on the gif(4) interface after deciphering
Hello, I have a FreeBSD 11.1 box with 2 public IPs that has two tunnels to another FreeBSD box with 1 public IP. One of these tunnels is working, the other isn't. Long story short: I have some experience in ipsec tunnels setup. and I supposed that have configured everything properly, and to illustrate this I've loaded if_enc(4) on the 11.1 and it does show the traffic for the second gif: Here I ping the targed troublesome host (2 public IPs) from the remote (1 public IP) and the tcpdump is launched on the receiver: ===Cut=== # tcpdump -npi enc0 host 83.222.68.177 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on enc0, link-type ENC (OpenBSD encapsulated IP), capture size 262144 bytes 12:00:58.218256 (authentic): SPI 0x0c00b77c: IP 188.17.155.29 > 83.222.68.177: ESP(spi=0x0ffc906c,seq=0x14c), length 132 12:00:58.218271 (authentic,confidential): SPI 0x0ffc906c: IP 188.17.155.29 > 83.222.68.177: IP 172.16.0.68 > 172.16.0.67: ICMP echo request, id 24591, seq 121, length 64 (ipip-proto-4) 12:00:59.232761 (authentic): SPI 0x0c00b77c: IP 188.17.155.29 > 83.222.68.177: ESP(spi=0x0ffc906c,seq=0x14d), length 132 12:00:59.232773 (authentic,confidential): SPI 0x0ffc906c: IP 188.17.155.29 > 83.222.68.177: IP 172.16.0.68 > 172.16.0.67: ICMP echo request, id 24591, seq 122, length 64 (ipip-proto-4) ^C 12 packets captured 574 packets received by filter 0 packets dropped by kernel ===Cut=== From this output I conclude that the IPSec is working, since kernel is able to decipher the packets. But for some mysterious reason this traffic isn't showing on the gif(4) (of course I have allowed all the traffic on the enc(4) itself), tcpdump shows nothing. If pinging in the opposite direction, tcpdump shows outgoing packets, enc(4) shows both (remote replies successfully), but once again, there's no incoming packets on the gif(4). There would be a simple answer if I would just misconfigure adressing on the gif(4), but I see no errors: ===Cut=== # ifconfig gif3 gif3: flags=8051 metric 0 mtu 1400 description: idk2 <---> alamics options=8 tunnel inet 83.222.68.177 --> 188.17.155.29 inet 172.16.0.67 --> 172.16.0.68 netmask 0x nd6 options=29 groups: gif ===Cut=== Since I don't have identical tunnel IP pairs I don't need net.link.gif.parallel_tunnels (right ?), so my final guess - either there's something around having two tunnels to the same destination or some bug in 11.1. Any ideas ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
11-STABLE, gstat and swap: uneven mirror disk usage
Hello, Am I right concluding that there's something wrong in either how freebsd works with swap partition, or with how gstat reports its activity ? Because on the consistently woring mirror the situation when only one disk member is used and the other is not for both reads and writes just cannot happen: last pid: 24590; load averages: 6.54, 8.61, 9.86 up 16+04:11:06 15:22:24 79 processes: 1 running, 78 sleeping CPU: 0.2% user, 0.0% nice, 15.0% system, 1.8% interrupt, 83.0% idle Mem: 55M Active, 1092K Inact, 2136K Laundry, 61G Wired, M Free ARC: 33G Total, 1799M MFU, 26G MRU, 3808M Anon, 408M Header, 1032M Other 26G Compressed, 31G Uncompressed, 1.18:1 Ratio Swap: 32G Total, 1665M Used, 30G Free, 5% Inuse, 5620K In, 2276K Out # gstat -do dT: 1.004s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/do/s ms/o %busy Name 0 19 191371.0 0 00.0 0 0 0.0 00.02.0| da0 16342342 2774 38.5 0 00.0 0 0 0.0 00.0 100.0| da1 2 5517 4192 386290.6 1189 255310.3 0 0 0.01350.8 67.3| da2 2 5408 4139 392030.6 1134 253320.4 0 0 0.01350.9 69.5| da3 0 5534 4193 374720.6 1205 255620.4 0 0 0.01350.9 69.3| da4 1 5501 4170 372380.6 1196 254040.3 0 0 0.01350.9 66.8| da5 0 5979 4485 424870.6 1366 285700.3 0 0 0.01280.7 71.0| da6 0 5897 4521 411090.6 1247 285780.3 0 0 0.01280.9 70.1| da7 0 5922 4491 416630.5 1302 288150.3 0 0 0.01280.8 69.1| da8 0 6071 4611 403990.5 1332 282910.3 0 0 0.01280.8 68.6| da9 2 5681 4267 414760.5 1286 249510.3 0 0 0.01270.7 66.4| da10 2 5515 4147 406520.6 1241 247520.3 0 0 0.01271.0 66.6| da11 2 5768 4284 414320.5 1357 248960.3 0 0 0.01270.8 64.8| da12 0 5608 4209 395650.5 1271 247520.3 0 0 0.01270.8 64.6| da13 0 5217 3747 372030.5 1333 259080.2 0 0 0.01370.6 60.6| da14 0 5151 3725 382260.5 1288 257530.2 0 0 0.01370.6 61.0| da15 0 5157 3722 367360.5 1297 259080.3 0 0 0.01370.7 63.4| da16 0 4948 3645 363750.7 1165 256550.4 0 0 0.01370.8 67.0| da17 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| da18 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| da0p1 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| da0p2 0 17 171330.7 0 00.0 0 0 0.0 00.01.2| da0p3 0 2 2 44.0 0 00.0 0 0 0.0 00.00.8| da0p4 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| gpt/boot0 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| gptid/2cdc4b9e-c5c6-11 e5-a23b-0cc47ad2b886 0 17 171330.7 0 00.0 0 0 0.0 00.01.2| gpt/zroot0 0 2 2 44.0 0 00.0 0 0 0.0 00.00.8| gpt/userdata0 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| da1p1 15336336 2263 35.5 0 00.0 0 0 0.0 00.0 100.0| da1p2 1 4 4510 187.7 0 00.0 0 0 0.0 00.0 74.8| da1p3 0 2 2 1 243.0 0 00.0 0 0 0.0 00.0 48.4| da1p4 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| diskid/DISK-761S1047TB 4V 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| gpt/boot1 0 0 0 00.0 0 00.0 0 0 0.0 00.00.0| gptid/2ce52799-c5c6-11 e5-a23b-0cc47ad2b886 15336336 2263 35.5 0 00.0 0 0 0.0 00.0 100.0| mirror/swap 1 4 4510 187.7 0 00.0 0 0 0.0 00.0 74.8| gpt/zroot1 0 2 2 1 243.0 0 00.0 0 0 0.0 00.0 48.4| gpt/userdata1 # gmirror status NameStatus Components mirror/swap COMPLETE da0p2 (ACTIVE) da1p2 (ACTIVE) Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list
Re: Where is my memory on 'fresh' 11-STABLE? It should be used by ARC, but it is not used for it anymore.
Hello, 20.11.2018 15:42, Lev Serebryakov пишет: I have server which is mostly torrent box. It uses ZFS and equipped with 16GiB of physical memory. It is running 11-STABLE (r339914 now). I've updated it to r339914 from some 11.1-STABLE revision 3 weeks ago. I was used to see 13-14GiB of memory in ZFS ARC and it was Ok. Sometimes it "locks" under heavy disk load due to ARC memory pressure, but it was bearable, and as ZFS is main reason this server exists, I didn't limit ARC. But new revision (r339914) shows very strange behaivor: ARC is no more than 4GiB, but kernel has 15GiB wired: Mem: 22M Active, 656M Inact, 62M Laundry, 15G Wired, 237M Free ARC: 4252M Total, 2680M MFU, 907M MRU, 3680K Anon, 15M Header, 634M Other 2789M Compressed, 3126M Uncompressed, 1.12:1 Ratio It is typical numbers for last week: 15G wired, 237M Free, but only 4252M ARC! Where is other 11G of memory?! [...] And total USED/FREE numbers is very strange for me: $ vmstat -z | tr : , | awk -F , '1{u+=$2*$4; f+=$2*$5} END{print u,f}' 5717965420 9328951088 $ So, only ~5.7G is used and 9.3G is free! But why this memory is not used by ARC anymore and why is it wired and not free? I'm getting pretty much same story on recent 11-STABLE from 9th November. Previous versions didn't have that much questions about memory usage (and I run several 11-STABLEs). Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: plenty of memory, but system us intensively swapping
Hello, On 20.11.2018 16:22, Trond Endrestøl wrote: I know others have created a daemon that observe the ARC and the amount of wired and free memory, and when these values exceed some threshold, the daemon will allocate a number of gigabytes, writing zero to the first byte or word of every page, and then freeing the allocated memory before going back to sleep. The ARC will release most of its allocations and the kernel will also release some but not all of its wired memory, and some user pages are likely to be thrown onto the swap device, turning the user experience to a mild nightmare while waiting for applications to be paged back into memory. ZFS seems to be the common factor in most, if not all, of these cases. I created my own and not so sophisticated C program that I run every now and then: #include #include #include int main(int argc, char **argv) { const size_t pagesize = (size_t)getpagesize(); const size_t gigabyte = 1024ULL * 1024ULL * 1024ULL; size_t amount, n = 1ULL; char *p, *offset; if (argc > 1) { sscanf(argv[1], "%zu", ); } amount = n * gigabyte; if (amount > 0ULL) { if ( (p = malloc(amount)) != NULL) { for (offset = p; offset < p + amount; offset += pagesize) { *offset = '\0'; } free(p); } else { fprintf(stderr, "%s:%s:%d: unable to allocate %zu gigabyte%s\n", argv[0], __FILE__, __LINE__, n, (n == 1ULL) ? "" : "s"); return 2; } } else { return 1; } return 0; } // main() // allocate_gigabytes.c Jeez, thanks a lot, this stuff is working. Now the system has 8 Gigs of free memory and stopped swapping. Well, the next question is addressed to the core team which I suppose reads this ML eventually - why we don't have something similar as a watchdog in the base system ? I understand that this solution is architecturally ugly, but it's now worse not to have any, and it still works. At least I'm about to run this periodically. Trond, thanks again. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: plenty of memory, but system us intensively swapping
Hello, On 20.11.2018 15:12, Trond Endrestøl wrote: On freebsd-hackers the other day, https://lists.freebsd.org/pipermail/freebsd-hackers/2018-November/053575.html, it was suggested to set vm.pageout_update_period=0. This sysctl is at 600 initially. ZFS' ARC needs to be capped, otherwise it will eat most, if not all, of your memory. Well, as you can see, ARC ate only half, and the other half is eaten by the kernel. So far I suppose that if I will cap the ARC, the kernel will simply eat the rest. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
plenty of memory, but system us intensively swapping
Hello, I have a recent FreeBSD 11-STABLE which is mainly used as an iSCSI target. The system has 64G of RAM but is swapping intensively. Yup, about of half of the memory is used as ZFS ARC (isn't capped in loader.conf), and another half is eaten by the kernel, but it oly uses only about half of it (thus 25% of the total amount). Could this be tweaked by some sysctl oids (I suppose not, but worth asking). top, vmstat 1 snapshots and zfs-stats -a are listed below. Thanks. [root@san01:nginx/vhost.d]# vmstat 1 procs memory pagedisks faults cpu r b w avm fre flt re pi pofr sr da0 da1 in sycs us sy id 0 0 38 23G 609M 1544 68 118 64 895 839 0 0 3644 2678 649 0 13 87 0 0 53 23G 601M 1507 185 742 315 1780 33523 651 664 56438 785 476583 0 28 72 0 0 53 23G 548M 1727 330 809 380 2377 33256 758 763 5 1273 468545 0 26 73 0 0 53 23G 528M 1702 239 660 305 1347 32335 611 631 59962 1025 490365 0 22 78 0 0 52 23G 854M 2409 309 693 203 97943 16944 525 515 64309 1570 540533 0 29 71 3 0 54 23G 1.1G 2756 639 641 149 124049 19531 542 538 64777 1576 553946 0 35 65 0 0 53 23G 982M 1694 236 680 282 2754 35602 597 603 66540 1385 583687 0 28 72 0 0 41 23G 867M 1882 223 767 307 1162 34936 682 638 67284 780 568818 0 33 67 0 0 39 23G 769M 1542 167 673 336 1187 35123 646 610 65925 1176 551623 0 23 77 2 0 41 23G 700M 3602 535 688 327 2192 37109 622 594 65862 4256 518934 0 33 67 0 0 54 23G 650M 2957 219 726 464 4838 36464 852 868 65384 4110 558132 1 37 62 0 0 54 23G 641M 1576 245 730 344 1139 33681 740 679 67216 970 560379 0 31 69 [root@san01:nginx/vhost.d]# top last pid: 55190; load averages: 11.32, 12.15, 10.76 up 10+16:05:14 14:38:58 101 processes: 1 running, 100 sleeping CPU: 0.2% user, 0.0% nice, 28.9% system, 1.6% interrupt, 69.3% idle Mem: 85M Active, 1528K Inact, 12K Laundry, 62G Wired, 540M Free ARC: 31G Total, 19G MFU, 6935M MRU, 2979M Anon, 556M Header, 1046M Other 25G Compressed, 34G Uncompressed, 1.39:1 Ratio Swap: 32G Total, 1186M Used, 31G Free, 3% Inuse, 7920K In, 3752K Out PID USERNAME THR PRI NICE SIZERES STATE C TIMEWCPU COMMAND 40132 root 131 520 3152M 75876K uwait 14 36:59 6.10% java 55142 root 1 200 7904K 2728K CPU20 20 0:00 0.72% top 20026 root 1 200 106M 5676K nanslp 28 1:23 0.60% gstat 53642 root 1 200 7904K 2896K select 14 0:03 0.58% top 977 zfsreplica 1 200 30300K 3568K kqread 21 4:00 0.42% uwsgi 968 zfsreplica 1 200 30300K 2224K swread 11 2:03 0.21% uwsgi 973 zfsreplica 1 200 30300K 2264K swread 13 12:26 0.13% uwsgi 53000 www1 200 23376K 1372K kqread 24 0:00 0.05% nginx 1292 root 1 200 6584K 2040K select 29 0:23 0.04% blacklistd 776 zabbix 1 200 12408K 4236K nanslp 26 4:42 0.03% zabbix_agentd 1289 root 1 200 67760K 5148K select 13 9:50 0.03% bsnmpd 777 zabbix 1 200 12408K 1408K select 25 5:06 0.03% zabbix_agentd 785 zfsreplica 1 200 27688K 3960K kqread 28 2:04 0.02% uwsgi 975 zfsreplica 1 200 30300K 464K kqread 18 2:33 0.02% uwsgi 974 zfsreplica 1 200 30300K 480K kqread 30 3:39 0.02% uwsgi 965 zfsreplica 1 200 30300K 464K kqread 4 3:23 0.02% uwsgi 976 zfsreplica 1 200 30300K 464K kqread 14 2:59 0.01% uwsgi 972 zfsreplica 1 200 30300K 464K kqread 10 2:57 0.01% uwsgi 963 zfsreplica 1 200 30300K 460K kqread 3 2:45 0.01% uwsgi 971 zfsreplica 1 200 30300K 464K kqread 13 3:16 0.01% uwsgi 69644 emz1 200 13148K 4596K select 24 0:05 0.01% sshd 18203 vryabov1 200 13148K 4624K select 9 0:02 0.01% sshd 636 root 1 200 6412K 1884K select 17 4:10 0.01% syslogd 51266 emz1 200 13148K 4576K select 5 0:00 0.01% sshd 964 zfsreplica 1 200 30300K 460K kqread 18 11:02 0.01% uwsgi 962 zfsreplica 1 200 30300K 460K kqread 28 6:56 0.01% uwsgi 969 zfsreplica 1 200 30300K 464K kqread 12 2:07 0.01% uwsgi 967 zfsreplica 1 200 30300K 464K kqread 27 5:18 0.01% uwsgi 970 zfsreplica 1 200 30300K 464K kqread 0 4:25 0.01% uwsgi 966 zfsreplica 1 220 30300K 468K kqread 14 4:29 0.01% uwsgi 53001 www1 200 23376K 1256K kqread 10 0:00 0.01% nginx 791 zfsreplica 1 200 27664K 4244K kqread 17 1:34 0.01% uwsgi 52431 root 1 200 17132K 4492K select 21 0:00 0.01% mc 70013 root 1 200 17132K 4492K select 4 0:03 0.01% mc 870 root 1 200 12448K 12544K select 19
Re: ZFS: Can't find pool by guid
Hello. On 28.04.2018 17:46, Willem Jan Withagen wrote: Hi, I upgraded a server from 10.4 to 11.1 and now al of a sudden the server complains about: ZFS: Can't find pool by guid And I end up in the boot prompt: lsdev gives disk0 withe on p1 the partion that the zroot is/was. This is an active server, so redoing install and stuf is nog going to be real workable So how do I get this to boot? The basic scenario for this is when you have a "shadow" pool on the bootable disks with actual root pool - for example once you had a zfs pool on some disks that were in dedicated mode, then you extracted these disks without clearing the zpool labels (and 'zpool destroy' never clears the zpool labels) and installed the system onto them. This way 'zpool import' will show the old pool which has no live replicas and no live vdevs. The system on it may be bootable (and will probably be) until the data gets redistributed in some way, after that gptzfsboot will start to see the old pool remains, will try to detect if this pool has bootfs on it - but in this case there's no valid pool - so it will fall into error and stop working. Actually, the newer 11.2 gptzfsboot loader has more support of this - it clearly states the pool found and mentions the error - thanks to all the guys that did a great work on this, seriously. The way to resolve this is to detach disks sequentially from root pool (or offline them in case of raidz), making 'zpool labelclear' on them (please keep in mind that 'labelclear' is evil and ignorant, and breaks things including GPT table) and attaching them back, resilvering, and repeating this until 'zpool import' will show no old disassembled pools. Determining which disks have the old labels can be done with 'zdb -l /dev/ | grep name:'. I understand that your situation was resolved long ago, I'm writing this merely to establish a knowledge point if someone will step on this too, like I did yesterday. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD CTL device data/id questions
Hi, I have a bunch if the dumb (not sarcasm) questions concerning FreeBSD CTL layer and iSCSI target management: - is the "FREEBSD CTLDISK 0001" line that the ctladm lunlist is presenting, and that the initiators are seeing as the hwardware id hardcoded somewhere, especially the "CTLDISK 0001" part ? - I am able to change the "FREEBSD" part of the above, but only from the configuration file (ctl.conf), but not from the ctladm on the create stage (when I'm issuing -o vendor there's no error, and the vendor in the devlist changes but not on the lunlist, however, I see changed vendors in the lunlist and they come from the config). - is the desire to change the "FREEBSD CTLDISK 0001" part weird ? I'm currently considering it as a part of the production maintenance, but I'm not sure. Is there a way to change it without touching the code ? Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: extract the process arguments from the crashdump
Hello, On 14.05.2018 18:12, Konstantin Belousov wrote: On Mon, May 14, 2018 at 05:32:21PM +0500, Eugene M. Zheganin wrote: Well, unfortunately this gives me exactly same information as the core.X.txt file contains - process names without arguments, and I really want to know what arguments ctladm had when the system has crashed: Most likely the in-kernel cache for the process arguments was dropped. Is there anything I can do to prevent this from happening, so if the system will crash next time I will get processes arguments extractable (in this case I would still like to get the arguments information) ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: extract the process arguments from the crashdump
Hello, On 14.05.2018 16:15, Konstantin Belousov wrote: On Mon, May 14, 2018 at 01:02:28PM +0500, Eugene M. Zheganin wrote: Hello, Is there any way to extract the process arguments from the system crashdump ? If yes, could anyone please explain to me how do I do it. ps -M vmcore.file -N /boot/mykernel/kernel -auxww Even if I ask ps explicitely to give me args, for some reason it ignores the format and 'args' keyword seems to be an alias for 'comm', but with square brackets: [root@san1:esx/r332096M]# ps -M vmcore.4 -N /boot/kernel/kernel -axo 'pid,ppid,comm,args' PID PPID COMMANDCOMMAND 0 0 kernel [kernel] 1 0 init [init] 2 0 crypto [crypto] 3 0 crypto returns [crypto returns] 4 0 cam[cam] 5 0 soaiod1[soaiod1] 6 0 soaiod2[soaiod2] 7 0 soaiod3[soaiod3] 8 0 soaiod4[soaiod4] 9 0 zfskern[zfskern] 10 0 audit [audit] 11 0 idle [idle] 12 0 intr [intr] 13 0 geom [geom] 14 0 usb[usb] 15 0 sctp_iterator [sctp_iterator] 16 0 pf purge [pf purge] 17 0 rand_harvestq [rand_harvestq] 18 0 enc_daemon0[enc_daemon0] 19 0 enc_daemon1[enc_daemon1] 20 0 enc_daemon2[enc_daemon2] 21 0 g_mirror swap [g_mirror swap] 22 0 pagedaemon [pagedaemon] 23 0 vmdaemon [vmdaemon] 24 0 pagezero [pagezero] 25 0 bufdaemon [bufdaemon] 26 0 bufspacedaemon [bufspacedaemon] 27 0 syncer [syncer] 28 0 vnlru [vnlru] 114 1 adjkerntz [adjkerntz] 593 1 moused [moused] 606 1 devd [devd] 701 1 syslogd[syslogd] 784 1 watchdogd [watchdogd] 866 0 ctl[ctl] 868 1 ctld [ctld] 894 1 zabbix_agentd [zabbix_agentd] 898 894 zabbix_agentd [zabbix_agentd] 901 894 zabbix_agentd [zabbix_agentd] 905 894 zabbix_agentd [zabbix_agentd] 907 894 zabbix_agentd [zabbix_agentd] 949 1 ntpd [ntpd] 968 1 nginx [nginx] 978 0 ng_queue [ng_queue] 1069 1 sshd [sshd] 1151 1 sendmail [sendmail] 1154 1 sendmail [sendmail] 1158 1 cron [cron] 1197 1 bsnmpd [bsnmpd] 1200 1 blacklistd [blacklistd] 1210 1 getty [getty] 1211 1 getty [getty] 1212 1 getty [getty] 1213 1 getty [getty] 1214 1 getty [getty] 1215 1 getty [getty] 1216 1 getty [getty] 1217 1 getty [getty] 1218 1 getty [getty] 12970 968 nginx [nginx] 12971 968 nginx [nginx] 12972 968 nginx [nginx] 12973 968 nginx [nginx] 12974 968 nginx [nginx] 12975 968 nginx [nginx] 12976 968 nginx [nginx] 12977 968 nginx [nginx] 12978 968 nginx [nginx] 12979 968 nginx [nginx] 12980 968 nginx [nginx] 12981 968 nginx [nginx] 12982 968 nginx [nginx] 12983 968 nginx [nginx] 12984 968 nginx [nginx] 12985 968 nginx [nginx] 12986 968 nginx [nginx] 32835 1069 sshd [sshd] 32884 32835 sshd [sshd] 32885 32884 zsh[zsh] 32929 32885 su [su] 32948 32929 csh[csh] 32964 32948 sh [sh] 32965 32964 mc [mc] 32966 32965 csh[csh] 48747 67993 sudo [sudo] 48750 67988 sudo [sudo] 48757 48750 zfs[zfs] 48758 48747 zfs[zfs] 48759 67990 sudo [sudo] 48762 48759 zfs[zfs] 48765 67997 sudo [sudo] 48766 48765 zfs[zfs] 48769 67984 sudo [sudo] 48770 48769 zfs[zfs] 48771 67996 sudo [sudo] 48772 48771 zfs[zfs] 48785 67991 sudo [sudo] 48786 48785 ctladm [ctladm] 48787 67983 sudo [sudo] 48788 48787 ctladm [ctladm] 48789 67986 sudo [sudo] 48790 48789 ctladm [ctladm] 48791 67985 sudo [sudo] 48792 48791 ctladm [ctladm] 48796 67987 sudo [sudo] 48797 48796 zfs[zfs] 67980 1 uwsgi [uwsgi] 67981 67980 uwsgi [uwsgi] 67982 67980 uwsgi [uwsgi] 67983 67980 uwsgi [uwsgi] 67984 67980 uwsgi [uwsgi] 67985 67980 uwsgi [uwsgi] 67986 67980 uwsgi [uwsgi] 67987 67980 uwsgi [uwsgi] 67988 67980 uwsgi [uwsgi] 67989 67980 uwsgi [uwsgi] 67990 67980 uwsgi [uwsgi] 67991 67980 uwsgi [uwsgi] 67992 67980 uwsgi [uwsgi] 67993 67980 uwsgi [uwsgi] 67994 67980 uwsgi [uwsgi] 67995 67980 uwsgi [uwsgi] 67996
Re: extract the process arguments from the crashdump
Hello, On 14.05.2018 16:15, Konstantin Belousov wrote: On Mon, May 14, 2018 at 01:02:28PM +0500, Eugene M. Zheganin wrote: Hello, Is there any way to extract the process arguments from the system crashdump ? If yes, could anyone please explain to me how do I do it. ps -M vmcore.file -N /boot/mykernel/kernel -auxww Well, unfortunately this gives me exactly same information as the core.X.txt file contains - process names without arguments, and I really want to know what arguments ctladm had when the system has crashed: [root@san1:esx/r332096M]# ps -M vmcore.4 -N /boot/kernel/kernel -auxww USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 0 0,0 0,0 0 0 - DLs 1янв.70 2866:37,17 [kernel] root 1 0,0 0,0542416 - DLs 1янв.70 0:03,95 [init] root 2 0,0 0,0 0 0 - DL1янв.70 0:00,00 [crypto] root 3 0,0 0,0 0 0 - DL1янв.70 0:00,00 [crypto returns] root 4 0,0 0,0 0 0 - RL1янв.70 175:44,92 [cam] root 5 0,0 0,0 0 0 - DL1янв.70 0:00,07 [soaiod1] root 6 0,0 0,0 0 0 - DL1янв.70 0:00,07 [soaiod2] root 7 0,0 0,0 0 0 - DL1янв.70 0:00,07 [soaiod3] root 8 0,0 0,0 0 0 - DL1янв.70 0:00,07 [soaiod4] root 9 0,0 0,0 0 0 - DL1янв.70 181:27,20 [zfskern] root 10 0,0 0,0 0 0 - DL1янв.70 0:00,00 [audit] root 11 0,0 0,0 0 0 - RL1янв.70 183810:56,57 [idle] root 12 0,0 0,0 0 0 - WL1янв.70 131:37,76 [intr] root 13 0,0 0,0 0 0 - DL1янв.70 1:33,61 [geom] root 14 0,0 0,0 0 0 - DL1янв.70 0:36,74 [usb] root 15 0,0 0,0 0 0 - DL1янв.70 0:00,00 [sctp_iterator] root 16 0,0 0,0 0 0 - DL1янв.70 1:38,61 [pf purge] root 17 0,0 0,0 0 0 - DL1янв.70 1:11,87 [rand_harvestq] root 18 0,0 0,0 0 0 - DL1янв.70 0:00,37 [enc_daemon0] root 19 0,0 0,0 0 0 - DL1янв.70 0:00,38 [enc_daemon1] root 20 0,0 0,0 0 0 - DL1янв.70 0:05,20 [enc_daemon2] root 21 0,0 0,0 0 0 - DL1янв.70 1:03,00 [g_mirror swap] root 22 0,0 0,0 0 0 - DL1янв.70 10:19,64 [pagedaemon] root 23 0,0 0,0 0 0 - DL1янв.70 0:18,40 [vmdaemon] root 24 0,0 0,0 0 0 - DL1янв.70 0:00,01 [pagezero] root 25 0,0 0,0 0 0 - DL1янв.70 0:01,71 [bufdaemon] root 26 0,0 0,0 0 0 - DL1янв.70 0:01,95 [bufspacedaemon] root 27 0,0 0,0 0 0 - DL1янв.70 2:20,07 [syncer] root 28 0,0 0,0 0 0 - DL1янв.70 0:03,19 [vnlru] root 114 0,0 0,06288 0 - DWs - 0:00,00 [adjkerntz] root 593 0,0 0,06600 1860 - Ds1янв.70 0:00,00 [moused] root 606 0,0 0,09180 620 - Ds1янв.70 0:07,76 [devd] root 701 0,0 0,06420 1928 - Ds1янв.70 0:26,92 [syslogd] root 784 0,0 0,03564 3612 - Ds1янв.70 0:01,46 [watchdogd] root 866 0,0 0,0 0 0 - DL1янв.70 42:20,99 [ctl] root 868 0,0 0,0 224200 2248 - Ds1янв.70 20:03,85 [ctld] zabbix 894 0,0 0,0 12424 0 - DW - 0:00,00 [zabbix_agentd] zabbix 898 0,0 0,0 12424 4504 - D 1янв.70 1:02,34 [zabbix_agentd] zabbix 901 0,0 0,0 12424 0 - DW - 0:00,00 [zabbix_agentd] zabbix 905 0,0 0,0 12424 1580 - D 1янв.70 3:03,14 [zabbix_agentd] zabbix 907 0,0 0,0 12424 1376 - D 1янв.70 3:05,45 [zabbix_agentd] root 949 0,0 0,0 12452 12532 - Ds1янв.70 0:19,90 [ntpd] root 968 0,0 0,0 1063848 0 - DWs - 0:00,00 [nginx] root 978 0,0 0,0 0 0 - DL1янв.70 0:00,00 [ng_queue] root1069 0,0 0,0 12848 3780 - Ds1янв.70 0:06,33 [sshd] root1151 0,0 0,0 10452 4304 - Ds1янв.70 0:09,25 [sendmail] smmsp 1154 0,0 0,0 10452 0 - DWs - 0:00,00 [sendmail] root1158 0,0 0,06464 0 - DWs - 0:00,00 [cron] root1197 0,0 0,0 10060 5268 - Ds1янв.70 4:51,59 [bsnmpd] root1200 0,0 0,06600 2112 - Ds1янв.70 0:04,13 [blacklistd] root1210 0,0 0,06408 1844 - Ds+ 1янв.70 0:00,00 [getty] root1211 0,0 0,06408 1844 - Ds+ 1янв.70 0:00,00 [getty] root1212 0,0 0,06408 1844 - Ds+ 1янв.70 0:00,00 [getty] root1213 0,0 0,06408 1844 - Ds+ 1янв.70 0:00,00 [getty] root1214 0,0 0,06408 1844 - Ds+ 1янв.70 0:00,00 [getty] root1215 0,0 0,0
extract the process arguments from the crashdump
Hello, Is there any way to extract the process arguments from the system crashdump ? If yes, could anyone please explain to me how do I do it. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
clear old pools remains from active vdevs
Hello, I have some active vdev disk members that used to be in pool that clearly have not beed destroyed properly, so I'm seeing in a "zpool import" output something like # zpool import pool: zroot id: 14767697319309030904 state: UNAVAIL status: The pool was last accessed by another system. action: The pool cannot be imported due to damaged devices or data. see: http://illumos.org/msg/ZFS-8000-EY config: zrootUNAVAIL insufficient replicas mirror-0 UNAVAIL insufficient replicas 5291726022575795110 UNAVAIL cannot open 2933754417879630350 UNAVAIL cannot open pool: esx id: 8314148521324214892 state: UNAVAIL status: The pool was last accessed by another system. action: The pool cannot be imported due to damaged devices or data. see: http://illumos.org/msg/ZFS-8000-EY config: esx UNAVAIL insufficient replicas mirror-0UNAVAIL insufficient replicas 10170732803757341731 UNAVAIL cannot open 9207269511643803468 UNAVAIL cannot open is there any _safe_ way to get rid of this ? I'm asking because a gptzfsboot loader in recent -STABLE stumbles upon this and refuses to boot the system (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227772). The workaround is to use the 11.1 loader, but I'm afraid this behavior will now be the intended one. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
vputx: usecount not zero for vnode
Hello, what is this panic, got it just recently on a fresh -STABLE r332466: Apr 18 17:52:39 san1 kernel: vputx: usecount not zero for vnode Apr 18 17:52:39 san1 kernel: 0xf80f4d7d1760: tag devfs, type VCHR Apr 18 17:52:39 san1 kernel: usecount -1, writecount -1, refcount 0 "vputx: usecount not zero for vnode" can be fund in just one place, /usr/src/sys/kern/vfs_subr.c, and it's followed by the panic() syscall. Unfortunately, I have no crashdump of it (though I usually get crashdumps on this machine). Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
HAST, cyclic singal 6, and inability to start
Hi. About a month ago I was experimenting with HAST on my servers, and, though I did have a complications with signal 6 on init phase, I was able to start it and it was working in test mode for a couple of weeks. After that I had to reboo both of them and now it doesn't start al all - both node hast is crashing on signal 6, and I'm unable to launch it as primary in either one. As soon as I switch from init or secondary to primary on either node - bad things are starting to happen - cyclic signal 6 for hastd and hangups for hastctl. Both nodes are running FreeBSD 11.1-RELEASE-pX (p1 and p6). Here's an extempt from the PR https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227461 I just created: ===Cut=== Node A (11.1-RELEASE-p1): [root@gw0:/var/log]# service hastd start Starting hastd. [root@gw0:/var/log]# hastctl status NameStatus Role Components hasta -init /dev/gpt/hasta tcp4://192.168.0.247 hastb -init /dev/gpt/hastb tcp4://192.168.0.247 [root@gw0:/var/log]# hastctl role secondary hasta [root@gw0:/var/log]# hastctl role secondary hastb [root@gw0:/var/log]# hastctl status NameStatus Role Components hasta -secondary /dev/gpt/hasta tcp4://192.168.0.247 hastb -secondary /dev/gpt/hastb tcp4://192.168.0.247 Node B (11.1-RELEASE-p6): [root@gw1:/var/log]# service hastd start Starting hastd. [root@gw1:/var/log]# hastctl status NameStatus Role Components hasta -init /dev/gpt/hasta tcp4://192.168.0.248 hastb -init /dev/gpt/hastb tcp4://192.168.0.248 [root@gw1:/var/log]# hastctl role promary hasta usage: hastctl create [-d] [-c config] [-e extentsize] [-k keepdirty] [-m mediasize] name ... hastctl role [-d] [-c config] all | name ... hastctl list [-d] [-c config] [all | name ...] hastctl status [-d] [-c config] [all | name ...] hastctl dump [-d] [-c config] [all | name ...] [root@gw1:/var/log]# hastctl role primary hasta [root@gw1:/var/log]# hastctl role primary hastb [root@gw1:/var/log]# hastctl status (hangs) Node B dmesg: pid 26813 (hastd), uid 0: exited on signal 6 (core dumped) pid 26814 (hastd), uid 0: exited on signal 6 (core dumped) pid 26815 (hastd), uid 0: exited on signal 6 (core dumped) pid 26816 (hastd), uid 0: exited on signal 6 (core dumped) pid 26817 (hastd), uid 0: exited on signal 6 (core dumped) pid 26822 (hastd), uid 0: exited on signal 6 (core dumped) pid 26825 (hastd), uid 0: exited on signal 6 (core dumped) pid 26828 (hastd), uid 0: exited on signal 6 (core dumped) pid 26829 (hastd), uid 0: exited on signal 6 (core dumped) pid 26830 (hastd), uid 0: exited on signal 6 (core dumped) pid 26831 (hastd), uid 0: exited on signal 6 (core dumped) pid 26833 (hastd), uid 0: exited on signal 6 (core dumped) pid 26836 (hastd), uid 0: exited on signal 6 (core dumped) pid 26837 (hastd), uid 0: exited on signal 6 (core dumped) Node B messages: Apr 12 15:02:49 gw1 kernel: pid 26891 (hastd), uid 0: exited on signal 6 (core dumped) Apr 12 15:02:50 gw1 hastd[26679]: [hastb] (primary) Worker process killed (pid=26891, signal=6). Apr 12 15:02:50 gw1 hastd[26893]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Apr 12 15:02:50 gw1 hastd[26893]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Apr 12 15:02:50 gw1 kernel: pid 26893 (hastd), uid 0: exited on signal 6 (core dumped) Apr 12 15:02:51 gw1 hastd[26679]: [hasta] (primary) Worker process killed (pid=26893, signal=6). Apr 12 15:02:51 gw1 hastd[26896]: [hastb] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Apr 12 15:02:51 gw1 hastd[26896]: [hastb] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Apr 12 15:02:52 gw1 kernel: pid 26896 (hastd), uid 0: exited on signal 6 (core dumped) Apr 12 15:02:52 gw1 hastd[26679]: [hastb] (primary) Worker process killed (pid=26896, signal=6). Apr 12 15:02:52 gw1 hastd[26900]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Apr 12 15:02:52 gw1 hastd[26900]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Apr 12 15:02:53 gw1 kernel: pid 26900 (hastd), uid 0: exited on signal 6 (core dumped) Apr 12 15:02:54 gw1 hastd[26679]: [hasta] (primary) Worker process killed (pid=26900, signal=6). Apr 12 15:02:54 gw1 hastd[26904]: [hastb] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Apr 12 15:02:54 gw1 hastd[26904]: [hastb] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Apr 12 15:02:54 gw1 kernel: pid 26904 (hastd), uid 0: exited on signal 6 (core dumped) Now when I'm trying to switch A to primary: [root@gw0:/var/log]# hastctl role primary hastb [root@gw0:/var/log]# hastctl role
Re: TRIM, iSCSI and %busy waves
Hi, 05.04.2018 20:15, Eugene M. Zheganin wrote: You can indeed tune things here are the relevant sysctls: sysctl -a | grep trim |grep -v kstat vfs.zfs.trim.max_interval: 1 vfs.zfs.trim.timeout: 30 vfs.zfs.trim.txg_delay: 32 vfs.zfs.trim.enabled: 1 vfs.zfs.vdev.trim_max_pending: 1 vfs.zfs.vdev.trim_max_active: 64 vfs.zfs.vdev.trim_min_active: 1 vfs.zfs.vdev.trim_on_init: 1 Well, I've already seen these. How do I tune them ? The idea of just tampering with them and seeing what will happen doesn't look like a bright one to me. Do I increase or decrease them ? Which ones do I have to avoid ? So, about these - are there any best practices to fine-tune them ? How do you tune them (I won't blame anyone for examples) ? Or are they "just there" and nobody touches them ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: TRIM, iSCSI and %busy waves
Hello, On 05.04.2018 20:00, Warner Losh wrote: I'm also having a couple of iSCSI issues that I'm dealing through bounty with, so may be this is related somehow. Or may be not. Due to some issues in iSCSI stack my system sometimes reboots, and then these "waves" are stopped for some time. So, my question is - can I fine-tune TRIM operations ? So they don't consume the whole disk at 100%. I see several sysctl oids, but they aren't well-documented. You might be able to set the delete method. Set it to what ? It's not like I'm seeing FreeBSD for the first time, but from what I see in sysctl - all of those "sysctl -a | grep trim" oids are numeric. P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA. Which LSI HBA? A SAS9300-4i4e one. Eugene. Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: TRIM, iSCSI and %busy waves
Hello, On 05.04.2018 19:57, Steven Hartland wrote: You can indeed tune things here are the relevant sysctls: sysctl -a | grep trim |grep -v kstat vfs.zfs.trim.max_interval: 1 vfs.zfs.trim.timeout: 30 vfs.zfs.trim.txg_delay: 32 vfs.zfs.trim.enabled: 1 vfs.zfs.vdev.trim_max_pending: 1 vfs.zfs.vdev.trim_max_active: 64 vfs.zfs.vdev.trim_min_active: 1 vfs.zfs.vdev.trim_on_init: 1 Well, I've already seen these. How do I tune them ? The idea of just tampering with them and seeing what will happen doesn't look like a bright one to me. Do I increase or decrease them ? Which ones do I have to avoid ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
TRIM, iSCSI and %busy waves
Hi, I have a production iSCSI system (on zfs of course) with 15 ssd disks and it's often suffering from TRIMs. Well, I know what TRIM is for, and I know it's a good thing, but sometimes (actually often) I'm seeing my disks in gstat are overwhelmed by the TRIM waves, this looks like a "wave" of 20K 100%busy delete operations starting on first pool disk, then reaching second, then third,... - at the time it reaches the 15th disk the first one if freed from TRIM operations, and in 20-40 seconds this wave begins again. I'm also having a couple of iSCSI issues that I'm dealing through bounty with, so may be this is related somehow. Or may be not. Due to some issues in iSCSI stack my system sometimes reboots, and then these "waves" are stopped for some time. So, my question is - can I fine-tune TRIM operations ? So they don't consume the whole disk at 100%. I see several sysctl oids, but they aren't well-documented. P.S. This is 11.x, disks are Toshibas, and they are attached via LSI HBA. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: another question about zfs compression numbers
Hi, On 04.04.2018 12:35, Patrick M. Hausen wrote: Hi all, Am 04.04.2018 um 09:21 schrieb Eugene M. Zheganin <eug...@zhegan.in>: I'm just trying to understand these numbers: file size is 232G, it's actual size on the lz4-compressed dataset is 18G, so then why is the compressratio only 1.86x ? And why logicalused is 34.2G ? On one hand, 34.2G exactlyfits to the 1.86x compresstaio, but still I don't get it. dataset is on raidz, 3 spans across 5 disk vdevs, with total of 15 disks if it matters: A sparse file, possibly? The ZFS numbers refer to blocks. "Skipping" zeroes at the VFS layer is not taken into account as fas as I know. Seriously, how should it? If I'm not mistaken, ZFS will never get to see these empty blocks. Looks so, thanks. Although it's a mysql tablespace file. But yeah, in hex viewer looks like it's filled with zeroes in many places. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
another question about zfs compression numbers
Hello, I'm just trying to understand these numbers: file size is 232G, it's actual size on the lz4-compressed dataset is 18G, so then why is the compressratio only 1.86x ? And why logicalused is 34.2G ? On one hand, 34.2G exactlyfits to the 1.86x compresstaio, but still I don't get it. dataset is on raidz, 3 spans across 5 disk vdevs, with total of 15 disks if it matters: # du -h 18G. # ls -lh total 19316318 -rw-r- 1 root wheel 232G 4 апр. 11:29 mp_userCoordsHistory.ibd # zfs get all data/test NAME PROPERTY VALUE SOURCE data/test type filesystem - data/test creation ср апр. 4 10:31 2018 - data/test used 18,4G - data/test available 9,95T - data/test referenced18,4G - data/test compressratio 1.86x - data/test mounted yes - data/test quota none default data/test reservation none default data/test recordsize128K default data/test mountpoint/data/test default data/test sharenfs off default data/test checksum on default data/test compression lz4 local data/test atime on default data/test devices on default data/test exec on default data/test setuidon default data/test readonly off default data/test jailedoff default data/test snapdir hidden default data/test aclmode discard default data/test aclinheritrestricted default data/test canmount on default data/test xattr off temporary data/test copies1 default data/test version 5 - data/test utf8only off - data/test normalization none- data/test casesensitivity sensitive - data/test vscan off default data/test nbmandoff default data/test sharesmb off default data/test refquota none default data/test refreservationnone default data/test primarycache all default data/test secondarycacheall default data/test usedbysnapshots 0 - data/test usedbydataset 18,4G - data/test usedbychildren0 - data/test usedbyrefreservation 0 - data/test logbias latency default data/test dedup on inherited from data data/test mlslabel - data/test sync standard default data/test refcompressratio 1.86x - data/test written 18,4G - data/test logicalused 34,2G - data/test logicalreferenced 34,2G - data/test volmode default default data/test filesystem_limit none default data/test snapshot_limitnone default data/test filesystem_count none default data/test snapshot_countnone default data/test redundant_metadataall default # zpool status pool: data state: ONLINE scan: scrub repaired 0 in 28h24m with 0 errors on Thu Feb 15 13:26:36 2018 config: NAMESTATE READ WRITE CKSUM dataONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10ONLINE 0 0 0 da11ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 da12ONLINE 0 0 0 da13ONLINE 0 0 0 da14ONLINE 0 0 0 da15ONLINE 0 0 0 da16ONLINE 0 0 0 ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: vdrop: holdcnt 0
Hello, On 22.03.2018 18:05, Eugene M. Zheganin wrote: today I eventyally got "panic: vdrop: holdcnt 0" on an iSCSI host, on an 11.1. Since I don't see any decent information on this - I just wanted to ask - what this kind of panic generally mean ? And where do I go with this. The only PR I see is about 9.[, and the author there got multiple reproducing crashes, not just one. To be more specific - here's a backtrace: Unread portion of the kernel message buffer: panic: vdrop: holdcnt 0 cpuid = 11 KDB: stack backtrace: #0 0x80aadac7 at kdb_backtrace+0x67 #1 0x80a6bba6 at vpanic+0x186 #2 0x80a6ba13 at panic+0x43 #3 0x80b28739 at _vdrop+0x3e9 #4 0x80b295d8 at vputx+0x2f8 #5 0x80b38342 at vn_close1+0x182 #6 0x82639d1a at ctl_be_block_ioctl+0x86a #7 0x82632bdc at ctl_ioctl+0x48c #8 0x8093ae38 at devfs_ioctl_f+0x128 #9 0x80ac9415 at kern_ioctl+0x255 #10 0x80ac914f at sys_ioctl+0x16f #11 0x80ee0394 at amd64_syscall+0x6c4 #12 0x80ec39bb at Xfast_syscall+0xfb Uptime: 1d17h36m25s I also have the full crashdump and stuff, in case someone will show an interest. This is currently the only panic I got, but this happened on a production system, which is, by the way 11.1-RELEASE-p6 r329259M, an iSCSI host and M stands for iscsi holdoff patch which was recently MFC'd, but appeared later than the 11.1 was released. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
panic: vdrop: holdcnt 0
Hi, today I eventyally got "panic: vdrop: holdcnt 0" on an iSCSI host, on an 11.1. Since I don't see any decent information on this - I just wanted to ask - what this kind of panic generally mean ? And where do I go with this. The only PR I see is about 9.[, and the author there got multiple reproducing crashes, not just one. Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
HAST, configuration, this actually looks insane
Hi, I'm trying to configure a HAST on FreeBSD, and suddenly it appears to be a mind-breaking procedure. I totally don't get it, thus it doesn't work, dumps cores and behaves weirdly. First of all, in an existing configuration files paradigm, used widely in the whole IT industry, the local view is usually described, and then remote mentioned. Here both local and remote views are described and the configuration file (and they aren't named explicitely local and remote since they are both "remote"), like I understand the handbook article, must be _the_same_ on both nodes. So, given that the sections are named arbitrarily, the local hostname isn't mentioned or linked anywhere - how do I configure it considering that I have _different_ GEOM providers of different machines ? So, let's consider I have written this configutaion file: resource hasta { on gw0 { local /dev/ada2p3 remote 192.168.0.247 } on gw1 { local /dev/ada0p4 remote 192.168.0.248 } } resource hastb { on gw0 { local /dev/ada3p3 remote 192.168.0.247 } on gw1 { local /dev/ada1p4 remote 192.168.0.248 } } The main question which IP do I mention where ? As far as I understand I should mention "remote" IP in the "local" device block, and vice-versa, but first of all - this doesn't work (dumps cores, complains bout FIFOs, and so on ) - second of all - how the hastd itself finds who's local and who's remote ? Thank god I have a GEOM configuration which cannot be applied on both nodes, so only the correct node wouyld have the GEOM provider mentions - otherwise I suggest this would corrupt my data and make a total mess of it. With thsi configuration file hastd doesn't work. "create" stage goes smoothly, but then on one node (the one with /dev/ada1p4 and /dev/ada0p4) hastd just loops crashing: Mar 18 20:48:47 gw1 hastd[92215]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Mar 18 20:48:47 gw1 hastd[92215]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Mar 18 20:48:47 gw1 kernel: pid 92215 (hastd), uid 0: exited on signal 6 (core dumped) Mar 18 20:48:52 gw1 hastd[92204]: [hasta] (primary) Worker process killed (pid=92215, signal=6). Mar 18 20:48:53 gw1 hastd[9]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Mar 18 20:48:53 gw1 hastd[9]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Mar 18 20:48:53 gw1 kernel: pid 9 (hastd), uid 0: exited on signal 6 (core dumped) Mar 18 20:48:58 gw1 hastd[92204]: [hasta] (primary) Worker process killed (pid=9, signal=6). Mar 18 20:48:59 gw1 hastd[92223]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Mar 18 20:48:59 gw1 hastd[92223]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Mar 18 20:48:59 gw1 kernel: pid 92223 (hastd), uid 0: exited on signal 6 (core dumped) Mar 18 20:49:01 gw1 hastd[92204]: [hasta] (primary) Worker process killed (pid=92223, signal=6). Mar 18 20:49:02 gw1 hastd[92225]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Mar 18 20:49:02 gw1 hastd[92225]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Mar 18 20:49:02 gw1 hastd[92204]: Unable to receive control header: Socket is not connected. Mar 18 20:49:02 gw1 kernel: pid 92225 (hastd), uid 0: exited on signal 6 (core dumped) Mar 18 20:49:02 gw1 hastd[92204]: Unable to send control response: Broken pipe. Mar 18 20:49:07 gw1 hastd[92204]: [hasta] (primary) Worker process killed (pid=92225, signal=6). Mar 18 20:49:08 gw1 hastd[92230]: [hasta] (primary) Descriptor 7 is open (pipe or FIFO), but should be closed. Mar 18 20:49:08 gw1 hastd[92230]: [hasta] (primary) Aborted at function descriptors_assert, file /usr/src/sbin/hastd/hastd.c, line 303. Mar 18 20:49:08 gw1 kernel: pid 92230 (hastd), uid 0: exited on signal 6 (core dumped) Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mc, xterm-clear, Ctrl-O and Home/End dilemma
Hi, On 22.12.2017 00:38, Marek Zarychta wrote: Maybe switching to the x-window driven desktop environment at home should be taken into consideration in this case. Both ncures and slang versions of misc/mc work fine (key bindings, border drawing etc.) for ssh(1) client called from xterm capable terminal. Yup, when I work from FreeBSD on Xorg, everything is fine, but not from putty. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mc, xterm-clear, Ctrl-O and Home/End dilemma
Hi, On 21.12.2017 23:20, Eugene M. Zheganin wrote: Hi, So, there's a puzzle of minor issues and I wanted to ask how you guys deal with it. - with standard ncurses misc/mc there's no borders in mc in putty, and Ctrl-O flushes the output beneath panels. -with slang misc/mc Ctrl-O flushes the output beneath panels (and I lived with this through years, but then discovered xterm-clear). - with slang and xterm-clear Home/End doesn't work in putty. everything else is fine, but this hurts. I use my FreeBSD desktop at work and from home wvia putty, so I really want to solve this without learning keays each time (and it seems like they aren't save on "Save setup". Ideas ? So, I figured it out, thanks to https://midnight-commander.org/ticket/2633 two things should be done on each FreeBSD mc is ran on to not ruin other ssh sessions: - a wrapper that will reside in PATH earlier than mc binary: #!/bin/sh # # simple knob to fix mc Ctrl-O without ruining remote Linux sshs # if [ $TERM = "xterm" ]; then { TERM=xterm-clear } fi /usr/local/bin/mc $* - a fix to /usr/local/share/mc/mc.lib: [terminal:xterm-clear] copy=xterm Then everything works, remote ssh sessions are not affected (like Linuxes/other OSes that don't have xterm-clear), putty works fine, Home/End working fine. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
mc, xterm-clear, Ctrl-O and Home/End dilemma
Hi, So, there's a puzzle of minor issues and I wanted to ask how you guys deal with it. - with standard ncurses misc/mc there's no borders in mc in putty, and Ctrl-O flushes the output beneath panels. -with slang misc/mc Ctrl-O flushes the output beneath panels (and I lived with this through years, but then discovered xterm-clear). - with slang and xterm-clear Home/End doesn't work in putty. everything else is fine, but this hurts. I use my FreeBSD desktop at work and from home wvia putty, so I really want to solve this without learning keays each time (and it seems like they aren't save on "Save setup". Ideas ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ctladm - create the target using only cli
Hi, my company is developing some sort of API for iSCSI managing, and at this time we are trying to figure out how to create and delete targets using ctladm and not usingb the configuration file. And the relationship between LUNs and ports are unclear to us if we don't use the configuration file. As abount the creation of LUN everyting is clear: ctladm create -b block -o file=/dev/zvol/dataflash/kvm/emz -o ctld_name=iqn.2016-04.net.playkey.iscsi:emz666,lun,0 -o scsiname=iqn.2016-04.net.playkey.iscsi:emz666,lun,0 -d EMZ666 -S EMZ666 But what next ? How do I create the port ? Because without port there's no way the iSCSI initiator would see the target. And there's no port, create created only the LUN: [root@san:~]# ctladm portlist Port Online Frontend Name pp vp 0YEStpc tpc 0 0 1YEScamsim camsim 0 0 naa.50079e708702 2YESioctlioctl0 0 3YESiscsiiscsi257 1 iqn.2016-04.net.playkey.iscsi:games,t,0x0101 4YESiscsiiscsi257 2 iqn.2016-04.net.playkey.iscsi:games-worker01,t,0x0101 5YESiscsiiscsi257 3 iqn.2016-04.net.playkey.iscsi:games-worker02,t,0x0101 6YESiscsiiscsi257 4 iqn.2016-04.net.playkey.iscsi:games-worker03,t,0x0101 7YESiscsiiscsi257 5 iqn.2016-04.net.playkey.iscsi:games-worker04,t,0x0101 8YESiscsiiscsi257 6 iqn.2016-04.net.playkey.iscsi:games-worker05,t,0x0101 9YESiscsiiscsi257 7 iqn.2016-04.net.playkey.iscsi:cirrascale1,t,0x0101 10 YESiscsiiscsi257 8 iqn.2016-04.net.playkey.iscsi:userdata1,t,0x0101 11 YESiscsiiscsi257 9 iqn.2016-04.net.playkey.iscsi:userdata2,t,0x0101 12 YESiscsiiscsi257 10 iqn.2016-04.net.playkey.iscsi:foobar,t,0x0101 13 YESiscsiiscsi257 11 iqn.2016-04.net.playkey.iscsi:guest1,t,0x0101 14 YESiscsiiscsi257 12 iqn.2016-04.net.playkey.iscsi:win7,t,0x0101 15 YESiscsiiscsi257 13 iqn.2016-04.net.playkey.iscsi:guest2,t,0x0101 16 YESiscsiiscsi257 14 iqn.2016-04.net.playkey.iscsi:guest3,t,0x0101 17 YESiscsiiscsi257 19 iqn.2016-04.net.playkey.iscsi:guest4,t,0x0101 19 YESiscsiiscsi257 17 iqn.2016-04.net.playkey.iscsi:zeppelin,t,0x0101 (obviously there's no entity named 'emz') [root@san:~]# ctladm devlist LUN Backend Size (Blocks) BS Serial NumberDevice ID 0 block6442450944 512 MYSERIAL 0 MYDEVID 0 1 block6442450944 512 MYSERIAL 1 MYDEVID 1 2 block6442450944 512 MYSERIAL 2 MYDEVID 2 3 block6442450944 512 MYSERIAL 3 MYDEVID 3 4 block6442450944 512 MYSERIAL 4 MYDEVID 4 5 block6442450944 512 MYSERIAL 5 MYDEVID 5 6 block 62914560 512 MYSERIAL 6 MYDEVID 6 7 block 104857600 512 MYSERIAL 7 MYDEVID 7 8 block 104857600 512 MYSERIAL 8 MYDEVID 8 9 block 2048 512 MYSERIAL 9 MYDEVID 9 10 block 104857600 512 MYSERIAL 10 MYDEVID 10 11 block 104857600 512 MYSERIAL 11 win7 12 block 104857600 512 MYSERIAL 12 guest2 13 block 104857600 512 MYSERIAL 13 guest3 16 block 104857600 512 MYSERIAL 16 zeppelin 15 block 104857600 512 666 guest4 14 block 104857600 512 MYSERIAL MYDEVID 17 block204800 512 EMZ666 EMZ666 <--- this is the device I want to create the target from. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
hw.vga.textmode=1 and the installation media
Hi, would be really nice if the 11.2 and subsequent versions would come with the hw.vga.textmode=1 as the default in the installation media. Because you know, there's a problem with some vendors (like HP) who's servers are incapable of showing graphics in IPMI with the default hw.vga.textmode=0 (yeah, I'm aware that most of the vendors don't have this issue), and there's still a bug that prevents this from being set from a loader prompt - USB keyboard doesn't work at least in 11.0 there (seems to be some sort of FreeBSD "holy cow", along with sshd starting last, after all the local daemons. I would ask again to fix the latter as I did last years, but it really seems to be a cornerstone which the FreeBSD is built upon). Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, iSCSI and volmode=dev
Hi, On 27.09.2017 16:07, Edward Napierala wrote: 2017-08-30 11:45 GMT+02:00 Eugene M. Zheganin <e...@norma.perm.ru <mailto:e...@norma.perm.ru>>: Hi, I have an iSCSI production system that exports a large number of zvols as the iSCSI targets. System is running FreeBSD 11.0-RELEASE-p7 and initially all of the zvols were confugured with default volmode. I've read that it's recommended to use them in dev mode, so the system isn't bothered with all of these geom structures, so I've switched all of the zvols to dev mode, then I exported/imported the pools back. Surprisingly, the performance has fallen down like 10 times (200-300 Mbits/sec against 3-4 Gbits/sec previously). After observing for 5 minutes the ESXes trying to boot up, and doing this extremely slowly, I switched the volmode back to default, then again exported/imported the pools. The performance went back to normal. So... why did this happen ? The result seems to be counter-intuitive. At least not obvious to me. I don't really have an answer - mav@ would be the best person to ask. Based on his description, "ZVOLs in GEOM mode don't support DPO/FUA cache control bits, had to chunk large I/Os into MAXPHYS-sized pieces and go through GEOM." There also used to be so that TRIM was only supported in the "dev" mode, but that changed a while ago. Yeah, but you mean dev is faster by design. So was my first thought too, but it seems like the opposite. Default volmode is geom, and it's much faster than dev. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
iSCSI: LUN modification error: LUN XXX is not managed by the block backend and LUN device confusion
Hi, I got one more problem while dealing iSCSI targets in the production (yeah, I'm boring and stubborn). The environment is as in previous questions (a production site, hundreds of VMs and hundreds of disks). I've encountered this issue before, but this time i decided to ask whether it's possible that the reason is my inadequate actions. We have two types of disks in production, one is called "userdata[number]" and another is called "games[something+number]". The iSCSI targets are named appropriately, plus userdata disks have the scsiname[number] option. Number simply indicates the VM it should be attached to. But sometimes some weird confusion happens, and I have two sorts of things, let me show them using one LUN as example (in reality right now I have 6 LUNs like this). So from now we are considering the 310 as a VM tag, and two disks, userdata310 and games disk. So imagine a piece of ctl.conf like this: ===Cut=== # # worker310 # target iqn.2016-04.net.playkey.iscsi:userdata-worker310 { initiator-portal 10.0.3.142/32 portal-group playkey auth-type none lun 0 { option scsiname userdata310 path /dev/zvol/data/userdata/worker310 } } # # worker310 # target iqn.2016-04.net.playkey.iscsi:gamestop-worker310 { initiator-portal 10.0.3.142/32 portal-group playkey auth-type none lun 0 { path /dev/zvol/data/reference-ver13_1233-worker310 } } ===Cut=== When the issue happens, I got the following line in the log: Oct 4 12:00:55 san1 ctld[777]: LUN modification error: LUN 547 is not managed by the block backend Oct 4 12:00:55 san1 ctld[777]: failed to modify lun "iqn.2016-04.net.playkey.iscsi:userdata-worker310,lun,0", CTL lun 547 In the "ctladm devlist -v" I see this about the LUN 547: 547 block 10737418240 512 MYSERIAL 738 MYDEVID 738 lun_type=0 num_threads=14 file=/dev/zvol/data/reference-ver13_1233-worker228 ctld_name=iqn.2016-04.net.playkey.iscsi:gamestop-worker228,lun,0 scsiname=userdata310 So, notice, that the userdata disk for VM310 has the devices for completely different VM (according to their names). Weird ! One may think that this is simply the misconfiguration and the games disk for worker228 VM simply has the erroneous scsiname option tag. But no, it hasn't: # # worker228 # target iqn.2016-04.net.playkey.iscsi:gamestop-worker228 { initiator-portal [...obfuscated...]/32 portal-group playkey auth-type none lun 0 { path /dev/zvol/data/reference-ver13_1233-worker228 } } The workaround to this is simply to comment the troublesome LUNs/targets in the ctl.conf, reload, uncomment and reload again. Am I doing something wrong ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ctld: only 579 iSCSI targets can be created
Hi. On 02.10.2017 15:03, Edward Napierala wrote: Thanks for the packet trace. What happens there is that the Windows initiator logs in, requests Discovery ("SendTargets=All"), receives the list of targets, as expected, and then... sends "SendTargets=All" again, instead of logging off. This results in ctld(8) dropping the session. The initiator then starts the Discovery session again, but this time it only logs in and then out, without actually requesting the target list. Perhaps you could work around this by using "discovery-filter", as documented in ctl.conf(5)? Thanks a lot, that did it. Seems like that Microsoft initiator has some limitation after crossing the number of 512 targets, and this happens somewhere near 573. When discovery is portal-filtered everything seems to be working just fine. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ctld: only 579 iSCSI targets can be created
Hi, Edward Tomasz Napierała wrote 2017-09-22 12:15: There are two weird things here. First is that the error is coming from ctld(8) - the userspace daemon, not the kernel. The second is that those invalid opcodes are actually both valid - they are the Text Request, and the Logout Request with Immediate flag set, exectly what you'd expect for a discovery session. Do you have a way to do a packet dump? Sure. Here it is: http://enaza.ru/stub-data/iscsi-protocol-error.pcap Target IP is 10.0.2.4, initiator IP is 10.0.3.127. During the session captured in this file I got in messages: Sep 22 15:38:11 san1 ctld[61373]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x4 Sep 22 15:38:11 san1 ctld[61374]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x46 This error happens when the initiator is trying to connect the disk from a target discovered. Target is running FreeBSD 11.0-STABLE #1 r310734M where M is for CTL_MAX_PORTS 1024 (old verion, yup, but I have a suspicion that I still failed to prove that more recent version have some iSCSI vs ZFS conflict, but that's another story). Initiator is running Windows 7 Professional x64, inside a ESX virtual machine. This happens only when some unclear threshold is crossed, previous ~2 hundreds of initiators run Windows 7 Professional too. If you need any additional data/diagnostics please let me know. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ctld: only 579 iSCSI targets can be created
Hi, Eugene M. Zheganin писал 2017-09-22 10:36: Hi, I have old 11-STABLE as an iSCSI server, but out of the blue I encountered weird problem: only 579 targets can be created. I mean, I am fully aware that the out-of-the-box limit is 128 targets, with is enforced by the CTL_MAX_PORTS define, and I've set it to 1024 (and of course rebuilt and installed a new kernel), but when I add more that 579 targets I start to get the protocol errors: Follow-up: I counted it wromg, so actually 573 targets. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ctld: only 579 iSCSI targets can be created
Hi, I have old 11-STABLE as an iSCSI server, but out of the blue I encountered weird problem: only 579 targets can be created. I mean, I am fully aware that the out-of-the-box limit is 128 targets, with is enforced by the CTL_MAX_PORTS define, and I've set it to 1024 (and of course rebuilt and installed a new kernel), but when I add more that 579 targets I start to get the protocol errors: Sep 22 10:16:48 san1 ctld[8657]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x4 Sep 22 10:16:48 san1 ctld[8658]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x46 Sep 22 10:17:31 san1 ctld[8746]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x4 Sep 22 10:17:31 san1 ctld[8747]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x46 Sep 22 10:19:58 san1 ctld[9190]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x4 Sep 22 10:19:58 san1 ctld[9191]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x46 Sep 22 10:21:33 san1 ctld[9518]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x4 Sep 22 10:21:33 san1 ctld[9519]: 10.0.3.127 (iqn.1991-05.com.microsoft:worker296): protocol error: received invalid opcode 0x46 So, the question is - is it possible to have more than 579 targets and if yes - how can this be achieved ? Right now I'm experimenting with extending luns, not targets. One may think that I am merey greedy, but the thing is, I really have hundreds of initiatorsm and it's just logical to have as many targets as I do. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zfs, iSCSI and volmode=dev
Hi, I have an iSCSI production system that exports a large number of zvols as the iSCSI targets. System is running FreeBSD 11.0-RELEASE-p7 and initially all of the zvols were confugured with default volmode. I've read that it's recommended to use them in dev mode, so the system isn't bothered with all of these geom structures, so I've switched all of the zvols to dev mode, then I exported/imported the pools back. Surprisingly, the performance has fallen down like 10 times (200-300 Mbits/sec against 3-4 Gbits/sec previously). After observing for 5 minutes the ESXes trying to boot up, and doing this extremely slowly, I switched the volmode back to default, then again exported/imported the pools. The performance went back to normal. So... why did this happen ? The result seems to be counter-intuitive. At least not obvious to me. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs listing and CPU
On 13.08.2017 16:13, Tenzin Lhakhang wrote: You may want to have an async zfs-get program/script that regularly does a zfs get -Ho and stores then in a local cache (redis or your own program) at a set interval and then the api can hit the cache instead of directly running get or list. I cannot because the cache will become stale on first new entity creation, which happens all the time. - Some silly person will try to benchmark your zfs web-API and overload your server with zfs processes. - Example: let me run [ ab -c 10 -n 1 http://yourserver/zfs-api/list ] -- Let me run 10 concurrent connection with a total of 10k requests to your api (it's a simple one liner -- people will be tempted to benchmark like this). Example: https://github.com/tlhakhan/ideal-potato/blob/master/zdux/routers/zfs/service.js#L9 - This is a JS example, but you can easily script it or another language (golang) for cache separation and another program for the API. Also, zfs does have a -c property to get cached values -- these values are stored in an internal zfs process cache. The -c doesn't help if you have 1000(0)s of filesystems, a single list can still take minutes. Sending the list is also several megabytes. Doesn't have on FreeBSD. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs listing and CPU
Hi, On 12.08.2017 20:50, Paul Kraus wrote: On Aug 11, 2017, at 2:28 AM, Eugene M. Zheganin <e...@norma.perm.ru> wrote: Why does the zfs listing eat so much of the CPU ? 47114 root 1 200 40432K 3840K db->db 4 0:05 26.84% zfs 47099 root 1 200 40432K 3840K zio->i 17 0:05 26.83% zfs 47106 root 1 200 40432K 3840K db->db 21 0:05 26.81% zfs 47150 root 1 200 40432K 3428K db->db 13 0:03 26.31% zfs 47141 root 1 200 40432K 3428K zio->i 28 0:03 26.31% zfs 47135 root 1 200 40432K 3312K g_wait 9 0:03 25.51% zfs This is from winter 2017 11-STABLE (r310734), one of the 'zfs'es is cloning, and all the others are 'zfs list -t all'. I have like 25 gigs of free RAM, do I have any chance of speeding this up using may be some caching or some sysctl tuning ? We are using a simple ZFS web API that may issue concurrent or sequential listing requests, so as you can see they sometimes do stack. How many snapshots do you have ? I have only seen this behavior with LOTS (not hundreds, but thousands) of snapshots. [root@san1:~]# zfs list -t snapshot | wc -l 88 What does your `iostat -x 1` look like ? I expect that you are probably saturating your drives with random I/O. Well, it's really long, and the disks are busy with random i/o indeed, but byst only for 20-30%. As about iostat - it's really long, because I have hundreds (not thousands) of zvols, and they do show up in iostat -x. But nothing unusual besides that. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zfs listing and CPU
Hi, Why does the zfs listing eat so much of the CPU ? last pid: 47151; load averages: 3.97, 6.35, 6.13up 1+23:21:18 09:15:13 146 processes: 3 running, 142 sleeping, 1 waiting CPU: 0.0% user, 0.0% nice, 30.5% system, 0.3% interrupt, 69.2% idle Mem: 44M Active, 360M Inact, 37G Wired, 25G Free ARC: 32G Total, 14G MFU, 17G MRU, 312M Anon, 803M Header, 523M Other Swap: 32G Total, 185M Used, 32G Free PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 11 root 32 155 ki31 0K 512K CPU00 1104.1 2666.23% idle 0 root9880 -16- 0K 154M swapin 11 314.0H 281.29% kernel 13 root 3 -8- 0K48K gread 0 20.9H 28.71% geom 47114 root 1 200 40432K 3840K db->db 4 0:05 26.84% zfs 47099 root 1 200 40432K 3840K zio->i 17 0:05 26.83% zfs 47106 root 1 200 40432K 3840K db->db 21 0:05 26.81% zfs 47150 root 1 200 40432K 3428K db->db 13 0:03 26.31% zfs 47141 root 1 200 40432K 3428K zio->i 28 0:03 26.31% zfs 47135 root 1 200 40432K 3312K g_wait 9 0:03 25.51% zfs 4 root 7 -16- 0K 112K - 20 975:01 19.73% cam 5 root2494 -8- 0K 39952K arc_re 18 20.2H 18.58% zfskern 12 root 65 -60- 0K 1040K WAIT0 17.8H 15.64% intr 22 root 2 -16- 0K32K psleep 3 66:34 7.31% pagedaemon 590 root 10 -16- 0K 160K - 21 177:02 2.96% ctl [...] This is from winter 2017 11-STABLE (r310734), one of the 'zfs'es is cloning, and all the others are 'zfs list -t all'. I have like 25 gigs of free RAM, do I have any chance of speeding this up using may be some caching or some sysctl tuning ? We are using a simple ZFS web API that may issue concurrent or sequential listing requests, so as you can see they sometimes do stack. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: a strange and terrible saga of the cursed iSCSI ZFS SAN
On 05.08.2017 22:08, Eugene M. Zheganin wrote: Hi, I got a problem that I cannot solve just by myself. I have a iSCSI zfs SAN system that crashes, corrupting it's data. I'll be short, and try to describe it's genesis shortly: 1) autumn 2016, SAN is set up, supermicro server, external JBOD, sandisk ssds, several redundant pools, FreeBSD 11.x (probably release, don't really remember - see below). 2) this is working just fine until early spring 2017 3) system starts to crash (various panics): panic: general protection fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=6599069589504 size=81920) panic: page fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=8245779054592 size=8192) panic: page fault panic: page fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=1792100934656 size=46080) 4) we memtested it immidiately, no problems found. 5) we switch sandisks to toshibas, we switch also the server to an identical one, JBOD to an identical one, leaving same cables. 6) crashes don't stop. 7) we found that field engineers physically damaged (sic!) the SATA cables (main one and spare ones), and that 90% of the disks show ICRC SMART errors. 8) we replaced the cable (brand new HP one). 9) ATA SMART errors stopped increasing. 10) crashes continue. 11) we decided that probably when ZFS was moved over damaged cables between JBODs it was somehow damaged too, so now it's panicking because of that. so we wiped the data completely, reinitialized the SAN system and put it back into the production. we even dd'ed each disk with zeroes (!) - just in case. Important note: the data was imported using zfs send from another, stable system that is runing in production in another DC. 12) today we got another panic. btw the pools look now like this: # zpool status -v pool: data state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAMESTATE READ WRITE CKSUM dataONLINE 0 062 raidz1-0 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10ONLINE 0 0 0 da11ONLINE 0 0 0 raidz1-2 ONLINE 0 062 da12ONLINE 0 0 0 da13ONLINE 0 0 0 da14ONLINE 0 0 0 da15ONLINE 0 0 0 da16ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: data/userdata/worker208:<0x1> pool: userdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 216K mirror-0 ONLINE 0 0 432K gpt/userdata0 ONLINE 0 0 432K gpt/userdata1 ONLINE 0 0 432K errors: Permanent errors have been detected in the following files: userdata/worker36:<0x1> userdata/worker30:<0x1> userdata/worker31:<0x1> userdata/worker35:<0x1> 12) somewhere between p.5 and p.10 the pool became deduplicated (not directly connected to the problem, just for production reasons). So, concluding: we had bad hardware, we replaced EACH piece (server, adapter, JBOD, cable, disks), and crashes just don't stop. We have 5 another iSCSI SAN systems, almost fully identical that don't crash. Crashes on this particular system began when it was running same set of versions that stable systems. So far my priority version is that something was broken in the iSCSI+zfs stack somewhere between r310734 (most recent version on my SAN systems that works) and r320056 (probably earlier, but r320056 is the first revision with documented crash). So I downgraded back to r310734 (from a 11.1-RELEASE, which is affected, if I'm right). Some things speak pro this version: - the system was stable pre-s
Re: a strange and terrible saga of the cursed iSCSI ZFS SAN
Hi, On 05.08.2017 22:08, Eugene M. Zheganin wrote: pool: userdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 216K mirror-0 ONLINE 0 0 432K gpt/userdata0 ONLINE 0 0 432K gpt/userdata1 ONLINE 0 0 432K That would be funny, if not that sad, but while writing this message, the pool started to look like below (I just asked zpool status twice in a row, comparing to what it was): [root@san1:~]# zpool status userdata pool: userdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 728K mirror-0 ONLINE 0 0 1,42M gpt/userdata0 ONLINE 0 0 1,42M gpt/userdata1 ONLINE 0 0 1,42M errors: 4 data errors, use '-v' for a list [root@san1:~]# zpool status userdata pool: userdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 730K mirror-0 ONLINE 0 0 1,43M gpt/userdata0 ONLINE 0 0 1,43M gpt/userdata1 ONLINE 0 0 1,43M errors: 4 data errors, use '-v' for a list So, you see, the error rate is like speed of light. And I'm not sure if the data access rate is that enormous, looks like they are increasing on their own. So may be someone have an idea on what this really means. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
a strange and terrible saga of the cursed iSCSI ZFS SAN
Hi, I got a problem that I cannot solve just by myself. I have a iSCSI zfs SAN system that crashes, corrupting it's data. I'll be short, and try to describe it's genesis shortly: 1) autumn 2016, SAN is set up, supermicro server, external JBOD, sandisk ssds, several redundant pools, FreeBSD 11.x (probably release, don't really remember - see below). 2) this is working just fine until early spring 2017 3) system starts to crash (various panics): panic: general protection fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=6599069589504 size=81920) panic: page fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=8245779054592 size=8192) panic: page fault panic: page fault panic: page fault panic: Solaris(panic): zfs: allocating allocated segment(offset=1792100934656 size=46080) 4) we memtested it immidiately, no problems found. 5) we switch sandisks to toshibas, we switch also the server to an identical one, JBOD to an identical one, leaving same cables. 6) crashes don't stop. 7) we found that field engineers physically damaged (sic!) the SATA cables (main one and spare ones), and that 90% of the disks show ICRC SMART errors. 8) we replaced the cable (brand new HP one). 9) ATA SMART errors stopped increasing. 10) crashes continue. 11) we decided that probably when ZFS was moved over damaged cables between JBODs it was somehow damaged too, so now it's panicking because of that. so we wiped the data completely, reinitialized the SAN system and put it back into the production. we even dd'ed each disk with zeroes (!) - just in case. Important note: the data was imported using zfs send from another, stable system that is runing in production in another DC. 12) today we got another panic. btw the pools look now like this: # zpool status -v pool: data state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAMESTATE READ WRITE CKSUM dataONLINE 0 062 raidz1-0 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10ONLINE 0 0 0 da11ONLINE 0 0 0 raidz1-2 ONLINE 0 062 da12ONLINE 0 0 0 da13ONLINE 0 0 0 da14ONLINE 0 0 0 da15ONLINE 0 0 0 da16ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: data/userdata/worker208:<0x1> pool: userdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 216K mirror-0 ONLINE 0 0 432K gpt/userdata0 ONLINE 0 0 432K gpt/userdata1 ONLINE 0 0 432K errors: Permanent errors have been detected in the following files: userdata/worker36:<0x1> userdata/worker30:<0x1> userdata/worker31:<0x1> userdata/worker35:<0x1> 12) somewhere between p.5 and p.10 the pool became deduplicated (not directly connected to the problem, just for production reasons). So, concluding: we had bad hardware, we replaced EACH piece (server, adapter, JBOD, cable, disks), and crashes just don't stop. We have 5 another iSCSI SAN systems, almost fully identical that don't crash. Crashes on this particular system began when it was running same set of versions that stable systems. So, besides calling an exorcist, I would like to hear what other options do I have, I really would. And I want to also ask - what happens when the system's memory isn't enough for deduplication - does it crash, or does the problem of mounting the pool appear, like some articles mention ? This message could been encumbered by junky data like the exact FreeBSD releases we ran (asuuming it's normal for some 11.x revisions to crash and damage the data, and some - not, which I
panic: dva_get_dsize_sync(): bad DVA on 2016 11-STABLE.
Hi, today I got the following panic on the December 2016 11-STABLE: FreeBSD san02.playkey.net 11.0-STABLE FreeBSD 11.0-STABLE #0 r310734M: Thu Dec 29 19:22:30 UTC 2016 emz@san02:/usr/obj/usr/src/sys/GENERIC amd64 panic: dva_get_dsize_sync(): bad DVA 4294967295:2086400 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: dva_get_dsize_sync(): bad DVA 4294967295:2086400 cpuid = 2 KDB: stack backtrace: #0 0x80b023a7 at kdb_backtrace+0x67 #1 0x80ab88e6 at vpanic+0x186 #2 0x80ab8753 at panic+0x43 #3 0x8226a148 at bp_get_dsize+0x128 #4 0x8222cc29 at dmu_tx_count_write+0x589 #5 0x8222c675 at dmu_tx_hold_write+0x35 #6 0x822c573e at zvol_strategy+0x21e #7 0x809f6a10 at g_io_request+0x4a0 #8 0x8283a45f at ctl_be_block_dispatch_dev+0x20f #9 0x8283bddc at ctl_be_block_worker+0x6c #10 0x80b1484a at taskqueue_run_locked+0x14a #11 0x80b15a38 at taskqueue_thread_loop+0xe8 #12 0x80a70785 at fork_exit+0x85 #13 0x80f55f2e at fork_trampoline+0xe Uptime: 78d7h43m31s My question is (since I din't find much on this) what does this "bad DVA" mean ? I've read that this may indicate the on-disk zfs corrupton, but I'm not suer about it. Is this fixable in any way ? Do I have to prepare to recreate the pool (btw I have three pools) from scratch, and how do I determine which one has the corruption. Some [useless ?] zfs info: # zpool status pool: data state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM dataONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 da10ONLINE 0 0 0 da11ONLINE 0 0 0 da12ONLINE 0 0 0 da13ONLINE 0 0 0 raidz1-3 ONLINE 0 0 0 da14ONLINE 0 0 0 da15ONLINE 0 0 0 da16ONLINE 0 0 0 da17ONLINE 0 0 0 errors: No known data errors pool: userdata state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM userdata ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/userdata0 ONLINE 0 0 0 gpt/userdata1 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/zroot0 ONLINE 0 0 0 gpt/zroot1 ONLINE 0 0 0 errors: No known data errors Hardware: # camcontrol devlist at scbus4 target 0 lun 0 (pass0,ses0) at scbus11 target 0 lun 0 (pass1,ses1) at scbus12 target 4 lun 0 (pass2,da0) at scbus12 target 5 lun 0 (pass3,da1) at scbus12 target 8 lun 0 (pass4,da2) at scbus12 target 9 lun 0 (pass5,da3) at scbus12 target 10 lun 0 (pass6,da4) at scbus12 target 11 lun 0 (pass7,da5) at scbus12 target 12 lun 0 (pass8,da6) at scbus12 target 13 lun 0 (pass9,da7) at scbus12 target 14 lun 0 (pass10,da8) at scbus12 target 15 lun 0 (pass11,da9) at scbus12 target 16 lun 0 (pass12,da10) at scbus12 target 17 lun 0 (pass13,da11) at scbus12 target 18 lun 0 (pass14,da12) at scbus12 target 19 lun 0 (pass15,da13) at scbus12 target 20 lun 0 (pass16,da14) at scbus12 target 21 lun 0 (pass17,da15) at scbus12 target 22 lun 0 (pass18,da16) at scbus12 target 23 lun 0 (pass19,da17) at scbus12 target 24 lun 0 (pass20,da18) at scbus12 target 32 lun 0 (pass21,ses2) I also have the zdb -uuumdC for each pool, here they are in case someone needs them: https://enaza.ru/stub-data/zdb-uuumdC-data.txt https://enaza.ru/stub-data/zdb-uuumdC-userdata.txt
Re: some general zfs tuning (for iSCSI)
Hi. On 02.08.2017 17:43, Ronald Klop wrote: On Fri, 28 Jul 2017 12:56:11 +0200, Eugene M. Zheganin <e...@norma.perm.ru> wrote: Hi, I'm using several FreeBSD zfs installations as the iSCSI production systems, they basically consist of an LSI HBA, and a JBOD with a bunch of SSD disks (12-24, Intel, Toshiba or Sandisk (avoid Sandisks btw)). And I observe a problem very often: gstat shows 20-30% of disk load, but the system reacts very slowly: cloning a dataset takes 10 seconds, similar operations aren't lightspeeding too. To my knowledge, until the disks are 90-100% busy, this shouldn't happen. My systems are equipped with 32-64 gigs of RAM, and the only tuning I use is limiting the ARC size (in a very tender manner - at least to 16 gigs) and playing with TRIM. The number of datasets is high enough - hundreds of clones, dozens of snapshots, most of teh data ovjects are zvols. Pools aren't overfilled, most are filled up to 60-70% (no questions about low space pools, but even in this case the situation is clearer - %busy goes up in the sky). So, my question is - is there some obvious zfs tuning not mentioned in the Handbook ? On the other side - handbook isn't much clear on how to tune zfs, it's written mostly in the manner of "these are sysctl iods you can play with". Of course I have seen several ZFS tuning guides. Like Opensolaris one, but they are mostly file- and application-specific. Is there some special approach to tune ZFS in the environment with loads of disks ? I don't know like tuning the vdev cache or something simllar. ? What version of FreeBSD are you running? Well, different ones. Mostly some versions of 11.0-RELEASE-pX and 11-STABLE. What is the system doing during all this? What do you meant by "what" ? Nothing else except serving iSCSI - it's the main purpose of every one of these servers. How are your pools setup (raidz1/2/3, mirror, 3mirror)? zroot is a mirrored two-disk pool, others are raidz, mostly spans of multiple 5-disk radizs. How is your iSCSI configured and what are the clients doing with it? Using the kernel ctld of course. As you may know ctl.conf dosn't suppose any performance tweaks, it's just a way of organizing the authorization layer. Clients are the VMWare ESX hypervisors, using iSCSI as disk devices, as for ESX SRs, and as direct iSCSI disks in Windows VMs. Is the data distributed evenly on all disks? It's not. Does it ever ditrubute evenly anywhere ? Do the clients write a lot of sync data? What do you exactly mean by "sync data" ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
some general zfs tuning (for iSCSI)
Hi, I'm using several FreeBSD zfs installations as the iSCSI production systems, they basically consist of an LSI HBA, and a JBOD with a bunch of SSD disks (12-24, Intel, Toshiba or Sandisk (avoid Sandisks btw)). And I observe a problem very often: gstat shows 20-30% of disk load, but the system reacts very slowly: cloning a dataset takes 10 seconds, similar operations aren't lightspeeding too. To my knowledge, until the disks are 90-100% busy, this shouldn't happen. My systems are equipped with 32-64 gigs of RAM, and the only tuning I use is limiting the ARC size (in a very tender manner - at least to 16 gigs) and playing with TRIM. The number of datasets is high enough - hundreds of clones, dozens of snapshots, most of teh data ovjects are zvols. Pools aren't overfilled, most are filled up to 60-70% (no questions about low space pools, but even in this case the situation is clearer - %busy goes up in the sky). So, my question is - is there some obvious zfs tuning not mentioned in the Handbook ? On the other side - handbook isn't much clear on how to tune zfs, it's written mostly in the manner of "these are sysctl iods you can play with". Of course I have seen several ZFS tuning guides. Like Opensolaris one, but they are mostly file- and application-specific. Is there some special approach to tune ZFS in the environment with loads of disks ? I don't know like tuning the vdev cache or something simllar. ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ctl.conf includes
Hi, any chance we will get the "include" directive for ctl.conf ? Because, for instance, I'm using bunch of custom APIs on top of iSCSI/zfs and the inability to split the ctl.conf to a set of different one-for-a-target config files complicates lot of things. I understand clearly that this is only my problem, bit I'm writing this in case of someone's needs this too, so may be I'm not alone asking for ctl.conf includes. I am aware that ctladm allows many thing, including creating and deleting targets on the fly, but the problem is in saving this configuration in the consistent state. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: cannot destroy faulty zvol
Hi. On 23.07.2017 0:28, Eugene M. Zheganin wrote: Hi, On 22.07.2017 17:08, Eugene M. Zheganin wrote: is this weird error "cannot destroy: already exists" related to the fact that the zvol is faulty ? Does it indicate that metadata is probably faulty too ? Anyway, is there a way to destroy this dataset ? Follow-up: I sent a similar zvol of the thexactly same size into the faulty one, zpool errors are gone, still cannot destroy the zvol. Is this a zfs bug ? Seems like it. zdb shows some "invisible" dataset not shown by zfs list -t all, one of them is a child dataset of the one I'm trying to destroy: # zdb -d zfsroot Dataset mos [META], ID 0, cr_txg 4, 60.7M, 1078 objects Dataset zfsroot/usr/src [ZPL], ID 1181, cr_txg 59832, 1.03G, 158964 objects Dataset zfsroot/usr/home [ZPL], ID 1197, cr_txg 59911, 21.7M, 551 objects Dataset zfsroot/usr/ports [ZPL], ID 1189, cr_txg 59881, 887M, 335670 objects Dataset zfsroot/usr [ZPL], ID 1173, cr_txg 59829, 23.0K, 7 objects Dataset zfsroot/tmp [ZPL], ID 1253, cr_txg 59931, 6.66G, 480610 objects Dataset zfsroot/userdata/worker226 [ZVOL], ID 940, cr_txg 59580, 12.0K, 2 objects Dataset zfsroot/userdata/worker251 [ZVOL], ID 948, cr_txg 59583, 132M, 2 objects Dataset zfsroot/userdata/worker152 [ZVOL], ID 924, cr_txg 59551, 928M, 2 objects Dataset zfsroot/userdata/worker125 [ZVOL], ID 932, cr_txg 59566, 997M, 2 objects Dataset zfsroot/userdata/worker158 [ZVOL], ID 916, cr_txg 59536, 498M, 2 objects Dataset zfsroot/userdata/worker214 [ZVOL], ID 908, cr_txg 59530, 736M, 2 objects Dataset zfsroot/userdata/worker160 [ZVOL], ID 900, cr_txg 59524, 774M, 2 objects Dataset zfsroot/userdata/worker184 [ZVOL], ID 892, cr_txg 59518, 609M, 2 objects Dataset zfsroot/userdata/worker235 [ZVOL], ID 1012, cr_txg 59663, 1.62G, 2 objects Dataset zfsroot/userdata/worker242 [ZVOL], ID 1021, cr_txg 59674, 96.1M, 2 objects Dataset zfsroot/userdata/worker248 [ZVOL], ID 1004, cr_txg 59660, 153M, 2 objects Dataset zfsroot/userdata/worker141 [ZVOL], ID 988, cr_txg 59631, 1014M, 2 objects Dataset zfsroot/userdata/worker136 [ZVOL], ID 996, cr_txg 59646, 995M, 2 objects Dataset zfsroot/userdata/worker207 [ZVOL], ID 980, cr_txg 59617, 577M, 2 objects Dataset zfsroot/userdata/worker179 [ZVOL], ID 972, cr_txg 59602, 801M, 2 objects Dataset zfsroot/userdata/worker197 [ZVOL], ID 964, cr_txg 59595, 383M, 2 objects Dataset zfsroot/userdata/worker173 [ZVOL], ID 956, cr_txg 59586, 1.26G, 2 objects Dataset zfsroot/userdata/worker190 [ZVOL], ID 1085, cr_txg 59757, 236M, 2 objects Dataset zfsroot/userdata/worker174 [ZVOL], ID 1077, cr_txg 59743, 2.11G, 2 objects Dataset zfsroot/userdata/worker200 [ZVOL], ID 1069, cr_txg 59732, 260M, 2 objects Dataset zfsroot/userdata/worker131 [ZVOL], ID 1053, cr_txg 59711, 792M, 2 objects Dataset zfsroot/userdata/worker146 [ZVOL], ID 1061, cr_txg 59725, 418M, 2 objects Dataset zfsroot/userdata/worker245 [ZVOL], ID 1037, cr_txg 59692, 208M, 2 objects Dataset zfsroot/userdata/worker232 [ZVOL], ID 1045, cr_txg 59695, 527M, 2 objects Dataset zfsroot/userdata/worker238 [ZVOL], ID 1029, cr_txg 59677, 1.94G, 2 objects Dataset zfsroot/userdata/worker167 [ZVOL], ID 1165, cr_txg 59823, 4.43G, 2 objects Dataset zfsroot/userdata/worker189 [ZVOL], ID 1157, cr_txg 59817, 326M, 2 objects Dataset zfsroot/userdata/worker183 [ZVOL], ID 1149, cr_txg 59811, 1.18G, 2 objects Dataset zfsroot/userdata/worker219 [ZVOL], ID 1141, cr_txg 59808, 12.0K, 2 objects Dataset zfsroot/userdata/worker213 [ZVOL], ID 1133, cr_txg 59802, 1.04G, 2 objects Dataset zfsroot/userdata/worker122 [ZVOL], ID 1117, cr_txg 59782, 1.05G, 2 objects Dataset zfsroot/userdata/worker155 [ZVOL], ID 1125, cr_txg 59790, 963M, 2 objects Dataset zfsroot/userdata/worker128 [ZVOL], ID 1109, cr_txg 59769, 1.67G, 2 objects Dataset zfsroot/userdata/worker256 [ZVOL], ID 1093, cr_txg 59763, 602K, 2 objects Dataset zfsroot/userdata/worker221 [ZVOL], ID 1101, cr_txg 59766, 12.0K, 2 objects Dataset zfsroot/userdata/worker126 [ZVOL], ID 666, cr_txg 59194, 781M, 2 objects Dataset zfsroot/userdata/worker151 [ZVOL], ID 674, cr_txg 59205, 435M, 2 objects Dataset zfsroot/userdata/worker252 [ZVOL], ID 650, cr_txg 59188, 127M, 2 objects Dataset zfsroot/userdata/worker225 [ZVOL], ID 658, cr_txg 59191, 12.0K, 2 objects Dataset zfsroot/userdata/worker187 [ZVOL], ID 642, cr_txg 59171, 2.55G, 2 objects Dataset zfsroot/userdata/worker169 [ZVOL], ID 634, cr_txg 59157, 359M, 2 objects Dataset zfsroot/userdata/worker163 [ZVOL], ID 626, cr_txg 59139, 2.30G, 2 objects Dataset zfsroot/userdata/worker217 [ZVOL], ID 618, cr_txg 59136, 12.0K, 2 objects Dataset zfsroot/userdata/worker135 [ZVOL], ID 731, cr_txg 59301, 1.36G, 2 objects Dataset zfsroot/userdata/worker142 [ZVOL], ID 739, cr_txg 59315, 468M, 2 objects Dataset zfsroot/userdata/worker148 [ZVOL], ID 723, cr_txg 59288, 1.21G, 2 objects Dataset zfsroot/userdata/worker241 [ZVOL], ID 707, cr_txg 59265, 758M, 2 objects Dataset zfsroot/userdata/
Re: cannot destroy faulty zvol
Hi, On 22.07.2017 17:08, Eugene M. Zheganin wrote: is this weird error "cannot destroy: already exists" related to the fact that the zvol is faulty ? Does it indicate that metadata is probably faulty too ? Anyway, is there a way to destroy this dataset ? Follow-up: I sent a similar zvol of the thexactly same size into the faulty one, zpool errors are gone, still cannot destroy the zvol. Is this a zfs bug ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
cannot destroy faulty zvol
Hi, I cannot destroy a zvol for a reason that I don't understand: [root@san1:~]# zfs list -t all | grep worker182 zfsroot/userdata/worker182-bad1,38G 1,52T 708M - [root@san1:~]# zfs destroy -R zfsroot/userdata/worker182-bad cannot destroy 'zfsroot/userdata/worker182-bad': dataset already exists [root@san1:~]# also noitice that this zvol is faulty: pool: zfsroot state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Sat Jul 22 15:01:37 2017 18,7G scanned out of 130G at 75,0M/s, 0h25m to go 0 repaired, 14,43% done config: NAMESTATE READ WRITE CKSUM zfsroot ONLINE 0 0 4 mirror-0 ONLINE 0 0 8 gpt/zroot0 ONLINE 0 0 8 gpt/zroot1 ONLINE 0 0 8 errors: Permanent errors have been detected in the following files: zfsroot/userdata/worker182-bad:<0x1> <0xc7>:<0x1> is this weird error "cannot destroy: already exists" related to the fact that the zvol is faulty ? Does it indicate that metadata is probably faulty too ? Anyway, is there a way to destroy this dataset ? Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
mdconfig and UDF
Hi. Is there any chance to mount UDF filesystem under FreeBSD with mdconfig and ISO image ? Mount -t cd9660 /dev/md0 /mnt/cdrom gives me the readme.txt with "This is UDF, you idiot" and mount -t udf /dev/md0 /mnt/cdrom gives me # mount -t udf /dev/md0 cdrom mount_udf: /dev/md0: Invalid argument So... Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: redundant zfs pool, system traps and tonns of corrupted files
Hi, On 29.06.2017 16:37, Eugene M. Zheganin wrote: Hi. Say I'm having a server that traps more and more often (different panics: zfs panics, GPFs, fatal traps while in kernel mode etc), and then I realize it has tonns of permanent errors on all of it's pools that scrub is unable to heal. Does this situation mean it's a bad memory case ? Unfortunately I switched the hardware to an identical server prior to encountering zpools have errors, so I'm not use when did they appear. Right now I'm about to run a memtest on an old hardware. So, whadda you say - does it point at the memory as the root problem ? I'm also not quite getting the situation when I have errors on a vdev level, but 0 errors on a lower device layer (could someone please explain this): pool: esx state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: resilvered 3,74G in 0h5m with 0 errors on Tue Dec 27 05:14:32 2016 config: NAMESTATE READ WRITE CKSUM esx ONLINE 0 0 99,0K raidz1-0 ONLINE 0 0 113K da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 2 da3 ONLINE 0 0 0 da5 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 84,7K da12ONLINE 0 0 0 da13ONLINE 0 0 1 da14ONLINE 0 0 0 da15ONLINE 0 0 0 da16ONLINE 0 0 0 errors: 25 data errors, use '-v' for a list pool: gamestop state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Thu Jun 29 12:30:21 2017 1,67T scanned out of 4,58T at 1002M/s, 0h50m to go 0 repaired, 36,44% done config: NAMESTATE READ WRITE CKSUM gamestopONLINE 0 0 1 raidz1-0 ONLINE 0 0 2 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da11ONLINE 0 0 0 errors: 10 data errors, use '-v' for a list P.S. This is a FreeBSD 11.1-BETA2 r320056M (M stands for CTL_MAX_PORTS = 1024), with ECC memory. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
redundant zfs pool, system traps and tonns of corrupted files
Hi. Say I'm having a server that traps more and more often (different panics: zfs panics, GPFs, fatal traps while in kernel mode etc), and then I realize it has tonns of permanent errors on all of it's pools that scrub is unable to heal. Does this situation mean it's a bad memory case ? Unfortunately I switched the hardware to an identical server prior to encountering zpools have errors, so I'm not use when did they appear. Right now I'm about to run a memtest on an old hardware. So, whadda you say - does it point at the memory as the root problem ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
system is unresponsive and the amount of wired memory is cycling - zfs/iscsi ?
Hi. I'm using a FreeBSD 11.0-R server as a SAN system (with a native iSCSI target). It has 12 disks attached via external enclosure and a Megaraid SAS 3003 mrsas(4) controller. Actually I'm using several FreeBSD in similar configurations as SAN systems, but this one is frequently unresponsive. This often looks like top/gstat/dmesg are locked on start and waiting for something, then, after several minutes, everything is back to normal. In the same time ls and zpool list are working fine, but zfs clone is not. In the dmesg I got tonns if messages like ctl_datamove: tag 0x1086bc on (16:34:0) aborted ctl_datamove: tag 0x1086a8 on (16:34:0) aborted ctl_datamove: tag 0x10ec50 on (7:34:0) aborted ctl_datamove: tag 0xf1cd6 on (13:34:0) aborted ctl_datamove: tag 0xfcc88 on (15:34:0) aborted ctl_datamove: tag 0xd3ed1 on (21:34:0) aborted ctl_datamove: tag 0x1056f8 on (1:34:0) aborted Not sure if they are related to the lock issue I'm having, so I thought I will just mention them. It seems like the system is starving on some resources, but I really have no idea about what could it be. When this lock is happening, a launched top stops to update it's output, so I don't know what is happening, I let the system run with default zfs settings, top show that the ARC has eaten out all of the memory. One more importang thing - just befor top stops refereshing the screen it shows that ARC has eaten all the memory, then, after lock is gone, top shows the server has 11G free memory back, so this happens in cycles and definitely has something to do with the amount of the wired memory. Should I tune the zfs subsystem in some way ? If yes - what exactly should I tune ? I'm also running this system with a CTL patch increasing the target limit to 1024, may be this is eating some additional kernel memory, don't know if this is the source of this locking issue. I'm running other systems with the same patch, but they are running smoothly. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: freebsd on Intel s5000pal - kernel reboots the system
Hi. On 18.04.2017 14:44, Konstantin Belousov wrote: You did not provide any information about your issue. It is not known even whether the loader breaks for you, or a kernel starts booting and failing. Ideally, you would use serial console and provide the log of everything printed on it, before the reboot. What kind of boot do you use, legacy BIOS or EFI ? What version of FreeBSD ? Oh, yeah, I'm sorry. Nah, loader is fine, the server reboots when the kernel itself is initializing the devices. It starts initializing various PCI stuff, then reboots. I'm aware about serial console, but, unfortunately, this server has some proprietary RS-232 jack, in the form-factor of 8P8C, and the only cable I have with these jacks is Cisco "blue" cable, but seems like they use different pin-out scheme. Thus, so far, no serial console, but I'm working on it. The FreeBSD version I've tried to use is the 11-RELEASE amd64. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
freebsd on Intel s5000pal - kernel reboots the system
Hi, I need to install FreBSD on an Intel system with s5000pal mainboard. The problem is, that on the kernel loading stage, FreeBSD reboots the server. Like, always and silently, without trapping. I have plugged out all of the discrete PCI controllers, leaving only onboard ones. Still reboots. I was suspecting this is some kind of hardware problem, so I ran the memtest (no errors for two hours) and I've even switched the server (I have two identical ones). New server reboots too. So, looks like it's some kind of FreeBSD issue. I've updated the BIOS on both, tried to boot without ACPI - and this doesn't help (and without ACPI FreeBSD refuses to even start to load kernel). So I'll really appreciate any ideas on how to solve this. I tried to boot CentOS - it boots just fine. I've alsp tried to play with various BIOS settings, but this makes no difference at all. From what I see FreeBSD reboots the server during the PCI initialization. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zpool list show nonsense on raidz pools, at least it looks like it for me
Hi, It's not my first letter where I fail to understand the space usage from zfs utilities, and in previous ones I was kind of convinced that I just read it wrong, but not this time I guess. See for yourself: [emz@san01:~]> zpool list data NAME SIZE ALLOC FREE EXPANDSZ FRAGCAP DEDUP HEALTH ALTROOT data 17,4T 7,72T 9,66T -46%44% 1.00x ONLINE - Here' as I understand it, zpool says that less than a half of the pool is used. As far as I know this is very complicated when it comes to the radiz pools. Let's see: [emz@san01:~]> zfs list -t all data NAME USED AVAIL REFER MOUNTPOINT data 13,3T 186G 27,2K /data So, if we won't investigate further, it looks like that only 186G is free. Spoiling - this is the real free space amount, because I've just managed to free 160 gigs of data, and I really know I was short on space when sending 30 Gb dataset, because zfs was saying "Not enough free space". So, let's investigate further: [emz@san01:~]> zfs list -t all | more NAMEUSED AVAIL REFER MOUNTPOINT data 13,3T 186G 27,2K /data data/esx 5,23T 186G 27,2K /data/esx data/esx/boot-esx018,25G 193G 561M - data/esx/boot-esx028,25G 193G 561M - data/esx/boot-esx038,25G 193G 561M - data/esx/boot-esx048,25G 193G 561M - data/esx/boot-esx058,25G 193G 561M - data/esx/boot-esx068,25G 193G 561M - data/esx/boot-esx078,25G 193G 962M - data/esx/boot-esx088,25G 193G 562M - data/esx/boot-esx098,25G 193G 562M - data/esx/boot-esx108,25G 193G 595M - data/esx/boot-esx118,25G 193G 539M - data/esx/boot-esx128,25G 193G 539M - data/esx/boot-esx138,25G 193G 539M - data/esx/boot-esx148,25G 193G 539M - data/esx/boot-esx158,25G 193G 539M - data/esx/boot-esx168,25G 193G 541M - data/esx/boot-esx178,25G 193G 540M - data/esx/boot-esx188,25G 193G 539M - data/esx/boot-esx198,25G 193G 542M - data/esx/boot-esx208,25G 194G 12,8K - data/esx/boot-esx218,25G 194G 12,8K - data/esx/boot-esx228,25G 193G 913M - data/esx/boot-esx238,25G 193G 558M - data/esx/boot-esx248,25G 194G 12,8K - data/esx/boot-esx258,25G 194G 12,8K - data/esx/boot-esx268,25G 194G 12,8K - data/esx/shared5,02T 2,59T 2,61T - data/reference 6,74T 4,17T 2,73T - data/reference@ver7_214 127M - 2,73T - data/reference@ver2_73912,8M - 2,73T - data/reference@ver2_7405,80M - 2,73T - data/reference@ver2_7414,55M - 2,73T - data/reference@ver2_742 993K - 2,73T - data/reference-ver2_739-worker100 1,64G 186G 2,73T - data/reference-ver2_739-worker101 254M 186G 2,73T - data/reference-ver2_739-worker102 566K 186G 2,73T - data/reference-ver2_739-worker103 260M 186G 2,73T - data/reference-ver2_739-worker104 8,74G 186G 2,73T - data/reference-ver2_739-worker105 4,19G 186G 2,73T - data/reference-ver2_739-worker106 1,72G 186G 2,73T - data/reference-ver2_739-worker107 282M 186G 2,73T - data/reference-ver2_739-worker108 1,27M 186G 2,73T - data/reference-ver2_739-worker109 8,74G 186G 2,73T - data/reference-ver2_739-worker110 8,74G 186G 2,73T - data/reference-ver2_739-worker111 8,74G 186G 2,73T - data/reference-ver2_739-worker112 8,74G 186G 2,73T - data/reference-ver2_739-worker113 838K 186G 2,73T - data/reference-ver2_739-worker114 8,74G 186G 2,73T - data/reference-ver2_739-worker115 8,74G 186G 2,73T - data/reference-ver2_739-worker116 8,74G 186G 2,73T - data/reference-ver2_739-worker117 8,74G 186G 2,73T - data/reference-ver2_739-worker118 8,74G 186G 2,73T - data/reference-ver2_739-worker119 8,74G 186G 2,73T - data/reference-ver2_739-worker120 8,74G 186G 2,73T - data/reference-ver2_739-worker121 8,74G 186G 2,73T - data/reference-ver2_739-worker122 8,74G 186G 2,73T - data/reference-ver2_739-worker123 8,74G 186G 2,73T - data/reference-ver2_739-worker124 8,74G 186G 2,73T - data/reference-ver2_739-worker125 8,74G 186G 2,73T - data/reference-ver2_739-worker126 8,74G 186G 2,73T - data/reference-ver2_739-worker127 8,74G 186G 2,73T - data/reference-ver2_739-worker128 8,74G 186G 2,73T - data/reference-ver2_739-worker129 8,74G 186G 2,73T - data/reference-ver2_739-worker130 8,74G 186G 2,73T - data/reference-ver2_739-worker131 8,74G 186G 2,73T
about that DFBSD performance test
Hi. Some have probably seen this already - http://lists.dragonflybsd.org/pipermail/users/2017-March/313254.html So, could anyone explain why FreeBSD was owned that much. Test is split into two parts, one is nginx part, and the other is the IPv4 forwarding part. I understand that nginx ownage was due to SO_REUSEPORT feature, which we do formally have, but in DFBSD and Linux it does provide a kernel socket multiplexor, which eliminates locking, and ours does not. I have only found traces of discussion that DFBSD implementation is too hackish. Well, hackish or not, but it's 4 times faster, as it turns out. The IPv4 forwarding loss is pure defeat though. Please not that although they use HEAD it these tests, they also mention that this is the GENERIC-NODEBUG kernel which means this isn't related to the WITNESS stuff. Please also don't consider this trolling, I'm a big FreeBSD fan through the years, so I'm asking because I'm kind of concerned. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: reset not working like 70% of the time
Hi. On 25.01.2017 15:15, Kurt Jaeger wrote: > Hi! > >> does anyone suffer from this too ? Right now (and for several last >> years) a 100% decent way to reset a terminal session (for instance, >> after a connection reset, after acidentally displaying a binary file >> with symbols that are treated as terminal control sequence, after >> breaking a cu session, etc) is to launch midnight commander and then >> quit from it. And the reset is working like in 30% of cases only. > [...] >> Am I the only person seeing this ? > I had some cases in the past where xterm was hanging, too -- but > not with *that* high rate of problems. > > Most of the times, xterm's Full Reset options works fine. > > The question is: how to debug that... ? > A typical cases are: - newlines aren't working properly, current line just got cleaned and that's all (no scrolling up one line happens) - mouse clicking produces some input symbols in the terminal line - Ctrl-C stops working (just nothing happens) I'm seeing all of these in my konsole terminal window while working with local and remote hosts (mostly with remotes), an typing 'reset' usually just does nothing. God bless the Midnight Commander ! Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
reset not working like 70% of the time
Hi, does anyone suffer from this too ? Right now (and for several last years) a 100% decent way to reset a terminal session (for instance, after a connection reset, after acidentally displaying a binary file with symbols that are treated as terminal control sequence, after breaking a cu session, etc) is to launch midnight commander and then quit from it. And the reset is working like in 30% of cases only. Unlike in Linux, where it's 100% functional. Am I the only person seeing this ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: decent 40G network adapters
Hi. On 18.01.2017 15:03, Slawa Olhovchenkov wrote: > I am use Chelsio and Solarflare. > Not sure about you workload -- I am have 40K+ TCP connections, you > workload need different tuning. > Do you planed to utilise both ports? > For this case you need PCIe 16x card. This is Chelsio T6 and > Solarflare 9200. Thanks. No, the number of connections in my case will be small - hundreds, and right now target servers are utilizing 4-5 Gbit/sec bandwidth, so I'm looking forward to something more performing, thats all. The pps number is also way below Mpps - at this time it's about 200 kpps, so I really hope I won't be facing a situation with millions of pps, - though seems like it will be slightly above 1 Mpps. Hopefully with the help of the community I'll be able to tune the servers to handle this ! :) Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
decent 40G network adapters
Hi. Could someone recommend a decent 40Gbit adapter that are proven to be working under FreeBSD ? The intended purpose - iSCSI traffic, not much pps, but rates definitely above 10G. I've tried Supermicro-manufactured Intel XL710 ones (two boards, different servers - same sad story: packets loss, server unresponsive, spikes), seems like they have a problem in a driver (or firmware), and though Intel support states this is because the Supermicro tampered with the adapter, I'm still suspicious about ixl(4). I've also seen in the ML a guy reported the exact same problem with ixl(4) as I have found. So, what would you say ? Chelsio ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: camcontrol rescan seems to be broken
Hi. On 22.12.2016 23:46, Warner Losh wrote: Sure sounds like your binaries are cross-threaded with the kernel. what was "file `which camcontrol`" tell you? I just got this on the FreeBSD 11.0-RELEASE Live CD, whan trying to rescan a scsci bus on an LSI3008 adapter. Looks more like a bug in 11.0-RELEASE, since I downloaded the image from the official ftp. Does it work for you ? On the original machine I was talking about in this thread: # file `which camcontrol` /sbin/camcontrol: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.0 (1100122), FreeBSD-style, stripped # uname -U 1100122 # uname -K 1100122 P.S. Does the `camcontrol rescan all` work for anyone reading this ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: camcontrol rescan seems to be broken
Hi. On 22.12.2016 11:51, Eugene M. Zheganin wrote: Hi, could anyone tell me where am I wrong: # camcontrol rescan all camcontrol: CAMIOCOMMAND ioctl failed: Invalid argument # uname -U 1100122 # uname -K 1100122 # uname -a FreeBSD bsdrookie.norma.com. 11.0-RELEASE-p5 FreeBSD 11.0-RELEASE-p5 #0 r310364: Wed Dec 21 19:03:58 YEKT 2016 e...@bsdrookie.norma.com.:/usr/obj/usr/src/sys/BSDROOKIE amd64 Furthermore, there's definitely something wrong on my machine, because smartctl complains too: # smartctl -a /dev/ada1 smartctl 6.4 2015-06-04 r4109 [FreeBSD 11.0-RELEASE-p5 amd64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org error sending CAMIOCOMMAND ioctl: Inappropriate ioctl for device # smartctl -a /dev/ada0 smartctl 6.4 2015-06-04 r4109 [FreeBSD 11.0-RELEASE-p5 amd64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org error sending CAMIOCOMMAND ioctl: Inappropriate ioctl for device Unable to get CAM device list /dev/ada0: Unable to detect device type Please specify device type with the -d option. Use smartctl -h to get a usage summary ktrace part, if it will help: [...] 4291 smartctl NAMI "/dev/xpt0" 4291 smartctl RET openat 3 4291 smartctl CALL ioctl(0x3,0xc4d81802,0x7fffa978) 4291 smartctl RET ioctl -1 errno 25 Inappropriate ioctl for device Does anyone have an idea on how to fix things back to normal ? Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: cannot detach vdev from zfs pool
Hi. On 22.12.2016 21:26, Alan Somers wrote: I'm not surprised to see this kind of error in a ZFS on GELI on Zvol pool. ZFS on Zvols has known deadlocks, even without involving GELI. GELI only makes it worse, because it foils the recursion detection in zvol_open. I wouldn't bother opening a PR if I were you, because it probably wouldn't add any new information. Sorry it didn't meet your expectations, -Alan Oh, so that's why it happened. Okay, that's pertfectly fine with me. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
cannot detach vdev from zfs pool
Hi, Recently I decided to remove the bogus zfs-inside-geli-inside-zvol pool, since it's now officially unsupported. So, I needed to reslice my disk, hence to detach one of the disks from a mirrored pool. I issued 'zpool detach zroot gpt/zroot1' and my system livelocked almost immidiately, so I pressed reset. Now I got this: # zpool status zroot pool: zroot state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: resilvered 687G in 5h26m with 0 errors on Sat Oct 17 19:41:49 2015 config: NAME STATE READ WRITE CKSUM zrootDEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 gpt/zroot0 ONLINE 0 0 0 1151243332124505229 OFFLINE 0 0 0 was /dev/gpt/zroot1 errors: No known data errors This isn't a big deal by itself, since I was able to create second zfs pool and now I'm relocating my data to it, although I should say that this is very disturbing sequence of events, because I'm now unable to even delete the UNAVAIL vdev from the pool. I tried to boot from a FreeBSD USB stick and detach it there, but all I discovered was the fact that zfs subsystem locks up upon the command 'zpool detach zroot 1151243332124505229'. I waited for several minutes but nothing happened, furthermore subsequent zpool/zfs commands are hanging up too. Is this worth submitting a pr, or may be it does need additional investigation ? In general I intend to destroy this pool after relocation it, but I'm afraid someone (or even myself again) could step on this later. Both disks are healthy, and I don't see the complains in dmesg. I'm running a FreeBSD 11.0-release-p5 here. The pool was initialy created somewhere under 9.0 I guess. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
camcontrol rescan seems to be broken
Hi, could anyone tell me where am I wrong: # camcontrol rescan all camcontrol: CAMIOCOMMAND ioctl failed: Invalid argument # uname -U 1100122 # uname -K 1100122 # uname -a FreeBSD bsdrookie.norma.com. 11.0-RELEASE-p5 FreeBSD 11.0-RELEASE-p5 #0 r310364: Wed Dec 21 19:03:58 YEKT 2016 e...@bsdrookie.norma.com.:/usr/obj/usr/src/sys/BSDROOKIE amd64 # camcontrol devlist at scbus0 target 0 lun 0 (pass0,ada0) at scbus1 target 0 lun 0 (pass1,ada1) at scbus2 target 0 lun 0 (pass2,ada2) at scbus3 target 0 lun 0 (pass3,ada3) at scbus4 target 0 lun 0 (cd0,pass4) at scbus5 target 0 lun 0 (pass5,ses0) at scbus6 target 0 lun 0 (da0,pass6) # egrep 'ahci0|ada' /var/run/dmesg.boot ahci0: port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xf7202000-0xf72027ff irq 19 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported ahcich0: at channel 0 on ahci0 ahcich1: at channel 1 on ahci0 ahcich2: at channel 2 on ahci0 ahcich3: at channel 3 on ahci0 ahcich4: at channel 4 on ahci0 ahciem0: on ahci0 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: ATA8-ACS SATA 3.x device ada0: Serial Number Z1D45ETV ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 953869MB (1953525168 512 byte sectors) ada0: quirks=0x1<4K> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: ATA8-ACS SATA 2.x device ada1: Serial Number 5XW0K4AP ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 1907729MB (3907029168 512 byte sectors) ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 ada2: ATA8-ACS SATA 3.x device ada2: Serial Number Z1E2SLEV ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 1907729MB (3907029168 512 byte sectors) ada2: quirks=0x1<4K> ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: ATA8-ACS SATA 2.x device ada3: Serial Number WD-WMAYP0516506 ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 476940MB (976773168 512 byte sectors) GEOM: ada3: the primary GPT table is corrupt or invalid. GEOM: ada3: using the secondary instead -- recovery strongly advised. GEOM_ELI: Device ada2p5.eli created. ahci0: port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xf7202000-0xf72027ff irq 19 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported ahcich0: at channel 0 on ahci0 ahcich1: at channel 1 on ahci0 ahcich2: at channel 2 on ahci0 ahcich3: at channel 3 on ahci0 ahcich4: at channel 4 on ahci0 ahciem0: on ahci0 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: ATA8-ACS SATA 3.x device ada0: Serial Number Z1D45ETV ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 953869MB (1953525168 512 byte sectors) ada0: quirks=0x1<4K> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: ATA8-ACS SATA 2.x device ada1: Serial Number 5XW0K4AP ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 1907729MB (3907029168 512 byte sectors) ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 ada2: ATA8-ACS SATA 3.x device ada2: Serial Number Z1E2SLEV ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 1907729MB (3907029168 512 byte sectors) ada2: quirks=0x1<4K> ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: ATA8-ACS SATA 2.x device ada3: Serial Number WD-WMAYP0516506 ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 476940MB (976773168 512 byte sectors) Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)
Hi. On 19.12.2016 11:51, Warner Losh wrote: > On Sun, Dec 18, 2016 at 11:34 PM, Eugene M. Zheganin <e...@norma.perm.ru> > wrote: >> I tried the UEFI boot sequence on a Supermicro server. It boots only >> manually, gives some cryptic error while booting automatically. When >> entering the path to the EFI loader in a appearing prompt - it boots >> fine, but this kills the idea. >> >> I've written a message here about this, so far nobody answered (August, >> 14th, "FreeBSD doesn't boot automatically from UEFI"). >> >> Now it runs on gptzfsboot again, so > Which SuperMicro board? Our X9's have big issues with UEFI (though > some versions of the boards seem to work). The X10's are rock solid. > The affected server has X9SCL/X9SCM, yup. Is there some workaround to this, like flashing newer BIOS ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)
Hi. On 16.12.2016 22:08, Fernando Herrero Carrón wrote: > I am reading uefi(8) and it looks like FreeBSD 11 should be able to boot > using UEFI straight into ZFS, so I am thinking of converting that > freebsd-boot partition to an EFI partition, creating a FAT filesystem and > copying /boot/boot.efi there. > > How good of an idea is that? Would it really be that simple or am I missing > something? My only reason for wanting to boot with UEFI is faster boot, > everything is working fine otherwise. > I tried the UEFI boot sequence on a Supermicro server. It boots only manually, gives some cryptic error while booting automatically. When entering the path to the EFI loader in a appearing prompt - it boots fine, but this kills the idea. I've written a message here about this, so far nobody answered (August, 14th, "FreeBSD doesn't boot automatically from UEFI"). Now it runs on gptzfsboot again, so Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
iscsi limit to 255 entities
Hi. I kind of stepped on a limit of 255 targets (a bunch of VMs), what is the possible workaround for this, besides running a secont ctld in bhyve ? I guess I cannot run ctld inside a jail, since it's the kernel daemon, right ? Is the 255 limit a limit on entities - I mean can I ran like 255 luns in 255 targets ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: [ZFS] files in a weird situtation
Hi, On 18.12.2016 02:01, David Marec wrote: > > A pass with `zfs scrub` didn't help. > > Any clue is welcome. What's that `dmu_bonus_hold` stands for ? > Just out of the curiosity - is it on a redundant pool and does the 'zpool status' report any error ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
sonewconn: pcb [...]: Listen queue overflow to human-readable form
Hi. Sometimes on one of my servers I got dmesg full of sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (6 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (2 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (1 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (15 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (12 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (10 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (16 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (16 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (22 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (6 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (6 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (1 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (9 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (5 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (18 occurrences) sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in queue awaiting acceptance (4 occurrences) but at the time of investigation the socket is already closed and lsof cannot show me the owner. I wonder if the kernel can itself decode this output and write it in the human-readable form ? Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: webcamd panic - is it just me?
Hi. On 06.12.2016 18:43, Anton Shterenlikht wrote: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215000 > > I think this started after moving from 10.3 to 11.0. > > Does nobody else see this panic? > > Saw webcamd-initiated panic once, on an 11.x too; don't remember the details, so I cannot tell whether mine was identical or not. Furthermore, after my camera has lost the ability to show 640x480 mode under FreeBSD somewhere between 8.x i386 and 8.x amd64, I don't use it often - just a couple tryouts per year, just to see this wasn't accidentally fixed. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
vfs.zfs.vdev.bio_delete_disable - crash while changing on the fly
Hi. Recently I've encountered the issue with "slow TRIM" and Sandisk SSDs, so I was told to try to disable TRIM and see what happens (thanks a lot by the way, that did it). But changing the vfs.zfs.vdev.bio_delete_disable on the fly can lead to the system crash with the probability of 50%. Is it just me or is this already known ? If it's known, why isn't this oid in a read-only list ? Thanks. P.S. My box tried to dump a core, but after a reboot savecore got nothing, so you just have to believe me. ;) Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Sandisk CloudSpeed Gen. II Eco Channel SSD vs ZFS = we're in hell
Hi. On 28.11.2016 23:07, Steven Hartland wrote: > Check your gstat with -dp so you also see deletes, it may be that your > drives have a very slow TRIM. > Indeed, I see a bunch of delete operations, and when TRIM disabled my engineers report that the performance is greatly increasing. Is this it ? Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Sandisk CloudSpeed Gen. II Eco Channel SSD vs ZFS = we're in hell
Hi, recently we bough a bunch of "Sandisk CloudSpeed Gen. II Eco Channel" disks (the model name by itself should already made me suspicious) for using with zfs SAN on FreeBSD, we're plugged them into the LSI SAS3008 and now we are experiencing the performance that I would call "literally awful". I'm using already some of the zfs SANs on FreeBSD with Intel/Samsung SSD drives, including the LSI SAS3008 controller, but never saw anything like this (and yes, these are all SSDs): dT: 1.004s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 75472 78367 104.4 12 1530 94.8 113.4| da0 75475 81482 79.2 12 1530 94.5 113.1| da1 69490 96626 106.9 12 1530 124.9 149.4| da2 75400 72382 51.5 10 1275 93.7 93.4| da3 0 0 0 00.0 0 00.00.0| da4 75400 72382 55.0 10 1275 93.9 93.7| da5 2 3975 3975 240200.3 0 00.0 21.0| da6 0 3967 3967 241440.3 0 00.0 21.4| da7 1 3929 3929 242590.3 0 00.0 21.6| da8 0 3998 3998 239330.3 0 00.0 21.2| da9 0 0 0 00.0 0 00.00.0| da10 0 4037 4037 237100.2 0 00.0 21.3| da11 0 0 0 00.0 0 00.00.0| da12 0 0 0 00.0 0 00.00.0| da13 0 0 0 00.0 0 00.00.0| da14 0 0 0 00.0 0 00.00.0| da15 0 0 0 00.0 0 00.00.0| da16 Disks are ogranized in the raidz1 pools (which is slower than the raid1 or 10, but, considering the performance of SSDs, we got no problems with Intel or Samsung drives), the controller is flashed with last firmware available (identical controller with Samsung drives performs just fine). Disks are 512e/4K drives, and "diskinfo -v"/"camcontrol identify" both report that they have 4K stripersize/physical sector. Pools are organized using dedicated disks, so, considering all of the above, I don't see any possiblity to explain this with the alignment errors. No errors are seen in the dmesg. So, right at this time, I'm out of ideas. Everything point that these Sandisk drives are the roort of the problem, but I don't see how this is possible- according to the various benchmarks (taken, however, with regular drives, not "Channel" ones, and so far I haven't figured out what is the difference between "Channel" and non-"Channel" ones, but they run different firmware branches) they have to be okay (or seem so), just the ordinary SSD. If someone has the explanation of this awful performance, please let me know. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0-RELEASE-p2: panic: vm_page_unwire: page 0x[...]'s wire count is zero
Hi. On 27.10.2016 15:01, Eugene M. Zheganin wrote: > Has anyone seen this, and what are my actions ? I've google a bit, saw > some references mentioning FreeBSD 9.x and ZERO_COPY_SOCKETS, but I > don't have neither, so now I'm trying to understand what will my actions > be do I report this ? > > And, unfortunately, it's repeatable: got another one just now. I guess I'll have to downgrade to 10-STABLE. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
11.0-RELEASE-p2: panic: vm_page_unwire: page 0x[...]'s wire count is zero
Hi, I've upgraded one of my old FreeBSD from 10-STABLE (which was leaking the wired memory and this was fixed in the spring; other than that it was pretty much stable) to the 11.0-RELEASE-p2, and almost immidiately got myself a panic: ===Cut=== # more core.txt.0 calypso.enaza.ru dumped core - see /var/crash/vmcore.0 Thu Oct 27 14:47:50 YEKT 2016 FreeBSD calypso.enaza.ru 11.0-RELEASE-p2 FreeBSD 11.0-RELEASE-p2 #0 r307991: Thu Oct 27 13:41:31 YEKT 2016 e...@calypso.enaza.ru:/usr/obj/usr/src/sys/CALYPSO amd64 panic: vm_page_unwire: page 0xf8023ca8b2f8's wire count is zero GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: panic: vm_page_unwire: page 0xf8023ca8b2f8's wire count is zero cpuid = 3 KDB: stack backtrace: #0 0x80b1b417 at kdb_backtrace+0x67 #1 0x80ad0782 at vpanic+0x182 #2 0x80ad05f3 at panic+0x43 #3 0x80e693b3 at vm_page_unwire+0x73 #4 0x80accb40 at sf_ext_free+0xb0 #5 0x80aa7ad0 at mb_free_ext+0xc0 #6 0x80aa81a8 at m_freem+0x38 #7 0x80cf7453 at tcp_do_segment+0x28a3 #8 0x80cf3edc at tcp_input+0xd1c #9 0x80c64c5f at ip_input+0x15f #10 0x80bfa135 at netisr_dispatch_src+0xa5 #11 0x80be2b9a at ether_demux+0x12a #12 0x80be37f2 at ether_nh_input+0x322 #13 0x80bfa135 at netisr_dispatch_src+0xa5 #14 0x80be2e16 at ether_input+0x26 #15 0x8055983c at igb_rxeof+0x81c #16 0x80558b92 at igb_msix_que+0x152 #17 0x80a8a7af at intr_event_execute_handlers+0x20f Uptime: 28m25s Dumping 3780 out of 8147 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump (textdump=) at pcpu.h:221 221 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=) at pcpu.h:221 #1 0x80ad0209 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0x80ad07bb in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0x80ad05f3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:690 #4 0x80e693b3 in vm_page_unwire (m=, queue=) at /usr/src/sys/vm/vm_page.c:3136 #5 0x80accb40 in sf_ext_free (arg1=0xf8023ca8b2f8, arg2=0x0) at /usr/src/sys/kern/kern_sendfile.c:140 #6 0x80aa7ad0 in mb_free_ext (m=0xf80022f27c00) at /usr/src/sys/kern/kern_mbuf.c:678 #7 0x80aa81a8 in m_freem (mb=) at mbuf.h:1180 #8 0x80cf7453 in tcp_do_segment (m=, th=, so=0xf80022ba6a20, tp=, drop_hdrlen=52, tlen=, iptos=, ti_locked=Cannot access memory at address 0x1 ) at /usr/src/sys/netinet/tcp_input.c:1764 #9 0x80cf3edc in tcp_input (mp=, offp=, proto=) at /usr/src/sys/netinet/tcp_input.c:1442 #10 0x80c64c5f in ip_input (m=Cannot access memory at address 0x0 ) at /usr/src/sys/netinet/ip_input.c:809 #11 0x80bfa135 in netisr_dispatch_src (proto=1, source=, m=0x0) at /usr/src/sys/net/netisr.c:1121 #12 0x80be2b9a in ether_demux (ifp=, m=0x0) at /usr/src/sys/net/if_ethersubr.c:850 #13 0x80be37f2 in ether_nh_input (m=) at /usr/src/sys/net/if_ethersubr.c:639 #14 0x80bfa135 in netisr_dispatch_src (proto=5, source=, m=0x0) at /usr/src/sys/net/netisr.c:1121 #15 0x80be2e16 in ether_input (ifp=, m=0x0) at /usr/src/sys/net/if_ethersubr.c:759 #16 0x8055983c in igb_rxeof (count=583080448) at /usr/src/sys/dev/e1000/if_igb.c:4957 #17 0x80558b92 in igb_msix_que (arg=0xf8000649b538) at /usr/src/sys/dev/e1000/if_igb.c:1612 #18 0x80a8a7af in intr_event_execute_handlers ( p=, ie=) at /usr/src/sys/kern/kern_intr.c:1262 #19 0x80a8aa16 in ithread_loop (arg=) at /usr/src/sys/kern/kern_intr.c:1275 #20 0x80a873f5 in fork_exit ( callout=0x80a8a950 , arg=0xf80006497900, frame=0xfe01f0b1cc00) at /usr/src/sys/kern/kern_fork.c:1038 #21 0x80fc112e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #22 0x in ?? () Current language: auto; currently minimal (kgdb) ===Cut=== Has anyone seen this, and what are my actions ? I've google a bit, saw some references mentioning FreeBSD 9.x and ZERO_COPY_SOCKETS, but I don't have neither, so
Re: zfs, a directory that used to hold lot of files and listing pause
Hi. On 21.10.2016 15:20, Slawa Olhovchenkov wrote: ZFS prefetch affect performance dpeneds of workload (independed of RAM size): for some workloads wins, for some workloads lose (for my workload prefetch is lose and manualy disabled with 128GB RAM). Anyway, this system have only 24MB in ARC by 2.3GB free, this is may be too low for this workload. You mean - "for getting a list of a directory with 20 subdirectories" ? Why then does only this directory have this issue with pause, not /usr/ports/..., which has more directories in it ? (and yes, /usr/ports/www isn't empty and holds 2410 entities) /usr/bin/time -h ls -1 /usr/ports/www [...] 0.14s real 0.00s user 0.00s sys Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
Hi. On 21.10.2016 9:22, Steven Hartland wrote: On 21/10/2016 04:52, Eugene M. Zheganin wrote: Hi. On 20.10.2016 21:17, Steven Hartland wrote: Do you have atime enabled for the relevant volume? I do. If so disable it and see if that helps: zfs set atime=off Nah, it doesn't help at all. As per with Jonathon what does gstat -pd and top -SHz show? gstat (while ls'ing): dT: 1.005s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 1 49 49 2948 13.5 0 00.0 0 0 0.0 65.0| ada0 0 32 32 1798 11.1 0 00.0 0 0 0.0 35.3| ada1 gstat (while idling): dT: 1.003s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/wd/s kBps ms/d %busy Name 0 0 0 00.0 0 00.0 0 0 0.00.0| ada0 0 2 22550.8 0 00.0 0 0 0.00.1| ada1 top -SHz output doesn't really differ while ls'ing or idling: last pid: 12351; load averages: 0.46, 0.49, 0.46 up 39+14:41:02 14:03:05 376 processes: 3 running, 354 sleeping, 19 waiting CPU: 5.8% user, 0.0% nice, 16.3% system, 0.0% interrupt, 77.9% idle Mem: 21M Active, 646M Inact, 931M Wired, 2311M Free ARC: 73M Total, 3396K MFU, 21M MRU, 545K Anon, 1292K Header, 47M Other Swap: 4096M Total, 4096M Free PID USERNAME PRI NICE SIZERES STATE C TIMEWCPU COMMAND 600 root390 27564K 5072K nanslp 1 295.0H 24.56% monit 0 root -170 0K 2608K - 1 75:24 0.00% kernel{zio_write_issue} 767 freeswitch 200 139M 31668K uwait 0 48:29 0.00% freeswitch{freeswitch} 683 asterisk200 806M 483M uwait 0 41:09 0.00% asterisk{asterisk} 0 root-80 0K 2608K - 0 37:43 0.00% kernel{metaslab_group_t} [... others lines are just 0% ...] Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
Hi. On 20.10.2016 21:17, Steven Hartland wrote: Do you have atime enabled for the relevant volume? I do. If so disable it and see if that helps: zfs set atime=off Nah, it doesn't help at all. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
Hi. On 20.10.2016 19:18, Dr. Nikolaus Klepp wrote: I've the same issue, but only if the ZFS resides on a LSI MegaRaid and one RAID0 for each disk. Not in my case, both pool disks are attached to the Intel ICH7 SATA300 controller. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
Hi, On 20.10.2016 19:12, Pete French wrote: Have ignored this thread untiul now, but I observed the same behaviour on mysystems over the last week or so. In my case its an exim spool directory, which was hugely full as some point (thousands of files) and now takes an awfully long time to open and list. I delet and remake them and the problem goes away, but I belive it is the same thing. I am running 10.3-STABLE, r303832 Yup, saw this once on a sendmail spool directory. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
Hi. On 20.10.2016 19:03, Miroslav Lachman wrote: What about snapshots? Are there any snapshots on this filesystem? Nope. # zfs list -t all NAMEUSED AVAIL REFER MOUNTPOINT zroot 245G 201G 1.17G legacy zroot/tmp 10.1M 201G 10.1M /tmp zroot/usr 9.78G 201G 7.36G /usr zroot/usr/home 77.9M 201G 77.9M /usr/home zroot/usr/ports1.41G 201G 857M /usr/ports zroot/usr/ports/distfiles 590M 201G 590M /usr/ports/distfiles zroot/usr/ports/packages642K 201G 642K /usr/ports/packages zroot/usr/src 949M 201G 949M /usr/src zroot/var 234G 201G 233G /var zroot/var/crash21.5K 201G 21.5K /var/crash zroot/var/db127M 201G 121M /var/db zroot/var/db/pkg 6.28M 201G 6.28M /var/db/pkg zroot/var/empty 20K 201G20K /var/empty zroot/var/log 631M 201G 631M /var/log zroot/var/mail 24.6M 201G 24.6M /var/mail zroot/var/run54K 201G54K /var/run zroot/var/tmp 198K 201G 198K /var/tmp Or scrub running in the background? No. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
Hi. On 20.10.2016 18:54, Nicolas Gilles wrote: Looks like it's not taking up any processing time, so my guess is the lag probably comes from stalled I/O ... bad disk? Well, I cannot rule this out completely, but first time I've seen this lag on this particular server about two months ago, and I guess two months is enough time for zfs on a redundant pool to ger errors, but as you can see: ]# zpool status pool: zroot state: ONLINE status: One or more devices are configured to use a non-native block size. Expect reduced performance. action: Replace affected devices with devices that support the configured block size, or migrate data to a properly configured pool. scan: resilvered 5.74G in 0h31m with 0 errors on Wed Jun 8 11:54:14 2016 config: NAMESTATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/zroot0 ONLINE 0 0 0 block size: 512B configured, 4096B native gpt/zroot1 ONLINE 0 0 0 errors: No known data errors there's none. Yup, disks have different sector size, but this issue happened with one particular directory, not all of them. So I guess this is irrelevant. Does a second "ls" immediately returned (ie. metadata has been cached) ? Nope. Although the lag varies slightly: 4.79s real 0.00s user 0.02s sys 5.51s real 0.00s user 0.02s sys 4.78s real 0.00s user 0.02s sys 6.88s real 0.00s user 0.02s sys Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zfs, a directory that used to hold lot of files and listing pause
Hi. I have FreeBSD 10.2-STABLE r289293 (but I have observed this situation on different releases) and a zfs. I also have one directory that used to have a lot of (tens of thousands) files. I surely takes a lot of time to get a listing of it. But now I have 2 files and a couple of dozens directories in it (I sorted files into directories). Surprisingly, there's still a lag between "ls" and an output: ===Cut=== # /usr/bin/time -h ls .recycle2016-01 2016-04 2016-07 2016-10 sort-files.sh 20142016-02 2016-05 2016-08 ktrace.out sort-months.sh 20152016-03 2016-06 2016-09 old sounds 5.75s real 0.00s user 0.02s sys ===Cut=== I've seen this situation before, on other servers, so it's not the first time I encounter this. However, it's not 100% reproducible (I mean, if I fill the directory with dozens of thousands of files, I will not certainly get this lag after the deletion). Has anyone seen this and does anyone know how to resolve this ? It's not critical issue, but it makes thing uncomfortable here. One method I'm aware of: you can move the contents of this directory to some other place, then delete it and create again. But it's kind of a nasty workaround. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: I'm upset about FreeBSD
Hi. On 17.10.2016 5:44, Rostislav Krasny wrote: Hi, I've been using FreeBSD for many years. Not as my main operating system, though. But anyway several bugs and patches were contributed and somebody even added my name into the additional contributors list. That's pleasing but today I tried to install the FreeBSD 11.0 and I'm upset about this operating system. First of all I faced an old problem that I reported here a year ago: http://comments.gmane.org/gmane.os.freebsd.stable/96598 Completely new USB flash drive flashed by the FreeBSD-11.0-RELEASE-i386-mini-memstick.img file kills every Windows again. If I use the Rufus util to write the img file (using DD mode) the Windows dies immediately after the flashing. If I use the Win32DiskImager (suggested by the Handbook) it doesn't reinitialize the USB storage and Windows dies only if I remove and put that USB flash drive again or boot Windows when it is connected. Nothing was done to fix this nasty bug for a year. I saw this particular bug, and I must say - man, it's not FreeBSD, it's Rufus. So far Windows doesn't have any decent tool to write the image with. As about Rufus - somehow it does produce broken images on a USB stick (not always though), which make every Windows installation to BSOD immediately after inserting. And this continues until you reinitialize the stick boot area. My opinion on this wasn't changing regardless of the operation system: if something traps after something valid happens (like USB flash is inserted) - that's OS problem, not the problem of whoever triggered this. Especially in the case when the USB flash is inserted, containing no FS that Windows can recognize and mount ouf-of-the-box. A non-bugged OS just shouldn't trap whatever is inserted in it's USB port, because it feels like in the cheap The Net movie with Sandra Bullock. FreeBSD has many problems (and I'm upset with it too), but this just isn't it. Just because such thing never happens when you create the image using dd on just any OS that has it natively. So it's bad experience with Rufus, not with FreeBSD. P.S. By the way, win32diskimager is a total mess too. Sometimes it just does nothing instead of writing an image. I did try almost all of the free win32 tools to write image with, and didn't find any that would completely satisfy me. Rufus would be the best, if it didn't have this ridiculous bug with BSOD. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs/raidz: seems like I'm failing with math
Hi. On 16.10.2016 23:42, Gary Palmer wrote: You're confusing disk manufacturer gigabytes with real (power of two) gigabytes. The below turns 960 197 124 096 into real gigabytes Yup, I thought that smartctl is better than that and already displayed the size with base 1024. :) Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs/raidz: seems like I'm failing with math
Hi. On 16.10.2016 22:06, Alan Somers wrote: It's raw size, but the discrepancy is between 1000 and 1024. Smartctl is reporting base 10 size, but zpool is reporting base 1024.. 960197124096.0*6/1024**4 = 5.24 TB, which is pretty close to what zpool says. Thanks ! It does explain it. But then again, on a pool that has been just created, I check the properties of the root dataset (I'm posting all the properties, just to display there's no child datasets or data on the pool): ===Cut=== # zfs get all gamestop NAME PROPERTY VALUE SOURCE gamestop type filesystem - gamestop creation sun oct 16 19:02 2016 - gamestop used 403K - gamestop available 4,04T - gamestop referenced153K - gamestop compressratio 1.00x - gamestop mounted yes- gamestop quota none default gamestop reservation none default gamestop recordsize128K default gamestop mountpoint/gamestop default gamestop sharenfs offdefault gamestop checksum on default gamestop compression offdefault gamestop atime on default gamestop devices on default gamestop exec on default gamestop setuidon default gamestop readonly offdefault gamestop jailedoffdefault gamestop snapdir hidden default gamestop aclmode discarddefault gamestop aclinheritrestricted default gamestop canmount on default gamestop xattr offtemporary gamestop copies1 default gamestop version 5 - gamestop utf8only off- gamestop normalization none - gamestop casesensitivity sensitive - gamestop vscan offdefault gamestop nbmandoffdefault gamestop sharesmb offdefault gamestop refquota none default gamestop refreservationnone default gamestop primarycache alldefault gamestop secondarycachealldefault gamestop usedbysnapshots 0 - gamestop usedbydataset 153K - gamestop usedbychildren249K - gamestop usedbyrefreservation 0 - gamestop logbias latencydefault gamestop dedup offdefault gamestop mlslabel - gamestop sync standard default gamestop refcompressratio 1.00x - gamestop written 153K - gamestop logicalused 26,5K - gamestop logicalreferenced 9,50K - gamestop volmode defaultdefault gamestop filesystem_limit none default gamestop snapshot_limitnone default gamestop filesystem_count none default gamestop snapshot_countnone default gamestop redundant_metadataalldefault ===Cut=== Only 4.03T is available. Looks like it's the actual size, since it's zfs and not zpool. But 960197124096 bytes * 5 / 1024^4 gives me 4.366 Tb, and not the 4.03 T. Where did about 300 gigs go ? I'm really trying to understand, not to catch some questionable logic or find errors. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zfs/raidz: seems like I'm failing with math
Hi. FreeBSD 11.0-RC1 r303979, zfs raidz1: ===Cut=== # zpool status gamestop pool: gamestop state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM gamestopONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 ===Cut=== 6 disks 960 Gbs each: ===Cut=== # smartctl -a /dev/da0 smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-RC1 amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: SAMSUNG MZ7KM960HAHP-5 Serial Number:S2HTNX0H507466 LU WWN Device Id: 5 002538 c402bdac1 Firmware Version: GXM1003Q User Capacity:960 197 124 096 bytes [960 GB] Sector Size: 512 bytes logical/physical [...] ===Cut=== But: ===Cut=== # zpool list gamestop NAME SIZE ALLOC FREE EXPANDSZ FRAGCAP DEDUP HEALTH ALTROOT gamestop 5,22T 4,38T 861G -24%83% 1.00x ONLINE - ===Cut=== Why 5.22T ? If zpool is displaying raw size, it should be 960 x 6 = 5760 Gb = 5.65 T. If it's displaying the actual data, then it should be 960 x 5 = 4800 Gb = 4.68 T. 5.22 T is neither of these. I'm stuck, please explain. :) Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zvol clone diffs
Hi. I should mention from the start that this is a question about an engineering task, not a question about FreeBSD issue. I have a set of zvol clones that I redistribute over iSCSI. Several Windows VMs use these clones as disks via their embedded iSCSI initiators (each clone represents a disk with an NTFS partition, is imported as a "foreign" disk and functions just fine). From my opinion, they should not have any need to do additional writes on these clones (each VM should only read data, from my point of view). But zfs shows they do, and sometimes they write a lot of data, so clearly facts and expactations differ a lot - obviously I didn't take something into accounting. Is there any way to figure out what these writes are ? Because I cannot propose any simple enough method. Thanks. Eugene. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
zfs/raidz and creation pause/blocking
Hi. Recently I spent a lot of time setting up various zfs installations, and I got a question. Often when creating a raidz on disks considerably big (>~ 1T) I'm seeing a weird stuff: "zpool create" blocks, and waits for several minutes. In the same time system is fully responsive and I can see in gstat that the kernel starts to tamper all the pool candidates sequentially at 100% busy with iops around zero (in the example below, taken from a live system, it's doing something with da11): (zpool create gamestop raidz da5 da7 da8 da9 da10 da11) dT: 1.064s w: 1.000s L(q) ops/sr/s kBps ms/rw/s kBps ms/w %busy Name 0 0 0 00.0 0 00.00.0| da0 0 0 0 00.0 0 00.00.0| da1 0 0 0 00.0 0 00.00.0| da2 0 0 0 00.0 0 00.00.0| da3 0 0 0 00.0 0 00.00.0| da4 0 0 0 00.0 0 00.00.0| da5 0 0 0 00.0 0 00.00.0| da6 0 0 0 00.0 0 00.00.0| da7 0 0 0 00.0 0 00.00.0| da8 0 0 0 00.0 0 00.00.0| da9 0 0 0 00.0 0 00.00.0| da10 150 3 0 00.0 0 00.0 112.6| da11 0 0 0 00.0 0 00.00.0| da0p1 0 0 0 00.0 0 00.00.0| da0p2 0 0 0 00.0 0 00.00.0| da0p3 0 0 0 00.0 0 00.00.0| da1p1 0 0 0 00.0 0 00.00.0| da1p2 0 0 0 00.0 0 00.00.0| da1p3 0 0 0 00.0 0 00.00.0| da0p4 0 0 0 00.0 0 00.00.0| gpt/boot0 0 0 0 00.0 0 00.00.0| gptid/22659641-7ee6-11e6-9b56-0cc47aa41194 0 0 0 00.0 0 00.00.0| gpt/zroot0 0 0 0 00.0 0 00.00.0| gpt/esx0 0 0 0 00.0 0 00.00.0| gpt/boot1 0 0 0 00.0 0 00.00.0| gptid/23c1fbec-7ee6-11e6-9b56-0cc47aa41194 0 0 0 00.0 0 00.00.0| gpt/zroot1 0 0 0 00.0 0 00.00.0| mirror/mirror 0 0 0 00.0 0 00.00.0| da1p4 0 0 0 00.0 0 00.00.0| gpt/esx1 The most funny thing is that da5,7-11 are SSD, with a capability of like 30K iops at their least. So I wonder what is happening during this and why does it take that long. Because usually pools are creating very fast. Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"