cifs vs Go: EINTR and ENOENT errors from getdents64
Hi, author of gocryptfs here, an encrypted overlay filesystem written in Go. A gocryptfs user reported [1] hitting EINTR errors when gocryptfs is used on a cifs mount. I wrote a minimal reproducer, getdents.go [2], that runs getdents64 in a loop and I can reproduce the issue easily. I additionally see ENOENT errors. The errors can also be seen via the high-level Readdirnames() function that the Go stdlib provides [6]. I am running kernel 5.6.13 and Samba 4.12.2 for this test. I have a cifs share mounted as shown in [3], populated with 1001 empty files using touch $(seq 1000 2000) . When running getdents.go [2] against this mount, good cases look like this (output from getdents.go): --- example 0 --- unix.Getdents: n=4176; n=4176; n=4176; n=4176; n=4176; n=3192; n=0; err=; total 24072 bytes unix.Getdents: n=4176; n=4176; n=4176; n=4176; n=24; n=4152; n=3192; n=0; err=; total 24072 bytes INTR errors look like this (getdents.go output and corresponding kernel debug messages that were enabled acc. to [4]): --- example 1 --- unix.Getdents: n=-1; err=interrupted system call; total 0 bytes [168340.499471] fs/cifs/file.c: CIFS VFS: leaving cifs_closedir (xid = 24238) rc = 0 [168340.599686] fs/cifs/readdir.c: Full path: start at: 0 [168340.599696] fs/cifs/transport.c: signal is pending before sending any data [168340.599698] smb2_add_credits: 310 callbacks suppressed [168340.599699] fs/cifs/smb2ops.c: add 1 credits total=8192 [168340.599701] fs/cifs/smb2ops.c: query_dir_first: open failed rc=-4 [168340.599702] fs/cifs/readdir.c: initiate cifs search rc -4 [168340.599979] fs/cifs/file.c: Closedir inode = 0x67f9a79c [168340.599981] fs/cifs/file.c: CIFS VFS: in cifs_closedir as Xid: 24240 with uid: 1026 [168340.599982] fs/cifs/file.c: Freeing private data in close dir [168340.599983] fs/cifs/file.c: CIFS VFS: leaving cifs_closedir (xid = 24240) rc = 0 --- example 2 --- unix.Getdents: n=-1; err=interrupted system call; total 0 bytes [168453.792894] Status code returned 0x8006 STATUS_NO_MORE_FILES [168453.893360] fs/cifs/transport.c: signal is pending before sending any data [168453.893363] fs/cifs/smb2ops.c: query_dir_first: open failed rc=-4 ENOENT errors look like this: --- example 3 --- unix.Getdents: n=4176; n=4176; n=4176; n=4176; n=4176; n=-1; err=no such file or directory; total 20880 bytes [168517.187072] Status code returned 0x8006 STATUS_NO_MORE_FILES [168517.300252] fs/cifs/transport.c: signal is pending before sending any data [168517.300255] fs/cifs/readdir.c: fce error -2 --- example 4 --- unix.Getdents: n=4176; n=4176; n=4176; n=1440; n=2736; n=-1; err=no such file or directory; total 16704 bytes [168650.603145] Status code returned 0x8006 STATUS_NO_MORE_FILES [168650.713831] fs/cifs/transport.c: signal is pending before sending any data [168650.713835] fs/cifs/readdir.c: fce error -2 I have the same thing written in C does not show any problems [5]. The Go runtime uses signals heavily, could this be a reason for the behavoir I am seeing? I can trigger the problem more quickly when I resize the terminal window (causing SIGWINCH). Note that the Go runtime sets SA_RESTART on all signal handlers. I can handle the EINTR errors by retrying, no problem. Should I also retry when I get ENOENT? Thanks, Jakob [1] https://github.com/rfjakob/gocryptfs/issues/483 [2] https://github.com/rfjakob/gocryptfs/blob/master/contrib/getdents-debug/getdents/getdents.go [3] //127.0.0.1/samba-test on /mnt/samba-test type cifs (rw,relatime,vers=3.1.1,cache=strict,username=jakob,uid=1026,forceuid,gid=1026,forcegid,addr=127.0.0.1,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mappos ix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,user=jakob) [4] https://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Enabling_Debugging [5] https://github.com/rfjakob/gocryptfs/blob/master/contrib/getdents-debug/getdents_c/getdents.c [6] https://github.com/golang/go/issues/39237
Re: [PATCH v5 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 25.04.18 17:34, Marc Kleine-Budde wrote: Applied to linux-can-next. Thanks, Marc Marc, is the patch supposed to show up on https://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git/log/ or am I looking at the wrong tree? Thanks, Jakob
Re: [PATCH v5 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 25.04.18 17:34, Marc Kleine-Budde wrote: Applied to linux-can-next. Thanks, Marc Marc, is the patch supposed to show up on https://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next.git/log/ or am I looking at the wrong tree? Thanks, Jakob
Re: [regression, bisected] rockchip rk3399 video output breakage
On 24.04.18 15:33, JeffyChen wrote: [ 36.076577] WARNING: CPU: 1 PID: 83 at drivers/gpu/drm/rockchip/rockchip_drm_vop.c:1004 vop_crtc_atomic_flush+0x1c0/0x1c8 this looks like an issue recently reported by heiko, we found that might due to an unbalanced irq disable in vop driver. my test patch is: +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -1601,6 +1601,8 @@ static void vop_unbind(struct device *dev, struct device *master, void *data) { struct vop *vop = dev_get_drvdata(dev); + // Pair with the initial disable_irq() + enable_irq(vop->irq); and i think sandy would send the real patch soon(maybe already sent?) Works fine with your patch! Thank you very much Jeffy. Best regards, Jakob
Re: [regression, bisected] rockchip rk3399 video output breakage
On 24.04.18 15:33, JeffyChen wrote: [ 36.076577] WARNING: CPU: 1 PID: 83 at drivers/gpu/drm/rockchip/rockchip_drm_vop.c:1004 vop_crtc_atomic_flush+0x1c0/0x1c8 this looks like an issue recently reported by heiko, we found that might due to an unbalanced irq disable in vop driver. my test patch is: +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -1601,6 +1601,8 @@ static void vop_unbind(struct device *dev, struct device *master, void *data) { struct vop *vop = dev_get_drvdata(dev); + // Pair with the initial disable_irq() + enable_irq(vop->irq); and i think sandy would send the real patch soon(maybe already sent?) Works fine with your patch! Thank you very much Jeffy. Best regards, Jakob
Re: [regression, bisected] rockchip rk3399 video output breakage
On 24.04.18 14:37, JeffyChen wrote: right, i think it's a known issue, as the iommu failed to get clks: [ 1.525153] rk_iommu ff8f3f00.iommu: Failed to get clk 'iface': -2 [ 1.525316] rk_iommu: probe of ff8f3f00.iommu failed with error -2 [ 1.525484] rk_iommu ff903f00.iommu: Failed to get clk 'iface': -2 [ 1.525643] rk_iommu: probe of ff903f00.iommu failed with error -2 there's a followed patch to add those clks to the dtsi, and also a fix patch to make those clks optional(by heiko): https://www.spinics.net/lists/arm-kernel/msg645696.html http://lists.infradead.org/pipermail/linux-rockchip/2018-April/020349.html Thanks, I tested both, and both get the screen to display some kernel output! However, I am getting some nasty error messages and the screen seems to refresh only once every 30 seconds: [ 15.586502] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:30:crtc-0] flip_done timed out [ 25.826490] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:37:HDMI-A-1] flip_done timed out [ 36.066490] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:28:plane-0] flip_done timed out [ 36.066504] [drm:drm_calc_timestamping_constants] crtc 30: hwmode: htotal 2200, vtotal 1125, vdisplay 1080 [ 36.066508] [drm:drm_calc_timestamping_constants] crtc 30: clock 148500 kHz framedur 1666 linedur 14814 [ 36.076535] rockchip-vop ff8f.vop: [drm:vop_crtc_atomic_flush] *ERROR* VOP vblank IRQ stuck for 10 ms [ 36.076577] WARNING: CPU: 1 PID: 83 at drivers/gpu/drm/rockchip/rockchip_drm_vop.c:1004 vop_crtc_atomic_flush+0x1c0/0x1c8 Full dmesg with patch "arm64: dts: rockchip: add clocks in iommu nodes": https://gist.github.com/jakob-tsd/3fd49894d52dcd8a409eb9e6136b2d39 Full dmesg with patch "iommu/rockchip: make clock handling optional": https://gist.github.com/jakob-tsd/da96572a40d11f0f6dff3ee481098138 (looks the same) Thanks, Jakob
Re: [regression, bisected] rockchip rk3399 video output breakage
On 24.04.18 14:37, JeffyChen wrote: right, i think it's a known issue, as the iommu failed to get clks: [ 1.525153] rk_iommu ff8f3f00.iommu: Failed to get clk 'iface': -2 [ 1.525316] rk_iommu: probe of ff8f3f00.iommu failed with error -2 [ 1.525484] rk_iommu ff903f00.iommu: Failed to get clk 'iface': -2 [ 1.525643] rk_iommu: probe of ff903f00.iommu failed with error -2 there's a followed patch to add those clks to the dtsi, and also a fix patch to make those clks optional(by heiko): https://www.spinics.net/lists/arm-kernel/msg645696.html http://lists.infradead.org/pipermail/linux-rockchip/2018-April/020349.html Thanks, I tested both, and both get the screen to display some kernel output! However, I am getting some nasty error messages and the screen seems to refresh only once every 30 seconds: [ 15.586502] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:30:crtc-0] flip_done timed out [ 25.826490] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:37:HDMI-A-1] flip_done timed out [ 36.066490] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:28:plane-0] flip_done timed out [ 36.066504] [drm:drm_calc_timestamping_constants] crtc 30: hwmode: htotal 2200, vtotal 1125, vdisplay 1080 [ 36.066508] [drm:drm_calc_timestamping_constants] crtc 30: clock 148500 kHz framedur 1666 linedur 14814 [ 36.076535] rockchip-vop ff8f.vop: [drm:vop_crtc_atomic_flush] *ERROR* VOP vblank IRQ stuck for 10 ms [ 36.076577] WARNING: CPU: 1 PID: 83 at drivers/gpu/drm/rockchip/rockchip_drm_vop.c:1004 vop_crtc_atomic_flush+0x1c0/0x1c8 Full dmesg with patch "arm64: dts: rockchip: add clocks in iommu nodes": https://gist.github.com/jakob-tsd/3fd49894d52dcd8a409eb9e6136b2d39 Full dmesg with patch "iommu/rockchip: make clock handling optional": https://gist.github.com/jakob-tsd/da96572a40d11f0f6dff3ee481098138 (looks the same) Thanks, Jakob
[regression, bisected] rockchip rk3399 video output breakage
I am working on getting HDMI output enabled in mainline Linux for our RK3399-Q7 module. It works fine on v4.16, but testing with v4.17-rc2 I get this, and the screen stays black: [7.142712] alloc_contig_range: [7f061, 7f062) PFNs busy [7.148862] alloc_contig_range: [7f066, 7f067) PFNs busy [7.155041] alloc_contig_range: [7f067, 7f068) PFNs busy [7.161139] alloc_contig_range: [7f070, 7f071) PFNs busy [7.167295] alloc_contig_range: [7f071, 7f072) PFNs busy [7.173413] alloc_contig_range: [7f072, 7f073) PFNs busy [7.181387] alloc_contig_range: [7f073, 7f074) PFNs busy [7.190623] alloc_contig_range: [7f074, 7f075) PFNs busy [7.196668] alloc_contig_range: [7f075, 7f076) PFNs busy [7.202793] alloc_contig_range: [7f076, 7f077) PFNs busy [7.254911] hub 7-1:1.0: USB hub found [7.259878] hub 7-1:1.0: 4 ports detected [7.322543] rockchip-drm display-subsystem: master bind failed: -12 [7.329619] rockchip-vop: probe of ff90.vop failed with error -12 Full dmesg: https://gist.github.com/jakob-tsd/33cf395e355bf9bb6956c36438d999e7 I have bisected the "master bind failed" down to: commit 9176a303d971dc0fb35469c531c0d263667d2277 Author: Jeffy ChenDate: Fri Mar 23 15:38:10 2018 +0800 iommu/rockchip: Use IOMMU device for dma mapping operations Use the first registered IOMMU device for dma mapping operations, and drop the domain platform device. This is similar to exynos iommu driver. Signed-off-by: Jeffy Chen Reviewed-by: Tomasz Figa Reviewed-by: Robin Murphy Signed-off-by: Joerg Roedel Moving to one commit earlier brings the screen to life. Just with colorful garbage, but I guess that's a different problem. Is this a known issue with the IOMMU change? Thanks, Jakob
[regression, bisected] rockchip rk3399 video output breakage
I am working on getting HDMI output enabled in mainline Linux for our RK3399-Q7 module. It works fine on v4.16, but testing with v4.17-rc2 I get this, and the screen stays black: [7.142712] alloc_contig_range: [7f061, 7f062) PFNs busy [7.148862] alloc_contig_range: [7f066, 7f067) PFNs busy [7.155041] alloc_contig_range: [7f067, 7f068) PFNs busy [7.161139] alloc_contig_range: [7f070, 7f071) PFNs busy [7.167295] alloc_contig_range: [7f071, 7f072) PFNs busy [7.173413] alloc_contig_range: [7f072, 7f073) PFNs busy [7.181387] alloc_contig_range: [7f073, 7f074) PFNs busy [7.190623] alloc_contig_range: [7f074, 7f075) PFNs busy [7.196668] alloc_contig_range: [7f075, 7f076) PFNs busy [7.202793] alloc_contig_range: [7f076, 7f077) PFNs busy [7.254911] hub 7-1:1.0: USB hub found [7.259878] hub 7-1:1.0: 4 ports detected [7.322543] rockchip-drm display-subsystem: master bind failed: -12 [7.329619] rockchip-vop: probe of ff90.vop failed with error -12 Full dmesg: https://gist.github.com/jakob-tsd/33cf395e355bf9bb6956c36438d999e7 I have bisected the "master bind failed" down to: commit 9176a303d971dc0fb35469c531c0d263667d2277 Author: Jeffy Chen Date: Fri Mar 23 15:38:10 2018 +0800 iommu/rockchip: Use IOMMU device for dma mapping operations Use the first registered IOMMU device for dma mapping operations, and drop the domain platform device. This is similar to exynos iommu driver. Signed-off-by: Jeffy Chen Reviewed-by: Tomasz Figa Reviewed-by: Robin Murphy Signed-off-by: Joerg Roedel Moving to one commit earlier brings the screen to life. Just with colorful garbage, but I guess that's a different problem. Is this a known issue with the IOMMU change? Thanks, Jakob
[PATCH] can: dev: increase bus-off message severity
bus-off is usually caused by hardware malfunction or configuration error (baud rate mismatch) and causes a complete loss of communication. Increase the "bus-off" message's severity from netdev_dbg to netdev_info to make it visible to the user. A can interface going into bus-off is similar in severity to ethernet's "Link is Down" message, which is also printed at info level. It is debatable whether the the "restarted" message should also be changed to netdev_info to make the interface state changes comprehensible from the kernel log. I have chosen to keep the "restarted" message at dbg for now as the "bus-off" message should be enough for the user to notice and investigate the problem. Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- drivers/net/can/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index b1779566c5bb..3c71f1cb205f 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -605,7 +605,7 @@ void can_bus_off(struct net_device *dev) { struct can_priv *priv = netdev_priv(dev); - netdev_dbg(dev, "bus-off\n"); + netdev_info(dev, "bus-off\n"); netif_carrier_off(dev); -- 2.11.0
[PATCH] can: dev: increase bus-off message severity
bus-off is usually caused by hardware malfunction or configuration error (baud rate mismatch) and causes a complete loss of communication. Increase the "bus-off" message's severity from netdev_dbg to netdev_info to make it visible to the user. A can interface going into bus-off is similar in severity to ethernet's "Link is Down" message, which is also printed at info level. It is debatable whether the the "restarted" message should also be changed to netdev_info to make the interface state changes comprehensible from the kernel log. I have chosen to keep the "restarted" message at dbg for now as the "bus-off" message should be enough for the user to notice and investigate the problem. Signed-off-by: Jakob Unterwurzacher Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- drivers/net/can/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index b1779566c5bb..3c71f1cb205f 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -605,7 +605,7 @@ void can_bus_off(struct net_device *dev) { struct can_priv *priv = netdev_priv(dev); - netdev_dbg(dev, "bus-off\n"); + netdev_info(dev, "bus-off\n"); netif_carrier_off(dev); -- 2.11.0
[PATCH v5 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber <martin.elshu...@theobroma-systems.com> Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> Signed-off-by: Philipp Tomsich <philipp.toms...@theobroma-systems.com> Acked-by: Wolfgang Grandegger <w...@grandegger.com> --- Documentation/networking/can_ucan_protocol.rst | 332 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1613 5 files changed, 1957 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..4cef88d24fc7 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,332 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement
[PATCH v5 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber Signed-off-by: Jakob Unterwurzacher Signed-off-by: Philipp Tomsich Acked-by: Wolfgang Grandegger --- Documentation/networking/can_ucan_protocol.rst | 332 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1613 5 files changed, 1957 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..4cef88d24fc7 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,332 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COMMAND_SET_BITTIMING +~~ + +*Host2Dev; mandatory* + +Setup bittiming by sending
[PATCH v5 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is v5 of the Theobroma Systems CAN/USB "UCAN" adapter driver upstreaming effort. v4 -> v5 changes: * describe bus-off handling in rst documentation * fix inverted if in tx_complete * try to fit overlong strings in 80 chars * use early returns instead of goto where possible * add spinlock around can_put/get/free_echo skb * remove BUS-OFF printk * fix missing true path in ucan_release_context * add ACK from Wolfgang Grandegger v3 -> v4 changes: * get rid of a few repeated le16_to_cpu casts by storing the value once * fix canid masking logic * drop __func__ from log messages. Use netdev_* where possible, use UCAN_DRIVER_NAME where not. * drop device report log output (data is available via ip) * fix issues catched by sparse v2 -> v3 changes: * count error frames as data packets * use canid_t for all can ids * use BIT(x) instead of (1 << x) * use __le16 / __le32 for little-endian fields * add spinlock around context allocation (fixes a possible race) * fix comment style * use WARN_ON return value * fix state logic bug that did not allow return to ERROR_ACTIVE * drop echo_index from context_array (not needed) * rename "tx_contexts" -> "context_array" to prevent confusion * add __func__ to all errors and warnings, and to info where it made sense Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 332 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1613 5 files changed, 1957 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber <martin.elshu...@theobroma-systems.com> Cc: Philipp Tomsich <philipp.toms...@theobroma-systems.com> Cc: Wolfgang Grandegger <w...@grandegger.com> Cc: Marc Kleine-Budde <m...@pengutronix.de> Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v5 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is v5 of the Theobroma Systems CAN/USB "UCAN" adapter driver upstreaming effort. v4 -> v5 changes: * describe bus-off handling in rst documentation * fix inverted if in tx_complete * try to fit overlong strings in 80 chars * use early returns instead of goto where possible * add spinlock around can_put/get/free_echo skb * remove BUS-OFF printk * fix missing true path in ucan_release_context * add ACK from Wolfgang Grandegger v3 -> v4 changes: * get rid of a few repeated le16_to_cpu casts by storing the value once * fix canid masking logic * drop __func__ from log messages. Use netdev_* where possible, use UCAN_DRIVER_NAME where not. * drop device report log output (data is available via ip) * fix issues catched by sparse v2 -> v3 changes: * count error frames as data packets * use canid_t for all can ids * use BIT(x) instead of (1 << x) * use __le16 / __le32 for little-endian fields * add spinlock around context allocation (fixes a possible race) * fix comment style * use WARN_ON return value * fix state logic bug that did not allow return to ERROR_ACTIVE * drop echo_index from context_array (not needed) * rename "tx_contexts" -> "context_array" to prevent confusion * add __func__ to all errors and warnings, and to info where it made sense Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 332 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1613 5 files changed, 1957 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber Cc: Philipp Tomsich Cc: Wolfgang Grandegger Cc: Marc Kleine-Budde Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v4 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is v4 of the Theobroma Systems CAN/USB "UCAN" adapter driver upstreaming effort. v3 -> v4 changes: * get rid of a few repeated le16_to_cpu casts by storing the value once * fix canid masking logic * drop __func__ from log messages. Use netdev_* where possible, use UCAN_DRIVER_NAME where not. * drop device report log output (data is available via ip) * fix issues catched by sparse v2 -> v3 changes: * count error frames as data packets * use canid_t for all can ids * use BIT(x) instead of (1 << x) * use __le16 / __le32 for little-endian fields * add spinlock around context allocation (fixes a possible race) * fix comment style * use WARN_ON return value * fix state logic bug that did not allow return to ERROR_ACTIVE * drop echo_index from context_array (not needed) * rename "tx_contexts" -> "context_array" to prevent confusion * add __func__ to all errors and warnings, and to info where it made sense Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1596 5 files changed, 1923 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber <martin.elshu...@theobroma-systems.com> Cc: Philipp Tomsich <philipp.toms...@theobroma-systems.com> Cc: Wolfgang Grandegger <w...@grandegger.com> Cc: Marc Kleine-Budde <m...@pengutronix.de> Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v4 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber <martin.elshu...@theobroma-systems.com> Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> Signed-off-by: Philipp Tomsich <philipp.toms...@theobroma-systems.com> --- Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1596 5 files changed, 1923 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..d859b36200b4 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,315 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COM
[PATCH v4 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is v4 of the Theobroma Systems CAN/USB "UCAN" adapter driver upstreaming effort. v3 -> v4 changes: * get rid of a few repeated le16_to_cpu casts by storing the value once * fix canid masking logic * drop __func__ from log messages. Use netdev_* where possible, use UCAN_DRIVER_NAME where not. * drop device report log output (data is available via ip) * fix issues catched by sparse v2 -> v3 changes: * count error frames as data packets * use canid_t for all can ids * use BIT(x) instead of (1 << x) * use __le16 / __le32 for little-endian fields * add spinlock around context allocation (fixes a possible race) * fix comment style * use WARN_ON return value * fix state logic bug that did not allow return to ERROR_ACTIVE * drop echo_index from context_array (not needed) * rename "tx_contexts" -> "context_array" to prevent confusion * add __func__ to all errors and warnings, and to info where it made sense Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1596 5 files changed, 1923 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber Cc: Philipp Tomsich Cc: Wolfgang Grandegger Cc: Marc Kleine-Budde Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v4 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber Signed-off-by: Jakob Unterwurzacher Signed-off-by: Philipp Tomsich --- Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1596 5 files changed, 1923 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..d859b36200b4 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,315 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COMMAND_SET_BITTIMING +~~ + +*Host2Dev; mandatory* + +Setup bittiming by sending the the structure
Re: [PATCH v3 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 23.03.18 11:04, Wolfgang Grandegger wrote: But I'm open to other suggestions (use a fixed "ucan: " prefix?) or to drop it entirely if you think it is not worth it. But there is already a device prefix, e.g.: peak_usb 1-6:1.0: PEAK-System PCAN-USB adapter hwrev 28 serial (1 channel) peak_usb 1-6:1.0 can0: attached to PCAN-USB channel 0 (device 255) Interesting. Looks like the UCAN driver is missing something to make this work: [ 67.792947] usb 5-1.4: ucan_probe: registered UCAN device at can0 I'll take a closer look. Thanks, Jakob
Re: [PATCH v3 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 23.03.18 11:04, Wolfgang Grandegger wrote: But I'm open to other suggestions (use a fixed "ucan: " prefix?) or to drop it entirely if you think it is not worth it. But there is already a device prefix, e.g.: peak_usb 1-6:1.0: PEAK-System PCAN-USB adapter hwrev 28 serial (1 channel) peak_usb 1-6:1.0 can0: attached to PCAN-USB channel 0 (device 255) Interesting. Looks like the UCAN driver is missing something to make this work: [ 67.792947] usb 5-1.4: ucan_probe: registered UCAN device at can0 I'll take a closer look. Thanks, Jakob
Re: [PATCH v3 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 23.03.18 09:32, Wolfgang Grandegger wrote: * add __func__ to all errors and warnings, and to info where it made sense The final output messages in the driver should especially be useful for the end user... and not the developer! This is also true for the function names. You already use more "__func__" than all other CAN drivers together. Just my opinion! The idea was to make it clear which driver printed the message. In my opinion, this is a problem: drivers/net/can/usb$ git grep "No memory left for USB buffer" ems_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); ems_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); esd_usb2.c: "No memory left for USB buffer\n"); esd_usb2.c: netdev_err(netdev, "No memory left for USB buffer\n"); gs_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); gs_usb.c: "No memory left for USB buffer\n"); kvaser_usb.c:"No memory left for USB buffer\n"); mcba_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); usb_8dev.c: netdev_err(netdev, "No memory left for USB buffer\n"); usb_8dev.c: netdev_err(netdev, "No memory left for USB buffer\n"); But I'm open to other suggestions (use a fixed "ucan: " prefix?) or to drop it entirely if you think it is not worth it. Thanks for the feedback, Jakob
Re: [PATCH v3 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 23.03.18 09:32, Wolfgang Grandegger wrote: * add __func__ to all errors and warnings, and to info where it made sense The final output messages in the driver should especially be useful for the end user... and not the developer! This is also true for the function names. You already use more "__func__" than all other CAN drivers together. Just my opinion! The idea was to make it clear which driver printed the message. In my opinion, this is a problem: drivers/net/can/usb$ git grep "No memory left for USB buffer" ems_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); ems_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); esd_usb2.c: "No memory left for USB buffer\n"); esd_usb2.c: netdev_err(netdev, "No memory left for USB buffer\n"); gs_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); gs_usb.c: "No memory left for USB buffer\n"); kvaser_usb.c:"No memory left for USB buffer\n"); mcba_usb.c: netdev_err(netdev, "No memory left for USB buffer\n"); usb_8dev.c: netdev_err(netdev, "No memory left for USB buffer\n"); usb_8dev.c: netdev_err(netdev, "No memory left for USB buffer\n"); But I'm open to other suggestions (use a fixed "ucan: " prefix?) or to drop it entirely if you think it is not worth it. Thanks for the feedback, Jakob
[PATCH v3 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is v3 of the Theobroma Systems CAN/USB adapter driver upstreaming effort. Featured v2 -> v3 changes: * count error frames as data packets * use canid_t for all can ids * use BIT(x) instead of (1 << x) * use __le16 / __le32 for little-endian fields * add spinlock around context allocation (fixes a possible race) * fix comment style * use WARN_ON return value * fix state logic bug that did not allow return to ERROR_ACTIVE * drop echo_index from context_array (not needed) * rename "tx_contexts" -> "context_array" to prevent confusion * add __func__ to all errors and warnings, and to info where it made sense Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1628 5 files changed, 1955 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber <martin.elshu...@theobroma-systems.com> Cc: Philipp Tomsich <philipp.toms...@theobroma-systems.com> Cc: Wolfgang Grandegger <w...@grandegger.com> Cc: Marc Kleine-Budde <m...@pengutronix.de> Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v3 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is v3 of the Theobroma Systems CAN/USB adapter driver upstreaming effort. Featured v2 -> v3 changes: * count error frames as data packets * use canid_t for all can ids * use BIT(x) instead of (1 << x) * use __le16 / __le32 for little-endian fields * add spinlock around context allocation (fixes a possible race) * fix comment style * use WARN_ON return value * fix state logic bug that did not allow return to ERROR_ACTIVE * drop echo_index from context_array (not needed) * rename "tx_contexts" -> "context_array" to prevent confusion * add __func__ to all errors and warnings, and to info where it made sense Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1628 5 files changed, 1955 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber Cc: Philipp Tomsich Cc: Wolfgang Grandegger Cc: Marc Kleine-Budde Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v3 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber <martin.elshu...@theobroma-systems.com> Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> Signed-off-by: Philipp Tomsich <philipp.toms...@theobroma-systems.com> --- Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1628 5 files changed, 1955 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..d859b36200b4 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,315 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COM
[PATCH v3 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber Signed-off-by: Jakob Unterwurzacher Signed-off-by: Philipp Tomsich --- Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1628 5 files changed, 1955 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..d859b36200b4 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,315 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COMMAND_SET_BITTIMING +~~ + +*Host2Dev; mandatory* + +Setup bittiming by sending the the structure
Re: [bug, bisected] pfifo_fast causes packet reordering
On 21.03.18 21:52, John Fastabend wrote: Can you try this, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index d4907b5..1e596bd 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -30,6 +30,7 @@ struct qdisc_rate_table { enum qdisc_state_t { __QDISC_STATE_SCHED, __QDISC_STATE_DEACTIVATED, + __QDISC_STATE_RUNNING, }; [...] Tested, looks good. No OOO observed, no side effects observed, iperf numbers on Gigabit Ethernet look the same. Thanks, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 21.03.18 21:52, John Fastabend wrote: Can you try this, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index d4907b5..1e596bd 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -30,6 +30,7 @@ struct qdisc_rate_table { enum qdisc_state_t { __QDISC_STATE_SCHED, __QDISC_STATE_DEACTIVATED, + __QDISC_STATE_RUNNING, }; [...] Tested, looks good. No OOO observed, no side effects observed, iperf numbers on Gigabit Ethernet look the same. Thanks, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 21.03.18 19:43, John Fastabend wrote: Thats my theory at least. Are you able to test a patch if I generate one to fix this? Yes, no problem. I just tested with the flag change you suggested (see below, I had to keep TCQ_F_CPUSTATS to prevent a crash) and I have NOT seen OOO so far. Thanks, Jakob diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 190570f21b20..51b68ef4977b 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -792,7 +792,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = { .dump = pfifo_fast_dump, .change_tx_queue_len = pfifo_fast_change_tx_queue_len, .owner = THIS_MODULE, - .static_flags = TCQ_F_NOLOCK | TCQ_F_CPUSTATS, + .static_flags = TCQ_F_CPUSTATS, }; EXPORT_SYMBOL(pfifo_fast_ops);
Re: [bug, bisected] pfifo_fast causes packet reordering
On 21.03.18 19:43, John Fastabend wrote: Thats my theory at least. Are you able to test a patch if I generate one to fix this? Yes, no problem. I just tested with the flag change you suggested (see below, I had to keep TCQ_F_CPUSTATS to prevent a crash) and I have NOT seen OOO so far. Thanks, Jakob diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 190570f21b20..51b68ef4977b 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -792,7 +792,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = { .dump = pfifo_fast_dump, .change_tx_queue_len = pfifo_fast_change_tx_queue_len, .owner = THIS_MODULE, - .static_flags = TCQ_F_NOLOCK | TCQ_F_CPUSTATS, + .static_flags = TCQ_F_CPUSTATS, }; EXPORT_SYMBOL(pfifo_fast_ops);
Re: [bug, bisected] pfifo_fast causes packet reordering
On 16.03.18 11:26, Jakob Unterwurzacher wrote: On 15.03.18 23:30, John Fastabend wrote: I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py Great thanks, can you also run this with taskset to bind to a single CPU, # taskset 0x1 ./pifof_stress.py And let me know if you still see the OOO. Interesting. Looks like it depends on which core it runs on. CPU0 is clean, CPU1 is not. So we are at v4.16-rc6 now - have you managed to reproduce this is or should I try to get the revert correct? Best regards, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 16.03.18 11:26, Jakob Unterwurzacher wrote: On 15.03.18 23:30, John Fastabend wrote: I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py Great thanks, can you also run this with taskset to bind to a single CPU, # taskset 0x1 ./pifof_stress.py And let me know if you still see the OOO. Interesting. Looks like it depends on which core it runs on. CPU0 is clean, CPU1 is not. So we are at v4.16-rc6 now - have you managed to reproduce this is or should I try to get the revert correct? Best regards, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 19.03.18 13:32, Paolo Abeni wrote: Is not clear to me if you can reproduce the bug with the vanilla kernel, or if you need some out-of-tree nic driver. Can you please clarify which NIC/driver are you using? Yes I reproduced it with a vanilla kernel. I use two off-the-shelf USB NICs, lsusb says: Bus 005 Device 013: ID 0b95:7720 ASIX Electronics Corp. AX88772 Bus 005 Device 014: ID 0b95:772b ASIX Electronics Corp. AX88772B Photo of the setup: https://github.com/jakob-tsd/pfifo_stress/blob/master/usb_nics.jpg Best regards, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 19.03.18 13:32, Paolo Abeni wrote: Is not clear to me if you can reproduce the bug with the vanilla kernel, or if you need some out-of-tree nic driver. Can you please clarify which NIC/driver are you using? Yes I reproduced it with a vanilla kernel. I use two off-the-shelf USB NICs, lsusb says: Bus 005 Device 013: ID 0b95:7720 ASIX Electronics Corp. AX88772 Bus 005 Device 014: ID 0b95:772b ASIX Electronics Corp. AX88772B Photo of the setup: https://github.com/jakob-tsd/pfifo_stress/blob/master/usb_nics.jpg Best regards, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 15.03.18 23:30, John Fastabend wrote: I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py Great thanks, can you also run this with taskset to bind to a single CPU, # taskset 0x1 ./pifof_stress.py And let me know if you still see the OOO. Interesting. Looks like it depends on which core it runs on. CPU0 is clean, CPU1 is not. Clean: taskset --cpu-list 0 ./pfifo_stress.py Broken: taskset --cpu-list 1 ./pfifo_stress.py Maybe related: CPU0 is where USB interrupts are handled: root@rk3399-q7:~# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 217:2175353 0 0 0 0 0 GICv3 142 Level xhci-hcd:usb5 Thanks, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 15.03.18 23:30, John Fastabend wrote: I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py Great thanks, can you also run this with taskset to bind to a single CPU, # taskset 0x1 ./pifof_stress.py And let me know if you still see the OOO. Interesting. Looks like it depends on which core it runs on. CPU0 is clean, CPU1 is not. Clean: taskset --cpu-list 0 ./pfifo_stress.py Broken: taskset --cpu-list 1 ./pfifo_stress.py Maybe related: CPU0 is where USB interrupts are handled: root@rk3399-q7:~# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 217:2175353 0 0 0 0 0 GICv3 142 Level xhci-hcd:usb5 Thanks, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 14.03.18 05:03, John Fastabend wrote: On 03/13/2018 11:35 AM, Dave Taht wrote: On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> wrote: During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. Is the stress-testing tool available somewhere? What type of packets are being sent? I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py Here is what I get: root@rk3399-q7:~# ./pfifo_stress.py [...] expected ctr 0xcdc, received 0xcdd expected ctr 0xcde, received 0xcdc expected ctr 0xcdd, received 0xcde expected ctr 0xe3c, received 0xe3d expected ctr 0xe3e, received 0xe3c expected ctr 0xe3d, received 0xe3e expected ctr 0x1097, received 0x1098 expected ctr 0x1099, received 0x1097 expected ctr 0x1098, received 0x1099 expected ctr 0x17c0, received 0x17c1 expected ctr 0x17c2, received 0x17c0 [...] Best regards, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 14.03.18 05:03, John Fastabend wrote: On 03/13/2018 11:35 AM, Dave Taht wrote: On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher wrote: During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. Is the stress-testing tool available somewhere? What type of packets are being sent? I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py Here is what I get: root@rk3399-q7:~# ./pfifo_stress.py [...] expected ctr 0xcdc, received 0xcdd expected ctr 0xcde, received 0xcdc expected ctr 0xcdd, received 0xcde expected ctr 0xe3c, received 0xe3d expected ctr 0xe3e, received 0xe3c expected ctr 0xe3d, received 0xe3e expected ctr 0x1097, received 0x1098 expected ctr 0x1099, received 0x1097 expected ctr 0x1098, received 0x1099 expected ctr 0x17c0, received 0x17c1 expected ctr 0x17c2, received 0x17c0 [...] Best regards, Jakob
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 11:04, Wolfgang Grandegger wrote: (000.000443) can0 2034 [8] 00 0C 00 00 00 00 78 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{120}{0}} (000.000444) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} Just, controller-problem{rx-error-passive,tx-error-passive} [...] Back to error active is missing!? That was indeed missing. We have fixed the missing back-to-error-active in our firmware. Also, we no longer send the controller status in every error frame, but only on state changes (see below) which seems to be how other drivers are handling things. Thanks, Jakob *** test output *** Disconnect cable, send one frame root@rk3399-q7:~# candump -td -e any,0:0,# | head -n 100 (000.00) can0 6E7 [2] 7A F9 (000.000558) can0 2030 [8] 00 00 00 00 00 00 08 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{8}{0}} [...] (000.000567) can0 2034 [8] 00 0C 00 00 00 00 60 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{96}{0}} [...] (000.000527) can0 2034 [8] 00 30 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} [...] Reconnect cable (000.000687) can1 6E7 [2] 7A F9 (000.15) can0 2004 [8] 00 0C 00 00 00 00 7F 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} error-counter-tx-rx{{127}{0}} Send more frames (046.485245) can0 61B [2] E2 D8 (000.000621) can1 61B [2] E2 D8 [...] (000.199224) can0 3E6 [0] (000.000477) can1 3E6 [0] (000.44) can0 2004 [8] 00 40 00 00 00 00 5F 00 ERRORFRAME controller-problem{back-to-error-active} error-counter-tx-rx{{95}{0}}
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 11:04, Wolfgang Grandegger wrote: (000.000443) can0 2034 [8] 00 0C 00 00 00 00 78 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{120}{0}} (000.000444) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} Just, controller-problem{rx-error-passive,tx-error-passive} [...] Back to error active is missing!? That was indeed missing. We have fixed the missing back-to-error-active in our firmware. Also, we no longer send the controller status in every error frame, but only on state changes (see below) which seems to be how other drivers are handling things. Thanks, Jakob *** test output *** Disconnect cable, send one frame root@rk3399-q7:~# candump -td -e any,0:0,# | head -n 100 (000.00) can0 6E7 [2] 7A F9 (000.000558) can0 2030 [8] 00 00 00 00 00 00 08 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{8}{0}} [...] (000.000567) can0 2034 [8] 00 0C 00 00 00 00 60 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{96}{0}} [...] (000.000527) can0 2034 [8] 00 30 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} [...] Reconnect cable (000.000687) can1 6E7 [2] 7A F9 (000.15) can0 2004 [8] 00 0C 00 00 00 00 7F 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} error-counter-tx-rx{{127}{0}} Send more frames (046.485245) can0 61B [2] E2 D8 (000.000621) can1 61B [2] E2 D8 [...] (000.199224) can0 3E6 [0] (000.000477) can1 3E6 [0] (000.44) can0 2004 [8] 00 40 00 00 00 00 5F 00 ERRORFRAME controller-problem{back-to-error-active} error-counter-tx-rx{{95}{0}}
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 11:04, Wolfgang Grandegger wrote: controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} Just, controller-problem{rx-error-passive,tx-error-passive} Ok. (1b) cable gets connected: (000.000883) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} (000.000996) can0 2004 [8] 00 0C 00 00 00 00 7F 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} error-counter-tx-rx{{127}{0}} Back to error active is missing!? Have a look to: https://elixir.bootlin.com/linux/latest/source/drivers/net/can/dev.c#L364 Should be sent out once the error counter drops further, will check. Thanks, Jakob
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 11:04, Wolfgang Grandegger wrote: controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} Just, controller-problem{rx-error-passive,tx-error-passive} Ok. (1b) cable gets connected: (000.000883) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} (000.000996) can0 2004 [8] 00 0C 00 00 00 00 7F 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} error-counter-tx-rx{{127}{0}} Back to error active is missing!? Have a look to: https://elixir.bootlin.com/linux/latest/source/drivers/net/can/dev.c#L364 Should be sent out once the error counter drops further, will check. Thanks, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 14.03.18 05:03, John Fastabend wrote: During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. Is the stress-testing tool available somewhere? What type of packets are being sent? Not public, no, the problem is that you'd need a CAN adapter as well. The test is simple, sending CAN frames with an increasing counter and a random length payload: [ tx thread rx thread] | ^ v | [ interface 0 ] cable > [ interface 1 ] I'll see if I can come up with a UDP testcase that works with normal ethernet interfaces. Is this a single queue device or a multiqueue device? Running 'tc -s qdisc show dev foo' would help some. Here you go: root@rk3399-q7:~# tc -s qdisc show dev can0 qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 If we introduced a OOO edge case somewhere that was not intended so I'll take a look into it. But, if you can provide a bit more details on how stress testing is done to cause the issue that would help. Will do. Thanks, Jakob
Re: [bug, bisected] pfifo_fast causes packet reordering
On 14.03.18 05:03, John Fastabend wrote: During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. Is the stress-testing tool available somewhere? What type of packets are being sent? Not public, no, the problem is that you'd need a CAN adapter as well. The test is simple, sending CAN frames with an increasing counter and a random length payload: [ tx thread rx thread] | ^ v | [ interface 0 ] cable > [ interface 1 ] I'll see if I can come up with a UDP testcase that works with normal ethernet interfaces. Is this a single queue device or a multiqueue device? Running 'tc -s qdisc show dev foo' would help some. Here you go: root@rk3399-q7:~# tc -s qdisc show dev can0 qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 If we introduced a OOO edge case somewhere that was not intended so I'll take a look into it. But, if you can provide a bit more details on how stress testing is done to cause the issue that would help. Will do. Thanks, Jakob
Re: rx_packets/bytes stats for error frames
On 14.03.18 10:46, Wolfgang Grandegger wrote: We do count them, as errors! This is what happens when you transmit a single CAN frame with nothing connected: "TX errors" shoots up but "RX packets" stays zero. This is handled not consistent in the existing CAN drivers. In flexcan all and c_can (all but rx overflow) are counted as rx_packets and rx_bytes. (I haven't looked at the other drivers.) I tend to count the error frames as ordinary frames. +1, I think we should count the packets and bytes delivered to the network layer. Documentation/ABI/testing/sysfs-class-net-statistics says: What: /sys/class//statistics/rx_packets Date: April 2005 KernelVersion: 2.6.12 Contact:net...@vger.kernel.org Description: Indicates the total number of good packets received by this network device. I considered error frames not "good packets", but I can change that for v3, no problem. Best regards, Jakob
Re: rx_packets/bytes stats for error frames
On 14.03.18 10:46, Wolfgang Grandegger wrote: We do count them, as errors! This is what happens when you transmit a single CAN frame with nothing connected: "TX errors" shoots up but "RX packets" stays zero. This is handled not consistent in the existing CAN drivers. In flexcan all and c_can (all but rx overflow) are counted as rx_packets and rx_bytes. (I haven't looked at the other drivers.) I tend to count the error frames as ordinary frames. +1, I think we should count the packets and bytes delivered to the network layer. Documentation/ABI/testing/sysfs-class-net-statistics says: What: /sys/class//statistics/rx_packets Date: April 2005 KernelVersion: 2.6.12 Contact:net...@vger.kernel.org Description: Indicates the total number of good packets received by this network device. I considered error frames not "good packets", but I can change that for v3, no problem. Best regards, Jakob
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 10:25, Wolfgang Grandegger wrote: Counting the state changes is one thing but you should also generate error messages for them. The usual test here is: $ candump -td -e any,0:0,# should report proper state changes, if you send messages with 1. no cable connected 2. CAN high and low short-circuited Also downwards if the hardware error is gone and you continue to send messages. Yes, we do, the hardware does it. Testcases: (1) no cable connected root@rk3399-q7:~# candump -td -e any,0:0,# | head -n 1000 (000.00) can0 178 [0] (000.000410) can0 2030 [8] 00 00 00 00 00 00 08 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{8}{0}} (000.000445) can0 2030 [8] 00 00 00 00 00 00 10 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{16}{0}} (000.000425) can0 2030 [8] 00 00 00 00 00 00 18 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{24}{0}} (000.000451) can0 2030 [8] 00 00 00 00 00 00 20 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{32}{0}} (000.000429) can0 2030 [8] 00 00 00 00 00 00 28 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{40}{0}} (000.000448) can0 2030 [8] 00 00 00 00 00 00 30 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{48}{0}} (000.000433) can0 2030 [8] 00 00 00 00 00 00 38 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{56}{0}} (000.000437) can0 2030 [8] 00 00 00 00 00 00 40 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{64}{0}} (000.000443) can0 2030 [8] 00 00 00 00 00 00 48 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{72}{0}} (000.000437) can0 2030 [8] 00 00 00 00 00 00 50 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{80}{0}} (000.000498) can0 2030 [8] 00 00 00 00 00 00 58 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{88}{0}} (000.000394) can0 2034 [8] 00 0C 00 00 00 00 60 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{96}{0}} (000.000433) can0 2034 [8] 00 0C 00 00 00 00 68 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{104}{0}} (000.000437) can0 2034 [8] 00 0C 00 00 00 00 70 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{112}{0}} (000.000443) can0 2034 [8] 00 0C 00 00 00 00 78 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{120}{0}} (000.000444) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} (000.000495) can0 2024 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} Repeats ad infinitum... (1b) cable gets connected: (000.000883) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} (000.000996) can0 2004 [8] 00 0C 00 00 00 00 7F 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} error-counter-tx-rx{{127}{0}} (2) short circuit root@rk3399-q7:~# candump -td -e any,0:0,# | head -n 1000 (000.00) can0 6AB [7] 33 39 62 29 00 F5 31 (000.000201) can0 2018 [8] 00 00 08 00 00 00 18 00 ERRORFRAME protocol-violation{{tx-dominant-bit-error}{}} transceiver-status error-counter-tx-rx{{24}{0}} (000.000141) can0 200C [8] 00 3C 08 00 00 00 88 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} protocol-violation{{tx-dominant-bit-error}{}}
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 10:25, Wolfgang Grandegger wrote: Counting the state changes is one thing but you should also generate error messages for them. The usual test here is: $ candump -td -e any,0:0,# should report proper state changes, if you send messages with 1. no cable connected 2. CAN high and low short-circuited Also downwards if the hardware error is gone and you continue to send messages. Yes, we do, the hardware does it. Testcases: (1) no cable connected root@rk3399-q7:~# candump -td -e any,0:0,# | head -n 1000 (000.00) can0 178 [0] (000.000410) can0 2030 [8] 00 00 00 00 00 00 08 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{8}{0}} (000.000445) can0 2030 [8] 00 00 00 00 00 00 10 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{16}{0}} (000.000425) can0 2030 [8] 00 00 00 00 00 00 18 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{24}{0}} (000.000451) can0 2030 [8] 00 00 00 00 00 00 20 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{32}{0}} (000.000429) can0 2030 [8] 00 00 00 00 00 00 28 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{40}{0}} (000.000448) can0 2030 [8] 00 00 00 00 00 00 30 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{48}{0}} (000.000433) can0 2030 [8] 00 00 00 00 00 00 38 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{56}{0}} (000.000437) can0 2030 [8] 00 00 00 00 00 00 40 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{64}{0}} (000.000443) can0 2030 [8] 00 00 00 00 00 00 48 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{72}{0}} (000.000437) can0 2030 [8] 00 00 00 00 00 00 50 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{80}{0}} (000.000498) can0 2030 [8] 00 00 00 00 00 00 58 00 ERRORFRAME transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{88}{0}} (000.000394) can0 2034 [8] 00 0C 00 00 00 00 60 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{96}{0}} (000.000433) can0 2034 [8] 00 0C 00 00 00 00 68 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{104}{0}} (000.000437) can0 2034 [8] 00 0C 00 00 00 00 70 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{112}{0}} (000.000443) can0 2034 [8] 00 0C 00 00 00 00 78 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{120}{0}} (000.000444) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} (000.000495) can0 2024 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} Repeats ad infinitum... (1b) cable gets connected: (000.000883) can0 2034 [8] 00 3C 00 00 00 00 80 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} transceiver-status no-acknowledgement-on-tx error-counter-tx-rx{{128}{0}} (000.000996) can0 2004 [8] 00 0C 00 00 00 00 7F 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning} error-counter-tx-rx{{127}{0}} (2) short circuit root@rk3399-q7:~# candump -td -e any,0:0,# | head -n 1000 (000.00) can0 6AB [7] 33 39 62 29 00 F5 31 (000.000201) can0 2018 [8] 00 00 08 00 00 00 18 00 ERRORFRAME protocol-violation{{tx-dominant-bit-error}{}} transceiver-status error-counter-tx-rx{{24}{0}} (000.000141) can0 200C [8] 00 3C 08 00 00 00 88 00 ERRORFRAME controller-problem{rx-error-warning,tx-error-warning,rx-error-passive,tx-error-passive} protocol-violation{{tx-dominant-bit-error}{}}
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 10:17, Wolfgang Grandegger wrote: Counting the state changes is one thing but you should also generate error messages for them. We do for BUS-OFF already (see below), but not for other states - will do in v3. + /* we switched into a worse state */ + up->can.state = new_state; + switch (new_state) { + case CAN_STATE_BUS_OFF: + can_stats->bus_off++; + can_bus_off(up->netdev); + netdev_info(up->netdev, + "link has gone into BUS-OFF state\n"); + break;
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 10:17, Wolfgang Grandegger wrote: Counting the state changes is one thing but you should also generate error messages for them. We do for BUS-OFF already (see below), but not for other states - will do in v3. + /* we switched into a worse state */ + up->can.state = new_state; + switch (new_state) { + case CAN_STATE_BUS_OFF: + can_stats->bus_off++; + can_bus_off(up->netdev); + netdev_info(up->netdev, + "link has gone into BUS-OFF state\n"); + break;
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 10:11, Wolfgang Grandegger wrote: + /* handle error frames */ + canid = le32_to_cpu(m->msg.can_msg.id); + if (canid & CAN_ERR_FLAG) { + ucan_handle_error_frame(up, m, canid); + /* drop frame if berr-reporting is off */ + if (!(up->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING)) + return; + } You still do not generate error messages for state changes, IIUC. The hardware generates error frames on state changes - is that what you mean? In our testing, the state shows up (and updates) correctly in ip -details -statistics link show can0 Best regards, Jakob
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 10:11, Wolfgang Grandegger wrote: + /* handle error frames */ + canid = le32_to_cpu(m->msg.can_msg.id); + if (canid & CAN_ERR_FLAG) { + ucan_handle_error_frame(up, m, canid); + /* drop frame if berr-reporting is off */ + if (!(up->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING)) + return; + } You still do not generate error messages for state changes, IIUC. The hardware generates error frames on state changes - is that what you mean? In our testing, the state shows up (and updates) correctly in ip -details -statistics link show can0 Best regards, Jakob
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 08:51, Marc Kleine-Budde wrote: + memcpy(cf->data, m->msg.can_msg.data, cf->can_dlc); + + /* don't count error frames as real packets */ + if (!(canid & CAN_ERR_FLAG)) { + stats->rx_packets++; + stats->rx_bytes += cf->can_dlc; + } Please count them, too. We do count them, as errors! This is what happens when you transmit a single CAN frame with nothing connected: "TX errors" shoots up but "RX packets" stays zero. root@rk3399-q7:~# ip -details -statistics link show can0 [...] RX: bytes packets errors dropped overrun mcast 0 00 0 0 0 TX: bytes packets errors dropped carrier collsns 0 020425 0 0 0
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 14.03.18 08:51, Marc Kleine-Budde wrote: + memcpy(cf->data, m->msg.can_msg.data, cf->can_dlc); + + /* don't count error frames as real packets */ + if (!(canid & CAN_ERR_FLAG)) { + stats->rx_packets++; + stats->rx_bytes += cf->can_dlc; + } Please count them, too. We do count them, as errors! This is what happens when you transmit a single CAN frame with nothing connected: "TX errors" shoots up but "RX packets" stays zero. root@rk3399-q7:~# ip -details -statistics link show can0 [...] RX: bytes packets errors dropped overrun mcast 0 00 0 0 0 TX: bytes packets errors dropped carrier collsns 0 020425 0 0 0
[bug, bisected] pfifo_fast causes packet reordering
During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. We have tracked the problem down to the driver interface level, and it seems that the driver's net_device_ops.ndo_start_xmit() function gets the packets handed over in the wrong order. This behavior was not observed on Linux v4.15 and I have bisected the problem down to this patch: commit c5ad119fb6c09b0297446be05bd66602fa564758 Author: John FastabendDate: Thu Dec 7 09:58:19 2017 -0800 net: sched: pfifo_fast use skb_array This converts the pfifo_fast qdisc to use the skb_array data structure and set the lockless qdisc bit. pfifo_fast is the first qdisc to support the lockless bit that can be a child of a qdisc requiring locking. So we add logic to clear the lock bit on initialization in these cases when the qdisc graft operation occurs. This also removes the logic used to pick the next band to dequeue from and instead just checks a per priority array for packets from top priority to lowest. This might need to be a bit more clever but seems to work for now. Signed-off-by: John Fastabend Signed-off-by: David S. Miller The patch does not revert cleanly, but moving to one commit earlier makes the problem go away. Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go away as well. Is this an unintended side-effect of the patch or is there something the driver has to do to request in-order delivery? Thanks, Jakob
[bug, bisected] pfifo_fast causes packet reordering
During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. We have tracked the problem down to the driver interface level, and it seems that the driver's net_device_ops.ndo_start_xmit() function gets the packets handed over in the wrong order. This behavior was not observed on Linux v4.15 and I have bisected the problem down to this patch: commit c5ad119fb6c09b0297446be05bd66602fa564758 Author: John Fastabend Date: Thu Dec 7 09:58:19 2017 -0800 net: sched: pfifo_fast use skb_array This converts the pfifo_fast qdisc to use the skb_array data structure and set the lockless qdisc bit. pfifo_fast is the first qdisc to support the lockless bit that can be a child of a qdisc requiring locking. So we add logic to clear the lock bit on initialization in these cases when the qdisc graft operation occurs. This also removes the logic used to pick the next band to dequeue from and instead just checks a per priority array for packets from top priority to lowest. This might need to be a bit more clever but seems to work for now. Signed-off-by: John Fastabend Signed-off-by: David S. Miller The patch does not revert cleanly, but moving to one commit earlier makes the problem go away. Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go away as well. Is this an unintended side-effect of the patch or is there something the driver has to do to request in-order delivery? Thanks, Jakob
Re: [PATCH v2 0/1] Open questions
This email addresses open questions from the v1 review cycle. Three emails are answered, you can skip to sections by searching for the header text: * On 07.02.18 08:17, Wolfgang Grandegger wrote * On 09.02.18 11:40, Marc Kleine-Budde wrote * On 06.02.18 12:58, Andri Yngvason wrote Best regards, Jakob On 07.02.18 08:17, Wolfgang Grandegger wrote: > - please remove dev_dbg's just useful for driver development, >especially in the TX and RX path. Done > - I do not see CAN error states being handled. It's >mandatory! Done > - The TX done is handled when the transmission callback is done. >This has some unnice side effects. A real TX done message would >be much better. Done > Is "enpoint" a typo? Here an in a few other places! Done >> +/* Callback when the device sends the IRQ sate * > > Typo! Done; Interrupt Endpoint is gone in favor of TX_COMPLETE message (Used for Flow Control). >> +if (cf->can_dlc > sizeof(m->msg.can_msg.data)) >> +goto err_freeskb; > > Is'nt this worth an error message? > >> + >> +if (cf->can_dlc < 0) >> +goto err_freeskb; > > Ditto. sizeof: Added error message can_dlc < 0: if statement is unneccessary (can_dlc is u8, and comparision always evaluates fasle); statment deleted >> +if (cf->can_id & CAN_RTR_FLAG) { >> +cf->can_id |= CAN_RTR_FLAG; > > This flags is already set. Fixed >> +cf->can_id |= CAN_ERR_FLAG; > > Ditto. Fixed; Error Frame handling revised >> +ret = usb_submit_urb(up->irq_urb, GFP_KERNEL); >> +if (ret) { >> +kfree(up->irq_data); >> +usb_free_urb(up->irq_urb); > > Please use labels to cleanup in reverse order. > >> +goto err; >> +} Done; Interrupt Endpoint is gone in favor of TX_COMPLETE message (Used for Flow Control). >> +urb = usb_alloc_urb(0, GFP_KERNEL); >> +if (!urb) >> +goto err; > > What about: > > kfree(up->irq_data); > usb_free_urb(up->irq_urb); Revised open code; Interrupt Endpoint is gone ... you know already about this. >> +/* Infvalid interface Settings */ > > Typo. Fixed; probe code revised >> +if (up->in_ep->wMaxPacketSize < sizeof(struct ucan_message_in)) >> +goto err_free_candev; >> +if (up->out_ep->wMaxPacketSize < sizeof(struct ucan_message_out)) >> +goto err_free_candev; >> +if (up->irq_ep->wMaxPacketSize < sizeof(u32)) >> +goto err_free_candev; > > Could be or'ed together. I believe separating them support readability On 09.02.18 11:40, Marc Kleine-Budde wrote: > Does the controller have a serial number and expose it via the USB? Does > it show up as ID_SERIAL_SHORT by "udevadm info --path="? Yes! # udevadm info /sys/class/net/can0 | grep ID_SERIAL_SHORT E: ID_SERIAL_SHORT=004000405750511920353631 >> +++ b/Documentation/networking/can_ucan_protocol.txt > > Can you convert the Textfilen into .rst > (http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html)? Done; Updated file reflecting protocol changes >> +The UCAN protocol has been designed to be hardware-independent. >> +It is modeled closely after how Linux represents CAN devices > > modelled We believe it is modeled in en_US (modelled is used in en_GB) >> +All structures mentioned in this document are defined in >> +drivers/net/can/usb/ucan.h . > > Please fold the .h into the .c file. As no one besides the driver uses them. Done >> +IN enpoint: The device sends CAN frames and device > > endpoint Done >> +the device uses the "len" field to jump to the next >> +ucan_message_out value. Each ucan_message_out must be aliged > > aligned Done >> +* UCAN_OUT_COMMAND: Send a simple command to the device (parameters: cmd, >> + subcmd, val). See the comments in drivers/net/can/usb/ucan.h for details. > > merge .h -> .c Done >> + UCAN is an microcontroller-based USB-CAN interface that >> + is integrated on System-on-Modules made by Theobroma Systems >> + (https://www.theobroma-systems.com/som-products). > > Why not link to the controller's website directly? Did not exist at the time, now it does and we are linking to https://www.theobroma-systems.com/seal . >> + * Copyright (C) 2015 Theobroma Systems Design und Consulting GmbH >> + * >> + * This driver is inspired by the 4.0.0 version of drivers/net/can/usb/ems_usb.c > > Do you need to add the ems_usb.c's copyright here? We used the ems driver as starting point for the v1 driver that we posted. However, we are now at protocol version 3 and not much is left of the ems driver, we think we can drop this statement. >> +#include "ucan.h" >> + >> +MODULE_LICENSE("GPL v2"); >> +MODULE_AUTHOR("Martin Elshuber, Theobroma Systems Design und Consulting GmbH"); >> +MODULE_DESCRIPTION("Driver for Theobroma Systems UCAN devices"); > > I think we usually have these at the end of the driver. Done, moved to the end >> +#define MAX_TX_URBS 8 >> +#define
Re: [PATCH v2 0/1] Open questions
This email addresses open questions from the v1 review cycle. Three emails are answered, you can skip to sections by searching for the header text: * On 07.02.18 08:17, Wolfgang Grandegger wrote * On 09.02.18 11:40, Marc Kleine-Budde wrote * On 06.02.18 12:58, Andri Yngvason wrote Best regards, Jakob On 07.02.18 08:17, Wolfgang Grandegger wrote: > - please remove dev_dbg's just useful for driver development, >especially in the TX and RX path. Done > - I do not see CAN error states being handled. It's >mandatory! Done > - The TX done is handled when the transmission callback is done. >This has some unnice side effects. A real TX done message would >be much better. Done > Is "enpoint" a typo? Here an in a few other places! Done >> +/* Callback when the device sends the IRQ sate * > > Typo! Done; Interrupt Endpoint is gone in favor of TX_COMPLETE message (Used for Flow Control). >> +if (cf->can_dlc > sizeof(m->msg.can_msg.data)) >> +goto err_freeskb; > > Is'nt this worth an error message? > >> + >> +if (cf->can_dlc < 0) >> +goto err_freeskb; > > Ditto. sizeof: Added error message can_dlc < 0: if statement is unneccessary (can_dlc is u8, and comparision always evaluates fasle); statment deleted >> +if (cf->can_id & CAN_RTR_FLAG) { >> +cf->can_id |= CAN_RTR_FLAG; > > This flags is already set. Fixed >> +cf->can_id |= CAN_ERR_FLAG; > > Ditto. Fixed; Error Frame handling revised >> +ret = usb_submit_urb(up->irq_urb, GFP_KERNEL); >> +if (ret) { >> +kfree(up->irq_data); >> +usb_free_urb(up->irq_urb); > > Please use labels to cleanup in reverse order. > >> +goto err; >> +} Done; Interrupt Endpoint is gone in favor of TX_COMPLETE message (Used for Flow Control). >> +urb = usb_alloc_urb(0, GFP_KERNEL); >> +if (!urb) >> +goto err; > > What about: > > kfree(up->irq_data); > usb_free_urb(up->irq_urb); Revised open code; Interrupt Endpoint is gone ... you know already about this. >> +/* Infvalid interface Settings */ > > Typo. Fixed; probe code revised >> +if (up->in_ep->wMaxPacketSize < sizeof(struct ucan_message_in)) >> +goto err_free_candev; >> +if (up->out_ep->wMaxPacketSize < sizeof(struct ucan_message_out)) >> +goto err_free_candev; >> +if (up->irq_ep->wMaxPacketSize < sizeof(u32)) >> +goto err_free_candev; > > Could be or'ed together. I believe separating them support readability On 09.02.18 11:40, Marc Kleine-Budde wrote: > Does the controller have a serial number and expose it via the USB? Does > it show up as ID_SERIAL_SHORT by "udevadm info --path="? Yes! # udevadm info /sys/class/net/can0 | grep ID_SERIAL_SHORT E: ID_SERIAL_SHORT=004000405750511920353631 >> +++ b/Documentation/networking/can_ucan_protocol.txt > > Can you convert the Textfilen into .rst > (http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html)? Done; Updated file reflecting protocol changes >> +The UCAN protocol has been designed to be hardware-independent. >> +It is modeled closely after how Linux represents CAN devices > > modelled We believe it is modeled in en_US (modelled is used in en_GB) >> +All structures mentioned in this document are defined in >> +drivers/net/can/usb/ucan.h . > > Please fold the .h into the .c file. As no one besides the driver uses them. Done >> +IN enpoint: The device sends CAN frames and device > > endpoint Done >> +the device uses the "len" field to jump to the next >> +ucan_message_out value. Each ucan_message_out must be aliged > > aligned Done >> +* UCAN_OUT_COMMAND: Send a simple command to the device (parameters: cmd, >> + subcmd, val). See the comments in drivers/net/can/usb/ucan.h for details. > > merge .h -> .c Done >> + UCAN is an microcontroller-based USB-CAN interface that >> + is integrated on System-on-Modules made by Theobroma Systems >> + (https://www.theobroma-systems.com/som-products). > > Why not link to the controller's website directly? Did not exist at the time, now it does and we are linking to https://www.theobroma-systems.com/seal . >> + * Copyright (C) 2015 Theobroma Systems Design und Consulting GmbH >> + * >> + * This driver is inspired by the 4.0.0 version of drivers/net/can/usb/ems_usb.c > > Do you need to add the ems_usb.c's copyright here? We used the ems driver as starting point for the v1 driver that we posted. However, we are now at protocol version 3 and not much is left of the ems driver, we think we can drop this statement. >> +#include "ucan.h" >> + >> +MODULE_LICENSE("GPL v2"); >> +MODULE_AUTHOR("Martin Elshuber, Theobroma Systems Design und Consulting GmbH "); >> +MODULE_DESCRIPTION("Driver for Theobroma Systems UCAN devices"); > > I think we usually have these at the end of the driver. Done, moved to the end >> +#define MAX_TX_URBS 8 >> +#define MAX_RX_URBS 8 >> +#define
[PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber <martin.elshu...@theobroma-systems.com> Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> Signed-off-by: Philipp Tomsich <philipp.toms...@theobroma-systems.com> --- Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1587 5 files changed, 1914 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..d859b36200b4 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,315 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COM
[PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
The UCAN driver supports the microcontroller-based USB/CAN adapters from Theobroma Systems. There are two form-factors that run essentially the same firmware: * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal ) * Mule: integrated on the PCB of various System-on-Modules from Theobroma Systems like the A31-µQ7 and the RK3399-Q7 ( https://www.theobroma-systems.com/rk3399-q7 ) The USB wire protocol has been designed to be as generic and hardware-indendent as possible in the hope of being useful for implementation on other microcontrollers. Signed-off-by: Martin Elshuber Signed-off-by: Jakob Unterwurzacher Signed-off-by: Philipp Tomsich --- Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1587 5 files changed, 1914 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst new file mode 100644 index ..d859b36200b4 --- /dev/null +++ b/Documentation/networking/can_ucan_protocol.rst @@ -0,0 +1,315 @@ += +The UCAN Protocol += + +UCAN is the protocol used by the microcontroller-based USB-CAN +adapter that is integrated on System-on-Modules from Theobroma Systems +and that is also available as a standalone USB stick. + +The UCAN protocol has been designed to be hardware-independent. +It is modeled closely after how Linux represents CAN devices +internally. All multi-byte integers are encoded as Little Endian. + +All structures mentioned in this document are defined in +``drivers/net/can/usb/ucan.c``. + +USB Endpoints += + +UCAN devices use three USB endpoints: + +CONTROL endpoint + The driver sends device management commands on this endpoint + +IN endpoint + The device sends CAN data frames and CAN error frames + +OUT endpoint + The driver sends CAN data frames on the out endpoint + + +CONTROL Messages + + +UCAN devices are configured using vendor requests on the control pipe. + +To support multiple CAN interfaces in a single USB device all +configuration commands target the corresponding interface in the USB +descriptor. + +The driver uses ``ucan_ctrl_command_in/out`` and +``ucan_device_request_in`` to deliver commands to the device. + +Setup Packet + + += = +``bmRequestType`` Direction | Vendor | (Interface or Device) +``bRequest`` Command Number +``wValue`` Subcommand Number (16 Bit) or 0 if not used +``wIndex`` USB Interface Index (0 for device commands) +``wLength``* Host to Device - Number of bytes to transmit + * Device to Host - Maximum Number of bytes to + receive. If the device send less. Commom ZLP + semantics are used. += = + +Error Handling +-- + +The device indicates failed control commands by stalling the +pipe. + +Device Commands +--- + +UCAN_DEVICE_GET_FW_STRING +~ + +*Dev2Host; optional* + +Request the device firmware string. + + +Interface Commands +-- + +UCAN_COMMAND_START +~~ + +*Host2Dev; mandatory* + +Bring the CAN interface up. + +Payload Format + ``ucan_ctl_payload_t.cmd_start`` + + +mode or mask of ``UCAN_MODE_*`` + + +UCAN_COMMAND_STOP +~~ + +*Host2Dev; mandatory* + +Stop the CAN interface + +Payload Format + *empty* + +UCAN_COMMAND_RESET +~~ + +*Host2Dev; mandatory* + +Reset the CAN controller (including error counters) + +Payload Format + *empty* + +UCAN_COMMAND_GET + + +*Host2Dev; mandatory* + +Get Information from the Device + +Subcommands +^^^ + +UCAN_COMMAND_GET_INFO + Request the device information structure ``ucan_ctl_payload_t.device_info``. + + See the ``device_info`` field for details, and + ``uapi/linux/can/netlink.h`` for an explanation of the + ``can_bittiming fields``. + + Payload Format +``ucan_ctl_payload_t.device_info`` + +UCAN_COMMAND_GET_PROTOCOL_VERSION + + Request the device protocol version + ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3. + + Payload Format +``ucan_ctl_payload_t.protocol_version`` + +.. note:: Devices that do not implement this command use the old + protocol version 1 + +UCAN_COMMAND_SET_BITTIMING +~~ + +*Host2Dev; mandatory* + +Setup bittiming by sending the the structure
[PATCH v2 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is the second iteration of the Theobroma Systems CAN/USB adapter driver. Thanks to feedback from the list and additonal internal stress-testing, the driver, the wire protocol and the device firmware have been improved. Among other changes, error states are now handled, and an explicit tx completion message has been added. A few questions and review comments are still open. I will post an email gathering all answers in a reply to this cover letter. Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1587 5 files changed, 1914 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber <martin.elshu...@theobroma-systems.com> Cc: Philipp Tomsich <philipp.toms...@theobroma-systems.com> Cc: Wolfgang Grandegger <w...@grandegger.com> Cc: Marc Kleine-Budde <m...@pengutronix.de> Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
[PATCH v2 0/1] can: ucan: add driver for Theobroma Systems UCAN devices
This is the second iteration of the Theobroma Systems CAN/USB adapter driver. Thanks to feedback from the list and additonal internal stress-testing, the driver, the wire protocol and the device firmware have been improved. Among other changes, error states are now handled, and an explicit tx completion message has been added. A few questions and review comments are still open. I will post an email gathering all answers in a reply to this cover letter. Jakob Unterwurzacher (1): can: ucan: add driver for Theobroma Systems UCAN devices Documentation/networking/can_ucan_protocol.rst | 315 + Documentation/networking/index.rst |1 + drivers/net/can/usb/Kconfig| 10 + drivers/net/can/usb/Makefile |1 + drivers/net/can/usb/ucan.c | 1587 5 files changed, 1914 insertions(+) create mode 100644 Documentation/networking/can_ucan_protocol.rst create mode 100644 drivers/net/can/usb/ucan.c -- 2.11.0 Cc: Martin Elshuber Cc: Philipp Tomsich Cc: Wolfgang Grandegger Cc: Marc Kleine-Budde Cc: linux-...@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 13.03.18 18:44, Marc Kleine-Budde wrote: On 03/13/2018 06:35 PM, Jakob Unterwurzacher wrote: [...] Please mark all multibyte values going over the USB as either le or be. This is documented in can_ucan_protocol.rst, All multi-byte integers are encoded as Little Endian. but it's probably worth mentioning here as well. Please use standard 'net' comment style: /* this is a multiline * comment */ + struct ucan_ctl_cmd_start cmd_start; + /*** +* Setup Bittiming +* bmRequest == UCAN_COMMAND_SET_BITTIMING +***/ Understood. Thanks, Jakob
Re: [PATCH v2 1/1] can: ucan: add driver for Theobroma Systems UCAN devices
On 13.03.18 18:44, Marc Kleine-Budde wrote: On 03/13/2018 06:35 PM, Jakob Unterwurzacher wrote: [...] Please mark all multibyte values going over the USB as either le or be. This is documented in can_ucan_protocol.rst, All multi-byte integers are encoded as Little Endian. but it's probably worth mentioning here as well. Please use standard 'net' comment style: /* this is a multiline * comment */ + struct ucan_ctl_cmd_start cmd_start; + /*** +* Setup Bittiming +* bmRequest == UCAN_COMMAND_SET_BITTIMING +***/ Understood. Thanks, Jakob
Re: [PATCH v2 0/1] Open questions
On 13.03.18 18:42, Dr. Philipp Tomsich wrote: On 13 Mar 2018, at 18:40, Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> wrote: +/* get the urb context */ +if (WARN_ON(!context)) +return; Can this happen? Not unless there is a bug in the code. But we want to get a message before dereferencing a pointer. Why not use BUG_ON(!context)? I believe BUG_ON kills the kernel - the intention was to get loud message, but we'd rather not crash the user's machine. Best regards, Jakob
Re: [PATCH v2 0/1] Open questions
On 13.03.18 18:42, Dr. Philipp Tomsich wrote: On 13 Mar 2018, at 18:40, Jakob Unterwurzacher wrote: +/* get the urb context */ +if (WARN_ON(!context)) +return; Can this happen? Not unless there is a bug in the code. But we want to get a message before dereferencing a pointer. Why not use BUG_ON(!context)? I believe BUG_ON kills the kernel - the intention was to get loud message, but we'd rather not crash the user's machine. Best regards, Jakob
fuse readdirplus skip one entry when interrupted by signal
A user running a Haskell program [1] noticed a problem with fuse's readdirplus: when it is interrupted by a signal, it skips one directory entry. The problem is most apparent with Haskell as it uses SIGVTALRM to interrupt it's own green threads. A minimal reproducer in C, "ls-count.c", is available [2]. The problem has been reproduced against libfuse's "passthrough_fh.c", but also against gocryptfs, which uses go-fuse instead of libfuse. This suggest that the bug is in kernel-space, which also the opinion of libfuse upstream [3]. What "ls-count.c" does is that it loops over readdir while sending itself SIGVTALRM. When the count of directory entries changes, it exits: $ ./ls-count b ls-count: counts do not match: 2 vs 1 strace against ls-count shows that we get one entry, when we should get two ("." and ".."): getdents(3, /* 1 entries */, 32768) = 24 --- SIGVTALRM --- rt_sigreturn({mask=[]}) = 24 getdents(3, /* 0 entries */, 32768) = 0 The debug output from go-fuse [4] shows what seems to be happening: Dispatch 548: READDIRPLUS, NodeId: 1. data: {Fh 3 off 0 sz 4096} Serialize 548: READDIRPLUS code: OK value: 320 bytes data Dispatch 549: READDIRPLUS, NodeId: 1. data: {Fh 3 off 2 sz 4096} Serialize 549: READDIRPLUS code: OK value: The kernel starts reading the directory from "off 0", where it is interrupted, and only returns one entry to userspace. Then it continues reading at "off 2". Offset 1 is skipped. I can reliably reproduce this within 1 second against kernel 4.12.5. Best regards, Jakob [1] https://github.com/hanwen/go-fuse/issues/191 [2] https://gist.githubusercontent.com/rfjakob/79581292a037ae7cb068067cb6207ef8/raw/f71494a291cfded8a96d02c3f0ee2983457591cc/ls-count.c [3] https://github.com/libfuse/libfuse/issues/214 [4] gocryptfs -fg -fusedebug -nosyslog
fuse readdirplus skip one entry when interrupted by signal
A user running a Haskell program [1] noticed a problem with fuse's readdirplus: when it is interrupted by a signal, it skips one directory entry. The problem is most apparent with Haskell as it uses SIGVTALRM to interrupt it's own green threads. A minimal reproducer in C, "ls-count.c", is available [2]. The problem has been reproduced against libfuse's "passthrough_fh.c", but also against gocryptfs, which uses go-fuse instead of libfuse. This suggest that the bug is in kernel-space, which also the opinion of libfuse upstream [3]. What "ls-count.c" does is that it loops over readdir while sending itself SIGVTALRM. When the count of directory entries changes, it exits: $ ./ls-count b ls-count: counts do not match: 2 vs 1 strace against ls-count shows that we get one entry, when we should get two ("." and ".."): getdents(3, /* 1 entries */, 32768) = 24 --- SIGVTALRM --- rt_sigreturn({mask=[]}) = 24 getdents(3, /* 0 entries */, 32768) = 0 The debug output from go-fuse [4] shows what seems to be happening: Dispatch 548: READDIRPLUS, NodeId: 1. data: {Fh 3 off 0 sz 4096} Serialize 548: READDIRPLUS code: OK value: 320 bytes data Dispatch 549: READDIRPLUS, NodeId: 1. data: {Fh 3 off 2 sz 4096} Serialize 549: READDIRPLUS code: OK value: The kernel starts reading the directory from "off 0", where it is interrupted, and only returns one entry to userspace. Then it continues reading at "off 2". Offset 1 is skipped. I can reliably reproduce this within 1 second against kernel 4.12.5. Best regards, Jakob [1] https://github.com/hanwen/go-fuse/issues/191 [2] https://gist.githubusercontent.com/rfjakob/79581292a037ae7cb068067cb6207ef8/raw/f71494a291cfded8a96d02c3f0ee2983457591cc/ls-count.c [3] https://github.com/libfuse/libfuse/issues/214 [4] gocryptfs -fg -fusedebug -nosyslog
Re: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency
> On 16. Oct 2017, at 13:19, Fabio Estevamwrote: > >> I think I see a way to fix this without adding a DTS property, I’ll send out >> a patch later this afternoon. > > Ok, great. Hi Fabio, I think I found a way to get our use-case working with just DTS configuration for simple-audio-card. When you define the “clocks” property for the "simple-audio-card,cpu” sub-node, set_sysclk is called only once, on startup - like the IMX driver does. We should get away without changes to the codec driver. Thanks & best regards, Jakob
Re: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency
> On 16. Oct 2017, at 13:19, Fabio Estevam wrote: > >> I think I see a way to fix this without adding a DTS property, I’ll send out >> a patch later this afternoon. > > Ok, great. Hi Fabio, I think I found a way to get our use-case working with just DTS configuration for simple-audio-card. When you define the “clocks” property for the "simple-audio-card,cpu” sub-node, set_sysclk is called only once, on startup - like the IMX driver does. We should get away without changes to the codec driver. Thanks & best regards, Jakob
Re: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency
> On 16. Oct 2017, at 12:51, Fabio Estevam <feste...@gmail.com> wrote: > > Hi Jakob, > > On Mon, Oct 16, 2017 at 7:46 AM, Jakob Unterwurzacher > <jakob.unterwurzac...@theobroma-systems.com> wrote: >> Hi Fabio, that’s interesting. >> >> Looks like the fsl,imx-audio-sgtl5000 sound card driver that is used on the >> imx51-babbage >> does not try to adjust MCLK in response to the playback sample rate. >> >> The simple-audio-card driver does that (it calls snd_soc_dai_set_sysclk >> whenever playback >> starts). > > Could you please test this with linux-next tree? > > Looks like that the recent b0a7043d5c2c("ASoC: fsl_ssi: Caculate bit > clock rate using slot number and width") would avoid the issue. > Hi Fabio, if this ( https://patchwork.kernel.org/patch/9952385/ ) is the patch, then it seems to be specific to the Freescale I2S controller. We use the SGTL5000 together with a Rockchip CPU. I think I see a way to fix this without adding a DTS property, I’ll send out a patch later this afternoon. Best regards, Jakob
Re: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency
> On 16. Oct 2017, at 12:51, Fabio Estevam wrote: > > Hi Jakob, > > On Mon, Oct 16, 2017 at 7:46 AM, Jakob Unterwurzacher > wrote: >> Hi Fabio, that’s interesting. >> >> Looks like the fsl,imx-audio-sgtl5000 sound card driver that is used on the >> imx51-babbage >> does not try to adjust MCLK in response to the playback sample rate. >> >> The simple-audio-card driver does that (it calls snd_soc_dai_set_sysclk >> whenever playback >> starts). > > Could you please test this with linux-next tree? > > Looks like that the recent b0a7043d5c2c("ASoC: fsl_ssi: Caculate bit > clock rate using slot number and width") would avoid the issue. > Hi Fabio, if this ( https://patchwork.kernel.org/patch/9952385/ ) is the patch, then it seems to be specific to the Freescale I2S controller. We use the SGTL5000 together with a Rockchip CPU. I think I see a way to fix this without adding a DTS property, I’ll send out a patch later this afternoon. Best regards, Jakob
Re: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency
Hi Fabio, that’s interesting. Looks like the fsl,imx-audio-sgtl5000 sound card driver that is used on the imx51-babbage does not try to adjust MCLK in response to the playback sample rate. The simple-audio-card driver does that (it calls snd_soc_dai_set_sysclk whenever playback starts). I think we can do without the new dts property when we fix sgtl5000_set_dai_sysclk to actually call clk_set_rate on the MCLK instead of blindly setting any requested value. On a fixed clock, clk_set_rate is a no-op if the requested rate matches the fixed rate and errors out otherwise. I think this would be the more correct fix, but it carries the risk of breaking existing boards that rely on the current behaviour. What do you think? Best regards, Jakob > On 16. Oct 2017, at 01:09, Fabio Estevam <fabio.este...@nxp.com> wrote: > > [Sorry for the top-posting. I did not see your message in the mailing list] > > Hi Jakob, > > arch/arm/boot/dts/imx51-babbage.dts has SYS_MCLK generated by an external > fixed oscillator and it works fine. > > Do you really need this new property? > From: Jakob Unterwurzacher <jakob.unterwurzac...@theobroma-systems.com> > Sent: Friday, October 6, 2017 5:34:44 AM > To: linux-kernel@vger.kernel.org > Cc: Jakob Unterwurzacher; Klaus Goger; Liam Girdwood; Mark Brown; Rob > Herring; Mark Rutland; Jaroslav Kysela; Takashi Iwai; Richard Leitner; Fabio > Estevam; Bhumika Goyal; alsa-de...@alsa-project.org; > devicet...@vger.kernel.org > Subject: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency > > The system clock, also called SYS_MCLK in the SGTL5000 datasheet, > is usually generated by the I2S master with a frequency proportional > to the sample rate. > > However, this is not always the case. Compute modules following > the Qseven specification, for example, do not even have a pin for > I2S SYS_MCLK on their board-to-board connector. Only I2S_WS, I2S_RST# > I2S_CLK, I2S_SDI and I2S_SDO are specified. > > Instead, SYS_MCLK is usually generated by an external fixed oscillator. > > Add support for this scenario, enabled by the new "sysclk-fixed" > devicetree property. > > Tested on a "Puma" RK3399-Q7 compute module with a "Haikou" baseboard > that has a fixed 24.576 MHz oscillator connected to the codec's SYS_MCLK. > > Signed-off-by: Jakob Unterwurzacher > <jakob.unterwurzac...@theobroma-systems.com> > Cc: Klaus Goger <klaus.go...@theobroma-systems.com> > --- > Documentation/devicetree/bindings/sound/sgtl5000.txt | 5 + > sound/soc/codecs/sgtl5000.c | 12 > 2 files changed, 17 insertions(+) > > diff --git a/Documentation/devicetree/bindings/sound/sgtl5000.txt > b/Documentation/devicetree/bindings/sound/sgtl5000.txt > index 7a73a9d62015..0d0dd58b2d4a 100644 > --- a/Documentation/devicetree/bindings/sound/sgtl5000.txt > +++ b/Documentation/devicetree/bindings/sound/sgtl5000.txt > @@ -35,6 +35,11 @@ VDDIO1.8V2.5V3.3V > 2 = 3.33 mA 5.74 mA 8.03 mA > 3 = 4.99 mA 8.61 mA 12.05 mA > > +- sysclk-fixed : If this property exists, the sysclk is considered to be > + generated by a fixed-frequency oscillator. The frequency value is read > + from the clock defined in the "clocks" property. set_sysclk requests > + are ignored. > + > Example: > > codec: sgtl5000@0a { > diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c > index f2bb4feba3b6..be3aea89371c 100644 > --- a/sound/soc/codecs/sgtl5000.c > +++ b/sound/soc/codecs/sgtl5000.c > @@ -133,6 +133,7 @@ struct sgtl5000_priv { > u8 micbias_resistor; > u8 micbias_voltage; > u8 lrclk_strength; > + u8 sysclk_fixed; /* sysclk rate cannot be changed */ > }; > > /* > @@ -613,6 +614,10 @@ static int sgtl5000_set_dai_sysclk(struct snd_soc_dai > *codec_dai, > > switch (clk_id) { > case SGTL5000_SYSCLK: > + if (sgtl5000->sysclk_fixed) { > + dev_dbg(codec->dev, "sysclk-fixed: ignoring > set_sysclk request\n"); > + return 0; > + } > sgtl5000->sysclk = freq; > break; > default: > @@ -1444,6 +1449,13 @@ static int sgtl5000_i2c_probe(struct i2c_client > *client, > } else { > sgtl5000->micbias_voltage = 0; > } > + if (of_property_read_bool(np, "sysclk-fixed")) { > + sgtl5000->sysclk = clk_get_rate(sgtl5000->mclk); > + sgtl5000->sysclk_fixed = 1; > + dev_dbg(>dev, > + "sysclk-fixed: setting sysclk to fixed value > %d\n", > + sgtl5000->sysclk); > + } > } > > sgtl5000->lrclk_strength = I2S_LRCLK_STRENGTH_LOW; > -- > 2.11.0
Re: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency
Hi Fabio, that’s interesting. Looks like the fsl,imx-audio-sgtl5000 sound card driver that is used on the imx51-babbage does not try to adjust MCLK in response to the playback sample rate. The simple-audio-card driver does that (it calls snd_soc_dai_set_sysclk whenever playback starts). I think we can do without the new dts property when we fix sgtl5000_set_dai_sysclk to actually call clk_set_rate on the MCLK instead of blindly setting any requested value. On a fixed clock, clk_set_rate is a no-op if the requested rate matches the fixed rate and errors out otherwise. I think this would be the more correct fix, but it carries the risk of breaking existing boards that rely on the current behaviour. What do you think? Best regards, Jakob > On 16. Oct 2017, at 01:09, Fabio Estevam wrote: > > [Sorry for the top-posting. I did not see your message in the mailing list] > > Hi Jakob, > > arch/arm/boot/dts/imx51-babbage.dts has SYS_MCLK generated by an external > fixed oscillator and it works fine. > > Do you really need this new property? > From: Jakob Unterwurzacher > Sent: Friday, October 6, 2017 5:34:44 AM > To: linux-kernel@vger.kernel.org > Cc: Jakob Unterwurzacher; Klaus Goger; Liam Girdwood; Mark Brown; Rob > Herring; Mark Rutland; Jaroslav Kysela; Takashi Iwai; Richard Leitner; Fabio > Estevam; Bhumika Goyal; alsa-de...@alsa-project.org; > devicet...@vger.kernel.org > Subject: [PATCH] ASoC: sgtl5000: add support for a fixed sysclk frequency > > The system clock, also called SYS_MCLK in the SGTL5000 datasheet, > is usually generated by the I2S master with a frequency proportional > to the sample rate. > > However, this is not always the case. Compute modules following > the Qseven specification, for example, do not even have a pin for > I2S SYS_MCLK on their board-to-board connector. Only I2S_WS, I2S_RST# > I2S_CLK, I2S_SDI and I2S_SDO are specified. > > Instead, SYS_MCLK is usually generated by an external fixed oscillator. > > Add support for this scenario, enabled by the new "sysclk-fixed" > devicetree property. > > Tested on a "Puma" RK3399-Q7 compute module with a "Haikou" baseboard > that has a fixed 24.576 MHz oscillator connected to the codec's SYS_MCLK. > > Signed-off-by: Jakob Unterwurzacher > > Cc: Klaus Goger > --- > Documentation/devicetree/bindings/sound/sgtl5000.txt | 5 + > sound/soc/codecs/sgtl5000.c | 12 > 2 files changed, 17 insertions(+) > > diff --git a/Documentation/devicetree/bindings/sound/sgtl5000.txt > b/Documentation/devicetree/bindings/sound/sgtl5000.txt > index 7a73a9d62015..0d0dd58b2d4a 100644 > --- a/Documentation/devicetree/bindings/sound/sgtl5000.txt > +++ b/Documentation/devicetree/bindings/sound/sgtl5000.txt > @@ -35,6 +35,11 @@ VDDIO1.8V2.5V3.3V > 2 = 3.33 mA 5.74 mA 8.03 mA > 3 = 4.99 mA 8.61 mA 12.05 mA > > +- sysclk-fixed : If this property exists, the sysclk is considered to be > + generated by a fixed-frequency oscillator. The frequency value is read > + from the clock defined in the "clocks" property. set_sysclk requests > + are ignored. > + > Example: > > codec: sgtl5000@0a { > diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c > index f2bb4feba3b6..be3aea89371c 100644 > --- a/sound/soc/codecs/sgtl5000.c > +++ b/sound/soc/codecs/sgtl5000.c > @@ -133,6 +133,7 @@ struct sgtl5000_priv { > u8 micbias_resistor; > u8 micbias_voltage; > u8 lrclk_strength; > + u8 sysclk_fixed; /* sysclk rate cannot be changed */ > }; > > /* > @@ -613,6 +614,10 @@ static int sgtl5000_set_dai_sysclk(struct snd_soc_dai > *codec_dai, > > switch (clk_id) { > case SGTL5000_SYSCLK: > + if (sgtl5000->sysclk_fixed) { > + dev_dbg(codec->dev, "sysclk-fixed: ignoring > set_sysclk request\n"); > + return 0; > + } > sgtl5000->sysclk = freq; > break; > default: > @@ -1444,6 +1449,13 @@ static int sgtl5000_i2c_probe(struct i2c_client > *client, > } else { > sgtl5000->micbias_voltage = 0; > } > + if (of_property_read_bool(np, "sysclk-fixed")) { > + sgtl5000->sysclk = clk_get_rate(sgtl5000->mclk); > + sgtl5000->sysclk_fixed = 1; > + dev_dbg(>dev, > + "sysclk-fixed: setting sysclk to fixed value > %d\n", > + sgtl5000->sysclk); > + } > } > > sgtl5000->lrclk_strength = I2S_LRCLK_STRENGTH_LOW; > -- > 2.11.0
Re: tmpfs returns incorrect data on concurrent pread() and truncate()
First of all, thanks for your replies! On 02.11.2016 00:51, Hugh Dickins wrote: > I don't think that there will be much appetite for changing the > kernel's VFS to prevent that. I hope that gocryptfs can provide > the serialization that it needs for itself, or otherwise handle > those zeroes without crashing. Yes, I dropped the nonce sanity check, the MAC will catch the corruption anyway. Just FYI, xfstests generic/075 discovered this. The test itself does not check the data it gets back, which is why it usually passes. Best regards, Jakob
Re: tmpfs returns incorrect data on concurrent pread() and truncate()
First of all, thanks for your replies! On 02.11.2016 00:51, Hugh Dickins wrote: > I don't think that there will be much appetite for changing the > kernel's VFS to prevent that. I hope that gocryptfs can provide > the serialization that it needs for itself, or otherwise handle > those zeroes without crashing. Yes, I dropped the nonce sanity check, the MAC will catch the corruption anyway. Just FYI, xfstests generic/075 discovered this. The test itself does not check the data it gets back, which is why it usually passes. Best regards, Jakob
tmpfs returns incorrect data on concurrent pread() and truncate()
tmpfs seems to be incorrectly returning 0-bytes when reading from a file that is concurrently being truncated. This is causing crashes in gocryptfs, a cryptographic FUSE overlay, when it reads a nonce from disk that should absolutely positively never be all-zero. I have written a reproducer in C that triggers this issue on both boxes I tested, Linux 4.7.2 and 3.16.7, both amd64. It can be downloaded from here: https://gist.github.com/rfjakob/d01281c737db38075767f90bf03fc475 or, alternatively, I have attached it to this email at the bottom. The reproducer: 1) Creates a 10MB file filled with 'x' at /dev/shm/x 2) Spawns a thread that truncates the file 3 bytes at a time 3) Spawns another thread that pread()s the file 1 byte at a time starting from the top 4) Prints "wrong data" whenever the pread() gets something that is not 'x' or an empty result. Example run: $ gcc -Wall -lpthread truncate_read.c && ./a.out wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 [...] Best regards, Jakob - truncate_read.c --8<--- // Compile and run: // gcc -Wall -lpthread truncate_read.c && ./a.out #include #include #include #include #include #include #include #include int fd; int l = 10*1024*1024; pthread_t tid[2]; void* read_thread(void *arg){ int o,n; char b; for(o=l; o>0; o--) { b = 'a'; n = pread(fd, , 1, o); if(n==0) { continue; } if(b != 'x') { printf("wrong data: %x\n", b); } } return NULL; } void* truncate_thread(void *arg){ // Parent = Truncater int o,n; // "3" seems to be the sweet spot to trigger the most errors. for(o=l; o>0; o-=3) { n = ftruncate(fd, o); if(n!=0) { perror("ftruncate err"); } } return NULL; } int main(int argc, char *argv[]) { fd = open("/dev/shm/x", O_RDWR | O_TRUNC | O_CREAT, 0666); if(fd < 0) { printf("open failed\n"); exit(1); } char* x = malloc(l); memset(x, 'x', l); write(fd, x, l); pthread_create(&(tid[0]), NULL, _thread, NULL); pthread_create(&(tid[1]), NULL, _thread, NULL); void *res; pthread_join(tid[0], ); pthread_join(tid[1], ); return 0; }
tmpfs returns incorrect data on concurrent pread() and truncate()
tmpfs seems to be incorrectly returning 0-bytes when reading from a file that is concurrently being truncated. This is causing crashes in gocryptfs, a cryptographic FUSE overlay, when it reads a nonce from disk that should absolutely positively never be all-zero. I have written a reproducer in C that triggers this issue on both boxes I tested, Linux 4.7.2 and 3.16.7, both amd64. It can be downloaded from here: https://gist.github.com/rfjakob/d01281c737db38075767f90bf03fc475 or, alternatively, I have attached it to this email at the bottom. The reproducer: 1) Creates a 10MB file filled with 'x' at /dev/shm/x 2) Spawns a thread that truncates the file 3 bytes at a time 3) Spawns another thread that pread()s the file 1 byte at a time starting from the top 4) Prints "wrong data" whenever the pread() gets something that is not 'x' or an empty result. Example run: $ gcc -Wall -lpthread truncate_read.c && ./a.out wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 wrong data: 0 [...] Best regards, Jakob - truncate_read.c --8<--- // Compile and run: // gcc -Wall -lpthread truncate_read.c && ./a.out #include #include #include #include #include #include #include #include int fd; int l = 10*1024*1024; pthread_t tid[2]; void* read_thread(void *arg){ int o,n; char b; for(o=l; o>0; o--) { b = 'a'; n = pread(fd, , 1, o); if(n==0) { continue; } if(b != 'x') { printf("wrong data: %x\n", b); } } return NULL; } void* truncate_thread(void *arg){ // Parent = Truncater int o,n; // "3" seems to be the sweet spot to trigger the most errors. for(o=l; o>0; o-=3) { n = ftruncate(fd, o); if(n!=0) { perror("ftruncate err"); } } return NULL; } int main(int argc, char *argv[]) { fd = open("/dev/shm/x", O_RDWR | O_TRUNC | O_CREAT, 0666); if(fd < 0) { printf("open failed\n"); exit(1); } char* x = malloc(l); memset(x, 'x', l); write(fd, x, l); pthread_create(&(tid[0]), NULL, _thread, NULL); pthread_create(&(tid[1]), NULL, _thread, NULL); void *res; pthread_join(tid[0], ); pthread_join(tid[1], ); return 0; }
Re: [fuse-devel] Horrible mmap write performance (kernel writeback issue?)
On 12.04.2016 13:09, Tejun Heo wrote: >> >> Probably you want to look into: >> https://lkml.org/lkml/2016/3/10/21 >> >> The patch mentioned above solves the issue for me. > > Heh, I tracked it down to wb_over_bg_thresh() and fell asleep. Yeah, > that is the right fix. Works wonderfully now, thanks to everybody involved. Is it too late for 4.6? Best regards, Jakob
Re: [fuse-devel] Horrible mmap write performance (kernel writeback issue?)
On 12.04.2016 13:09, Tejun Heo wrote: >> >> Probably you want to look into: >> https://lkml.org/lkml/2016/3/10/21 >> >> The patch mentioned above solves the issue for me. > > Heh, I tracked it down to wb_over_bg_thresh() and fell asleep. Yeah, > that is the right fix. Works wonderfully now, thanks to everybody involved. Is it too late for 4.6? Best regards, Jakob
Re: [fuse-devel] Horrible mmap write performance (kernel writeback issue?)
On 30.03.2016 20:47, Tejun Heo wrote: > Hmmm... cgroup writeback support shouldn't affect fuse at all as the > backing device doesn't enable cgroup support. I probably made some > silly mistake. Is there a simple reproducer I can play with? Hi Tejun! A simple reproducer is at https://github.com/rfjakob/mmapwrite . What seems to be happening in the kernel is that the estimated device bandwith drops to zero. I'm not even sure how this works for FUSE, but that's what I gathered from some printk debugging. What I also found is that once mmapwrite is hung, you can unblock it for some time by running something like cat /dev/zero > /var/tmp/foo mmapwrite will then steam ahead as long as cat is writing, even though encfs writes to /tmp (tmpfs) and /var is on the ext4 disk. Note that the hang happens regardless of the backing device, on both tmpfs and ext4. Best regards, Jakob
Re: [fuse-devel] Horrible mmap write performance (kernel writeback issue?)
On 30.03.2016 20:47, Tejun Heo wrote: > Hmmm... cgroup writeback support shouldn't affect fuse at all as the > backing device doesn't enable cgroup support. I probably made some > silly mistake. Is there a simple reproducer I can play with? Hi Tejun! A simple reproducer is at https://github.com/rfjakob/mmapwrite . What seems to be happening in the kernel is that the estimated device bandwith drops to zero. I'm not even sure how this works for FUSE, but that's what I gathered from some printk debugging. What I also found is that once mmapwrite is hung, you can unblock it for some time by running something like cat /dev/zero > /var/tmp/foo mmapwrite will then steam ahead as long as cat is writing, even though encfs writes to /tmp (tmpfs) and /var is on the ext4 disk. Note that the hang happens regardless of the backing device, on both tmpfs and ext4. Best regards, Jakob
Re: [fuse-devel] Horrible mmap write performance (kernel writeback issue?)
On 16.03.2016 10:44, Miklos Szeredi wrote: > On Tue, Mar 15, 2016 at 8:55 AM, Jakob Unterwurzacher > <jakob...@gmail.com> wrote: >> Just for anybody finding this thread: This still happens in v4.4, it >> just took longer to trigger. >> >> I have posted more details to linux-kernel (copy-pasted below), >> http://thread.gmane.org/gmane.linux.kernel/2132944 > > Okay, so you can reproduce this relatively quickly. Can you try "git > bisect" to find exactly which commit is responsible? > > Thanks, > Miklos That took a while, but it looks like it got it: > commit 947e9762a8ddefda38aa21e249e6a4fec215cd12 > Author: Tejun Heo <t...@kernel.org> > Date: Fri May 22 18:23:32 2015 -0400 > > writeback: update wb_over_bg_thresh() to use wb_domain aware operations Note that this commens seems to only activate changes that happened in the commit before, aa661bb: > commit aa661bbe1e61ce80ca4ae98804f673ede94b0827 > Author: Tejun Heo <t...@kernel.org> > Date: Fri May 22 18:23:31 2015 -0400 > > writeback: move over_bground_thresh() to mm/page-writeback.c Anyway, I can reliably reboot between aa661bb and 947e976 and always get the same results: aa661bb passes 947e976 fails I you want to reproduce, clone https://github.com/rfjakob/mmapwrite.git and run ./encfs-test.sh (needs encfs installed). On the bad kernel, it will hang within a few seconds. Thanks, Jakob
Re: [fuse-devel] Horrible mmap write performance (kernel writeback issue?)
On 16.03.2016 10:44, Miklos Szeredi wrote: > On Tue, Mar 15, 2016 at 8:55 AM, Jakob Unterwurzacher > wrote: >> Just for anybody finding this thread: This still happens in v4.4, it >> just took longer to trigger. >> >> I have posted more details to linux-kernel (copy-pasted below), >> http://thread.gmane.org/gmane.linux.kernel/2132944 > > Okay, so you can reproduce this relatively quickly. Can you try "git > bisect" to find exactly which commit is responsible? > > Thanks, > Miklos That took a while, but it looks like it got it: > commit 947e9762a8ddefda38aa21e249e6a4fec215cd12 > Author: Tejun Heo > Date: Fri May 22 18:23:32 2015 -0400 > > writeback: update wb_over_bg_thresh() to use wb_domain aware operations Note that this commens seems to only activate changes that happened in the commit before, aa661bb: > commit aa661bbe1e61ce80ca4ae98804f673ede94b0827 > Author: Tejun Heo > Date: Fri May 22 18:23:31 2015 -0400 > > writeback: move over_bground_thresh() to mm/page-writeback.c Anyway, I can reliably reboot between aa661bb and 947e976 and always get the same results: aa661bb passes 947e976 fails I you want to reproduce, clone https://github.com/rfjakob/mmapwrite.git and run ./encfs-test.sh (needs encfs installed). On the bad kernel, it will hang within a few seconds. Thanks, Jakob
Regression for mmap writes through FUSE in 4.2+
I have noticed an annoying regression that was introduced in 4.2 and is still there in 4.4. mmap writes to FUSE filesystems are throttled down to basically zero. Reproducer: https://github.com/rfjakob/mmapwrite , testing against encfs: $ mmapwrite /tmp/encfs-mnt/foo 1 .. 107.01 MB/s 2 .. 101.98 MB/s [...] 68 .. 106.79 MB/s 69 .. 105.09 MB/s 70 .. 2.02 MB/s 71 .. 1.77 MB/s 72 .. 0.42 MB/s 73 (hangs) I have tested kernels from 4.0 and this seems to have been introduced in 4.2: 4.0 ... 140MB/s permanent 4.1 ... 140MB/s permanent 4.2 ... 100MB/s at the start, sudden slowdown to 1MB/s after ~5GB 4.3 ... 100MB/s at the start, sudden slowdown to 1MB/s after ~1.5GB 4.4-rc4 ... 100MB/s at the start, slowly ramps down, 0.3MB/s after ~2GB 4.4 ... 100MB/s at the start, sudden slowdown after ~3GB Is there a way to disable the throttling? Or at least exempt FUSE until there is a proper fix? Thanks, Jakob
Regression for mmap writes through FUSE in 4.2+
I have noticed an annoying regression that was introduced in 4.2 and is still there in 4.4. mmap writes to FUSE filesystems are throttled down to basically zero. Reproducer: https://github.com/rfjakob/mmapwrite , testing against encfs: $ mmapwrite /tmp/encfs-mnt/foo 1 .. 107.01 MB/s 2 .. 101.98 MB/s [...] 68 .. 106.79 MB/s 69 .. 105.09 MB/s 70 .. 2.02 MB/s 71 .. 1.77 MB/s 72 .. 0.42 MB/s 73 (hangs) I have tested kernels from 4.0 and this seems to have been introduced in 4.2: 4.0 ... 140MB/s permanent 4.1 ... 140MB/s permanent 4.2 ... 100MB/s at the start, sudden slowdown to 1MB/s after ~5GB 4.3 ... 100MB/s at the start, sudden slowdown to 1MB/s after ~1.5GB 4.4-rc4 ... 100MB/s at the start, slowly ramps down, 0.3MB/s after ~2GB 4.4 ... 100MB/s at the start, sudden slowdown after ~3GB Is there a way to disable the throttling? Or at least exempt FUSE until there is a proper fix? Thanks, Jakob