Re: [Intel-wired-lan] [net-next] igb: add function to set I210 transmit mode

2016-08-13 Thread Richard Cochran
On Sat, Aug 13, 2016 at 08:27:38AM -0700, Alexander Duyck wrote:
> I really don' think this patch is going to work.  If you are going to
> implement something like this and have a hope to get it accepted into
> the Linux kernel you need to come up with a solution that will work
> fore more than this one device.  We don't want the drivers having to
> carry around their own sysfs controls for things that really are not
> proprietary to the device.  There needs to be a generic kernel
> interface for this.  The fact is something like QAV more than likely
> exists on other devices as well so it may be worth while to look into
> seeing if you could come up with some way of interfacing this with
> either ethtool ,iproute2, or maybe even the DCB/LLDP utilities since
> this is essentially splitting the Tx into two separate traffic
> classes.

Yes to all of this.
 
> Also for these kind of patches it would be best to include the netdev
> mailing list.  That way it can be reviewed by a wider audience and you
> are much more likely to get this accepted upstream rather than have it
> rejected when Jeff Kirsher attempts to submit it.

Right.  We just had a discussion about implementing TSN, and we will
need proper infrastructure in place *before* we start hacking
drivers.

Thanks,
Richard


Re: [Intel-wired-lan] [net-next] igb: add function to set I210 transmit mode

2016-08-13 Thread Alexander Duyck
On Tue, Aug 9, 2016 at 11:48 PM, Gangfeng  wrote:
> From: Gangfeng Huang 
>
> I210 supports two transmit modes, legacy and Qav. The transmit mode is
> configured in TQAVCTRL.QavMode register. Before this patch igb driver
> only support legacy mode. This patch makes it possible to configure the
> transmit mode.
>
> Example:
> Get the transmit mode:
> $ echo /sys/class/net/eth0/qav_mode
> 0
> Set transmit mode to qav mode
> $ echo 1 > /sys/class/net/eth0/qav_mode
>
> Tested:
> Setting /sys/class/net/eth0/qav_mode to Qav mode,
>  1) Switch back and forth between Qav mode and legacy mode
>  2) Send/recv packets in both mode.
>
> Signed-off-by: Gangfeng Huang 

I really don' think this patch is going to work.  If you are going to
implement something like this and have a hope to get it accepted into
the Linux kernel you need to come up with a solution that will work
fore more than this one device.  We don't want the drivers having to
carry around their own sysfs controls for things that really are not
proprietary to the device.  There needs to be a generic kernel
interface for this.  The fact is something like QAV more than likely
exists on other devices as well so it may be worth while to look into
seeing if you could come up with some way of interfacing this with
either ethtool ,iproute2, or maybe even the DCB/LLDP utilities since
this is essentially splitting the Tx into two separate traffic
classes.

Also for these kind of patches it would be best to include the netdev
mailing list.  That way it can be reviewed by a wider audience and you
are much more likely to get this accepted upstream rather than have it
rejected when Jeff Kirsher attempts to submit it.

> ---
>  drivers/net/ethernet/intel/igb/e1000_defines.h |  21 +++
>  drivers/net/ethernet/intel/igb/e1000_regs.h|   7 +
>  drivers/net/ethernet/intel/igb/igb.h   |   5 +
>  drivers/net/ethernet/intel/igb/igb_main.c  | 178 
> -
>  4 files changed, 209 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h 
> b/drivers/net/ethernet/intel/igb/e1000_defines.h
> index cf3846b..f13d6a7 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> @@ -360,6 +360,7 @@
>  #define MAX_JUMBO_FRAME_SIZE   0x2600
>
>  /* PBA constants */
> +#define E1000_PBA_32K 0x0020
>  #define E1000_PBA_34K 0x0022
>  #define E1000_PBA_64K 0x0040/* 64KB */
>
> @@ -1028,4 +1029,24 @@
>  #define E1000_VLAPQF_P_VALID(_n)  (0x1 << (3 + (_n) * 4))
>  #define E1000_VLAPQF_QUEUE_MASK   0x03
>
> +/* Queue mode, 0=strict, 1=SR mode */
> +#define E1000_TQAVCC_QUEUEMODE 0x8000
> +/* Transmit mode, 0=legacy, 1=QAV */
> +#define E1000_TQAVCTRL_TXMODE  0x0001
> +/* Report DMA time of tx packets */
> +#define E1000_TQAVCTRL_1588_STAT_EN0x0004
> +#define E1000_TQAVCTRL_DATA_FETCH_ARB  0x0010 /* Data fetch arbitration 
> */
> +#define E1000_TQAVCTRL_DATA_TRAN_ARB   0x0100 /* Data tx arbitration */
> +#define E1000_TQAVCTRL_DATA_TRAN_TIM   0x0200 /* Data launch time valid 
> */
> +/* Stall SP to guarantee SR */
> +#define E1000_TQAVCTRL_SP_WAIT_SR  0x0400
> +#define E1000_TQAVCTRL_FETCH_TM_SHIFT  (16)
> +
> +#define E1000_TXPBSIZE_TX0PB_SHIFT0
> +#define E1000_TXPBSIZE_TX1PB_SHIFT6
> +#define E1000_TXPBSIZE_TX2PB_SHIFT12
> +#define E1000_TXPBSIZE_TX3PB_SHIFT18
> +
> +#define E1000_DTXMXPKTSZ_DEFAULT 0x0098
> +
>  #endif
> diff --git a/drivers/net/ethernet/intel/igb/e1000_regs.h 
> b/drivers/net/ethernet/intel/igb/e1000_regs.h
> index 9b66b6f..7cffabc 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_regs.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_regs.h
> @@ -138,6 +138,12 @@
>  #define E1000_FCRTC0x02170 /* Flow Control Rx high watermark */
>  #define E1000_PCIEMISC 0x05BB8 /* PCIE misc config register */
>
> +/* High credit registers where _n can be 0 or 1. */
> +#define E1000_TQAVHC(_n)   (0x300C + 0x40 * (_n))
> +/* QAV Tx mode control registers where _n can be 0 or 1. */
> +#define E1000_TQAVCC(_n)   (0x3004 + 0x40 * (_n))
> +#define E1000_TQAVCTRL 0x3570 /* Tx Qav Control registers */
> +
>  /* TX Rate Limit Registers */
>  #define E1000_RTTDQSEL 0x3604 /* Tx Desc Plane Queue Select - WO */
>  #define E1000_RTTBCNRM 0x3690 /* Tx BCN Rate-scheduler MMW */
> @@ -204,6 +210,7 @@
>  #define E1000_TDFT 0x03418  /* TX Data FIFO Tail - RW */
>  #define E1000_TDFHS0x03420  /* TX Data FIFO Head Saved - RW */
>  #define E1000_TDFPC0x03430  /* TX Data FIFO Packet Count - RW */
> +#define E1000_DTXMXPKT 0x0355C  /* DMA TX Maximum Packet Size */
>  #define E1000_DTXCTL   0x03590  /* DMA TX Control - RW */
>  #define E1000_CRCERRS  0x04000  /* CRC Error Count - R/clr */
>  #define E1000_ALGNERRC 0x04004  /* Alignment Error Count - R/clr */
> diff --git 

[PATCH net-next 0/2] libbpf: minor fix and API update

2016-08-13 Thread Eric Leblond

Hello,

Here's a small patchset on libbpf fixing two issues I've encountered
when adding some eBPF related features to Suricata.

Patchset statistics:
 tools/lib/bpf/libbpf.c | 16 +++-
 tools/lib/bpf/libbpf.h |  4 +++-
 2 files changed, 10 insertions(+), 10 deletions(-)

BR,
--
Eric Leblond


[PATCH net-next 2/2] tools lib bpf: export function to set type

2016-08-13 Thread Eric Leblond
Current API was not allowing the user to set a type like socket
filter. To avoid a setter function for each type, the patch simply
exports a set function that takes the type in parameter.

Signed-off-by: Eric Leblond 
---
 tools/lib/bpf/libbpf.c | 15 ++-
 tools/lib/bpf/libbpf.h |  3 +++
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7872ff6..ff2a8c6 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1336,26 +1336,23 @@ int bpf_program__nth_fd(struct bpf_program *prog, int n)
return fd;
 }
 
-static void bpf_program__set_type(struct bpf_program *prog,
+int bpf_program__set_type(struct bpf_program *prog,
  enum bpf_prog_type type)
 {
+   if (!prog)
+   return -EINVAL;
prog->type = type;
+   return 0;
 }
 
 int bpf_program__set_tracepoint(struct bpf_program *prog)
 {
-   if (!prog)
-   return -EINVAL;
-   bpf_program__set_type(prog, BPF_PROG_TYPE_TRACEPOINT);
-   return 0;
+   return bpf_program__set_type(prog, BPF_PROG_TYPE_TRACEPOINT);
 }
 
 int bpf_program__set_kprobe(struct bpf_program *prog)
 {
-   if (!prog)
-   return -EINVAL;
-   bpf_program__set_type(prog, BPF_PROG_TYPE_KPROBE);
-   return 0;
+   return bpf_program__set_type(prog, BPF_PROG_TYPE_KPROBE);
 }
 
 static bool bpf_program__is_type(struct bpf_program *prog,
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index a6c5cde..6a84d7a 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -173,6 +173,9 @@ int bpf_program__set_kprobe(struct bpf_program *prog);
 bool bpf_program__is_tracepoint(struct bpf_program *prog);
 bool bpf_program__is_kprobe(struct bpf_program *prog);
 
+int bpf_program__set_type(struct bpf_program *prog,
+ enum bpf_prog_type type);
+
 /*
  * We don't need __attribute__((packed)) now since it is
  * unnecessary for 'bpf_map_def' because they are all aligned.
-- 
2.8.1



[PATCH net-next 1/2] tools lib bpf: suppress useless include

2016-08-13 Thread Eric Leblond
The include of err.h is not explicitely needed in exported
functions and it was causing include conflict with some existing
code due to redefining some macros.

Signed-off-by: Eric Leblond 
---
 tools/lib/bpf/libbpf.c | 1 +
 tools/lib/bpf/libbpf.h | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b699aea..7872ff6 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index dd7a513..a6c5cde 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -23,7 +23,6 @@
 
 #include 
 #include 
-#include 
 
 enum libbpf_errno {
__LIBBPF_ERRNO__START = 4000,
-- 
2.8.1



КЛИЕHТСКИЕ БA3Ьl МAЙЛ:oqissavip-2...@speed.1s.fr СКAЙП: prodawez389 Для 6ыстpой мAссовой пpодAжи ВAшиx товApов и услуг! Подpо6ности узнAйтe сeйчAс!

2016-08-13 Thread netdev@vger.kernel.org
Собeрем gля Ваc пo интepнет 6азy дaнныx nотенцuальныx kлиeнmoв для Baшеro 
Бизнeса! B 6aзе 6уqут вce kонтаkmныe qaнные необхoдuмые gля мaсcовой nрoqажu 
Вашuх moвaров u услуг. Пo Baшему зaпpoсу nришлем пpимeр и подро6нyю uнфoрмацию. 
Если uнmeрeснo зanросиmе noqробноcmu сейчас МAЙЛ: oqissavip-2...@speed.1s.fr 
СKAЙП: prodawez389 Блаrоgарим за 6ыcmрый oтвет



Re: [PATCH v2 2/3] VSOCK: Add vsockmon device

2016-08-13 Thread zhuyj
+#define DEFAULT_MTU (VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + sizeof(struct
af_vsockmon_hdr));

It is better.

On Sat, Aug 13, 2016 at 6:21 PM,   wrote:
> From: Gerard Garcia 
>
> Add vsockmon virtual network device that receives packets from the vsock
> transports and exposes them to user space.
>
> Based on the nlmon device.
>
> Signed-off-by: Gerard Garcia 
> ---
>  drivers/net/Kconfig   |   8 ++
>  drivers/net/Makefile  |   1 +
>  drivers/net/vsockmon.c| 168 
> ++
>  include/uapi/linux/Kbuild |   1 +
>  include/uapi/linux/vsockmon.h |  38 ++
>  5 files changed, 216 insertions(+)
>  create mode 100644 drivers/net/vsockmon.c
>  create mode 100644 include/uapi/linux/vsockmon.h
>
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 0c5415b..42c43b6 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -330,6 +330,14 @@ config NET_VRF
>   This option enables the support for mapping interfaces into VRF's. 
> The
>   support enables VRF devices.
>
> +config VSOCKMON
> +tristate "Virtual vsock monitoring device"
> +depends on VHOST_VSOCK
> +---help---
> + This option enables a monitoring net device for vsock sockets. It is
> + mostly intended for developers or support to debug vsock issues. If
> + unsure, say N.
> +
>  endif # NET_CORE
>
>  config SUNGEM_PHY
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 7336cbd..e2188d4 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -28,6 +28,7 @@ obj-$(CONFIG_GENEVE) += geneve.o
>  obj-$(CONFIG_GTP) += gtp.o
>  obj-$(CONFIG_NLMON) += nlmon.o
>  obj-$(CONFIG_NET_VRF) += vrf.o
> +obj-$(CONFIG_VSOCKMON) += vsockmon.o
>
>  #
>  # Networking Drivers
> diff --git a/drivers/net/vsockmon.c b/drivers/net/vsockmon.c
> new file mode 100644
> index 000..9ad4f0a
> --- /dev/null
> +++ b/drivers/net/vsockmon.c
> @@ -0,0 +1,168 @@
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/* Virtio transport max packet size plus header */
> +#define DEFAULT_MTU VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + sizeof(struct 
> af_vsockmon_hdr);
> +
> +struct pcpu_lstats {
> +   u64 rx_packets;
> +   u64 rx_bytes;
> +   struct u64_stats_sync syncp;
> +};
> +
> +static int vsockmon_dev_init(struct net_device *dev)
> +{
> +   dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
> +   return dev->lstats == NULL ? -ENOMEM : 0;
> +}
> +
> +static void vsockmon_dev_uninit(struct net_device *dev)
> +{
> +   free_percpu(dev->lstats);
> +}
> +
> +struct vsockmon {
> +   struct vsock_tap vt;
> +};
> +
> +static int vsockmon_open(struct net_device *dev)
> +{
> +   struct vsockmon *vsockmon = netdev_priv(dev);
> +
> +   vsockmon->vt.dev = dev;
> +   vsockmon->vt.module = THIS_MODULE;
> +   return vsock_add_tap(>vt);
> +}
> +
> +static int vsockmon_close(struct net_device *dev) {
> +   struct vsockmon *vsockmon = netdev_priv(dev);
> +
> +   return vsock_remove_tap(>vt);
> +}
> +
> +static netdev_tx_t vsockmon_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> +   int len = skb->len;
> +   struct pcpu_lstats *stats = this_cpu_ptr(dev->lstats);
> +
> +   u64_stats_update_begin(>syncp);
> +   stats->rx_bytes += len;
> +   stats->rx_packets++;
> +   u64_stats_update_end(>syncp);
> +
> +   dev_kfree_skb(skb);
> +
> +   return NETDEV_TX_OK;
> +}
> +
> +static struct rtnl_link_stats64 *
> +vsockmon_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
> +{
> +   int i;
> +   u64 bytes = 0, packets = 0;
> +
> +   for_each_possible_cpu(i) {
> +   const struct pcpu_lstats *vstats;
> +   u64 tbytes, tpackets;
> +   unsigned int start;
> +
> +   vstats = per_cpu_ptr(dev->lstats, i);
> +
> +   do {
> +   start = u64_stats_fetch_begin_irq(>syncp);
> +   tbytes = vstats->rx_bytes;
> +   tpackets = vstats->rx_packets;
> +   } while (u64_stats_fetch_retry_irq(>syncp, start));
> +
> +   packets += tpackets;
> +   bytes += tbytes;
> +   }
> +
> +   stats->rx_packets = packets;
> +   stats->tx_packets = 0;
> +
> +   stats->rx_bytes = bytes;
> +   stats->tx_bytes = 0;
> +
> +   return stats;
> +}
> +
> +static int vsockmon_is_valid_mtu(int new_mtu)
> +{
> +   return new_mtu >= (int) sizeof(struct af_vsockmon_hdr);
> +}
> +
> +static int vsockmon_change_mtu(struct net_device *dev, int new_mtu)
> +{
> +   if (!vsockmon_is_valid_mtu(new_mtu))
> +   return -EINVAL;
> +
> +   dev->mtu = new_mtu;
> +   return 0;
> +}
> +
> +static const struct net_device_ops vsockmon_ops = {
> +   .ndo_init = vsockmon_dev_init,
> +   

Re: [Patch net v3 5/5] net_sched: convert tcf_exts from list to pointer array

2016-08-13 Thread Jamal Hadi Salim


Just minor comment below:

On 16-08-11 08:41 PM, Cong Wang wrote:



+static inline void
+tcf_exts_to_list(const struct tcf_exts *exts, struct list_head *actions)
+{


to:
static inline void tcf_exts_to_list(const struct tcf_exts *exts,
struct list_head *actions)

cheers,
jamal



[PATCH 2/2] net: ethernet: mediatek: add the missing of_node_put() after node is used done

2016-08-13 Thread Sean Wang
This patch adds the missing of_node_put() after finishing the usage
of of_parse_phandle() or of_node_get() used by fixed_phy.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index f5d2745..88b04dd 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -269,6 +269,8 @@ static int mtk_phy_connect(struct mtk_mac *mac)
ADVERTISED_Autoneg;
phy_start_aneg(mac->phy_dev);
 
+   of_node_put(np);
+
return 0;
 }
 
-- 
1.7.9.5



[PATCH 1/2] net: ethernet: mediatek: fixed that initializing u64_stats_sync is missing

2016-08-13 Thread Sean Wang
To fix runtime warning with lockdep is enabled due that u64_stats_sync
is not initialized well, so add it.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 3a4726e..f5d2745 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1748,6 +1748,7 @@ static int mtk_add_mac(struct mtk_eth *eth, struct 
device_node *np)
goto free_netdev;
}
spin_lock_init(>hw_stats->stats_lock);
+   u64_stats_init(>hw_stats->syncp);
mac->hw_stats->reg_offset = id * MTK_STAT_OFFSET;
 
SET_NETDEV_DEV(eth->netdev[id], eth->dev);
-- 
1.7.9.5



Re: [Patch net 0/5] net_sched: tc action fixes and updates

2016-08-13 Thread Jamal Hadi Salim

On 16-08-11 08:08 PM, Cong Wang wrote:

On Thu, Aug 11, 2016 at 9:20 AM, Jamal Hadi Salim  wrote:

On 16-08-10 04:06 PM, Cong Wang wrote:


On Wed, Aug 10, 2016 at 7:34 AM, Jamal Hadi Salim 
wrote:


On 16-08-08 04:46 PM, Cong Wang wrote:




tcf_exts_exec() is the culprit - and conversion to from flexarray
to linked list in the fast problem to be specific.



Ah, this reminds me that I don't have to use flex_array, initially
I thought the tcf_exts could hold as many actions as it wants,
but actually there is a upper bound, TCA_ACT_MAX_PRIO.
IOW, a regular dynamic array is just enough here.



Yes, a regular array would be enough.



I just replaced the flex_array with a regular one, it works fine
for me too, at least no crash with all of my test cases.






No problem Cong - except we have a kernel that crashes right now.
BTW: I just thought of another test which uses a different code
path
# add a policer rule
sudo $TC actions add action police rate 1kbit burst 90k drop
#dump rules..
sudo $TC -s actions ls action police





I tested a lot more this time.
Good news: performance regression now resolved.
Some bad news - there's still one more oops:

sudo $TC actions add action police rate 1kbit burst 90k drop index 1

note how i explicitly specified the index.
If i leave out the index, all works fine. I'll continue to see
if there are any other issue for the next while and will email.
I think you are close so  I will also make small comments on the
patches because you are going to make another update.

cheers,
jamal


[PATCH v2 2/3] VSOCK: Add vsockmon device

2016-08-13 Thread ggarcia
From: Gerard Garcia 

Add vsockmon virtual network device that receives packets from the vsock
transports and exposes them to user space.

Based on the nlmon device.

Signed-off-by: Gerard Garcia 
---
 drivers/net/Kconfig   |   8 ++
 drivers/net/Makefile  |   1 +
 drivers/net/vsockmon.c| 168 ++
 include/uapi/linux/Kbuild |   1 +
 include/uapi/linux/vsockmon.h |  38 ++
 5 files changed, 216 insertions(+)
 create mode 100644 drivers/net/vsockmon.c
 create mode 100644 include/uapi/linux/vsockmon.h

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0c5415b..42c43b6 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -330,6 +330,14 @@ config NET_VRF
  This option enables the support for mapping interfaces into VRF's. The
  support enables VRF devices.
 
+config VSOCKMON
+tristate "Virtual vsock monitoring device"
+depends on VHOST_VSOCK
+---help---
+ This option enables a monitoring net device for vsock sockets. It is
+ mostly intended for developers or support to debug vsock issues. If
+ unsure, say N.
+
 endif # NET_CORE
 
 config SUNGEM_PHY
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 7336cbd..e2188d4 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -28,6 +28,7 @@ obj-$(CONFIG_GENEVE) += geneve.o
 obj-$(CONFIG_GTP) += gtp.o
 obj-$(CONFIG_NLMON) += nlmon.o
 obj-$(CONFIG_NET_VRF) += vrf.o
+obj-$(CONFIG_VSOCKMON) += vsockmon.o
 
 #
 # Networking Drivers
diff --git a/drivers/net/vsockmon.c b/drivers/net/vsockmon.c
new file mode 100644
index 000..9ad4f0a
--- /dev/null
+++ b/drivers/net/vsockmon.c
@@ -0,0 +1,168 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Virtio transport max packet size plus header */
+#define DEFAULT_MTU VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + sizeof(struct 
af_vsockmon_hdr);
+
+struct pcpu_lstats {
+   u64 rx_packets;
+   u64 rx_bytes;
+   struct u64_stats_sync syncp;
+};
+
+static int vsockmon_dev_init(struct net_device *dev)
+{
+   dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
+   return dev->lstats == NULL ? -ENOMEM : 0;
+}
+
+static void vsockmon_dev_uninit(struct net_device *dev)
+{
+   free_percpu(dev->lstats);
+}
+
+struct vsockmon {
+   struct vsock_tap vt;
+};
+
+static int vsockmon_open(struct net_device *dev)
+{
+   struct vsockmon *vsockmon = netdev_priv(dev);
+
+   vsockmon->vt.dev = dev;
+   vsockmon->vt.module = THIS_MODULE;
+   return vsock_add_tap(>vt);
+}
+
+static int vsockmon_close(struct net_device *dev) {
+   struct vsockmon *vsockmon = netdev_priv(dev);
+
+   return vsock_remove_tap(>vt);
+}
+
+static netdev_tx_t vsockmon_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+   int len = skb->len;
+   struct pcpu_lstats *stats = this_cpu_ptr(dev->lstats);
+
+   u64_stats_update_begin(>syncp);
+   stats->rx_bytes += len;
+   stats->rx_packets++;
+   u64_stats_update_end(>syncp);
+
+   dev_kfree_skb(skb);
+
+   return NETDEV_TX_OK;
+}
+
+static struct rtnl_link_stats64 *
+vsockmon_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+   int i;
+   u64 bytes = 0, packets = 0;
+
+   for_each_possible_cpu(i) {
+   const struct pcpu_lstats *vstats;
+   u64 tbytes, tpackets;
+   unsigned int start;
+
+   vstats = per_cpu_ptr(dev->lstats, i);
+
+   do {
+   start = u64_stats_fetch_begin_irq(>syncp);
+   tbytes = vstats->rx_bytes;
+   tpackets = vstats->rx_packets;
+   } while (u64_stats_fetch_retry_irq(>syncp, start));
+
+   packets += tpackets;
+   bytes += tbytes;
+   }
+
+   stats->rx_packets = packets;
+   stats->tx_packets = 0;
+
+   stats->rx_bytes = bytes;
+   stats->tx_bytes = 0;
+
+   return stats;
+}
+
+static int vsockmon_is_valid_mtu(int new_mtu)
+{
+   return new_mtu >= (int) sizeof(struct af_vsockmon_hdr);
+}
+
+static int vsockmon_change_mtu(struct net_device *dev, int new_mtu)
+{
+   if (!vsockmon_is_valid_mtu(new_mtu))
+   return -EINVAL;
+
+   dev->mtu = new_mtu;
+   return 0;
+}
+
+static const struct net_device_ops vsockmon_ops = {
+   .ndo_init = vsockmon_dev_init,
+   .ndo_uninit = vsockmon_dev_uninit,
+   .ndo_open = vsockmon_open,
+   .ndo_stop = vsockmon_close,
+   .ndo_start_xmit = vsockmon_xmit,
+   .ndo_get_stats64 = vsockmon_get_stats64,
+   .ndo_change_mtu = vsockmon_change_mtu,
+};
+
+static u32 always_on(struct net_device *dev)
+{
+   return 1;
+}
+
+static const struct ethtool_ops vsockmon_ethtool_ops = {
+   .get_link = always_on,
+};
+
+static void vsockmon_setup(struct net_device *dev)
+{
+   dev->type = 

[PATCH v2 1/3] VSOCK: Add vsockmon tap functions

2016-08-13 Thread ggarcia
From: Gerard Garcia 

Add tap functions that can be used by the vsock transports to
deliver packets to vsockmon virtual network devices.

Signed-off-by: Gerard Garcia 
---
 include/net/af_vsock.h   |  13 +
 include/uapi/linux/if_arp.h  |   1 +
 net/vmw_vsock/Makefile   |   2 +-
 net/vmw_vsock/af_vsock_tap.c | 113 +++
 4 files changed, 128 insertions(+), 1 deletion(-)
 create mode 100644 net/vmw_vsock/af_vsock_tap.c

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index f275896..f7c51b1 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -185,4 +185,17 @@ struct sock *vsock_find_connected_socket(struct 
sockaddr_vm *src,
 void vsock_remove_sock(struct vsock_sock *vsk);
 void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
 
+/ TAP /
+
+struct vsock_tap {
+   struct net_device *dev;
+   struct module *module;
+   struct list_head list;
+};
+
+int vsock_init_tap(void);
+int vsock_add_tap(struct vsock_tap *vt);
+int vsock_remove_tap(struct vsock_tap *vt);
+void vsock_deliver_tap(struct sk_buff *skb);
+
 #endif /* __AF_VSOCK_H__ */
diff --git a/include/uapi/linux/if_arp.h b/include/uapi/linux/if_arp.h
index 4d024d7..cf73510 100644
--- a/include/uapi/linux/if_arp.h
+++ b/include/uapi/linux/if_arp.h
@@ -95,6 +95,7 @@
 #define ARPHRD_IP6GRE  823 /* GRE over IPv6*/
 #define ARPHRD_NETLINK 824 /* Netlink header   */
 #define ARPHRD_6LOWPAN 825 /* IPv6 over LoWPAN */
+#define ARPHRD_VSOCKMON826 /* Vsock monitor header 
*/
 
 #define ARPHRD_VOID  0x/* Void type, nothing is known */
 #define ARPHRD_NONE  0xFFFE/* zero header length */
diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
index bc27c70..09fc2eb 100644
--- a/net/vmw_vsock/Makefile
+++ b/net/vmw_vsock/Makefile
@@ -3,7 +3,7 @@ obj-$(CONFIG_VMWARE_VMCI_VSOCKETS) += vmw_vsock_vmci_transport.o
 obj-$(CONFIG_VIRTIO_VSOCKETS) += vmw_vsock_virtio_transport.o
 obj-$(CONFIG_VIRTIO_VSOCKETS_COMMON) += vmw_vsock_virtio_transport_common.o
 
-vsock-y += af_vsock.o vsock_addr.o
+vsock-y += af_vsock.o af_vsock_tap.o vsock_addr.o
 
 vmw_vsock_vmci_transport-y += vmci_transport.o vmci_transport_notify.o \
vmci_transport_notify_qstate.o
diff --git a/net/vmw_vsock/af_vsock_tap.c b/net/vmw_vsock/af_vsock_tap.c
new file mode 100644
index 000..ff242b1
--- /dev/null
+++ b/net/vmw_vsock/af_vsock_tap.c
@@ -0,0 +1,113 @@
+/*
+ * Tap functions for AF_VSOCK sockets.
+ *
+ * Code based on net/netlink/af_netlink.c tap functions.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+
+static DEFINE_SPINLOCK(vsock_tap_lock);
+static struct list_head vsock_tap_all __read_mostly =
+   LIST_HEAD_INIT(vsock_tap_all);
+
+int vsock_add_tap(struct vsock_tap *vt) {
+   if (unlikely(vt->dev->type != ARPHRD_VSOCKMON))
+   return -EINVAL;
+
+   __module_get(vt->module);
+
+   spin_lock(_tap_lock);
+   list_add_rcu(>list, _tap_all);
+   spin_unlock(_tap_lock);
+
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vsock_add_tap);
+
+int __vsock_remove_tap(struct vsock_tap *vt) {
+   bool found = false;
+   struct vsock_tap *tmp;
+
+   spin_lock(_tap_lock);
+
+   list_for_each_entry(tmp, _tap_all, list) {
+   if (vt == tmp) {
+   list_del_rcu(>list);
+   found = true;
+   goto out;
+   }
+   }
+
+   pr_warn("__vsock_remove_tap: %p not found\n", vt);
+out:
+   spin_unlock(_tap_lock);
+
+   if (found)
+   module_put(vt->module);
+
+   return found ? 0 : -ENODEV;
+}
+
+int vsock_remove_tap(struct vsock_tap *vt)
+{
+   int ret;
+
+   ret = __vsock_remove_tap(vt);
+   synchronize_net();
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(vsock_remove_tap);
+
+static int __vsock_deliver_tap_skb(struct sk_buff *skb,
+struct net_device *dev)
+{
+   int ret = 0;
+   struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+   if (nskb) {
+   dev_hold(dev);
+
+   nskb->dev = dev;
+   ret = dev_queue_xmit(nskb);
+   if (unlikely(ret > 0))
+   ret = net_xmit_errno(ret);
+
+   dev_put(dev);
+   }
+
+   return ret;
+}
+
+static void __vsock_deliver_tap(struct sk_buff *skb)
+{
+   int ret;
+   struct vsock_tap *tmp;
+
+   list_for_each_entry_rcu(tmp, _tap_all, list) {
+   ret = __vsock_deliver_tap_skb(skb, tmp->dev);
+   

[PATCH v2 3/3] VSOCK: Add virtio vsock vsockmon hooks

2016-08-13 Thread ggarcia
From: Gerard Garcia 

Add hooks to the virtio transport host driver to deliver a copy of
the received and sent messages to all vsockmon virtual network devices.

Signed-off-by: Gerard Garcia 
---
 drivers/vhost/vsock.c | 72 +++
 1 file changed, 72 insertions(+)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index e3b30ea..4670c3c 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -15,8 +15,10 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
+#include 
 #include "vhost.h"
 
 #define VHOST_VSOCK_DEFAULT_HOST_CID   2
@@ -45,6 +47,68 @@ struct vhost_vsock {
u32 guest_cid;
 };
 
+static struct sk_buff *
+virtio_vsock_pkt_to_vsockmon_skb(struct virtio_vsock_pkt *pkt)
+{
+   struct sk_buff *skb;
+   struct af_vsockmon_hdr *hdr;
+   unsigned char *t_hdr, *payload;
+
+   u32 skb_len = sizeof(*hdr) + sizeof(pkt->hdr) +
+   pkt->len;
+
+   skb = alloc_skb(skb_len, GFP_ATOMIC);
+   if (!skb)
+   return NULL;
+
+   hdr = (struct af_vsockmon_hdr *) skb_put(skb, sizeof(*hdr));
+
+   hdr->src_cid = pkt->hdr.src_cid;
+   hdr->src_port = pkt->hdr.src_port;
+   hdr->dst_cid = pkt->hdr.dst_cid;
+   hdr->dst_port = pkt->hdr.dst_port;
+   hdr->t = cpu_to_le16(AF_VSOCK_T_VIRTIO);
+   hdr->len = cpu_to_le16(sizeof(pkt->hdr));
+
+   switch(cpu_to_le16(pkt->hdr.op)) {
+   case VIRTIO_VSOCK_OP_REQUEST:
+   case VIRTIO_VSOCK_OP_RESPONSE:
+   hdr->op = cpu_to_le16(AF_VSOCK_OP_CONNECT);
+   break;
+   case VIRTIO_VSOCK_OP_RST:
+   case VIRTIO_VSOCK_OP_SHUTDOWN:
+   hdr->op = cpu_to_le16(AF_VSOCK_OP_DISCONNECT);
+   break;
+   case VIRTIO_VSOCK_OP_RW:
+   hdr->op = cpu_to_le16(AF_VSOCK_OP_PAYLOAD);
+   break;
+   case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
+   case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
+   hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
+   break;
+   default:
+   hdr->op = cpu_to_le16(AF_VSOCK_OP_UNKNOWN);
+   break;
+   }
+
+   t_hdr = skb_put(skb, sizeof(pkt->hdr));
+   memcpy(t_hdr, >hdr, sizeof(pkt->hdr));
+
+   if (pkt->len) {
+   payload = skb_put(skb, pkt->len);
+   memcpy(payload, pkt->buf, pkt->len);
+   }
+
+   return skb;
+}
+
+static void vsock_deliver_tap_pkt(struct virtio_vsock_pkt *pkt)
+{
+   struct sk_buff *skb = virtio_vsock_pkt_to_vsockmon_skb(pkt);
+   if (skb)
+   vsock_deliver_tap(skb);
+}
+
 static u32 vhost_transport_get_local_cid(void)
 {
return VHOST_VSOCK_DEFAULT_HOST_CID;
@@ -168,6 +232,11 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
restart_tx = true;
}
 
+   /* Deliver to monitoring devices all correctly transmitted
+* packets.
+*/
+   vsock_deliver_tap_pkt(pkt);
+
virtio_transport_free_pkt(pkt);
}
if (added)
@@ -338,6 +407,9 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work 
*work)
 
len = pkt->len;
 
+   /* Deliver to monitoring devices all received packets */
+   vsock_deliver_tap_pkt(pkt);
+
/* Only accept correctly addressed packets */
if (le64_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid)
virtio_transport_recv_pkt(pkt);
-- 
2.9.1



[PATCH v2 0/3] VSOCK: vsockmon virtual device to monitor AF_VSOCK sockets.

2016-08-13 Thread ggarcia
From: Gerard Garcia 

This patch applies over the mst vhost git repository:
http://git.kernel.org/cgit/linux/kernel/git/mst/vhost.git

v2:
 * Clone skb before transmitting them to vsockmon.
 * Use consume_skb() instead of kfree_skb().
 * Pass skb lifetime responsibility to tap functions.
 * Remove t_hdr member from vsockmon header to avoid problems when/if it
changes it size if more transports are supported.

This was already been sent as a RFC where several issues where fixed.
This is the summary of changes from the first RFC:

v2:
 * Do not clone skb, instead take ownership before transmitting.
 * Split tap functions from af_vsock.c.
 * Simplify vsockmon header to remove unnecessary padding and
set little endian byte order.
 * Various simple fixes from the comments received to the first RFC.

Additionally, first pach version changes:
 * Add len field to the vsockmon header to ease parsing.
 * Pack vsockmon header.
 * Various simple fixes and styling.

Overview:

Virtual socket transports operate at kernel level therefore, there is no easy
way to see the traffic exchanged between virtual machines and hypervisors that
communicate using AF_VSOCK sockets. In addition, being able to see the control
messages exchanged by the transports may be useful for debugging and
optimization purposes. This patch adds a virtual device that may be used to see
the traffic exchanged between virtual machines and hypervisors through AF_VSOCK
sockets.

Its structure is based on the nlmon device and this version just targets the
virtio transport, but support for the VMCI transport can be easily implemented.
The vsockmon header contains a generic header and includes the header specific 
to
the transport. The generic header allows to follow an AF_VSOCK stream without
having to dig into the details of the transport while the transport header
gives more detail which may be useful for troubleshooting and debugging.

Testing:

To set up a vsockmon device:

ip link add type vsockmon
ip link set vsockmon0 up

The Wireshark development version (master branch) includes a vsock dissector
that is capable of parsing packets received through vsockmon. The dissector
needs to be manually selected.

Thanks to Stefan Hajnoczi for his help.

Gerard Garcia (3):
  VSOCK: Add vsockmon tap functions
  VSOCK: Add vsockmon device
  VSOCK: Add virtio vsock vsockmon hooks

 drivers/net/Kconfig   |   8 ++
 drivers/net/Makefile  |   1 +
 drivers/net/vsockmon.c| 168 ++
 drivers/vhost/vsock.c |  72 ++
 include/net/af_vsock.h|  13 
 include/uapi/linux/Kbuild |   1 +
 include/uapi/linux/if_arp.h   |   1 +
 include/uapi/linux/vsockmon.h |  38 ++
 net/vmw_vsock/Makefile|   2 +-
 net/vmw_vsock/af_vsock_tap.c  | 113 
 10 files changed, 416 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/vsockmon.c
 create mode 100644 include/uapi/linux/vsockmon.h
 create mode 100644 net/vmw_vsock/af_vsock_tap.c

-- 
2.9.1



[PATCH] net: macb: add phy-handle support for the macb

2016-08-13 Thread Kedareswara rao Appana
This patch adds support for the 'phy-handle' binding which allows for a
system to specifically select a phy which can be attached via any MDIO
bus available in the system.

Signed-off-by: Kedareswara rao Appana 
---
 Documentation/devicetree/bindings/net/macb.txt |  3 +++
 drivers/net/ethernet/cadence/macb.c| 11 +--
 drivers/net/ethernet/cadence/macb.h|  1 +
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/macb.txt 
b/Documentation/devicetree/bindings/net/macb.txt
index b5a42df..3cfff7c 100644
--- a/Documentation/devicetree/bindings/net/macb.txt
+++ b/Documentation/devicetree/bindings/net/macb.txt
@@ -23,6 +23,9 @@ Required properties:
Optional elements: 'tx_clk'
 - clocks: Phandles to input clocks.
 
+Optional properties:
+- phy-handle   : See ethernet.txt file in the same directory.
+
 Optional properties for PHY child node:
 - reset-gpios : Should specify the gpio for phy reset
 - magic-packet : If present, indicates that the hardware supports waking
diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index 6b797e3..4ddd45d 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -421,7 +421,7 @@ static int macb_mii_probe(struct net_device *dev)
 static int macb_mii_init(struct macb *bp)
 {
struct macb_platform_data *pdata;
-   struct device_node *np;
+   struct device_node *np, *np1;
int err = -ENXIO, i;
 
/* Enable management port */
@@ -445,7 +445,13 @@ static int macb_mii_init(struct macb *bp)
dev_set_drvdata(>dev->dev, bp->mii_bus);
 
np = bp->pdev->dev.of_node;
-   if (np) {
+   np1 = of_get_parent(bp->phy_node);
+   if (np1) {
+   of_node_put(np1);
+   err = of_mdiobus_register(bp->mii_bus, np1);
+   if (err)
+   goto err_out_unregister_bus;
+   } else if (np) {
/* try dt phy registration */
err = of_mdiobus_register(bp->mii_bus, np);
 
@@ -3016,6 +3022,7 @@ static int macb_probe(struct platform_device *pdev)
} else {
bp->phy_interface = err;
}
+   bp->phy_node = of_parse_phandle(np, "phy-handle", 0);
 
/* IP specific init */
err = init(pdev);
diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index aa3aeec..83d1617 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -855,6 +855,7 @@ struct macb {
unsigned intjumbo_max_len;
 
u32 wol;
+   struct device_node  *phy_node;
 };
 
 static inline bool macb_is_gem(struct macb *bp)
-- 
2.1.2



Re: [PATCH v2 1/1] VSOCK: remove more space available check filling TXvq

2016-08-13 Thread Gerard Garcia

On 08/13/2016 02:31 AM, David Miller wrote:

From: ggar...@abra.uab.cat
Date: Wed, 10 Aug 2016 17:24:34 +0200


From: Gerard Garcia 

Remove unnecessary use of enable/disable callback notifications
and the incorrect more space available check.

The virtio_transport_tx_work handles when the TX virtqueue
has more buffers available.

Signed-off-by: Gerard Garcia 
Acked-by: Stefan Hajnoczi 


This does not apply cleanly to the current net GIT tree.



I'm sorry, I should have said that it applies over the mst vhost tree.


Re: [PATCH net] sctp: fix a success return may hide an error

2016-08-13 Thread Xin Long
>
> This style of error handling is dangerous.  The first error can be
> lost.
>
> For example, if sctp_outq_flush_rtx() earlier in this function returns
> an error, it will be lost if any invocation of the function
> sctp_packet_transmit() at the end function signals an error.
>
> I think you should always preserve the first error that is recorded
> into 'error'.
>
> I also wonder about why sctp_outq_flush_rtx() errors are completely
> ignored and don't influence the control flow here in any way.

Yes, the first error can be lost.
Here we just keep the last error. We don't really have to return the
first error or return it on the first failure.

[1]
Both sctp_outq_flush_rtx and sctp_packet_transmit can ONLY
return one error (-ENOMEM), as sctp_outq_flush_rtx also calls
sctp_packet_transmit.

[2]
It's the original codes that it doesn't return immediately when
sctp_outq_flush_rtx returns error. I guess it just doesn't want
to stop flushing out transport_list only because it fail to flush
rtx.
even sctp_packet_transmit_chunk in sctp_outq_flush also just
put the error into sk->sk_err, instread of returning immediately.

So we cannot return the err at the first failure as [2], the error
here is always -ENOMEM as [1].
I think to return the last error here is ok, at least  not dangerous,
can also fix the issue "a success return may hide an error" with
clear codes. :)


Re: [PATCH net] bpf: fix bpf_skb_in_cgroup helper naming

2016-08-13 Thread Martin KaFai Lau
On Fri, Aug 12, 2016 at 10:17:17PM +0200, Daniel Borkmann wrote:
> While hashing out BPF's current_task_under_cgroup helper bits, it came
> to discussion that the skb_in_cgroup helper name was suboptimally chosen.
>
> Tejun says:
>
>   So, I think in_cgroup should mean that the object is in that
>   particular cgroup while under_cgroup in the subhierarchy of that
>   cgroup. Let's rename the other subhierarchy test to under too. I
>   think that'd be a lot less confusing going forward.
>
>   [...]
>
>   It's more intuitive and gives us the room to implement the real
>   "in" test if ever necessary in the future.
>
> Since this touches uapi bits, we need to change this as long as v4.8
> is not yet officially released. Thus, change the helper enum and rename
> related bits.
Thanks for working on this and be-lated

Acked-by: Martin KaFai Lau