On 2018年04月10日 03:19, Bruce Ashfield wrote: > On 2018-04-05 3:08 PM, Bruce Ashfield wrote: >> On 2018-03-28 11:25 PM, He Zhe wrote: >>> This is a net statistics collector for atop from: >>> http://atoptool.nl/netatop.php >>> https://atoptool.nl/download/netatop-2.0.tar.gz >>> >>> Two changes: >>> Adjust pathes for netatop.h and netatopversion.h >>> Add module_init and module_exit >> >> Which kernel versions have you tested this on ? Is there a way we could get >> a README that describes how to build and test it ? > > Also, is there a recipe that builds this that I can see/test ?
Sorry for slow reply, I'm on vacation and will handle this asap. Is this expected to be built as a module via a recipe? Or as a patch/feature in yocto-kernel-cache? Zhe > > Bruce > >> >> Bruce >> >>> >>> Signed-off-by: He Zhe <[email protected]> >>> --- >>> drivers/staging/netatop/netatop.c | 1843 >>> ++++++++++++++++++++++++++++++ >>> drivers/staging/netatop/netatop.h | 47 + >>> drivers/staging/netatop/netatopversion.h | 2 + >>> 3 files changed, 1892 insertions(+) >>> create mode 100644 drivers/staging/netatop/netatop.c >>> create mode 100644 drivers/staging/netatop/netatop.h >>> create mode 100644 drivers/staging/netatop/netatopversion.h >>> >>> diff --git a/drivers/staging/netatop/netatop.c >>> b/drivers/staging/netatop/netatop.c >>> new file mode 100644 >>> index 000000000000..3baa6520416a >>> --- /dev/null >>> +++ b/drivers/staging/netatop/netatop.c >>> @@ -0,0 +1,1843 @@ >>> +/* >>> +** This module uses the netfilter interface to maintain statistics >>> +** about the network traffic per task, on level of thread group >>> +** and individual thread. >>> +** >>> +** General setup >>> +** ------------- >>> +** Once the module is active, it is called for every packet that is >>> +** transmitted by a local process and every packet that is received >>> +** from an interface. Not only the packets that contain the user data >>> +** are passed but also the TCP related protocol packets (SYN, ACK, ...). >>> +** >>> +** When the module discovers a packet for a connection (TCP) or local >>> +** port (UDP) that is new, it creates a sockinfo structure. As soon as >>> +** possible the sockinfo struct will be connected to a taskinfo struct >>> +** that represents the proces or thread that is related to the socket. >>> +** However, the task can only be determined when a packet is transmitted, >>> +** i.e. the module is called during system call handling in the context >>> +** of the transmitting process. At that moment the tgid (process) and >>> +** pid (thread) can be obtained from the process administration to >>> +** be stored in the module's own taskinfo structs (one for the process, >>> +** one for the thread). >>> +** For the time that the sockinfo struct can not be related to a taskinfo >>> +** struct (e.g. when only packets are received), counters are maintained >>> +** temporarily in the sockinfo struct. As soon as a related taskinfo struct >>> +** is discovered when the task transmits, counters will be maintained in >>> +** the taskinfo struct itself. >>> +** When packets are only received for a socket (e.g. another machine is >>> +** sending UDP packets to the local machine) while the local task >>> +** never responds, no match to a process can be made and the packets >>> +** remain unidentified by the netatop module. At least one packet should >>> +** have been sent by a local process to be able to match packets for such >>> +** socket. >>> +** In the file /proc/netatop counters can be found that show the total >>> +** number of packets sent/received and how many of these packets were >>> +** unidentified (i.e. not accounted to a process/thread). >>> +** >>> +** Garbage collection >>> +** ------------------ >>> +** The module uses a garbage collector to cleanup the unused sockinfo >>> +** structs if connections do not exist any more (TCP) or have not been >>> +** used for some time (TCP/UDP). >>> +** Furthermore, the garbage collector checks if the taskinfo structs >>> +** still represent existing processes or threads. If not, the taskinfo >>> struct >>> +** is destroyed (in case of a thread) or it is moved to a separate list of >>> +** finished processes (in case of a process). Analysis programs can read >>> +** the taskinfo of such finished process. When the taskinfo of a finished >>> +** process is not read within 15 seconds, the taskinfo will be destroyed. >>> +** >>> +** A garbage collector cycle can be triggered by issueing a getsockopt >>> +** call from an analysis program (e.g. atop). Apart from that, a time-based >>> +** garbage collector cycle is issued anyhow every 15 seconds by the >>> +** knetatop kernel thread. >>> +** >>> +** Interface with user mode >>> +** ------------------------ >>> +** Programs can open an IP socket and use the getsockopt() system call >>> +** to issue commands to this module. With the command ATOP_GETCNT_TGID >>> +** the current counters can be obtained on process level (thread group) >>> +** and with the command ATOP_GETCNT_PID the counters on thread level. >>> +** For both commands, the tgid/pid has to be passed of the required thread >>> +** (group). When the required thread (group) does not exist, an errno ESRCH >>> +** is given. >>> +** >>> +** The command ATOP_GETCNT_EXIT can be issued to obtain the counters of >>> +** an exited process. As stated above, such command has to be issued >>> +** within 15 seconds after a process has been declared 'finished' by >>> +** the garbage collector. Whenever this command is issued and no exited >>> +** process is in the exitlist, the requesting process is blocked until >>> +** an exited process is available. >>> +** >>> +** The command NETATOP_FORCE_GC activates the garbage collector of the >>> +** netatop module to determine if sockinfo's of old connections/ports >>> +** can be destroyed and if taskinfo's of exited processes can be >>> +** The command NETATOP_EMPTY_EXIT can be issued to wait until the exitlist >>> +** with the taskinfo's of exited processes is empty. >>> +** ---------------------------------------------------------------------- >>> +** Copyright (C) 2012 Gerlof Langeveld ([email protected]) >>> +** >>> +** This program is free software; you can redistribute it and/or modify >>> +** it under the terms of the GNU General Public License version 2 as >>> +** published by the Free Software Foundation. >>> +*/ >>> +#include <linux/init.h> >>> +#include <linux/kernel.h> >>> +#include <linux/module.h> >>> +#include <linux/netfilter.h> >>> +#include <linux/netfilter_ipv4.h> >>> +#include <linux/sched.h> >>> +#include <linux/skbuff.h> >>> +#include <linux/types.h> >>> +#include <linux/file.h> >>> +#include <linux/kthread.h> >>> +#include <linux/seq_file.h> >>> +#include <linux/version.h> >>> +#include <net/sock.h> >>> +#include <linux/ip.h> >>> +#include <linux/tcp.h> >>> +#include <linux/udp.h> >>> +#include <linux/proc_fs.h> >>> +#include <net/ip.h> >>> +#include <net/tcp.h> >>> +#include <net/udp.h> >>> + >>> +#include "netatop.h" >>> +#include "netatopversion.h" >>> + >>> +MODULE_LICENSE("GPL"); >>> +MODULE_AUTHOR("Gerlof Langeveld <[email protected]>"); >>> +MODULE_DESCRIPTION("Per-task network statistics"); >>> +MODULE_VERSION(NETATOPVERSION); >>> + >>> +#define GCINTERVAL (HZ*15) // interval garbage collector >>> (jiffies) >>> +#define GCMAXUDP (HZ*16) // max inactivity for UDP >>> (jiffies) >>> +#define GCMAXTCP (HZ*1800) // max inactivity for TCP (jiffies) >>> +#define GCMAXUNREF (HZ*60) // max time without taskref >>> (jiffies) >>> + >>> +#define SILIMIT (4096*1024) // maximum memory for sockinfo >>> structs >>> +#define TILIMIT (2048*1024) // maximum memory for taskinfo >>> structs >>> + >>> +#define NF_IP_PRE_ROUTING 0 >>> +#define NF_IP_LOCAL_IN 1 >>> +#define NF_IP_FORWARD 2 >>> +#define NF_IP_LOCAL_OUT 3 >>> +#define NF_IP_POST_ROUTING 4 >>> + >>> +/* >>> +** struct that maintains statistics about the network >>> +** traffic caused per thread or thread group >>> +*/ >>> +struct chainer { >>> + void *next; >>> + void *prev; >>> +}; >>> + >>> +struct taskinfobucket; >>> + >>> +struct taskinfo { >>> + struct chainer ch; >>> + >>> + pid_t id; // tgid or pid >>> + char type; // 'g' (thread group) or >>> + // 't' (thread) >>> + unsigned char state; // see below >>> + char command[COMLEN]; >>> + unsigned long btime; // start time of process >>> + unsigned long long exittime; // time inserted in exitlist >>> + >>> + struct taskcount tc; >>> +}; >>> + >>> +static struct kmem_cache *ticache; // taskinfo cache >>> + >>> +// state values above >>> +#define CHECKED 1 // verified that task still exists >>> +#define INDELETE 2 // task exited but still in hash list >>> +#define FINISHED 3 // task on exit list >>> + >>> +/* >>> +** hash tables to find a particular thread group or thread >>> +*/ >>> +#define TBUCKS 1024 // must be multiple of 2! >>> +#define THASH(x, t) (((x)+t)&(TBUCKS-1)) >>> + >>> +struct taskinfobucket { >>> + struct chainer ch; >>> + spinlock_t lock; >>> +} thash[TBUCKS]; >>> + >>> +static unsigned long nrt; // current number of taskinfo allocated >>> +static unsigned long nrt_ovf; // no taskinfo allocated due to overflow >>> +static DEFINE_SPINLOCK(nrtlock); >>> + >>> + >>> +static struct taskinfo *exithead; // linked list of exited processes >>> +static struct taskinfo *exittail; >>> +static DEFINE_SPINLOCK(exitlock); >>> + >>> +static DECLARE_WAIT_QUEUE_HEAD(exitlist_filled); >>> +static DECLARE_WAIT_QUEUE_HEAD(exitlist_empty); >>> + >>> +static unsigned long nre; // current number of taskinfo on exitlist >>> + >>> +/* >>> +** structs that uniquely identify a TCP connection (host endian format) >>> +*/ >>> +struct tcpv4_ident { >>> + uint32_t laddr; /* local IP address */ >>> + uint32_t raddr; /* remote IP address */ >>> + uint16_t lport; /* local port number */ >>> + uint16_t rport; /* remote port number */ >>> +}; >>> + >>> +struct tcpv6_ident { >>> + struct in6_addr laddr; /* local IP address */ >>> + struct in6_addr raddr; /* remote IP address */ >>> + uint16_t lport; /* local port number */ >>> + uint16_t rport; /* remote port number */ >>> +}; >>> + >>> +/* >>> +** struct to maintain the reference from a socket >>> +** to a thread and thread-group >>> +*/ >>> +struct sockinfo { >>> + struct chainer ch; >>> + >>> + unsigned char last_state; // last known state of socket >>> + uint8_t proto; // protocol >>> + >>> + union keydef { >>> + uint16_t udp; // UDP ident (only portnumber) >>> + struct tcpv4_ident tcp4; // TCP connection ident IPv4 >>> + struct tcpv6_ident tcp6; // TCP connection ident IPv6 >>> + } key; >>> + >>> + struct taskinfo *tgp; // ref to thread group >>> + struct taskinfo *thp; // ref to thread (or NULL) >>> + >>> + short tgh; // hash number of thread group >>> + short thh; // hash number of thread >>> + >>> + unsigned int sndpacks; // temporary counters in case >>> + unsigned int rcvpacks; // known yet >>> + unsigned long sndbytes; // no relation to process is >>> + unsigned long rcvbytes; >>> + >>> + unsigned long long lastact; // last updated (jiffies) >>> +}; >>> + >>> +static struct kmem_cache *sicache; // sockinfo cache >>> + >>> +/* >>> +** hash table to find a socket reference >>> +*/ >>> +#define SBUCKS 1024 // must be multiple of 2! >>> +#define SHASHTCP4(x) (((x).raddr+(x).lport+(x).rport)&(SBUCKS-1)) >>> +#define SHASHUDP(x) ((x)&(SBUCKS-1)) >>> + >>> +struct { >>> + struct chainer ch; >>> + spinlock_t lock; >>> +} shash[SBUCKS]; >>> + >>> +static unsigned long nrs; // current number sockinfo allocated >>> +static unsigned long nrs_ovf; // no sockinfo allocated due to overflow >>> +static DEFINE_SPINLOCK(nrslock); >>> + >>> +/* >>> +** various static counters >>> +*/ >>> +static unsigned long icmpsndbytes; >>> +static unsigned long icmpsndpacks; >>> +static unsigned long icmprcvbytes; >>> +static unsigned long icmprcvpacks; >>> + >>> +static unsigned long tcpsndpacks; >>> +static unsigned long tcprcvpacks; >>> +static unsigned long udpsndpacks; >>> +static unsigned long udprcvpacks; >>> +static unsigned long unidentudpsndpacks; >>> +static unsigned long unidentudprcvpacks; >>> +static unsigned long unidenttcpsndpacks; >>> +static unsigned long unidenttcprcvpacks; >>> + >>> +static unsigned long unknownproto; >>> + >>> +static DEFINE_MUTEX(gclock); >>> +static unsigned long long gclast; // last garbage collection >>> (jiffies) >>> + >>> +static struct task_struct *knetatop_task; >>> + >>> +static struct timespec boottime; >>> + >>> +/* >>> +** function prototypes >>> +*/ >>> +static void analyze_tcpv4_packet(struct sk_buff *, >>> + const struct net_device *, int, char, >>> + struct iphdr *, void *); >>> + >>> +static void analyze_udp_packet(struct sk_buff *, >>> + const struct net_device *, int, char, >>> + struct iphdr *, void *); >>> + >>> +static int sock2task(char, struct sockinfo *, >>> + struct taskinfo **, short *, >>> + struct sk_buff *, const struct net_device *, >>> + int, char); >>> + >>> +static void update_taskcounters(struct sk_buff *, >>> + const struct net_device *, >>> + struct taskinfo *, char); >>> + >>> +static void update_sockcounters(struct sk_buff *, >>> + const struct net_device *, >>> + struct sockinfo *, char); >>> + >>> +static void sock2task_sync(struct sk_buff *, >>> + struct sockinfo *, struct taskinfo *); >>> + >>> +static void register_unident(struct sockinfo *); >>> + >>> +static int calc_reallen(struct sk_buff *, >>> + const struct net_device *); >>> + >>> +static void get_tcpv4_ident(struct iphdr *, void *, >>> + char, union keydef *); >>> + >>> +static struct sockinfo *find_sockinfo(int, union keydef *, int, int); >>> +static struct sockinfo *make_sockinfo(int, union keydef *, int, int); >>> + >>> +static void wipesockinfo(void); >>> +static void wipetaskinfo(void); >>> +static void wipetaskexit(void); >>> + >>> +static void garbage_collector(void); >>> +static void gctaskexit(void); >>> +static void gcsockinfo(void); >>> +static void gctaskinfo(void); >>> + >>> +static void move_taskinfo(struct taskinfo *); >>> +static void delete_taskinfo(struct taskinfo *); >>> +static void delete_sockinfo(struct sockinfo *); >>> + >>> +static struct taskinfo *get_taskinfo(pid_t, char); >>> + >>> +static int getsockopt(struct sock *, int, void *, int *); >>> + >>> +static int netatop_open(struct inode *inode, struct file *file); >>> + >>> +/* >>> +** hook definitions >>> +*/ >>> +static struct nf_hook_ops hookin_ipv4; >>> +static struct nf_hook_ops hookout_ipv4; >>> + >>> +/* >>> +** getsockopt definitions for communication with user space >>> +*/ >>> +static struct nf_sockopt_ops sockopts = { >>> + .pf = PF_INET, >>> + .get_optmin = NETATOP_BASE_CTL, >>> + .get_optmax = NETATOP_BASE_CTL+6, >>> + .get = getsockopt, >>> + .owner = THIS_MODULE, >>> +}; >>> + >>> +static struct file_operations netatop_proc_fops = { >>> + .open = netatop_open, >>> + .read = seq_read, >>> + .llseek = seq_lseek, >>> + .release = single_release, >>> + .owner = THIS_MODULE, >>> +}; >>> + >>> + >>> +/* >>> +** hook function to be called for every incoming local packet >>> +*/ >>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 10, 0) >>> +# if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 4, 0) >>> +# define HOOK_ARG_TYPE void *priv >>> +# else >>> +# define HOOK_ARG_TYPE const struct nf_hook_ops *ops >>> +# endif >>> +#else >>> +# define HOOK_ARG_TYPE unsigned int hooknum >>> +#endif >>> + >>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 1, 0) >>> +#define HOOK_STATE_ARGS const struct nf_hook_state *state >>> + >>> +#define DEV_IN state->in >>> +#define DEV_OUT state->out >>> +#else >>> +# ifndef __GENKSYMS__ >>> +# define HOOK_STATE_ARGS const struct net_device *in, \ >>> + const struct net_device *out, \ >>> + const struct nf_hook_state *state >>> +# else >>> +# define HOOK_STATE_ARGS const struct net_device *in, \ >>> + const struct net_device *out, \ >>> + int (*okfn)(struct sk_buff *) >>> +# endif >>> +#define DEV_IN in >>> +#define DEV_OUT out >>> +#endif >>> + >>> + >>> +static unsigned int >>> +ipv4_hookin(HOOK_ARG_TYPE, struct sk_buff *skb, HOOK_STATE_ARGS) >>> +{ >>> + struct iphdr *iph; >>> + void *trh; >>> + >>> + if (skb == NULL) // useless socket buffer? >>> + return NF_ACCEPT; >>> + >>> + /* >>> + ** get pointer to IP header and transport header >>> + */ >>> + iph = (struct iphdr *)skb_network_header(skb); >>> + trh = ((char *)iph + (iph->ihl * 4)); >>> + >>> + /* >>> + ** react on protocol number >>> + */ >>> + switch (iph->protocol) { >>> + case IPPROTO_TCP: >>> + tcprcvpacks++; >>> + analyze_tcpv4_packet(skb, DEV_IN, 0, 'i', iph, trh); >>> + break; >>> + >>> + case IPPROTO_UDP: >>> + udprcvpacks++; >>> + analyze_udp_packet(skb, DEV_IN, 0, 'i', iph, trh); >>> + break; >>> + >>> + case IPPROTO_ICMP: >>> + icmprcvpacks++; >>> + icmprcvbytes += skb->len + DEV_IN->hard_header_len + 4; >>> + break; >>> + >>> + default: >>> + unknownproto++; >>> + } >>> + >>> + // accept every packet after stats gathering >>> + return NF_ACCEPT; >>> +} >>> + >>> +/* >>> +** hook function to be called for every outgoing local packet >>> +*/ >>> +static unsigned int >>> +ipv4_hookout(HOOK_ARG_TYPE, struct sk_buff *skb, HOOK_STATE_ARGS) >>> +{ >>> + int in_syscall = !in_interrupt(); >>> + struct iphdr *iph; >>> + void *trh; >>> + >>> + if (skb == NULL) // useless socket buffer? >>> + return NF_ACCEPT; >>> + >>> + /* >>> + ** get pointer to IP header and transport header >>> + */ >>> + iph = (struct iphdr *)skb_network_header(skb); >>> + trh = skb_transport_header(skb); >>> + >>> + /* >>> + ** react on protocol number >>> + */ >>> + switch (iph->protocol) { >>> + case IPPROTO_TCP: >>> + tcpsndpacks++; >>> + analyze_tcpv4_packet(skb, DEV_OUT, in_syscall, 'o', iph, trh); >>> + break; >>> + >>> + case IPPROTO_UDP: >>> + udpsndpacks++; >>> + analyze_udp_packet(skb, DEV_OUT, in_syscall, 'o', iph, trh); >>> + break; >>> + >>> + case IPPROTO_ICMP: >>> + icmpsndpacks++; >>> + icmpsndbytes += skb->len + DEV_OUT->hard_header_len + 4; >>> + break; >>> + >>> + default: >>> + unknownproto++; >>> + } >>> + >>> + // accept every packet after stats gathering >>> + return NF_ACCEPT; >>> +} >>> + >>> +/* >>> +** generic function (for input and output) to analyze the current packet >>> +*/ >>> +static void >>> +analyze_tcpv4_packet(struct sk_buff *skb, >>> + const struct net_device *ndev, // interface description >>> + int in_syscall, // called during system call? >>> + char direction, // incoming ('i') or outgoing ('o') >>> + struct iphdr *iph, void *trh) >>> +{ >>> + union keydef key; >>> + struct sockinfo *sip; >>> + int bs; // hash bucket for sockinfo >>> + unsigned long sflags; >>> + >>> + /* >>> + ** determine tcpv4_ident that identifies this TCP packet >>> + ** and calculate hash bucket in sockinfo hash >>> + */ >>> + get_tcpv4_ident(iph, trh, direction, &key); >>> + >>> + /* >>> + ** check if we have seen this tcpv4_ident before with a >>> + ** corresponding thread and thread group >>> + */ >>> + bs = SHASHTCP4(key.tcp4); >>> + >>> + spin_lock_irqsave(&shash[bs].lock, sflags); >>> + >>> + if ( (sip = find_sockinfo(IPPROTO_TCP, &key, sizeof key.tcp4, bs)) >>> + == NULL) { >>> + // no sockinfo yet: create one >>> + if ( (sip = make_sockinfo(IPPROTO_TCP, &key, >>> + sizeof key.tcp4, bs)) == NULL) { >>> + if (direction == 'i') >>> + unidenttcprcvpacks++; >>> + else >>> + unidenttcpsndpacks++; >>> + goto unlocks; >>> + } >>> + } >>> + >>> + if (skb->sk) >>> + sip->last_state = skb->sk->sk_state; >>> + >>> + /* >>> + ** if needed (re)connect the sockinfo to a taskinfo and update >>> + ** the counters >>> + */ >>> + >>> + // connect to thread group and update >>> + if (sock2task('g', sip, &sip->tgp, &sip->tgh, >>> + skb, ndev, in_syscall, direction)) { >>> + // connect to thread and update >>> + (void) sock2task('t', sip, &sip->thp, &sip->thh, >>> + skb, ndev, in_syscall, direction); >>> + } >>> + >>> +unlocks: >>> + spin_unlock_irqrestore(&shash[bs].lock, sflags); >>> +} >>> + >>> + >>> +/* >>> +** generic function (for input and output) to analyze the current packet >>> +*/ >>> +static void >>> +analyze_udp_packet(struct sk_buff *skb, >>> + const struct net_device *ndev, // interface description >>> + int in_syscall, // called during system call? >>> + char direction, // incoming ('i') or outgoing ('o') >>> + struct iphdr *iph, void *trh) >>> +{ >>> + struct udphdr *udph = (struct udphdr *)trh; >>> + uint16_t udplocal = (direction == 'i' ? >>> + ntohs(udph->dest) : ntohs(udph->source)); >>> + int bs; // hash bucket for sockinfo >>> + >>> + union keydef key; >>> + struct sockinfo *sip; >>> + unsigned long sflags; >>> + >>> + /* >>> + ** check if we have seen this local UDP port before with a >>> + ** corresponding thread and thread group >>> + */ >>> + key.udp = udplocal; >>> + bs = SHASHUDP(udplocal); >>> + >>> + spin_lock_irqsave(&shash[bs].lock, sflags); >>> + >>> + if ( (sip = find_sockinfo(IPPROTO_UDP, &key, sizeof key.udp, bs)) >>> + == NULL) { >>> + // no sockinfo yet: create one >>> + if ( (sip = make_sockinfo(IPPROTO_UDP, &key, >>> + sizeof key.udp, bs)) == NULL) { >>> + if (direction == 'i') >>> + unidentudprcvpacks++; >>> + else >>> + unidentudpsndpacks++; >>> + goto unlocks; >>> + } >>> + } >>> + >>> + /* >>> + ** if needed (re)connect the sockinfo to a taskinfo and update >>> + ** the counters >>> + */ >>> + >>> + // connect to thread group and update >>> + if (sock2task('g', sip, &sip->tgp, &sip->tgh, >>> + skb, ndev, in_syscall, direction)) { >>> + // connect to thread and update >>> + (void) sock2task('t', sip, &sip->thp, &sip->thh, >>> + skb, ndev, in_syscall, direction); >>> + } >>> + >>> +unlocks: >>> + spin_unlock_irqrestore(&shash[bs].lock, sflags); >>> +} >>> + >>> +/* >>> +** connect the sockinfo to the correct taskinfo and update the counters >>> +*/ >>> +static int >>> +sock2task(char idtype, struct sockinfo *sip, struct taskinfo **tipp, >>> + short *hash, struct sk_buff *skb, const struct net_device *ndev, >>> + int in_syscall, char direction) >>> +{ >>> + pid_t curid; >>> + unsigned long tflags; >>> + >>> + if (*tipp == NULL) { >>> + /* >>> + ** no taskinfo connected yet for this reference from >>> + ** sockinfo; to connect to a taskinfo, we must >>> + ** be in system call handling now --> verify >>> + */ >>> + if (!in_syscall) { >>> + if (idtype == 'g') >>> + update_sockcounters(skb, ndev, sip, direction); >>> + >>> + return 0; // failed >>> + } >>> + >>> + /* >>> + ** try to find existing taskinfo or create new taskinfo >>> + */ >>> + curid = (idtype == 'g' ? current->tgid : current->pid); >>> + >>> + *hash = THASH(curid, idtype); // calc hashQ >>> + >>> + spin_lock_irqsave(&thash[*hash].lock, tflags); >>> + >>> + if ( (*tipp = get_taskinfo(curid, idtype)) == NULL) { >>> + /* >>> + ** not possible to connect >>> + */ >>> + spin_unlock_irqrestore(&thash[*hash].lock, tflags); >>> + >>> + if (idtype == 'g') >>> + update_sockcounters(skb, ndev, sip, direction); >>> + >>> + return 0; // failed >>> + } >>> + >>> + /* >>> + ** new connection made: >>> + ** update task counters with sock counters >>> + */ >>> + sock2task_sync(skb, sip, *tipp); >>> + } else { >>> + /* >>> + ** already related to thread group or thread >>> + ** lock existing task >>> + */ >>> + spin_lock_irqsave(&thash[*hash].lock, tflags); >>> + >>> + /* >>> + ** check if socket has been passed to another process in the >>> + ** meantime, like programs as xinetd use to do >>> + ** if so, connect sockinfo to the new task >>> + */ >>> + if (in_syscall) { >>> + curid = (idtype == 'g' ? current->tgid : current->pid); >>> + >>> + if ((*tipp)->id != curid) { >>> + spin_unlock_irqrestore(&thash[*hash].lock, >>> + tflags); >>> + *hash = THASH(curid, idtype); >>> + >>> + spin_lock_irqsave(&thash[*hash].lock, tflags); >>> + >>> + if ( (*tipp = get_taskinfo(curid, idtype)) >>> + == NULL) { >>> + spin_unlock_irqrestore( >>> + &thash[*hash].lock, tflags); >>> + return 0; >>> + } >>> + } >>> + } >>> + } >>> + >>> + update_taskcounters(skb, ndev, *tipp, direction); >>> + >>> + spin_unlock_irqrestore(&thash[*hash].lock, tflags); >>> + >>> + return 1; >>> +} >>> + >>> +/* >>> +** update the statistics of a particular thread group or thread >>> +*/ >>> +static void >>> +update_taskcounters(struct sk_buff *skb, const struct net_device *ndev, >>> + struct taskinfo *tip, char direction) >>> +{ >>> + struct iphdr *iph = (struct iphdr *)skb_network_header(skb); >>> + int reallen = calc_reallen(skb, ndev); >>> + >>> + switch (iph->protocol) { >>> + case IPPROTO_TCP: >>> + if (direction == 'i') { >>> + tip->tc.tcprcvpacks++; >>> + tip->tc.tcprcvbytes += reallen; >>> + } else { >>> + tip->tc.tcpsndpacks++; >>> + tip->tc.tcpsndbytes += reallen; >>> + } >>> + break; >>> + >>> + case IPPROTO_UDP: >>> + if (direction == 'i') { >>> + tip->tc.udprcvpacks++; >>> + tip->tc.udprcvbytes += reallen; >>> + } else { >>> + tip->tc.udpsndpacks++; >>> + tip->tc.udpsndbytes += reallen; >>> + } >>> + } >>> +} >>> + >>> +/* >>> +** update the statistics of a sockinfo without a connected task >>> +*/ >>> +static void >>> +update_sockcounters(struct sk_buff *skb, const struct net_device *ndev, >>> + struct sockinfo *sip, char direction) >>> +{ >>> + int reallen = calc_reallen(skb, ndev); >>> + >>> + if (direction == 'i') { >>> + sip->rcvpacks++; >>> + sip->rcvbytes += reallen; >>> + } else { >>> + sip->sndpacks++; >>> + sip->sndbytes += reallen; >>> + } >>> +} >>> + >>> +/* >>> +** add the temporary counters in the sockinfo to the new connected task >>> +*/ >>> +static void >>> +sock2task_sync(struct sk_buff *skb, struct sockinfo *sip, struct taskinfo >>> *tip) >>> +{ >>> + struct iphdr *iph = (struct iphdr *)skb_network_header(skb); >>> + >>> + switch (iph->protocol) { >>> + case IPPROTO_TCP: >>> + tip->tc.tcprcvpacks += sip->rcvpacks; >>> + tip->tc.tcprcvbytes += sip->rcvbytes; >>> + tip->tc.tcpsndpacks += sip->sndpacks; >>> + tip->tc.tcpsndbytes += sip->sndbytes; >>> + break; >>> + >>> + case IPPROTO_UDP: >>> + tip->tc.udprcvpacks += sip->rcvpacks; >>> + tip->tc.udprcvbytes += sip->rcvbytes; >>> + tip->tc.udpsndpacks += sip->sndpacks; >>> + tip->tc.udpsndbytes += sip->sndbytes; >>> + } >>> +} >>> + >>> +static void >>> +register_unident(struct sockinfo *sip) >>> +{ >>> + switch (sip->proto) { >>> + case IPPROTO_TCP: >>> + unidenttcprcvpacks += sip->rcvpacks; >>> + unidenttcpsndpacks += sip->sndpacks; >>> + break; >>> + >>> + case IPPROTO_UDP: >>> + unidentudprcvpacks += sip->rcvpacks; >>> + unidentudpsndpacks += sip->sndpacks; >>> + } >>> +} >>> + >>> +/* >>> +** calculate the number of bytes that are really sent or received >>> +*/ >>> +static int >>> +calc_reallen(struct sk_buff *skb, const struct net_device *ndev) >>> +{ >>> + /* >>> + ** calculate the real load of this packet on the network: >>> + ** >>> + ** - length of IP header, TCP/UDP header and data (skb->len) >>> + ** >>> + ** since packet assembly/disassembly is done by the IP layer >>> + ** (we get an input packet that has been assembled already and >>> + ** an output packet that still has to be disassembled), additional >>> + ** IP headers/interface headers and interface headers have >>> + ** to be calculated for packets that are larger than the mtu >>> + ** >>> + ** - interface header length + 4 bytes crc >>> + */ >>> + int reallen = skb->len; >>> + >>> + if (reallen > ndev->mtu) >>> + reallen += (reallen / ndev->mtu) * >>> + (sizeof(struct iphdr) + ndev->hard_header_len + 4); >>> + >>> + reallen += ndev->hard_header_len + 4; >>> + >>> + return reallen; >>> +} >>> + >>> +/* >>> +** find the tcpv4_ident for the current packet, represented by >>> +** the skb_buff >>> +*/ >>> +static void >>> +get_tcpv4_ident(struct iphdr *iph, void *trh, char direction, union keydef >>> *key) >>> +{ >>> + struct tcphdr *tcph = (struct tcphdr *)trh; >>> + >>> + memset(key, 0, sizeof *key); // important for memcmp later on >>> + >>> + /* >>> + ** determine local/remote IP address and >>> + ** determine local/remote port number >>> + */ >>> + switch (direction) { >>> + case 'i': // incoming packet >>> + key->tcp4.laddr = ntohl(iph->daddr); >>> + key->tcp4.raddr = ntohl(iph->saddr); >>> + key->tcp4.lport = ntohs(tcph->dest); >>> + key->tcp4.rport = ntohs(tcph->source); >>> + break; >>> + >>> + case 'o': // outgoing packet >>> + key->tcp4.laddr = ntohl(iph->saddr); >>> + key->tcp4.raddr = ntohl(iph->daddr); >>> + key->tcp4.lport = ntohs(tcph->source); >>> + key->tcp4.rport = ntohs(tcph->dest); >>> + } >>> +} >>> + >>> +/* >>> +** search for the sockinfo holding the given address info >>> +** the appropriate hash bucket must have been locked before calling >>> +*/ >>> +static struct sockinfo * >>> +find_sockinfo(int proto, union keydef *identp, int identsz, int hash) >>> +{ >>> + struct sockinfo *sip = shash[hash].ch.next; >>> + >>> + /* >>> + ** search for appropriate struct >>> + */ >>> + while (sip != (void *)&shash[hash].ch) { >>> + if ( memcmp(&sip->key, identp, identsz) == 0 && >>> + sip->proto == proto) { >>> + sip->lastact = jiffies_64; >>> + return sip; >>> + } >>> + >>> + sip = sip->ch.next; >>> + } >>> + >>> + return NULL; // not existing >>> +} >>> + >>> +/* >>> +** create a new sockinfo and fill >>> +** the appropriate hash bucket must have been locked before calling >>> +*/ >>> +static struct sockinfo * >>> +make_sockinfo(int proto, union keydef *identp, int identsz, int hash) >>> +{ >>> + struct sockinfo *sip; >>> + unsigned long flags; >>> + >>> + /* >>> + ** check if the threshold of memory used for sockinfo structs >>> + ** is reached to avoid that a fork bomb of processes opening >>> + ** a socket leads to memory overload >>> + */ >>> + if ( (nrs+1) * sizeof(struct sockinfo) > SILIMIT) { >>> + spin_lock_irqsave(&nrslock, flags); >>> + nrs_ovf++; >>> + spin_unlock_irqrestore(&nrslock, flags); >>> + return NULL; >>> + } >>> + >>> + if ( (sip = kmem_cache_alloc(sicache, GFP_ATOMIC)) == NULL) >>> + return NULL; >>> + >>> + spin_lock_irqsave(&nrslock, flags); >>> + nrs++; >>> + spin_unlock_irqrestore(&nrslock, flags); >>> + >>> + /* >>> + ** insert new struct in doubly linked list >>> + */ >>> + memset(sip, '\0', sizeof *sip); >>> + >>> + sip->ch.next = &shash[hash].ch; >>> + sip->ch.prev = shash[hash].ch.prev; >>> + ((struct sockinfo *)shash[hash].ch.prev)->ch.next = sip; >>> + shash[hash].ch.prev = sip; >>> + >>> + sip->proto = proto; >>> + sip->lastact = jiffies_64; >>> + sip->key = *identp; >>> + >>> + return sip; >>> +} >>> + >>> +/* >>> +** search the taskinfo structure holding the info about the given id/type >>> +** if such taskinfo is not yet present, create a new one >>> +*/ >>> +static struct taskinfo * >>> +get_taskinfo(pid_t id, char type) >>> +{ >>> + int bt = THASH(id, type); >>> + struct taskinfo *tip = thash[bt].ch.next; >>> + unsigned long tflags; >>> + >>> + /* >>> + ** search if id exists already >>> + */ >>> + while (tip != (void *)&thash[bt].ch) { >>> + if (tip->id == id && tip->type == type) >>> + return tip; >>> + >>> + tip = tip->ch.next; >>> + } >>> + >>> + /* >>> + ** check if the threshold of memory used for taskinfo structs >>> + ** is reached to avoid that a fork bomb of processes opening >>> + ** a socket lead to memory overload >>> + */ >>> + if ( (nre+nrt+1) * sizeof(struct taskinfo) > TILIMIT) { >>> + spin_lock_irqsave(&nrtlock, tflags); >>> + nrt_ovf++; >>> + spin_unlock_irqrestore(&nrtlock, tflags); >>> + return NULL; >>> + } >>> + >>> + /* >>> + ** id not known yet >>> + ** add new entry to hash list >>> + */ >>> + if ( (tip = kmem_cache_alloc(ticache, GFP_ATOMIC)) == NULL) >>> + return NULL; >>> + >>> + spin_lock_irqsave(&nrtlock, tflags); >>> + nrt++; >>> + spin_unlock_irqrestore(&nrtlock, tflags); >>> + >>> + /* >>> + ** insert new struct in doubly linked list >>> + ** and fill values >>> + */ >>> + memset(tip, '\0', sizeof *tip); >>> + >>> + tip->ch.next = &thash[bt].ch; >>> + tip->ch.prev = thash[bt].ch.prev; >>> + ((struct taskinfo *)thash[bt].ch.prev)->ch.next = tip; >>> + thash[bt].ch.prev = tip; >>> + >>> + tip->id = id; >>> + tip->type = type; >>> + >>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 17, 0) >>> + tip->btime = div_u64((current->real_start_time + >>> + (boottime.tv_sec * NSEC_PER_SEC + >>> + boottime.tv_sec)), NSEC_PER_SEC); >>> +#else >>> + // current->real_start_time is type u64 >>> + tip->btime = current->real_start_time.tv_sec + boottime.tv_sec; >>> + >>> + if (current->real_start_time.tv_nsec + boottime.tv_nsec > NSEC_PER_SEC) >>> + tip->btime++; >>> +#endif >>> + >>> + strncpy(tip->command, current->comm, COMLEN); >>> + >>> + return tip; >>> +} >>> + >>> +/* >>> +** garbage collector that removes: >>> +** - exited tasks that are no longer used by user mode programs >>> +** - sockinfo's that are not used any more >>> +** - taskinfo's that do not exist any more >>> +** >>> +** a mutex avoids that the garbage collector runs several times in parallel >>> +** >>> +** this function may only be called in process context! >>> +*/ >>> +static void >>> +garbage_collector(void) >>> +{ >>> + mutex_lock(&gclock); >>> + >>> + if (jiffies_64 < gclast + (HZ/2)) { // maximum 2 GC cycles per second >>> + mutex_unlock(&gclock); >>> + return; >>> + } >>> + >>> + gctaskexit(); // remove remaining taskinfo structs from exit list >>> + >>> + gcsockinfo(); // clean up sockinfo structs in shash list >>> + >>> + gctaskinfo(); // clean up taskinfo structs in thash list >>> + >>> + gclast = jiffies_64; >>> + >>> + mutex_unlock(&gclock); >>> +} >>> + >>> +/* >>> +** tasks in the exitlist can be read by a user mode process for a limited >>> +** amount of time; this function removes all taskinfo structures that have >>> +** not been read within that period of time >>> +** notice that exited processes are chained to the tail, so the oldest >>> +** can be found at the head >>> +*/ >>> +static void >>> +gctaskexit() >>> +{ >>> + unsigned long flags; >>> + struct taskinfo *tip; >>> + >>> + spin_lock_irqsave(&exitlock, flags); >>> + >>> + for (tip=exithead; tip;) { >>> + if (jiffies_64 < tip->exittime + GCINTERVAL) >>> + break; >>> + >>> + // remove taskinfo from exitlist >>> + exithead = tip->ch.next; >>> + kmem_cache_free(ticache, tip); >>> + nre--; >>> + tip = exithead; >>> + } >>> + >>> + /* >>> + ** if list empty now, then exithead and exittail both NULL >>> + ** wakeup waiters for emptylist >>> + */ >>> + if (nre == 0) { >>> + exittail = NULL; >>> + wake_up_interruptible(&exitlist_empty); >>> + } >>> + >>> + spin_unlock_irqrestore(&exitlock, flags); >>> +} >>> + >>> +/* >>> +** cleanup sockinfo structures that are connected to finished processes >>> +*/ >>> +static void >>> +gcsockinfo() >>> +{ >>> + int i; >>> + struct sockinfo *sip, *sipsave; >>> + unsigned long sflags, tflags; >>> + struct pid *pid; >>> + >>> + /* >>> + ** go through all sockinfo hash buckets >>> + */ >>> + for (i=0; i < SBUCKS; i++) { >>> + if (shash[i].ch.next == (void *)&shash[i].ch) >>> + continue; // quick return without lock >>> + >>> + spin_lock_irqsave(&shash[i].lock, sflags); >>> + >>> + sip = shash[i].ch.next; >>> + >>> + /* >>> + ** search all sockinfo structs chained in one bucket >>> + */ >>> + while (sip != (void *)&shash[i].ch) { >>> + /* >>> + ** TCP connections that were not in >>> + ** state ESTABLISHED or LISTEN can be >>> + ** eliminated >>> + */ >>> + if (sip->proto == IPPROTO_TCP) { >>> + switch (sip->last_state) { >>> + case TCP_ESTABLISHED: >>> + case TCP_LISTEN: >>> + break; >>> + >>> + default: >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + continue; >>> + } >>> + } >>> + >>> + /* >>> + ** check if this sockinfo has no relation >>> + ** for a while with a thread group >>> + ** if so, delete the sockinfo >>> + */ >>> + if (sip->tgp == NULL) { >>> + if (sip->lastact + GCMAXUNREF < jiffies_64) { >>> + register_unident(sip); >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + } else { >>> + sip = sip->ch.next; >>> + } >>> + continue; >>> + } >>> + >>> + /* >>> + ** check if referred thread group is >>> + ** already marked as 'indelete' during this >>> + ** sockinfo search >>> + ** if so, delete this sockinfo >>> + */ >>> + spin_lock_irqsave(&thash[sip->tgh].lock, tflags); >>> + >>> + if (sip->tgp->state == INDELETE) { >>> + spin_unlock_irqrestore(&thash[sip->tgh].lock, >>> + tflags); >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + continue; >>> + } >>> + >>> + /* >>> + ** check if referred thread group still exists; >>> + ** this step will be skipped if we already verified >>> + ** the existance of the thread group earlier during >>> + ** this garbage collection cycle >>> + */ >>> + if (sip->tgp->state != CHECKED) { >>> + /* >>> + ** connected thread group not yet verified >>> + ** during this cycle, so check if it still >>> + ** exists >>> + ** if not, mark the thread group as 'indelete' >>> + ** (it can not be deleted right now because >>> + ** we might find other sockinfo's referring >>> + ** to this thread group during the current >>> + ** cycle) and delete this sockinfo >>> + ** if the thread group exists, just mark >>> + ** it as 'checked' for this cycle >>> + */ >>> + rcu_read_lock(); >>> + pid = find_vpid(sip->tgp->id); >>> + rcu_read_unlock(); >>> + >>> + if (pid == NULL) { >>> + sip->tgp->state = INDELETE; >>> + spin_unlock_irqrestore( >>> + &thash[sip->tgh].lock, tflags); >>> + >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + continue; >>> + } else { >>> + sip->tgp->state = CHECKED; >>> + } >>> + } >>> + >>> + spin_unlock_irqrestore(&thash[sip->tgh].lock, tflags); >>> + >>> + /* >>> + ** check if this sockinfo has a relation with a thread >>> + ** if not, skip further handling of this sockinfo >>> + */ >>> + if (sip->thp == NULL) { >>> + sip = sip->ch.next; >>> + continue; >>> + } >>> + >>> + /* >>> + ** check if referred thread is already marked >>> + ** as 'indelete' during this sockinfo search >>> + ** if so, break connection >>> + */ >>> + spin_lock_irqsave(&thash[sip->thh].lock, tflags); >>> + >>> + if (sip->thp->state == INDELETE) { >>> + spin_unlock_irqrestore(&thash[sip->thh].lock, >>> + tflags); >>> + sip->thp = NULL; >>> + sip = sip->ch.next; >>> + continue; >>> + } >>> + >>> + /* >>> + ** check if referred thread is already checked >>> + ** during this sockinfo search >>> + */ >>> + if (sip->thp->state == CHECKED) { >>> + spin_unlock_irqrestore(&thash[sip->thh].lock, >>> + tflags); >>> + sip = sip->ch.next; >>> + continue; >>> + } >>> + >>> + /* >>> + ** connected thread not yet verified >>> + ** check if it still exists >>> + ** if not, mark it as 'indelete' and break connection >>> + ** if thread exists, mark it 'checked' >>> + */ >>> + rcu_read_lock(); >>> + pid = find_vpid(sip->thp->id); >>> + rcu_read_unlock(); >>> + >>> + if (pid == NULL) { >>> + sip->thp->state = INDELETE; >>> + sip->thp = NULL; >>> + } else { >>> + sip->thp->state = CHECKED; >>> + } >>> + >>> + spin_unlock_irqrestore(&thash[sip->thh].lock, tflags); >>> + >>> + /* >>> + ** check if a TCP port has not been used >>> + ** for some time --> destroy even if the thread >>> + ** (group) is still there >>> + */ >>> + if (sip->proto == IPPROTO_TCP && >>> + sip->lastact + GCMAXTCP < jiffies_64) { >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + continue; >>> + } >>> + >>> + /* >>> + ** check if a UDP port has not been used >>> + ** for some time --> destroy even if the thread >>> + ** (group) is still there >>> + ** e.g. outgoing DNS requests (to remote port 53) are >>> + ** issued every time with another source port being >>> + ** a new object that should not be kept too long; >>> + ** local well-known ports are useful to keep >>> + */ >>> + if (sip->proto == IPPROTO_UDP && >>> + sip->lastact + GCMAXUDP < jiffies_64 && >>> + sip->key.udp > 1024) { >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + continue; >>> + } >>> + >>> + sip = sip->ch.next; >>> + } >>> + >>> + spin_unlock_irqrestore(&shash[i].lock, sflags); >>> + } >>> +} >>> + >>> +/* >>> +** remove taskinfo structures of finished tasks from hash list >>> +*/ >>> +static void >>> +gctaskinfo() >>> +{ >>> + int i; >>> + struct taskinfo *tip, *tipsave; >>> + unsigned long tflags; >>> + struct pid *pid; >>> + >>> + /* >>> + ** go through all taskinfo hash buckets >>> + */ >>> + for (i=0; i < TBUCKS; i++) { >>> + if (thash[i].ch.next == (void *)&thash[i].ch) >>> + continue; // quick return without lock >>> + >>> + spin_lock_irqsave(&thash[i].lock, tflags); >>> + >>> + tip = thash[i].ch.next; >>> + >>> + /* >>> + ** check all taskinfo structs chained to this bucket >>> + */ >>> + while (tip != (void *)&thash[i].ch) { >>> + switch (tip->state) { >>> + /* >>> + ** remove INDELETE tasks from the hash buckets >>> + ** -- move thread group to exitlist >>> + ** -- destroy thread right away >>> + */ >>> + case INDELETE: >>> + tipsave = tip->ch.next; >>> + >>> + if (tip->type == 'g') >>> + move_taskinfo(tip); // thread group >>> + else >>> + delete_taskinfo(tip); // thread >>> + >>> + tip = tipsave; >>> + break; >>> + >>> + case CHECKED: >>> + tip->state = 0; >>> + tip = tip->ch.next; >>> + break; >>> + >>> + default: // not checked yet >>> + rcu_read_lock(); >>> + pid = find_vpid(tip->id); >>> + rcu_read_unlock(); >>> + >>> + if (pid == NULL) { >>> + tipsave = tip->ch.next; >>> + >>> + if (tip->type == 'g') >>> + move_taskinfo(tip); >>> + else >>> + delete_taskinfo(tip); >>> + >>> + tip = tipsave; >>> + } else { >>> + tip = tip->ch.next; >>> + } >>> + } >>> + } >>> + >>> + spin_unlock_irqrestore(&thash[i].lock, tflags); >>> + } >>> +} >>> + >>> + >>> +/* >>> +** remove all sockinfo structs >>> +*/ >>> +static void >>> +wipesockinfo() >>> +{ >>> + struct sockinfo *sip, *sipsave; >>> + int i; >>> + unsigned long sflags; >>> + >>> + for (i=0; i < SBUCKS; i++) { >>> + spin_lock_irqsave(&shash[i].lock, sflags); >>> + >>> + sip = shash[i].ch.next; >>> + >>> + /* >>> + ** free all structs chained in one bucket >>> + */ >>> + while (sip != (void *)&shash[i].ch) { >>> + sipsave = sip->ch.next; >>> + delete_sockinfo(sip); >>> + sip = sipsave; >>> + } >>> + >>> + spin_unlock_irqrestore(&shash[i].lock, sflags); >>> + } >>> +} >>> + >>> +/* >>> +** remove all taskinfo structs from hash list >>> +*/ >>> +static void >>> +wipetaskinfo() >>> +{ >>> + struct taskinfo *tip, *tipsave; >>> + int i; >>> + unsigned long tflags; >>> + >>> + for (i=0; i < TBUCKS; i++) { >>> + spin_lock_irqsave(&thash[i].lock, tflags); >>> + >>> + tip = thash[i].ch.next; >>> + >>> + /* >>> + ** free all structs chained in one bucket >>> + */ >>> + while (tip != (void *)&thash[i].ch) { >>> + tipsave = tip->ch.next; >>> + delete_taskinfo(tip); >>> + tip = tipsave; >>> + } >>> + >>> + spin_unlock_irqrestore(&thash[i].lock, tflags); >>> + } >>> +} >>> + >>> +/* >>> +** remove all taskinfo structs from exit list >>> +*/ >>> +static void >>> +wipetaskexit() >>> +{ >>> + gctaskexit(); >>> +} >>> + >>> +/* >>> +** move one taskinfo struct from hash bucket to exitlist >>> +*/ >>> +static void >>> +move_taskinfo(struct taskinfo *tip) >>> +{ >>> + unsigned long flags; >>> + >>> + /* >>> + ** remove from hash list >>> + */ >>> + ((struct taskinfo *)tip->ch.next)->ch.prev = tip->ch.prev; >>> + ((struct taskinfo *)tip->ch.prev)->ch.next = tip->ch.next; >>> + >>> + spin_lock_irqsave(&nrtlock, flags); >>> + nrt--; >>> + spin_unlock_irqrestore(&nrtlock, flags); >>> + >>> + /* >>> + ** add to exitlist >>> + */ >>> + tip->ch.next = NULL; >>> + tip->state = FINISHED; >>> + tip->exittime = jiffies_64; >>> + >>> + spin_lock_irqsave(&exitlock, flags); >>> + >>> + if (exittail) { // list filled? >>> + exittail->ch.next = tip; >>> + exittail = tip; >>> + } else { // list empty >>> + exithead = exittail = tip; >>> + } >>> + >>> + nre++; >>> + >>> + wake_up_interruptible(&exitlist_filled); >>> + >>> + spin_unlock_irqrestore(&exitlock, flags); >>> +} >>> + >>> +/* >>> +** remove one taskinfo struct for the hash bucket chain >>> +*/ >>> +static void >>> +delete_taskinfo(struct taskinfo *tip) >>> +{ >>> + unsigned long flags; >>> + >>> + ((struct taskinfo *)tip->ch.next)->ch.prev = tip->ch.prev; >>> + ((struct taskinfo *)tip->ch.prev)->ch.next = tip->ch.next; >>> + >>> + kmem_cache_free(ticache, tip); >>> + >>> + spin_lock_irqsave(&nrtlock, flags); >>> + nrt--; >>> + spin_unlock_irqrestore(&nrtlock, flags); >>> +} >>> + >>> +/* >>> +** remove one sockinfo struct for the hash bucket chain >>> +*/ >>> +static void >>> +delete_sockinfo(struct sockinfo *sip) >>> +{ >>> + unsigned long flags; >>> + >>> + ((struct sockinfo *)sip->ch.next)->ch.prev = sip->ch.prev; >>> + ((struct sockinfo *)sip->ch.prev)->ch.next = sip->ch.next; >>> + >>> + kmem_cache_free(sicache, sip); >>> + >>> + spin_lock_irqsave(&nrslock, flags); >>> + nrs--; >>> + spin_unlock_irqrestore(&nrslock, flags); >>> +} >>> + >>> +/* >>> +** read function for /proc/netatop >>> +*/ >>> +static int >>> +netatop_show(struct seq_file *m, void *v) >>> +{ >>> + seq_printf(m, "tcpsndpacks: %12lu (unident: %9lu)\n" >>> + "tcprcvpacks: %12lu (unident: %9lu)\n" >>> + "udpsndpacks: %12lu (unident: %9lu)\n" >>> + "udprcvpacks: %12lu (unident: %9lu)\n\n" >>> + "icmpsndpacks: %12lu\n" >>> + "icmprcvpacks: %12lu\n\n" >>> + "#sockinfo: %12lu (overflow: %8lu)\n" >>> + "#taskinfo: %12lu (overflow: %8lu)\n" >>> + "#taskexit: %12lu\n\n" >>> + "modversion: %14s\n", >>> + tcpsndpacks, unidenttcpsndpacks, >>> + tcprcvpacks, unidenttcprcvpacks, >>> + udpsndpacks, unidentudpsndpacks, >>> + udprcvpacks, unidentudprcvpacks, >>> + icmpsndpacks, icmprcvpacks, >>> + nrs, nrs_ovf, >>> + nrt, nrt_ovf, >>> + nre, NETATOPVERSION); >>> + return 0; >>> +} >>> + >>> +static int >>> +netatop_open(struct inode *inode, struct file *file) >>> +{ >>> + return single_open(file, netatop_show, NULL); >>> +} >>> + >>> +/* >>> +** called when user spce issues system call getsockopt() >>> +*/ >>> +static int >>> +getsockopt(struct sock *sk, int cmd, void __user *user, int *len) >>> +{ >>> + int bt; >>> + struct taskinfo *tip; >>> + char tasktype = 't'; >>> + struct netpertask npt; >>> + unsigned long tflags; >>> + >>> + /* >>> + ** verify the proper privileges >>> + */ >>> + if (!capable(CAP_NET_ADMIN)) >>> + return -EPERM; >>> + >>> + /* >>> + ** react on command >>> + */ >>> + switch (cmd) { >>> + case NETATOP_PROBE: >>> + break; >>> + >>> + case NETATOP_FORCE_GC: >>> + garbage_collector(); >>> + break; >>> + >>> + case NETATOP_EMPTY_EXIT: >>> + while (nre > 0) { >>> + if (wait_event_interruptible(exitlist_empty, nre == 0)) >>> + return -ERESTARTSYS; >>> + } >>> + break; >>> + >>> + case NETATOP_GETCNT_EXIT: >>> + if (nre == 0) >>> + wake_up_interruptible(&exitlist_empty); >>> + >>> + if (*len < sizeof(pid_t)) >>> + return -EINVAL; >>> + >>> + if (*len > sizeof npt) >>> + *len = sizeof npt; >>> + >>> + spin_lock_irqsave(&exitlock, tflags); >>> + >>> + /* >>> + ** check if an exited process is present >>> + ** if not, wait for it... >>> + */ >>> + while (nre == 0) { >>> + spin_unlock_irqrestore(&exitlock, tflags); >>> + >>> + if ( wait_event_interruptible(exitlist_filled, nre > 0)) >>> + return -ERESTARTSYS; >>> + >>> + spin_lock_irqsave(&exitlock, tflags); >>> + } >>> + >>> + /* >>> + ** get first eprocess from exitlist and remove it from there >>> + */ >>> + tip = exithead; >>> + >>> + if ( (exithead = tip->ch.next) == NULL) >>> + exittail = NULL; >>> + >>> + nre--; >>> + >>> + spin_unlock_irqrestore(&exitlock, tflags); >>> + >>> + /* >>> + ** pass relevant info to user mode >>> + ** and free taskinfo struct >>> + */ >>> + npt.id = tip->id; >>> + npt.tc = tip->tc; >>> + npt.btime = tip->btime; >>> + memcpy(npt.command, tip->command, COMLEN); >>> + >>> + if (copy_to_user(user, &npt, *len) != 0) >>> + return -EFAULT; >>> + >>> + kmem_cache_free(ticache, tip); >>> + >>> + return 0; >>> + >>> + case NETATOP_GETCNT_TGID: >>> + tasktype = 'g'; >>> + >>> + case NETATOP_GETCNT_PID: >>> + if (*len < sizeof(pid_t)) >>> + return -EINVAL; >>> + >>> + if (*len > sizeof npt) >>> + *len = sizeof npt; >>> + >>> + if (copy_from_user(&npt, user, *len) != 0) >>> + return -EFAULT; >>> + >>> + /* >>> + ** search requested id in taskinfo hash >>> + */ >>> + bt = THASH(npt.id, tasktype); // calculate hash >>> + >>> + if (thash[bt].ch.next == (void *)&thash[bt].ch) >>> + return -ESRCH; // quick return without lock >>> + >>> + spin_lock_irqsave(&thash[bt].lock, tflags); >>> + >>> + tip = thash[bt].ch.next; >>> + >>> + while (tip != (void *)&thash[bt].ch) { >>> + // is this the one? >>> + if (tip->id == npt.id && tip->type == tasktype) { >>> + /* >>> + ** found: copy results to user space >>> + */ >>> + memcpy(npt.command, tip->command, COMLEN); >>> + npt.tc = tip->tc; >>> + npt.btime = tip->btime; >>> + >>> + spin_unlock_irqrestore(&thash[bt].lock, tflags); >>> + >>> + if (copy_to_user(user, &npt, *len) != 0) >>> + return -EFAULT; >>> + else >>> + return 0; >>> + } >>> + >>> + tip = tip->ch.next; >>> + } >>> + >>> + spin_unlock_irqrestore(&thash[bt].lock, tflags); >>> + return -ESRCH; >>> + >>> + default: >>> + printk(KERN_INFO "unknown getsockopt command %d\n", cmd); >>> + return -EINVAL; >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +/* >>> +** kernel mode thread: initiate garbage collection every N seconds >>> +*/ >>> +static int netatop_thread(void *dummy) >>> +{ >>> + while (!kthread_should_stop()) { >>> + /* >>> + ** do garbage collection >>> + */ >>> + garbage_collector(); >>> + >>> + /* >>> + ** wait a while >>> + */ >>> + (void) schedule_timeout_interruptible(GCINTERVAL); >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 13, 0) >>> +static int __net_init netatop_nf_register(struct net *net) >>> +{ >>> + int err; >>> + >>> + err = nf_register_net_hooks(net, &hookin_ipv4, 1); >>> + if (err) >>> + return err; >>> + >>> + err = nf_register_net_hooks(net, &hookout_ipv4, 1); >>> + if (err) >>> + nf_unregister_net_hooks(net, &hookin_ipv4, 1); >>> + >>> + return err; >>> +} >>> + >>> +static void __net_exit netatop_nf_unregister(struct net *net) >>> +{ >>> + nf_unregister_net_hooks(net, &hookin_ipv4, 1); >>> + nf_unregister_net_hooks(net, &hookout_ipv4, 1); >>> +} >>> + >>> +static struct pernet_operations netatop_net_ops = { >>> + .init = netatop_nf_register, >>> + .exit = netatop_nf_unregister, >>> +}; >>> +#endif >>> + >>> +/* >>> +** called when module loaded >>> +*/ >>> +int >>> +init_module() >>> +{ >>> + int i; >>> + >>> + /* >>> + ** initialize caches for taskinfo and sockinfo >>> + */ >>> + ticache = kmem_cache_create("Netatop_taskinfo", >>> + sizeof (struct taskinfo), 0, 0, NULL); >>> + if (!ticache) >>> + return -EFAULT; >>> + >>> + sicache = kmem_cache_create("Netatop_sockinfo", >>> + sizeof (struct sockinfo), 0, 0, NULL); >>> + if (!sicache) { >>> + kmem_cache_destroy(ticache); >>> + return -EFAULT; >>> + } >>> + >>> + /* >>> + ** initialize hash table for taskinfo and sockinfo >>> + */ >>> + for (i=0; i < TBUCKS; i++) { >>> + thash[i].ch.next = &thash[i].ch; >>> + thash[i].ch.prev = &thash[i].ch; >>> + spin_lock_init(&thash[i].lock); >>> + } >>> + >>> + for (i=0; i < SBUCKS; i++) { >>> + shash[i].ch.next = &shash[i].ch; >>> + shash[i].ch.prev = &shash[i].ch; >>> + spin_lock_init(&shash[i].lock); >>> + } >>> + >>> + getboottime(&boottime); >>> + >>> + /* >>> + ** register getsockopt for user space communication >>> + */ >>> + if (nf_register_sockopt(&sockopts) < 0) { >>> + kmem_cache_destroy(ticache); >>> + kmem_cache_destroy(sicache); >>> + return -1; >>> + } >>> + >>> + /* >>> + ** create a new kernel mode thread for time-driven garbage collection >>> + ** after creation, the thread waits until it is woken up >>> + */ >>> + knetatop_task = kthread_create(netatop_thread, NULL, "knetatop"); >>> + >>> + if (IS_ERR(knetatop_task)) { >>> + nf_unregister_sockopt(&sockopts); >>> + kmem_cache_destroy(ticache); >>> + kmem_cache_destroy(sicache); >>> + return -1; >>> + } >>> + >>> + /* >>> + ** prepare hooks for input and output packets >>> + */ >>> + hookin_ipv4.hooknum = NF_IP_LOCAL_IN; // input packs >>> + hookin_ipv4.hook = ipv4_hookin; // func to call >>> + hookin_ipv4.pf = PF_INET; // IPV4 packets >>> + hookin_ipv4.priority = NF_IP_PRI_FIRST; // highest prio >>> + >>> + hookout_ipv4.hooknum = NF_IP_LOCAL_OUT; // output packs >>> + hookout_ipv4.hook = ipv4_hookout; // func to call >>> + hookout_ipv4.pf = PF_INET; // IPV4 packets >>> + hookout_ipv4.priority = NF_IP_PRI_FIRST; // highest prio >>> + >>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 13, 0) >>> + i = register_pernet_subsys(&netatop_net_ops); >>> + if (i) >>> + return i; >>> +#else >>> + nf_register_hook(&hookin_ipv4); // register hook >>> + nf_register_hook(&hookout_ipv4); // register hook >>> +#endif >>> + >>> + /* >>> + ** create a /proc-entry to produce status-info on request >>> + */ >>> + proc_create("netatop", 0444, NULL, &netatop_proc_fops); >>> + >>> + /* >>> + ** all admi prepared; kick off kernel mode thread >>> + */ >>> + wake_up_process(knetatop_task); >>> + >>> + return 0; // return success >>> +} >>> + >>> +/* >>> +** called when module unloaded >>> +*/ >>> +void >>> +cleanup_module() >>> +{ >>> + /* >>> + ** tell kernel daemon to stop >>> + */ >>> + kthread_stop(knetatop_task); >>> + >>> + /* >>> + ** unregister netfilter hooks and other miscellaneous stuff >>> + */ >>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 13, 0) >>> + unregister_pernet_subsys(&netatop_net_ops); >>> +#else >>> + nf_unregister_hook(&hookin_ipv4); >>> + nf_unregister_hook(&hookout_ipv4); >>> +#endif >>> + >>> + remove_proc_entry("netatop", NULL); >>> + >>> + nf_unregister_sockopt(&sockopts); >>> + >>> + /* >>> + ** destroy allocated stats >>> + */ >>> + wipesockinfo(); >>> + wipetaskinfo(); >>> + wipetaskexit(); >>> + >>> + /* >>> + ** destroy caches >>> + */ >>> + kmem_cache_destroy(ticache); >>> + kmem_cache_destroy(sicache); >>> +} >>> + >>> +module_init(init_module); >>> +module_exit(cleanup_module); >>> diff --git a/drivers/staging/netatop/netatop.h >>> b/drivers/staging/netatop/netatop.h >>> new file mode 100644 >>> index 000000000000..a14e6fa5db52 >>> --- /dev/null >>> +++ b/drivers/staging/netatop/netatop.h >>> @@ -0,0 +1,47 @@ >>> +#define COMLEN 16 >>> + >>> +struct taskcount { >>> + unsigned long long tcpsndpacks; >>> + unsigned long long tcpsndbytes; >>> + unsigned long long tcprcvpacks; >>> + unsigned long long tcprcvbytes; >>> + >>> + unsigned long long udpsndpacks; >>> + unsigned long long udpsndbytes; >>> + unsigned long long udprcvpacks; >>> + unsigned long long udprcvbytes; >>> + >>> + /* space for future extensions */ >>> +}; >>> + >>> +struct netpertask { >>> + pid_t id; // tgid or tid (depending on command) >>> + unsigned long btime; >>> + char command[COMLEN]; >>> + >>> + struct taskcount tc; >>> +}; >>> + >>> + >>> +/* >>> +** getsocktop commands >>> +*/ >>> +#define NETATOP_BASE_CTL 15661 >>> + >>> +// just probe if the netatop module is active >>> +#define NETATOP_PROBE (NETATOP_BASE_CTL) >>> + >>> +// force garbage collection to make finished processes available >>> +#define NETATOP_FORCE_GC (NETATOP_BASE_CTL+1) >>> + >>> +// wait until all finished processes are read (blocks until done) >>> +#define NETATOP_EMPTY_EXIT (NETATOP_BASE_CTL+2) >>> + >>> +// get info for finished process (blocks until available) >>> +#define NETATOP_GETCNT_EXIT (NETATOP_BASE_CTL+3) >>> + >>> +// get counters for thread group (i.e. process): input is 'id' (pid) >>> +#define NETATOP_GETCNT_TGID (NETATOP_BASE_CTL+4) >>> + >>> +// get counters for thread: input is 'id' (tid) >>> +#define NETATOP_GETCNT_PID (NETATOP_BASE_CTL+5) >>> diff --git a/drivers/staging/netatop/netatopversion.h >>> b/drivers/staging/netatop/netatopversion.h >>> new file mode 100644 >>> index 000000000000..eb011f5a13c9 >>> --- /dev/null >>> +++ b/drivers/staging/netatop/netatopversion.h >>> @@ -0,0 +1,2 @@ >>> +#define NETATOPVERSION "2.0" >>> +#define NETATOPDATE "2017/09/30 12:06:19" >>> >> > > -- _______________________________________________ linux-yocto mailing list [email protected] https://lists.yoctoproject.org/listinfo/linux-yocto
