Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-19 Thread Marcelo Ricardo Leitner
On Tue, Dec 19, 2017 at 03:38:16PM +, Ilya Lesokhin wrote:
> Tuesday, December 19, 2017 5:12 PM, Marcelo Ricardo Leitner wrote:
> 
> > > I'm not quite sure what you mean by "no net_device's are registered"
> > > Presumably you mean there is no device that implements the
> > > NETIF_F_HW_TLS_TX capability yet.
> > 
> > Not really. Let me try again. This patchset is using the expression 
> > "tls_device".
> > When I read that, I expect a new interface type, like a tunnel, that would 
> > be
> > created on top of another interface that has the offloading capability. 
> > That's
> > why I'm confused. IMHO "tls_offload" is a better fit. Makes sense?
> > 
> 
> We don't expose a new interface. An existing netdev does the offload.
> 
> The xfrm layer also calls the offload layer xfrm_device and It also doesn't 
> need to
> add another interface to offload ipsec to a netdev.

Hm right, there is xfrm_dev_init() and others, but there is also
XFRM_OFFLOAD as the config define and not XFRM_DEVICE.

> 
> I thought about calling it tls_hw or tls_hw_offload.
> The problem is that the important distinction here is that the 
> offload is done by a netdev.
> tls_sw can also use hw offload if you have the required 
> memory to memory crypto engine and crypto_alloc_aead("gcm(aes)", 0, 0); 
> decides on using it.

Now I can see the confusion in both ways, thanks.
And now I don't have a preference either.

  Marcelo


RE: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-19 Thread Ilya Lesokhin
Tuesday, December 19, 2017 5:12 PM, Marcelo Ricardo Leitner wrote:

> > I'm not quite sure what you mean by "no net_device's are registered"
> > Presumably you mean there is no device that implements the
> > NETIF_F_HW_TLS_TX capability yet.
> 
> Not really. Let me try again. This patchset is using the expression 
> "tls_device".
> When I read that, I expect a new interface type, like a tunnel, that would be
> created on top of another interface that has the offloading capability. That's
> why I'm confused. IMHO "tls_offload" is a better fit. Makes sense?
> 

We don't expose a new interface. An existing netdev does the offload.

The xfrm layer also calls the offload layer xfrm_device and It also doesn't 
need to
add another interface to offload ipsec to a netdev.

I thought about calling it tls_hw or tls_hw_offload.
The problem is that the important distinction here is that the 
offload is done by a netdev.
tls_sw can also use hw offload if you have the required 
memory to memory crypto engine and crypto_alloc_aead("gcm(aes)", 0, 0); 
decides on using it.



Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-19 Thread Marcelo Ricardo Leitner
On Tue, Dec 19, 2017 at 07:31:24AM +, Ilya Lesokhin wrote:
> On Mon, Monday, December 18, 2017 9:54 PM, Marcelo Ricardo Leitner wrote:
> 
> > On Mon, Dec 18, 2017 at 01:10:33PM +0200, Ilya Lesokhin wrote:
> > > This patch adds a generic infrastructure to offload TLS crypto to a
> > > network devices. It enables the kernel TLS socket to skip encryption
> > > and authentication operations on the transmit side of the data path.
> > > Leaving those computationally expensive operations to the NIC.
> > 
> > I have a hard time understanding why this was named 'tls_device' if no
> > net_device's are registered.
> > 
> I'm not quite sure what you mean by "no net_device's are registered"
> Presumably you mean there is no device that implements the 
> NETIF_F_HW_TLS_TX capability yet.

Not really. Let me try again. This patchset is using the expression
"tls_device". When I read that, I expect a new interface type, like a
tunnel, that would be created on top of another interface that has the
offloading capability. That's why I'm confused. IMHO "tls_offload" is
a better fit. Makes sense?

> I'll just say that the IPSEC device offload infrastructure was also submitted
> https://github.com/torvalds/linux/commit/d77e38e612a017480157fe6d2c1422f42cb5b7e3
> before the first implementation
> https://github.com/torvalds/linux/commit/bebb23e6cb02d2fc752905e39d09ff6152852c6c
> 
> And we did provide a link to an implementation 
> https://github.com/Mellanox/tls-offload/tree/tls_device_v3
> for people who want to take a look.
> Unfortunately it is not ready for upstream submission yet

Yep, although I still have to get there.

Thanks,
Marcelo


Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-19 Thread kbuild test robot
Hi Ilya,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/tls-Add-generic-NIC-offload-infrastructure/20171219-140819
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


Please review and possibly fold the followup patch.

vim +649 net/tls/tls_device.c

   547  
   548  int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
   549  {
   550  struct tls_crypto_info *crypto_info;
   551  struct tls_offload_context *offload_ctx;
   552  struct tls_record_info *start_marker_record;
   553  u16 nonece_size, tag_size, iv_size, rec_seq_size;
   554  char *iv, *rec_seq;
   555  int rc;
   556  struct net_device *netdev;
   557  struct sk_buff *skb;
   558  
   559  if (!ctx) {
   560  rc = -EINVAL;
   561  goto out;
   562  }
   563  
   564  if (ctx->priv_ctx) {
   565  rc = -EEXIST;
   566  goto out;
   567  }
   568  
   569  /* We support starting offload on multiple sockets
   570   * concurrently, So we only need a read lock here.
   571   */
   572  percpu_down_read(&device_offload_lock);
   573  netdev = get_netdev_for_sock(sk);
   574  if (!netdev) {
   575  pr_err("%s: netdev not found\n", __func__);
   576  rc = -EINVAL;
   577  goto release_lock;
   578  }
   579  
   580  if (!(netdev->features & NETIF_F_HW_TLS_TX)) {
   581  rc = -ENOTSUPP;
   582  goto release_netdev;
   583  }
   584  
   585  /* Avoid offloading if the device is down
   586   * We don't want to offload new flows after
   587   * the NETDEV_DOWN event
   588   */
   589  if (!(netdev->flags & IFF_UP)) {
   590  rc = -EINVAL;
   591  goto release_lock;
   592  }
   593  
   594  crypto_info = &ctx->crypto_send;
   595  switch (crypto_info->cipher_type) {
   596  case TLS_CIPHER_AES_GCM_128: {
   597  nonece_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
   598  tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
   599  iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
   600  iv = ((struct tls12_crypto_info_aes_gcm_128 
*)crypto_info)->iv;
   601  rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE;
   602  rec_seq =
   603   ((struct tls12_crypto_info_aes_gcm_128 
*)crypto_info)->rec_seq;
   604  break;
   605  }
   606  default:
   607  rc = -EINVAL;
   608  goto release_netdev;
   609  }
   610  
   611  start_marker_record = kmalloc(sizeof(*start_marker_record), 
GFP_KERNEL);
   612  if (!start_marker_record) {
   613  rc = -ENOMEM;
   614  goto release_netdev;
   615  }
   616  
   617  rc = attach_sock_to_netdev(sk, netdev, ctx);
   618  if (rc)
   619  goto free_marker_record;
   620  
   621  ctx->netdev = netdev;
   622  
   623  ctx->prepend_size = TLS_HEADER_SIZE + nonece_size;
   624  ctx->tag_size = tag_size;
   625  ctx->iv_size = iv_size;
   626  ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
   627GFP_KERNEL);
   628  if (!ctx->iv) {
   629  rc = -ENOMEM;
   630  goto detach_sock;
   631  }
   632  
   633  memcpy(ctx->iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, rec_seq, 
iv_size);
   634  
   635  ctx->rec_seq_size = rec_seq_size;
   636  ctx->rec_seq = kmalloc(rec_seq_size, GFP_KERNEL);
   637  if (!ctx->rec_seq) {
   638  rc = -ENOMEM;
   639  goto err_iv;
   640  }
   641  memcpy(ctx->rec_seq, rec_seq, rec_seq_size);
   642  
   643  offload_ctx = ctx->priv_ctx;
   644  memcpy(&offload_ctx->unacked_record_sn, rec_seq,
   645 sizeof(offload_ctx->unacked_record_sn));
   646  
   647  /* start at rec_seq -1 to account for the start marker record */
   648  offload_ctx->unacked_record_sn =
 > 649  be64_to_cpu(offload_ctx->unacked_record_sn) - 1;
   650  
   651  rc = tls_sw_fallback_init(sk, offload_ctx, crypto_info);
   652  if (rc)
   653  goto err_iv;
   654  
   655  start_marker_record->end_seq = tcp_sk(sk)->write_seq;
   656  start_marker_record->len = 0;
   657  start_marker_record->num_frags = 0;
   65

RE: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-18 Thread Ilya Lesokhin
On Mon, Monday, December 18, 2017 9:54 PM, Marcelo Ricardo Leitner wrote:

> On Mon, Dec 18, 2017 at 01:10:33PM +0200, Ilya Lesokhin wrote:
> > This patch adds a generic infrastructure to offload TLS crypto to a
> > network devices. It enables the kernel TLS socket to skip encryption
> > and authentication operations on the transmit side of the data path.
> > Leaving those computationally expensive operations to the NIC.
> 
> I have a hard time understanding why this was named 'tls_device' if no
> net_device's are registered.
> 
I'm not quite sure what you mean by "no net_device's are registered"
Presumably you mean there is no device that implements the 
NETIF_F_HW_TLS_TX capability yet.
I'll just say that the IPSEC device offload infrastructure was also submitted
https://github.com/torvalds/linux/commit/d77e38e612a017480157fe6d2c1422f42cb5b7e3
before the first implementation
https://github.com/torvalds/linux/commit/bebb23e6cb02d2fc752905e39d09ff6152852c6c

And we did provide a link to an implementation 
https://github.com/Mellanox/tls-offload/tree/tls_device_v3
for people who want to take a look.
Unfortunately it is not ready for upstream submission yet


> > +   percpu_down_read(&device_offload_lock);
> > +   netdev = get_netdev_for_sock(sk);
> > +   if (!netdev) {
> > +   pr_err("%s: netdev not found\n", __func__);
> 
> _ratelimit?
> 

Thanks, we will fix it in the future.


Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-18 Thread kbuild test robot
Hi Ilya,

I love your patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/tls-Add-generic-NIC-offload-infrastructure/20171219-140819
config: tile-allmodconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=tile 

All errors (new ones prefixed by >>):

   net/tls/tls_device_fallback.c: In function 'update_chksum':
>> net/tls/tls_device_fallback.c:190:16: error: implicit declaration of 
>> function 'csum_ipv6_magic'; did you mean 'csum_tcpudp_magic'? 
>> [-Werror=implicit-function-declaration]
  th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
   ^~~
   csum_tcpudp_magic
   cc1: some warnings being treated as errors

vim +190 net/tls/tls_device_fallback.c

   166  
   167  static inline void update_chksum(struct sk_buff *skb, int headln)
   168  {
   169  /* Can't use icsk->icsk_af_ops->send_check here because the ip 
addresses
   170   * might have been changed by NAT.
   171   */
   172  
   173  const struct ipv6hdr *ipv6h;
   174  const struct iphdr *iph;
   175  struct tcphdr *th = tcp_hdr(skb);
   176  int datalen = skb->len - headln;
   177  
   178  /* We only changed the payload so if we are using partial we 
don't
   179   * need to update anything.
   180   */
   181  if (likely(skb->ip_summed == CHECKSUM_PARTIAL))
   182  return;
   183  
   184  skb->ip_summed = CHECKSUM_PARTIAL;
   185  skb->csum_start = skb_transport_header(skb) - skb->head;
   186  skb->csum_offset = offsetof(struct tcphdr, check);
   187  
   188  if (skb->sk->sk_family == AF_INET6) {
   189  ipv6h = ipv6_hdr(skb);
 > 190  th->check = ~csum_ipv6_magic(&ipv6h->saddr, 
 > &ipv6h->daddr,
   191   datalen, IPPROTO_TCP, 0);
   192  } else {
   193  iph = ip_hdr(skb);
   194  th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, 
datalen,
   195 IPPROTO_TCP, 0);
   196  }
   197  }
   198  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-18 Thread kbuild test robot
Hi Ilya,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/tls-Add-generic-NIC-offload-infrastructure/20171219-140819
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=xtensa 

All warnings (new ones prefixed by >>):

   net//tls/tls_device_fallback.c: In function 'tls_sw_fallback':
>> net//tls/tls_device_fallback.c:360:1: warning: the frame size of 1040 bytes 
>> is larger than 1024 bytes [-Wframe-larger-than=]
}
^

vim +360 net//tls/tls_device_fallback.c

   214  
   215  /* This function may be called after the user socket is already
   216   * closed so make sure we don't use anything freed during
   217   * tls_sk_proto_close here
   218   */
   219  struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb)
   220  {
   221  int tcp_header_size = tcp_hdrlen(skb);
   222  int tcp_payload_offset = skb_transport_offset(skb) + 
tcp_header_size;
   223  int payload_len = skb->len - tcp_payload_offset;
   224  struct tls_context *tls_ctx = tls_get_ctx(sk);
   225  struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
   226  int remaining, buf_len, resync_sgs, rc, i = 0;
   227  void *buf, *dummy_buf, *iv, *aad;
   228  struct scatterlist sg_in[2 * (MAX_SKB_FRAGS + 1)];
   229  struct scatterlist sg_out[3];
   230  u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
   231  struct aead_request *aead_req;
   232  struct sk_buff *nskb = NULL;
   233  struct tls_record_info *record;
   234  unsigned long flags;
   235  s32 sync_size;
   236  u64 rcd_sn;
   237  
   238  if (!payload_len)
   239  return skb;
   240  
   241  sg_init_table(sg_in, ARRAY_SIZE(sg_in));
   242  sg_init_table(sg_out, ARRAY_SIZE(sg_out));
   243  
   244  spin_lock_irqsave(&ctx->lock, flags);
   245  record = tls_get_record(ctx, tcp_seq, &rcd_sn);
   246  if (!record) {
   247  spin_unlock_irqrestore(&ctx->lock, flags);
   248  WARN(1, "Record not found for seq %u\n", tcp_seq);
   249  goto free_orig;
   250  }
   251  
   252  sync_size = tcp_seq - tls_record_start_seq(record);
   253  if (sync_size < 0) {
   254  int is_start_marker = 
tls_record_is_start_marker(record);
   255  
   256  spin_unlock_irqrestore(&ctx->lock, flags);
   257  if (!is_start_marker)
   258  /* This should only occur if the relevant record was
   259   * already acked. In that case it should be ok
   260   * to drop the packet and avoid retransmission.
   261   *
   262   * There is a corner case where the packet contains
   263   * both an acked and a non-acked record.
   264   * We currently don't handle that case and rely
   265   * on TCP to retranmit a packet that doesn't contain
   266   * already acked payload.
   267   */
   268  goto free_orig;
   269  
   270  if (payload_len > -sync_size) {
   271  WARN(1, "Fallback of partially offloaded 
packets is not supported\n");
   272  goto free_orig;
   273  } else {
   274  return skb;
   275  }
   276  }
   277  
   278  remaining = sync_size;
   279  while (remaining > 0) {
   280  skb_frag_t *frag = &record->frags[i];
   281  
   282  __skb_frag_ref(frag);
   283  sg_set_page(sg_in + i, skb_frag_page(frag),
   284  skb_frag_size(frag), frag->page_offset);
   285  
   286  remaining -= skb_frag_size(frag);
   287  
   288  if (remaining < 0)
   289  sg_in[i].length += remaining;
   290  
   291  i++;
   292  }
   293  spin_unlock_irqrestore(&ctx->lock, flags);
   294  resync_sgs = i;
   295  
   296  aead_req = tls_alloc_aead_request(ctx->aead_send, GFP_ATOMIC);
   297  if (!aead_req)
   298  goto put_sg;
   299  
   300  buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE +
   301TLS_CIPHER_AES_GCM_128_IV_SIZE +
   302TLS_AAD_SPACE_SIZE +
   303sync_size +
   304tls_ctx->tag_size;
   305  buf = kma

Re: [PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-18 Thread Marcelo Ricardo Leitner
On Mon, Dec 18, 2017 at 01:10:33PM +0200, Ilya Lesokhin wrote:
> This patch adds a generic infrastructure to offload TLS crypto to a
> network devices. It enables the kernel TLS socket to skip encryption
> and authentication operations on the transmit side of the data path.
> Leaving those computationally expensive operations to the NIC.

I have a hard time understanding why this was named 'tls_device' if no
net_device's are registered.

> 
> The NIC offload infrastructure builds TLS records and pushes them to
> the TCP layer just like the SW KTLS implementation and using the same API.
> TCP segmentation is mostly unaffected. Currently the only exception is
> that we prevent mixed SKBs where only part of the payload requires
> offload. In the future we are likely to add a similar restriction
> following a change cipher spec record.
> 
> The notable differences between SW KTLS and NIC offloaded TLS
> implementations are as follows:
> 1. The offloaded implementation builds "plaintext TLS record", those
> records contain plaintext instead of ciphertext and place holder bytes
> instead of authentication tags.
> 2. The offloaded implementation maintains a mapping from TCP sequence
> number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
> TLS socket, we can use the tls NIC offload infrastructure to obtain
> enough context to encrypt the payload of the SKB.
> A TLS record is released when the last byte of the record is ack'ed,
> this is done through the new icsk_clean_acked callback.
> 
> The infrastructure should be extendable to support various NIC offload
> implementations.  However it is currently written with the
> implementation below in mind:
> The NIC assumes that packets from each offloaded stream are sent as
> plaintext and in-order. It keeps track of the TLS records in the TCP
> stream. When a packet marked for offload is transmitted, the NIC
> encrypts the payload in-place and puts authentication tags in the
> relevant place holders.
> 
> The responsibility for handling out-of-order packets (i.e. TCP
> retransmission, qdisc drops) falls on the netdev driver.
> 
> The netdev driver keeps track of the expected TCP SN from the NIC's
> perspective.  If the next packet to transmit matches the expected TCP
> SN, the driver advances the expected TCP SN, and transmits the packet
> with TLS offload indication.
> 
> If the next packet to transmit does not match the expected TCP SN. The
> driver calls the TLS layer to obtain the TLS record that includes the
> TCP of the packet for transmission. Using this TLS record, the driver
> posts a work entry on the transmit queue to reconstruct the NIC TLS
> state required for the offload of the out-of-order packet. It updates
> the expected TCP SN accordingly and transmit the now in-order packet.
> The same queue is used for packet transmission and TLS context
> reconstruction to avoid the need for flushing the transmit queue before
> issuing the context reconstruction request.
> 
> Signed-off-by: Boris Pismenny 
> Signed-off-by: Ilya Lesokhin 
> Signed-off-by: Aviad Yehezkel 
> ---
>  include/net/tls.h |  62 +++-
>  net/tls/Kconfig   |   9 +
>  net/tls/Makefile  |   3 +
>  net/tls/tls_device.c  | 800 
> ++
>  net/tls/tls_device_fallback.c | 405 +
>  net/tls/tls_main.c|  33 +-
>  6 files changed, 1305 insertions(+), 7 deletions(-)
>  create mode 100644 net/tls/tls_device.c
>  create mode 100644 net/tls/tls_device_fallback.c
> 
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 936cfc5cab7d..9c1b5d13d9a7 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -75,6 +75,29 @@ struct tls_sw_context {
>   struct scatterlist sg_aead_out[2];
>  };
>  
> +struct tls_record_info {
> + struct list_head list;
> + u32 end_seq;
> + int len;
> + int num_frags;
> + skb_frag_t frags[MAX_SKB_FRAGS];
> +};
> +
> +struct tls_offload_context {
> + struct crypto_aead *aead_send;
> +
> + struct list_head records_list;
> + struct scatterlist sg_tx_data[MAX_SKB_FRAGS];
> + void (*sk_destruct)(struct sock *sk);
> + struct tls_record_info *open_record;
> + struct tls_record_info *retransmit_hint;
> + u64 hint_record_sn;
> + u64 unacked_record_sn;
> +
> + u32 expected_seq;
> + spinlock_t lock;/* protects records list */
> +};
> +
>  enum {
>   TLS_PENDING_CLOSED_RECORD
>  };
> @@ -85,6 +108,10 @@ struct tls_context {
>   struct tls12_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
>   };
>  
> + struct list_head list;
> + struct net_device *netdev;
> + refcount_t refcount;
> +
>   void *priv_ctx;
>  
>   u8 tx_conf:2;
> @@ -129,9 +156,29 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
>  void tls_sw_close(struct sock *sk, long timeout);
>  void tls_sw_free_tx_resources(struct sock *sk);
>  
> -void tls_sk_destruct(struct sock *

[PATCH v3 net-next 6/6] tls: Add generic NIC offload infrastructure.

2017-12-18 Thread Ilya Lesokhin
This patch adds a generic infrastructure to offload TLS crypto to a
network devices. It enables the kernel TLS socket to skip encryption
and authentication operations on the transmit side of the data path.
Leaving those computationally expensive operations to the NIC.

The NIC offload infrastructure builds TLS records and pushes them to
the TCP layer just like the SW KTLS implementation and using the same API.
TCP segmentation is mostly unaffected. Currently the only exception is
that we prevent mixed SKBs where only part of the payload requires
offload. In the future we are likely to add a similar restriction
following a change cipher spec record.

The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. The offloaded implementation builds "plaintext TLS record", those
records contain plaintext instead of ciphertext and place holder bytes
instead of authentication tags.
2. The offloaded implementation maintains a mapping from TCP sequence
number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
TLS socket, we can use the tls NIC offload infrastructure to obtain
enough context to encrypt the payload of the SKB.
A TLS record is released when the last byte of the record is ack'ed,
this is done through the new icsk_clean_acked callback.

The infrastructure should be extendable to support various NIC offload
implementations.  However it is currently written with the
implementation below in mind:
The NIC assumes that packets from each offloaded stream are sent as
plaintext and in-order. It keeps track of the TLS records in the TCP
stream. When a packet marked for offload is transmitted, the NIC
encrypts the payload in-place and puts authentication tags in the
relevant place holders.

The responsibility for handling out-of-order packets (i.e. TCP
retransmission, qdisc drops) falls on the netdev driver.

The netdev driver keeps track of the expected TCP SN from the NIC's
perspective.  If the next packet to transmit matches the expected TCP
SN, the driver advances the expected TCP SN, and transmits the packet
with TLS offload indication.

If the next packet to transmit does not match the expected TCP SN. The
driver calls the TLS layer to obtain the TLS record that includes the
TCP of the packet for transmission. Using this TLS record, the driver
posts a work entry on the transmit queue to reconstruct the NIC TLS
state required for the offload of the out-of-order packet. It updates
the expected TCP SN accordingly and transmit the now in-order packet.
The same queue is used for packet transmission and TLS context
reconstruction to avoid the need for flushing the transmit queue before
issuing the context reconstruction request.

Signed-off-by: Boris Pismenny 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Aviad Yehezkel 
---
 include/net/tls.h |  62 +++-
 net/tls/Kconfig   |   9 +
 net/tls/Makefile  |   3 +
 net/tls/tls_device.c  | 800 ++
 net/tls/tls_device_fallback.c | 405 +
 net/tls/tls_main.c|  33 +-
 6 files changed, 1305 insertions(+), 7 deletions(-)
 create mode 100644 net/tls/tls_device.c
 create mode 100644 net/tls/tls_device_fallback.c

diff --git a/include/net/tls.h b/include/net/tls.h
index 936cfc5cab7d..9c1b5d13d9a7 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -75,6 +75,29 @@ struct tls_sw_context {
struct scatterlist sg_aead_out[2];
 };
 
+struct tls_record_info {
+   struct list_head list;
+   u32 end_seq;
+   int len;
+   int num_frags;
+   skb_frag_t frags[MAX_SKB_FRAGS];
+};
+
+struct tls_offload_context {
+   struct crypto_aead *aead_send;
+
+   struct list_head records_list;
+   struct scatterlist sg_tx_data[MAX_SKB_FRAGS];
+   void (*sk_destruct)(struct sock *sk);
+   struct tls_record_info *open_record;
+   struct tls_record_info *retransmit_hint;
+   u64 hint_record_sn;
+   u64 unacked_record_sn;
+
+   u32 expected_seq;
+   spinlock_t lock;/* protects records list */
+};
+
 enum {
TLS_PENDING_CLOSED_RECORD
 };
@@ -85,6 +108,10 @@ struct tls_context {
struct tls12_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
};
 
+   struct list_head list;
+   struct net_device *netdev;
+   refcount_t refcount;
+
void *priv_ctx;
 
u8 tx_conf:2;
@@ -129,9 +156,29 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 void tls_sw_close(struct sock *sk, long timeout);
 void tls_sw_free_tx_resources(struct sock *sk);
 
-void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
-void tls_icsk_clean_acked(struct sock *sk);
+void tls_clear_device_offload(struct sock *sk, struct tls_context *ctx);
+int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_device_sendpage(struct sock *sk, struct page *