date:20121226

Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread Eric W. Biederman

The syscall ABI still has the wrong semantics.

Aka totally unmaintainable and umergeable.

The concept of domU support is also strange.  What does domU support even mean, 
when the dom0 support is loading a kernel to pick up Xen when Xen falls over.

I expect a lot of decisions about what code can be shared and what code can't 
is going to be driven by the simple question what does the syscall mean.

Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me 
when what you are doing is effectively passing your arguments through to the 
Xen version of kexec.

Either Xen has it's own version of those routines or I expect the Xen version 
of kexec is buggy.   I can't imagine what sharing that code would mean.  By the 
same token I can't any need to duplicate the code either.

Furthermore since this is just passing data from one version of the syscall to 
another I expect you can share the majority of the code across all 
architectures that implement Xen.  The only part I can see being arch specific 
is the Xen syscall stub.

With respect to the proposed semantics of silently giving the kexec system call 
different meaning when running under Xen,
/sbin/kexec has to act somewhat differently when loading code into the Xen 
hypervisor so there is no point not making that explicit in the ABI.

Eric

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 2/2] vhost: handle polling failure

2012-12-26 Thread Jason Wang

Currently, polling error were ignored in vhost. This may lead some issues (e.g
kenrel crash when passing a tap fd to vhost before calling TUNSETIFF). Fix this
by:

- extend the idea of vhost_net_poll_state to all vhost_polls
- change the state only when polling is succeed
- make vhost_poll_start() report errors to the caller, which could be used
  caller or userspace.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c   |   75 +
 drivers/vhost/vhost.c |   16 +-
 drivers/vhost/vhost.h |   11 ++-
 3 files changed, 50 insertions(+), 52 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 629d6b5..56e7f5a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -64,20 +64,10 @@ enum {
VHOST_NET_VQ_MAX = 2,
 };
 
-enum vhost_net_poll_state {
-   VHOST_NET_POLL_DISABLED = 0,
-   VHOST_NET_POLL_STARTED = 1,
-   VHOST_NET_POLL_STOPPED = 2,
-};
-
 struct vhost_net {
struct vhost_dev dev;
struct vhost_virtqueue vqs[VHOST_NET_VQ_MAX];
struct vhost_poll poll[VHOST_NET_VQ_MAX];
-   /* Tells us whether we are polling a socket for TX.
-* We only do this when socket buffer fills up.
-* Protected by tx vq lock. */
-   enum vhost_net_poll_state tx_poll_state;
/* Number of TX recently submitted.
 * Protected by tx vq lock. */
unsigned tx_packets;
@@ -155,24 +145,6 @@ static void copy_iovec_hdr(const struct iovec *from, 
struct iovec *to,
}
 }
 
-/* Caller must have TX VQ lock */
-static void tx_poll_stop(struct vhost_net *net)
-{
-   if (likely(net->tx_poll_state != VHOST_NET_POLL_STARTED))
-   return;
-   vhost_poll_stop(net->poll + VHOST_NET_VQ_TX);
-   net->tx_poll_state = VHOST_NET_POLL_STOPPED;
-}
-
-/* Caller must have TX VQ lock */
-static void tx_poll_start(struct vhost_net *net, struct socket *sock)
-{
-   if (unlikely(net->tx_poll_state != VHOST_NET_POLL_STOPPED))
-   return;
-   vhost_poll_start(net->poll + VHOST_NET_VQ_TX, sock->file);
-   net->tx_poll_state = VHOST_NET_POLL_STARTED;
-}
-
 /* In case of DMA done not in order in lower device driver for some reason.
  * upend_idx is used to track end of used idx, done_idx is used to track head
  * of used idx. Once lower device DMA done contiguously, we will signal KVM
@@ -252,7 +224,7 @@ static void handle_tx(struct vhost_net *net)
wmem = atomic_read(&sock->sk->sk_wmem_alloc);
if (wmem >= sock->sk->sk_sndbuf) {
mutex_lock(&vq->mutex);
-   tx_poll_start(net, sock);
+   vhost_poll_start(net->poll + VHOST_NET_VQ_TX, sock->file);
mutex_unlock(&vq->mutex);
return;
}
@@ -261,7 +233,7 @@ static void handle_tx(struct vhost_net *net)
vhost_disable_notify(&net->dev, vq);
 
if (wmem < sock->sk->sk_sndbuf / 2)
-   tx_poll_stop(net);
+   vhost_poll_stop(net->poll + VHOST_NET_VQ_TX);
hdr_size = vq->vhost_hlen;
zcopy = vq->ubufs;
 
@@ -283,7 +255,8 @@ static void handle_tx(struct vhost_net *net)
 
wmem = atomic_read(&sock->sk->sk_wmem_alloc);
if (wmem >= sock->sk->sk_sndbuf * 3 / 4) {
-   tx_poll_start(net, sock);
+   vhost_poll_start(net->poll + VHOST_NET_VQ_TX,
+sock->file);
set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
break;
}
@@ -294,7 +267,8 @@ static void handle_tx(struct vhost_net *net)
(vq->upend_idx - vq->done_idx) :
(vq->upend_idx + UIO_MAXIOV - vq->done_idx);
if (unlikely(num_pends > VHOST_MAX_PEND)) {
-   tx_poll_start(net, sock);
+   vhost_poll_start(net->poll + VHOST_NET_VQ_TX,
+sock->file);
set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
break;
}
@@ -360,7 +334,8 @@ static void handle_tx(struct vhost_net *net)
}
vhost_discard_vq_desc(vq, 1);
if (err == -EAGAIN || err == -ENOBUFS)
-   tx_poll_start(net, sock);
+   vhost_poll_start(net->poll + VHOST_NET_VQ_TX,
+sock->file);
break;
}
if (err != len)
@@ -623,7 +598,6 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
 
vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
-

[PATCH 1/2] vhost_net: correct error hanlding in vhost_net_set_backend()

2012-12-26 Thread Jason Wang

Fix the leaking of oldubufs and fd refcnt when fail to initialized used ring.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c |   14 +++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index ebd08b2..629d6b5 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -834,8 +834,10 @@ static long vhost_net_set_backend(struct vhost_net *n, 
unsigned index, int fd)
vhost_net_enable_vq(n, vq);
 
r = vhost_init_used(vq);
-   if (r)
-   goto err_vq;
+   if (r) {
+   sock = NULL;
+   goto err_used;
+   }
 
n->tx_packets = 0;
n->tx_zcopy_err = 0;
@@ -859,8 +861,14 @@ static long vhost_net_set_backend(struct vhost_net *n, 
unsigned index, int fd)
mutex_unlock(&n->dev.mutex);
return 0;
 
+err_used:
+   if (oldubufs)
+   vhost_ubuf_put_and_wait(oldubufs);
+   if (oldsock)
+   fput(oldsock->file);
 err_ubufs:
-   fput(sock->file);
+   if (sock)
+   fput(sock->file);
 err_vq:
mutex_unlock(&vq->mutex);
 err:
-- 
1.7.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 01/11] kexec: introduce kexec firmware support

2012-12-26 Thread Eric W. Biederman

Daniel Kiper  writes:

> Some kexec/kdump implementations (e.g. Xen PVOPS) could not use default
> Linux infrastructure and require some support from firmware and/or hypervisor.
> To cope with that problem kexec firmware infrastructure was introduced.
> It allows a developer to use all kexec/kdump features of given firmware
> or hypervisor.

As this stands this patch is wrong.

You need to pass an additional flag from userspace through /sbin/kexec
that says load the kexec image in the firmware.  A global variable here
is not ok.

As I understand it you are loading a kexec on xen panic image.  Which
is semantically different from a kexec on linux panic image.  It is not
ok to do have a silly global variable kexec_use_firmware.

Furthermore it is not ok to have a conditional code outside of header
files.

Eric
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread H. Peter Anvin


On 12/26/2012 06:18 PM, Daniel Kiper wrote:

Hi,

This set of patches contains initial kexec/kdump implementation for Xen v3.
Currently only dom0 is supported, however, almost all infrustructure
required for domU support is ready.

Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code.
This could simplify and reduce a bit size of kernel code. However, this solution
requires some changes in baremetal x86 code. First of all code which establishes
transition page table should be moved back from machine_kexec_$(BITS).c to
relocate_kernel_$(BITS).S. Another important thing which should be changed in 
that
case is format of page_list array. Xen kexec hypercall requires to alternate 
physical
addresses with virtual ones. These and other required stuff have not been done 
in that
version because I am not sure that solution will be accepted by kexec/kdump 
maintainers.
I hope that this email spark discussion about that topic.



I want a detailed list of the constraints that this assumes and 
therefore imposes on the native implementation as a result of this.  We 
have had way too many patches where Xen PV hacks effectively nailgun 
arbitrary, and sometimes poor, design decisions in place and now we 
can't fix them.


-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 06/11] x86/xen: Add i386 kexec/kdump implementation

2012-12-26 Thread H. Peter Anvin


On 12/26/2012 06:18 PM, Daniel Kiper wrote:

Add i386 kexec/kdump implementation.

v2 - suggestions/fixes:
- allocate transition page table pages below 4 GiB
  (suggested by Jan Beulich).



Why?

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

2012-12-26 Thread Wanlong Gao

On 12/27/2012 11:28 AM, Jason Wang wrote:
> On 12/26/2012 06:19 PM, Wanlong Gao wrote:
>> On 12/26/2012 06:06 PM, Jason Wang wrote:
>>> On 12/26/2012 03:06 PM, Wanlong Gao wrote:
 Add a cpu notifier to virtio-net, so that we can reset the
 virtqueue affinity if the cpu hotplug happens. It improve
 the performance through enabling or disabling the virtqueue
 affinity after doing cpu hotplug.
>>> Hi Wanlong:
>>>
>>> Thanks for looking at this.
 Cc: Rusty Russell 
 Cc: "Michael S. Tsirkin" 
 Cc: Jason Wang 
 Cc: virtualization@lists.linux-foundation.org
 Cc: net...@vger.kernel.org
 Signed-off-by: Wanlong Gao 
 ---
  drivers/net/virtio_net.c | 39 ++-
  1 file changed, 38 insertions(+), 1 deletion(-)

 diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
 index a6fcf15..9710cf4 100644
 --- a/drivers/net/virtio_net.c
 +++ b/drivers/net/virtio_net.c
 @@ -26,6 +26,7 @@
  #include 
  #include 
  #include 
 +#include 
  
  static int napi_weight = 128;
  module_param(napi_weight, int, 0444);
 @@ -34,6 +35,8 @@ static bool csum = true, gso = true;
  module_param(csum, bool, 0444);
  module_param(gso, bool, 0444);
  
 +static bool cpu_hotplug = false;
 +
  /* FIXME: MTU in config. */
  #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
  #define GOOD_COPY_LEN 128
 @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct 
 virtnet_info *vi, bool set)
vi->affinity_hint_set = false;
  }
  
 +static int virtnet_cpu_callback(struct notifier_block *nfb,
 + unsigned long action, void *hcpu)
 +{
 +  switch(action) {
 +  case CPU_ONLINE:
 +  case CPU_ONLINE_FROZEN:
 +  case CPU_DEAD:
 +  case CPU_DEAD_FROZEN:
 +  cpu_hotplug = true;
 +  break;
 +  default:
 +  break;
 +  }
 +  return NOTIFY_OK;
 +}
 +
 +static struct notifier_block virtnet_cpu_notifier = {
 +  .notifier_call = virtnet_cpu_callback,
 +};
 +
  static void virtnet_get_ringparam(struct net_device *dev,
struct ethtool_ringparam *ring)
  {
 @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device 
 *dev, int new_mtu)
   */
  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff 
 *skb)
  {
 -  int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
 +  int txq;
 +
 +  if (unlikely(cpu_hotplug == true)) {
 +  virtnet_set_affinity(netdev_priv(dev), true);
 +  cpu_hotplug = false;
 +  }
 +
>>> Why don't you just do this in callback?
>> Callback can just give us a "hcpu", can't get the virtnet_info from 
>> callback. Am I missing something?
> 
> Well, I think you can just embed the notifier block into virtnet_info,
> then use something like container_of in the callback to make the
> notifier per device. This also solve the concern of Eric.

Yeah, thank you very much for your suggestion. I'll try it.

>>> btw. Does qemu/kvm support cpu-hotplug now?
>> From http://www.linux-kvm.org/page/CPUHotPlug, I saw that qemu-kvm can 
>> support hotplug
>> but failed to merge to qemu.git, right?
> 
> Not sure, I just try latest qemu, it even does not have a cpu_set command.

Adding Igor to CC,

As I know, hotplug support is cleaned from qemu, and Igor want to rework it but 
not been completed?
I'm not sure about that, Igor, could you send out your tech-preview-patches?

Thanks,
Wanlong Gao

> 
> Thanks
>>
>> Thanks,
>> Wanlong Gao
>>
> 
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

2012-12-26 Thread Jason Wang

On 12/26/2012 06:19 PM, Wanlong Gao wrote:
> On 12/26/2012 06:06 PM, Jason Wang wrote:
>> On 12/26/2012 03:06 PM, Wanlong Gao wrote:
>>> Add a cpu notifier to virtio-net, so that we can reset the
>>> virtqueue affinity if the cpu hotplug happens. It improve
>>> the performance through enabling or disabling the virtqueue
>>> affinity after doing cpu hotplug.
>> Hi Wanlong:
>>
>> Thanks for looking at this.
>>> Cc: Rusty Russell 
>>> Cc: "Michael S. Tsirkin" 
>>> Cc: Jason Wang 
>>> Cc: virtualization@lists.linux-foundation.org
>>> Cc: net...@vger.kernel.org
>>> Signed-off-by: Wanlong Gao 
>>> ---
>>>  drivers/net/virtio_net.c | 39 ++-
>>>  1 file changed, 38 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index a6fcf15..9710cf4 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -26,6 +26,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  static int napi_weight = 128;
>>>  module_param(napi_weight, int, 0444);
>>> @@ -34,6 +35,8 @@ static bool csum = true, gso = true;
>>>  module_param(csum, bool, 0444);
>>>  module_param(gso, bool, 0444);
>>>  
>>> +static bool cpu_hotplug = false;
>>> +
>>>  /* FIXME: MTU in config. */
>>>  #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
>>>  #define GOOD_COPY_LEN  128
>>> @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct virtnet_info 
>>> *vi, bool set)
>>> vi->affinity_hint_set = false;
>>>  }
>>>  
>>> +static int virtnet_cpu_callback(struct notifier_block *nfb,
>>> +  unsigned long action, void *hcpu)
>>> +{
>>> +   switch(action) {
>>> +   case CPU_ONLINE:
>>> +   case CPU_ONLINE_FROZEN:
>>> +   case CPU_DEAD:
>>> +   case CPU_DEAD_FROZEN:
>>> +   cpu_hotplug = true;
>>> +   break;
>>> +   default:
>>> +   break;
>>> +   }
>>> +   return NOTIFY_OK;
>>> +}
>>> +
>>> +static struct notifier_block virtnet_cpu_notifier = {
>>> +   .notifier_call = virtnet_cpu_callback,
>>> +};
>>> +
>>>  static void virtnet_get_ringparam(struct net_device *dev,
>>> struct ethtool_ringparam *ring)
>>>  {
>>> @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device 
>>> *dev, int new_mtu)
>>>   */
>>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff 
>>> *skb)
>>>  {
>>> -   int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>>> +   int txq;
>>> +
>>> +   if (unlikely(cpu_hotplug == true)) {
>>> +   virtnet_set_affinity(netdev_priv(dev), true);
>>> +   cpu_hotplug = false;
>>> +   }
>>> +
>> Why don't you just do this in callback?
> Callback can just give us a "hcpu", can't get the virtnet_info from callback. 
> Am I missing something?

Well, I think you can just embed the notifier block into virtnet_info,
then use something like container_of in the callback to make the
notifier per device. This also solve the concern of Eric.
>> btw. Does qemu/kvm support cpu-hotplug now?
> From http://www.linux-kvm.org/page/CPUHotPlug, I saw that qemu-kvm can 
> support hotplug
> but failed to merge to qemu.git, right?

Not sure, I just try latest qemu, it even does not have a cpu_set command.

Thanks
>
> Thanks,
> Wanlong Gao
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

2012-12-26 Thread Jason Wang

On 12/26/2012 06:46 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 26, 2012 at 03:06:54PM +0800, Wanlong Gao wrote:
>> Add a cpu notifier to virtio-net, so that we can reset the
>> virtqueue affinity if the cpu hotplug happens. It improve
>> the performance through enabling or disabling the virtqueue
>> affinity after doing cpu hotplug.
>>
>> Cc: Rusty Russell 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Jason Wang 
>> Cc: virtualization@lists.linux-foundation.org
>> Cc: net...@vger.kernel.org
>> Signed-off-by: Wanlong Gao 
> Thanks for looking into this.
> Some comments:
>
> 1. Looks like the logic in
> virtnet_set_affinity (and in virtnet_select_queue)
> will not work very well when CPU IDs are not
> consequitive. This can happen with hot unplug.
>
> Maybe we should add a VQ allocator, and defining
> a per-cpu variable specifying the VQ instead
> of using CPU ID.

Yes, and generate the affinity hint based on the mapping. Btw, what does
VQ allocator means here?
>
>
> 2. The below code seems racy e.g. when CPU is added
>   during device init.
>
> 3. using a global cpu_hotplug seems inelegant.
> In any case we should document what is the
> meaning of this variable.
>
>> ---
>>  drivers/net/virtio_net.c | 39 ++-
>>  1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index a6fcf15..9710cf4 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -26,6 +26,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  static int napi_weight = 128;
>>  module_param(napi_weight, int, 0444);
>> @@ -34,6 +35,8 @@ static bool csum = true, gso = true;
>>  module_param(csum, bool, 0444);
>>  module_param(gso, bool, 0444);
>>  
>> +static bool cpu_hotplug = false;
>> +
>>  /* FIXME: MTU in config. */
>>  #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
>>  #define GOOD_COPY_LEN   128
>> @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct virtnet_info 
>> *vi, bool set)
>>  vi->affinity_hint_set = false;
>>  }
>>  
>> +static int virtnet_cpu_callback(struct notifier_block *nfb,
>> +   unsigned long action, void *hcpu)
>> +{
>> +switch(action) {
>> +case CPU_ONLINE:
>> +case CPU_ONLINE_FROZEN:
>> +case CPU_DEAD:
>> +case CPU_DEAD_FROZEN:
>> +cpu_hotplug = true;
>> +break;
>> +default:
>> +break;
>> +}
>> +return NOTIFY_OK;
>> +}
>> +
>> +static struct notifier_block virtnet_cpu_notifier = {
>> +.notifier_call = virtnet_cpu_callback,
>> +};
>> +
>>  static void virtnet_get_ringparam(struct net_device *dev,
>>  struct ethtool_ringparam *ring)
>>  {
>> @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device *dev, 
>> int new_mtu)
>>   */
>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>>  {
>> -int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> +int txq;
>> +
>> +if (unlikely(cpu_hotplug == true)) {
>> +virtnet_set_affinity(netdev_priv(dev), true);
>> +cpu_hotplug = false;
>> +}
>> +
>> +txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>>smp_processor_id();
>>  
>>  while (unlikely(txq >= dev->real_num_tx_queues))
>> @@ -1248,6 +1278,8 @@ static void virtnet_del_vqs(struct virtnet_info *vi)
>>  {
>>  struct virtio_device *vdev = vi->vdev;
>>  
>> +unregister_hotcpu_notifier(&virtnet_cpu_notifier);
>> +
>>  virtnet_set_affinity(vi, false);
>>  
>>  vdev->config->del_vqs(vdev);
>> @@ -1372,6 +1404,11 @@ static int init_vqs(struct virtnet_info *vi)
>>  goto err_free;
>>  
>>  virtnet_set_affinity(vi, true);
>> +
>> +ret = register_hotcpu_notifier(&virtnet_cpu_notifier);
>> +if (ret)
>> +goto err_free;
>> +
>>  return 0;
>>  
>>  err_free:
>> -- 
>> 1.8.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 02/11] x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE

2012-12-26 Thread H. Peter Anvin

Hmm... this code is being redone at the moment... this might conflict.

Daniel Kiper  wrote:

>Some implementations (e.g. Xen PVOPS) could not use part of identity
>page table
>to construct transition page table. It means that they require separate
>PUDs,
>PMDs and PTEs for virtual and physical (identity) mapping. To satisfy
>that
>requirement add extra pointer to PGD, PUD, PMD and PTE and align
>existing code.
>
>Signed-off-by: Daniel Kiper 
>---
> arch/x86/include/asm/kexec.h   |   10 +++---
> arch/x86/kernel/machine_kexec_64.c |   12 ++--
> 2 files changed, 13 insertions(+), 9 deletions(-)
>
>diff --git a/arch/x86/include/asm/kexec.h
>b/arch/x86/include/asm/kexec.h
>index 6080d26..cedd204 100644
>--- a/arch/x86/include/asm/kexec.h
>+++ b/arch/x86/include/asm/kexec.h
>@@ -157,9 +157,13 @@ struct kimage_arch {
> };
> #else
> struct kimage_arch {
>-  pud_t *pud;
>-  pmd_t *pmd;
>-  pte_t *pte;
>+  pgd_t *pgd;
>+  pud_t *pud0;
>+  pud_t *pud1;
>+  pmd_t *pmd0;
>+  pmd_t *pmd1;
>+  pte_t *pte0;
>+  pte_t *pte1;
> };
> #endif
> 
>diff --git a/arch/x86/kernel/machine_kexec_64.c
>b/arch/x86/kernel/machine_kexec_64.c
>index b3ea9db..976e54b 100644
>--- a/arch/x86/kernel/machine_kexec_64.c
>+++ b/arch/x86/kernel/machine_kexec_64.c
>@@ -137,9 +137,9 @@ out:
> 
> static void free_transition_pgtable(struct kimage *image)
> {
>-  free_page((unsigned long)image->arch.pud);
>-  free_page((unsigned long)image->arch.pmd);
>-  free_page((unsigned long)image->arch.pte);
>+  free_page((unsigned long)image->arch.pud0);
>+  free_page((unsigned long)image->arch.pmd0);
>+  free_page((unsigned long)image->arch.pte0);
> }
> 
> static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
>@@ -157,7 +157,7 @@ static int init_transition_pgtable(struct kimage
>*image, pgd_t *pgd)
>   pud = (pud_t *)get_zeroed_page(GFP_KERNEL);
>   if (!pud)
>   goto err;
>-  image->arch.pud = pud;
>+  image->arch.pud0 = pud;
>   set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE));
>   }
>   pud = pud_offset(pgd, vaddr);
>@@ -165,7 +165,7 @@ static int init_transition_pgtable(struct kimage
>*image, pgd_t *pgd)
>   pmd = (pmd_t *)get_zeroed_page(GFP_KERNEL);
>   if (!pmd)
>   goto err;
>-  image->arch.pmd = pmd;
>+  image->arch.pmd0 = pmd;
>   set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
>   }
>   pmd = pmd_offset(pud, vaddr);
>@@ -173,7 +173,7 @@ static int init_transition_pgtable(struct kimage
>*image, pgd_t *pgd)
>   pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
>   if (!pte)
>   goto err;
>-  image->arch.pte = pte;
>+  image->arch.pte0 = pte;
>   set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
>   }
>   pte = pte_offset_kernel(pmd, vaddr);

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 04/11] x86/xen: Introduce architecture dependent data for kexec/kdump

2012-12-26 Thread Daniel Kiper

Introduce architecture dependent constants, structures and
functions required by Xen kexec/kdump implementation.

Signed-off-by: Daniel Kiper 
---
 arch/x86/include/asm/xen/hypercall.h |6 +++
 arch/x86/include/asm/xen/kexec.h |   79 ++
 2 files changed, 85 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/kexec.h

diff --git a/arch/x86/include/asm/xen/hypercall.h 
b/arch/x86/include/asm/xen/hypercall.h
index c20d1ce..e76a1b8 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -459,6 +459,12 @@ HYPERVISOR_hvm_op(int op, void *arg)
 }
 
 static inline int
+HYPERVISOR_kexec_op(unsigned long op, void *args)
+{
+   return _hypercall2(int, kexec_op, op, args);
+}
+
+static inline int
 HYPERVISOR_tmem_op(
struct tmem_op *op)
 {
diff --git a/arch/x86/include/asm/xen/kexec.h b/arch/x86/include/asm/xen/kexec.h
new file mode 100644
index 000..d09b52f
--- /dev/null
+++ b/arch/x86/include/asm/xen/kexec.h
@@ -0,0 +1,79 @@
+/*
+ * Copyright (c) 2011 Daniel Kiper
+ * Copyright (c) 2012 Daniel Kiper, Oracle Corporation
+ *
+ * kexec/kdump implementation for Xen was written by Daniel Kiper.
+ * Initial work on it was sponsored by Google under Google Summer
+ * of Code 2011 program and Citrix. Konrad Rzeszutek Wilk from Oracle
+ * was the mentor for this project.
+ *
+ * Some ideas are taken from:
+ *   - native kexec/kdump implementation,
+ *   - kexec/kdump implementation for Xen Linux Kernel Ver. 2.6.18,
+ *   - PV-GRUB.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program.  If not, see .
+ */
+
+#ifndef _ASM_X86_XEN_KEXEC_H
+#define _ASM_X86_XEN_KEXEC_H
+
+#define KEXEC_XEN_NO_PAGES 17
+
+#define XK_MA_CONTROL_PAGE 0
+#define XK_VA_CONTROL_PAGE 1
+#define XK_MA_PGD_PAGE 2
+#define XK_VA_PGD_PAGE 3
+#define XK_MA_PUD0_PAGE4
+#define XK_VA_PUD0_PAGE5
+#define XK_MA_PUD1_PAGE6
+#define XK_VA_PUD1_PAGE7
+#define XK_MA_PMD0_PAGE8
+#define XK_VA_PMD0_PAGE9
+#define XK_MA_PMD1_PAGE10
+#define XK_VA_PMD1_PAGE11
+#define XK_MA_PTE0_PAGE12
+#define XK_VA_PTE0_PAGE13
+#define XK_MA_PTE1_PAGE14
+#define XK_VA_PTE1_PAGE15
+#define XK_MA_TABLE_PAGE   16
+
+#ifndef __ASSEMBLY__
+struct xen_kexec_image {
+   unsigned long page_list[KEXEC_XEN_NO_PAGES];
+   unsigned long indirection_page;
+   unsigned long start_address;
+};
+
+struct xen_kexec_load {
+   int type;
+   struct xen_kexec_image image;
+};
+
+extern unsigned int xen_kexec_control_code_size;
+
+#ifdef CONFIG_X86_32
+extern void xen_relocate_kernel(unsigned long indirection_page,
+   unsigned long *page_list,
+   unsigned long start_address,
+   unsigned int has_pae,
+   unsigned int preserve_context);
+#else
+extern void xen_relocate_kernel(unsigned long indirection_page,
+   unsigned long *page_list,
+   unsigned long start_address,
+   unsigned int preserve_context);
+#endif
+#endif
+#endif /* _ASM_X86_XEN_KEXEC_H */
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 06/11] x86/xen: Add i386 kexec/kdump implementation

2012-12-26 Thread Daniel Kiper

Add i386 kexec/kdump implementation.

v2 - suggestions/fixes:
   - allocate transition page table pages below 4 GiB
 (suggested by Jan Beulich).

Signed-off-by: Daniel Kiper 
---
 arch/x86/xen/machine_kexec_32.c   |  226 ++
 arch/x86/xen/relocate_kernel_32.S |  323 +
 2 files changed, 549 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/xen/machine_kexec_32.c
 create mode 100644 arch/x86/xen/relocate_kernel_32.S

diff --git a/arch/x86/xen/machine_kexec_32.c b/arch/x86/xen/machine_kexec_32.c
new file mode 100644
index 000..011a5e8
--- /dev/null
+++ b/arch/x86/xen/machine_kexec_32.c
@@ -0,0 +1,226 @@
+/*
+ * Copyright (c) 2011 Daniel Kiper
+ * Copyright (c) 2012 Daniel Kiper, Oracle Corporation
+ *
+ * kexec/kdump implementation for Xen was written by Daniel Kiper.
+ * Initial work on it was sponsored by Google under Google Summer
+ * of Code 2011 program and Citrix. Konrad Rzeszutek Wilk from Oracle
+ * was the mentor for this project.
+ *
+ * Some ideas are taken from:
+ *   - native kexec/kdump implementation,
+ *   - kexec/kdump implementation for Xen Linux Kernel Ver. 2.6.18,
+ *   - PV-GRUB.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define __ma(vaddr)(virt_to_machine(vaddr).maddr)
+
+static void *alloc_pgtable_page(struct kimage *image)
+{
+   struct page *page;
+
+   page = firmware_kimage_alloc_control_pages(image, 0);
+
+   if (!page || !page_address(page))
+   return NULL;
+
+   memset(page_address(page), 0, PAGE_SIZE);
+
+   return page_address(page);
+}
+
+static int alloc_transition_pgtable(struct kimage *image)
+{
+   image->arch.pgd = alloc_pgtable_page(image);
+
+   if (!image->arch.pgd)
+   return -ENOMEM;
+
+   image->arch.pmd0 = alloc_pgtable_page(image);
+
+   if (!image->arch.pmd0)
+   return -ENOMEM;
+
+   image->arch.pmd1 = alloc_pgtable_page(image);
+
+   if (!image->arch.pmd1)
+   return -ENOMEM;
+
+   image->arch.pte0 = alloc_pgtable_page(image);
+
+   if (!image->arch.pte0)
+   return -ENOMEM;
+
+   image->arch.pte1 = alloc_pgtable_page(image);
+
+   if (!image->arch.pte1)
+   return -ENOMEM;
+
+   return 0;
+}
+
+struct page *mf_kexec_kimage_alloc_pages(gfp_t gfp_mask,
+   unsigned int order,
+   unsigned long limit)
+{
+   struct page *pages;
+   unsigned int address_bits, i;
+
+   pages = alloc_pages(gfp_mask, order);
+
+   if (!pages)
+   return NULL;
+
+   address_bits = (limit == ULONG_MAX) ? BITS_PER_LONG : ilog2(limit);
+
+   /* Relocate set of pages below given limit. */
+   if (xen_create_contiguous_region((unsigned long)page_address(pages),
+   order, address_bits)) {
+   __free_pages(pages, order);
+   return NULL;
+   }
+
+   BUG_ON(PagePrivate(pages));
+
+   pages->mapping = NULL;
+   set_page_private(pages, order);
+
+   for (i = 0; i < (1 << order); ++i)
+   SetPageReserved(pages + i);
+
+   return pages;
+}
+
+void mf_kexec_kimage_free_pages(struct page *page)
+{
+   unsigned int i, order;
+
+   order = page_private(page);
+
+   for (i = 0; i < (1 << order); ++i)
+   ClearPageReserved(page + i);
+
+   xen_destroy_contiguous_region((unsigned long)page_address(page), order);
+   __free_pages(page, order);
+}
+
+unsigned long mf_kexec_page_to_pfn(struct page *page)
+{
+   return pfn_to_mfn(page_to_pfn(page));
+}
+
+struct page *mf_kexec_pfn_to_page(unsigned long mfn)
+{
+   return pfn_to_page(mfn_to_pfn(mfn));
+}
+
+unsigned long mf_kexec_virt_to_phys(volatile void *address)
+{
+   return virt_to_machine(address).maddr;
+}
+
+void *mf_kexec_phys_to_virt(unsigned long address)
+{
+   return phys_to_virt(machine_to_phys(XMADDR(address)).paddr);
+}
+
+int mf_kexec_prepare(struct kimage *image)
+{
+#ifdef CONFIG_KEXEC_JUMP
+   if (image->preserve_context) {
+   pr_info_once("kexec: Context preservation is not "
+

[PATCH v3 05/11] x86/xen: Register resources required by kexec-tools

2012-12-26 Thread Daniel Kiper

Register resources required by kexec-tools.

v2 - suggestions/fixes:
   - change logging level
 (suggested by Konrad Rzeszutek Wilk).

Signed-off-by: Daniel Kiper 
---
 arch/x86/xen/kexec.c |  150 ++
 1 files changed, 150 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/xen/kexec.c

diff --git a/arch/x86/xen/kexec.c b/arch/x86/xen/kexec.c
new file mode 100644
index 000..7ec4c45
--- /dev/null
+++ b/arch/x86/xen/kexec.c
@@ -0,0 +1,150 @@
+/*
+ * Copyright (c) 2011 Daniel Kiper
+ * Copyright (c) 2012 Daniel Kiper, Oracle Corporation
+ *
+ * kexec/kdump implementation for Xen was written by Daniel Kiper.
+ * Initial work on it was sponsored by Google under Google Summer
+ * of Code 2011 program and Citrix. Konrad Rzeszutek Wilk from Oracle
+ * was the mentor for this project.
+ *
+ * Some ideas are taken from:
+ *   - native kexec/kdump implementation,
+ *   - kexec/kdump implementation for Xen Linux Kernel Ver. 2.6.18,
+ *   - PV-GRUB.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+
+unsigned long xen_vmcoreinfo_maddr = 0;
+unsigned long xen_vmcoreinfo_max_size = 0;
+
+static int __init xen_init_kexec_resources(void)
+{
+   int rc;
+   static struct resource xen_hypervisor_res = {
+   .name = "Hypervisor code and data",
+   .flags = IORESOURCE_BUSY | IORESOURCE_MEM
+   };
+   struct resource *cpu_res;
+   struct xen_kexec_range xkr;
+   struct xen_platform_op cpuinfo_op;
+   uint32_t cpus, i;
+
+   if (!xen_initial_domain())
+   return 0;
+
+   if (strstr(boot_command_line, "crashkernel="))
+   pr_warn("kexec: Ignoring crashkernel option. "
+   "It should be passed to Xen hypervisor.\n");
+
+   /* Register Crash kernel resource. */
+   xkr.range = KEXEC_RANGE_MA_CRASH;
+   rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_get_range, &xkr);
+
+   if (rc) {
+   pr_warn("kexec: %s: HYPERVISOR_kexec_op(KEXEC_RANGE_MA_CRASH)"
+   ": %i\n", __func__, rc);
+   return rc;
+   }
+
+   if (!xkr.size)
+   return 0;
+
+   crashk_res.start = xkr.start;
+   crashk_res.end = xkr.start + xkr.size - 1;
+   insert_resource(&iomem_resource, &crashk_res);
+
+   /* Register Hypervisor code and data resource. */
+   xkr.range = KEXEC_RANGE_MA_XEN;
+   rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_get_range, &xkr);
+
+   if (rc) {
+   pr_warn("kexec: %s: HYPERVISOR_kexec_op(KEXEC_RANGE_MA_XEN)"
+   ": %i\n", __func__, rc);
+   return rc;
+   }
+
+   xen_hypervisor_res.start = xkr.start;
+   xen_hypervisor_res.end = xkr.start + xkr.size - 1;
+   insert_resource(&iomem_resource, &xen_hypervisor_res);
+
+   /* Determine maximum number of physical CPUs. */
+   cpuinfo_op.cmd = XENPF_get_cpuinfo;
+   cpuinfo_op.u.pcpu_info.xen_cpuid = 0;
+   rc = HYPERVISOR_dom0_op(&cpuinfo_op);
+
+   if (rc) {
+   pr_warn("kexec: %s: HYPERVISOR_dom0_op(): %i\n", __func__, rc);
+   return rc;
+   }
+
+   cpus = cpuinfo_op.u.pcpu_info.max_present + 1;
+
+   /* Register CPUs Crash note resources. */
+   cpu_res = kcalloc(cpus, sizeof(struct resource), GFP_KERNEL);
+
+   if (!cpu_res) {
+   pr_warn("kexec: %s: kcalloc(): %i\n", __func__, -ENOMEM);
+   return -ENOMEM;
+   }
+
+   for (i = 0; i < cpus; ++i) {
+   xkr.range = KEXEC_RANGE_MA_CPU;
+   xkr.nr = i;
+   rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_get_range, &xkr);
+
+   if (rc) {
+   pr_warn("kexec: %s: cpu: %u: HYPERVISOR_kexec_op"
+   "(KEXEC_RANGE_MA_XEN): %i\n", __func__, i, rc);
+   continue;
+   }
+
+   cpu_res->name = "Crash note";
+   cpu_res->start = xkr.start;
+   cpu_res->end = xkr.start + xkr.size - 1;
+   cpu_res->flags = IORESOURCE_BUSY | IORESOURCE_MEM;
+   insert_resource(&iomem_resource, cpu_res++);
+   }
+
+   /* Get vmcoreinfo address and maxim

[PATCH v3 07/11] x86/xen: Add x86_64 kexec/kdump implementation

2012-12-26 Thread Daniel Kiper

Add x86_64 kexec/kdump implementation.

Signed-off-by: Daniel Kiper 
---
 arch/x86/xen/machine_kexec_64.c   |  318 +
 arch/x86/xen/relocate_kernel_64.S |  309 +++
 2 files changed, 627 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/xen/machine_kexec_64.c
 create mode 100644 arch/x86/xen/relocate_kernel_64.S

diff --git a/arch/x86/xen/machine_kexec_64.c b/arch/x86/xen/machine_kexec_64.c
new file mode 100644
index 000..2600342
--- /dev/null
+++ b/arch/x86/xen/machine_kexec_64.c
@@ -0,0 +1,318 @@
+/*
+ * Copyright (c) 2011 Daniel Kiper
+ * Copyright (c) 2012 Daniel Kiper, Oracle Corporation
+ *
+ * kexec/kdump implementation for Xen was written by Daniel Kiper.
+ * Initial work on it was sponsored by Google under Google Summer
+ * of Code 2011 program and Citrix. Konrad Rzeszutek Wilk from Oracle
+ * was the mentor for this project.
+ *
+ * Some ideas are taken from:
+ *   - native kexec/kdump implementation,
+ *   - kexec/kdump implementation for Xen Linux Kernel Ver. 2.6.18,
+ *   - PV-GRUB.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define __ma(vaddr)(virt_to_machine(vaddr).maddr)
+
+static void init_level2_page(pmd_t *pmd, unsigned long addr)
+{
+   unsigned long end_addr = addr + PUD_SIZE;
+
+   while (addr < end_addr) {
+   native_set_pmd(pmd++, native_make_pmd(addr | 
__PAGE_KERNEL_LARGE_EXEC));
+   addr += PMD_SIZE;
+   }
+}
+
+static int init_level3_page(struct kimage *image, pud_t *pud,
+   unsigned long addr, unsigned long last_addr)
+{
+   pmd_t *pmd;
+   struct page *page;
+   unsigned long end_addr = addr + PGDIR_SIZE;
+
+   while ((addr < last_addr) && (addr < end_addr)) {
+   page = firmware_kimage_alloc_control_pages(image, 0);
+
+   if (!page)
+   return -ENOMEM;
+
+   pmd = page_address(page);
+   init_level2_page(pmd, addr);
+   native_set_pud(pud++, native_make_pud(__ma(pmd) | 
_KERNPG_TABLE));
+   addr += PUD_SIZE;
+   }
+
+   /* Clear the unused entries. */
+   while (addr < end_addr) {
+   native_pud_clear(pud++);
+   addr += PUD_SIZE;
+   }
+
+   return 0;
+}
+
+
+static int init_level4_page(struct kimage *image, pgd_t *pgd,
+   unsigned long addr, unsigned long last_addr)
+{
+   int rc;
+   pud_t *pud;
+   struct page *page;
+   unsigned long end_addr = addr + PTRS_PER_PGD * PGDIR_SIZE;
+
+   while ((addr < last_addr) && (addr < end_addr)) {
+   page = firmware_kimage_alloc_control_pages(image, 0);
+
+   if (!page)
+   return -ENOMEM;
+
+   pud = page_address(page);
+   rc = init_level3_page(image, pud, addr, last_addr);
+
+   if (rc)
+   return rc;
+
+   native_set_pgd(pgd++, native_make_pgd(__ma(pud) | 
_KERNPG_TABLE));
+   addr += PGDIR_SIZE;
+   }
+
+   /* Clear the unused entries. */
+   while (addr < end_addr) {
+   native_pgd_clear(pgd++);
+   addr += PGDIR_SIZE;
+   }
+
+   return 0;
+}
+
+static void free_transition_pgtable(struct kimage *image)
+{
+   free_page((unsigned long)image->arch.pgd);
+   free_page((unsigned long)image->arch.pud0);
+   free_page((unsigned long)image->arch.pud1);
+   free_page((unsigned long)image->arch.pmd0);
+   free_page((unsigned long)image->arch.pmd1);
+   free_page((unsigned long)image->arch.pte0);
+   free_page((unsigned long)image->arch.pte1);
+}
+
+static int alloc_transition_pgtable(struct kimage *image)
+{
+   image->arch.pgd = (pgd_t *)get_zeroed_page(GFP_KERNEL);
+
+   if (!image->arch.pgd)
+   goto err;
+
+   image->arch.pud0 = (pud_t *)get_zeroed_page(GFP_KERNEL);
+
+   if (!image->arch.pud0)
+   goto err;
+
+   image->arch.pud1 = (pud_t *)get_zeroed_page(GFP_KERNEL);
+
+   if (!image->arch.pud1)
+   goto err;
+
+   image->arch.pmd0 = (pmd_t *)get_zeroed_page(GFP_KERNEL);
+
+   if (!image->arch.pmd0)

[PATCH v3 08/11] x86/xen: Add kexec/kdump Kconfig and makefile rules

2012-12-26 Thread Daniel Kiper

Add kexec/kdump Kconfig and makefile rules.

Signed-off-by: Daniel Kiper 
---
 arch/x86/Kconfig  |3 +++
 arch/x86/xen/Kconfig  |1 +
 arch/x86/xen/Makefile |3 +++
 3 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 79795af..e2746c4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1600,6 +1600,9 @@ config KEXEC_JUMP
  Jump between original kernel and kexeced kernel and invoke
  code in physical address mode via KEXEC
 
+config KEXEC_FIRMWARE
+   def_bool n
+
 config PHYSICAL_START
hex "Physical address where the kernel is loaded" if (EXPERT || 
CRASH_DUMP)
default "0x100"
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 131dacd..8469c1c 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -7,6 +7,7 @@ config XEN
select PARAVIRT
select PARAVIRT_CLOCK
select XEN_HAVE_PVMMU
+   select KEXEC_FIRMWARE if KEXEC
depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS)
depends on X86_TSC
help
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 96ab2c0..99952d7 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -22,3 +22,6 @@ obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= spinlock.o
 obj-$(CONFIG_XEN_DEBUG_FS) += debugfs.o
 obj-$(CONFIG_XEN_DOM0) += apic.o vga.o
 obj-$(CONFIG_SWIOTLB_XEN)  += pci-swiotlb-xen.o
+obj-$(CONFIG_KEXEC_FIRMWARE)   += kexec.o
+obj-$(CONFIG_KEXEC_FIRMWARE)   += machine_kexec_$(BITS).o
+obj-$(CONFIG_KEXEC_FIRMWARE)   += relocate_kernel_$(BITS).o
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 09/11] x86/xen/enlighten: Add init and crash kexec/kdump hooks

2012-12-26 Thread Daniel Kiper

Add init and crash kexec/kdump hooks.

Signed-off-by: Daniel Kiper 
---
 arch/x86/xen/enlighten.c |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 138e566..5025bba 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1276,6 +1277,12 @@ static void xen_machine_power_off(void)
 
 static void xen_crash_shutdown(struct pt_regs *regs)
 {
+#ifdef CONFIG_KEXEC_FIRMWARE
+   if (kexec_crash_image) {
+   crash_save_cpu(regs, safe_smp_processor_id());
+   return;
+   }
+#endif
xen_reboot(SHUTDOWN_crash);
 }
 
@@ -1353,6 +1360,10 @@ asmlinkage void __init xen_start_kernel(void)
 
xen_init_mmu_ops();
 
+#ifdef CONFIG_KEXEC_FIRMWARE
+   kexec_use_firmware = true;
+#endif
+
/* Prevent unwanted bits from being set in PTEs. */
__supported_pte_mask &= ~_PAGE_GLOBAL;
 #if 0
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 11/11] x86: Add Xen kexec control code size check to linker script

2012-12-26 Thread Daniel Kiper

Add Xen kexec control code size check to linker script.

Signed-off-by: Daniel Kiper 
---
 arch/x86/kernel/vmlinux.lds.S |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 22a1530..f18786a 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -360,5 +360,10 @@ INIT_PER_CPU(irq_stack_union);
 
 . = ASSERT(kexec_control_code_size <= KEXEC_CONTROL_CODE_MAX_SIZE,
"kexec control code size is too big");
-#endif
 
+#ifdef CONFIG_XEN
+. = ASSERT(xen_kexec_control_code_size - xen_relocate_kernel <=
+   KEXEC_CONTROL_CODE_MAX_SIZE,
+   "Xen kexec control code size is too big");
+#endif
+#endif
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 10/11] drivers/xen: Export vmcoreinfo through sysfs

2012-12-26 Thread Daniel Kiper

Export vmcoreinfo through sysfs.

Signed-off-by: Daniel Kiper 
---
 drivers/xen/sys-hypervisor.c |   42 +-
 1 files changed, 41 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/sys-hypervisor.c b/drivers/xen/sys-hypervisor.c
index 96453f8..9dd290c 100644
--- a/drivers/xen/sys-hypervisor.c
+++ b/drivers/xen/sys-hypervisor.c
@@ -368,6 +368,41 @@ static void xen_properties_destroy(void)
sysfs_remove_group(hypervisor_kobj, &xen_properties_group);
 }
 
+#ifdef CONFIG_KEXEC_FIRMWARE
+static ssize_t vmcoreinfo_show(struct hyp_sysfs_attr *attr, char *buffer)
+{
+   return sprintf(buffer, "%lx %lx\n", xen_vmcoreinfo_maddr,
+   xen_vmcoreinfo_max_size);
+}
+
+HYPERVISOR_ATTR_RO(vmcoreinfo);
+
+static int __init xen_vmcoreinfo_init(void)
+{
+   if (!xen_vmcoreinfo_max_size)
+   return 0;
+
+   return sysfs_create_file(hypervisor_kobj, &vmcoreinfo_attr.attr);
+}
+
+static void xen_vmcoreinfo_destroy(void)
+{
+   if (!xen_vmcoreinfo_max_size)
+   return;
+
+   sysfs_remove_file(hypervisor_kobj, &vmcoreinfo_attr.attr);
+}
+#else
+static int __init xen_vmcoreinfo_init(void)
+{
+   return 0;
+}
+
+static void xen_vmcoreinfo_destroy(void)
+{
+}
+#endif
+
 static int __init hyper_sysfs_init(void)
 {
int ret;
@@ -390,9 +425,14 @@ static int __init hyper_sysfs_init(void)
ret = xen_properties_init();
if (ret)
goto prop_out;
+   ret = xen_vmcoreinfo_init();
+   if (ret)
+   goto vmcoreinfo_out;
 
goto out;
 
+vmcoreinfo_out:
+   xen_properties_destroy();
 prop_out:
xen_sysfs_uuid_destroy();
 uuid_out:
@@ -407,12 +447,12 @@ out:
 
 static void __exit hyper_sysfs_exit(void)
 {
+   xen_vmcoreinfo_destroy();
xen_properties_destroy();
xen_compilation_destroy();
xen_sysfs_uuid_destroy();
xen_sysfs_version_destroy();
xen_sysfs_type_destroy();
-
 }
 module_init(hyper_sysfs_init);
 module_exit(hyper_sysfs_exit);
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 03/11] xen: Introduce architecture independent data for kexec/kdump

2012-12-26 Thread Daniel Kiper

Introduce architecture independent constants and structures
required by Xen kexec/kdump implementation.

Signed-off-by: Daniel Kiper 
---
 include/xen/interface/xen.h |   33 +
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 886a5d8..09c16ab 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -57,6 +57,7 @@
 #define __HYPERVISOR_event_channel_op 32
 #define __HYPERVISOR_physdev_op   33
 #define __HYPERVISOR_hvm_op   34
+#define __HYPERVISOR_kexec_op 37
 #define __HYPERVISOR_tmem_op  38
 
 /* Architecture-specific hypercall definitions. */
@@ -231,7 +232,39 @@ DEFINE_GUEST_HANDLE_STRUCT(mmuext_op);
 #define VMASST_TYPE_pae_extended_cr3 3
 #define MAX_VMASST_TYPE 3
 
+/*
+ * Commands to HYPERVISOR_kexec_op().
+ */
+#define KEXEC_CMD_kexec0
+#define KEXEC_CMD_kexec_load   1
+#define KEXEC_CMD_kexec_unload 2
+#define KEXEC_CMD_kexec_get_range  3
+
+/*
+ * Memory ranges for kdump (utilized by HYPERVISOR_kexec_op()).
+ */
+#define KEXEC_RANGE_MA_CRASH   0
+#define KEXEC_RANGE_MA_XEN 1
+#define KEXEC_RANGE_MA_CPU 2
+#define KEXEC_RANGE_MA_XENHEAP 3
+#define KEXEC_RANGE_MA_BOOT_PARAM  4
+#define KEXEC_RANGE_MA_EFI_MEMMAP  5
+#define KEXEC_RANGE_MA_VMCOREINFO  6
+
 #ifndef __ASSEMBLY__
+struct xen_kexec_exec {
+   int type;
+};
+
+struct xen_kexec_range {
+   int range;
+   int nr;
+   unsigned long size;
+   unsigned long start;
+};
+
+extern unsigned long xen_vmcoreinfo_maddr;
+extern unsigned long xen_vmcoreinfo_max_size;
 
 typedef uint16_t domid_t;
 
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 02/11] x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE

2012-12-26 Thread Daniel Kiper

Some implementations (e.g. Xen PVOPS) could not use part of identity page table
to construct transition page table. It means that they require separate PUDs,
PMDs and PTEs for virtual and physical (identity) mapping. To satisfy that
requirement add extra pointer to PGD, PUD, PMD and PTE and align existing code.

Signed-off-by: Daniel Kiper 
---
 arch/x86/include/asm/kexec.h   |   10 +++---
 arch/x86/kernel/machine_kexec_64.c |   12 ++--
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 6080d26..cedd204 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -157,9 +157,13 @@ struct kimage_arch {
 };
 #else
 struct kimage_arch {
-   pud_t *pud;
-   pmd_t *pmd;
-   pte_t *pte;
+   pgd_t *pgd;
+   pud_t *pud0;
+   pud_t *pud1;
+   pmd_t *pmd0;
+   pmd_t *pmd1;
+   pte_t *pte0;
+   pte_t *pte1;
 };
 #endif
 
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index b3ea9db..976e54b 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -137,9 +137,9 @@ out:
 
 static void free_transition_pgtable(struct kimage *image)
 {
-   free_page((unsigned long)image->arch.pud);
-   free_page((unsigned long)image->arch.pmd);
-   free_page((unsigned long)image->arch.pte);
+   free_page((unsigned long)image->arch.pud0);
+   free_page((unsigned long)image->arch.pmd0);
+   free_page((unsigned long)image->arch.pte0);
 }
 
 static int init_transition_pgtable(struct kimage *image, pgd_t *pgd)
@@ -157,7 +157,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
pud = (pud_t *)get_zeroed_page(GFP_KERNEL);
if (!pud)
goto err;
-   image->arch.pud = pud;
+   image->arch.pud0 = pud;
set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE));
}
pud = pud_offset(pgd, vaddr);
@@ -165,7 +165,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
pmd = (pmd_t *)get_zeroed_page(GFP_KERNEL);
if (!pmd)
goto err;
-   image->arch.pmd = pmd;
+   image->arch.pmd0 = pmd;
set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
}
pmd = pmd_offset(pud, vaddr);
@@ -173,7 +173,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
if (!pte)
goto err;
-   image->arch.pte = pte;
+   image->arch.pte0 = pte;
set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
}
pte = pte_offset_kernel(pmd, vaddr);
-- 
1.5.6.5

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 01/11] kexec: introduce kexec firmware support

2012-12-26 Thread Daniel Kiper

Some kexec/kdump implementations (e.g. Xen PVOPS) could not use default
Linux infrastructure and require some support from firmware and/or hypervisor.
To cope with that problem kexec firmware infrastructure was introduced.
It allows a developer to use all kexec/kdump features of given firmware
or hypervisor.

v3 - suggestions/fixes:
   - replace kexec_ops struct by kexec firmware infrastructure
 (suggested by Eric Biederman).

v2 - suggestions/fixes:
   - add comment for kexec_ops.crash_alloc_temp_store member
 (suggested by Konrad Rzeszutek Wilk),
   - simplify kexec_ops usage
 (suggested by Konrad Rzeszutek Wilk).

Signed-off-by: Daniel Kiper 
---
 include/linux/kexec.h   |   26 ++-
 kernel/Makefile |1 +
 kernel/kexec-firmware.c |  743 +++
 kernel/kexec.c  |   46 +++-
 4 files changed, 809 insertions(+), 7 deletions(-)
 create mode 100644 kernel/kexec-firmware.c

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index d0b8458..9568457 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -116,17 +116,34 @@ struct kimage {
 #endif
 };
 
-
-
 /* kexec interface functions */
 extern void machine_kexec(struct kimage *image);
 extern int machine_kexec_prepare(struct kimage *image);
 extern void machine_kexec_cleanup(struct kimage *image);
+extern struct page *mf_kexec_kimage_alloc_pages(gfp_t gfp_mask,
+   unsigned int order,
+   unsigned long limit);
+extern void mf_kexec_kimage_free_pages(struct page *page);
+extern unsigned long mf_kexec_page_to_pfn(struct page *page);
+extern struct page *mf_kexec_pfn_to_page(unsigned long mfn);
+extern unsigned long mf_kexec_virt_to_phys(volatile void *address);
+extern void *mf_kexec_phys_to_virt(unsigned long address);
+extern int mf_kexec_prepare(struct kimage *image);
+extern int mf_kexec_load(struct kimage *image);
+extern void mf_kexec_cleanup(struct kimage *image);
+extern void mf_kexec_unload(struct kimage *image);
+extern void mf_kexec_shutdown(void);
+extern void mf_kexec(struct kimage *image);
 extern asmlinkage long sys_kexec_load(unsigned long entry,
unsigned long nr_segments,
struct kexec_segment __user *segments,
unsigned long flags);
+extern long firmware_sys_kexec_load(unsigned long entry,
+   unsigned long nr_segments,
+   struct kexec_segment __user *segments,
+   unsigned long flags);
 extern int kernel_kexec(void);
+extern int firmware_kernel_kexec(void);
 #ifdef CONFIG_COMPAT
 extern asmlinkage long compat_sys_kexec_load(unsigned long entry,
unsigned long nr_segments,
@@ -135,7 +152,10 @@ extern asmlinkage long compat_sys_kexec_load(unsigned long 
entry,
 #endif
 extern struct page *kimage_alloc_control_pages(struct kimage *image,
unsigned int order);
+extern struct page *firmware_kimage_alloc_control_pages(struct kimage *image,
+   unsigned int order);
 extern void crash_kexec(struct pt_regs *);
+extern void firmware_crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 void crash_save_vmcoreinfo(void);
@@ -168,6 +188,8 @@ unsigned long paddr_vmcoreinfo_note(void);
 #define VMCOREINFO_CONFIG(name) \
vmcoreinfo_append_str("CONFIG_%s=y\n", #name)
 
+extern bool kexec_use_firmware;
+
 extern struct kimage *kexec_image;
 extern struct kimage *kexec_crash_image;
 
diff --git a/kernel/Makefile b/kernel/Makefile
index 6c072b6..bc96b2f 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -58,6 +58,7 @@ obj-$(CONFIG_MODULE_SIG) += module_signing.o modsign_pubkey.o 
modsign_certificat
 obj-$(CONFIG_KALLSYMS) += kallsyms.o
 obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
 obj-$(CONFIG_KEXEC) += kexec.o
+obj-$(CONFIG_KEXEC_FIRMWARE) += kexec-firmware.o
 obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
 obj-$(CONFIG_COMPAT) += compat.o
 obj-$(CONFIG_CGROUPS) += cgroup.o
diff --git a/kernel/kexec-firmware.c b/kernel/kexec-firmware.c
new file mode 100644
index 000..f6ddd4c
--- /dev/null
+++ b/kernel/kexec-firmware.c
@@ -0,0 +1,743 @@
+/*
+ * Copyright (C) 2002-2004 Eric Biederman  
+ * Copyright (C) 2012 Daniel Kiper, Oracle Corporation
+ *
+ * Most of the code here is a copy of kernel/kexec.c.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2.  See the file COPYING for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/*
+ * KIMAGE_NO_DEST is an impossible destination address..., for
+ * allocating pages whose destinat

[PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread Daniel Kiper


Hi,

This set of patches contains initial kexec/kdump implementation for Xen v3.
Currently only dom0 is supported, however, almost all infrustructure
required for domU support is ready.

Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code.
This could simplify and reduce a bit size of kernel code. However, this solution
requires some changes in baremetal x86 code. First of all code which establishes
transition page table should be moved back from machine_kexec_$(BITS).c to
relocate_kernel_$(BITS).S. Another important thing which should be changed in 
that
case is format of page_list array. Xen kexec hypercall requires to alternate 
physical
addresses with virtual ones. These and other required stuff have not been done 
in that
version because I am not sure that solution will be accepted by kexec/kdump 
maintainers.
I hope that this email spark discussion about that topic.

Daniel

 arch/x86/Kconfig |3 +
 arch/x86/include/asm/kexec.h |   10 +-
 arch/x86/include/asm/xen/hypercall.h |6 +
 arch/x86/include/asm/xen/kexec.h |   79 
 arch/x86/kernel/machine_kexec_64.c   |   12 +-
 arch/x86/kernel/vmlinux.lds.S|7 +-
 arch/x86/xen/Kconfig |1 +
 arch/x86/xen/Makefile|3 +
 arch/x86/xen/enlighten.c |   11 +
 arch/x86/xen/kexec.c |  150 +++
 arch/x86/xen/machine_kexec_32.c  |  226 +++
 arch/x86/xen/machine_kexec_64.c  |  318 +++
 arch/x86/xen/relocate_kernel_32.S|  323 +++
 arch/x86/xen/relocate_kernel_64.S|  309 ++
 drivers/xen/sys-hypervisor.c |   42 ++-
 include/linux/kexec.h|   26 ++-
 include/xen/interface/xen.h  |   33 ++
 kernel/Makefile  |1 +
 kernel/kexec-firmware.c  |  743 ++
 kernel/kexec.c   |   46 ++-
 20 files changed, 2331 insertions(+), 18 deletions(-)

Daniel Kiper (11):
  kexec: introduce kexec firmware support
  x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and 
PTE
  xen: Introduce architecture independent data for kexec/kdump
  x86/xen: Introduce architecture dependent data for kexec/kdump
  x86/xen: Register resources required by kexec-tools
  x86/xen: Add i386 kexec/kdump implementation
  x86/xen: Add x86_64 kexec/kdump implementation
  x86/xen: Add kexec/kdump Kconfig and makefile rules
  x86/xen/enlighten: Add init and crash kexec/kdump hooks
  drivers/xen: Export vmcoreinfo through sysfs
  x86: Add Xen kexec control code size check to linker script
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

CISTI'2013 Doctoral Symposium - CFP, Lisbon, June 19 - 23, 2013

2012-12-26 Thread Maria Lemos


***
CISTI'2013 DOCTORAL SYMPOSIUM
8th Iberian Conference on Information Systems and Technologies
Lisbon, Portugal, June 19 - 23, 2013
http://www.aisti.eu/cisti2013/index.php?option=com_content&view=article&id=64&Itemid=68&lang=en
***

INTRODUCTION

The purpose of CISTI'2013s Doctoral Symposium 
(http://www.aisti.eu/cisti2013/index.php?option=com_content&view=article&id=64&Itemid=68&lang=en)
 is to provide graduate students a setting where they can, informally, expose 
and discuss their work, collecting valuable expert opinions and sharing new 
ideas, methods and applications. The Doctoral Symposium is an excellent 
opportunity for PhD students to present and discuss their work in a Workshop 
format. Each presentation will be evaluated by a panel composed by at least 
three Information Systems and Technologies experts. 

CONTRIBUTIONS SUBMISSION

The Doctoral Symposium is opened to PhD students whose research area includes 
the themes proposed for this Conference. Submissions must include an extended 
abstract (maximum 4 pages), following the Conference style guide. All selected 
contributions will be handed out along with the Conference Proceedings, in CD 
with an ISBN. These contributions will be send for indexation by EBSCO, and 
EI-Compendex.

Submissions must include the field, the PhD institution and the number of 
months devoted to the development of the work. Additionally, they should 
include in a clear and succinct manner:

The problem approached and its significance or relevance
The research objectives and related investigation topics
A brief display of what is already known
A proposed solution methodology for the problem
Expected results

IMPORTANT DATES

Data limite para submissão de propostas: 15 de Fevereiro de 2013
Notificação de aceitação: 29 de Março de 2013
Data limite para apresentação das versões finais: 12 de Abril de 2013
Pagamento da inscrição, para garantir a inclusão da contribuição aceite 
nas actas da conferência: 12 de Abril de 2013

SCIENTIFIC AND ORGANIZING COMMITTEE

Manuel Pérez Cota, Universidad de Vigo (Chair)
Adolfo Lozano Tello, Universidad de Extremadura
Alberto J. Bugarín Diz, Universidad de Santiago de Compostela
Álvaro Rocha, Universidade Fernando Pessoa
Ana Maria Ramalho Correia, Universidade Nova de Lisboa, ISEGI
António Palma dos Reis, Universidade Técnica de Lisboa, ISEG
Arturo Mendez Penín, Universidade de Vigo
Carlos Ferrás Sexto, Universidad de Santiago de Compostela
David Fonseca, Universidad Ramón Llul
Ernesto Redondo, Universidad Politécnica de Cataluña
Feliz Gouveia, Universidade Fernando Pessoa
Francisco Restivo, Universidade Católica Portuguesa - Braga
Guilhermina Miranda, Universidade de Lisboa
Gonzalo Cuevas Agustín, Universidad Politécnica de Madrid
Héctor Jorge García Neder, Universidad Tecnológica NacioNal
João Álvaro Carvalho, Universidade do Minho
João Barroso, Universidade de Trás-os-Montes e Alto Douro
Jörg Thomaschewski, University of Applied Sciences of Emden-Leer
José Antonio Calvo-Manzano Villalón, Universidad Politécnica de Madrid
José Bulas Cruz, Universidade de Trás-os-Montes e Alto Douro
José Tribolet, Universidade Técnica de Lisboa, IST
Leandro Rodríguez Liñares, Universidade de Vigo
Luís Paulo Reis, Universidade do Minho
María José Lado Touriño, Universidade de Vigo
Maria Manuela Cruz Cunha, Instituto Politécnico do Cávado e do Ave
Marco Painho, Universidade Nova de Lisboa, ISEGI
Mario Alberto Groppo, Universidad Tecnológica Nacional
Nuno Ribeiro, Universidade Fernando Pessoa
Pilar Mareca, Universidade Politécnica de Madrid
Ramiro Gonçalves, Universidade de Trás-os-Montes e Alto Douro
Tomas San Feliu Gilabert, Universidad Politécnica de Madrid
Vicente Alcober, Universidad Politécnica de Madrid

-
CISTI'2013 Team
http://www.aisti.eu/cisti2013
ais...@gmail.com

---

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

2012-12-26 Thread Michael S. Tsirkin

On Wed, Dec 26, 2012 at 03:06:54PM +0800, Wanlong Gao wrote:
> Add a cpu notifier to virtio-net, so that we can reset the
> virtqueue affinity if the cpu hotplug happens. It improve
> the performance through enabling or disabling the virtqueue
> affinity after doing cpu hotplug.
> 
> Cc: Rusty Russell 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: virtualization@lists.linux-foundation.org
> Cc: net...@vger.kernel.org
> Signed-off-by: Wanlong Gao 

Thanks for looking into this.
Some comments:

1. Looks like the logic in
virtnet_set_affinity (and in virtnet_select_queue)
will not work very well when CPU IDs are not
consequitive. This can happen with hot unplug.

Maybe we should add a VQ allocator, and defining
a per-cpu variable specifying the VQ instead
of using CPU ID.


2. The below code seems racy e.g. when CPU is added
during device init.

3. using a global cpu_hotplug seems inelegant.
In any case we should document what is the
meaning of this variable.

> ---
>  drivers/net/virtio_net.c | 39 ++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index a6fcf15..9710cf4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static int napi_weight = 128;
>  module_param(napi_weight, int, 0444);
> @@ -34,6 +35,8 @@ static bool csum = true, gso = true;
>  module_param(csum, bool, 0444);
>  module_param(gso, bool, 0444);
>  
> +static bool cpu_hotplug = false;
> +
>  /* FIXME: MTU in config. */
>  #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
>  #define GOOD_COPY_LEN128
> @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct virtnet_info 
> *vi, bool set)
>   vi->affinity_hint_set = false;
>  }
>  
> +static int virtnet_cpu_callback(struct notifier_block *nfb,
> +unsigned long action, void *hcpu)
> +{
> + switch(action) {
> + case CPU_ONLINE:
> + case CPU_ONLINE_FROZEN:
> + case CPU_DEAD:
> + case CPU_DEAD_FROZEN:
> + cpu_hotplug = true;
> + break;
> + default:
> + break;
> + }
> + return NOTIFY_OK;
> +}
> +
> +static struct notifier_block virtnet_cpu_notifier = {
> + .notifier_call = virtnet_cpu_callback,
> +};
> +
>  static void virtnet_get_ringparam(struct net_device *dev,
>   struct ethtool_ringparam *ring)
>  {
> @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device *dev, 
> int new_mtu)
>   */
>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>  {
> - int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> + int txq;
> +
> + if (unlikely(cpu_hotplug == true)) {
> + virtnet_set_affinity(netdev_priv(dev), true);
> + cpu_hotplug = false;
> + }
> +
> + txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> smp_processor_id();
>  
>   while (unlikely(txq >= dev->real_num_tx_queues))
> @@ -1248,6 +1278,8 @@ static void virtnet_del_vqs(struct virtnet_info *vi)
>  {
>   struct virtio_device *vdev = vi->vdev;
>  
> + unregister_hotcpu_notifier(&virtnet_cpu_notifier);
> +
>   virtnet_set_affinity(vi, false);
>  
>   vdev->config->del_vqs(vdev);
> @@ -1372,6 +1404,11 @@ static int init_vqs(struct virtnet_info *vi)
>   goto err_free;
>  
>   virtnet_set_affinity(vi, true);
> +
> + ret = register_hotcpu_notifier(&virtnet_cpu_notifier);
> + if (ret)
> + goto err_free;
> +
>   return 0;
>  
>  err_free:
> -- 
> 1.8.0
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

2012-12-26 Thread Wanlong Gao

On 12/26/2012 06:06 PM, Jason Wang wrote:
> On 12/26/2012 03:06 PM, Wanlong Gao wrote:
>> Add a cpu notifier to virtio-net, so that we can reset the
>> virtqueue affinity if the cpu hotplug happens. It improve
>> the performance through enabling or disabling the virtqueue
>> affinity after doing cpu hotplug.
> 
> Hi Wanlong:
> 
> Thanks for looking at this.
>> Cc: Rusty Russell 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Jason Wang 
>> Cc: virtualization@lists.linux-foundation.org
>> Cc: net...@vger.kernel.org
>> Signed-off-by: Wanlong Gao 
>> ---
>>  drivers/net/virtio_net.c | 39 ++-
>>  1 file changed, 38 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index a6fcf15..9710cf4 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -26,6 +26,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  static int napi_weight = 128;
>>  module_param(napi_weight, int, 0444);
>> @@ -34,6 +35,8 @@ static bool csum = true, gso = true;
>>  module_param(csum, bool, 0444);
>>  module_param(gso, bool, 0444);
>>  
>> +static bool cpu_hotplug = false;
>> +
>>  /* FIXME: MTU in config. */
>>  #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
>>  #define GOOD_COPY_LEN   128
>> @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct virtnet_info 
>> *vi, bool set)
>>  vi->affinity_hint_set = false;
>>  }
>>  
>> +static int virtnet_cpu_callback(struct notifier_block *nfb,
>> +   unsigned long action, void *hcpu)
>> +{
>> +switch(action) {
>> +case CPU_ONLINE:
>> +case CPU_ONLINE_FROZEN:
>> +case CPU_DEAD:
>> +case CPU_DEAD_FROZEN:
>> +cpu_hotplug = true;
>> +break;
>> +default:
>> +break;
>> +}
>> +return NOTIFY_OK;
>> +}
>> +
>> +static struct notifier_block virtnet_cpu_notifier = {
>> +.notifier_call = virtnet_cpu_callback,
>> +};
>> +
>>  static void virtnet_get_ringparam(struct net_device *dev,
>>  struct ethtool_ringparam *ring)
>>  {
>> @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device *dev, 
>> int new_mtu)
>>   */
>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>>  {
>> -int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> +int txq;
>> +
>> +if (unlikely(cpu_hotplug == true)) {
>> +virtnet_set_affinity(netdev_priv(dev), true);
>> +cpu_hotplug = false;
>> +}
>> +
> 
> Why don't you just do this in callback?

Callback can just give us a "hcpu", can't get the virtnet_info from callback. 
Am I missing something?

> 
> btw. Does qemu/kvm support cpu-hotplug now?

>From http://www.linux-kvm.org/page/CPUHotPlug, I saw that qemu-kvm can support 
>hotplug
but failed to merge to qemu.git, right?

Thanks,
Wanlong Gao

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

2012-12-26 Thread Jason Wang

On 12/26/2012 03:06 PM, Wanlong Gao wrote:
> Add a cpu notifier to virtio-net, so that we can reset the
> virtqueue affinity if the cpu hotplug happens. It improve
> the performance through enabling or disabling the virtqueue
> affinity after doing cpu hotplug.

Hi Wanlong:

Thanks for looking at this.
> Cc: Rusty Russell 
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: virtualization@lists.linux-foundation.org
> Cc: net...@vger.kernel.org
> Signed-off-by: Wanlong Gao 
> ---
>  drivers/net/virtio_net.c | 39 ++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index a6fcf15..9710cf4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  static int napi_weight = 128;
>  module_param(napi_weight, int, 0444);
> @@ -34,6 +35,8 @@ static bool csum = true, gso = true;
>  module_param(csum, bool, 0444);
>  module_param(gso, bool, 0444);
>  
> +static bool cpu_hotplug = false;
> +
>  /* FIXME: MTU in config. */
>  #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
>  #define GOOD_COPY_LEN128
> @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct virtnet_info 
> *vi, bool set)
>   vi->affinity_hint_set = false;
>  }
>  
> +static int virtnet_cpu_callback(struct notifier_block *nfb,
> +unsigned long action, void *hcpu)
> +{
> + switch(action) {
> + case CPU_ONLINE:
> + case CPU_ONLINE_FROZEN:
> + case CPU_DEAD:
> + case CPU_DEAD_FROZEN:
> + cpu_hotplug = true;
> + break;
> + default:
> + break;
> + }
> + return NOTIFY_OK;
> +}
> +
> +static struct notifier_block virtnet_cpu_notifier = {
> + .notifier_call = virtnet_cpu_callback,
> +};
> +
>  static void virtnet_get_ringparam(struct net_device *dev,
>   struct ethtool_ringparam *ring)
>  {
> @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device *dev, 
> int new_mtu)
>   */
>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>  {
> - int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> + int txq;
> +
> + if (unlikely(cpu_hotplug == true)) {
> + virtnet_set_affinity(netdev_priv(dev), true);
> + cpu_hotplug = false;
> + }
> +

Why don't you just do this in callback?

btw. Does qemu/kvm support cpu-hotplug now?
> + txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> smp_processor_id();
>  
>   while (unlikely(txq >= dev->real_num_tx_queues))
> @@ -1248,6 +1278,8 @@ static void virtnet_del_vqs(struct virtnet_info *vi)
>  {
>   struct virtio_device *vdev = vi->vdev;
>  
> + unregister_hotcpu_notifier(&virtnet_cpu_notifier);
> +
>   virtnet_set_affinity(vi, false);
>  
>   vdev->config->del_vqs(vdev);
> @@ -1372,6 +1404,11 @@ static int init_vqs(struct virtnet_info *vi)
>   goto err_free;
>  
>   virtnet_set_affinity(vi, true);
> +
> + ret = register_hotcpu_notifier(&virtnet_cpu_notifier);
> + if (ret)
> + goto err_free;
> +
>   return 0;
>  
>  err_free:

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

[PATCH 2/2] vhost: handle polling failure

[PATCH 1/2] vhost_net: correct error hanlding in vhost_net_set_backend()

Re: [PATCH v3 01/11] kexec: introduce kexec firmware support

Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

Re: [PATCH v3 06/11] x86/xen: Add i386 kexec/kdump implementation

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

Re: [PATCH v3 02/11] x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE

[PATCH v3 04/11] x86/xen: Introduce architecture dependent data for kexec/kdump

[PATCH v3 06/11] x86/xen: Add i386 kexec/kdump implementation

[PATCH v3 05/11] x86/xen: Register resources required by kexec-tools

[PATCH v3 07/11] x86/xen: Add x86_64 kexec/kdump implementation

[PATCH v3 08/11] x86/xen: Add kexec/kdump Kconfig and makefile rules

[PATCH v3 09/11] x86/xen/enlighten: Add init and crash kexec/kdump hooks

[PATCH v3 11/11] x86: Add Xen kexec control code size check to linker script

[PATCH v3 10/11] drivers/xen: Export vmcoreinfo through sysfs

[PATCH v3 03/11] xen: Introduce architecture independent data for kexec/kdump

[PATCH v3 02/11] x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE

[PATCH v3 01/11] kexec: introduce kexec firmware support

[PATCH v3 00/11] xen: Initial kexec/kdump implementation

CISTI'2013 Doctoral Symposium - CFP, Lisbon, June 19 - 23, 2013

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug

26 matches

Site Navigation

Mail list logo

Footer information