Re: [Xen-devel] [v3 01/15] Vt-d Posted-intterrupt (PI) design

2015-06-24 Thread Wu, Feng


 -Original Message-
 From: Meng Xu [mailto:xumengpa...@gmail.com]
 Sent: Wednesday, June 24, 2015 2:16 PM
 To: Wu, Feng
 Cc: xen-devel@lists.xen.org; Tian, Kevin; Keir Fraser; George Dunlap; Andrew
 Cooper; Jan Beulich; Zhang, Yang Z
 Subject: Re: [Xen-devel] [v3 01/15] Vt-d Posted-intterrupt (PI) design
 
 Hi Feng,
 
 One minor thing:
 
  +Important Definitions
  +==
  +There are some changes to IRTE and posted-interrupt descriptor after
  +VT-d PI is introduced:
  +IRTE:
 It seems that you forgot to define IRTE. :-)
 
 I guess it stands for Interrupt Remapping Table Entry? (Probably I'm wrong. 
 :-))

Yes, you're right. I will add this in the next version. Thanks for pointing it 
out!

Thanks,
Feng

 
 Thanks,
 
 Meng
 
 
 ---
 Meng Xu
 PhD Student in Computer and Information Science
 University of Pennsylvania
 http://www.cis.upenn.edu/~mengxu/
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 03/19] xen/vtd: create RMRR mapping

2015-06-24 Thread Chen, Tiejun

Note actually we just need p2m_remove_page() to unmap these mapping on
both ept and vt-d sides, and guest_physmap_remove_page is really a
wrapper of p2m_remove_page().


And I agree with Tim regarding the desire to avoid code duplication.
Yet that's no reason to do it asymmetrically:
clear_identity_p2m_entry() could still be an inline (or macro) wrapper
around guest_physmap_remove_page(). That way, apart from making


I can define that as a macro close to set_identity_p2m_entry() in p2m.h.


the code above look nicer, if the latter needs extending in the future
for some reason, simply converting the wrapper to a real function is
possible without needing to touch the call site(s).

This would need to go into patch 2; I wonder whether folding that


Yes.


and this one wouldn't be warranted, avoiding the former adding


Are you saying to fold patch #2 and patch #3? But shouldn't we always 
define a new and then use that in practice subsequently? Even with two 
patches, respectively.


Thanks
Tiejun


(at that point) dead code.

Jan





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 01/15] docs: add colo readme

2015-06-24 Thread Yang Hongyang



On 06/16/2015 06:56 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:45 +0800, Yang Hongyang wrote:

add colo readme, refer to
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com


This is fine as far as it goes but I wonder if perhaps
docs/README.{remus,colo} ought to be moved into docs/misc, perhaps
converted to markdown (which should be trivial) and perhaps merged into
a single document about checkpointing?


Agreeed that we can add a checkpointing.txt to docs/misc, and describe
remus/COLO in that file. but can we do this later when COLO feature is
merged? at that time we can do this within one patch.



The reason for the move is twofold, first it is a bit a typical for docs
to live in the top-level docs dir and secondly moving it into misc will
cause it to appear automatically at
http://xenbits.xen.org/docs/unstable/ etc.

Ian.

---
  docs/README.colo | 9 +
  1 file changed, 9 insertions(+)
  create mode 100644 docs/README.colo

diff --git a/docs/README.colo b/docs/README.colo
new file mode 100644
index 000..466eb72
--- /dev/null
+++ b/docs/README.colo
@@ -0,0 +1,9 @@
+COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
+project is a high availability solution. Both primary VM (PVM) and secondary VM
+(SVM) run in parallel. They receive the same request from client, and generate
+response in parallel too. If the response packets from PVM and SVM are
+identical, they are released immediately. Otherwise, a VM checkpoint (on 
demand)
+is conducted.
+
+See the website at http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
+for details.



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v3 01/15] Vt-d Posted-intterrupt (PI) design

2015-06-24 Thread Meng Xu
Hi Feng,

One minor thing:

 +Important Definitions
 +==
 +There are some changes to IRTE and posted-interrupt descriptor after
 +VT-d PI is introduced:
 +IRTE:
It seems that you forgot to define IRTE. :-)

I guess it stands for Interrupt Remapping Table Entry? (Probably I'm wrong. :-))

Thanks,

Meng


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/12] VMX: VMFUNC and #VE definitions and detection.

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 Currently, neither is enabled globally but may be enabled on a per-VCPU
 basis by the altp2m code.

 Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
 that bit is now hardware-defined.

 Signed-off-by: Ed White edmund.h.wh...@intel.com

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install

2015-06-24 Thread Razvan Cojocaru
On 06/24/2015 12:14 PM, Fanhenglong wrote:
 I want to debug the procedure of windows os install with windbg,
 
 windbg executes instruction(fxsave) after the blank vm is started and
 before guest iso start to install,
 
 fxsave trigger the following code path:
 vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION)
 -ept_handle_violation
 -hvm_hap_nested_page_fault
 -handle_mmio_with_translation
 -handle_mmio
 -hvm_emulate_one
 -x86_emulate
 
 *X86_emulate return X86EMUL_UNHANDLEABLE*

How are you using Xen in this case? Are you by any chance using the
vm_event system in a way that sends back an emulate vm_event response
from userspace?

You might want to look at x86_emulate() in
xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is
being handled.


HTH,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 03/19] xen/vtd: create RMRR mapping

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 03:11, tiejun.c...@intel.com wrote:
 On 2015/6/23 18:12, Jan Beulich wrote:
 On 23.06.15 at 11:57, tiejun.c...@intel.com wrote:
 --- a/xen/drivers/passthrough/vtd/iommu.c
 +++ b/xen/drivers/passthrough/vtd/iommu.c
 @@ -1839,7 +1839,7 @@ static int rmrr_identity_mapping(struct domain *d, 
 bool_t map,

   while ( base_pfn  end_pfn )
   {
 -if ( intel_iommu_unmap_page(d, base_pfn) )
 +if ( guest_physmap_remove_page(d, base_pfn, base_pfn, 0) )
 
 Yeah, I also thought this may bring some confusions in this context.
 
   ret = -ENXIO;
   base_pfn++;
   }
 @@ -1855,8 +1855,7 @@ static int rmrr_identity_mapping(struct domain *d, 
 bool_t map,

   while ( base_pfn  end_pfn )
   {
 -int err = intel_iommu_map_page(d, base_pfn, base_pfn,
 -   IOMMUF_readable|IOMMUF_writable);
 +int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw);

 Shouldn't the two continue to be the inverse of one another?
 
 Initially, instead of using guest_physmap_remove_page, I was trying to 
 introduce a new, clear_identity_p2m_entry, which can wrapper 
 p2m_remove_page().
 
 But Tim just thought we'd better avoid duplicating code,
 
 http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg02970.html 
 
 Maybe guest_physmap_remove_page() does what you want,
 
 Note actually we just need p2m_remove_page() to unmap these mapping on 
 both ept and vt-d sides, and guest_physmap_remove_page is really a 
 wrapper of p2m_remove_page().

And I agree with Tim regarding the desire to avoid code duplication.
Yet that's no reason to do it asymmetrically:
clear_identity_p2m_entry() could still be an inline (or macro) wrapper
around guest_physmap_remove_page(). That way, apart from making
the code above look nicer, if the latter needs extending in the future
for some reason, simply converting the wrapper to a real function is
possible without needing to touch the call site(s).

This would need to go into patch 2; I wonder whether folding that
and this one wouldn't be warranted, avoiding the former adding
(at that point) dead code.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [linux-arm-xen test] 58849: regressions - FAIL

2015-06-24 Thread Ian Campbell
On Wed, 2015-06-24 at 06:03 +, osstest service user wrote:
 flight 58849 linux-arm-xen real [real]
 http://logs.test-lab.xenproject.org/osstest/logs/58849/
 
 Regressions :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 
 58830

This was:
http://logs.test-lab.xenproject.org/osstest/logs/58849/test-armhf-armhf-xl-cubietruck/cubietruck-braque---var-log-kern.log

Jun 24 04:09:13 cubietruck-braque kernel: [  807.637687] [ cut here 
]
Jun 24 04:09:13 cubietruck-braque kernel: [  807.637756] kernel BUG at 
drivers/xen/grant-table.c:923!
Jun 24 04:09:13 cubietruck-braque kernel: [  807.637784] Internal error: Oops - 
BUG: 0 [#1] SMP ARM
Jun 24 04:09:13 cubietruck-braque kernel: [  807.637810] Modules linked in: 
xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211
Jun 24 04:09:13 cubietruck-braque kernel: [  807.637899] CPU: 0 PID: 16206 
Comm: vif1.0-q0-guest Not tainted 3.16.7-ckt12+ #1
Jun 24 04:09:13 cubietruck-braque kernel: [  807.637936] task: c12fc480 ti: 
d2d3c000 task.ti: d2d3c000
Jun 24 04:09:13 cubietruck-braque kernel: [  807.637977] PC is at 
gnttab_batch_copy+0xd4/0xe0
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638004] LR is at 
gnttab_batch_copy+0x1c/0xe0
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638030] pc : [c04abf7c]
lr : [c04abec4]psr: a013
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638030] sp : d2d3deb0  ip : 
deadbeef  fp : d2d3df3c
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638091] r10: 0001  r9 : 
  r8 : 0008
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638124] r7 : 0001  r6 : 
0001  r5 :   r4 : e1e38d30
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638159] r3 : 0001  r2 : 
deadbeef  r1 : deadbeef  r0 : fff2
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638193] Flags: NzCv  IRQs on  
FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638227] Control: 10c5387d  
Table: 7b50406a  DAC: 0015
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638257] Process 
vif1.0-q0-guest (pid: 16206, stack limit = 0xd2d3c248)
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638287] Stack: (0xd2d3deb0 to 
0xd2d3e000)
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638316] dea0:  
   0001   e1e3
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638353] dec0: 0001 
c05d7c44 003e 0ec2 d2d3df3c   0001
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638391] dee0: dbbb7a80 
  0008  d2d3df20 e1e38cfc e1e38d30
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638430] df00: 0001 
 0001  e1e38d30 e1e63530 003e 0208
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638468] df20: d9f0f480 
d9f0f480 0001  d2d3df2c d2d3df34 d2d3df34 
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638505] df40:  
db34e380  e1e3 c05d776c   
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638543] df60:  
c0264138    e1e3  
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638581] df80: d2d3df80 
d2d3df80   d2d3df90 d2d3df90 d2d3dfac db34e380
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638638] dfa0: c026406c 
  c020f038    
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638686] dfc0:  
      
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638723] dfe0:  
   0013   
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638785] [c04abf7c] 
(gnttab_batch_copy) from [c05d7c44] (xenvif_kthread_guest_rx+0x4d8/0xbc0)
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638841] [c05d7c44] 
(xenvif_kthread_guest_rx) from [c0264138] (kthread+0xcc/0xe8)
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638887] [c0264138] (kthread) 
from [c020f038] (ret_from_fork+0x14/0x3c)
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638929] Code: 0ae5 
eaed e8bd80f8 e7f001f2 (e7f001f2) 
Jun 24 04:09:13 cubietruck-braque kernel: [  807.638978] ---[ end trace 
98c74482d9a5771d ]---

Which looks familiar, although I can't seem to find it, does anyone
remember it? Are we missing a backport perhaps?

This is the 3.16.y based linux-arm-xen tree, which was recently updated
from a baseline of v3.16.4-ckt7 to v3.16.7-ckt12 (flight 58830) in both
cases plus xen_arch_need_swiotlb for swiotlb stuff.

This here was the next flight which only added the xen: netback: read
hotplug script once at start of day., which I don't think is related to
the failure, which I suspect is intermittent.

Ian.


___

Re: [Xen-devel] [PATCH] tools: libxl: Take the userdata lock around maxmem changes

2015-06-24 Thread Ian Campbell
On Tue, 2015-06-23 at 19:50 +0100, Wei Liu wrote:
 On Tue, Jun 23, 2015 at 05:38:17PM +0100, Ian Campbell wrote:
  On Tue, 2015-06-23 at 17:32 +0100, Wei Liu wrote:
   On Tue, Jun 23, 2015 at 03:58:32PM +0100, Ian Campbell wrote:
There is an issue in libxl_set_memory_target whereby the target and
the max mem can get out of sync, this is because the call the
xc_domain_setmaxmem is not tied in any way to the xenstore transaction
which controls updates to the xenstore side of things.

Consider a domain with 1M of RAM (==target and maxmem for the sake of
argument) and two simultaneous calls to libxl_set_memory_target, both
with relative=0 and enforce=1, one with target=3 and the other with
target=5.

target=5 call   target=3 call

transaction start
transaction start
write target=5 to xenstore
write target=3 to xenstore
setmaxmem(5)
setmaxmem(3)

transaction commit = success
transaction commit = EAGAIN

At this point maxmem=3 while target=5.

In reality the target=3 case will the retry and eventually (hopefully)
succeed with target=maxmem=3, however the bad state will persist for
some window which is undesirable. On failure other than EAGAIN all
bets are off anyway, but in that case we will likely stick in the bad
state until someone else sets the memory).

To fix this we slightly abuse the userdata lock which is used to
protect updates to the domain's json configuration. Abused because
maxmem is not actually stored in there, but is kept by Xen. However
the lock protects some semantically similar things and is convenient
to use here too.

libxl_domain_setmaxmem also takes the lock, since it reads
memory/target from xenstore before calling xc_domain_setmaxmem there
is a small (but perhaps not very interesting) race there too.

There is on more use of xc_domain_setmaxmem in libxl__build_pre.
However taking a lock around this would be tricky since the xenstore
parts are not done until libxl__build_post. I think this one could be
argued to be OK since the domid is not public yet, that is it has
not been returned to the application yet (as the result of the create
operation). Toolstacks which go round fiddling with random domid's
which they find lying on the floor should be taught to do better.

Add a doc note that taking the userdata lock requires the CTX_LOCK to
be held.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
---
This applies on top of Wei's revert of libxl_set_memory_target:
retain the same maxmem offset on top of the current target

I couldn't quite rule out some race (for transaction=EAGAIN, for
!EAGAIN there are obvious ones) which resulted in the incorrect state
being in place after both entities exit, but couldn't construct an
actual case.
   
   Not sure I follow. With this patch you lock out other contenders to
   even start a transaction so the EAGAIN vs !EAGAIN should be moot. You
   can safely rollback in !EAGAIN case, right?
  
  Sorry, I meant a prexisting race which was fixed by this patch, rather
  than one that continues to exist with this fix.
  
 
 Is there inconsistency after this fix?

That post-changelog note was regarding the original code before the
patch, because I felt the example race given in the code was a
relatively minor one (since EAGAIN will cause it to be fixed up pretty
quickly in the real world) and I was hoping to include an example of a
much more serious race in the commit message but hadn't been able to
construct one.

  I think not, because you can roll
 back maxmem and pod target to previous values -- but the rollback was
 not implemented here though.

It's not necessary, I think, because with EAGAIN failures we will always
try again and (eventually) succeed. Other kinds of transaction failure
are of the xenstore is completely b0rked kind and all bets are pretty
much off in that case.

What this patch prevents is getting into some weird state which has
aspects of two separate calls to set_memory_target, which could be much
worse than a transient state as part of a single call, especially if we
could exit with success on both cases but in some hybrid state.

There will always be a small window between the xc_domain_setmaxmem and
the transaction commit.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)

2015-06-24 Thread Ian Campbell
On Fri, 2015-06-19 at 12:07 +0100, Ian Campbell wrote:
 On Fri, 2015-06-19 at 10:51 +0100, Jan Beulich wrote:
   On 18.06.15 at 16:22, ian.campb...@citrix.com wrote:
   On Thu, 2015-06-18 at 12:37 +0100, Jan Beulich wrote:
On 17.06.15 at 12:26, ian.jack...@eu.citrix.com wrote:
Jan Beulich writes (stable trees (was: [xen-4.2-testing test] 58584: 
regressions)):
Which leaves several options:
- the problem was always there, but hidden by some factor in the
  old osstest instance,

I think this is most likely.  The old system had much older hosts.

I think this is a race that we now happen to lose most of the time.
   
   For verification purposes, would it be possible to set up a couple of
   flights on the old instance for one of the stable trees?
   
   I can try and run something adhoc on the old system if you can let me
   know exactly which jobs (test-*-*-*) and branches you are interested in.
  
  Any or all of test-amd64-*-xl-qemuu-win* (not sure whether you
  can specify wildcards), and I guess stable-4.5 (or staging-4.5)
  would be the most natural branch choice.
 
 I think the tools can do wildcards, yes.
 
 I've kicked off a full adhoc xen-4.5-testing flight so I have a local
 template to copy the jobs from for some repeated runs with just the
 problem flights (it's just easier to do that than to invent a cut-down
 flight from scratch...).

After that baseline I ran a few tests of just the windows + qemuu stuff:
http://xenbits.xen.org/people/ianc/tmp/adhoc/37619/

was allowing free reign on the machines and was mostly successful, apart
from the windows-install failure on lake-frog. Looking at the test
history this seems to have always been a problem on the old infra.
*-frog are AMD Opteron(tm) Processor 6168 which is as close as the old
infra has to the new colos merlot[01] which is AMD Opteron(tm)
Processor 6376.

With that in mind I reran with things limited to the two frog-* boxes
and got http://xenbits.xen.org/people/ianc/tmp/adhoc/37624/.

The windows-install of winxpsp3 persisted but there was no migration
failure elsewhere.

It's not a lot of data, but in comparison with the results in the colo:
http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-xl-qemuu-win7-amd64/xen-4.5-testing.html
 
it looks like it's the newer system which is exposing the issue.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 8/9] x86/pvh: Don't try to get l4 table for PVH guests in vcpu_destroy_pagetables()

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 05:05, boris.ostrov...@oracle.com wrote:
 On 06/23/2015 09:38 AM, Jan Beulich wrote:
 On 20.06.15 at 05:09, boris.ostrov...@oracle.com wrote:
 --- a/xen/arch/x86/mm.c
 +++ b/xen/arch/x86/mm.c
 @@ -2652,7 +2652,7 @@ int vcpu_destroy_pagetables(struct vcpu *v)
   if ( rc )
   return rc;
   
 -if ( is_pv_32on64_vcpu(v) )
 +if ( is_pv_32on64_vcpu(v)  !is_pvh_vcpu(v) )
 This looks wrong - is_pv_32on64_vcpu() should imply is_pv_vcpu()
 and hence !is_pvh_vcpu().
 
 
 That's because I kept
  d-arch.is_32bit_pv = d-arch.has_32bit_shinfo = 1;
 in switch_compat() for both PV and PVH. I should probably only set 
 has_32bit_shinfo for PVH.

Right.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/9] x86/pvh: Set PVH guest's mode in XEN_DOMCTL_set_address_size

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 04:53, boris.ostrov...@oracle.com wrote:
 On 06/23/2015 09:22 AM, Jan Beulich wrote:

 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2320,12 +2320,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
   v-arch.hvm_vcpu.inject_trap.vector = -1;
   
   if ( is_pvh_domain(d) )
 -{
 -v-arch.hvm_vcpu.hcall_64bit = 1;/* PVH 32bitfixme. */
 -/* This is for hvm_long_mode_enabled(v). */
 -v-arch.hvm_vcpu.guest_efer = EFER_LMA | EFER_LME;
   return 0;
 -}
 With this removed, is there any guarantee that hvm_set_mode()
 will be called for each vCPU?
 
 IIUIC, toolstack is required to call XEN_DOMCTL_set_address_size which 
 results in a call to switch_compat/native(), which loop over all VCPUs, 
 calling set_mode.

I don't recall this being a strict requirement. I think a PV 64-bit
guest would start fine without.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 04/18] x86/hvm: make sure translated MMIO reads or writes fall within a page

2015-06-24 Thread Jan Beulich
 On 23.06.15 at 18:32, paul.durr...@citrix.com wrote:
 Ok. If you believe it's the right thing to do, I'm happy to drop this patch 
 out of the series. I'll send v4 tomorrow.

Perhaps worth waiting a little for further review comments (fwiw I
didn't get to look at 5 and onwards so far)? But of course if you
don't mind doing an extra round...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-arm-xen test] 58849: regressions - FAIL

2015-06-24 Thread osstest service user
flight 58849 linux-arm-xen real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58849/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 58830

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux64972ceb0b0cafc91a09764bc731e1b7f0503b5c
baseline version:
 linux9f51b5de8c3fdd01a9d692da5633449cc6936688


People who touched revisions under test:
  David S. Miller da...@davemloft.net
  Ian Campbell ian.campb...@citrix.com
  Luis Henriques luis.henriq...@canonical.com
  Wei Liu wei.l...@citrix.com


jobs:
 build-armhf-xsm  pass
 build-armhf  pass
 build-armhf-libvirt  pass
 build-armhf-pvopspass
 test-armhf-armhf-xl  pass
 test-armhf-armhf-libvirt-xsm pass
 test-armhf-armhf-xl-xsm  pass
 test-armhf-armhf-xl-arndale  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   fail
 test-armhf-armhf-libvirt pass
 test-armhf-armhf-xl-multivcpupass
 test-armhf-armhf-xl-sedf-pin pass
 test-armhf-armhf-xl-sedf pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 64972ceb0b0cafc91a09764bc731e1b7f0503b5c
Author: Ian Campbell ian.campb...@citrix.com
Date:   Mon Jun 1 11:30:24 2015 +0100

xen: netback: read hotplug script once at start of day.

commit 31a418986a5852034d520a5bab546821ff1ccf3d upstream.

When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).

In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.

A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).

Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.

The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...

A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
Acked-by: Wei Liu wei.l...@citrix.com
Signed-off-by: David S. Miller da...@davemloft.net
Signed-off-by: Luis Henriques luis.henriq...@canonical.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] NetBSDRump: provide evtchn.h and privcmd.h

2015-06-24 Thread Wei Liu
Xen's build system has a target for rump kernel called NetBSDRump. We
want to build libxc against rump kernel, so we need to copy NetBSD's
evtchn.h and privcmd.h to NetBSDRump. These copies is not very likely to
diverge from NetBSD's copies, but we don't preclude such possibility.

Signed-off-by: Wei Liu wei.l...@citrix.com
---
 tools/include/xen-sys/NetBSDRump/evtchn.h  | 86 ++
 tools/include/xen-sys/NetBSDRump/privcmd.h | 81 ++--
 2 files changed, 164 insertions(+), 3 deletions(-)
 create mode 100644 tools/include/xen-sys/NetBSDRump/evtchn.h

diff --git a/tools/include/xen-sys/NetBSDRump/evtchn.h 
b/tools/include/xen-sys/NetBSDRump/evtchn.h
new file mode 100644
index 000..2d8a1f9
--- /dev/null
+++ b/tools/include/xen-sys/NetBSDRump/evtchn.h
@@ -0,0 +1,86 @@
+/* $NetBSD: evtchn.h,v 1.1.1.1 2007/06/14 19:39:45 bouyer Exp $ */
+/**
+ * evtchn.h
+ * 
+ * Interface to /dev/xen/evtchn.
+ * 
+ * Copyright (c) 2003-2005, K A Fraser
+ * 
+ * This file may be distributed separately from the Linux kernel, or
+ * incorporated into other software packages, subject to the following license:
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the Software), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ * 
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ * 
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef __NetBSD_EVTCHN_H__
+#define __NetBSD_EVTCHN_H__
+
+/*
+ * Bind a fresh port to VIRQ @virq.
+ */
+#define IOCTL_EVTCHN_BIND_VIRQ \
+   _IOWR('E', 4, struct ioctl_evtchn_bind_virq)
+struct ioctl_evtchn_bind_virq {
+   unsigned int virq;
+   unsigned int port;
+};
+
+/*
+ * Bind a fresh port to remote @remote_domain, @remote_port.
+ */
+#define IOCTL_EVTCHN_BIND_INTERDOMAIN  \
+   _IOWR('E', 5, struct ioctl_evtchn_bind_interdomain)
+struct ioctl_evtchn_bind_interdomain {
+   unsigned int remote_domain, remote_port;
+   unsigned int port;
+};
+
+/*
+ * Allocate a fresh port for binding to @remote_domain.
+ */
+#define IOCTL_EVTCHN_BIND_UNBOUND_PORT \
+   _IOWR('E', 6, struct ioctl_evtchn_bind_unbound_port)
+struct ioctl_evtchn_bind_unbound_port {
+   unsigned int remote_domain;
+   unsigned int port;
+};
+
+/*
+ * Unbind previously allocated @port.
+ */
+#define IOCTL_EVTCHN_UNBIND\
+   _IOW('E', 7, struct ioctl_evtchn_unbind)
+struct ioctl_evtchn_unbind {
+   unsigned int port;
+};
+
+/*
+ * Send event to previously allocated @port.
+ */
+#define IOCTL_EVTCHN_NOTIFY\
+   _IOW('E', 8, struct ioctl_evtchn_notify)
+struct ioctl_evtchn_notify {
+   unsigned int port;
+};
+
+/* Clear and reinitialise the event buffer. Clear error condition. */
+#define IOCTL_EVTCHN_RESET \
+   _IO('E', 9)
+
+#endif /* __NetBSD_EVTCHN_H__ */
diff --git a/tools/include/xen-sys/NetBSDRump/privcmd.h 
b/tools/include/xen-sys/NetBSDRump/privcmd.h
index efdcae9..1296b30 100644
--- a/tools/include/xen-sys/NetBSDRump/privcmd.h
+++ b/tools/include/xen-sys/NetBSDRump/privcmd.h
@@ -1,6 +1,36 @@
+/* NetBSD: xenio.h,v 1.3 2005/05/24 12:07:12 yamt Exp $*/
 
-#ifndef __NetBSDRump_PRIVCMD_H__
-#define __NetBSDRump_PRIVCMD_H__
+/**
+ * privcmd.h
+ * 
+ * Copyright (c) 2003-2004, K A Fraser
+ * 
+ * This file may be distributed separately from the Linux kernel, or
+ * incorporated into other software packages, subject to the following license:
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the Software), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ * 
+ * The above copyright notice and this 

[Xen-devel] [PATCH 0/2] Build libxc on rump kernel

2015-06-24 Thread Wei Liu
I have upstreamed a privcmd driver for rump kernel. That driver has the same
semantics as the NetBSD one so we can just use xc_netbsd for rump kernel.

Wei.

Wei Liu (2):
  NetBSDRump: provide evtchn.h and privcmd.h
  libxc: use xc_netbsd.c for rump kernel

 tools/include/xen-sys/NetBSDRump/evtchn.h  | 86 ++
 tools/include/xen-sys/NetBSDRump/privcmd.h | 81 ++--
 tools/libxc/Makefile   |  1 +
 3 files changed, 165 insertions(+), 3 deletions(-)
 create mode 100644 tools/include/xen-sys/NetBSDRump/evtchn.h

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/2] libxc: use xc_netbsd.c for rump kernel

2015-06-24 Thread Wei Liu
Signed-off-by: Wei Liu wei.l...@citrix.com
---
 tools/libxc/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 55782c8..153b79e 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -48,6 +48,7 @@ CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c xc_linux_osdep.c
 CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c xc_freebsd_osdep.c
 CTRL_SRCS-$(CONFIG_SunOS) += xc_solaris.c
 CTRL_SRCS-$(CONFIG_NetBSD) += xc_netbsd.c
+CTRL_SRCS-$(CONFIG_NetBSDRump) += xc_netbsd.c
 CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c
 
 GUEST_SRCS-y :=
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install

2015-06-24 Thread Wei Liu
On Wed, Jun 24, 2015 at 10:31:57AM +0100, Andrew Cooper wrote:
 On 24/06/15 10:25, Razvan Cojocaru wrote:
  On 06/24/2015 12:14 PM, Fanhenglong wrote:
  I want to debug the procedure of windows os install with windbg,
 
  windbg executes instruction(fxsave) after the blank vm is started and
  before guest iso start to install,
 
  fxsave trigger the following code path:
  vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION)
  -ept_handle_violation
  -hvm_hap_nested_page_fault
  -handle_mmio_with_translation
  -handle_mmio
  -hvm_emulate_one
  -x86_emulate
 
  *X86_emulate return X86EMUL_UNHANDLEABLE*
  How are you using Xen in this case? Are you by any chance using the
  vm_event system in a way that sends back an emulate vm_event response
  from userspace?
 
  You might want to look at x86_emulate() in
  xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is
  being handled.
 
 The fxsave instruction has no emulation implementation.
 
 0f ae 07 is fxsave (%rdi) which means that either introspection is
 active, or %rdi is a pointer into an MMIO region.
 

So I think this is not a regression? (I'm now trying to identify
possible blockers for the release)

Wei.

 ~Andrew
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 10/17] x86/hvm: revert 82ed8716b fix direct PCI port I/O emulation retry...

2015-06-24 Thread Paul Durrant
...and error handling

NOTE: A straight reversion was not possible because of subsequent changes
  in the code so this at least partially a manual reversion.

By limiting hvmemul_do_io_addr() to reps falling within the pages on which
a reference has already been taken, we can guarantee that calls to
hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out
or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic from
the intercept code and simplify it significantly.

Normally hvmemul_do_io_addr() will only reference single page at a time.
It will, however, take an extra page reference for I/O spanning a page
boundary.

It is still important to know, upon returning from x86_emulate(), whether
the number of reps was reduced so the mmio_retry flag is retained for that
purpose.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c |   86 +++-
 xen/arch/x86/hvm/hvm.c |4 ++
 xen/arch/x86/hvm/intercept.c   |   52 +---
 xen/include/asm-x86/hvm/vcpu.h |2 +-
 4 files changed, 74 insertions(+), 70 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4e2fdf1..eefe860 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -52,7 +52,7 @@ static void hvmtrace_io_assist(ioreq_t *p)
 }
 
 static int hvmemul_do_io(
-bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
 uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
 struct vcpu *curr = current;
@@ -61,6 +61,7 @@ static int hvmemul_do_io(
 .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
 .addr = addr,
 .size = size,
+.count = reps,
 .dir = dir,
 .df = df,
 .data = data,
@@ -126,15 +127,6 @@ static int hvmemul_do_io(
 HVMIO_dispatched : HVMIO_awaiting_completion;
 vio-io_size = size;
 
-/*
- * When retrying a repeated string instruction, force exit to guest after
- * completion of the retried iteration to allow handling of interrupts.
- */
-if ( vio-mmio_retrying )
-*reps = 1;
-
-p.count = *reps;
-
 if ( dir == IOREQ_WRITE )
 {
 if ( !data_is_addr )
@@ -148,17 +140,9 @@ static int hvmemul_do_io(
 switch ( rc )
 {
 case X86EMUL_OKAY:
-case X86EMUL_RETRY:
-*reps = p.count;
 p.state = STATE_IORESP_READY;
-if ( !vio-mmio_retry )
-{
-hvm_io_assist(p);
-vio-io_state = HVMIO_none;
-}
-else
-/* Defer hvm_io_assist() invocation to hvm_do_resume(). */
-vio-io_state = HVMIO_handle_mmio_awaiting_completion;
+hvm_io_assist(p);
+vio-io_state = HVMIO_none;
 break;
 case X86EMUL_UNHANDLEABLE:
 {
@@ -236,7 +220,7 @@ static int hvmemul_do_io_buffer(
 
 BUG_ON(buffer == NULL);
 
-rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
+rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
(uintptr_t)buffer);
 if ( rc == X86EMUL_UNHANDLEABLE  dir == IOREQ_READ )
 memset(buffer, 0xff, size);
@@ -288,17 +272,66 @@ static int hvmemul_do_io_addr(
 bool_t is_mmio, paddr_t addr, unsigned long *reps,
 unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa)
 {
-struct page_info *ram_page;
+struct vcpu *v = current;
+unsigned long ram_gmfn = paddr_to_pfn(ram_gpa);
+struct page_info *ram_page[2];
+int nr_pages = 0;
+unsigned long count;
 int rc;
 
-rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), ram_page);
+rc = hvmemul_acquire_page(ram_gmfn, ram_page[nr_pages]);
 if ( rc != X86EMUL_OKAY )
-return rc;
+goto out;
 
-rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1,
+nr_pages++;
+
+/* Detemine how many reps will fit within this page */
+for ( count = 0; count  *reps; count++ )
+{
+paddr_t start, end;
+
+if ( df )
+{
+start = ram_gpa - count * size;
+end = ram_gpa + size - 1;
+}
+else
+{
+start = ram_gpa;
+end = ram_gpa + (count + 1) * size - 1;
+}
+
+if ( paddr_to_pfn(start) != ram_gmfn ||
+ paddr_to_pfn(end) != ram_gmfn )
+break;
+}
+
+if ( count == 0 )
+{
+/*
+ * This access must span two pages, so grab a reference to
+ * the next page and do a single rep.
+ */
+rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1,
+  ram_page[nr_pages]);
+if ( rc != X86EMUL_OKAY )
+goto out;
+
+nr_pages++;
+count = 1;
+}
+
+rc = 

[Xen-devel] [PATCH v4 13/17] x86/hvm: remove HVMIO_dispatched I/O state

2015-06-24 Thread Paul Durrant
By removing the HVMIO_dispatched state and making all pending emulations
(i.e. all those not handled by the hypervisor) use HVMIO_awating_completion,
various code-paths can be simplified.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c  |   12 +++-
 xen/arch/x86/hvm/hvm.c  |   12 +++-
 xen/arch/x86/hvm/io.c   |   12 
 xen/arch/x86/hvm/vmx/realmode.c |2 +-
 xen/include/asm-x86/hvm/vcpu.h  |8 +++-
 5 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 111987c..c10adad 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -138,20 +138,14 @@ static int hvmemul_do_io(
 if ( data_is_addr || dir == IOREQ_WRITE )
 return X86EMUL_UNHANDLEABLE;
 goto finish_access;
-case HVMIO_dispatched:
-/* May have to wait for previous cycle of a multi-write to complete. */
-if ( is_mmio  !data_is_addr  (dir == IOREQ_WRITE) 
- (addr == (vio-mmio_large_write_pa +
-   vio-mmio_large_write_bytes)) )
-return X86EMUL_RETRY;
-/* fallthrough */
 default:
 return X86EMUL_UNHANDLEABLE;
 }
 
-vio-io_state = (data_is_addr || dir == IOREQ_WRITE) ?
-HVMIO_dispatched : HVMIO_awaiting_completion;
+vio-io_state = HVMIO_awaiting_completion;
 vio-io_size = size;
+vio-io_dir = dir;
+vio-io_data_is_addr = data_is_addr;
 
 if ( dir == IOREQ_WRITE )
 {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 39f40ad..4458fa4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -416,22 +416,16 @@ static void hvm_io_assist(ioreq_t *p)
 {
 struct vcpu *curr = current;
 struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
-enum hvm_io_state io_state;
 
 p-state = STATE_IOREQ_NONE;
 
-io_state = vio-io_state;
-vio-io_state = HVMIO_none;
-
-switch ( io_state )
+if ( HVMIO_NEED_COMPLETION(vio) )
 {
-case HVMIO_awaiting_completion:
 vio-io_state = HVMIO_completed;
 vio-io_data = p-data;
-break;
-default:
-break;
 }
+else
+vio-io_state = HVMIO_none;
 
 msix_write_completion(curr);
 vcpu_end_shutdown_deferral(curr);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 27150e9..12fc923 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -90,9 +90,7 @@ int handle_mmio(void)
 
 rc = hvm_emulate_one(ctxt);
 
-if ( rc != X86EMUL_RETRY )
-vio-io_state = HVMIO_none;
-if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry )
+if ( HVMIO_NEED_COMPLETION(vio) || vio-mmio_retry )
 vio-io_completion = HVMIO_mmio_completion;
 else
 vio-mmio_access = (struct npfec){};
@@ -142,6 +140,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
 
 rc = hvmemul_do_pio_buffer(port, size, dir, data);
 
+if ( HVMIO_NEED_COMPLETION(vio) )
+vio-io_completion = HVMIO_pio_completion;
+
 switch ( rc )
 {
 case X86EMUL_OKAY:
@@ -154,11 +155,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
 }
 break;
 case X86EMUL_RETRY:
-if ( vio-io_state != HVMIO_awaiting_completion )
-return 0;
-/* Completion in hvm_io_assist() with no re-emulation required. */
-ASSERT(dir == IOREQ_READ);
-vio-io_completion = HVMIO_pio_completion;
 break;
 default:
 gdprintk(XENLOG_ERR, Weird HVM ioemulation status %d.\n, rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 76ff9a5..5e56a1f 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -111,7 +111,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt 
*hvmemul_ctxt)
 
 rc = hvm_emulate_one(hvmemul_ctxt);
 
-if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry )
+if ( HVMIO_NEED_COMPLETION(vio) || vio-mmio_retry )
 vio-io_completion = HVMIO_realmode_completion;
 
 if ( rc == X86EMUL_UNHANDLEABLE )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index c4d96a8..2830057 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -32,7 +32,6 @@
 
 enum hvm_io_state {
 HVMIO_none = 0,
-HVMIO_dispatched,
 HVMIO_awaiting_completion,
 HVMIO_completed
 };
@@ -55,6 +54,13 @@ struct hvm_vcpu_io {
 unsigned long  io_data;
 intio_size;
 enum hvm_io_completion io_completion;
+uint8_tio_dir;
+uint8_tio_data_is_addr;
+
+#define HVMIO_NEED_COMPLETION(_vio) \
+( ((_vio)-io_state == HVMIO_awaiting_completion) \
+  !(_vio)-io_data_is_addr  \
+

Re: [Xen-devel] [RFC PATCH v3 02/18] xen: Add log2 functionality

2015-06-24 Thread Vijay Kilari
On Mon, Jun 22, 2015 at 6:47 PM, Jan Beulich jbeul...@suse.com wrote:
 On 22.06.15 at 14:01, vijay.kil...@gmail.com wrote:

 First of all, please Cc _all_ relevant maintainers.

 --- a/xen/include/xen/bitops.h
 +++ b/xen/include/xen/bitops.h
 @@ -117,6 +117,14 @@ static inline int generic_fls64(__u64 x)
  # endif
  #endif

 +static inline unsigned fls_long(unsigned long l)
 +{
 +if (sizeof(l) == 4)
 +return fls(l);
 +
 +return fls64(l);
 +}

 I'm not really opposed to this, but did you really verify that there's
 no suitable functionality in tree already (even if named differently)?
 I can't, e.g., see why flsl() wouldn't fit your needs.

#define fls64 flsl
So flsl also should be fine


 --- /dev/null
 +++ b/xen/include/xen/log2.h
 @@ -0,0 +1,205 @@
 +/*
 + * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved.
 + * Written by David Howells (dhowe...@redhat.com)
 + *
 + * This program is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU General Public License
 + * as published by the Free Software Foundation; either version
 + * 2 of the License, or (at your option) any later version.
 + */
 +
 +#ifndef _LINUX_LOG2_H
 +#define _LINUX_LOG2_H

 LINUX?

Most of the include/xen/*.h files has LINUX kept.
Anyway, I will remove it


 +/*
 + * deal with unrepresentable constant logarithms
 + */
 +extern __attribute__((const))
 +int ilog2_NaN(void);

 -ETOOMANYUNDERSCORES

I have taken this from linux. I have not introduced it.


 +#if 1

 ??? (at least one more below)

 Also, are you really needing all of what you add here?

I have taken complete log2 file from linux.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install

2015-06-24 Thread Razvan Cojocaru
On 06/24/2015 12:31 PM, Andrew Cooper wrote:
 On 24/06/15 10:25, Razvan Cojocaru wrote:
 On 06/24/2015 12:14 PM, Fanhenglong wrote:
 I want to debug the procedure of windows os install with windbg,

 windbg executes instruction(fxsave) after the blank vm is started and
 before guest iso start to install,

 fxsave trigger the following code path:
 vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION)
 -ept_handle_violation
 -hvm_hap_nested_page_fault
 -handle_mmio_with_translation
 -handle_mmio
 -hvm_emulate_one
 -x86_emulate

 *X86_emulate return X86EMUL_UNHANDLEABLE*
 How are you using Xen in this case? Are you by any chance using the
 vm_event system in a way that sends back an emulate vm_event response
 from userspace?

 You might want to look at x86_emulate() in
 xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is
 being handled.
 
 The fxsave instruction has no emulation implementation.
 
 0f ae 07 is fxsave (%rdi) which means that either introspection is
 active, or %rdi is a pointer into an MMIO region.

I see, these are the cases we wanted to treat with the old patch (I
thick it was called xen: Handle resumed instruction based on previous
mem_event reply - the early versions, with RFC) that sometimes bypassed
the emulator in the introspection case. Without that, there's always
going to be a potential current or future instruction not emulated, and
then something like this happens.


Cheers,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 10/18] xen/arm: ITS: Add APIs to add and assign device

2015-06-24 Thread Julien Grall
Hi Vijay,

On 22/06/15 13:01, vijay.kil...@gmail.com wrote:
 From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com
 
 Add APIs to add devices to RB-tree, assign and remove
 devices to domain.
 
 Signed-off-by: Vijaya Kumar K vijaya.ku...@caviumnetworks.com
 ---
  xen/arch/arm/gic-v3-its.c |  246 
 -
  xen/include/asm-arm/gic-its.h |4 +-
  2 files changed, 246 insertions(+), 4 deletions(-)
 
 diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
 index 2a4fa97..4471669 100644
 --- a/xen/arch/arm/gic-v3-its.c
 +++ b/xen/arch/arm/gic-v3-its.c
 @@ -94,6 +94,7 @@ static LIST_HEAD(its_nodes);
  static DEFINE_SPINLOCK(its_lock);
  static struct rdist_prop  *gic_rdists;
  static struct rb_root rb_its_dev;
 +static DEFINE_SPINLOCK(rb_its_dev_lock);

Can't you use the its_lock to handle locking for the rb_its_dev?
  
  #define gic_data_rdist()(per_cpu(rdist, smp_processor_id()))
  #define gic_data_rdist_rd_base()(per_cpu(rdist, 
 smp_processor_id()).rbase)
 @@ -152,6 +153,21 @@ u32 its_get_nr_events(void)
  return (1  id_bits);
  }
  
 +static struct its_node * its_get_phys_node(u32 dev_id)
 +{
 +struct its_node *its;
 +
 +/* TODO: For now return ITS0 node.
 + * Need Query PCI helper function to get on which
 + * ITS node the device is attached
 + */
 +list_for_each_entry(its, its_nodes, entry)
 +{
 +return its;
 +}
 +
 +return NULL;
 +}
  /* RB-tree helpers for its_device */
  struct its_device * find_its_device(struct rb_root *root, u32 devid)
  {
 @@ -459,7 +475,7 @@ void its_send_inv(struct its_device *dev, struct 
 its_collection *col,
  its_send_single_command(dev-its, its_build_inv_cmd, desc);
  }
  
 -void its_send_mapd(struct its_device *dev, int valid)
 +static void its_send_mapd(struct its_device *dev, int valid)

I would prefer to see this static where the function has been introduced
and delay the compilation until the end.

  {
  struct its_cmd_desc desc;
  
 @@ -493,7 +509,7 @@ void its_send_mapvi(struct its_device *dev, struct 
 its_collection *col,
  its_send_single_command(dev-its, its_build_mapvi_cmd, desc);
  }
  
 -void its_send_movi(struct its_device *dev, struct its_collection *col,
 +static void its_send_movi(struct its_device *dev, struct its_collection *col,
u32 event)

Ditto

  {
  struct its_cmd_desc desc;
 @@ -596,7 +612,7 @@ int its_lpi_init(u32 id_bits)
  return 0;
  }
  
 -unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids)
 +static unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids)

Ditto

  {
  unsigned long *bitmap = NULL;
  int chunk_id;
 @@ -636,6 +652,230 @@ out:
  return bitmap;
  }
  
 +static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
 +{
 +int lpi;
 +
 +spin_lock(lpi_lock);
 +
 +for ( lpi = base; lpi  (base + nr_ids); lpi += IRQS_PER_CHUNK )
 +{
 +int chunk = its_lpi_to_chunk(lpi);
 +
 +BUG_ON(chunk  lpi_chunks);
 +if ( test_bit(chunk, lpi_bitmap) )
 +clear_bit(chunk, lpi_bitmap);
 +else
 +its_err(Bad LPI chunk %d\n, chunk);
 +}
 +
 +spin_unlock(lpi_lock);
 +
 +xfree(bitmap);
 +}
 +
 +static int its_alloc_device_irq(struct its_device *dev, u32 *hwirq)
 +{
 +int idx;
 +
 +idx = find_first_zero_bit(dev-lpi_map, dev-nr_lpis);
 +if ( idx == dev-nr_lpis )
 +return -ENOSPC;
 +
 +*hwirq = dev-lpi_base + idx; 

You could use its_get_plpi here.

 +set_bit(idx, dev-lpi_map);
 +
 +return 0;
 +}
 +
 +static u32 its_get_plpi(struct its_device *dev, u32 event)
 +{
 +ASSERT(event  dev-nr_lpis);
 +return dev-lpi_base + event;
 +}
 +
 +/* Device assignment. Should be called from pci_device_add */

I guess you mean PHYSDEVOP_pci_device_add?

 +int its_add_device(struct domain *d, u32 devid)

Why do you pass a domain in parameter? This function only adds a device
into the ITS, there is nothing related to a specific domain.

For more abstraction, I would create an helper which take a struct
device in parameter and retrieving all the necessary informations
(devid, number of MSI...) and then call this function.

 +{
 +struct its_device *dev;
 +unsigned long *lpi_map;
 +void *itt;
 +int lpi_base, nr_lpis, sz;
 +u32 i, nr_ites, plpi, nr_cpus;
 +struct its_collection *col;
 +
 +spin_lock(rb_its_dev_lock);
 +dev = find_its_device(rb_its_dev, devid);
 +if ( dev )

This check is not useful. As you release the lock before adding the
device someone else can have time to add the device in the RB-tree.

 +{
 +spin_unlock(rb_its_dev_lock);
 +dprintk(XENLOG_G_ERR, ITS:d%dv%d Device already exists dev 0x%x\n,
 +d-domain_id, current-vcpu_id, dev-device_id);
 +return -EEXIST;
 +}
 +spin_unlock(rb_its_dev_lock);
 +
 +DPRINTK(ITS:d%dv%d Add device devid 0x%x\n,
 

Re: [Xen-devel] Hyper and Xen Project

2015-06-24 Thread Stefano Stabellini
Hi Wang,

I don't know the answer, so I CCed xen-devel (the Xen development list)
and a few people that I think will be able to help.

Cheers,

Stefano

On Wed, 24 Jun 2015, Wang Xu wrote:
 A problem about channel, where do I found the channel name in the guest, In 
 the document, it says I could found it in
 sysfs, but looks there isn't a name property:

 | root@test-container-create-ubuntu:/sys/bus/xen/devices# udevadm  info 
 --attribute-walk  --path=/devices/console-1
 |
 [...]
 |
 |   looking at device '/devices/console-1':
 |     KERNEL==console-1
 |     SUBSYSTEM==xen
 |     DRIVER==xenconsole
 |     ATTR{devtype}==console
 |     ATTR{nodename}==device/console/1

 and I directly test `/dev/hvc1`, and it could communicate with the outside 
 socket. Is there some mistake in my channel 
 name configuration?

 | static void hyper_config_channel(libxl_device_channel* ch, const char* 
 name, const char* sock, int devid) {
 |     libxl_device_channel_init(ch);
 |     ch-backend_domid = 0;
 |     ch-name = strdup(name);
 |     ch-devid = devid;
 |     ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET;
 |     ch-u.socket.path = strdup(sock);
 | }

 I tried to look at the oVirt code as it is mentioned in the dock, but I did 
 not find xen console in its guest agent code.

So the issue is that the name you assign here to the channel, doesn't
come up anywhere in the guest. Is that correct?


 Thank you!


 On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini 
 stefano.stabell...@eu.citrix.com wrote:
   On Tue, 23 Jun 2015, Wang Xu wrote:
On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini 
 stefano.stabell...@eu.citrix.com wrote:
          Integrating hyper with Xen using libxl was the right decision 
 and it
          looks like you did a good job. I think that you can go ahead 
 with the
          PR!
   
   
          But I did have a few issues building hyper. I am getting:
   
          hyperd.go:11:2: cannot find package hyper/daemon in any of:
          [...]
   
I tried with a clean 0.2-dev branch
 ./autogen.sh
 ./configure
 make
   
It looks ok, are you work on the 0.2-dev branch, I did not write the 
 branch name in the instruction of
   Readme, sorry for
that.  

   No worries, the most important part at this stage is the code, and that
   looks OK :-)
   Yes, I was using 0.2-dev and followed those steps. As I usually don't
   program in go, it is likely that my go working environment is missing
   something, or my go paths are wrong. This is the full error message:

   CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep go 
 build hyperd.go
   hyperd.go:11:2: cannot find package hyper/daemon in any of:
           /local/scratch/sstabellini/go/src/hyper/daemon (from $GOROOT)
           
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon (from 
 $GOPATH)
   hyperd.go:10:2: cannot find package hyper/engine in any of:
           /local/scratch/sstabellini/go/src/hyper/engine (from $GOROOT)
           
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine (from 
 $GOPATH)
   hyperd.go:12:2: cannot find package hyper/lib/glog in any of:
           /local/scratch/sstabellini/go/src/hyper/lib/glog (from $GOROOT)
           
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/lib/glog (from 
 $GOPATH)
   hyperd.go:13:2: cannot find package hyper/utils in any of:
           /local/scratch/sstabellini/go/src/hyper/utils (from $GOROOT)
           
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/utils (from 
 $GOPATH)
   godep: go exit status 1


          Looking through the code, it seems that you are adding a
          virtio-serial-pci device, why do you need it?  It is not used 
 very much
          on Xen; the regular Xen uart is specified by setting
          b_info-u.hvm.serial to pty, and it looks like you are 
 already doing
          that. If you need more than one console, you can have a list 
 setting
          b_info-u.hvm.serial_list.
   
What the difference between u.hvm.serial_list and channels in 
 domain_config. The channel looks having more
   features.

   Actually I think that you are right: channels are better tested and more
   flexible.


          virtio-9p-pci is also not used very much with Xen, but as we 
 don't have
          an alternative yet, I think it is good for now.
   
   
          Thanks again,
   
          Stefano
   
   
   
          On Fri, 19 Jun 2015, Sarah Conway wrote:
           Hi Xu,
           I'd be happy to work with you on some PR to promote this 
 work. I'll be in touch with some next steps
   next
          week and look
           forward to Stefano's feedback.
          
           Sarah 

Re: [Xen-devel] [PATCH v4 00/17] x86/hvm: I/O emulation cleanup and fix

2015-06-24 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 24 June 2015 13:16
 To: Paul Durrant
 Cc: xen-de...@lists.xenproject.org
 Subject: Re: [Xen-devel] [PATCH v4 00/17] x86/hvm: I/O emulation cleanup
 and fix
 
  On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
  This patch series re-works much of the code involved in emulation of port
  and memory mapped I/O for HVM guests.
 
  The code has become very convoluted and, at least by inspection, certain
  emulations will apparently malfunction.
 
  The series is broken down into 17 patches (which are also available in
  my xenbits repo:
 http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
  on the emulation27 branch) as follows:
 
  0001-x86-hvm-simplify-hvmemul_do_io.patch
  0002-x86-hvm-remove-hvm_io_pending-check-in-hvmemul_do_io.patch
  0003-x86-hvm-remove-extraneous-parameter-from-hvmtrace_io.patch
 
 In the event there were any, however minor, changes to these
 three - they went in yesterday evening.
 

Ok. Thanks. Having not seen your acked-by on list, I didn't check. I'll drop 
these from any subsequent post.

  Paul

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86: fix build with old gas

2015-06-24 Thread Jan Beulich
.string8 is only supported by gas 2.19 and newer.

Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/include/asm-x86/bug.h
+++ b/xen/include/asm-x86/bug.h
@@ -79,7 +79,7 @@ extern const struct bug_frame __start_bu
 .L\@ud: ud2a
 
 .pushsection .rodata.str1, aMS, @progbits, 1
- .L\@s1: .string8 \file_str
+ .L\@s1: .asciz \file_str
 .popsection
 
 .pushsection .bug_frames.\type, a, @progbits
@@ -91,7 +91,7 @@ extern const struct bug_frame __start_bu
 
 .if \second_frame
 .pushsection .rodata.str1, aMS, @progbits, 1
-.L\@s2: .string8 \msg
+.L\@s2: .asciz \msg
 .popsection
 .long 0, (.L\@s2 - .L\@bf)
 .endif



x86: fix build with old gas

.string8 is only supported by gas 2.19 and newer.

Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/include/asm-x86/bug.h
+++ b/xen/include/asm-x86/bug.h
@@ -79,7 +79,7 @@ extern const struct bug_frame __start_bu
 .L\@ud: ud2a
 
 .pushsection .rodata.str1, aMS, @progbits, 1
- .L\@s1: .string8 \file_str
+ .L\@s1: .asciz \file_str
 .popsection
 
 .pushsection .bug_frames.\type, a, @progbits
@@ -91,7 +91,7 @@ extern const struct bug_frame __start_bu
 
 .if \second_frame
 .pushsection .rodata.str1, aMS, @progbits, 1
-.L\@s2: .string8 \msg
+.L\@s2: .asciz \msg
 .popsection
 .long 0, (.L\@s2 - .L\@bf)
 .endif
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] xen{trace/analyze}: don't use 64bit versions of libc functions

2015-06-24 Thread Roger Pau Monné
El 24/06/15 a les 13.11, Roger Pau Monné ha escrit:
 El 22/06/15 a les 16.48, Roger Pau Monné ha escrit:
 El 22/06/15 a les 12.09, George Dunlap ha escrit:
 On 06/22/2015 10:59 AM, Roger Pau Monné wrote:
 El 22/06/15 a les 11.08, George Dunlap ha escrit:
 On 06/19/2015 09:58 AM, Roger Pau Monne wrote:
 This is not needed, neither encouraged. Configure already checks
 _FILE_OFFSET_BITS and appends it when needed, so that the right functions
 are used. Also remove the usage of loff_t and O_LARGEFILE for the same
 reason.

 Just so I understand -- are you saying that configure at the tools
 directory level will notice that Linux can handle 64-bit file operations
 and use them automatically?

 Yes, according to the man page [1]:

 Over time, increases in the size of the stat structure have led to
 three successive versions of stat(): sys_stat() (slot __NR_oldstat),
 sys_newstat() (slot __NR_stat), and sys_stat64() (new in kernel 2.4;
 slot __NR_stat64). The glibc stat() wrapper function hides these details
 from applications, invoking the most recent version of the system call
 provided by the kernel, and repacking the returned information if
 required for old binaries. Similar remarks apply for fstat() and lstat().

 OK, if you can confirm that you've actually tested this on a file larger
 than 4GiB, then:

 No, I have only build tested it since I was trying to unbreak the build.
 I don't think I will have time to test this until tomorrow, sorry for
 the delay.
 
 I've now tested this with a ~5GB file and it seems to work fine, I
 haven't seen any error and the output looks reasonable. This was on a
 64bit Dom0, if someone has a 32bit Dom0 it would be good to test it
 there also.

I've also tested on a 32bit Dom0, with and without the patches in this
series and I always end up getting the same strange output from xenalyze:

# xenalyze trace.file
No output defined, using summary.
Using VMX hardware-assisted virtualization.
scan_for_new_pcpu: Activating pcpu 0 at offset 0
Creating vcpu 0 for dom 32768
scan_for_new_pcpu: Activating pcpu 1 at offset 10376
Creating vcpu 1 for dom 32768
scan_for_new_pcpu: Activating pcpu 4 at offset 10848
Creating vcpu 4 for dom 32768
scan_for_new_pcpu: Activating pcpu 6 at offset 11176
Creating vcpu 6 for dom 32768
init_pcpus: through first trace write, done for now.
Creating domain 0
Creating vcpu 0 for dom 0
Using first_tsc for d0v0 (8109 cycles)
Creating domain 32767
Creating vcpu 1 for dom 32767
Creating vcpu 1 for dom 0
Creating vcpu 2 for dom 0
Creating vcpu 4 for dom 32767
Using first_tsc for d32767v4 (9407 cycles)
Creating vcpu 6 for dom 32767
Using first_tsc for d32767v6 (8755 cycles)
process_cpu_change: Activating pcpu 5 at offset 16664
Creating vcpu 5 for dom 32768
scan_for_new_pcpu: Activating pcpu 7 at offset 17812
Creating vcpu 7 for dom 32768
Creating vcpu 3 for dom 0
Using first_tsc for d0v3 (3369172 cycles)
Creating vcpu 0 for dom 32767
Creating vcpu 6 for dom 0
Creating vcpu 5 for dom 32767
Using first_tsc for d32767v5 (7868 cycles)
Creating vcpu 7 for dom 0
Creating vcpu 7 for dom 32767
Using first_tsc for d32767v7 (7693 cycles)
process_cpu_change: Activating pcpu 2 at offset 61284
Creating vcpu 2 for dom 32768
process_cpu_change: Activating pcpu 3 at offset 62128
Creating vcpu 3 for dom 32768
Creating vcpu 5 for dom 0
Creating vcpu 3 for dom 32767
Using first_tsc for d32767v3 (24609 cycles)
Creating vcpu 4 for dom 0
Creating vcpu 2 for dom 32767
Using first_tsc for d32767v2 (2575 cycles)
WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2.
Not processing
]   84007(8:4:7) 0 [ ]
WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2.
Not processing
]   84006(8:4:6) 0 [ ]
WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2.
Not processing
]   84008(8:4:8) 0 [ ]
WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2.
Not processing
]   84008(8:4:8) 0 [ ]
WARNING: Unexpected vcpu data type for d0v3 on proc 0! Expected 1 got 2.
Not processing
]   84006(8:4:6) 0 [ ]
Creating domain 90
Creating vcpu 0 for dom 90
Creating domain 89
Creating vcpu 0 for dom 89
Unknown hvm event: 84011
h-exit_reason 7b  exit_reason_max 38!
]   81002(8:1:2) 2 [ 7b 100d9e ]

And that's all. Since this seems to not be related to this fixes I think
they should be applied.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hyper and Xen Project

2015-06-24 Thread Dave Scott

 On 24 Jun 2015, at 12:48, Stefano Stabellini 
 stefano.stabell...@eu.citrix.com wrote:
 
 Hi Wang,
 
 I don't know the answer, so I CCed xen-devel (the Xen development list)
 and a few people that I think will be able to help.
 
 Cheers,
 
 Stefano
 
 On Wed, 24 Jun 2015, Wang Xu wrote:
 A problem about channel, where do I found the channel name in the guest, In 
 the document, it says I could found it in
 sysfs, but looks there isn't a name property:
 
 | root@test-container-create-ubuntu:/sys/bus/xen/devices# udevadm  info 
 --attribute-walk  --path=/devices/console-1
 |
 [...]
 |
 |   looking at device '/devices/console-1':
 | KERNEL==console-1
 | SUBSYSTEM==xen
 | DRIVER==xenconsole
 | ATTR{devtype}==console
 | ATTR{nodename}==device/console/1”
 

I don’t think the frontend driver in Linux knows about the name key. In my 
testing I wrote a udev script which looks up the ‘name’ key directly in 
xenstore and created a named device node using that. For reference my script is 
here:

https://github.com/mirage/mirage-console/blob/master/udev/xenconsole-setup-tty

Cheers,
Dave

 and I directly test `/dev/hvc1`, and it could communicate with the outside 
 socket. Is there some mistake in my channel 
 name configuration?
 
 | static void hyper_config_channel(libxl_device_channel* ch, const char* 
 name, const char* sock, int devid) {
 | libxl_device_channel_init(ch);
 | ch-backend_domid = 0;
 | ch-name = strdup(name);
 | ch-devid = devid;
 | ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET;
 | ch-u.socket.path = strdup(sock);
 | }
 
 I tried to look at the oVirt code as it is mentioned in the dock, but I did 
 not find xen console in its guest agent code.
 
 So the issue is that the name you assign here to the channel, doesn't
 come up anywhere in the guest. Is that correct?


 
 
 Thank you!
 
 
 On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini 
 stefano.stabell...@eu.citrix.com wrote:
  On Tue, 23 Jun 2015, Wang Xu wrote:
 On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini 
 stefano.stabell...@eu.citrix.com wrote:
Integrating hyper with Xen using libxl was the right decision and it
looks like you did a good job. I think that you can go ahead with the
PR!
 
 
But I did have a few issues building hyper. I am getting:
 
hyperd.go:11:2: cannot find package hyper/daemon in any of:
[...]
 
 I tried with a clean 0.2-dev branch
 ./autogen.sh
 ./configure
 make
 
 It looks ok, are you work on the 0.2-dev branch, I did not write the branch 
 name in the instruction of
  Readme, sorry for
 that.  
 
  No worries, the most important part at this stage is the code, and that
  looks OK :-)
  Yes, I was using 0.2-dev and followed those steps. As I usually don't
  program in go, it is likely that my go working environment is missing
  something, or my go paths are wrong. This is the full error message:
 
  CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep go 
 build hyperd.go
  hyperd.go:11:2: cannot find package hyper/daemon in any of:
  /local/scratch/sstabellini/go/src/hyper/daemon (from $GOROOT)
  
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon (from 
 $GOPATH)
  hyperd.go:10:2: cannot find package hyper/engine in any of:
  /local/scratch/sstabellini/go/src/hyper/engine (from $GOROOT)
  
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine (from 
 $GOPATH)
  hyperd.go:12:2: cannot find package hyper/lib/glog in any of:
  /local/scratch/sstabellini/go/src/hyper/lib/glog (from $GOROOT)
  
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/lib/glog (from 
 $GOPATH)
  hyperd.go:13:2: cannot find package hyper/utils in any of:
  /local/scratch/sstabellini/go/src/hyper/utils (from $GOROOT)
  
 /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/utils (from 
 $GOPATH)
  godep: go exit status 1
 
 
Looking through the code, it seems that you are adding a
virtio-serial-pci device, why do you need it?  It is not used very 
 much
on Xen; the regular Xen uart is specified by setting
b_info-u.hvm.serial to pty, and it looks like you are already 
 doing
that. If you need more than one console, you can have a list setting
b_info-u.hvm.serial_list.
 
 What the difference between u.hvm.serial_list and channels in 
 domain_config. The channel looks having more
  features.
 
  Actually I think that you are right: channels are better tested and more
  flexible.
 
 
virtio-9p-pci is also not used very much with Xen, but as we don't 
 have
an alternative yet, I think it is good for now.
 
 
Thanks again,
 
Stefano
 
 
 
On Fri, 19 Jun 2015, Sarah Conway wrote:
 Hi Xu,
 I'd be happy to work with you on some PR to 

Re: [Xen-devel] [PATCH 1/2] NetBSDRump: provide evtchn.h and privcmd.h

2015-06-24 Thread Roger Pau Monné
El 24/06/15 a les 12.10, Wei Liu ha escrit:
   +#define IOCTL_PRIVCMD_MMAP \
 +_IOW('P', 2, privcmd_mmap_t)
 +#define IOCTL_PRIVCMD_MMAPBATCH\
 +_IOW('P', 3, privcmd_mmapbatch_t)

FWIW you could have gotten away with just implementing
IOCTL_PRIVCMD_MMAPBATCH, this is what I did on FreeBSD.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 04/12] x86/altp2m: basic data structures and support routines.

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
 index 3d8f4dc..a1529c0 100644
 --- a/xen/include/asm-x86/hvm/vcpu.h
 +++ b/xen/include/asm-x86/hvm/vcpu.h
 @@ -118,6 +118,13 @@ struct nestedvcpu {
  
  #define vcpu_nestedhvm(v) ((v)-arch.hvm_vcpu.nvcpu)
  
 +struct altp2mvcpu {
 +uint16_tp2midx; /* alternate p2m index */
 +uint64_tveinfo_gfn; /* #VE information page guest pfn */

Please use the recently-introduced pfn_t here.  pfn is a more
appropriate term than gfn in this case.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] xen{trace/analyze}: don't use 64bit versions of libc functions

2015-06-24 Thread Roger Pau Monné
El 22/06/15 a les 16.48, Roger Pau Monné ha escrit:
 El 22/06/15 a les 12.09, George Dunlap ha escrit:
 On 06/22/2015 10:59 AM, Roger Pau Monné wrote:
 El 22/06/15 a les 11.08, George Dunlap ha escrit:
 On 06/19/2015 09:58 AM, Roger Pau Monne wrote:
 This is not needed, neither encouraged. Configure already checks
 _FILE_OFFSET_BITS and appends it when needed, so that the right functions
 are used. Also remove the usage of loff_t and O_LARGEFILE for the same
 reason.

 Just so I understand -- are you saying that configure at the tools
 directory level will notice that Linux can handle 64-bit file operations
 and use them automatically?

 Yes, according to the man page [1]:

 Over time, increases in the size of the stat structure have led to
 three successive versions of stat(): sys_stat() (slot __NR_oldstat),
 sys_newstat() (slot __NR_stat), and sys_stat64() (new in kernel 2.4;
 slot __NR_stat64). The glibc stat() wrapper function hides these details
 from applications, invoking the most recent version of the system call
 provided by the kernel, and repacking the returned information if
 required for old binaries. Similar remarks apply for fstat() and lstat().

 OK, if you can confirm that you've actually tested this on a file larger
 than 4GiB, then:
 
 No, I have only build tested it since I was trying to unbreak the build.
 I don't think I will have time to test this until tomorrow, sorry for
 the delay.

I've now tested this with a ~5GB file and it seems to work fine, I
haven't seen any error and the output looks reasonable. This was on a
64bit Dom0, if someone has a 32bit Dom0 it would be good to test it
there also.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops

2015-06-24 Thread Paul Durrant
...in hvmemul_read/write()

Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions
to reduce code duplication.

NOTE: This patch also introduces a change in 'chunking' around a page
  boundary. Previously (for example) an 8 byte access at the last
  byte of a page would get carried out as 8 single-byte accesses.
  It will now be carried out as a single-byte access, followed by
  a 4-byte access, a 2-byte access and then another single-byte
  access.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c |  225 +++-
 1 file changed, 118 insertions(+), 107 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 935eab3..4d11c6c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -540,6 +540,119 @@ static int hvmemul_virtual_to_linear(
 return X86EMUL_EXCEPTION;
 }
 
+static int hvmemul_phys_mmio_access(
+paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer)
+{
+unsigned long one_rep = 1;
+unsigned int chunk;
+int rc;
+
+/* Accesses must fall within a page */
+BUG_ON((gpa  (PAGE_SIZE - 1)) + size  PAGE_SIZE);
+
+/*
+ * hvmemul_do_io() cannot handle non-power-of-2 accesses or
+ * accesses larger than sizeof(long), so choose the highest power
+ * of 2 not exceeding sizeof(long) as the 'chunk' size.
+ */
+chunk = 1  (fls(size) - 1);
+if ( chunk  sizeof (long) )
+chunk = sizeof (long);
+
+for ( ;; )
+{
+rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
+*buffer);
+if ( rc != X86EMUL_OKAY )
+break;
+
+/* Advance to the next chunk */
+gpa += chunk;
+*buffer += chunk;
+size -= chunk;
+
+if ( size == 0 )
+break;
+
+/*
+ * If the chunk now exceeds the remaining size, choose the next
+ * lowest power of 2 that will fit.
+ */
+while ( chunk  size )
+chunk = 1;
+}
+
+return rc;
+}
+
+static int hvmemul_linear_mmio_access(
+unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
+uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t translate)
+{
+struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io;
+unsigned long page_off = gla  (PAGE_SIZE - 1);
+unsigned int chunk;
+paddr_t gpa;
+unsigned long one_rep = 1;
+int rc;
+
+chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
+
+if ( translate )
+gpa = pfn_to_paddr(vio-mmio_gpfn) | page_off;
+else
+{
+rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec,
+hvmemul_ctxt);
+if ( rc != X86EMUL_OKAY )
+return rc;
+}
+
+for ( ;; )
+{
+rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
+if ( rc != X86EMUL_OKAY )
+break;
+
+gla += chunk;
+gpa += chunk;
+size -= chunk;
+
+if ( size == 0 )
+break;
+
+ASSERT((gla  (PAGE_SIZE - 1)) == 0);
+chunk = min_t(unsigned int, size, PAGE_SIZE);
+if ( !translate )
+{
+rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec,
+hvmemul_ctxt);
+if ( rc != X86EMUL_OKAY )
+return rc;
+}
+}
+
+return rc;
+}
+
+static inline int hvmemul_linear_mmio_read(
+unsigned long gla, unsigned int size, void *buffer,
+uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+bool_t translate)
+{
+return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer,
+  pfec, hvmemul_ctxt, translate);
+}
+
+static inline int hvmemul_linear_mmio_write(
+unsigned long gla, unsigned int size, void *buffer,
+uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt,
+bool_t translate)
+{
+return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer,
+  pfec, hvmemul_ctxt, translate);
+}
+
 static int __hvmemul_read(
 enum x86_segment seg,
 unsigned long offset,
@@ -550,51 +663,19 @@ static int __hvmemul_read(
 {
 struct vcpu *curr = current;
 unsigned long addr, reps = 1;
-unsigned int off, chunk = min(bytes, 1U  LONG_BYTEORDER);
 uint32_t pfec = PFEC_page_present;
 struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
-paddr_t gpa;
 int rc;
 
 rc = hvmemul_virtual_to_linear(
 seg, offset, bytes, reps, access_type, hvmemul_ctxt, addr);
 if ( rc != X86EMUL_OKAY )
 return rc;
-off = addr  (PAGE_SIZE - 1);
-/*
- * We only need to handle sizes actual instruction operands can have. All
- * such sizes are either powers 

Re: [Xen-devel] [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI

2015-06-24 Thread Boris Ostrovsky

On 06/24/2015 06:14 AM, Roger Pau Monné wrote:

El 24/06/15 a les 12.05, Jan Beulich ha escrit:

On 24.06.15 at 11:47, roger@citrix.com wrote:

What needs to be done (ordered by priority):

  - Clean up the patches, this patch series was done in less than a week.
  - Finish the boot ABI (this would also be needed for PVH anyway).
  - Convert the rest of xc_dom_*loaders in order to use the physical
entry point when present, right now xc_dom_elfloader is the only one
usable with HVMlite. This is quite trivial (see patch 10, it's a 4
LOC change).
  - Dom0 support.
  - Migration.
  - PCI pass-through.

IMHO this is what we agreed to do with PVH, make it an HVM guest without
a device model and without the emulated devices inside of Xen. Sooner or
later we would need to make that change anyway in order to properly
integrate PVH into Xen, and we get a bunch of new features for free as
compared to PVH.

I don't think of this as throw PVH out of the window and start
something completely new from scratch, we are going to reuse some of
the code paths used by PVH inside of Xen. From a guest POV the changes
needed to move from PVH into HVMlite are regarding the boot ABI only,
which we already agreed that should be changed anyway.

I have to admit that I'm having a hard time making myself a clear
picture of what the intention now is, namely with feature freeze
being in about 2.5 weeks: If we assume that this series gets ready
in time, should we drop Boris' 32-bit support patches? Would then
be unfortunate if the series here didn't get ready.

TBH I'm not going to make any promises of this being ready before the
4.6 feature freeze, not until I get some feedback from the tools
maintainers regarding the libxc changes to unify the PV and HVM domain
creation paths.


FWIW, I gave this a quick spin on Monday and crashed the hypervisor on a 
NULL pointer right away in vapic code. Which, I assume, is not 
surprising since we are not supposed to be there in the first place.


I'll try it again later today (I was out yesterday), maybe I messed 
something up.





Otoh I don't think this and Boris' code conflict, and what we got in
the tree PVH-wise is kind of a mess right now anyway, so adding to
it just a few more bits (actually getting rid of some fixme-s, i.e.
reducing messiness), so I'd be inclined to take the rest of Boris'
series once ready, and if the series here gets ready too it could
then also go in. Which would then mean for someone (perhaps
after 4.6 was branched) to clean up any no longer necessary
PVH special cases, unifying things towards what we seem to now
call HVMlite.

I'm not against merging the 32bit support series for PVH, but I'm
certainly not going to invest time in adding 32bit PVH entry points to
any OSes.


What about Tim's proposal 
(http://lists.xen.org/archives/html/xen-devel/2014-12/msg00596.html)? 
Can this work be made part of it? At least, make it extendable to that?


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] xen{trace/analyze}: don't use 64bit versions of libc functions

2015-06-24 Thread George Dunlap
On 06/24/2015 02:02 PM, Roger Pau Monné wrote:
 El 24/06/15 a les 13.11, Roger Pau Monné ha escrit:
 El 22/06/15 a les 16.48, Roger Pau Monné ha escrit:
 El 22/06/15 a les 12.09, George Dunlap ha escrit:
 On 06/22/2015 10:59 AM, Roger Pau Monné wrote:
 El 22/06/15 a les 11.08, George Dunlap ha escrit:
 On 06/19/2015 09:58 AM, Roger Pau Monne wrote:
 This is not needed, neither encouraged. Configure already checks
 _FILE_OFFSET_BITS and appends it when needed, so that the right 
 functions
 are used. Also remove the usage of loff_t and O_LARGEFILE for the same
 reason.

 Just so I understand -- are you saying that configure at the tools
 directory level will notice that Linux can handle 64-bit file operations
 and use them automatically?

 Yes, according to the man page [1]:

 Over time, increases in the size of the stat structure have led to
 three successive versions of stat(): sys_stat() (slot __NR_oldstat),
 sys_newstat() (slot __NR_stat), and sys_stat64() (new in kernel 2.4;
 slot __NR_stat64). The glibc stat() wrapper function hides these details
 from applications, invoking the most recent version of the system call
 provided by the kernel, and repacking the returned information if
 required for old binaries. Similar remarks apply for fstat() and lstat().

 OK, if you can confirm that you've actually tested this on a file larger
 than 4GiB, then:

 No, I have only build tested it since I was trying to unbreak the build.
 I don't think I will have time to test this until tomorrow, sorry for
 the delay.

 I've now tested this with a ~5GB file and it seems to work fine, I
 haven't seen any error and the output looks reasonable. This was on a
 64bit Dom0, if someone has a 32bit Dom0 it would be good to test it
 there also.
 
 I've also tested on a 32bit Dom0, with and without the patches in this
 series and I always end up getting the same strange output from xenalyze:
 
 # xenalyze trace.file
 No output defined, using summary.
 Using VMX hardware-assisted virtualization.
 scan_for_new_pcpu: Activating pcpu 0 at offset 0
 Creating vcpu 0 for dom 32768
 scan_for_new_pcpu: Activating pcpu 1 at offset 10376
 Creating vcpu 1 for dom 32768
 scan_for_new_pcpu: Activating pcpu 4 at offset 10848
 Creating vcpu 4 for dom 32768
 scan_for_new_pcpu: Activating pcpu 6 at offset 11176
 Creating vcpu 6 for dom 32768
 init_pcpus: through first trace write, done for now.
 Creating domain 0
 Creating vcpu 0 for dom 0
 Using first_tsc for d0v0 (8109 cycles)
 Creating domain 32767
 Creating vcpu 1 for dom 32767
 Creating vcpu 1 for dom 0
 Creating vcpu 2 for dom 0
 Creating vcpu 4 for dom 32767
 Using first_tsc for d32767v4 (9407 cycles)
 Creating vcpu 6 for dom 32767
 Using first_tsc for d32767v6 (8755 cycles)
 process_cpu_change: Activating pcpu 5 at offset 16664
 Creating vcpu 5 for dom 32768
 scan_for_new_pcpu: Activating pcpu 7 at offset 17812
 Creating vcpu 7 for dom 32768
 Creating vcpu 3 for dom 0
 Using first_tsc for d0v3 (3369172 cycles)
 Creating vcpu 0 for dom 32767
 Creating vcpu 6 for dom 0
 Creating vcpu 5 for dom 32767
 Using first_tsc for d32767v5 (7868 cycles)
 Creating vcpu 7 for dom 0
 Creating vcpu 7 for dom 32767
 Using first_tsc for d32767v7 (7693 cycles)
 process_cpu_change: Activating pcpu 2 at offset 61284
 Creating vcpu 2 for dom 32768
 process_cpu_change: Activating pcpu 3 at offset 62128
 Creating vcpu 3 for dom 32768
 Creating vcpu 5 for dom 0
 Creating vcpu 3 for dom 32767
 Using first_tsc for d32767v3 (24609 cycles)
 Creating vcpu 4 for dom 0
 Creating vcpu 2 for dom 32767
 Using first_tsc for d32767v2 (2575 cycles)
 WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2.
 Not processing
 ]   84007(8:4:7) 0 [ ]
 WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2.
 Not processing
 ]   84006(8:4:6) 0 [ ]
 WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2.
 Not processing
 ]   84008(8:4:8) 0 [ ]
 WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2.
 Not processing
 ]   84008(8:4:8) 0 [ ]
 WARNING: Unexpected vcpu data type for d0v3 on proc 0! Expected 1 got 2.
 Not processing
 ]   84006(8:4:6) 0 [ ]
 Creating domain 90
 Creating vcpu 0 for dom 90
 Creating domain 89
 Creating vcpu 0 for dom 89
 Unknown hvm event: 84011
 h-exit_reason 7b  exit_reason_max 38!
 ]   81002(8:1:2) 2 [ 7b 100d9e ]
 
 And that's all. Since this seems to not be related to this fixes I think
 they should be applied.

+1.

(Ack is already there.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 02/12] VMX: implement suppress #VE.

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 In preparation for selectively enabling #VE in a later patch, set
 suppress #VE on all EPTE's.

 Suppress #VE should always be the default condition for two reasons:
 it is generally not safe to deliver #VE into a guest unless that guest
 has been modified to receive it; and even then for most EPT violations only
 the hypervisor is able to handle the violation.

 Signed-off-by: Ed White edmund.h.wh...@intel.com
 ---
  xen/arch/x86/mm/p2m-ept.c | 25 -
  1 file changed, 24 insertions(+), 1 deletion(-)

 diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
 index a6c9adf..5de3387 100644
 --- a/xen/arch/x86/mm/p2m-ept.c
 +++ b/xen/arch/x86/mm/p2m-ept.c
 @@ -41,7 +41,7 @@
  #define is_epte_superpage(ept_entry)((ept_entry)-sp)
  static inline bool_t is_epte_valid(ept_entry_t *e)
  {
 -return (e-epte != 0  e-sa_p2mt != p2m_invalid);
 +return ((e-epte  ~(1ul  63)) != 0  e-sa_p2mt != p2m_invalid);

It might be nice to leave a comment explaining that epte.suppress_ve is
not considered as part of validity.  This avoids a rather opaque mask
against a magic number.

Otherwise, Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

  }
  
  /* returns : 0 for success, -errno otherwise */
 @@ -219,6 +219,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, 
 ept_entry_t *entry,
  static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t 
 *ept_entry)
  {
  struct page_info *pg;
 +ept_entry_t *table;
 +unsigned int i;
  
  pg = p2m_alloc_ptp(p2m, 0);
  if ( pg == NULL )
 @@ -232,6 +234,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, 
 ept_entry_t *ept_entry)
  /* Manually set A bit to avoid overhead of MMU having to write it later. 
 */
  ept_entry-a = 1;
  
 +ept_entry-suppress_ve = 1;
 +
 +table = __map_domain_page(pg);
 +
 +for ( i = 0; i  EPT_PAGETABLE_ENTRIES; i++ )
 +table[i].suppress_ve = 1;
 +
 +unmap_domain_page(table);
 +
  return 1;
  }
  
 @@ -281,6 +292,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, 
 ept_entry_t *ept_entry,
  epte-sp = (level  1);
  epte-mfn += i * trunk;
  epte-snp = (iommu_enabled  iommu_snoop);
 +epte-suppress_ve = 1;
  
  ept_p2m_type_to_flags(p2m, epte, epte-sa_p2mt, epte-access);
  
 @@ -790,6 +802,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, 
 mfn_t mfn,
  ept_p2m_type_to_flags(p2m, new_entry, p2mt, p2ma);
  }
  
 +new_entry.suppress_ve = 1;
 +
  rc = atomic_write_ept_entry(ept_entry, new_entry, target);
  if ( unlikely(rc) )
  old_entry.epte = 0;
 @@ -,6 +1125,8 @@ static void ept_flush_pml_buffers(struct p2m_domain 
 *p2m)
  int ept_p2m_init(struct p2m_domain *p2m)
  {
  struct ept_data *ept = p2m-ept;
 +ept_entry_t *table;
 +unsigned int i;
  
  p2m-set_entry = ept_set_entry;
  p2m-get_entry = ept_get_entry;
 @@ -1134,6 +1150,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
  p2m-flush_hardware_cached_dirty = ept_flush_pml_buffers;
  }
  
 +table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
 +
 +for ( i = 0; i  EPT_PAGETABLE_ENTRIES; i++ )
 +table[i].suppress_ve = 1;
 +
 +unmap_domain_page(table);
 +
  if ( !zalloc_cpumask_var(ept-synced_mask) )
  return -ENOMEM;
  


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [rumpuserxen test] 58871: regressions - FAIL

2015-06-24 Thread osstest service user
flight 58871 rumpuserxen real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58871/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   5 rumpuserxen-build fail REGR. vs. 33866
 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a

version targeted for testing:
 rumpuserxen  3b91e44996ea6ae1276bce1cc44f38701c53ee6f
baseline version:
 rumpuserxen  30d72f3fc5e35cd53afd82c8179cc0e0b11146ad


People who touched revisions under test:
  Antti Kantee po...@iki.fi
  Ian Jackson ian.jack...@eu.citrix.com
  Martin Lucina mar...@lucina.net
  Wei Liu l...@liuw.name


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-i386-rumpuserxen-i386 blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 535 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 15/17] x86/hvm: use ioreq_t to track in-flight state

2015-06-24 Thread Paul Durrant
Use an ioreq_t rather than open coded state, size, dir and data fields
in struct hvm_vcpu_io. This also allows PIO completion to be handled
similarly to MMIO completion by re-issuing the handle_pio() call.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c   |   35 +--
 xen/arch/x86/hvm/hvm.c   |   15 ---
 xen/arch/x86/hvm/svm/nestedsvm.c |2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   12 
 5 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 6f538bf..6c50ef5 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -92,6 +92,7 @@ static int hvmemul_do_io(
 .df = df,
 .data = data,
 .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
+.state = STATE_IOREQ_READY,
 };
 void *p_data = (void *)data;
 int rc;
@@ -129,12 +130,24 @@ static int hvmemul_do_io(
 }
 }
 
-switch ( vio-io_state )
+switch ( vio-io_req.state )
 {
 case STATE_IOREQ_NONE:
 break;
 case STATE_IORESP_READY:
-vio-io_state = STATE_IOREQ_NONE;
+vio-io_req.state = STATE_IOREQ_NONE;
+p = vio-io_req;
+
+/* Verify the emulation request has been correctly re-issued */
+if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
+ (p.addr != addr) ||
+ (p.size != size) ||
+ (p.count != reps) ||
+ (p.dir != dir) ||
+ (p.df != df) ||
+ (p.data_is_ptr != data_is_addr) )
+domain_crash(curr-domain);
+
 if ( data_is_addr || dir == IOREQ_WRITE )
 return X86EMUL_UNHANDLEABLE;
 goto finish_access;
@@ -142,11 +155,6 @@ static int hvmemul_do_io(
 return X86EMUL_UNHANDLEABLE;
 }
 
-vio-io_state = STATE_IOREQ_READY;
-vio-io_size = size;
-vio-io_dir = dir;
-vio-io_data_is_addr = data_is_addr;
-
 if ( dir == IOREQ_WRITE )
 {
 if ( !data_is_addr )
@@ -155,13 +163,14 @@ static int hvmemul_do_io(
 hvmtrace_io_assist(p);
 }
 
+vio-io_req = p;
+
 rc = hvm_io_intercept(p);
 
 switch ( rc )
 {
 case X86EMUL_OKAY:
-vio-io_data = p.data;
-vio-io_state = STATE_IOREQ_NONE;
+vio-io_req.state = STATE_IOREQ_NONE;
 break;
 case X86EMUL_UNHANDLEABLE:
 {
@@ -172,15 +181,13 @@ static int hvmemul_do_io(
 if ( !s )
 {
 rc = hvm_process_io_intercept(null_handler, p);
-if ( rc == X86EMUL_OKAY )
-vio-io_data = p.data;
-vio-io_state = STATE_IOREQ_NONE;
+vio-io_req.state = STATE_IOREQ_NONE;
 }
 else
 {
 rc = hvm_send_assist_req(s, p);
 if ( rc != X86EMUL_RETRY )
-vio-io_state = STATE_IOREQ_NONE;
+vio-io_req.state = STATE_IOREQ_NONE;
 else if ( data_is_addr || dir == IOREQ_WRITE )
 rc = X86EMUL_OKAY;
 }
@@ -199,7 +206,7 @@ static int hvmemul_do_io(
 hvmtrace_io_assist(p);
 
 if ( !data_is_addr )
-memcpy(p_data, vio-io_data, size);
+memcpy(p_data, p.data, size);
 }
 
 if ( is_mmio  !data_is_addr )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7411287..8abf29b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
 if ( HVMIO_NEED_COMPLETION(vio) )
 {
-vio-io_state = STATE_IORESP_READY;
-vio-io_data = p-data;
+vio-io_req.state = STATE_IORESP_READY;
+vio-io_req.data = p-data;
 }
 else
-vio-io_state = STATE_IOREQ_NONE;
+vio-io_req.state = STATE_IOREQ_NONE;
 
 msix_write_completion(curr);
 vcpu_end_shutdown_deferral(curr);
@@ -501,11 +501,12 @@ void hvm_do_resume(struct vcpu *v)
 (void)handle_mmio();
 break;
 case HVMIO_pio_completion:
-if ( vio-io_size == 4 ) /* Needs zero extension. */
-guest_cpu_user_regs()-rax = (uint32_t)vio-io_data;
+if ( vio-io_req.size == 4 ) /* Needs zero extension. */
+guest_cpu_user_regs()-rax = (uint32_t)vio-io_req.data;
 else
-memcpy(guest_cpu_user_regs()-rax, vio-io_data, vio-io_size);
-vio-io_state = STATE_IOREQ_NONE;
+memcpy(guest_cpu_user_regs()-rax, vio-io_req.data,
+   vio-io_req.size);
+vio-io_req.state = STATE_IOREQ_NONE;
 break;
 case HVMIO_realmode_completion:
 {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index 8b165c6..78667a2 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c

[Xen-devel] [PATCH v4 14/17] x86/hvm: remove hvm_io_state enumeration

2015-06-24 Thread Paul Durrant
Emulation request status is already covered by STATE_IOREQ_XXX values so
just use those. The mapping is:

HVMIO_none- STATE_IOREQ_NONE
HVMIO_awaiting_completion - STATE_IOREQ_READY
HVMIO_completed   - STATE_IORESP_READY

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c   |   14 +++---
 xen/arch/x86/hvm/hvm.c   |6 +++---
 xen/arch/x86/hvm/svm/nestedsvm.c |2 +-
 xen/arch/x86/hvm/vmx/realmode.c  |4 ++--
 xen/include/asm-x86/hvm/vcpu.h   |   10 ++
 5 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c10adad..6f538bf 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -131,10 +131,10 @@ static int hvmemul_do_io(
 
 switch ( vio-io_state )
 {
-case HVMIO_none:
+case STATE_IOREQ_NONE:
 break;
-case HVMIO_completed:
-vio-io_state = HVMIO_none;
+case STATE_IORESP_READY:
+vio-io_state = STATE_IOREQ_NONE;
 if ( data_is_addr || dir == IOREQ_WRITE )
 return X86EMUL_UNHANDLEABLE;
 goto finish_access;
@@ -142,7 +142,7 @@ static int hvmemul_do_io(
 return X86EMUL_UNHANDLEABLE;
 }
 
-vio-io_state = HVMIO_awaiting_completion;
+vio-io_state = STATE_IOREQ_READY;
 vio-io_size = size;
 vio-io_dir = dir;
 vio-io_data_is_addr = data_is_addr;
@@ -161,7 +161,7 @@ static int hvmemul_do_io(
 {
 case X86EMUL_OKAY:
 vio-io_data = p.data;
-vio-io_state = HVMIO_none;
+vio-io_state = STATE_IOREQ_NONE;
 break;
 case X86EMUL_UNHANDLEABLE:
 {
@@ -174,13 +174,13 @@ static int hvmemul_do_io(
 rc = hvm_process_io_intercept(null_handler, p);
 if ( rc == X86EMUL_OKAY )
 vio-io_data = p.data;
-vio-io_state = HVMIO_none;
+vio-io_state = STATE_IOREQ_NONE;
 }
 else
 {
 rc = hvm_send_assist_req(s, p);
 if ( rc != X86EMUL_RETRY )
-vio-io_state = HVMIO_none;
+vio-io_state = STATE_IOREQ_NONE;
 else if ( data_is_addr || dir == IOREQ_WRITE )
 rc = X86EMUL_OKAY;
 }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4458fa4..7411287 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p)
 
 if ( HVMIO_NEED_COMPLETION(vio) )
 {
-vio-io_state = HVMIO_completed;
+vio-io_state = STATE_IORESP_READY;
 vio-io_data = p-data;
 }
 else
-vio-io_state = HVMIO_none;
+vio-io_state = STATE_IOREQ_NONE;
 
 msix_write_completion(curr);
 vcpu_end_shutdown_deferral(curr);
@@ -505,7 +505,7 @@ void hvm_do_resume(struct vcpu *v)
 guest_cpu_user_regs()-rax = (uint32_t)vio-io_data;
 else
 memcpy(guest_cpu_user_regs()-rax, vio-io_data, vio-io_size);
-vio-io_state = HVMIO_none;
+vio-io_state = STATE_IOREQ_NONE;
 break;
 case HVMIO_realmode_completion:
 {
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index be5797a..8b165c6 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v)
  * Delay the injection because this would result in delivering
  * an interrupt *within* the execution of an instruction.
  */
-if ( v-arch.hvm_vcpu.hvm_io.io_state != HVMIO_none )
+if ( v-arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE )
 return hvm_intblk_shadow;
 
 if ( !nv-nv_vmexit_pending  n2vmcb-exitintinfo.bytes != 0 ) {
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 5e56a1f..4135ad4 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 
 vmx_realmode_emulate_one(hvmemul_ctxt);
 
-if ( vio-io_state != HVMIO_none || vio-mmio_retry )
+if ( vio-io_state != STATE_IOREQ_NONE || vio-mmio_retry )
 break;
 
 /* Stop emulating unless our segment state is not safe */
@@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
 }
 
 /* Need to emulate next time if we've started an IO operation */
-if ( vio-io_state != HVMIO_none )
+if ( vio-io_state != STATE_IOREQ_NONE )
 curr-arch.hvm_vmx.vmx_emulate = 1;
 
 if ( !curr-arch.hvm_vmx.vmx_emulate  !curr-arch.hvm_vmx.vmx_realmode )
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 2830057..f797518 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -30,12 +30,6 

[Xen-devel] [PATCH v4 11/17] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()

2015-06-24 Thread Paul Durrant
By removing the calls in hvmemul_do_io() (which is replaced by a single
assignment) and hvm_complete_assist_request() (which is replaced by a
call to process_portio_intercept() with a suitable set of ops) then
hvm_io_assist() can be moved into hvm.c and made static (and hence be a
candidate for inlining).

This patch also fixes the I/O state test at the end of hvm_io_assist()
to check the correct value. Since the ioreq server patch series was
integrated the current ioreq state is no longer an indicator of in-flight
I/O state, since an I/O sheduled by re-emulation may be targetted at a
different ioreq server.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c   |   34 +---
 xen/arch/x86/hvm/hvm.c   |   70 +++---
 xen/arch/x86/hvm/intercept.c |4 +--
 xen/arch/x86/hvm/io.c|   39 ---
 xen/include/asm-x86/hvm/io.h |3 +-
 5 files changed, 73 insertions(+), 77 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index eefe860..111987c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -51,6 +51,32 @@ static void hvmtrace_io_assist(ioreq_t *p)
 trace_var(event, 0/*!cycles*/, size, buffer);
 }
 
+static int null_read(struct hvm_io_handler *io_handler,
+ uint64_t addr,
+ uint64_t size,
+ uint64_t *data)
+{
+*data = ~0ul;
+return X86EMUL_OKAY;
+}
+
+static int null_write(struct hvm_io_handler *handler,
+  uint64_t addr,
+  uint64_t size,
+  uint64_t data)
+{
+return X86EMUL_OKAY;
+}
+
+static const struct hvm_io_ops null_ops = {
+.read = null_read,
+.write = null_write
+};
+
+static struct hvm_io_handler null_handler = {
+.ops = null_ops
+};
+
 static int hvmemul_do_io(
 bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
 uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
@@ -140,8 +166,7 @@ static int hvmemul_do_io(
 switch ( rc )
 {
 case X86EMUL_OKAY:
-p.state = STATE_IORESP_READY;
-hvm_io_assist(p);
+vio-io_data = p.data;
 vio-io_state = HVMIO_none;
 break;
 case X86EMUL_UNHANDLEABLE:
@@ -152,8 +177,9 @@ static int hvmemul_do_io(
 /* If there is no suitable backing DM, just ignore accesses */
 if ( !s )
 {
-hvm_complete_assist_req(p);
-rc = X86EMUL_OKAY;
+rc = hvm_process_io_intercept(null_handler, p);
+if ( rc == X86EMUL_OKAY )
+vio-io_data = p.data;
 vio-io_state = HVMIO_none;
 }
 else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 626c431..3365abb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -411,6 +411,45 @@ bool_t hvm_io_pending(struct vcpu *v)
 return 0;
 }
 
+static void hvm_io_assist(ioreq_t *p)
+{
+struct vcpu *curr = current;
+struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
+enum hvm_io_state io_state;
+
+p-state = STATE_IOREQ_NONE;
+
+io_state = vio-io_state;
+vio-io_state = HVMIO_none;
+
+switch ( io_state )
+{
+case HVMIO_awaiting_completion:
+vio-io_state = HVMIO_completed;
+vio-io_data = p-data;
+break;
+case HVMIO_handle_mmio_awaiting_completion:
+vio-io_state = HVMIO_completed;
+vio-io_data = p-data;
+(void)handle_mmio();
+break;
+case HVMIO_handle_pio_awaiting_completion:
+if ( vio-io_size == 4 ) /* Needs zero extension. */
+guest_cpu_user_regs()-rax = (uint32_t)p-data;
+else
+memcpy(guest_cpu_user_regs()-rax, p-data, vio-io_size);
+break;
+default:
+break;
+}
+
+if ( p-state == STATE_IOREQ_NONE )
+{
+msix_write_completion(curr);
+vcpu_end_shutdown_deferral(curr);
+}
+}
+
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
 {
 /* NB. Optimised for common case (p-state == STATE_IOREQ_NONE). */
@@ -2667,37 +2706,6 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, 
ioreq_t *proto_p)
 return X86EMUL_UNHANDLEABLE;
 }
 
-void hvm_complete_assist_req(ioreq_t *p)
-{
-switch ( p-type )
-{
-case IOREQ_TYPE_PCI_CONFIG:
-ASSERT_UNREACHABLE();
-break;
-case IOREQ_TYPE_COPY:
-case IOREQ_TYPE_PIO:
-if ( p-dir == IOREQ_READ )
-{
-if ( !p-data_is_ptr )
-p-data = ~0ul;
-else
-{
-int i, step = p-df ? -p-size : p-size;
-uint32_t data = ~0;
-
-for ( i = 0; i  p-count; i++ )
-hvm_copy_to_guest_phys(p-data + step * i, data,
-   

[Xen-devel] [PATCH v4 12/17] x86/hvm: split I/O completion handling from state model

2015-06-24 Thread Paul Durrant
The state of in-flight I/O and how its completion will be handled are
logically separate and conflating the two makes the code unnecessarily
confusing.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/hvm.c|   50 -
 xen/arch/x86/hvm/io.c |6 ++---
 xen/arch/x86/hvm/vmx/realmode.c   |   27 ++--
 xen/include/asm-x86/hvm/vcpu.h|   16 
 xen/include/asm-x86/hvm/vmx/vmx.h |1 +
 5 files changed, 68 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 3365abb..39f40ad 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include asm/hvm/trace.h
 #include asm/hvm/nestedhvm.h
 #include asm/hvm/event.h
+#include asm/hvm/vmx/vmx.h
 #include asm/mtrr.h
 #include asm/apic.h
 #include public/sched.h
@@ -428,26 +429,12 @@ static void hvm_io_assist(ioreq_t *p)
 vio-io_state = HVMIO_completed;
 vio-io_data = p-data;
 break;
-case HVMIO_handle_mmio_awaiting_completion:
-vio-io_state = HVMIO_completed;
-vio-io_data = p-data;
-(void)handle_mmio();
-break;
-case HVMIO_handle_pio_awaiting_completion:
-if ( vio-io_size == 4 ) /* Needs zero extension. */
-guest_cpu_user_regs()-rax = (uint32_t)p-data;
-else
-memcpy(guest_cpu_user_regs()-rax, p-data, vio-io_size);
-break;
 default:
 break;
 }
 
-if ( p-state == STATE_IOREQ_NONE )
-{
-msix_write_completion(curr);
-vcpu_end_shutdown_deferral(curr);
-}
+msix_write_completion(curr);
+vcpu_end_shutdown_deferral(curr);
 }
 
 static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p)
@@ -482,6 +469,7 @@ void hvm_do_resume(struct vcpu *v)
 struct hvm_vcpu_io *vio = v-arch.hvm_vcpu.hvm_io;
 struct domain *d = v-domain;
 struct hvm_ioreq_server *s;
+enum hvm_io_completion io_completion;
 
 check_wakeup_from_wait();
 
@@ -508,8 +496,36 @@ void hvm_do_resume(struct vcpu *v)
 }
 }
 
-if ( vio-mmio_retry )
+io_completion = vio-io_completion;
+vio-io_completion = HVMIO_no_completion;
+
+switch ( io_completion )
+{
+case HVMIO_no_completion:
+break;
+case HVMIO_mmio_completion:
 (void)handle_mmio();
+break;
+case HVMIO_pio_completion:
+if ( vio-io_size == 4 ) /* Needs zero extension. */
+guest_cpu_user_regs()-rax = (uint32_t)vio-io_data;
+else
+memcpy(guest_cpu_user_regs()-rax, vio-io_data, vio-io_size);
+vio-io_state = HVMIO_none;
+break;
+case HVMIO_realmode_completion:
+{
+struct hvm_emulate_ctxt ctxt;
+
+hvm_emulate_prepare(ctxt, guest_cpu_user_regs());
+vmx_realmode_emulate_one(ctxt);
+hvm_emulate_writeback(ctxt);
+
+break;
+}
+default:
+BUG();
+}
 
 /* Inject pending hw/sw trap */
 if ( v-arch.hvm_vcpu.inject_trap.vector != -1 ) 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 61df6dd..27150e9 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -92,8 +92,8 @@ int handle_mmio(void)
 
 if ( rc != X86EMUL_RETRY )
 vio-io_state = HVMIO_none;
-if ( vio-io_state == HVMIO_awaiting_completion )
-vio-io_state = HVMIO_handle_mmio_awaiting_completion;
+if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry )
+vio-io_completion = HVMIO_mmio_completion;
 else
 vio-mmio_access = (struct npfec){};
 
@@ -158,7 +158,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir)
 return 0;
 /* Completion in hvm_io_assist() with no re-emulation required. */
 ASSERT(dir == IOREQ_READ);
-vio-io_state = HVMIO_handle_pio_awaiting_completion;
+vio-io_completion = HVMIO_pio_completion;
 break;
 default:
 gdprintk(XENLOG_ERR, Weird HVM ioemulation status %d.\n, rc);
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index fe8b4a0..76ff9a5 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -101,15 +101,19 @@ static void realmode_deliver_exception(
 }
 }
 
-static void realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
+void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
 struct vcpu *curr = current;
+struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
 int rc;
 
 perfc_incr(realmode_emulations);
 
 rc = hvm_emulate_one(hvmemul_ctxt);
 
+if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry )
+vio-io_completion = HVMIO_realmode_completion;
+
 if ( rc == X86EMUL_UNHANDLEABLE )
 {
 gdprintk(XENLOG_ERR, Failed to emulate 

[Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset

2015-06-24 Thread Paul Durrant
The code in hvmemul_do_io() that tracks large reads or writes, to avoid
re-issue of component I/O, is defeated by accesses across a page boundary
because it uses physical address. The code is also only relevant to memory
mapped I/O to or from a buffer.

This patch re-factors the code and moves it into hvmemul_phys_mmio_access()
where it is relevant and tracks using buffer offset rather then address.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c |   98 
 xen/include/asm-x86/hvm/vcpu.h |   16 ---
 2 files changed, 48 insertions(+), 66 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index aa68787..4424dfc 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -107,29 +107,6 @@ static int hvmemul_do_io(
 return X86EMUL_UNHANDLEABLE;
 }
 
-if ( is_mmio  !data_is_addr )
-{
-/* Part of a multi-cycle read or write? */
-if ( dir == IOREQ_WRITE )
-{
-paddr_t pa = vio-mmio_large_write_pa;
-unsigned int bytes = vio-mmio_large_write_bytes;
-if ( (addr = pa)  ((addr + size) = (pa + bytes)) )
-return X86EMUL_OKAY;
-}
-else
-{
-paddr_t pa = vio-mmio_large_read_pa;
-unsigned int bytes = vio-mmio_large_read_bytes;
-if ( (addr = pa)  ((addr + size) = (pa + bytes)) )
-{
-memcpy(p_data, vio-mmio_large_read[addr - pa],
-   size);
-return X86EMUL_OKAY;
-}
-}
-}
-
 switch ( vio-io_req.state )
 {
 case STATE_IOREQ_NONE:
@@ -209,33 +186,6 @@ static int hvmemul_do_io(
 memcpy(p_data, p.data, size);
 }
 
-if ( is_mmio  !data_is_addr )
-{
-/* Part of a multi-cycle read or write? */
-if ( dir == IOREQ_WRITE )
-{
-paddr_t pa = vio-mmio_large_write_pa;
-unsigned int bytes = vio-mmio_large_write_bytes;
-if ( bytes == 0 )
-pa = vio-mmio_large_write_pa = addr;
-if ( addr == (pa + bytes) )
-vio-mmio_large_write_bytes += size;
-}
-else
-{
-paddr_t pa = vio-mmio_large_read_pa;
-unsigned int bytes = vio-mmio_large_read_bytes;
-if ( bytes == 0 )
-pa = vio-mmio_large_read_pa = addr;
-if ( (addr == (pa + bytes)) 
- ((bytes + size) = sizeof(vio-mmio_large_read)) )
-{
-memcpy(vio-mmio_large_read[bytes], p_data, size);
-vio-mmio_large_read_bytes += size;
-}
-}
-}
-
 return X86EMUL_OKAY;
 }
 
@@ -601,8 +551,11 @@ static int hvmemul_virtual_to_linear(
 }
 
 static int hvmemul_phys_mmio_access(
-paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer)
+paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer,
+unsigned int *off)
 {
+struct vcpu *curr = current;
+struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
 unsigned long one_rep = 1;
 unsigned int chunk;
 int rc;
@@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access(
 
 for ( ;; )
 {
-rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
-*buffer);
-if ( rc != X86EMUL_OKAY )
-break;
+/* Have we already done this chunk? */
+if ( (*off + chunk) = vio-mmio_cache[dir].size )
+{
+ASSERT(*off + chunk = vio-mmio_cache[dir].size);
+
+if ( dir == IOREQ_READ )
+memcpy(buffer[*off],
+   vio-mmio_cache[IOREQ_READ].buffer[*off],
+   chunk);
+else
+{
+if ( memcmp(buffer[*off],
+vio-mmio_cache[IOREQ_WRITE].buffer[*off],
+chunk) != 0 )
+domain_crash(curr-domain);
+}
+}
+else
+{
+ASSERT(*off == vio-mmio_cache[dir].size);
+
+rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
+buffer[*off]);
+if ( rc != X86EMUL_OKAY )
+break;
+
+/* Note that we have now done this chunk */
+memcpy(vio-mmio_cache[dir].buffer[*off],
+   buffer[*off], chunk);
+vio-mmio_cache[dir].size += chunk;
+}
 
 /* Advance to the next chunk */
 gpa += chunk;
-*buffer += chunk;
+*off += chunk;
 size -= chunk;
 
 if ( size == 0 )
@@ -651,7 +631,7 @@ static int hvmemul_linear_mmio_access(
 {
 struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io;
 unsigned long page_off = 

[Xen-devel] [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a buffer

2015-06-24 Thread Paul Durrant
If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
otherwise only the first chunk will be processed. This patch makes
sure all I/O from a buffer is re-emulated regardless of whether it
is a read or a write.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c |4 ++--
 xen/arch/x86/hvm/hvm.c |   13 -
 xen/include/asm-x86/hvm/vcpu.h |3 +--
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 6c50ef5..aa68787 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -148,7 +148,7 @@ static int hvmemul_do_io(
  (p.data_is_ptr != data_is_addr) )
 domain_crash(curr-domain);
 
-if ( data_is_addr || dir == IOREQ_WRITE )
+if ( data_is_addr )
 return X86EMUL_UNHANDLEABLE;
 goto finish_access;
 default:
@@ -188,7 +188,7 @@ static int hvmemul_do_io(
 rc = hvm_send_assist_req(s, p);
 if ( rc != X86EMUL_RETRY )
 vio-io_req.state = STATE_IOREQ_NONE;
-else if ( data_is_addr || dir == IOREQ_WRITE )
+else if ( data_is_addr )
 rc = X86EMUL_OKAY;
 }
 break;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8abf29b..c062c9f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -501,11 +501,14 @@ void hvm_do_resume(struct vcpu *v)
 (void)handle_mmio();
 break;
 case HVMIO_pio_completion:
-if ( vio-io_req.size == 4 ) /* Needs zero extension. */
-guest_cpu_user_regs()-rax = (uint32_t)vio-io_req.data;
-else
-memcpy(guest_cpu_user_regs()-rax, vio-io_req.data,
-   vio-io_req.size);
+if ( vio-io_req.dir == IOREQ_READ )
+{
+if ( vio-io_req.size == 4 ) /* Needs zero extension. */
+guest_cpu_user_regs()-rax = (uint32_t)vio-io_req.data;
+else
+memcpy(guest_cpu_user_regs()-rax, vio-io_req.data,
+   vio-io_req.size);
+}
 vio-io_req.state = STATE_IOREQ_NONE;
 break;
 case HVMIO_realmode_completion:
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 7338638..008c8fa 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -49,8 +49,7 @@ struct hvm_vcpu_io {
 
 #define HVMIO_NEED_COMPLETION(_vio) \
 ( ((_vio)-io_req.state == STATE_IOREQ_READY) \
-  !(_vio)-io_req.data_is_ptr  \
-  ((_vio)-io_req.dir == IOREQ_READ) )
+  !(_vio)-io_req.data_is_ptr )
 
 /*
  * HVM emulation:
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt test] 58119: regressions - FAIL

2015-06-24 Thread Dario Faggioli
On Tue, 2015-06-23 at 14:38 +0100, Ian Campbell wrote:
 On Tue, 2015-06-23 at 12:15 +0100, Anthony PERARD wrote:
  On Mon, Jun 08, 2015 at 10:22:28AM +0100, Ian Campbell wrote:
   On Mon, 2015-06-08 at 04:37 +, osstest service user wrote:
flight 58119 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58119/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
   
   This has been failing for a while now, sorry for not brining it to your
   attention sooner.
  
   libxl: debug: libxl_event.c:638:libxl__ev_xswatch_deregister: watch 
   w=0x7f805c25b248 wpath=/local/domain/0/device-model/1/state token=3/0: 
   deregister slotnum=3
   libxl: error: libxl_exec.c:393:spawn_watch_event: domain 1 device model: 
   startup timed out
   libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch 
   w=0x7f805c25b248: deregister unregistered
   libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch 
   w=0x7f805c25b248: deregister unregistered
   libxl: error: libxl_dm.c:1564:device_model_spawn_outcome: domain 1 device 
   model: spawn failed (rc=-3)
   libxl: error: libxl_create.c:1373:domcreate_devmodel_started: device 
   model did not start: -3
  
  Hi,
  
  I've tried to debug this device model: startup time out issue that I'm
  seeing on OpenStack. What I've done is strace every single QEMU. It appear
  that QEMU take more than 10s to load...
 
 Looking through
 http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-libvirt/ALL.html
  when it passes the collected var-log-libvirt-libxl-libxl-driver.log.gz seems 
 to indicate that the device model is successfully spawned in 2-4s.
 
 The same is true of the tests run on the Cambridge instance.
 
So, can we take Anthony's code/instrumentation for stracing QEMU and do
the same in the ad-hoc run on the test on merlot?

The goal would be to have something like what he attached to his email
(the strace output) for our failing case on merlot.

That's assuming that what Anthony have done to get the traces could be
put in a patch to libxl and/or libvirt, apply it to some branch, and
make the ad-hoc test pick code for the proper components from such
branch... which, I think, should all be doable, or am I talking
nonsense?

Regards,
Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops

2015-06-24 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 24 June 2015 13:34
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 04/17] x86/hvm: remove multiple open coded
 'chunking' loops
 
  On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
  +static int hvmemul_phys_mmio_access(
  +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer)
 
 As much as the earlier offset you returned via indirection to the
 caller was unnecessary, the indirection here seems pointless too.
 All callers know how (or don't care) to update the buffer pointer.

Ok. Personally I'd prefer one thing to be in charge of updating the pointer 
though.

 
  +static int hvmemul_linear_mmio_access(
  +unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
  +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t
 translate)
  +{
  +struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io;
  +unsigned long page_off = gla  (PAGE_SIZE - 1);
  +unsigned int chunk;
  +paddr_t gpa;
  +unsigned long one_rep = 1;
  +int rc;
  +
  +chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
  +
  +if ( translate )
  +gpa = pfn_to_paddr(vio-mmio_gpfn) | page_off;
 
 translate as name for the parameter signaling that the translation
 is known is kind of odd - translated or known_gpfn or some such?
 Or invert the meaning?
 

Ok. I think I'll go with the latter.

  +else
  +{
  +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec,
  +hvmemul_ctxt);
  +if ( rc != X86EMUL_OKAY )
  +return rc;
  +}
  +
  +for ( ;; )
  +{
  +rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
  +if ( rc != X86EMUL_OKAY )
  +break;
  +
  +gla += chunk;
  +gpa += chunk;
  +size -= chunk;
  +
  +if ( size == 0 )
  +break;
  +
  +ASSERT((gla  (PAGE_SIZE - 1)) == 0);
  +chunk = min_t(unsigned int, size, PAGE_SIZE);
 
 I think you could just assert that size is now less than PAGE_SIZE.

True.

 
  +if ( !translate )
  +{
  +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec,
  +hvmemul_ctxt);
  +if ( rc != X86EMUL_OKAY )
  +return rc;
  +}
 
 This must be done unconditionally (and gpa doesn't need updating
 above then), as the known translation is only for the first byte
 (and whatever falls on the same page).
 

Yes indeed, I somehow missed that before.

  Paul

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI

2015-06-24 Thread Roger Pau Monné
El 24/06/15 a les 12.05, Jan Beulich ha escrit:
 On 24.06.15 at 11:47, roger@citrix.com wrote:
 What needs to be done (ordered by priority):

  - Clean up the patches, this patch series was done in less than a week.
  - Finish the boot ABI (this would also be needed for PVH anyway).
  - Convert the rest of xc_dom_*loaders in order to use the physical
entry point when present, right now xc_dom_elfloader is the only one
usable with HVMlite. This is quite trivial (see patch 10, it's a 4
LOC change).
  - Dom0 support.
  - Migration.
  - PCI pass-through.

 IMHO this is what we agreed to do with PVH, make it an HVM guest without
 a device model and without the emulated devices inside of Xen. Sooner or
 later we would need to make that change anyway in order to properly
 integrate PVH into Xen, and we get a bunch of new features for free as
 compared to PVH.

 I don't think of this as throw PVH out of the window and start
 something completely new from scratch, we are going to reuse some of
 the code paths used by PVH inside of Xen. From a guest POV the changes
 needed to move from PVH into HVMlite are regarding the boot ABI only,
 which we already agreed that should be changed anyway.
 
 I have to admit that I'm having a hard time making myself a clear
 picture of what the intention now is, namely with feature freeze
 being in about 2.5 weeks: If we assume that this series gets ready
 in time, should we drop Boris' 32-bit support patches? Would then
 be unfortunate if the series here didn't get ready.

TBH I'm not going to make any promises of this being ready before the
4.6 feature freeze, not until I get some feedback from the tools
maintainers regarding the libxc changes to unify the PV and HVM domain
creation paths.

 Otoh I don't think this and Boris' code conflict, and what we got in
 the tree PVH-wise is kind of a mess right now anyway, so adding to
 it just a few more bits (actually getting rid of some fixme-s, i.e.
 reducing messiness), so I'd be inclined to take the rest of Boris'
 series once ready, and if the series here gets ready too it could
 then also go in. Which would then mean for someone (perhaps
 after 4.6 was branched) to clean up any no longer necessary
 PVH special cases, unifying things towards what we seem to now
 call HVMlite.

I'm not against merging the 32bit support series for PVH, but I'm
certainly not going to invest time in adding 32bit PVH entry points to
any OSes.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 09/18] xen/arm: ITS: Add virtual ITS commands support

2015-06-24 Thread Julien Grall
Hi Vijay,

On 22/06/15 13:01, vijay.kil...@gmail.com wrote:
 From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com
 
 Add Virtual ITS command processing support to Virtual ITS driver
 
 Signed-off-by: Vijaya Kumar K vijaya.ku...@caviumnetworks.com
 ---
  xen/arch/arm/gic-v3-its.c  |7 +
  xen/arch/arm/vgic-v3-its.c |  393 
 
  2 files changed, 400 insertions(+)
 
 diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
 index 535fc53..2a4fa97 100644
 --- a/xen/arch/arm/gic-v3-its.c
 +++ b/xen/arch/arm/gic-v3-its.c
 @@ -89,6 +89,7 @@ struct its_node {
  
  #define ITS_ITT_ALIGNSZ_256
  
 +static u32 id_bits;
  static LIST_HEAD(its_nodes);
  static DEFINE_SPINLOCK(its_lock);
  static struct rdist_prop  *gic_rdists;
 @@ -146,6 +147,11 @@ void dump_cmd(its_cmd_block *cmd)
  }
  #endif
  
 +u32 its_get_nr_events(void)
 +{
 +return (1  id_bits);
 +}
 +
  /* RB-tree helpers for its_device */
  struct its_device * find_its_device(struct rb_root *root, u32 devid)
  {
 @@ -1044,6 +1050,7 @@ static int its_probe(struct dt_device_node *node)
  its-phys_size = its_size;
  typer = readl_relaxed(its_base + GITS_TYPER);
  its-ite_size = ((typer  4)  0xf) + 1;
 +id_bits = GITS_TYPER_IDBITS(typer);
  
  its-cmd_base = xzalloc_bytes(ITS_CMD_QUEUE_SZ);
  if ( !its-cmd_base )
 diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
 index ea52a87..0671434 100644
 --- a/xen/arch/arm/vgic-v3-its.c
 +++ b/xen/arch/arm/vgic-v3-its.c
 @@ -256,6 +256,399 @@ int remove_vits_device(struct rb_root *root, struct 
 vits_device *dev)
  return 0;
  }
  
 +static int vgic_its_process_sync(struct vcpu *v, struct vgic_its *vits,
 + its_cmd_block *virt_cmd)

virt_cmd is not modified, please use const.

 +{
 +/* XXX: Ignored */

IHMO, XXX means TODO which is not the case here.

 +DPRINTK(vITS:d%dv%d SYNC: ta 0x%x \n,
 + v-domain-domain_id, v-vcpu_id, virt_cmd-sync.ta);

You can use %pv rather than d%dv%d an directly pass the vcpu.

 +
 +return 0;
 +}
 +
 +static int vgic_its_process_mapvi(struct vcpu *v, struct vgic_its *vits,
 +  its_cmd_block *virt_cmd)

Please use const for the virt_cmd.

 +{
 +struct vitt entry;
 +struct vits_device *vdev;
 +uint8_t vcol_id, cmd;
 +uint32_t vid, dev_id, event;

struct domain *d = v-domain for a better abstraction in the code.

 +
 +vcol_id = virt_cmd-mapvi.col;
 +vid = virt_cmd-mapvi.phy_id;
 +dev_id = its_decode_devid(v-domain, virt_cmd);

AFAIU, the its_decode_devid will return a physical devID... although you
function find_vits_device is working on virtual devID. This will also
not work on fake device. Did I miss something?

 +cmd = virt_cmd-mapvi.cmd;
 +
 +DPRINTK(vITS:d%dv%d MAPVI: dev_id 0x%x vcol_id %d vid %d \n,
 + v-domain-domain_id, v-vcpu_id, dev_id, vcol_id, vid);
 +
 +if ( vcol_id  (v-domain-max_vcpus + 1) ||  vid  its_get_nr_events() )

Checking the validity is pointless as a malicious guest can rewrite the
ITT. We only need to check it when the LPI is effectively injected.

 +return -EINVAL;
 +
 +/* XXX: Enable validation later */

What do you mean?

 +vdev = find_vits_device(v-domain-arch.vits-dev_root, dev_id);
 +if ( !vdev  !vdev-its_dev )
 +return -EINVAL;

You deny the possibility to have fake device in the domain.

Anyway, this check is not necessary too.

 +
 +entry.valid = true;
 +entry.vcollection = vcol_id;
 +entry.vlpi = vid;
 +
 +if ( cmd == GITS_CMD_MAPI )
 +vits_set_vitt_entry(v-domain, dev_id, vid, entry);
 +else
 +{
 +event = virt_cmd-mapvi.event;
 +if ( event  its_get_nr_events() )

You have hardcoded the number of event in the vGIC but you are using the
physical ITS to check the value.

IHMO, we should introduce a new field in the vITS to specify the number
of events supported by the domain. For DOM0 it will be equal to the
physical ITS.

But this check is also unnecessary.

 +return -EINVAL;
 +
 +vits_set_vitt_entry(v-domain, dev_id, event, entry);
 +}
 +
 +return 0;
 +}
 +
 +static int vgic_its_process_movi(struct vcpu *v, struct vgic_its *vits,
 + its_cmd_block *virt_cmd)

virt_cmd is not modified, please use const.

 +{
 +struct vitt entry;
 +struct vits_device *vdev;
 +uint32_t dev_id, event;
 +uint8_t vcol_id;

struct domain *d = v-domain for a better abstraction in the code.

 +
 +dev_id = its_decode_devid(v-domain, virt_cmd);
 +vcol_id = virt_cmd-movi.col;
 +event = virt_cmd-movi.event;
 +
 +DPRINTK(vITS:d%dv%d MOVI: dev_id 0x%x vcol_id %d event %d\n,
 +v-domain-domain_id, v-vcpu_id, dev_id, vcol_id, event);
 +if ( vcol_id  (v-domain-max_vcpus + 1)  || event  
 its_get_nr_events() )
 +return -EINVAL;
 +
 +/* Enable validation later 

Re: [Xen-devel] [PATCH 1/2] NetBSDRump: provide evtchn.h and privcmd.h

2015-06-24 Thread Wei Liu
On Wed, Jun 24, 2015 at 12:22:46PM +0200, Roger Pau Monné wrote:
 El 24/06/15 a les 12.10, Wei Liu ha escrit:
+#define IOCTL_PRIVCMD_MMAP \
  +_IOW('P', 2, privcmd_mmap_t)
  +#define IOCTL_PRIVCMD_MMAPBATCH\
  +_IOW('P', 3, privcmd_mmapbatch_t)
 
 FWIW you could have gotten away with just implementing
 IOCTL_PRIVCMD_MMAPBATCH, this is what I did on FreeBSD.
 

I was too lazy to change libxc code so I implemented both in rump
kernel. It's just plumbing through some minios functions.

Wei.

 Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 06/17] x86/hvm: unify internal portio and mmio intercepts

2015-06-24 Thread Paul Durrant
The implementation of mmio and portio intercepts is unnecessarily different.
This leads to much code duplication. This patch unifies much of the
intercept handling, leaving only distinct handlers for stdvga mmio and dpci
portio. Subsequent patches will unify those handlers.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c|   11 +-
 xen/arch/x86/hvm/hpet.c   |4 +-
 xen/arch/x86/hvm/hvm.c|7 +-
 xen/arch/x86/hvm/intercept.c  |  502 +
 xen/arch/x86/hvm/stdvga.c |   30 +-
 xen/arch/x86/hvm/vioapic.c|4 +-
 xen/arch/x86/hvm/vlapic.c |5 +-
 xen/arch/x86/hvm/vmsi.c   |7 +-
 xen/drivers/passthrough/amd/iommu_guest.c |   30 +-
 xen/include/asm-x86/hvm/domain.h  |1 +
 xen/include/asm-x86/hvm/io.h  |  119 +++
 11 files changed, 350 insertions(+), 370 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4d11c6c..9ced81b 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -143,16 +143,7 @@ static int hvmemul_do_io(
 hvmtrace_io_assist(p);
 }
 
-if ( is_mmio )
-{
-rc = hvm_mmio_intercept(p);
-if ( rc == X86EMUL_UNHANDLEABLE )
-rc = hvm_buffered_io_intercept(p);
-}
-else
-{
-rc = hvm_portio_intercept(p);
-}
+rc = hvm_io_intercept(p);
 
 switch ( rc )
 {
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 9585ca8..8958873 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -504,7 +504,7 @@ static int hpet_range(struct vcpu *v, unsigned long addr)
  (addr  (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
 }
 
-const struct hvm_mmio_ops hpet_mmio_ops = {
+static const struct hvm_mmio_ops hpet_mmio_ops = {
 .check = hpet_range,
 .read  = hpet_read,
 .write = hpet_write
@@ -659,6 +659,8 @@ void hpet_init(struct domain *d)
 h-hpet.comparator64[i] = ~0ULL;
 h-pt[i].source = PTSRC_isa;
 }
+
+register_mmio_handler(d, hpet_mmio_ops);
 }
 
 void hpet_deinit(struct domain *d)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 535d622..c10db78 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1465,11 +1465,12 @@ int hvm_domain_initialise(struct domain *d)
 goto fail0;
 
 d-arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS);
-d-arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler);
+d-arch.hvm_domain.io_handler = xzalloc_array(struct hvm_io_handler,
+  NR_IO_HANDLERS);
 rc = -ENOMEM;
 if ( !d-arch.hvm_domain.params || !d-arch.hvm_domain.io_handler )
 goto fail1;
-d-arch.hvm_domain.io_handler-num_slot = 0;
+d-arch.hvm_domain.io_handler_count = 0;
 
 /* Set the default IO Bitmap. */
 if ( is_hardware_domain(d) )
@@ -1506,6 +1507,8 @@ int hvm_domain_initialise(struct domain *d)
 
 rtc_init(d);
 
+msixtbl_init(d);
+
 register_portio_handler(d, 0xe9, 1, hvm_print_line);
 register_portio_handler(d, 0xcf8, 4, hvm_access_cf8);
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index cc44733..4db024e 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -32,205 +32,97 @@
 #include xen/event.h
 #include xen/iommu.h
 
-static const struct hvm_mmio_ops *const
-hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] =
+static bool_t hvm_mmio_accept(struct hvm_io_handler *handler,
+  uint64_t addr,
+  uint64_t size)
 {
-hpet_mmio_ops,
-vlapic_mmio_ops,
-vioapic_mmio_ops,
-msixtbl_mmio_ops,
-iommu_mmio_ops
-};
+BUG_ON(handler-type != IOREQ_TYPE_COPY);
+
+return handler-u.mmio.ops-check(current, addr);
+}
 
-static int hvm_mmio_access(struct vcpu *v,
-   ioreq_t *p,
-   hvm_mmio_read_t read,
-   hvm_mmio_write_t write)
+static int hvm_mmio_read(struct hvm_io_handler *handler,
+ uint64_t addr,
+ uint64_t size,
+ uint64_t *data)
 {
-struct hvm_vcpu_io *vio = v-arch.hvm_vcpu.hvm_io;
-unsigned long data;
-int rc = X86EMUL_OKAY, i, step = p-df ? -p-size : p-size;
+BUG_ON(handler-type != IOREQ_TYPE_COPY);
 
-if ( !p-data_is_ptr )
-{
-if ( p-dir == IOREQ_READ )
-{
-if ( vio-mmio_retrying )
-{
-if ( vio-mmio_large_read_bytes != p-size )
-return X86EMUL_UNHANDLEABLE;
-memcpy(data, vio-mmio_large_read, p-size);
-vio-mmio_large_read_bytes = 0;
-   

[Xen-devel] [PATCH v4 01/17] x86/hvm: simplify hvmemul_do_io()

2015-06-24 Thread Paul Durrant
Currently hvmemul_do_io() handles paging for I/O to/from a guest address
inline. This causes every exit point to have to execute:

if ( ram_page )
put_page(ram_page);

This patch introduces wrapper hvmemul_do_io_addr() and
hvmemul_do_io_buffer() functions. The latter is used for I/O to/from a Xen
buffer and thus the complexity of paging can be restricted only to the
former, making the common hvmemul_do_io() function less convoluted.

This patch also tightens up some types and introduces pio/mmio wrappers
for the above functions with comments to document their semantics.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c|  278 -
 xen/arch/x86/hvm/io.c |4 +-
 xen/include/asm-x86/hvm/emulate.h |   17 ++-
 3 files changed, 198 insertions(+), 101 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ac9c9d6..9d7af0c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -51,41 +51,23 @@ static void hvmtrace_io_assist(int is_mmio, ioreq_t *p)
 }
 
 static int hvmemul_do_io(
-int is_mmio, paddr_t addr, unsigned long *reps, int size,
-paddr_t ram_gpa, int dir, int df, void *p_data)
+bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
+uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
 struct vcpu *curr = current;
-struct hvm_vcpu_io *vio;
+struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
 ioreq_t p = {
 .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
 .addr = addr,
 .size = size,
 .dir = dir,
 .df = df,
-.data = ram_gpa,
-.data_is_ptr = (p_data == NULL),
+.data = data,
+.data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
 };
-unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
-p2m_type_t p2mt;
-struct page_info *ram_page;
+void *p_data = (void *)data;
 int rc;
 
-/* Check for paged out page */
-ram_page = get_page_from_gfn(curr-domain, ram_gfn, p2mt, P2M_UNSHARE);
-if ( p2m_is_paging(p2mt) )
-{
-if ( ram_page )
-put_page(ram_page);
-p2m_mem_paging_populate(curr-domain, ram_gfn);
-return X86EMUL_RETRY;
-}
-if ( p2m_is_shared(p2mt) )
-{
-if ( ram_page )
-put_page(ram_page);
-return X86EMUL_RETRY;
-}
-
 /*
  * Weird-sized accesses have undefined behaviour: we discard writes
  * and read all-ones.
@@ -93,23 +75,10 @@ static int hvmemul_do_io(
 if ( unlikely((size  sizeof(long)) || (size  (size - 1))) )
 {
 gdprintk(XENLOG_WARNING, bad mmio size %d\n, size);
-ASSERT(p_data != NULL); /* cannot happen with a REP prefix */
-if ( dir == IOREQ_READ )
-memset(p_data, ~0, size);
-if ( ram_page )
-put_page(ram_page);
 return X86EMUL_UNHANDLEABLE;
 }
 
-if ( !p.data_is_ptr  (dir == IOREQ_WRITE) )
-{
-memcpy(p.data, p_data, size);
-p_data = NULL;
-}
-
-vio = curr-arch.hvm_vcpu.hvm_io;
-
-if ( is_mmio  !p.data_is_ptr )
+if ( is_mmio  !data_is_addr )
 {
 /* Part of a multi-cycle read or write? */
 if ( dir == IOREQ_WRITE )
@@ -117,11 +86,7 @@ static int hvmemul_do_io(
 paddr_t pa = vio-mmio_large_write_pa;
 unsigned int bytes = vio-mmio_large_write_bytes;
 if ( (addr = pa)  ((addr + size) = (pa + bytes)) )
-{
-if ( ram_page )
-put_page(ram_page);
 return X86EMUL_OKAY;
-}
 }
 else
 {
@@ -131,8 +96,6 @@ static int hvmemul_do_io(
 {
 memcpy(p_data, vio-mmio_large_read[addr - pa],
size);
-if ( ram_page )
-put_page(ram_page);
 return X86EMUL_OKAY;
 }
 }
@@ -144,40 +107,28 @@ static int hvmemul_do_io(
 break;
 case HVMIO_completed:
 vio-io_state = HVMIO_none;
-if ( p_data == NULL )
-{
-if ( ram_page )
-put_page(ram_page);
+if ( data_is_addr || dir == IOREQ_WRITE )
 return X86EMUL_UNHANDLEABLE;
-}
 goto finish_access;
 case HVMIO_dispatched:
 /* May have to wait for previous cycle of a multi-write to complete. */
-if ( is_mmio  !p.data_is_ptr  (dir == IOREQ_WRITE) 
+if ( is_mmio  !data_is_addr  (dir == IOREQ_WRITE) 
  (addr == (vio-mmio_large_write_pa +
vio-mmio_large_write_bytes)) )
-{
-if ( ram_page )
-put_page(ram_page);
 return X86EMUL_RETRY;
-}
 /* fallthrough */
 default:
-

[Xen-devel] [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept

2015-06-24 Thread Paul Durrant
It's clear from the following check in hvmemul_rep_movs:

if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct ||
 (sp2mt == p2m_mmio_dm  dp2mt == p2m_mmio_dm) )
return X86EMUL_UNHANDLEABLE;

that mmio - mmio copy is not handled. This means the code in the
stdvga mmio intercept that explicitly handles mmio - mmio copy when
hvm_copy_to/from_guest_phys() fails is never going to be executed.

This patch therefore adds a check in hvmemul_do_io_addr() to make sure
mmio - mmio is disallowed and then registers standard mmio intercept ops
in stdvga_init().

With this patch all mmio and portio handled within Xen now goes through
process_io_intercept().

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c   |9 +++
 xen/arch/x86/hvm/intercept.c |7 --
 xen/arch/x86/hvm/stdvga.c|  173 +-
 xen/include/asm-x86/hvm/io.h |1 -
 4 files changed, 44 insertions(+), 146 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 9ced81b..4e2fdf1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -267,6 +267,15 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct 
page_info **page)
 return X86EMUL_RETRY;
 }
 
+/* This code should not be reached if the gmfn is not RAM */
+if ( p2m_is_mmio(p2mt) )
+{
+domain_crash(curr_d);
+
+put_page(*page);
+return X86EMUL_UNHANDLEABLE;
+}
+
 return X86EMUL_OKAY;
 }
 
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 5633959..625e585 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -279,13 +279,6 @@ int hvm_io_intercept(ioreq_t *p)
 {
 struct hvm_io_handler *handler;
 
-if ( p-type == IOREQ_TYPE_COPY )
-{
-int rc = stdvga_intercept_mmio(p);
-if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-return rc;
-}
-
 handler = hvm_find_io_handler(p);
 
 if ( handler == NULL )
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index dcd532a..639da6a 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -275,9 +275,10 @@ static uint8_t stdvga_mem_readb(uint64_t addr)
 return ret;
 }
 
-static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size)
+static int stdvga_mem_read(struct vcpu *v, unsigned long addr,
+   unsigned long size, unsigned long *p_data)
 {
-uint64_t data = 0;
+unsigned long data = 0;
 
 switch ( size )
 {
@@ -313,7 +314,9 @@ static uint64_t stdvga_mem_read(uint64_t addr, uint64_t 
size)
 break;
 }
 
-return data;
+*p_data = data;
+
+return X86EMUL_OKAY;
 }
 
 static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
@@ -424,8 +427,17 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val)
 }
 }
 
-static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
+static int stdvga_mem_write(struct vcpu *v, unsigned long addr,
+unsigned long size, unsigned long data)
 {
+ioreq_t p = { .type = IOREQ_TYPE_COPY,
+  .addr = addr,
+  .size = size,
+  .count = 1,
+  .dir = IOREQ_WRITE,
+  .data = data,
+};
+
 /* Intercept mmio write */
 switch ( size )
 {
@@ -460,153 +472,36 @@ static void stdvga_mem_write(uint64_t addr, uint64_t 
data, uint64_t size)
 gdprintk(XENLOG_WARNING, invalid io size: %PRId64\n, size);
 break;
 }
-}
-
-static uint32_t read_data;
-
-static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p)
-{
-int i;
-uint64_t addr = p-addr;
-p2m_type_t p2mt;
-struct domain *d = current-domain;
-int step = p-df ? -p-size : p-size;
 
-if ( p-data_is_ptr )
-{
-uint64_t data = p-data, tmp;
-
-if ( p-dir == IOREQ_READ )
-{
-for ( i = 0; i  p-count; i++ ) 
-{
-tmp = stdvga_mem_read(addr, p-size);
-if ( hvm_copy_to_guest_phys(data, tmp, p-size) !=
- HVMCOPY_okay )
-{
-struct page_info *dp = get_page_from_gfn(
-d, data  PAGE_SHIFT, p2mt, P2M_ALLOC);
-/*
- * The only case we handle is vga_mem - vga_mem.
- * Anything else disables caching and leaves it to qemu-dm.
- */
-if ( (p2mt != p2m_mmio_dm) || (data  VGA_MEM_BASE) ||
- ((data + p-size)  (VGA_MEM_BASE + VGA_MEM_SIZE)) )
-{
-if ( dp )
-put_page(dp);
-return 0;
-}
-ASSERT(!dp);
- 

[Xen-devel] [PATCH v4 03/17] x86/hvm: remove extraneous parameter from hvmtrace_io_assist()

2015-06-24 Thread Paul Durrant
The is_mmio parameter can be inferred from the ioreq type.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/emulate.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index b412302..935eab3 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -23,8 +23,9 @@
 #include asm/hvm/support.h
 #include asm/hvm/svm/svm.h
 
-static void hvmtrace_io_assist(int is_mmio, ioreq_t *p)
+static void hvmtrace_io_assist(ioreq_t *p)
 {
+bool_t is_mmio = (p-type == IOREQ_TYPE_COPY);
 unsigned int size, event;
 unsigned char buffer[12];
 
@@ -139,7 +140,7 @@ static int hvmemul_do_io(
 if ( !data_is_addr )
 memcpy(p.data, p_data, size);
 
-hvmtrace_io_assist(is_mmio, p);
+hvmtrace_io_assist(p);
 }
 
 if ( is_mmio )
@@ -200,7 +201,7 @@ static int hvmemul_do_io(
  finish_access:
 if ( dir == IOREQ_READ )
 {
-hvmtrace_io_assist(is_mmio, p);
+hvmtrace_io_assist(p);
 
 if ( !data_is_addr )
 memcpy(p_data, vio-io_data, size);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 08/17] x86/hvm: unify dpci portio intercept with standard portio intercept

2015-06-24 Thread Paul Durrant
This patch re-works the dpci portio intercepts so that they can be unified
with standard portio handling thereby removing a substantial amount of
code duplication.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/hvm.c |2 +
 xen/arch/x86/hvm/intercept.c   |   22 ++--
 xen/arch/x86/hvm/io.c  |  225 +---
 xen/include/asm-x86/hvm/io.h   |8 ++
 xen/include/asm-x86/hvm/vcpu.h |2 +
 xen/include/xen/iommu.h|1 -
 6 files changed, 88 insertions(+), 172 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c10db78..f8486f4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1486,6 +1486,8 @@ int hvm_domain_initialise(struct domain *d)
 else
 d-arch.hvm_domain.io_bitmap = hvm_io_bitmap;
 
+register_dpci_portio_handler(d);
+
 if ( is_pvh_domain(d) )
 {
 register_portio_handler(d, 0, 0x10003, handle_pvh_io);
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 5e8d8b2..5633959 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -116,10 +116,7 @@ static int hvm_process_io_intercept(struct hvm_io_handler 
*handler,
 ioreq_t *p)
 {
 struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io;
-const struct hvm_io_ops *ops =
-(p-type == IOREQ_TYPE_COPY) ?
-mmio_ops :
-portio_ops;
+const struct hvm_io_ops *ops = handler-ops;
 int rc = X86EMUL_OKAY, i, step = p-df ? -p-size : p-size;
 uint64_t data;
 uint64_t addr;
@@ -237,16 +234,13 @@ static struct hvm_io_handler *hvm_find_io_handler(ioreq_t 
*p)
 {
 struct vcpu *curr = current;
 struct domain *curr_d = curr-domain;
-const struct hvm_io_ops *ops =
-(p-type == IOREQ_TYPE_COPY) ?
-mmio_ops :
-portio_ops;
 unsigned int i;
 
 for ( i = 0; i  curr_d-arch.hvm_domain.io_handler_count; i++ )
 {
 struct hvm_io_handler *handler =
 curr_d-arch.hvm_domain.io_handler[i];
+const struct hvm_io_ops *ops = handler-ops;
 uint64_t start, end, count = p-count, size = p-size;
 
 if ( handler-type != p-type )
@@ -285,13 +279,7 @@ int hvm_io_intercept(ioreq_t *p)
 {
 struct hvm_io_handler *handler;
 
-if ( p-type == IOREQ_TYPE_PIO )
-{
-int rc = dpci_ioport_intercept(p);
-if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
-return rc;
-}
-else if ( p-type == IOREQ_TYPE_COPY )
+if ( p-type == IOREQ_TYPE_COPY )
 {
 int rc = stdvga_intercept_mmio(p);
 if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) )
@@ -306,7 +294,7 @@ int hvm_io_intercept(ioreq_t *p)
 return hvm_process_io_intercept(handler, p);
 }
 
-static struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
+struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
 {
 unsigned int i = d-arch.hvm_domain.io_handler_count++;
 
@@ -321,6 +309,7 @@ void register_mmio_handler(struct domain *d, const struct 
hvm_mmio_ops *ops)
 struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
 handler-type = IOREQ_TYPE_COPY;
+handler-ops = mmio_ops;
 handler-u.mmio.ops = ops;
 }
 
@@ -330,6 +319,7 @@ void register_portio_handler(struct domain *d, uint32_t 
addr,
 struct hvm_io_handler *handler = hvm_next_io_handler(d);
 
 handler-type = IOREQ_TYPE_PIO;
+handler-ops = portio_ops;
 handler-u.portio.start = addr;
 handler-u.portio.end = addr + size;
 handler-u.portio.action = action;
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index c0964ec..51ef19a 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -208,185 +208,100 @@ void hvm_io_assist(ioreq_t *p)
 }
 }
 
-static int dpci_ioport_read(uint32_t mport, ioreq_t *p)
+static bool_t dpci_portio_accept(struct hvm_io_handler *handler,
+ uint64_t addr,
+ uint64_t size)
 {
-struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io;
-int rc = X86EMUL_OKAY, i, step = p-df ? -p-size : p-size;
-uint32_t data = 0;
+struct vcpu *curr = current;
+struct hvm_iommu *hd = domain_hvm_iommu(curr-domain);
+struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
+struct g2m_ioport *g2m_ioport;
+uint32_t start, end;
+uint32_t gport = addr, mport;
 
-for ( i = 0; i  p-count; i++ )
+list_for_each_entry( g2m_ioport, hd-arch.g2m_ioport_list, list )
 {
-if ( vio-mmio_retrying )
-{
-if ( vio-mmio_large_read_bytes != p-size )
-return X86EMUL_UNHANDLEABLE;
-memcpy(data, vio-mmio_large_read, p-size);
-vio-mmio_large_read_bytes = 0;
-vio-mmio_retrying = 0;
-}
-else switch ( 

[Xen-devel] [PATCH v4 00/17] x86/hvm: I/O emulation cleanup and fix

2015-06-24 Thread Paul Durrant
This patch series re-works much of the code involved in emulation of port
and memory mapped I/O for HVM guests.

The code has become very convoluted and, at least by inspection, certain
emulations will apparently malfunction.

The series is broken down into 17 patches (which are also available in
my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git
on the emulation27 branch) as follows:

0001-x86-hvm-simplify-hvmemul_do_io.patch
0002-x86-hvm-remove-hvm_io_pending-check-in-hvmemul_do_io.patch
0003-x86-hvm-remove-extraneous-parameter-from-hvmtrace_io.patch
0004-x86-hvm-remove-multiple-open-coded-chunking-loops.patch
0005-x86-hvm-re-name-struct-hvm_mmio_handler-to-hvm_mmio_.patch
0006-x86-hvm-unify-internal-portio-and-mmio-intercepts.patch
0007-x86-hvm-add-length-to-mmio-check-op.patch
0008-x86-hvm-unify-dpci-portio-intercept-with-standard-po.patch
0009-x86-hvm-unify-stdvga-mmio-intercept-with-standard-mm.patch
0010-x86-hvm-revert-82ed8716b-fix-direct-PCI-port-I-O-emu.patch
0011-x86-hvm-only-call-hvm_io_assist-from-hvm_wait_for_io.patch
0012-x86-hvm-split-I-O-completion-handling-from-state-mod.patch
0013-x86-hvm-remove-HVMIO_dispatched-I-O-state.patch
0014-x86-hvm-remove-hvm_io_state-enumeration.patch
0015-x86-hvm-use-ioreq_t-to-track-in-flight-state.patch
0016-x86-hvm-always-re-emulate-I-O-from-a-buffer.patch
0017-x86-hvm-track-large-memory-mapped-accesses-by-buffer.patch

v2:
 - Removed bogus assertion from patch 16
 - Re-worked patch #17 after basic testing of back-port onto XenServer

v3:
 - Addressed comments from Jan
 - Re-ordered series to bring a couple of more trivial patches to the
   front
 - Backport to XenServer (4.5) now passing automated tests
 - Tested on unstable with QEMU upstream and trad, with and without
   HAP (to force shadow emulation)

v4:
 - Removed previous patch #4 (make sure translated MMIO reads or
   writes fall within a page) and rebased rest of series.
 - Address Jan's comments on prevous patch #5 (now patch #4)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-24 Thread Paul Durrant
When memory mapped I/O is range checked by internal handlers, the length
of the access should be taken into account.

Signed-off-by: Paul Durrant paul.durr...@citrix.com
Cc: Keir Fraser k...@xen.org
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/hvm/hpet.c   |7 ---
 xen/arch/x86/hvm/intercept.c  |2 +-
 xen/arch/x86/hvm/vioapic.c|   17 ++---
 xen/arch/x86/hvm/vlapic.c |8 +---
 xen/arch/x86/hvm/vmsi.c   |   27 ---
 xen/drivers/passthrough/amd/iommu_guest.c |   18 +++---
 xen/include/asm-x86/hvm/io.h  |4 +++-
 7 files changed, 62 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 8958873..1a1f239 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -498,10 +498,11 @@ static int hpet_write(
 return X86EMUL_OKAY;
 }
 
-static int hpet_range(struct vcpu *v, unsigned long addr)
+static int hpet_range(struct vcpu *v, unsigned long addr,
+  unsigned long length)
 {
-return ( (addr = HPET_BASE_ADDRESS) 
- (addr  (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
+return (addr = HPET_BASE_ADDRESS) 
+   ((addr + length)  (HPET_BASE_ADDRESS + HPET_MMAP_SIZE));
 }
 
 static const struct hvm_mmio_ops hpet_mmio_ops = {
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index 4db024e..5e8d8b2 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -38,7 +38,7 @@ static bool_t hvm_mmio_accept(struct hvm_io_handler *handler,
 {
 BUG_ON(handler-type != IOREQ_TYPE_COPY);
 
-return handler-u.mmio.ops-check(current, addr);
+return handler-u.mmio.ops-check(current, addr, size);
 }
 
 static int hvm_mmio_read(struct hvm_io_handler *handler,
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 9ad909b..4a9b33e 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -242,12 +242,13 @@ static int vioapic_write(
 return X86EMUL_OKAY;
 }
 
-static int vioapic_range(struct vcpu *v, unsigned long addr)
+static int vioapic_range(struct vcpu *v, unsigned long addr,
+unsigned long length)
 {
 struct hvm_hw_vioapic *vioapic = domain_vioapic(v-domain);
 
-return ((addr = vioapic-base_address 
- (addr  vioapic-base_address + VIOAPIC_MEM_LENGTH)));
+return (addr = vioapic-base_address) 
+   ((addr + length) = (vioapic-base_address + VIOAPIC_MEM_LENGTH));
 }
 
 static const struct hvm_mmio_ops vioapic_mmio_ops = {
@@ -466,3 +467,13 @@ void vioapic_deinit(struct domain *d)
 xfree(d-arch.hvm_domain.vioapic);
 d-arch.hvm_domain.vioapic = NULL;
 }
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: BSD
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index f2052cf..7421fc5 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -986,14 +986,16 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int 
msr, uint64_t msr_content)
 return vlapic_reg_write(v, offset, (uint32_t)msr_content);
 }
 
-static int vlapic_range(struct vcpu *v, unsigned long addr)
+static int vlapic_range(struct vcpu *v, unsigned long address,
+unsigned long len)
 {
 struct vlapic *vlapic = vcpu_vlapic(v);
-unsigned long offset  = addr - vlapic_base_address(vlapic);
+unsigned long offset  = address - vlapic_base_address(vlapic);
 
 return !vlapic_hw_disabled(vlapic) 
!vlapic_x2apic_mode(vlapic) 
-   (offset  PAGE_SIZE);
+   (address = vlapic_base_address(vlapic)) 
+   ((offset + len) = PAGE_SIZE);
 }
 
 static const struct hvm_mmio_ops vlapic_mmio_ops = {
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 09ea301..61fe391 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -168,14 +168,14 @@ struct msixtbl_entry
 static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock);
 
 static struct msixtbl_entry *msixtbl_find_entry(
-struct vcpu *v, unsigned long addr)
+struct vcpu *v, unsigned long address, unsigned long len)
 {
 struct msixtbl_entry *entry;
 struct domain *d = v-domain;
 
 list_for_each_entry( entry, d-arch.hvm_domain.msixtbl_list, list )
-if ( addr = entry-gtable 
- addr  entry-gtable + entry-table_len )
+if ( (address = entry-gtable) 
+ ((address + len) = (entry-gtable + entry-table_len)) )
 return entry;
 
 return NULL;
@@ -214,7 +214,7 @@ static int msixtbl_read(
 
 rcu_read_lock(msixtbl_rcu_lock);
 
-entry = msixtbl_find_entry(v, address);
+entry = msixtbl_find_entry(v, address, len);
 if ( !entry )
 goto out;
 offset = address  (PCI_MSIX_ENTRY_SIZE - 1);
@@ -273,7 

Re: [Xen-devel] [PATCH 5/9] x86/pvh: Set PVH guest's mode in XEN_DOMCTL_set_address_size

2015-06-24 Thread Boris Ostrovsky

On 06/24/2015 03:57 AM, Jan Beulich wrote:

On 24.06.15 at 04:53, boris.ostrov...@oracle.com wrote:

On 06/23/2015 09:22 AM, Jan Beulich wrote:

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2320,12 +2320,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
   v-arch.hvm_vcpu.inject_trap.vector = -1;
   
   if ( is_pvh_domain(d) )

-{
-v-arch.hvm_vcpu.hcall_64bit = 1;/* PVH 32bitfixme. */
-/* This is for hvm_long_mode_enabled(v). */
-v-arch.hvm_vcpu.guest_efer = EFER_LMA | EFER_LME;
   return 0;
-}

With this removed, is there any guarantee that hvm_set_mode()
will be called for each vCPU?

IIUIC, toolstack is required to call XEN_DOMCTL_set_address_size which
results in a call to switch_compat/native(), which loop over all VCPUs,
calling set_mode.

I don't recall this being a strict requirement. I think a PV 64-bit
guest would start fine without.


We do call it via libxl__build_pv() - xc_dom_boot_mem_init() - 
arch_setup_mem_init() - x86_compat().


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
 +static int hvmemul_phys_mmio_access(
 +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer)

As much as the earlier offset you returned via indirection to the
caller was unnecessary, the indirection here seems pointless too.
All callers know how (or don't care) to update the buffer pointer.

 +static int hvmemul_linear_mmio_access(
 +unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer,
 +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t translate)
 +{
 +struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io;
 +unsigned long page_off = gla  (PAGE_SIZE - 1);
 +unsigned int chunk;
 +paddr_t gpa;
 +unsigned long one_rep = 1;
 +int rc;
 +
 +chunk = min_t(unsigned int, size, PAGE_SIZE - page_off);
 +
 +if ( translate )
 +gpa = pfn_to_paddr(vio-mmio_gpfn) | page_off;

translate as name for the parameter signaling that the translation
is known is kind of odd - translated or known_gpfn or some such?
Or invert the meaning?

 +else
 +{
 +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec,
 +hvmemul_ctxt);
 +if ( rc != X86EMUL_OKAY )
 +return rc;
 +}
 +
 +for ( ;; )
 +{
 +rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer);
 +if ( rc != X86EMUL_OKAY )
 +break;
 +
 +gla += chunk;
 +gpa += chunk;
 +size -= chunk;
 +
 +if ( size == 0 )
 +break;
 +
 +ASSERT((gla  (PAGE_SIZE - 1)) == 0);
 +chunk = min_t(unsigned int, size, PAGE_SIZE);

I think you could just assert that size is now less than PAGE_SIZE.

 +if ( !translate )
 +{
 +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec,
 +hvmemul_ctxt);
 +if ( rc != X86EMUL_OKAY )
 +return rc;
 +}

This must be done unconditionally (and gpa doesn't need updating
above then), as the known translation is only for the first byte
(and whatever falls on the same page).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 06/12] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 From: Ravi Sahita ravi.sah...@intel.com

 Signed-off-by: Ravi Sahita ravi.sah...@intel.com
 ---
  xen/arch/x86/hvm/emulate.c | 13 +++--
  xen/arch/x86/hvm/vmx/vmx.c | 30 ++
  xen/arch/x86/x86_emulate/x86_emulate.c |  8 
  xen/arch/x86/x86_emulate/x86_emulate.h |  4 
  xen/include/asm-x86/hvm/hvm.h  |  2 ++
  5 files changed, 55 insertions(+), 2 deletions(-)

 diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
 index ac9c9d6..e38a2fe 100644
 --- a/xen/arch/x86/hvm/emulate.c
 +++ b/xen/arch/x86/hvm/emulate.c
 @@ -1356,6 +1356,13 @@ static int hvmemul_invlpg(
  return rc;
  }
  
 +static int hvmemul_vmfunc(
 +struct x86_emulate_ctxt *ctxt)
 +{
 +hvm_funcs.ahvm_vcpu_emulate_vmfunc(ctxt-regs);
 +return X86EMUL_OKAY;
 +}

ahvm_vcpu_emulate_vmfunc() should return an X86EMUL code.

 +
  static const struct x86_emulate_ops hvm_emulate_ops = {
  .read  = hvmemul_read,
  .insn_fetch= hvmemul_insn_fetch,
 @@ -1379,7 +1386,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
  .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
  .get_fpu   = hvmemul_get_fpu,
  .put_fpu   = hvmemul_put_fpu,
 -.invlpg= hvmemul_invlpg
 +.invlpg= hvmemul_invlpg,
 +.vmfunc= hvmemul_vmfunc,
  };
  
  static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
 @@ -1405,7 +1413,8 @@ static const struct x86_emulate_ops 
 hvm_emulate_ops_no_write = {
  .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
  .get_fpu   = hvmemul_get_fpu,
  .put_fpu   = hvmemul_put_fpu,
 -.invlpg= hvmemul_invlpg
 +.invlpg= hvmemul_invlpg,
 +.vmfunc= hvmemul_vmfunc,
  };
  
  static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
 diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
 index e8d9c82..ad9e9e4 100644
 --- a/xen/arch/x86/hvm/vmx/vmx.c
 +++ b/xen/arch/x86/hvm/vmx/vmx.c
 @@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
  static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
  static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
  static void vmx_invlpg_intercept(unsigned long vaddr);
 +static int vmx_vmfunc_intercept(struct cpu_user_regs* regs);

s/* / */

  
  uint8_t __read_mostly posted_intr_vector;
  
 @@ -1826,6 +1827,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
  vmx_vmcs_exit(v);
  }
  
 +static bool_t vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
 +{
 +bool_t rc = 0;
 +
 +if ( !cpu_has_vmx_vmfunc  altp2mhvm_active(current-domain) 
 + regs-eax == 0 
 + p2m_switch_vcpu_altp2m_by_id(current, (uint16_t)regs-ecx) )

Please latch current at the top of the function.  It is inefficient to
access like this.

 +{
 +regs-eip += 3;
 +rc = 1;
 +}
 +return rc;
 +}
 +
  static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
  {
  bool_t rc = 0;
 @@ -1894,6 +1909,7 @@ static struct hvm_function_table __initdata 
 vmx_function_table = {
  .msr_read_intercept   = vmx_msr_read_intercept,
  .msr_write_intercept  = vmx_msr_write_intercept,
  .invlpg_intercept = vmx_invlpg_intercept,
 +.vmfunc_intercept = vmx_vmfunc_intercept,
  .handle_cd= vmx_handle_cd,
  .set_info_guest   = vmx_set_info_guest,
  .set_rdtsc_exiting= vmx_set_rdtsc_exiting,
 @@ -1920,6 +1936,7 @@ static struct hvm_function_table __initdata 
 vmx_function_table = {
  .ahvm_vcpu_update_eptp = vmx_vcpu_update_eptp,
  .ahvm_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
  .ahvm_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
 +.ahvm_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
  };
  
  const struct hvm_function_table * __init start_vmx(void)
 @@ -2091,6 +2108,13 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
  vpid_sync_vcpu_gva(curr, vaddr);
  }
  
 +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
 +{
 +gdprintk(XENLOG_ERR, Failed guest VMFUNC execution\n);
 +domain_crash(current-domain);
 +return X86EMUL_OKAY;
 +}
 +
  static int vmx_cr_access(unsigned long exit_qualification)
  {
  struct vcpu *curr = current;
 @@ -2675,6 +2699,7 @@ void vmx_enter_realmode(struct cpu_user_regs *regs)
  regs-eflags |= (X86_EFLAGS_VM | X86_EFLAGS_IOPL);
  }
  
 +

Spurious whitespace change.

  static void vmx_vmexit_ud_intercept(struct cpu_user_regs *regs)
  {
  struct hvm_emulate_ctxt ctxt;
 @@ -3239,6 +3264,11 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
  update_guest_eip();
  break;
  
 +case EXIT_REASON_VMFUNC:
 +if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY )

This is currently an unconditional failure, and I don't see subsequent
patches which alter vmx_vmfunc_intercept().  Shouldn't

[Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))

2015-06-24 Thread Ian Campbell
Adding Boris+Suravee+Aravind (AMD/SVM maintainers), Dario (NUMA) and Jim
+Anthony (libvirt) to the CC.

TL;DR osstest is exposing issues running on AMD Opteron(tm) Processor
6376 in at least a couple of test cases. It would be good if someone
from AMD could have a look.

The systems here == merlot[01], which seem to be having with win7 live
migration tests as well as libvirt when starting PV guests. They each
contain AMD Opteron(tm) Processor 6376 processors with 32 threads in 4
nodes and seem to have a strange NUMA layout with no RAM on nodes 1 or
3.

The test history on these machines:
http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html
http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html

I just posted some analysis of the windows cases (including experiments
on the old Cambridge test infra with AMD Opteron(tm) Processor 6168
processes) in:
http://lists.xen.org/archives/html/xen-devel/2015-06/msg03713.html

I've also been investigating the libvirt guest-start failures. The
symptom is a 10s timeout starting qemu. Anthony is seeing this with
openstack too and did some analysis in
http://thread.gmane.org/gmane.comp.emulators.xen.devel/246473/focus=249172 
onwards, but it may be that this is unrelated to the osstest failures and that 
for Anthony's scenario the 10s timeout could be explained by the openstack 
tempest tests starting lots of VMs in parallel.

However for the osstests we are starting a single PV domain on an
otherwise idle host. There should be no reason for qemu to take as long
as 10s to come up in that case, even with pessimal NUMA layout (IMHO at
least). By comparison on other hosts starting qemu seems to take 2-4s,
so merlot is at least 2.5-5 times worse.

I tried running some adhoc tests on the old infra tied to the *-frog
machines (which are the Opteron 6168 ones):
http://xenbits.xen.org/people/ianc/tmp/adhoc/37623/
http://xenbits.xen.org/people/ianc/tmp/adhoc/37625/
The -xsm failures are because I botched the flight configuration, the
interesting information is that the other ones passed both times
(migrate-support is expected to fail at the moment).

Supposing that the NUMA oddities might be what is exposing this issue I
tried an adhoc run on the merlot machines where I specified
dom0_max_vcpus=8 dom0_nodes=0 on the hypervisor command line:
http://logs.test-lab.xenproject.org/osstest/logs/58853/

Again, I messed up the config for the -xsm case, so ignore.

The interesting thing is that the extra NUMA settings were
apparently_not_ helpful. From
http://logs.test-lab.xenproject.org/osstest/logs/58853/test-amd64-amd64-libvirt/serial-merlot0.log
 I can see they were applied:
Jun 23 15:50:34.205057 (XEN) Command line: placeholder conswitch=x watchdog 
com1=115200,8n1 console=com1,vga gdb=com1 dom0_mem=512M,max:512M ucode=scan 
dom0_max_vcpus=8 dom0_nodes=0
[...]
Jun 23 15:50:38.309057 (XEN) Dom0 has maximum 8 VCPUs

The memory info
Jun 23 15:56:27.749008 (XEN) Memory location of each domain:
Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072):
Jun 23 15:56:27.756983 (XEN) Node 0: 126905
Jun 23 15:56:27.756998 (XEN) Node 1: 0
Jun 23 15:56:27.764952 (XEN) Node 2: 4167
Jun 23 15:56:27.764969 (XEN) Node 3: 0
suggests at least a small amount of cross-node memory allocation (16M
out of dom0s 512M total). That's probably small enough to be OK.

And it seems as if the 8 dom0 vcpus are correctly pinned to the first 8
cpus (the ones in node 0):
Jun 23 15:56:43.797055 (XEN) VCPU information and callbacks for domain 0:
Jun 23 15:56:43.797110 (XEN) VCPU0: CPU4 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={4}
Jun 23 15:56:43.805078 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.813121 (XEN) pause_count=0 pause_flags=1
Jun 23 15:56:43.813157 (XEN) No periodic timer
Jun 23 15:56:43.821050 (XEN) VCPU1: CPU3 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={3}
Jun 23 15:56:43.829044 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.829082 (XEN) pause_count=0 pause_flags=1
Jun 23 15:56:43.837051 (XEN) No periodic timer
Jun 23 15:56:43.837084 (XEN) VCPU2: CPU5 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={5}
Jun 23 15:56:43.845102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.853035 (XEN) pause_count=0 pause_flags=1
Jun 23 15:56:43.853071 (XEN) No periodic timer
Jun 23 15:56:43.853099 (XEN) VCPU3: CPU7 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={7}
Jun 23 15:56:43.861102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.869110 (XEN) pause_count=0 pause_flags=1
Jun 23 15:56:43.869145 (XEN) No periodic timer
Jun 23 15:56:43.877014 (XEN) VCPU4: CPU0 [has=F] poll=0 upcall_pend=00 
upcall_mask=00 dirty_cpus={}
Jun 23 15:56:43.877038 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7}
Jun 23 15:56:43.885053 (XEN) pause_count=0 pause_flags=1
Jun 23 

Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 11:06, ian.campb...@citrix.com wrote:
 After that baseline I ran a few tests of just the windows + qemuu stuff:
 http://xenbits.xen.org/people/ianc/tmp/adhoc/37619/ 
 
 was allowing free reign on the machines and was mostly successful, apart
 from the windows-install failure on lake-frog. Looking at the test
 history this seems to have always been a problem on the old infra.
 *-frog are AMD Opteron(tm) Processor 6168 which is as close as the old
 infra has to the new colos merlot[01] which is AMD Opteron(tm)
 Processor 6376.
 
 With that in mind I reran with things limited to the two frog-* boxes
 and got http://xenbits.xen.org/people/ianc/tmp/adhoc/37624/.
 
 The windows-install of winxpsp3 persisted but there was no migration
 failure elsewhere.
 
 It's not a lot of data, but in comparison with the results in the colo:
 http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64 
 -xl-qemuu-win7-amd64/xen-4.5-testing.html 
 it looks like it's the newer system which is exposing the issue.

Thanks for doing all of this! While not pointing towards a solution
on the side of the newer systems, it at least reassures us that we
didn't release regressing software with 4.5.1.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.4 test] 58845: regressions - FAIL

2015-06-24 Thread osstest service user
flight 58845 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58845/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot  fail REGR. vs. 30511

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail in 58831 pass in 
58845
 test-amd64-amd64-pair10 xen-boot/dst_host   fail pass in 58798
 test-amd64-amd64-pair 9 xen-boot/src_host   fail pass in 58798
 test-amd64-amd64-xl-sedf-pin  6 xen-bootfail pass in 58798
 test-amd64-i386-pair 10 xen-boot/dst_host   fail pass in 58831
 test-amd64-i386-pair  9 xen-boot/src_host   fail pass in 58831

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-i386-libvirt-xsm   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-multivcpu  6 xen-boot   fail baseline untested
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-libvirt-xsm  6 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail baseline 
untested
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-sedf  6 xen-boot fail   like 30406
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 30511
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 30511
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail like 53709-bisect
 test-amd64-i386-freebsd10-amd64  6 xen-boot fail like 58780-bisect
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot   fail like 58786-bisect
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-bootfail like 58788-bisect
 test-amd64-i386-rumpuserxen-i386  6 xen-bootfail like 58799-bisect
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  6 xen-bootfail like 58801-bisect
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  6 xen-boot   fail like 58803-bisect
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-boot  fail like 58804-bisect
 test-amd64-i386-freebsd10-i386  6 xen-boot  fail like 58805-bisect
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot fail like 58806-bisect
 test-amd64-amd64-xl-qemuu-winxpsp3  6 xen-boot  fail like 58807-bisect
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot   fail like 58808-bisect
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  6 xen-bootfail like 58809-bisect
 test-amd64-amd64-rumpuserxen-amd64  6 xen-boot  fail like 58810-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64  6 xen-bootfail like 58811-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64  6 xen-boot   fail like 58813-bisect
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-bootfail like 58814-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64  6 xen-bootfail like 58815-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm 12 migrate-support-check fail in 58831 never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 linuxcf1b3dad6c5699b977273276bada8597636ef3e2
baseline version:
 linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70


500 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 

Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
 --- a/xen/arch/x86/hvm/hpet.c
 +++ b/xen/arch/x86/hvm/hpet.c
 @@ -498,10 +498,11 @@ static int hpet_write(
  return X86EMUL_OKAY;
  }
  
 -static int hpet_range(struct vcpu *v, unsigned long addr)
 +static int hpet_range(struct vcpu *v, unsigned long addr,
 +  unsigned long length)
  {
 -return ( (addr = HPET_BASE_ADDRESS) 
 - (addr  (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) );
 +return (addr = HPET_BASE_ADDRESS) 
 +   ((addr + length)  (HPET_BASE_ADDRESS + HPET_MMAP_SIZE));

=

 --- a/xen/arch/x86/hvm/vlapic.c
 +++ b/xen/arch/x86/hvm/vlapic.c
 @@ -986,14 +986,16 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int 
 msr, uint64_t msr_content)
  return vlapic_reg_write(v, offset, (uint32_t)msr_content);
  }
  
 -static int vlapic_range(struct vcpu *v, unsigned long addr)
 +static int vlapic_range(struct vcpu *v, unsigned long address,
 +unsigned long len)
  {
  struct vlapic *vlapic = vcpu_vlapic(v);
 -unsigned long offset  = addr - vlapic_base_address(vlapic);
 +unsigned long offset  = address - vlapic_base_address(vlapic);
  
  return !vlapic_hw_disabled(vlapic) 
 !vlapic_x2apic_mode(vlapic) 
 -   (offset  PAGE_SIZE);
 +   (address = vlapic_base_address(vlapic)) 
 +   ((offset + len) = PAGE_SIZE);

I'd prefer to stay with checking just offset here, unless you see
anything wrong with that.

 @@ -333,12 +333,15 @@ out:
  return r;
  }
  
 -static int msixtbl_range(struct vcpu *v, unsigned long addr)
 +static int msixtbl_range(struct vcpu *v, unsigned long address,
 + unsigned long len)
  {
 +struct msixtbl_entry *entry;
  const struct msi_desc *desc;
  
  rcu_read_lock(msixtbl_rcu_lock);
 -desc = msixtbl_addr_to_desc(msixtbl_find_entry(v, addr), addr);
 +entry = msixtbl_find_entry(v, address, len);
 +desc = msixtbl_addr_to_desc(entry, address);

Again I don't see the need to do more adjustments here than
necessary for your purpose.

 --- a/xen/include/asm-x86/hvm/io.h
 +++ b/xen/include/asm-x86/hvm/io.h
 @@ -35,7 +35,9 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
  unsigned long addr,
  unsigned long length,
  unsigned long val);
 -typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr);
 +typedef int (*hvm_mmio_check_t)(struct vcpu *v,
 +unsigned long addr,
 +unsigned long length);

I don't think this really needs to be long?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 05/18] x86/hvm: remove multiple open coded 'chunking' loops

2015-06-24 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 24 June 2015 10:38
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v3 05/18] x86/hvm: remove multiple open coded
 'chunking' loops
 
  On 23.06.15 at 12:39, paul.durr...@citrix.com wrote:
  --- a/xen/arch/x86/hvm/emulate.c
  +++ b/xen/arch/x86/hvm/emulate.c
  @@ -540,6 +540,115 @@ static int hvmemul_virtual_to_linear(
   return X86EMUL_EXCEPTION;
   }
 
  +static int hvmemul_phys_mmio_access(
  +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer,
  +unsigned int *off)
 
 Why this (buffer, off) pair? The caller can easily adjust buffer as
 necessary, avoiding the other parameter altogether. And buffer
 itself can be void * just like it is in some of the callers (and the
 others should follow suit).
 

It actually becomes necessary in a later patch, but I'll make the change there 
instead. As for incrementing a void *, I know that MSVC disallows this. I 
believe it is a gcc-ism which I guess clang must tolerate, but I don't think it 
is standard C.

  +{
  +unsigned long one_rep = 1;
  +unsigned int chunk;
  +int rc = 0;
  +
  +/* Accesses must fall within a page */
  +if ( (gpa  (PAGE_SIZE - 1)) + size  PAGE_SIZE )
  +return X86EMUL_UNHANDLEABLE;
 
 As for patch 4 - this imposes a restriction that real hardware doesn't
 have, and hence this needs to be replaced by adjusting the one caller
 not currently guaranteeing this such that it caps the size.
 

Ok.

  +/*
  + * hvmemul_do_io() cannot handle non-power-of-2 accesses or
  + * accesses larger than sizeof(long), so choose the highest power
  + * of 2 not exceeding sizeof(long) as the 'chunk' size.
  + */
  +chunk = 1  (fls(size) - 1);
  +if ( chunk  sizeof (long) )
  +chunk = sizeof (long);
 
 I suppose you intentionally generalize this; if so this should be
 mentioned in the commit message. This is particularly because it
 results in changed behavior (which isn't to say that I'm sure the
 previous way was any better in the sense of being closer to what
 real hardware does): Right now, an 8 byte access at the last
 byte of a page would get carried out as 8 1-byte accesses. Your
 change makes it a 1-, 4-, 2-, and 1-byte access in that order.
 

...which is certainly going to be quicker since it's only 4 round-trips to an 
emulator rather than 8.

 Also, considering instruction characteristics (as explained in the
 original comment) I think the old way of determining the chunk
 size may have been cheaper than yours using fls().
 

I thought fls() was generally implemented using inline assembler and was pretty 
fast. I didn't actually check how Xen implements it; I just assumed it would be 
optimal.

  +
  +while ( size != 0 )
  +{
  +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
  +buffer[*off]);
  +if ( rc != X86EMUL_OKAY )
  +break;
  +
  +/* Advance to the next chunk */
  +gpa += chunk;
  +*off += chunk;
  +size -= chunk;
  +
  +/*
  + * If the chunk now exceeds the remaining size, choose the next
  + * lowest power of 2 that will fit.
  + */
  +while ( chunk  size )
  +chunk = 1;
 
 Please avoid this loop when size == 0. Since the function won't be
 called with size being zero, I think the loop should be a for ( ; ; )
 one with the loop exit condition put in the middle.

Sure.

 
  @@ -549,52 +658,26 @@ static int __hvmemul_read(
   struct hvm_emulate_ctxt *hvmemul_ctxt)
   {
   struct vcpu *curr = current;
  -unsigned long addr, reps = 1;
  -unsigned int off, chunk = min(bytes, 1U  LONG_BYTEORDER);
  +unsigned long addr, one_rep = 1;
   uint32_t pfec = PFEC_page_present;
   struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io;
  -paddr_t gpa;
   int rc;
 
   rc = hvmemul_virtual_to_linear(
  -seg, offset, bytes, reps, access_type, hvmemul_ctxt, addr);
  +seg, offset, bytes, one_rep, access_type, hvmemul_ctxt, addr);
   if ( rc != X86EMUL_OKAY )
   return rc;
  -off = addr  (PAGE_SIZE - 1);
  -/*
  - * We only need to handle sizes actual instruction operands can have. 
  All
  - * such sizes are either powers of 2 or the sum of two powers of 2. 
  Thus
  - * picking as initial chunk size the largest power of 2 not greater 
  than
  - * the total size will always result in only power-of-2 size requests
  - * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non-
 powers-of-2).
  - */
  -while ( chunk  (chunk - 1) )
  -chunk = chunk - 1;
  -if ( off + bytes  PAGE_SIZE )
  -while ( off  (chunk - 1) )
  -chunk = 1;
 
   if ( ((access_type != hvm_access_insn_fetch
  ? vio-mmio_access.read_access
  : 

Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 11:14, fanhengl...@huawei.com wrote:
 I want to debug the procedure of windows os install with windbg,
 windbg executes instruction(fxsave) after the blank vm is started and before 
 guest iso start to install,
 
 
 fxsave trigger the following code path:
 vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION)
 -ept_handle_violation
 -hvm_hap_nested_page_fault
 -handle_mmio_with_translation
 -handle_mmio
 -hvm_emulate_one
 -x86_emulate
 
 X86_emulate return X86EMUL_UNHANDLEABLE
 
 The xl dmesg log;
 (d5) Writing SMBIOS tables ...
 (d5) Loading OVMF ...
 (XEN) d5v0 Over-allocation for domain 5: 2097409  2097408
 (XEN) memory.c:155:d5v0 Could not allocate order=0 extent: id=5 memflags=0 
 (0 of 1)
 (d5) Loading ACPI ...
 (d5) vm86 TSS at fc012d00
 (d5) BIOS map:
 (d5)  ffe0-: Main BIOS
 (d5) E820 table:
 (d5)  [00]: : - :000a: RAM
 (d5)  HOLE: :000a - :000f
 (d5)  [01]: :000f - :0010: RESERVED
 (d5)  [02]: :0010 - :f000: RAM
 (d5)  HOLE: :f000 - :fc00
 (d5)  [03]: :fc00 - 0001:: RESERVED
 (d5)  [04]: 0001: - 0002:0f6ed000: RAM
 (d5) Invoking OVMF ...
 (XEN) stdvga.c:147:d5v0 entering stdvga and caching modes
 (XEN) stdvga.c:151:d5v0 leaving stdvga
 (XEN) irq.c:276: Dom5 PCI link 0 changed 5 - 11
 (XEN) irq.c:276: Dom5 PCI link 1 changed 10 - 11
 (XEN) irq.c:276: Dom5 PCI link 2 changed 11 - 10
 (XEN) irq.c:276: Dom5 PCI link 3 changed 5 - 10
 (XEN) MMIO emulation failed: d5v0 64bit @ 0028:efe54dab - 0f ae 07 fc ff 75 
 10 48 8b 4d 08 48 89 e2 48 83
 (XEN) MMIO emulation failed: d5v0 64bit @ 0028:efe54dab - 0f ae 07 fc ff 75 
 10 48 8b 4d 08 48 89 e2 48 83
 (XEN) MMIO emulation failed: d5v0 64bit @ 0028:efe54dab - 0f ae 07 fc ff 75 
 10 48 8b 4d 08 48 89 e2 48 83
 (XEN) d5v0 Triple fault - invoking HVM shutdown action 1

Considering the address (below 4Gb) I'd view it equally possible
that it's OVMF that is running into this (and Windows may not
have got control at all by that time). But as others have said -
unless you're using VM events, it first of all would need to be
understood why fxsave would be issued on MMIO space, which
as a very minimum requires register state to be made visible.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 05/12] VMX/altp2m: add code to support EPTP switching and #VE.

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 Implement and hook up the code to enable VMX support of VMFUNC and #VE.

 VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

 Signed-off-by: Ed White edmund.h.wh...@intel.com
 ---
  xen/arch/x86/hvm/vmx/vmx.c | 132 
 +
  1 file changed, 132 insertions(+)

 diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
 index 2d3ad63..e8d9c82 100644
 --- a/xen/arch/x86/hvm/vmx/vmx.c
 +++ b/xen/arch/x86/hvm/vmx/vmx.c
 @@ -56,6 +56,7 @@
  #include asm/debugger.h
  #include asm/apic.h
  #include asm/hvm/nestedhvm.h
 +#include asm/hvm/altp2mhvm.h
  #include asm/event.h
  #include asm/monitor.h
  #include public/arch-x86/cpuid.h
 @@ -1763,6 +1764,100 @@ static void vmx_enable_msr_exit_interception(struct 
 domain *d)
   MSR_TYPE_W);
  }
  
 +static void vmx_vcpu_update_eptp(struct vcpu *v)
 +{
 +struct domain *d = v-domain;
 +struct p2m_domain *p2m = NULL;
 +struct ept_data *ept;
 +
 +if ( altp2mhvm_active(d) )
 +p2m = p2m_get_altp2m(v);
 +if ( !p2m )
 +p2m = p2m_get_hostp2m(d);
 +
 +ept = p2m-ept;
 +ept-asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
 +
 +vmx_vmcs_enter(v);
 +
 +__vmwrite(EPT_POINTER, ept_get_eptp(ept));
 +
 +if ( v-arch.hvm_vmx.secondary_exec_control 
 +SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
 +__vmwrite(EPTP_INDEX, vcpu_altp2mhvm(v).p2midx);
 +
 +vmx_vmcs_exit(v);
 +}
 +
 +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
 +{
 +struct domain *d = v-domain;
 +u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
 +
 +if ( !cpu_has_vmx_vmfunc )
 +return;
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
 +
 +vmx_vmcs_enter(v);
 +
 +if ( !d-is_dying  altp2mhvm_active(d) )
 +{
 +v-arch.hvm_vmx.secondary_exec_control |= mask;
 +__vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
 +__vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d-arch.altp2m_eptp));
 +
 +if ( cpu_has_vmx_virt_exceptions )
 +{
 +p2m_type_t t;
 +mfn_t mfn;
 +
 +mfn = get_gfn_query_unlocked(d, vcpu_altp2mhvm(v).veinfo_gfn, 
 t);

get_gfn_query_unlocked() returns _mfn(INVALID_MFN) in the failure case,
which you must not blindly write back.

 +__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn)  PAGE_SHIFT);

pfn_to_paddr() please, rather than opencoding it.  (This is a helper
which needs cleaning up, name-wise).

 +}
 +}
 +else
 +v-arch.hvm_vmx.secondary_exec_control = ~mask;
 +
 +__vmwrite(SECONDARY_VM_EXEC_CONTROL,
 +v-arch.hvm_vmx.secondary_exec_control);
 +
 +vmx_vmcs_exit(v);
 +}
 +
 +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 +{
 +bool_t rc = 0;
 +ve_info_t *veinfo = vcpu_altp2mhvm(v).veinfo_gfn ?
 +hvm_map_guest_frame_rw(vcpu_altp2mhvm(v).veinfo_gfn, 0) : NULL;

gfn 0 is a valid (albeit unlikely) location to request the veinfo page. 
Use GFN_INVALID as the sentinel.

 +
 +if ( !veinfo )
 +return 0;
 +
 +if ( veinfo-semaphore != 0 )
 +goto out;

The semantics of this semaphore are not clearly spelled out in the
manual.  The only information I can locate concerning this field is in
note in 25.5.6.1 which says:

Delivery of virtualization exceptions writes the value H to
offset 4 in the virtualization-exception informa-
tion area (see Section 25.5.6.2). Thus, once a virtualization exception
occurs, another can occur only if software
clears this field.

I presume this should be taken to mean software writes 0 to this
field, but some clarification would be nice.

 +
 +rc = 1;
 +
 +veinfo-exit_reason = EXIT_REASON_EPT_VIOLATION;
 +veinfo-semaphore = ~0l;

semaphore is declared as an unsigned field, so should use ~0u.

 +veinfo-eptp_index = vcpu_altp2mhvm(v).p2midx;
 +
 +vmx_vmcs_enter(v);
 +__vmread(EXIT_QUALIFICATION, veinfo-exit_qualification);
 +__vmread(GUEST_LINEAR_ADDRESS, veinfo-gla);
 +__vmread(GUEST_PHYSICAL_ADDRESS, veinfo-gpa);
 +vmx_vmcs_exit(v);
 +
 +hvm_inject_hw_exception(TRAP_virtualisation,
 +HVM_DELIVER_NO_ERROR_CODE);
 +
 +out:
 +hvm_unmap_guest_frame(veinfo, 0);
 +return rc;
 +}
 +
  static struct hvm_function_table __initdata vmx_function_table = {
  .name = VMX,
  .cpu_up_prepare   = vmx_cpu_up_prepare,
 @@ -1822,6 +1917,9 @@ static struct hvm_function_table __initdata 
 vmx_function_table = {
  .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
  .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
  .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
 +.ahvm_vcpu_update_eptp = vmx_vcpu_update_eptp,
 +.ahvm_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
 +.ahvm_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
  };
  
 

Re: [Xen-devel] [PATCH v3 07/18] x86/hvm: unify internal portio and mmio intercepts

2015-06-24 Thread Jan Beulich
 On 23.06.15 at 12:39, paul.durr...@citrix.com wrote:
 The implementation of mmio and portio intercepts is unnecessarily different.
 This leads to much code duplication. This patch unifies much of the
 intercept handling, leaving only distinct handlers for stdvga mmio and dpci
 portio. Subsequent patches will unify those handlers.
 
 Signed-off-by: Paul Durrant paul.durr...@citrix.com
 Cc: Keir Fraser k...@xen.org
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Andrew Cooper andrew.coop...@citrix.com
 ---

Please help reviewers of previous versions of your patches by
summarizing changes in the current version here.

 +static int hvm_portio_read(struct hvm_io_handler *handler,
 +   uint64_t addr,
 +   uint64_t size,
 +   uint64_t *data)
  {
 -struct vcpu *curr = current;
 -unsigned int i;
 +uint32_t val = 0;
 +int rc;
  
 -for ( i = 0; i  HVM_MMIO_HANDLER_NR; ++i )
 -if ( hvm_mmio_handlers[i]-check(curr, gpa) )
 -return 1;
 +BUG_ON(handler-type != IOREQ_TYPE_PIO);
  
 -return 0;
 +rc = handler-u.portio.action(IOREQ_READ, addr, size, val);
 +if ( rc == X86EMUL_OKAY )
 +*data = val;

I think there would be no harm doing this unconditionally, and it
would eliminate the potential of the caller using uninitialized data.

 @@ -284,29 +185,37 @@ static int process_portio_intercept(portio_action_t 
 action, ioreq_t *p)
  {
  for ( i = 0; i  p-count; i++ )
  {
 -data = 0;
 -switch ( hvm_copy_from_guest_phys(data, p-data + step * i,
 -  p-size) )
 +if ( p-data_is_ptr )
  {
 -case HVMCOPY_okay:
 -break;
 -case HVMCOPY_gfn_paged_out:
 -case HVMCOPY_gfn_shared:
 -rc = X86EMUL_RETRY;
 -break;
 -case HVMCOPY_bad_gfn_to_mfn:
 -data = ~0;
 -break;
 -case HVMCOPY_bad_gva_to_gfn:
 -ASSERT(0);
 -/* fall through */
 -default:
 -rc = X86EMUL_UNHANDLEABLE;
 -break;
 +switch ( hvm_copy_from_guest_phys(data, p-data + step * i,
 +  p-size) )
 +{
 +case HVMCOPY_okay:
 +break;
 +case HVMCOPY_gfn_paged_out:
 +case HVMCOPY_gfn_shared:
 +rc = X86EMUL_RETRY;
 +break;
 +case HVMCOPY_bad_gfn_to_mfn:
 +data = ~0;
 +break;
 +case HVMCOPY_bad_gva_to_gfn:
 +ASSERT_UNREACHABLE();
 +/* fall through */
 +default:
 +rc = X86EMUL_UNHANDLEABLE;
 +break;
 +}
 +if ( rc != X86EMUL_OKAY )
 +break;
  }
 -if ( rc != X86EMUL_OKAY )
 -break;
 -rc = action(IOREQ_WRITE, p-addr, p-size, data);
 +else
 +data = p-data;
 +
 +addr = (p-type == IOREQ_TYPE_COPY) ?
 +p-addr + step * i :
 +p-addr;

Indentation.

 @@ -324,78 +233,133 @@ static int process_portio_intercept(portio_action_t 
 action, ioreq_t *p)
  return rc;
  }
  
 -/*
 - * Check if the request is handled inside xen
 - * return value: 0 --not handled; 1 --handled
 - */
 -int hvm_io_intercept(ioreq_t *p, int type)
 +static struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p)
 +{
 +struct vcpu *curr = current;
 +struct domain *curr_d = curr-domain;

curr is used only once (here) and hence pointless as a local variable.

 +const struct hvm_io_ops *ops =
 +(p-type == IOREQ_TYPE_COPY) ?
 +mmio_ops :
 +portio_ops;
 +unsigned int i;
 +
 +for ( i = 0; i  curr_d-arch.hvm_domain.io_handler_count; i++ )
 +{
 +struct hvm_io_handler *handler =
 +curr_d-arch.hvm_domain.io_handler[i];
 +uint64_t start, end, count = p-count, size = p-size;

I'm not really happy with all these 64-bit local variables, but I
guess they are the result of you not wanting to do things the
way they're currently done everywhere... From a logical pov,
start and end would better be paddr_t, count and size
unsigned long.

 +if ( handler-type != p-type )
 +continue;
 +
 +switch ( handler-type )
 +{
 +case IOREQ_TYPE_PIO:
 +start = p-addr;
 +end = p-addr + size;
 +break;
 +case IOREQ_TYPE_COPY:
 +if ( p-df )
 +{
 +start = (p-addr - (count - 1) * size);
 +end = p-addr + size;
 +}
 +else
 +{
 +start = p-addr;
 

Re: [Xen-devel] Hyper and Xen Project

2015-06-24 Thread Stefano Stabellini
On Wed, 24 Jun 2015, Dave Scott wrote:
  On 24 Jun 2015, at 12:48, Stefano Stabellini 
  stefano.stabell...@eu.citrix.com wrote:
 
  Hi Wang,
 
  I don't know the answer, so I CCed xen-devel (the Xen development list)
  and a few people that I think will be able to help.
 
  Cheers,
 
  Stefano
 
  On Wed, 24 Jun 2015, Wang Xu wrote:
  A problem about channel, where do I found the channel name in the guest, 
  In the document, it says I could found it in
  sysfs, but looks there isn't a name property:
 
  | root@test-container-create-ubuntu:/sys/bus/xen/devices# udevadm  info 
  --attribute-walk  --path=/devices/console-1
  |
  [...]
  |
  |   looking at device '/devices/console-1':
  | KERNEL==console-1
  | SUBSYSTEM==xen
  | DRIVER==xenconsole
  | ATTR{devtype}==console
  | ATTR{nodename}==device/console/1”
 

 I don’t think the frontend driver in Linux knows about the name key. In my 
 testing I wrote a udev script which looks up the ‘name’ key directly in 
 xenstore and created a named device node using that. For reference my script 
 is here:

 https://github.com/mirage/mirage-console/blob/master/udev/xenconsole-setup-tty

That's a great workaround. However I think it would be best to make the
Linux hvc_xen driver aware of the name key going forward.


  and I directly test `/dev/hvc1`, and it could communicate with the outside 
  socket. Is there some mistake in my channel
  name configuration?
 
  | static void hyper_config_channel(libxl_device_channel* ch, const char* 
  name, const char* sock, int devid) {
  | libxl_device_channel_init(ch);
  | ch-backend_domid = 0;
  | ch-name = strdup(name);
  | ch-devid = devid;
  | ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET;
  | ch-u.socket.path = strdup(sock);
  | }
 
  I tried to look at the oVirt code as it is mentioned in the dock, but I 
  did not find xen console in its guest agent code.
 
  So the issue is that the name you assign here to the channel, doesn't
  come up anywhere in the guest. Is that correct?


 
 
  Thank you!
 
 
  On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini 
  stefano.stabell...@eu.citrix.com wrote:
   On Tue, 23 Jun 2015, Wang Xu wrote:
  On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini 
  stefano.stabell...@eu.citrix.com wrote:
 Integrating hyper with Xen using libxl was the right decision and 
  it
 looks like you did a good job. I think that you can go ahead with 
  the
 PR!
 
 
 But I did have a few issues building hyper. I am getting:
 
 hyperd.go:11:2: cannot find package hyper/daemon in any of:
 [...]
 
  I tried with a clean 0.2-dev branch
  ./autogen.sh
  ./configure
  make
 
  It looks ok, are you work on the 0.2-dev branch, I did not write the 
  branch name in the instruction of
   Readme, sorry for
  that.
 
   No worries, the most important part at this stage is the code, and 
  that
   looks OK :-)
   Yes, I was using 0.2-dev and followed those steps. As I usually don't
   program in go, it is likely that my go working environment is missing
   something, or my go paths are wrong. This is the full error message:
 
   CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep 
  go build hyperd.go
   hyperd.go:11:2: cannot find package hyper/daemon in any of:
   /local/scratch/sstabellini/go/src/hyper/daemon (from $GOROOT)
   
  /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon (from 
  $GOPATH)
   hyperd.go:10:2: cannot find package hyper/engine in any of:
   /local/scratch/sstabellini/go/src/hyper/engine (from $GOROOT)
   
  /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine (from 
  $GOPATH)
   hyperd.go:12:2: cannot find package hyper/lib/glog in any of:
   /local/scratch/sstabellini/go/src/hyper/lib/glog (from 
  $GOROOT)
   
  /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/lib/glog 
  (from $GOPATH)
   hyperd.go:13:2: cannot find package hyper/utils in any of:
   /local/scratch/sstabellini/go/src/hyper/utils (from $GOROOT)
   
  /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/utils (from 
  $GOPATH)
   godep: go exit status 1
 
 
 Looking through the code, it seems that you are adding a
 virtio-serial-pci device, why do you need it?  It is not used very 
  much
 on Xen; the regular Xen uart is specified by setting
 b_info-u.hvm.serial to pty, and it looks like you are already 
  doing
 that. If you need more than one console, you can have a list 
  setting
 b_info-u.hvm.serial_list.
 
  What the difference between u.hvm.serial_list and channels in 
  domain_config. The channel looks having more
   features.
 
   Actually I think that you are right: channels are better tested and 
  more
   flexible.
 
 
 virtio-9p-pci is also 

[Xen-devel] vTPM issues

2015-06-24 Thread Marcos Simó Picó
Hello everyone,


I would like to try the vTPM feature, but I'm having some issues. Basically, I 
followed the steps explained in 
https://mhsamsal.wordpress.com/2013/12/05/configuring-virtual-tpm-vtpm-for-xen-4-3-guest-virtual-machines/


I'm running Ubuntu 14.04 as Dom0 on a Dell optiplex-9020. I compiled Xen 4.5.0 
from source. After creating vtpmmgr and vtpm stubdoms, and DomU, I can invoke 
tpm_version from DomU:


root@DomU:/home/xen# tpm_version
  TPM 1.2 Version Info:
  Chip Version:1.2.0.7
  Spec Level:  2
  Errata Revision: 1
  TPM Vendor ID:   ETHZ
  TPM Version: 0101
  Manufacturer Info:   4554485a


I can also see the PCRs status by invoking cat 
/sys/class/misc/tpm0/device/pcrs, however, most of the commands return an 
error. When I invoke takeownership I get the following error:


root@DomU:/home/xen# tpm_takeownership -y -z -l debug
Tspi_Context_Create success
Tspi_Context_Connect success
Tspi_Context_GetTpmObject success
Tspi_GetPolicyObject success
Tspi_Policy_SetSecret success
Tspi_Context_CreateObject success
Tspi_GetPolicyObject success
Tspi_Policy_SetSecret success
Tspi_TPM_TakeOwnership failed: 0x2004 - layer=tcs, code=0004 (4), Internal 
software error
Tspi_Context_CloseObject success
Tspi_Context_FreeMemory success
Tspi_Context_Close success


The same error is given when invoking tpm_getpubkey. I have already tried after 
clearing the TPM from BIOS, after having taken ownership and with ownership no 
taken with the same result when using the vTPM. I have also installed Xen 
4.3.4, with the same result too.


In the end, I would like to use the vTPM to generate and use RSA keys for TLS 
session establishing (using the API provided with GnuTLS). Since I cannot take 
ownership of the vTPM, the GnuTLS' tpmtool complains it doesn't find any SRK.


I really appreciate any help you can provide.


Best regards,

Marcos
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/6] AMD-PVH: enable pvh if requirements met

2015-06-24 Thread Mukesh Rathor
On Wed, 24 Jun 2015 16:26:44 -0400
Elena Ufimtseva elena.ufimts...@oracle.com wrote:

 On Wed, Jun 24, 2015 at 07:24:18PM +0100, Andrew Cooper wrote:
  On 24/06/15 08:49, Jan Beulich wrote:
   On 24.06.15 at 04:34, boris.ostrov...@oracle.com wrote:
   On 06/23/2015 08:30 AM, Jan Beulich wrote:
   On 22.06.15 at 18:37, elena.ufimts...@oracle.com wrote:
   --- a/xen/arch/x86/hvm/svm/svm.c
   +++ b/xen/arch/x86/hvm/svm/svm.c
   @@ -1444,6 +1444,9 @@ const struct hvm_function_table * __init
   start_svm(void)
 svm_function_table.hap_capabilities =
   HVM_HAP_SUPERPAGE_2MB | ((cpuid_edx(0x8001) 
   0x0400) ? HVM_HAP_SUPERPAGE_1GB : 0); 
   +if ( cpu_has_svm_npt   cpu_has_svm_decode )
   +svm_function_table.pvh_supported = 1;
   If svm_decode indeed is a prereq, then the earlier patch dealing
   with the handle_mmio() invocations doesn't need to fiddle with
   VMEXIT_INVLPG other than to maybe add a documenting ASSERT().
  
   I am not sure we should require decode feature to be required
   for PVH support. I can't remember exactly but I think this
   feature was first introduced in family 15h so requiring it will
   leave at least family 10h processors as not supporting PVH.
   The question was why the dependency was added in the first place.
   Indeed only fam 12, 15, and 16 have the field documented. Otoh
   PVH isn't being supported universally on all VMX variants
   either...
  
  Right, but this is a bug (feature?) of the current implementation
  and need fixing.
  
  There are no technical reasons to prevent PVH guests running in any
  case where an HVM guest currently runs.
  
  The only technical restriction I can think of is that a PVH hardware
  domain needs IOMMU support, but that is it.
  
 
 CCing Mukesh, maybe he will reply to as why that restriction is here.

Hi Elena,

Basically, the restriction was to allow AMD to come on par with intel and
get phase I working on it. Then, I could just focus on handle_mmio for
INS/OUTS for both intel and amd, and if supporting !svm_decode family
of CPUs was important, then extend handle_mmio further...  

http://xen-devel.narkive.com/liQjEoV2/rfh-amd-cr-intercept-for-lmsw-clts

[In the absence of svm_decode, mov cr would need to go thru handle_mmio..]

thanks,
Mukesh




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Interested in taking up a project

2015-06-24 Thread Abhinav Gupta
Thanks for the help and support guys. I'll need some time to get a proper
understanding of how it is incorporated in Linux kernel and what all
interfaces are built on top of it. Once I'm comfortable with that and xen's
credit_scheduler, for starting, I'll come up with a design doc and share
with you all. I'll keep reporting the progress of the work and ask related
doubts in this thread.

Thanks,
Abhinav

On Mon, Jun 22, 2015, 3:15 PM George Dunlap george.dun...@eu.citrix.com
wrote:

 On 06/21/2015 07:37 AM, Abhinav Gupta wrote:
  Hii,
 I'm still waiting for the confirmation. Have started looking into the
  code though.

 Hey Abhinav,

 Thanks for your interest!  As others have said, it's a free world, so of
 course you can work on and attempt to contribute whatever you want. :-)

 There's nobody else working on this yet, and it's probably still a good
 idea, so in that sense, the project is something that you should feel
 free to start working on.

 I don't have time at the moment to commit to the level of mentorship I
 would if you were a GSoC intern; but as a community, we're generally
 pretty good about helping people who try to get involved -- as you've
 already found out. :-)

 One heads-up: A thing we've started doing in our community, before
 submitting a large new feature, is to post a design document describing
 the purpose of the new feature, and a technical overview of the changes
 that you want to make and why.

 This is *not required*; you are free to just submit patches with your
 changes, and many people do.  However, it's not uncommon for maintainers
 to request significant changes to the architecture or approach on major
 features, which require a major re-write.  This can be frustrating both
 for you and for us.  Done properly, a design document can make things
 easier for all of us.

 Looking forward to seeing your work!

  -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Benjamin Herrenschmidt
On Wed, 2015-06-24 at 18:38 +0200, Luis R. Rodriguez wrote:
 On Wed, Jun 24, 2015 at 08:42:23AM +1000, Benjamin Herrenschmidt wrote:
  On Fri, 2015-06-19 at 15:08 -0700, Luis R. Rodriguez wrote:
   From: Luis R. Rodriguez mcg...@suse.com
   
   PCI BARs tell us whether prefetching is safe, but they don't say anything
   about write combining (WC).  WC changes ordering rules and allows writes 
   to
   be collapsed, so it's not safe in general to use it on a prefetchable
   region.
  
  Well, the PCIe spec at least specifies that a prefetchable BAR also
  tolerates write merging... 
 
 How can that be determined and can that be used as a full bullet proof hint
 to enable wc ? And are you sure? :) 

Well, Im sure the spec says that ;-) But it could be new to PCIe, I
haven't checked legacy PCI.

 Reason all this was stated was to be
 apologetic over why we can't automate this behind the scenes. Otherwise
 we could amend what you stated into the commit log to elaborate on our
 technical apology. Let me know!

At least on powerpc, for mmap of resource to userspace, we take off the
garded bit in the PTE for prefetchable BARs. This has the effect
architecturally of enabling both prefetch and write combine (ie. side
effect) though afaik, the implementations probably don't actually
prefetch. We've done that for years.

In fact we don't have a way to split the notions, it's either G or no G,
which carries both meanings.

Do you have example/case of a device having problems ?

Cheers,
Ben.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 58851: regressions - FAIL

2015-06-24 Thread osstest service user
flight 58851 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58851/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 58821
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. 
vs. 58821

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 58773
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58821
 test-amd64-i386-libvirt  11 guest-start  fail   like 58821
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   like 58821
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58821
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58821
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58821

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  6b444c8e1c19fac08c82f18011ad00ca185e4e80
baseline version:
 xen  e76ff6c156906b515c2a4300a81c95886ece5d5f


People who touched revisions under test:
  Boris Ostrovsky boris.ostrov...@oracle.com
  David Vrabel david.vra...@citrix.com
  Don Slutz dsl...@verizon.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Juergen Gross jgr...@suse.com


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm fail
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  

Re: [Xen-devel] [PATCH 6/6] AMD-PVH: enable pvh if requirements met

2015-06-24 Thread Elena Ufimtseva
On Wed, Jun 24, 2015 at 07:24:18PM +0100, Andrew Cooper wrote:
 On 24/06/15 08:49, Jan Beulich wrote:
  On 24.06.15 at 04:34, boris.ostrov...@oracle.com wrote:
  On 06/23/2015 08:30 AM, Jan Beulich wrote:
  On 22.06.15 at 18:37, elena.ufimts...@oracle.com wrote:
  --- a/xen/arch/x86/hvm/svm/svm.c
  +++ b/xen/arch/x86/hvm/svm/svm.c
  @@ -1444,6 +1444,9 @@ const struct hvm_function_table * __init
  start_svm(void)
svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB |
((cpuid_edx(0x8001)  0x0400) ? HVM_HAP_SUPERPAGE_1GB 
  : 0);

  +if ( cpu_has_svm_npt   cpu_has_svm_decode )
  +svm_function_table.pvh_supported = 1;
  If svm_decode indeed is a prereq, then the earlier patch dealing
  with the handle_mmio() invocations doesn't need to fiddle with
  VMEXIT_INVLPG other than to maybe add a documenting ASSERT().
 
  I am not sure we should require decode feature to be required for PVH 
  support. I can't remember exactly but I think this feature was first 
  introduced in family 15h so requiring it will leave at least family 10h 
  processors as not supporting PVH.
  The question was why the dependency was added in the first place.
  Indeed only fam 12, 15, and 16 have the field documented. Otoh
  PVH isn't being supported universally on all VMX variants either...
 
 Right, but this is a bug (feature?) of the current implementation and
 need fixing.
 
 There are no technical reasons to prevent PVH guests running in any case
 where an HVM guest currently runs.
 
 The only technical restriction I can think of is that a PVH hardware
 domain needs IOMMU support, but that is it.
 

CCing Mukesh, maybe he will reply to as why that restriction is here.

 ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m

2015-06-24 Thread Lengyel, Tamas
On Wed, Jun 24, 2015 at 12:43 PM, Ed White edmund.h.wh...@intel.com wrote:

 On 06/24/2015 06:37 AM, Razvan Cojocaru wrote:
  On 06/24/2015 04:32 PM, Lengyel, Tamas wrote:
 
 
  On Wed, Jun 24, 2015 at 1:39 AM, Razvan Cojocaru
  rcojoc...@bitdefender.com mailto:rcojoc...@bitdefender.com wrote:
 
  On 06/24/2015 12:27 AM, Lengyel, Tamas wrote:
   I've extended xen-access to exercise this new feature taking into
   account some of the current limitations. Using the
 altp2m_write|exec
   options we create a duplicate view of the default hostp2m, and
 instead
   of relaxing the mem_access permissions when we encounter a
 violation, we
   swap the view on the violating vCPU while also enabling MTF
   singlestepping. When the singlestep event fires, we use the
 response to
   that event to swap the view back to the restricted altp2m view.
 
  That's certainly very interesting. I wonder what the benefits are in
  this case over emulating the fault-causing instruction (other than
  obviously not going through the emulator)? The altp2m method would
  certainly be slower, since you need more round-trips from userspace
 to
  the hypervisor (the EPT vm_event handling + the singlestep event,
  whereas with emulation you just reply to the original vm_event).
 
 
  Regards,
  Razvan
 
 
  Certainly, this is pretty slow right now, especially for the altp2m_exec
  case. However, sometimes you simply cannot emulate. For example if you
  write breakpoints into target locations, the original instruction has
  been overwritten with 0xCC. If you have a duplicate of the page without
  the breakpoint, this is an easy way to make the guest fetch the original
  instruction. Of course, if you extend the emulation routine where you
  can provide the instruction to emulate, instead of it being fetched from
  guest memory, that would be equally useful ;)
 
  Makes sense, thanks for the explanation! Sure, sending back the
  instruction to emulate could be something to consider for the future.
 
 
  Thanks,
  Razvan
 

 One thing I'd add is that what Tamas has done provides a valuable test that
 the cross-domain functionality works, even if it might not be a recommended
 design pattern. Our primary use case at Intel is intra-domain, and there
 the advantages of avoiding many exits are clear.

 Also, even cross-domain usage allows for different views of, and levels of
 access to, memory concurrently on different vcpus.

 Ed



Hi Ed,
I tried the system using memsharing and I collected the following crash
log. In this test I ran memsharing on all pages of the domain before
activating altp2m and creating the view. Afterwards I used my updated
xen-access to create a copy of this p2m with only R/X permissions. The idea
would be that the altp2m view remains completely shared, while the hostp2m
would be able to do its CoW propagation as the domain is executing.

(XEN) mm locking order violation: 278  239
(XEN) Xen BUG at mm-locks.h:68
(XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
(XEN) CPU:2
(XEN) RIP:e008:[82d0801f8768]
p2m_altp2m_propagate_change+0x85/0x4a9
(XEN) RFLAGS: 00010282   CONTEXT: hypervisor (d6v0)
(XEN) rax:    rbx:    rcx: 
(XEN) rdx: 8302163a8000   rsi: 000a   rdi: 82d0802a069c
(XEN) rbp: 8302163afa68   rsp: 8302163af9e8   r8:  83021c00
(XEN) r9:  0003   r10: 00ef   r11: 0003
(XEN) r12: 83010cc51820   r13:    r14: 830158d9
(XEN) r15: 00025697   cr0: 80050033   cr4: 001526f0
(XEN) cr3: dbba3000   cr2: 778c9714
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=8302163af9e8:
(XEN)8302163af9f8 803180f8 000c 82d0801892ee
(XEN)82d0801fb4d1 83010cc51de0 0008ff49 82d08012f86a
(XEN)83010cc51820 83010cc51820  
(XEN)83010cc51820  8300dbb334b8 8302163afa00
(XEN)8302163afb18 82d0801fd549 00050009 83020001
(XEN)0001 830158d9 0002 0008ff49
(XEN)00025697 000c 8302163afae8 80c08ff49175
(XEN)80c0d0a97175 01ff83010cc51820 0097 8300dbb33000
(XEN)8302163afb78 0008ff49  0001
(XEN)00025697 83010cc51820 8302163afb38 82d0801fd644
(XEN) 000d0a97 8302163afb98 82d0801f23c5
(XEN)830158d9 0cc51820 830158d9 000c
(XEN)0008ff49 83010cc51820 00025697 000d0a97
(XEN)0008ff49 830158d9 8302163afbd8 82d0801f45c8
(XEN)83010cc51820 000c 83008fd41170 

Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m

2015-06-24 Thread Ed White
On 06/24/2015 03:45 PM, Lengyel, Tamas wrote:
 On Wed, Jun 24, 2015 at 6:02 PM, Ed White edmund.h.wh...@intel.com wrote:
 
 On 06/24/2015 02:34 PM, Lengyel, Tamas wrote:
 Hi Ed,
 I tried the system using memsharing and I collected the following crash
 log. In this test I ran memsharing on all pages of the domain before
 activating altp2m and creating the view. Afterwards I used my updated
 xen-access to create a copy of this p2m with only R/X permissions. The
 idea
 would be that the altp2m view remains completely shared, while the
 hostp2m
 would be able to do its CoW propagation as the domain is executing.

 (XEN) mm locking order violation: 278  239
 (XEN) Xen BUG at mm-locks.h:68
 (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
 (XEN) CPU:2
 (XEN) RIP:e008:[82d0801f8768]
 p2m_altp2m_propagate_change+0x85/0x4a9
 (XEN) RFLAGS: 00010282   CONTEXT: hypervisor (d6v0)
 (XEN) rax:    rbx:    rcx:
 
 (XEN) rdx: 8302163a8000   rsi: 000a   rdi:
 82d0802a069c
 (XEN) rbp: 8302163afa68   rsp: 8302163af9e8   r8:
 83021c00
 (XEN) r9:  0003   r10: 00ef   r11:
 0003
 (XEN) r12: 83010cc51820   r13:    r14:
 830158d9
 (XEN) r15: 00025697   cr0: 80050033   cr4:
 001526f0
 (XEN) cr3: dbba3000   cr2: 778c9714
 (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
 (XEN) Xen stack trace from rsp=8302163af9e8:
 (XEN)8302163af9f8 803180f8 000c
 82d0801892ee
 (XEN)82d0801fb4d1 83010cc51de0 0008ff49
 82d08012f86a
 (XEN)83010cc51820 83010cc51820 
 
 (XEN)83010cc51820  8300dbb334b8
 8302163afa00
 (XEN)8302163afb18 82d0801fd549 00050009
 83020001
 (XEN)0001 830158d9 0002
 0008ff49
 (XEN)00025697 000c 8302163afae8
 80c08ff49175
 (XEN)80c0d0a97175 01ff83010cc51820 0097
 8300dbb33000
 (XEN)8302163afb78 0008ff49 
 0001
 (XEN)00025697 83010cc51820 8302163afb38
 82d0801fd644
 (XEN) 000d0a97 8302163afb98
 82d0801f23c5
 (XEN)830158d9 0cc51820 830158d9
 000c
 (XEN)0008ff49 83010cc51820 00025697
 000d0a97
 (XEN)0008ff49 830158d9 8302163afbd8
 82d0801f45c8
 (XEN)83010cc51820 000c 83008fd41170
 0008ff49
 (XEN)00025697 82e001a152e0 8302163afc58
 82d080205b51
 (XEN)0009 0008ff49 8300d0a97000
 83008fd41160
 (XEN)82e001a152f0 82e0011fe920 83010cc51820
 000c
 (XEN)00025697 0003 83010cc51820
 8302163afd34
 (XEN)00025697  8302163afca8
 82d0801f1f7d
 (XEN) Xen call trace:
 (XEN)[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9
 (XEN)[82d0801fd549] ept_set_entry_sve+0x5fa/0x6e6
 (XEN)[82d0801fd644] ept_set_entry+0xf/0x11
 (XEN)[82d0801f23c5] p2m_set_entry+0xd4/0x112
 (XEN)[82d0801f45c8] set_shared_p2m_entry+0x2d0/0x39b
 (XEN)[82d080205b51] __mem_sharing_unshare_page+0x83f/0xbd6
 (XEN)[82d0801f1f7d] __get_gfn_type_access+0x224/0x2b0
 (XEN)[82d0801c6df5] hvm_hap_nested_page_fault+0x21f/0x795
 (XEN)[82d0801e86ae] vmx_vmexit_handler+0x1764/0x1af3
 (XEN)[82d0801ee891] vmx_asm_vmexit_handler+0x41/0xc0

 The crash here is because I haven't successfully forced all the shared
 pages in the host p2m to become unshared before copying,
 which is the intended behaviour.

 I think I know how that has happened and how to fix it, but what you're
 trying to do won't work by design. By the time a copy from host p2m to
 altp2m occurs, the sharing is supposed to be broken.

 
 Hm. If the sharing gets broken before the hostp2m-altp2m copy, maybe doing
 sharing after the view has been created is a better route? I guess the
 sharing code would need to be adapted to check if altp2m is enabled for
 that to work..
 
 

 You're coming up with some ways of attempting to use altp2m that we
 hadn't thought of. That's a good thing, and just what we want, but
 there are limits to what we can support without more far-reaching
 changes to existing parts of Xen. This isn't going to be do-able for
 4.6.

 
 My main concern is just getting it to work, hitting 4.6 is not a priority.
 I understand that my stuff is highly experimental ;) While the gfn
 remapping feature is intriguing, in my setup I already have a copy of the
 page I would want to present during a singlestep-altp2mswitch - in the
 origin domains memory. AFAIU the 

Re: [Xen-devel] [PATCH 1/3] x86: drop is_pv_32on64_vcpu()

2015-06-24 Thread Boris Ostrovsky

On 06/23/2015 11:18 AM, Jan Beulich wrote:

... as being identical to is_pv_32bit_vcpu() after the x86-32 removal.

In a few cases this includes an additional is_pv_32bit_vcpu() -
is_pv_32bit_domain() conversion.

Signed-off-by: Jan Beulich jbeul...@suse.com


We have
struct arch_domain
{
...
/* Is a 32-bit PV (non-HVM) guest? */
bool_t is_32bit_pv;
/* Is shared-info page in 32-bit format? */
bool_t has_32bit_shinfo;
   ...
}

and currently both of these fields are set/unset together (except for 
one HVM case --- hvm_latch_shinfo_size()). Why not have a single 'bool 
is_32bit' and then replace macros at the top of include/asm-x86/domain.h 
with is_32bit_vcpu/domain()?


I think in majority of places when we test for is_pv_32bit_vcpu/domain() 
we already know that we are PV so it shouldn't add any additional tests.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Luis R. Rodriguez
On Wed, Jun 24, 2015 at 3:05 PM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:
 On Wed, 2015-06-24 at 18:38 +0200, Luis R. Rodriguez wrote:
 On Wed, Jun 24, 2015 at 08:42:23AM +1000, Benjamin Herrenschmidt wrote:
  On Fri, 2015-06-19 at 15:08 -0700, Luis R. Rodriguez wrote:
   From: Luis R. Rodriguez mcg...@suse.com
  
   PCI BARs tell us whether prefetching is safe, but they don't say anything
   about write combining (WC).  WC changes ordering rules and allows writes 
   to
   be collapsed, so it's not safe in general to use it on a prefetchable
   region.
 
  Well, the PCIe spec at least specifies that a prefetchable BAR also
  tolerates write merging...

 How can that be determined and can that be used as a full bullet proof hint
 to enable wc ? And are you sure? :)

 Well, Im sure the spec says that ;-) But it could be new to PCIe, I
 haven't checked legacy PCI.

OK cool so to be clear from what I gather you are suggesting (or not
and letting me make it) is that we might be able to enforce
write-merging on prefetchable areas, and if we can *ensure* we do this
then automatically enable write-combining behind the scenes?

 Reason all this was stated was to be
 apologetic over why we can't automate this behind the scenes. Otherwise
 we could amend what you stated into the commit log to elaborate on our
 technical apology. Let me know!

 At least on powerpc, for mmap of resource to userspace, we take off the
 garded bit in the PTE for prefetchable BARs. This has the effect
 architecturally of enabling both prefetch and write combine (ie. side
 effect)

That's pretty darn sexy.

 though afaik, the implementations probably don't actually
 prefetch. We've done that for years.

Neat!

 In fact we don't have a way to split the notions, it's either G or no G,
 which carries both meanings.

Interesting.

 Do you have example/case of a device having problems ?

Nope but at least what made me squint at this being a possible
feature was that in practice when reviewing all of the kernels
pending device drivers using MTRR (potential write-combine candidates)
I encountered a slew of them which had the architectural unfortunate
practice of combining PCI bars for MMIO and their respective
write-combined desirable area (framebuffer for video, PIO buffers for
infiniband, etc). Now, to me that read more as a practice for old
school devices when such things were likely still being evaluated,
more modern devices seem to adhere to sticking a full PCI bar with
write-combining or not. Did you not encounter such mismatch splits on
powerpc ? Was such possibility addressed?

If what you are implying here is applicable to the x86 world I'm all
for enabling this as we'd have less code to maintain but I'll note
that getting a clarification alone on that prefetchable !=
write-combining was in and of itself hard, I'd be surprised if we
could get full architectural buy-in to this as an immediate automatic
feature. Because of this and because PAT did have some errata as well,
I would not be surprised if some PCI bridges / devices would end up
finding corner cases, as such if we can really do what you're saying
and unless we can get some super sane certainty over it across the
board, I'd be inclined to leave such things as a part of a new API.
Maybe have some folks test using the new API for all calls and after
some sanity of testing / releases consider a full switch.

That is, unless of course you're sure all this is sane and would wager
all-in on it from the get-go.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 06/12] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.

2015-06-24 Thread Ed White
On 06/24/2015 05:47 AM, Andrew Cooper wrote:
 +case EXIT_REASON_VMFUNC:
 +if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY )
 
 This is currently an unconditional failure, and I don't see subsequent
 patches which alter vmx_vmfunc_intercept().  Shouldn't
 vmx_vmfunc_intercept() switch on eax and optionally call
 p2m_switch_vcpu_altp2m_by_id()?

If the VMFUNC instruction was valid, the hardware would have executed it.
The only time a VMFUNC exit occurs is if the hardware supports VMFUNC
and the hypervisor has enabled it, but the VMFUNC instruction is
invalid in some way and can't be executed (because EAX != 0, for example).

There are only two choices: crash the domain or inject #UD (which is the
closest analogue to what happens in the absence of a hypervisor and will
probably crash the OS in the domain). I chose the latter in the code I
originally wrote; Ravi chose the former in his patch. I don't have a
strong opinion either way, but I think these are the only two choices.

I hope this answers Jan's question in another email on the same subject.

Ed

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m

2015-06-24 Thread Ed White
On 06/24/2015 02:34 PM, Lengyel, Tamas wrote:
 Hi Ed,
 I tried the system using memsharing and I collected the following crash
 log. In this test I ran memsharing on all pages of the domain before
 activating altp2m and creating the view. Afterwards I used my updated
 xen-access to create a copy of this p2m with only R/X permissions. The idea
 would be that the altp2m view remains completely shared, while the hostp2m
 would be able to do its CoW propagation as the domain is executing.
 
 (XEN) mm locking order violation: 278  239
 (XEN) Xen BUG at mm-locks.h:68
 (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
 (XEN) CPU:2
 (XEN) RIP:e008:[82d0801f8768]
 p2m_altp2m_propagate_change+0x85/0x4a9
 (XEN) RFLAGS: 00010282   CONTEXT: hypervisor (d6v0)
 (XEN) rax:    rbx:    rcx: 
 (XEN) rdx: 8302163a8000   rsi: 000a   rdi: 82d0802a069c
 (XEN) rbp: 8302163afa68   rsp: 8302163af9e8   r8:  83021c00
 (XEN) r9:  0003   r10: 00ef   r11: 0003
 (XEN) r12: 83010cc51820   r13:    r14: 830158d9
 (XEN) r15: 00025697   cr0: 80050033   cr4: 001526f0
 (XEN) cr3: dbba3000   cr2: 778c9714
 (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
 (XEN) Xen stack trace from rsp=8302163af9e8:
 (XEN)8302163af9f8 803180f8 000c 82d0801892ee
 (XEN)82d0801fb4d1 83010cc51de0 0008ff49 82d08012f86a
 (XEN)83010cc51820 83010cc51820  
 (XEN)83010cc51820  8300dbb334b8 8302163afa00
 (XEN)8302163afb18 82d0801fd549 00050009 83020001
 (XEN)0001 830158d9 0002 0008ff49
 (XEN)00025697 000c 8302163afae8 80c08ff49175
 (XEN)80c0d0a97175 01ff83010cc51820 0097 8300dbb33000
 (XEN)8302163afb78 0008ff49  0001
 (XEN)00025697 83010cc51820 8302163afb38 82d0801fd644
 (XEN) 000d0a97 8302163afb98 82d0801f23c5
 (XEN)830158d9 0cc51820 830158d9 000c
 (XEN)0008ff49 83010cc51820 00025697 000d0a97
 (XEN)0008ff49 830158d9 8302163afbd8 82d0801f45c8
 (XEN)83010cc51820 000c 83008fd41170 0008ff49
 (XEN)00025697 82e001a152e0 8302163afc58 82d080205b51
 (XEN)0009 0008ff49 8300d0a97000 83008fd41160
 (XEN)82e001a152f0 82e0011fe920 83010cc51820 000c
 (XEN)00025697 0003 83010cc51820 8302163afd34
 (XEN)00025697  8302163afca8 82d0801f1f7d
 (XEN) Xen call trace:
 (XEN)[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9
 (XEN)[82d0801fd549] ept_set_entry_sve+0x5fa/0x6e6
 (XEN)[82d0801fd644] ept_set_entry+0xf/0x11
 (XEN)[82d0801f23c5] p2m_set_entry+0xd4/0x112
 (XEN)[82d0801f45c8] set_shared_p2m_entry+0x2d0/0x39b
 (XEN)[82d080205b51] __mem_sharing_unshare_page+0x83f/0xbd6
 (XEN)[82d0801f1f7d] __get_gfn_type_access+0x224/0x2b0
 (XEN)[82d0801c6df5] hvm_hap_nested_page_fault+0x21f/0x795
 (XEN)[82d0801e86ae] vmx_vmexit_handler+0x1764/0x1af3
 (XEN)[82d0801ee891] vmx_asm_vmexit_handler+0x41/0xc0

The crash here is because I haven't successfully forced all the shared
pages in the host p2m to become unshared before copying,
which is the intended behaviour.

I think I know how that has happened and how to fix it, but what you're
trying to do won't work by design. By the time a copy from host p2m to
altp2m occurs, the sharing is supposed to be broken.

You're coming up with some ways of attempting to use altp2m that we
hadn't thought of. That's a good thing, and just what we want, but
there are limits to what we can support without more far-reaching
changes to existing parts of Xen. This isn't going to be do-able for
4.6.

Ed


Ed


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m

2015-06-24 Thread Lengyel, Tamas
On Wed, Jun 24, 2015 at 6:02 PM, Ed White edmund.h.wh...@intel.com wrote:

 On 06/24/2015 02:34 PM, Lengyel, Tamas wrote:
  Hi Ed,
  I tried the system using memsharing and I collected the following crash
  log. In this test I ran memsharing on all pages of the domain before
  activating altp2m and creating the view. Afterwards I used my updated
  xen-access to create a copy of this p2m with only R/X permissions. The
 idea
  would be that the altp2m view remains completely shared, while the
 hostp2m
  would be able to do its CoW propagation as the domain is executing.
 
  (XEN) mm locking order violation: 278  239
  (XEN) Xen BUG at mm-locks.h:68
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  (XEN) CPU:2
  (XEN) RIP:e008:[82d0801f8768]
  p2m_altp2m_propagate_change+0x85/0x4a9
  (XEN) RFLAGS: 00010282   CONTEXT: hypervisor (d6v0)
  (XEN) rax:    rbx:    rcx:
 
  (XEN) rdx: 8302163a8000   rsi: 000a   rdi:
 82d0802a069c
  (XEN) rbp: 8302163afa68   rsp: 8302163af9e8   r8:
 83021c00
  (XEN) r9:  0003   r10: 00ef   r11:
 0003
  (XEN) r12: 83010cc51820   r13:    r14:
 830158d9
  (XEN) r15: 00025697   cr0: 80050033   cr4:
 001526f0
  (XEN) cr3: dbba3000   cr2: 778c9714
  (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
  (XEN) Xen stack trace from rsp=8302163af9e8:
  (XEN)8302163af9f8 803180f8 000c
 82d0801892ee
  (XEN)82d0801fb4d1 83010cc51de0 0008ff49
 82d08012f86a
  (XEN)83010cc51820 83010cc51820 
 
  (XEN)83010cc51820  8300dbb334b8
 8302163afa00
  (XEN)8302163afb18 82d0801fd549 00050009
 83020001
  (XEN)0001 830158d9 0002
 0008ff49
  (XEN)00025697 000c 8302163afae8
 80c08ff49175
  (XEN)80c0d0a97175 01ff83010cc51820 0097
 8300dbb33000
  (XEN)8302163afb78 0008ff49 
 0001
  (XEN)00025697 83010cc51820 8302163afb38
 82d0801fd644
  (XEN) 000d0a97 8302163afb98
 82d0801f23c5
  (XEN)830158d9 0cc51820 830158d9
 000c
  (XEN)0008ff49 83010cc51820 00025697
 000d0a97
  (XEN)0008ff49 830158d9 8302163afbd8
 82d0801f45c8
  (XEN)83010cc51820 000c 83008fd41170
 0008ff49
  (XEN)00025697 82e001a152e0 8302163afc58
 82d080205b51
  (XEN)0009 0008ff49 8300d0a97000
 83008fd41160
  (XEN)82e001a152f0 82e0011fe920 83010cc51820
 000c
  (XEN)00025697 0003 83010cc51820
 8302163afd34
  (XEN)00025697  8302163afca8
 82d0801f1f7d
  (XEN) Xen call trace:
  (XEN)[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9
  (XEN)[82d0801fd549] ept_set_entry_sve+0x5fa/0x6e6
  (XEN)[82d0801fd644] ept_set_entry+0xf/0x11
  (XEN)[82d0801f23c5] p2m_set_entry+0xd4/0x112
  (XEN)[82d0801f45c8] set_shared_p2m_entry+0x2d0/0x39b
  (XEN)[82d080205b51] __mem_sharing_unshare_page+0x83f/0xbd6
  (XEN)[82d0801f1f7d] __get_gfn_type_access+0x224/0x2b0
  (XEN)[82d0801c6df5] hvm_hap_nested_page_fault+0x21f/0x795
  (XEN)[82d0801e86ae] vmx_vmexit_handler+0x1764/0x1af3
  (XEN)[82d0801ee891] vmx_asm_vmexit_handler+0x41/0xc0

 The crash here is because I haven't successfully forced all the shared
 pages in the host p2m to become unshared before copying,
 which is the intended behaviour.

 I think I know how that has happened and how to fix it, but what you're
 trying to do won't work by design. By the time a copy from host p2m to
 altp2m occurs, the sharing is supposed to be broken.


Hm. If the sharing gets broken before the hostp2m-altp2m copy, maybe doing
sharing after the view has been created is a better route? I guess the
sharing code would need to be adapted to check if altp2m is enabled for
that to work..



 You're coming up with some ways of attempting to use altp2m that we
 hadn't thought of. That's a good thing, and just what we want, but
 there are limits to what we can support without more far-reaching
 changes to existing parts of Xen. This isn't going to be do-able for
 4.6.


My main concern is just getting it to work, hitting 4.6 is not a priority.
I understand that my stuff is highly experimental ;) While the gfn
remapping feature is intriguing, in my setup I already have a copy of the
page I would want to present during a singlestep-altp2mswitch - in the
origin domains memory. AFAIU the gfn 

Re: [Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))

2015-06-24 Thread Dario Faggioli
[Moving most people to Bcc, as this is indeed unrelated to the original
topic]

On Wed, 2015-06-24 at 13:41 +0100, Jan Beulich wrote:
  On 24.06.15 at 14:29, dario.faggi...@citrix.com wrote:
  On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote:
  The memory info
  Jun 23 15:56:27.749008 (XEN) Memory location of each domain:
  Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072):
  Jun 23 15:56:27.756983 (XEN) Node 0: 126905
  Jun 23 15:56:27.756998 (XEN) Node 1: 0
  Jun 23 15:56:27.764952 (XEN) Node 2: 4167
  Jun 23 15:56:27.764969 (XEN) Node 3: 0
  suggests at least a small amount of cross-node memory allocation (16M
  out of dom0s 512M total). That's probably small enough to be OK.
  
  Yeah, that is in line with what you usually get with dom0_nodes. Most of
  the memory, as you noted, comes from the proper node. We're just not
  (yet?) at the point where _all_ of it can come from there.
 
 Actually as long as there is enough memory on the requested node
 (minus any amount set aside for the DMA pool), this shouldn't
 happen (and I had seen this to be clean in my own testing). 

ISTR some allocation not being 'converted'. Perhaps I'm misremembering.

 There
 being 8Gb per node, I see no immediate reason why memory from
 node 2 would be handed out. Still I wouldn't suspect this to matter
 here.
 
On my 2 nodes test box with the following configuration:
(XEN) SRAT: Node 1 PXM 1 0-dc00
(XEN) SRAT: Node 1 PXM 1 1-1a400
(XEN) SRAT: Node 0 PXM 0 1a400-32400

with 'dom0_nodes=0', I see this:
(XEN) Memory location of each domain:
(XEN) Domain 0 (total: 131072):
(XEN) Node 0: 114664
(XEN) Node 1: 16408

while with 'dom0_nodes=1', this:
(XEN) Memory location of each domain:
(XEN) Domain 0 (total: 131072):
(XEN) Node 0: 7749
(XEN) Node 1: 123323

Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 15:34, paul.durr...@citrix.com wrote:
  -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 24 June 2015 14:25
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: RE: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio
 check op
 
  On 24.06.15 at 15:14, paul.durr...@citrix.com wrote:
  From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-
  boun...@lists.xen.org] On Behalf Of Jan Beulich
  Sent: 24 June 2015 14:08
   On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
   --- a/xen/include/asm-x86/hvm/io.h
   +++ b/xen/include/asm-x86/hvm/io.h
   @@ -35,7 +35,9 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v,
unsigned long addr,
unsigned long length,
unsigned long val);
   -typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long
 addr);
   +typedef int (*hvm_mmio_check_t)(struct vcpu *v,
   +unsigned long addr,
   +unsigned long length);
 
  I don't think this really needs to be long?
 
 
  For consistency with the mmio read and write function types I went with
  'long'. Is there any harm in that?
 
 Generally generates worse code (due to the need for the REX64
 prefix on all involved instructions). Perhaps the other ones don't
 need sizes/lengths passed as longs either?
 
 
 I'm happy to do it that way round if you don't mind the extra diffs. I'll do 
 it as a separate patch just before this one, to ease review.

Thanks. There's actually a good example of why they shouldn't be
unsigned long in patch 9 I just looked at:

+static int stdvga_mem_write(struct vcpu *v, unsigned long addr,
+unsigned long size, unsigned long data)
 {
+ioreq_t p = { .type = IOREQ_TYPE_COPY,
+  .addr = addr,
+  .size = size,

This clearly would truncate size if it ever exceeded a uint32_t (and
hence an unsigned int on x86).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 This set of patches adds support to hvm domains for EPTP switching by creating
 multiple copies of the host p2m (currently limited to 10 copies).

 The primary use of this capability is expected to be in scenarios where access
 to memory needs to be monitored and/or restricted below the level at which the
 guest OS page tables operate. Two examples that were discussed at the 2014 Xen
 developer summit are:

 VM introspection: 
 http://www.slideshare.net/xen_com_mgr/
 zero-footprint-guest-memory-introspection-from-xen

 Secure inter-VM communication:
 http://www.slideshare.net/xen_com_mgr/nakajima-nvf

 A more detailed design specification can be found at:
 http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

 Each p2m copy is populated lazily on EPT violations.
 Permissions for pages in alternate p2m's can be changed in a similar
 way to the existing memory access interface, and gfn-mfn mappings can be 
 changed.

 All this is done through extra HVMOP types.

 The cross-domain HVMOP code has been compile-tested only. Also, the 
 cross-domain
 code is hypervisor-only, the toolstack has not been modified.

 The intra-domain code has been tested. Violation notifications can only be 
 received
 for pages that have been modified (access permissions and/or gfn-mfn 
 mapping) 
 intra-domain, and only on VCPU's that have enabled notification.

 VMFUNC and #VE will both be emulated on hardware without native support.

 This code is not compatible with nested hvm functionality and will refuse to 
 work
 with nested hvm active. It is also not compatible with migration. It should be
 considered experimental.

Overall, this patch series is looking very good, and it would seem that
3rd party testing agrees!

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 04/12] x86/altp2m: basic data structures and support routines.

2015-06-24 Thread Jan Beulich
 On 22.06.15 at 20:56, edmund.h.wh...@intel.com wrote:
 --- a/xen/include/asm-x86/hvm/hvm.h
 +++ b/xen/include/asm-x86/hvm/hvm.h
 @@ -210,6 +210,14 @@ struct hvm_function_table {
uint32_t *ecx, uint32_t *edx);
  
  void (*enable_msr_exit_interception)(struct domain *d);
 +
 +/* Alternate p2m */
 +int (*ahvm_vcpu_initialise)(struct vcpu *v);
 +void (*ahvm_vcpu_destroy)(struct vcpu *v);
 +int (*ahvm_vcpu_reset)(struct vcpu *v);
 +void (*ahvm_vcpu_update_eptp)(struct vcpu *v);
 +void (*ahvm_vcpu_update_vmfunc_ve)(struct vcpu *v);
 +bool_t (*ahvm_vcpu_emulate_ve)(struct vcpu *v);
  };

These ahvm_ prefixes are pretty strange - this isn't about
alternate HVM after all.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert

2015-06-24 Thread Rob Hoes
Signed-off-by: Rob Hoes rob.h...@citrix.com
---
 tools/libxl/libxl.c | 12 ++--
 tools/libxl/libxl_device.c  |  6 +++---
 tools/libxl/libxl_qmp.c |  4 +++-
 tools/libxl/libxl_types.idl | 22 +-
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2f56c6e..f41f291 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2848,25 +2848,25 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 
 libxl_domain_type type = libxl__domain_type(gc, domid);
 if (type == LIBXL_DOMAIN_TYPE_INVALID) {
-rc = ERROR_FAIL;
+rc = ERROR_INVAL_DOMAIN_TYPE;
 goto out;
 }
 if (type != LIBXL_DOMAIN_TYPE_HVM) {
 LOG(ERROR, cdrom-insert requires an HVM domain);
-rc = ERROR_INVAL;
+rc = ERROR_NOHVM;
 goto out;
 }
 
 if (libxl_get_stubdom_id(ctx, domid) != 0) {
 LOG(ERROR, cdrom-insert doesn't work for stub domains);
-rc = ERROR_INVAL;
+rc = ERROR_STUBDOM;
 goto out;
 }
 
 dm_ver = libxl__device_model_version_running(gc, domid);
 if (dm_ver == -1) {
 LOG(ERROR, cannot determine device model version);
-rc = ERROR_FAIL;
+rc = ERROR_DM_VERSION_UNDETERMINED;
 goto out;
 }
 
@@ -2881,7 +2881,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 }
 if (i == num) {
 LIBXL__LOG(ctx, LIBXL__LOG_ERROR, Virtual device not found);
-rc = ERROR_FAIL;
+rc = ERROR_DISK_VDEV_NOT_FOUND;
 goto out;
 }
 
@@ -2941,7 +2941,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 {
 LIBXL__LOG(ctx, LIBXL__LOG_ERROR, Internal error: %s does not 
exist,
libxl__sprintf(gc, %s/frontend, path));
-rc = ERROR_FAIL;
+rc = ERROR_INTERNAL;
 goto out;
 }
 
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 56c6e2e..1c5f659 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -271,7 +271,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, 
libxl_device_disk *disk) {
 if (disk-format == LIBXL_DISK_FORMAT_EMPTY) {
 if (!disk-is_cdrom) {
 LOG(ERROR, Disk vdev=%s is empty but not cdrom, disk-vdev);
-return ERROR_INVAL;
+return ERROR_INVAL_DISK_FORMAT;
 }
 memset(a.stab, 0, sizeof(a.stab));
 } else if ((disk-backend == LIBXL_DISK_BACKEND_UNKNOWN ||
@@ -281,7 +281,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, 
libxl_device_disk *disk) {
 if (stat(disk-pdev_path, a.stab)) {
 LOGE(ERROR, Disk vdev=%s failed to stat: %s,
 disk-vdev, disk-pdev_path);
-return ERROR_INVAL;
+return ERROR_DISK_PDEV_NOT_FOUND;
 }
 }
 
@@ -299,7 +299,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, 
libxl_device_disk *disk) {
 }
 if (!ok) {
 LOG(ERROR, no suitable backend for disk %s, disk-vdev);
-return ERROR_INVAL;
+return ERROR_DISK_BACKEND_UNDETERMINED;
 }
 disk-backend = ok;
 return 0;
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 9aa7e2e..c687e86 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -817,9 +817,11 @@ static int qmp_run_command(libxl__gc *gc, int domid,
 
 qmp = libxl__qmp_initialize(gc, domid);
 if (!qmp)
-return ERROR_FAIL;
+return ERROR_QMP_INIT;
 
 rc = qmp_synchronous_send(qmp, cmd, args, callback, opaque, qmp-timeout);
+if (rc  0)
+rc = ERROR_QMP_SEND;
 
 libxl__qmp_close(qmp);
 return rc;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 15e4af2..88262ca 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -107,6 +107,10 @@ libxl_error = Enumeration(error, [
 # Requested domain was not found
 (-21, DOMAIN_NOTFOUND),
 
+# Internal error; not actionable by the caller other than by doing 
something like
+# a retry/reboot (perhaps a libxl bug)
+(ENUM_PREV, INTERNAL),
+
 # Xenstore errors
 (ENUM_PREV, XS_CONNECT),
 (ENUM_PREV, XS_READ),
@@ -132,12 +136,28 @@ libxl_error = Enumeration(error, [
 # Disk parameters invalid
 (ENUM_PREV, INVAL_DISK_VDEV),
 (ENUM_PREV, INVAL_DISK_BACKEND),
+(ENUM_PREV, INVAL_DISK_FORMAT),
 
 # Disk parameters could not be determined
 (ENUM_PREV, DISK_VDEV_UNDETERMINED),
+(ENUM_PREV, DISK_BACKEND_UNDETERMINED),
 
-# Physical disk device could not be found
+# Physical/virtual disk device could not be found
 (ENUM_PREV, DISK_PDEV_NOT_FOUND),
+(ENUM_PREV, DISK_VDEV_NOT_FOUND),
+
+# Operation requires an HVM domain
+(ENUM_PREV, NOHVM),
+
+# Operation is not compatible with a stub 

[Xen-devel] [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection

2015-06-24 Thread Rob Hoes
Signed-off-by: Rob Hoes rob.h...@citrix.com
---
 tools/libxl/libxl_device.c  | 1 +
 tools/libxl/libxl_types.idl | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 93bb41e..56c6e2e 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -768,6 +768,7 @@ void libxl__wait_device_connection(libxl__egc *egc, 
libxl__ao_device *aodev)
  LIBXL_INIT_TIMEOUT * 1000);
 if (rc) {
 LOG(ERROR, unable to initialize device %s, be_path);
+rc = ERROR_DEVICE_WAIT_INIT;
 goto out;
 }
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index b905353..3c44b41 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -122,6 +122,9 @@ libxl_error = Enumeration(error, [
 (ENUM_PREV, JSON_GET_CONFIG),
 (ENUM_PREV, JSON_SET_CONFIG),
 (ENUM_PREV, JSON_PARSE_CONFIG),
+
+# Unable to initialise device connection watch
+(ENUM_PREV, DEVICE_WAIT_INIT),
 ], value_namespace = )
 
 libxl_domain_type = Enumeration(domain_type, [
-- 
2.4.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC 1/9] libxl idl: add comments to error enum

2015-06-24 Thread Rob Hoes
Signed-off-by: Rob Hoes rob.h...@citrix.com
---
 tools/libxl/libxl_types.idl | 41 +
 1 file changed, 41 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 65d479f..6dc18fa 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -44,26 +44,67 @@ MemKB = UInt(64, init_val = LIBXL_MEMKB_DEFAULT, 
json_gen_fn = libxl__uint64_
 #
 
 libxl_error = Enumeration(error, [
+# Generic failure; code should be avoided (often seen as rc = -1)
 (-1, NONSPECIFIC),
+
+# Libxl version mismatch
 (-2, VERSION),
+
+# General failure; code should be avoided
 (-3, FAIL),
+
+# Not implemented
 (-4, NI),
+
+# Out of memory (malloc or similar failed)
 (-5, NOMEM),
+
+# General failure; code should be avoided
 (-6, INVAL),
+
+# General failure; code should be avoided (used only in xl)
 (-7, BADFAIL),
+
+# Domain responded to suspend request
 (-8, GUEST_TIMEDOUT),
+
+# A xenstore watch has timed out
 (-9, TIMEDOUT),
+
+# The operation requires PV control, but the domain does not offer it
 (-10, NOPARAVIRT),
+
+# Event has not happened (libxl_event_check)
 (-11, NOT_READY),
+
+# osevent registration or modification hook failed
 (-12, OSEVENT_REG_FAIL),
+
+# fd buffer full (libxl_osevent_beforepoll)
 (-13, BUFFERFULL),
+
+# Process is not a child of the current libxl instance 
(libxl_childproc_reaped)
 (-14, UNKNOWN_CHILD),
+
+# Could not acquire lock
 (-15, LOCK_FAIL),
+
+# Unable to find JSON domain config
 (-16, JSON_CONFIG_EMPTY),
+
+# The requested device already exists
 (-17, DEVICE_EXISTS),
+
+# Remus ops do not match device
 (-18, REMUS_DEVOPS_DOES_NOT_MATCH),
+
+# Remus device not supported
 (-19, REMUS_DEVICE_NOT_SUPPORTED),
+
+# vNUMA config not valid
 (-20, VNUMA_CONFIG_INVALID),
+
+# Requested domain was not found
 (-21, DOMAIN_NOTFOUND),
 ], value_namespace = )
 
-- 
2.4.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC 3/9] libxl: introduce specific xenstore error codes

2015-06-24 Thread Rob Hoes
Signed-off-by: Rob Hoes rob.h...@citrix.com
---
 tools/libxl/libxl_types.idl |  8 
 tools/libxl/libxl_xshelp.c  | 14 +++---
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 6dc18fa..e9b3477 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -106,6 +106,14 @@ libxl_error = Enumeration(error, [
 
 # Requested domain was not found
 (-21, DOMAIN_NOTFOUND),
+
+# Xenstore errors
+(ENUM_PREV, XS_CONNECT),
+(ENUM_PREV, XS_READ),
+(ENUM_PREV, XS_WRITE),
+(ENUM_PREV, XS_TRANS_START),
+(ENUM_PREV, XS_TRANS_COMMIT),
+(ENUM_PREV, XS_REMOVE),
 ], value_namespace = )
 
 libxl_domain_type = Enumeration(domain_type, [
diff --git a/tools/libxl/libxl_xshelp.c b/tools/libxl/libxl_xshelp.c
index d7eaa66..e634ee5 100644
--- a/tools/libxl/libxl_xshelp.c
+++ b/tools/libxl/libxl_xshelp.c
@@ -174,7 +174,7 @@ int libxl__xs_read_checked(libxl__gc *gc, xs_transaction_t 
t,
 if (!result) {
 if (errno != ENOENT) {
 LOGE(ERROR, xenstore read failed: `%s', path);
-return ERROR_FAIL;
+return ERROR_XS_READ;
 }
 }
 *result_out = result;
@@ -187,7 +187,7 @@ int libxl__xs_write_checked(libxl__gc *gc, xs_transaction_t 
t,
 size_t length = strlen(string);
 if (!xs_write(CTX-xsh, t, path, string, length)) {
 LOGE(ERROR, xenstore write failed: `%s' = `%s', path, string);
-return ERROR_FAIL;
+return ERROR_XS_WRITE;
 }
 return 0;
 }
@@ -199,7 +199,7 @@ int libxl__xs_rm_checked(libxl__gc *gc, xs_transaction_t t, 
const char *path)
 return 0;
 
 LOGE(ERROR, xenstore rm failed: `%s', path);
-return ERROR_FAIL;
+return ERROR_XS_REMOVE;
 }
 return 0;
 }
@@ -210,7 +210,7 @@ int libxl__xs_transaction_start(libxl__gc *gc, 
xs_transaction_t *t)
 *t = xs_transaction_start(CTX-xsh);
 if (!*t) {
 LOGE(ERROR, could not create xenstore transaction);
-return ERROR_FAIL;
+return ERROR_XS_TRANS_START;
 }
 return 0;
 }
@@ -225,7 +225,7 @@ int libxl__xs_transaction_commit(libxl__gc *gc, 
xs_transaction_t *t)
 return +1;
 
 LOGE(ERROR, could not commit xenstore transaction);
-return ERROR_FAIL;
+return ERROR_XS_TRANS_COMMIT;
 }
 
 *t = 0;
@@ -257,7 +257,7 @@ int libxl__xs_path_cleanup(libxl__gc *gc, xs_transaction_t 
t,
 if (!xs_rm(CTX-xsh, t, path)) {
 if (errno != ENOENT)
 LOGE(DEBUG, unable to remove path %s, path);
-rc = ERROR_FAIL;
+rc = ERROR_XS_REMOVE;
 goto out;
 }
 
@@ -274,7 +274,7 @@ int libxl__xs_path_cleanup(libxl__gc *gc, xs_transaction_t 
t,
 if (!xs_rm(CTX-xsh, t, path)) {
 if (errno != ENOENT)
 LOGE(DEBUG, unable to remove path %s, path);
-rc = ERROR_FAIL;
+rc = ERROR_XS_REMOVE;
 goto out;
 }
 }
-- 
2.4.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 10/12] x86/altp2m: define and implement alternate p2m HVMOP types.

2015-06-24 Thread Andrew Cooper
On 22/06/15 19:56, Ed White wrote:
 Signed-off-by: Ed White edmund.h.wh...@intel.com
 ---
  xen/arch/x86/hvm/hvm.c  | 216 
 
  xen/include/public/hvm/hvm_op.h |  69 +
  2 files changed, 285 insertions(+)

 diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
 index b758ee1..b3e74ce 100644
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -6424,6 +6424,222 @@ long do_hvm_op(unsigned long op, 
 XEN_GUEST_HANDLE_PARAM(void) arg)
  break;
  }
  
 +case HVMOP_altp2m_get_domain_state:
 +{
 +struct xen_hvm_altp2m_domain_state a;
 +struct domain *d;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +d = rcu_lock_domain_by_any_id(a.domid);
 +if ( d == NULL )
 +return -ESRCH;
 +
 +rc = -EINVAL;
 +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() )
 +goto param_fail9;
 +
 +a.state = altp2mhvm_active(d);
 +rc = copy_to_guest(arg, a, 1) ? -EFAULT : 0;
 +
 +param_fail9:
 +rcu_unlock_domain(d);
 +break;
 +}
 +
 +case HVMOP_altp2m_set_domain_state:
 +{
 +struct xen_hvm_altp2m_domain_state a;
 +struct domain *d;
 +struct vcpu *v;
 +bool_t ostate;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +d = rcu_lock_domain_by_any_id(a.domid);
 +if ( d == NULL )
 +return -ESRCH;
 +
 +rc = -EINVAL;
 +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() ||
 + nestedhvm_enabled(d) )
 +goto param_fail10;
 +
 +ostate = d-arch.altp2m_active;
 +d-arch.altp2m_active = !!a.state;
 +
 +/* If the alternate p2m state has changed, handle appropriately */
 +if ( d-arch.altp2m_active != ostate )
 +{
 +if ( !ostate  !p2m_init_altp2m_by_id(d, 0) )
 +goto param_fail10;

Indentation.

 +
 +for_each_vcpu( d, v )
 +if (!ostate)
 +altp2mhvm_vcpu_initialise(v);
 +else
 +altp2mhvm_vcpu_destroy(v);

Although strictly speaking this is (almost) ok by the style guidelines,
it would probably be better to have braces for the for_each_vcpu()
loop.  Also, spaces for the brackets for !ostate.

 +
 +if ( ostate )
 +p2m_flush_altp2m(d);
 +}
 +
 +rc = 0;
 +
 +param_fail10:
 +rcu_unlock_domain(d);
 +break;
 +}
 +
 +case HVMOP_altp2m_vcpu_enable_notify:
 +{
 +struct domain *curr_d = current-domain;
 +struct vcpu *curr = current;
 +struct xen_hvm_altp2m_vcpu_enable_notify a;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +if ( !is_hvm_domain(curr_d) || !hvm_altp2m_supported() ||
 + !curr_d-arch.altp2m_active || vcpu_altp2mhvm(curr).veinfo_gfn )
 +return -EINVAL;
 +
 +vcpu_altp2mhvm(curr).veinfo_gfn = a.pfn;
 +ahvm_vcpu_update_vmfunc_ve(curr);

You need a gfn bounds check against the host p2m here.

 +rc = 0;
 +
 +break;
 +}
 +
 +case HVMOP_altp2m_create_p2m:
 +{
 +struct xen_hvm_altp2m_view a;
 +struct domain *d;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +d = rcu_lock_domain_by_any_id(a.domid);
 +if ( d == NULL )
 +return -ESRCH;
 +
 +rc = -EINVAL;
 +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() ||
 + !d-arch.altp2m_active )
 +goto param_fail11;
 +
 +if ( !p2m_init_next_altp2m(d, a.view) )
 +goto param_fail11;
 +
 +rc = copy_to_guest(arg, a, 1) ? -EFAULT : 0;
 +
 +param_fail11:
 +rcu_unlock_domain(d);
 +break;
 +}
 +
 +case HVMOP_altp2m_destroy_p2m:
 +{
 +struct xen_hvm_altp2m_view a;
 +struct domain *d;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +d = rcu_lock_domain_by_any_id(a.domid);
 +if ( d == NULL )
 +return -ESRCH;
 +
 +rc = -EINVAL;
 +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() ||
 + !d-arch.altp2m_active )
 +goto param_fail12;
 +
 +if ( p2m_destroy_altp2m_by_id(d, a.view) )
 +rc = 0;
 +
 +param_fail12:
 +rcu_unlock_domain(d);
 +break;
 +}
 +
 +case HVMOP_altp2m_switch_p2m:
 +{
 +struct xen_hvm_altp2m_view a;
 +struct domain *d;
 +
 +if ( copy_from_guest(a, arg, 1) )
 +return -EFAULT;
 +
 +d = rcu_lock_domain_by_any_id(a.domid);
 +if ( d == NULL )
 +return -ESRCH;
 +
 +rc = -EINVAL;
 +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() ||
 + !d-arch.altp2m_active )
 +

Re: [Xen-devel] [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept

2015-06-24 Thread Jan Beulich
 On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
 @@ -424,8 +427,17 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t 
 val)
  }
  }
  
 -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
 +static int stdvga_mem_write(struct vcpu *v, unsigned long addr,
 +unsigned long size, unsigned long data)
  {
 +ioreq_t p = { .type = IOREQ_TYPE_COPY,
 +  .addr = addr,
 +  .size = size,
 +  .count = 1,
 +  .dir = IOREQ_WRITE,
 +  .data = data,

Indentation.

 -if ( s-stdvga  s-cache )
 -{
 -switch ( p-type )
 -{
 -case IOREQ_TYPE_COPY:
 -buf = mmio_move(s, p);
 -if ( !buf )
 -s-cache = 0;
 -break;
 -default:
 -gdprintk(XENLOG_WARNING, unsupported mmio request type:%d 
 - addr:0x%04x data:0x%04x size:%d count:%d state:%d 
 - isptr:%d dir:%d df:%d\n,
 - p-type, (int)p-addr, (int)p-data, (int)p-size,
 - (int)p-count, p-state,
 - p-data_is_ptr, p-dir, p-df);
 -s-cache = 0;
 -}

I can't see where these cases of clearing s-cache move to.

 -}
 -else
 -{
 -buf = (p-dir == IOREQ_WRITE);
 -}
 -
 -rc = (buf  hvm_buffered_io_send(p));
 +rc = s-stdvga  s-cache 
 +(addr = VGA_MEM_BASE) 
 +((addr + length)  (VGA_MEM_BASE + VGA_MEM_SIZE));

Not how the old code also calls hvm_buffered_io_send() when
!s-stdvga || !s-cache but p-dir == IOREQ_WRITE. Do you
really mean to drop that?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept

2015-06-24 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 24 June 2015 14:59
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with
 standard mmio intercept
 
  On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
  @@ -424,8 +427,17 @@ static void stdvga_mem_writeb(uint64_t addr,
 uint32_t val)
   }
   }
 
  -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size)
  +static int stdvga_mem_write(struct vcpu *v, unsigned long addr,
  +unsigned long size, unsigned long data)
   {
  +ioreq_t p = { .type = IOREQ_TYPE_COPY,
  +  .addr = addr,
  +  .size = size,
  +  .count = 1,
  +  .dir = IOREQ_WRITE,
  +  .data = data,
 
 Indentation.
 

Damn emacs.

  -if ( s-stdvga  s-cache )
  -{
  -switch ( p-type )
  -{
  -case IOREQ_TYPE_COPY:
  -buf = mmio_move(s, p);
  -if ( !buf )
  -s-cache = 0;
  -break;
  -default:
  -gdprintk(XENLOG_WARNING, unsupported mmio request type:%d
 
  - addr:0x%04x data:0x%04x size:%d count:%d state:%d 
  - isptr:%d dir:%d df:%d\n,
  - p-type, (int)p-addr, (int)p-data, (int)p-size,
  - (int)p-count, p-state,
  - p-data_is_ptr, p-dir, p-df);
  -s-cache = 0;
  -}
 
 I can't see where these cases of clearing s-cache move to.
 

There's only one case AFAICT, which is if the domain goes through a 
save/restore then s-cache is cleared.

  -}
  -else
  -{
  -buf = (p-dir == IOREQ_WRITE);
  -}
  -
  -rc = (buf  hvm_buffered_io_send(p));
  +rc = s-stdvga  s-cache 
  +(addr = VGA_MEM_BASE) 
  +((addr + length)  (VGA_MEM_BASE + VGA_MEM_SIZE));
 
 Not how the old code also calls hvm_buffered_io_send() when
 !s-stdvga || !s-cache but p-dir == IOREQ_WRITE. Do you
 really mean to drop that?
 

Hmmm. I'm not sure why it would have done that. It seems wrong. I'll check.

  Paul

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [linux-arm-xen test] 58849: regressions - FAIL

2015-06-24 Thread Stefano Stabellini
On Wed, 24 Jun 2015, Ian Campbell wrote:
 On Wed, 2015-06-24 at 06:03 +, osstest service user wrote:
  flight 58849 linux-arm-xen real [real]
  http://logs.test-lab.xenproject.org/osstest/logs/58849/
  
  Regressions :-(
  
  Tests which did not succeed and are blocking,
  including tests which could not be run:
   test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 
  58830
 
 This was:
 http://logs.test-lab.xenproject.org/osstest/logs/58849/test-armhf-armhf-xl-cubietruck/cubietruck-braque---var-log-kern.log
 
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637687] [ cut 
 here ]
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637756] kernel BUG at 
 drivers/xen/grant-table.c:923!
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637784] Internal error: Oops 
 - BUG: 0 [#1] SMP ARM
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637810] Modules linked in: 
 xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637899] CPU: 0 PID: 16206 
 Comm: vif1.0-q0-guest Not tainted 3.16.7-ckt12+ #1
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637936] task: c12fc480 ti: 
 d2d3c000 task.ti: d2d3c000
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.637977] PC is at 
 gnttab_batch_copy+0xd4/0xe0
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638004] LR is at 
 gnttab_batch_copy+0x1c/0xe0
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638030] pc : [c04abf7c]
 lr : [c04abec4]psr: a013
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638030] sp : d2d3deb0  ip : 
 deadbeef  fp : d2d3df3c
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638091] r10: 0001  r9 : 
   r8 : 0008
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638124] r7 : 0001  r6 : 
 0001  r5 :   r4 : e1e38d30
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638159] r3 : 0001  r2 : 
 deadbeef  r1 : deadbeef  r0 : fff2
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638193] Flags: NzCv  IRQs on 
  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638227] Control: 10c5387d  
 Table: 7b50406a  DAC: 0015
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638257] Process 
 vif1.0-q0-guest (pid: 16206, stack limit = 0xd2d3c248)
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638287] Stack: (0xd2d3deb0 
 to 0xd2d3e000)
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638316] dea0:
  0001   e1e3
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638353] dec0: 0001 
 c05d7c44 003e 0ec2 d2d3df3c   0001
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638391] dee0: dbbb7a80 
   0008  d2d3df20 e1e38cfc e1e38d30
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638430] df00: 0001 
  0001  e1e38d30 e1e63530 003e 0208
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638468] df20: d9f0f480 
 d9f0f480 0001  d2d3df2c d2d3df34 d2d3df34 
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638505] df40:  
 db34e380  e1e3 c05d776c   
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638543] df60:  
 c0264138    e1e3  
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638581] df80: d2d3df80 
 d2d3df80   d2d3df90 d2d3df90 d2d3dfac db34e380
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638638] dfa0: c026406c 
   c020f038    
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638686] dfc0:  
       
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638723] dfe0:  
    0013   
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638785] [c04abf7c] 
 (gnttab_batch_copy) from [c05d7c44] (xenvif_kthread_guest_rx+0x4d8/0xbc0)
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638841] [c05d7c44] 
 (xenvif_kthread_guest_rx) from [c0264138] (kthread+0xcc/0xe8)
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638887] [c0264138] 
 (kthread) from [c020f038] (ret_from_fork+0x14/0x3c)
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638929] Code: 0ae5 
 eaed e8bd80f8 e7f001f2 (e7f001f2) 
 Jun 24 04:09:13 cubietruck-braque kernel: [  807.638978] ---[ end trace 
 98c74482d9a5771d ]---
 
 Which looks familiar, although I can't seem to find it, does anyone
 remember it? Are we missing a backport perhaps?
 
 This is the 3.16.y based linux-arm-xen tree, which was recently updated
 from a baseline of v3.16.4-ckt7 to v3.16.7-ckt12 (flight 58830) in both
 cases plus xen_arch_need_swiotlb for swiotlb stuff.
 
 This here was the next flight which only added the xen: netback: read
 hotplug script once at start of day., which I 

[Xen-devel] [xen-4.1-testing test] 58847: regressions - FAIL

2015-06-24 Thread osstest service user
flight 58847 xen-4.1-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58847/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-pair   21 guest-migrate/src_host/dst_host fail REGR. vs. 27396

Regressions which are regarded as allowable (not blocking):
 test-i386-i386-pair 21 guest-migrate/src_host/dst_host fail like 27420
 test-amd64-amd64-pair   21 guest-migrate/src_host/dst_host fail like 27420

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-i386-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 build-i386-rumpuserxen5 rumpuserxen-buildfail   never pass
 build-amd64-rumpuserxen   5 rumpuserxen-buildfail   never pass
 build-amd64-libvirt   5 libvirt-buildfail   never pass
 test-i386-i386-libvirt5 xen-install  fail   never pass
 test-amd64-amd64-libvirt  5 xen-install  fail   never pass
 test-amd64-i386-libvirt   5 xen-install  fail   never pass
 test-i386-i386-xl-sedf-pin   15 guest-saverestore.2  fail   never pass
 build-i386-libvirt5 libvirt-buildfail   never pass
 test-amd64-amd64-xl-win7-amd64 16 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64 20 leak-check/check  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 16 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-qemuu-freebsd10-i386 21 leak-check/check   fail never pass
 test-amd64-i386-xl-win7-amd64 16 guest-stop   fail  never pass
 test-i386-i386-xl-qemut-winxpsp3 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 20 leak-check/check   fail never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 16 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 16 guest-stop fail never pass
 test-amd64-i386-xend-winxpsp3 20 leak-check/check fail  never pass
 test-amd64-amd64-xl-winxpsp3 16 guest-stop   fail   never pass
 test-amd64-i386-qemuu-freebsd10-amd64 21 leak-check/check  fail never pass
 test-i386-i386-xl-winxpsp3   16 guest-stop   fail   never pass

version targeted for testing:
 xen  40feff8733e2ac27561a27e7c009a61ba3b320fe
baseline version:
 xen  8995a94f8f88b174dabd1289d1d54c1dcfe7c78d


People who touched revisions under test:
  Gonglei arei.gong...@huawei.com
  Ian Jackson ian.jack...@eu.citrix.com
  Paolo Bonzini pbonz...@redhat.com
  Petr Matousek pmato...@redhat.com
  Stefan Hajnoczi stefa...@redhat.com


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-i386-i386-xlpass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64fail
 test-amd64-i386-xl-qemut-debianhvm-amd64 fail
 test-amd64-i386-qemuu-freebsd10-amd64fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-amd64-xl-credit2  pass
 test-i386-i386-xl-credit2pass
 test-amd64-i386-qemuu-freebsd10-i386 fail
 test-amd64-i386-rumpuserxen-i386 

Re: [Xen-devel] [PATCH v2 06/12] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.

2015-06-24 Thread Jan Beulich
 On 22.06.15 at 20:56, edmund.h.wh...@intel.com wrote:
 @@ -1826,6 +1827,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
  vmx_vmcs_exit(v);
  }
  
 +static bool_t vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
 +{
 +bool_t rc = 0;
 +
 +if ( !cpu_has_vmx_vmfunc  altp2mhvm_active(current-domain) 
 + regs-eax == 0 
 + p2m_switch_vcpu_altp2m_by_id(current, (uint16_t)regs-ecx) )
 +{
 +regs-eip += 3;

What if the instruction has some (bogus but not invalid) opcode
prefix?

 @@ -2091,6 +2108,13 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
  vpid_sync_vcpu_gva(curr, vaddr);
  }
  
 +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
 +{
 +gdprintk(XENLOG_ERR, Failed guest VMFUNC execution\n);
 +domain_crash(current-domain);
 +return X86EMUL_OKAY;
 +}

What is this unconditional crashing of the guest good for?

 --- a/xen/arch/x86/x86_emulate/x86_emulate.c
 +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
 @@ -3837,6 +3837,14 @@ x86_emulate(
  goto rdtsc;
  }
  
 +if (modrm == 0xd4) /* vmfunc */
 +{
 +fail_if(ops-vmfunc == NULL);
 +if ( (rc = ops-vmfunc(ctxt) != 0) )
 +goto done;
 +break;
 +}

Together with the two preceding if()-s this is now finally the point
where switch() should be used instead.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] 答复: (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install

2015-06-24 Thread Fanhenglong
Windows 8.0 iso can't install in uefi mode,
I have to use windbg to get more debug information, 
So using xen in this case


发件人: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com] 
发送时间: 2015年6月24日 17:26
收件人: Fanhenglong; xen-devel@lists.xen.org
抄送: Liuqiming (John); Yanqiangjun; Huangpeng (Peter); Hanweidong (Randy)
主题: Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during 
the procedure of guest iso install

On 06/24/2015 12:14 PM, Fanhenglong wrote:
 I want to debug the procedure of windows os install with windbg,
 
 windbg executes instruction(fxsave) after the blank vm is started and 
 before guest iso start to install,
 
 fxsave trigger the following code path:
 vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION)
 -ept_handle_violation
 -hvm_hap_nested_page_fault
 -handle_mmio_with_translation
 -handle_mmio
 -hvm_emulate_one
 -x86_emulate
 
 *X86_emulate return X86EMUL_UNHANDLEABLE*

How are you using Xen in this case? Are you by any chance using the vm_event 
system in a way that sends back an emulate vm_event response from userspace?

You might want to look at x86_emulate() in 
xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is being 
handled.


HTH,
Razvan
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 8/9] video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc()

2015-06-24 Thread Luis R. Rodriguez
From: Luis R. Rodriguez mcg...@suse.com

This driver uses the same area for MTRR as for the ioremap().
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range())

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info-fix.smem_start, info-fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info-screen_base = ioremap_nocache(base, size);
+info-screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info-screen_base = ioremap(base, size);
+info-screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com
Cc: Tomi Valkeinen tomi.valkei...@ti.com
Cc: Jingoo Han jg1@samsung.com
Cc: Geert Uytterhoeven ge...@linux-m68k.org
Cc: Daniel Vetter daniel.vet...@ffwll.ch
Cc: Lad, Prabhakar prabhakar.cse...@gmail.com
Cc: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se
Cc: Suresh Siddha sbsid...@gmail.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Thomas Gleixner t...@linutronix.de
Cc: Juergen Gross jgr...@suse.com
Cc: Andy Lutomirski l...@amacapital.net
Cc: Dave Airlie airl...@redhat.com
Cc: Antonino Daplas adap...@gmail.com
Cc: linux-fb...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Acked-by: Tomi Valkeinen tomi.valkei...@ti.com
Signed-off-by: Luis R. Rodriguez mcg...@suse.com
---
 drivers/video/fbdev/s3fb.c | 35 ++-
 1 file changed, 6 insertions(+), 29 deletions(-)

diff --git a/drivers/video/fbdev/s3fb.c b/drivers/video/fbdev/s3fb.c
index f0ae61a..13b1090 100644
--- a/drivers/video/fbdev/s3fb.c
+++ b/drivers/video/fbdev/s3fb.c
@@ -28,13 +28,9 @@
 #include linux/i2c.h
 #include linux/i2c-algo-bit.h
 
-#ifdef CONFIG_MTRR
-#include asm/mtrr.h
-#endif
-
 struct s3fb_info {
int chip, rev, mclk_freq;
-   int mtrr_reg;
+   int wc_cookie;
struct vgastate state;
struct mutex open_lock;
unsigned int ref_count;
@@ -154,11 +150,7 @@ static const struct svga_timing_regs s3_timing_regs = {
 
 
 static char *mode_option;
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
-
 static int fasttext = 1;
 
 
@@ -170,11 +162,8 @@ module_param(mode_option, charp, 0444);
 MODULE_PARM_DESC(mode_option, Default video mode ('640x480-8@60', etc));
 module_param_named(mode, mode_option, charp, 0444);
 MODULE_PARM_DESC(mode, Default video mode ('640x480-8@60', etc) 
(deprecated));
-
-#ifdef CONFIG_MTRR
 module_param(mtrr, int, 0444);
 MODULE_PARM_DESC(mtrr, Enable write-combining with MTRR (1=enable, 0=disable, 
default=1));
-#endif
 
 module_param(fasttext, int, 0644);
 MODULE_PARM_DESC(fasttext, Enable S3 fast text mode (1=enable, 0=disable, 
default=1));
@@ -1168,7 +1157,7 @@ static int s3_pci_probe(struct pci_dev *dev, const struct 
pci_device_id *id)
info-fix.smem_len = pci_resource_len(dev, 0);
 
/* Map physical IO memory address into kernel space */
-   info-screen_base = pci_iomap(dev, 0, 0);
+   info-screen_base = pci_iomap_wc(dev, 0, 0);
if (! info-screen_base) {
rc = -ENOMEM;
dev_err(info-device, iomap for framebuffer failed\n);
@@ -1365,12 +1354,9 @@ static int s3_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
/* Record a reference to the driver data */
pci_set_drvdata(dev, info);
 
-#ifdef CONFIG_MTRR
-   if (mtrr) {
-   par-mtrr_reg = -1;
-   par-mtrr_reg = 

Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Luis R. Rodriguez
On Thu, Jun 25, 2015 at 09:38:01AM +1000, Benjamin Herrenschmidt wrote:
 On Wed, 2015-06-24 at 15:29 -0700, Luis R. Rodriguez wrote:
 
  Nope but at least what made me squint at this being a possible
  feature was that in practice when reviewing all of the kernels
  pending device drivers using MTRR (potential write-combine candidates)
  I encountered a slew of them which had the architectural unfortunate
  practice of combining PCI bars for MMIO and their respective
  write-combined desirable area (framebuffer for video, PIO buffers for
  infiniband, etc). Now, to me that read more as a practice for old
  school devices when such things were likely still being evaluated,
  more modern devices seem to adhere to sticking a full PCI bar with
  write-combining or not. Did you not encounter such mismatch splits on
  powerpc ? Was such possibility addressed?
 
 Yes, I remember we dealt with some networking (or maybe IB) stuff back
 in the day. We dealt with it by using a WC mapping and explicit barriers
 to prevent combine when not wanted.
 
 It is to be noted that on powerpc at least, writel() and co will never
 combine due to the memory barriers in them. Only normal stores (or
 __raw_writel) will.
 
 On Intel things I different I assume...

And the people who really know seem to be eaten by volcanoes or not have time.

 The problem I see is that architectures can provide widely different
 mechanisms and semantics in those areas and it's hard to define a
 generic driver interface.

Provided asm generic helpers are defined this should work though. The question
is just if there is enough motivation. Doesn't sound like it or as you note
maybe for userspace there might be. My position is that if it was too late for
PCIE or if this was too ambigious for PCIE perhaps the next generation bus
archicture or ammendments (I have no clue if this would would be possible) will
make this part of future device negotiation clear and fully expected, not a
wonderful side effect.

  If what you are implying here is applicable to the x86 world I'm all
  for enabling this as we'd have less code to maintain but I'll note
  that getting a clarification alone on that prefetchable !=
  write-combining was in and of itself hard, I'd be surprised if we
  could get full architectural buy-in to this as an immediate automatic
  feature.
 
 I'm happy not to make it automatic for kernel space.

OK thanks I'll proceed with these patches then.

 As for user mappings,

Which APIs were you considering in this regard BTW?

 maybe the right thing to do is to let us do what we do by
 default with a quirk that can set a flag in pci_dev to disable that
 behaviour (maybe on a per BAR basis ?).

That might mean it could restrict userspace WC to require devices
to have WC parts on a full PCI BAR. Although this is restrictive
having reviewed most WC uses in the kernel I'd think this would be
a fair compromise to make, but again, if things are still murky
perhaps best we kiss this idea good bye for now and hope for it
to come in on future buses or ammendments (if that's even possible?).

 I think the common case is that WC works.

If WC does not I will note one hack which migh be worth mentioning -- just for
the record, this was devised as a shortcoming of a device where they failed to
split things properly and that *without* WC performance suffered quite a bit so
they made one full PCI BAR WC and as a work around this:

http://lkml.kernel.org/r/20150416041837.GA5712@hykim-PC

That is for registers that needed it:

write; wmb;

Then if they wanted to wait till the NIC has seen the write, they did:

write; wmb; read;

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Benjamin Herrenschmidt
On Wed, 2015-06-24 at 17:58 -0700, Luis R. Rodriguez wrote:
 On Wed, Jun 24, 2015 at 5:52 PM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:
  On Thu, 2015-06-25 at 02:08 +0200, Luis R. Rodriguez wrote:
 
  OK thanks I'll proceed with these patches then.
 
   As for user mappings,
 
  Which APIs were you considering in this regard BTW?
 
  mmap of the generic /sys/bus/pci/.../resource*
 
 Like? Got a demo patch in mind ? :)

Nope. I was just thinking out loud. Today I have yet to see a problem
with what we do so ...

Cheers,
Ben.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 2/9] video: fbdev: i740fb: use arch_phys_wc_add() and pci_ioremap_wc_bar()

2015-06-24 Thread Luis R. Rodriguez
From: Luis R. Rodriguez mcg...@suse.com

Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range())

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info-fix.smem_start, info-fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info-screen_base = ioremap_nocache(base, size);
+info-screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info-screen_base = ioremap(base, size);
+info-screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Jingoo Han jg1@samsung.com
Cc: Bjorn Helgaas bhelg...@google.com
Cc: Geert Uytterhoeven ge...@linux-m68k.org
Cc: Rob Clark robdcl...@gmail.com
Cc: Benoit Taine benoit.ta...@lip6.fr
Cc: Suresh Siddha sbsid...@gmail.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Thomas Gleixner t...@linutronix.de
Cc: Juergen Gross jgr...@suse.com
Cc: Daniel Vetter daniel.vet...@ffwll.ch
Cc: Andy Lutomirski l...@amacapital.net
Cc: Dave Airlie airl...@redhat.com
Cc: Antonino Daplas adap...@gmail.com
Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com
Cc: Tomi Valkeinen tomi.valkei...@ti.com
Cc: linux-fb...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Acked-by: Tomi Valkeinen tomi.valkei...@ti.com
Signed-off-by: Luis R. Rodriguez mcg...@suse.com
---
 drivers/video/fbdev/i740fb.c | 35 ++-
 1 file changed, 6 insertions(+), 29 deletions(-)

diff --git a/drivers/video/fbdev/i740fb.c b/drivers/video/fbdev/i740fb.c
index a2b4204..452e116 100644
--- a/drivers/video/fbdev/i740fb.c
+++ b/drivers/video/fbdev/i740fb.c
@@ -27,24 +27,15 @@
 #include linux/console.h
 #include video/vga.h
 
-#ifdef CONFIG_MTRR
-#include asm/mtrr.h
-#endif
-
 #include i740_reg.h
 
 static char *mode_option;
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
 
 struct i740fb_par {
unsigned char __iomem *regs;
bool has_sgram;
-#ifdef CONFIG_MTRR
-   int mtrr_reg;
-#endif
+   int wc_cookie;
bool ddc_registered;
struct i2c_adapter ddc_adapter;
struct i2c_algo_bit_data ddc_algo;
@@ -1040,7 +1031,7 @@ static int i740fb_probe(struct pci_dev *dev, const struct 
pci_device_id *ent)
goto err_request_regions;
}
 
-   info-screen_base = pci_ioremap_bar(dev, 0);
+   info-screen_base = pci_ioremap_wc_bar(dev, 0);
if (!info-screen_base) {
dev_err(info-device, error remapping base\n);
ret = -ENOMEM;
@@ -1144,13 +1135,9 @@ static int i740fb_probe(struct pci_dev *dev, const 
struct pci_device_id *ent)
 
fb_info(info, %s frame buffer device\n, info-fix.id);
pci_set_drvdata(dev, info);
-#ifdef CONFIG_MTRR
-   if (mtrr) {
-   par-mtrr_reg = -1;
-   par-mtrr_reg = mtrr_add(info-fix.smem_start,
-   info-fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-   }
-#endif
+   if (mtrr)
+   par-wc_cookie = arch_phys_wc_add(info-fix.smem_start,
+ info-fix.smem_len);
return 0;
 
 err_reg_framebuffer:
@@ -1177,13 +1164,7 @@ static void i740fb_remove(struct pci_dev *dev)
 
if (info) {
struct i740fb_par *par = info-par;
-
-#ifdef CONFIG_MTRR
-   if (par-mtrr_reg = 0) {
-   mtrr_del(par-mtrr_reg, 0, 0);
-   par-mtrr_reg = -1;
-   }
-#endif
+   

[Xen-devel] [PATCH v8 4/9] video: fbdev: gxt4500: use pci_ioremap_wc_bar() for framebuffer

2015-06-24 Thread Luis R. Rodriguez
From: Luis R. Rodriguez mcg...@suse.com

The driver doesn't use mtrr_add() or arch_phys_wc_add() but
since we know the framebuffer is isolated already on an
ioremap() we can take advantage of write combining for
performance where possible.

In this case there are a few motivations for this:

a) Take advantage of PAT when available

b) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range())

Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com
Cc: Rob Clark robdcl...@gmail.com
Cc: Geert Uytterhoeven ge...@linux-m68k.org
Cc: Suresh Siddha sbsid...@gmail.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Thomas Gleixner t...@linutronix.de
Cc: Juergen Gross jgr...@suse.com
Cc: Daniel Vetter daniel.vet...@ffwll.ch
Cc: Andy Lutomirski l...@amacapital.net
Cc: Dave Airlie airl...@redhat.com
Cc: Antonino Daplas adap...@gmail.com
Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com
Cc: Tomi Valkeinen tomi.valkei...@ti.com
Cc: linux-fb...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Acked-by: Tomi Valkeinen tomi.valkei...@ti.com
Signed-off-by: Luis R. Rodriguez mcg...@suse.com
---
 drivers/video/fbdev/gxt4500.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/gxt4500.c b/drivers/video/fbdev/gxt4500.c
index 135d78a..f19133a 100644
--- a/drivers/video/fbdev/gxt4500.c
+++ b/drivers/video/fbdev/gxt4500.c
@@ -662,7 +662,7 @@ static int gxt4500_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 
info-fix.smem_start = fb_phys;
info-fix.smem_len = pci_resource_len(pdev, 1);
-   info-screen_base = pci_ioremap_bar(pdev, 1);
+   info-screen_base = pci_ioremap_wc_bar(pdev, 1);
if (!info-screen_base) {
dev_err(pdev-dev, gxt4500: cannot map framebuffer\n);
goto err_unmap_regs;
-- 
2.3.2.209.gd67f9d5.dirty


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 9/9] video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc()

2015-06-24 Thread Luis R. Rodriguez
From: Luis R. Rodriguez mcg...@suse.com

This driver uses the same area for MTRR as for the ioremap().
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range())

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info-fix.smem_start, info-fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info-screen_base = ioremap_nocache(base, size);
+info-screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info-screen_base = ioremap(base, size);
+info-screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Rob Clark robdcl...@gmail.com
Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com
Cc: Jingoo Han jg1@samsung.com
Cc: Lad, Prabhakar prabhakar.cse...@gmail.com
Cc: Suresh Siddha sbsid...@gmail.com
Cc: Ingo Molnar mi...@elte.hu
Cc: Thomas Gleixner t...@linutronix.de
Cc: Juergen Gross jgr...@suse.com
Cc: Daniel Vetter daniel.vet...@ffwll.ch
Cc: Andy Lutomirski l...@amacapital.net
Cc: Dave Airlie airl...@redhat.com
Cc: Antonino Daplas adap...@gmail.com
Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com
Cc: Tomi Valkeinen tomi.valkei...@ti.com
Cc: linux-fb...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Acked-by: Tomi Valkeinen tomi.valkei...@ti.com
Signed-off-by: Luis R. Rodriguez mcg...@suse.com
---
 drivers/video/fbdev/vt8623fb.c | 31 ++-
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/drivers/video/fbdev/vt8623fb.c b/drivers/video/fbdev/vt8623fb.c
index ea7f056..60f24828 100644
--- a/drivers/video/fbdev/vt8623fb.c
+++ b/drivers/video/fbdev/vt8623fb.c
@@ -26,13 +26,9 @@
 #include linux/console.h /* Why should fb driver call console functions? 
because console_lock() */
 #include video/vga.h
 
-#ifdef CONFIG_MTRR
-#include asm/mtrr.h
-#endif
-
 struct vt8623fb_info {
char __iomem *mmio_base;
-   int mtrr_reg;
+   int wc_cookie;
struct vgastate state;
struct mutex open_lock;
unsigned int ref_count;
@@ -99,10 +95,7 @@ static struct svga_timing_regs vt8623_timing_regs = {
 /* Module parameters */
 
 static char *mode_option = 640x480-8@60;
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
 
 MODULE_AUTHOR((c) 2006 Ondrej Zajicek santi...@crfreenet.org);
 MODULE_LICENSE(GPL);
@@ -112,11 +105,8 @@ module_param(mode_option, charp, 0644);
 MODULE_PARM_DESC(mode_option, Default video mode ('640x480-8@60', etc));
 module_param_named(mode, mode_option, charp, 0);
 MODULE_PARM_DESC(mode, Default video mode e.g. '648x480-8@60' (deprecated));
-
-#ifdef CONFIG_MTRR
 module_param(mtrr, int, 0444);
 MODULE_PARM_DESC(mtrr, Enable write-combining with MTRR (1=enable, 0=disable, 
default=1));
-#endif
 
 
 /* - */
@@ -710,7 +700,7 @@ static int vt8623_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
info-fix.mmio_len = pci_resource_len(dev, 1);
 
/* Map physical IO memory address into kernel space */
-   info-screen_base = pci_iomap(dev, 0, 0);
+   info-screen_base = pci_iomap_wc(dev, 0, 0);
if (! info-screen_base) {
rc = -ENOMEM;
dev_err(info-device, iomap for framebuffer failed\n);
@@ -781,12 +771,9 @@ static int vt8623_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
/* Record a reference to the driver data */
pci_set_drvdata(dev, 

  1   2   3   >