Re: [Xen-devel] Naming things (was XEN_SYSCTL_get_cpuid_policy)

2018-02-27 Thread Jan Beulich
>>> On 27.02.18 at 18:27,  wrote:
> Having finally got back to some CPUID work, I've come back to a naming
> problem I've been unable to resolve in the intervening time.
> 
> Originally, the plan was to introduce XEN_SYSCTL_get_cpuid_policy and
> XEN_DOMCTL_{get,set}_cpuid_policy, which was shortly followed by similar
> MSR policy calls.
> 
> However, to do the hypervisor side audit, the domain set cpuid and msr
> policy calls have to be done together, so all state can be cross-checked
> together.
> 
> Therefore, I was thinking of something more along the lines of
> XEN_DOMCTL_{get,set}_arch_config to be rather more generic, and prevent
> the need to add a new get/set hypercall pair for each class of
> information.  However, that name is confusing with struct
> xen_arch_domainconfig arch_config as part of the createdomain call.
> 
> So, XEN_DOMCTL_{get,set}_arch_settings ?

Taking it that even on the MSR side this is about feature visibility
(rather than actual state), XEN_DOMCTL_{get,set}_feature_policy
or XEN_DOMCTL_{get,set}_capabilities (in both cases perhaps
with "cpu" inserted somewhere for further clarification)?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [libvirt test] 120053: tolerable all pass - PUSHED

2018-02-27 Thread osstest service owner
flight 120053 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120053/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 120004
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 120004
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 120004
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  666dcb1aa25fa8d7e061fc7493226c345bafc66e
baseline version:
 libvirt  ef71caeaa81ab75daf441fe702d267c9b722bafb

Last test of basis   120004  2018-02-25 07:34:16 Z2 days
Testing same since   120053  2018-02-27 04:22:05 Z0 days1 attempts


People who touched revisions under test:
  Michal Privoznik 
  Nikolay Shirokovskiy 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at

Re: [Xen-devel] [PATCH 5/7] public / x86: introduce __HYPERCALL_iommu_op

2018-02-27 Thread Tian, Kevin
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: Tuesday, February 27, 2018 5:33 PM
> 
> > -Original Message-
> [snip]
> > > I'll define some terms to try to avoid confusing...
> > >
> > > - where the IOMMU code in Xen maintains a map such that BFN == MFN,
> > > let’s call this an 'identitity MFN map'
> > > - where the IOMMU code in Xen *initially programmes* the IOMMU
> with
> > > an identity MFN map for the whole host, let's call this a 'host map'
> > > - where the IOMMU code in Xen maintains a map such that BFN == GFN,
> > > let's call this an 'identity GFN map'
> > > - where the IOMMU code in Xen *initially programmes* the IOMMU
> with
> > > an identity GFN map for the guest, let's call this a 'guest map'
> >
> > Can you introduce a name for such mapping? then when you describe
> > identity mapping in future version, people can immediately get the
> > actual meaning. At least to me I always think about the mapping on
> > actual IOMMU page table first, which is always about BFN->MFN
> > mapping (where the definition of BFN varies in different usages).
> >
> 
> My point is that there are two notional types of identity map: one where
> BFN == MFN and one where BFN == GFN. Then there is whether Xen
> maintains the map, or just programmes it at domain create and thereafter
> leaves it alone.
> 
> > >
> > > > 1) for dom0 (w/o pvIOMMU) in strict mode, it's MFN:MFN identity
> > > mapping
> > >
> > > Without strict mode, a host map is set up for dom0, otherwise it is an
> > > identity MFN map. In both cases the xen-swiotlb driver is use in Linux as
> > > there is no difference from its point of view.
> > >
> > > > 2) for dom0 (w/ pvIOMMU), it's BFN:MFN mapping
> > >
> > > With PV-IOMMU there is also a host map but since a host map is only
> > > initialized and not maintained (i.e. nothing happens when pages are
> > > removed from or added to dom0) then it is safe for dom0 to control the
> > > IOMMU mappings as it will not conflict with anything Xen is doing.
> >
> > what do you mean by not maintained?
> 
> By 'maintained' I mean that, when the P2M of the guest is modified, Xen
> will adjust the IOMMU mappings accordingly.
> 
> > host map will be programmed
> > to IOMMU page table before launching Dom0, since hypervisor doesn't
> > know whether there will be a pvIOMMU driver launched. Later
> > pvIOMMU driver is loaded and issues hypercall to control its own
> > mapping, hypervisor then switch IOMMU page table from host map
> > to the new one, which is the same logic regarding to virtual VTd for
> > HVM guest. that is how I call an address space switch.
> 
> But that is not what happens. If need_iommu() is false then Xen will have
> programmed a mapping (BFN == MFN in the case of dom0), but will not
> touch it after that. Whether the domain (dom0 in this case) chooses to
> modify those mapping after that is up to the domain but it is free to do
> so because Xen will not dynamically adjust the mapping should the P2M
> change.
> With PV-IOMMU there is no 'big switch'; Xen does nothing more than set
> up the initial mapping and then respond to the individual map/unmap
> hypercalls that the domain may or may not issue.

I prefer to Xen doing an ownership switch, i.e. clear all initial mappings
before serving pvIOMMU request. otherwise pvIOMMU driver needs to
unmap the whole address space itself before serving any map/unmap
requests from other drivers, which is counterintuitive to what a normal
IOMMU driver would do (just initialize an empty page table).

> 
> >
> > >
> > > > 3) for HVM (w/o virtual VTd) with passthrough device, it's GFN:MFN
> > >
> > > I have not been following virtual VTd closely but, yes, as it stands *when
> > > h/w is passed through* the guest gets an identity GFN map otherwise it
> > > gets no map at all.
> > >
> > > > 4) for HVM (w/ virtual VTd) with passthrough device, it's BFN:MFN
> > > >
> > >
> > > With virtual VTd I'd expect there would be a guest map and then the
> guest
> > > would get the same level of control over the IOMMU that PV-IOMMU
> > > allows for a PV domain but, of course, such control is as-yet unsafe for
> > > guests since an IOMMU fault can cause a host crash.
> >
> > I'm not sure why you call it unsafe. even today with any passthrough
> > device (w/o virtual VTd exposed), a bad guest driver can always cause
> > DMA access to invalid GPA address and thus cause IOMMU fault. adding
> > virtual VTd doesn't change any security aspect here.
> 
> That's not entirely true. Xen could easily fill the IOMMU with a BFN == GFN
> mapping for valid GFN and then program all the other BFN to point at a
> scratch page and thus avoid any possibility of an IOMMU fault caused by an
> in-guest driver mis-programming a device. As soon as Xen gives the domain
> control over its own mappings then it can no longer ensure all BFN map to
> something valid.

Please note Xen never gives the domain control on the actual IOMMU page
table. w/ either pvIOMMU or virtual VTd, the 

[Xen-devel] [linux-4.9 test] 120047: tolerable FAIL - PUSHED

2018-02-27 Thread osstest service owner
flight 120047 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120047/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119891
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119891
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119891
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119891
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 119891
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linux19c04ca5b239e6e2277a5b381d1e79482ab9bbc5
baseline version:
 linux80c1c8322c331586a86e58d3f95026a1265ab396

Last test of basis   119891  2018-02-22 15:01:38 Z5 days
Testing same since   120011  2018-02-25 10:47:07 Z2 days2 attempts


People who touched revisions under test:
  Adam Ford 
  Al Viro 
  Alexandru Ardelean 
  Aliaksei Karaliou 
  Anand Moon 
  Andre 

Re: [Xen-devel] Is there any way to read msr in hypervisor code?

2018-02-27 Thread Meng Xu
On Tue, Feb 27, 2018 at 8:25 PM, Minjun Hong  wrote:
>
> Ah! Now I understand what you said.
> Thank you so much, Andrew.
>
> BTW, isn't it possible to use PERFEVTSELx MSR in Xen directly? I eagerly want 
> to achieve the number of cache-misses.


If you just want to monitor the number of cache misses, you can use
perf in Xen: https://wiki.xenproject.org/wiki/Xen_Profiling:_oprofile_and_perf

You need to enable the vpmu at the boot command.

If you plan to add your own features, you can check how vpmu is
implemented, which may help you implement your own feature.


Best Regards,

Meng

---
Meng Xu
Ph.D. Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/


>
>
> On Tue, Feb 27, 2018 at 11:03 PM, Andrew Cooper  
> wrote:
>>
>> On 27/02/18 12:35, Minjun Hong wrote:
>>
>> On Tue, Feb 27, 2018 at 7:00 PM, Andrew Cooper  
>> wrote:
>>>
>>> On 27/02/2018 09:37, Minjun Hong wrote:
>>>
>>> Hello, I've tried to read msr in hypervisor code:
>>>
 uint64_t reg;

 rdmsr_safe(0x412e, ret);
>>>
>>>
>>> But, I all failed to read it because the hypervisor is running in ring-1 
>>> CPL as I did googling.
>>>
>>>
>>> Xen runs in ring 0, root mode.  Forget terms like "ring -1" because they 
>>> are not accurate, and not a helpful description of what is going on.
>>>
>>> When I accessed msr, the crash log is below:
>>>
>>>
> (XEN) [ Xen-4.9.0  x86_64  debug=n   Not tainted ]
>
> (XEN) CPU:7
>
> (XEN) RIP:e008:[] 
> mcsched.c#shscan_timer_fn+0xd/0x180
>
> (XEN) RFLAGS: 00010206   CONTEXT: hypervisor
>
> (XEN) rax: 83084abfe028   rbx: 83084abfe300   rcx: 
> 412e
>
> (XEN) rdx: 83084abb7fff   rsi: 82d080632f00   rdi: 
> 
>
> (XEN) rbp: 82d080241390   rsp: 83084abb7e30   r8:  
> 830868d4bdc0
>
> (XEN) r9:  0005   r10: 012da75b50a5   r11: 
> 8300782f3060
>
> (XEN) r12:    r13: 012d9f9d6e07   r14: 
> 83084abb7fff
>
> (XEN) r15: 82d08062ad80   cr0: 8005003b   cr4: 
> 003526e0
>
> (XEN) cr3: 0003426f3000   cr2: 7fc3c21d4000
>
> (XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
>
> (XEN) Xen code around  
> (mcsched.c#shscan_timer_fn+0xd/0x180):
>
> (XEN)  00 00 41 55 41 54 55 53 <0f> 32 4c 8d 05 0a 79 11 00 48 8d 0d 91 
> 45 14 00
>>>
>>>
>>> This is a rdmsr(), not rdmsr_safe(), which is why you are crashing.
>>>
>>> Irrespective of that, 0x412e isn't an MSR which exists on any real 
>>> hardware, so I'm not sure what other result you were expecting.
>>>
>>> ~Andrew
>>
>>
>> Thanks for your answer, Andrew.
>>
>> Actually, I intended to get the number of cache-misses from msr and  
>> according to "Figure 18-1 Layout of IA32_PERFEVTSELx MSRs in Intel® 64 and 
>> IA-32 Architectures Software Developer’s Manual Volume 3B: System 
>> Programming Guide",
>> it can be achieved by "0x412e":
>>
>> Bit Position
>>
>> CPUID.AH.EBX
>>
>> Event Name
>>
>> UMask
>>
>> Event Select
>>
>> 0
>>
>> UnHalted Core Cycles
>>
>> 00H
>>
>> 3CH
>>
>> 1
>>
>> Instruction Retired
>>
>> 00H
>>
>> C0H
>>
>> 2
>>
>> UnHalted Reference Cycles
>>
>> 01H
>>
>> 3CH
>>
>> 3
>>
>> LLC Reference
>>
>> 4FH
>>
>> 2EH
>>
>> 4
>>
>> LLC Misses
>>
>> 41H
>>
>> 2EH
>>
>> 5
>>
>> Branch Instruction Retired
>>
>> 00H
>>
>> C4H
>>
>> 6
>>
>> Branch Misses Retired
>>
>> 00H
>>
>> C5H
>>
>>
>> I also tried to do that w/ rdmsr_safe() but, I failed to read msr.
>> Through the macro that returns a non-zero value, it was easy to see that 
>> reading msr failed:
>>
>>> if ( rdmsr_safe(0x412e, ret) ) {
>>> printk("cannot read msr!!!\n");
>>> } else {
>>> printk("rdmsr: %lu\n", ret);
>>> }
>>
>>
>>  I cannot understand why you said "0x412e isn't an MSR which exists on any 
>> real hardware" so, is it impossible to get the number of cache-misses by 
>> using msr ??
>>
>>
>> The bit you highlight above are values you need to write into one of the 
>> PERFEVTSELx MSRs, to cause hardware to track that event.  It does not mean 
>> MSR by that index exists.
>>
>> ~Andrew
>
>
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Is there any way to read msr in hypervisor code?

2018-02-27 Thread Minjun Hong
Ah! Now I understand what you said.
Thank you so much, Andrew.

BTW, isn't it possible to use PERFEVTSELx MSR in Xen directly? I eagerly
want to achieve the number of cache-misses.

On Tue, Feb 27, 2018 at 11:03 PM, Andrew Cooper 
wrote:

> On 27/02/18 12:35, Minjun Hong wrote:
>
> On Tue, Feb 27, 2018 at 7:00 PM, Andrew Cooper 
> wrote:
>
>> On 27/02/2018 09:37, Minjun Hong wrote:
>>
>> Hello, I've tried to read msr in hypervisor code:
>>
>> uint64_t reg;
>>
>> rdmsr_safe(0x412e, ret);
>>
>>
>> But, I all failed to read it because the hypervisor is running in ring-1
>> CPL as I did googling.
>>
>>
>> Xen runs in ring 0, root mode.  Forget terms like "ring -1" because they
>> are not accurate, and not a helpful description of what is going on.
>>
>> When I accessed msr, the crash log is below:
>>
>>
>> (XEN) [ Xen-4.9.0  x86_64  debug=n   Not tainted ]
>>>
>>> (XEN) CPU:7
>>>
>>> (XEN) RIP:e008:[] mcsched.c#shscan_timer_fn+0xd/
 0x180
>>>
>>> (XEN) RFLAGS: 00010206   CONTEXT: hypervisor
>>>
>>> (XEN) rax: 83084abfe028   rbx: 83084abfe300   rcx:
 412e
>>>
>>> (XEN) rdx: 83084abb7fff   rsi: 82d080632f00   rdi:
 
>>>
>>> (XEN) rbp: 82d080241390   rsp: 83084abb7e30   r8:
 830868d4bdc0
>>>
>>> (XEN) r9:  0005   r10: 012da75b50a5   r11:
 8300782f3060
>>>
>>> (XEN) r12:    r13: 012d9f9d6e07   r14:
 83084abb7fff
>>>
>>> (XEN) r15: 82d08062ad80   cr0: 8005003b   cr4:
 003526e0
>>>
>>> (XEN) cr3: 0003426f3000   cr2: 7fc3c21d4000
>>>
>>> (XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
>>>
>>> (XEN) Xen code around  (mcsched.c#shscan_timer_fn+0xd
 /0x180):
>>>
>>> (XEN)  00 00 41 55 41 54 55 53 <0f> 32 4c 8d 05 0a 79 11 00 48 8d 0d 91
 45 14 00
>>>
>>>
>> This is a rdmsr(), not rdmsr_safe(), which is why you are crashing.
>>
>> Irrespective of that, 0x412e isn't an MSR which exists on any real
>> hardware, so I'm not sure what other result you were expecting.
>>
>> ~Andrew
>>
>
> Thanks for your answer, Andrew.
>
> Actually, I intended to get the number of cache-misses from msr and  according
> to "*Figure 18**-1 Layout of IA32_PERFEVTSELx MSRs in Intel® 64 and IA-32
> Architectures Software Developer’s Manual Volume 3B: System Programming
> Guide*",
> it can be achieved by "0x412e":
>
> Bit Position
>
> CPUID.AH.EBX
>
> Event Name
>
> UMask
>
> Event Select
>
> 0
>
> UnHalted Core Cycles
>
> 00H
>
> 3CH
>
> 1
>
> Instruction Retired
>
> 00H
>
> C0H
>
> 2
>
> UnHalted Reference Cycles
>
> 01H
>
> 3CH
>
> 3
>
> LLC Reference
>
> 4FH
>
> 2EH
>
> *4*
>
> *LLC Misses*
>
> *41H*
>
> *2EH*
>
> 5
>
> Branch Instruction Retired
>
> 00H
>
> C4H
>
> 6
>
> Branch Misses Retired
>
> 00H
>
> C5H
>
> I also tried to do that w/ rdmsr_safe() but, I failed to read msr.
> Through the macro that returns a non-zero value, it was easy to see that
> reading msr failed:
>
> if ( rdmsr_safe(0x412e, ret) ) {
>> printk("cannot read msr!!!\n");
>> } else {
>> printk("rdmsr: %lu\n", ret);
>> }
>
>
>  I cannot understand why you said "0x412e isn't an MSR which exists on
> any real hardware" so, is it impossible to get the number of cache-misses
> by using msr ??
>
>
> The bit you highlight above are values you need to write into one of the
> PERFEVTSELx MSRs, to cause hardware to track that event.  It does not mean
> MSR by that index exists.
>
> ~Andrew
>
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] pvh+vcpus startup issue

2018-02-27 Thread xen


Regards, Peter
http://ri.mu Startups start here.  Hosting.  DNS.  Offsite backups.  
Monitoring.  Email.

On 27/02/18 12:42 AM, Juergen Gross wrote:

On 22/02/18 21:38, x...@randomwebstuff.com wrote:

On 22/02/18 6:35 PM, Juergen Gross wrote:

On 22/02/18 05:37, x...@randomwebstuff.com wrote:

Hi.  I have a domU.  Its params file has: vcpus = 8.  It will start with
pv, but not type="pvh".  It will not start (on pvh) with vcpus = 7 or 6
or 5.  It does start with vcpus = 4.

I diffed the xl -v create logs, no difference there on either startup.

I grabbed the domU console output for a vcpus = 5 start (attached).  It
dies right after:

[    0.007110] cpu 3 spinlock event irq 23
[    0.007336] installing Xen timer for CPU 4

Can you please post the hypervisor log ("xl dmesg")?


Juergen

Attached.

Can you please try again with "loglvl=all guest_loglvl=all" in the
hypervisor's boot parameters and after the pvh guest failing?


Juergen

I added those options.  I am attaching a log of 1) xl create -v  2) domU 
console 3) xl dmesg 4) params file.


Reminder this issue seems to be only on certain CPUs.  e.g. this one is 
on a 2xE5420 host.
xl create -v log:

Parsing config from /home/users/ot.glenn/params
domainbuilder: detail: xc_dom_allocate: cmdline="root=/dev/xvda1 ro", 
features=""
domainbuilder: detail: xc_dom_kernel_file: filename="/home/users/ot.glenn/linux"
domainbuilder: detail: xc_dom_malloc_filemap: 9325 kB
domainbuilder: detail: xc_dom_malloc: 32 MB
domainbuilder: detail: xc_dom_do_gunzip: unzip ok, 0x91b5aa -> 0x20ed940
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.10, caps xen-3.0-x86_64 
xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying ELF-generic loader ... 
domainbuilder: detail: loader probe OK
xc: detail: ELF: phdr: paddr=0x100 memsz=0x1522000
xc: detail: ELF: phdr: paddr=0x260 memsz=0x5c5000
xc: detail: ELF: phdr: paddr=0x2bc5000 memsz=0x21d18
xc: detail: ELF: phdr: paddr=0x2be7000 memsz=0x1cf000
xc: detail: ELF: memory: 0x100 -> 0x2db6000
xc: detail: ELF: note: GUEST_OS = "linux"
xc: detail: ELF: note: GUEST_VERSION = "2.6"
xc: detail: ELF: note: XEN_VERSION = "xen-3.0"
xc: detail: ELF: note: VIRT_BASE = 0x8000
xc: detail: ELF: note: INIT_P2M = 0x80
xc: detail: ELF: note: ENTRY = 0x82be7180
xc: detail: ELF: note: HYPERCALL_PAGE = 0x81001000
xc: detail: ELF: note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
xc: detail: ELF: note: SUPPORTED_FEATURES = 0x801
xc: detail: ELF: note: PAE_MODE = "yes"
xc: detail: ELF: note: LOADER = "generic"
xc: detail: ELF: note: unknown (0xd)
xc: detail: ELF: note: SUSPEND_CANCEL = 0x1
xc: detail: ELF: note: MOD_START_PFN = 0x1
xc: detail: ELF: note: HV_START_LOW = 0x8000
xc: detail: ELF: note: PADDR_OFFSET = 0
xc: detail: ELF: note: PHYS32_ENTRY = 0x1000360
xc: detail: ELF: addresses:
xc: detail: virt_base= 0x8000
xc: detail: elf_paddr_offset = 0x0
xc: detail: virt_offset  = 0x8000
xc: detail: virt_kstart  = 0x8100
xc: detail: virt_kend= 0x82db6000
xc: detail: virt_entry   = 0x82be7180
xc: detail: p2m_base = 0x80
domainbuilder: detail: xc_dom_parse_elf_kernel: hvm-3.0-x86_32: 
0x8100 -> 0x82db6000
domainbuilder: detail: xc_dom_mem_init: mem 2000 MB, pages 0x7d000 pages, 4k 
each
domainbuilder: detail: xc_dom_mem_init: 0x7d000 pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: xc_dom_malloc: 4000 kB
xc: detail: PHYSICAL MEMORY ALLOCATION:
xc: detail:   4KB PAGES: 0x
xc: detail:   2MB PAGES: 0x01e8
xc: detail:   1GB PAGES: 0x0001
S3 disabled
S4 disabled
CONV disabled
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_malloc: 178 kB
domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 
0x1000+0x1db6 at 0x7fd8d1521000
domainbuilder: detail: xc_dom_alloc_segment:   kernel   : 
0x8100 -> 0x82db6000  (pfn 0x1000 + 0x1db6 pages)
xc: detail: ELF: phdr 0 at 0x7fd8d1521000 -> 0x7fd8d2a43000
xc: detail: ELF: phdr 1 at 0x7fd8d2b21000 -> 0x7fd8d30e6000
xc: detail: ELF: phdr 2 at 0x7fd8d30e6000 -> 0x7fd8d3107d18
xc: detail: ELF: phdr 3 at 0x7fd8d3108000 -> 0x7fd8d320e000
domainbuilder: detail: xc_dom_load_acpi: 64 bytes at address fffc0

domainbuilder: detail: xc_dom_load_acpi: 4096 bytes at address fc00

domainbuilder: detail: xc_dom_load_acpi: 28672 bytes at address fc001000

domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x2db6+0x1 
at 0x7fd8d7c4f000
domainbuilder: detail: xc_dom_alloc_segment:   HVM start info : 
0x82db6000 -> 0x82db7000  (pfn 0x2db6 + 0x1 pages)
domainbuilder: detail: alloc_pgtables_hvm: doing nothing

[Xen-devel] [xen-unstable-smoke test] 120078: tolerable all pass - PUSHED

2018-02-27 Thread osstest service owner
flight 120078 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120078/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  1c473c42199a8f4d70533c202e1c57ecd1dad35b
baseline version:
 xen  d81925f7b37efd96711f31832e0b03fa3b304b48

Last test of basis   120071  2018-02-27 17:02:04 Z0 days
Testing same since   120078  2018-02-27 21:15:58 Z0 days1 attempts


People who touched revisions under test:
  Julien Grall 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   d81925f7b3..1c473c4219  1c473c42199a8f4d70533c202e1c57ecd1dad35b -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-3.18 test] 120043: tolerable FAIL - PUSHED

2018-02-27 Thread osstest service owner
flight 120043 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120043/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 119432
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 119432
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119432
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 119432
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119432
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119432
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119432
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 build-arm64-pvops 6 kernel-build fail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass

version targeted for testing:
 linuxf8f8e8c5bbed6c3941845a1b7956bd893818f29f
baseline version:
 linux0c946219398a3108a9fe8dbc5096586bdcc797d6

Last test of basis   119432  2018-02-16 20:38:28 Z   11 days
Testing same since   120010  2018-02-25 10:46:44 Z2 days2 attempts


People who touched revisions under test:
  Al Viro 
  Alex Deucher 
  Aliaksei Karaliou 
  Anand Moon 
  Arnaldo Carvalho de Melo 
  Arnd Bergmann 
  Balamuruhan S 
  Bartlomiej Zolnierkiewicz 
  Bjorn Andersson 
  Boris 

Re: [Xen-devel] [PATCH] pvcalls-front: 64-bit align flags

2018-02-27 Thread Stefano Stabellini
On Tue, 27 Feb 2018, Boris Ostrovsky wrote:
> On 02/27/2018 04:32 PM, Stefano Stabellini wrote:
> > On Tue, 27 Feb 2018, Boris Ostrovsky wrote:
> >> On 02/27/2018 02:54 PM, Stefano Stabellini wrote:
> >>> We are using test_and_* operations on the status and flag fields of
> >>> struct sock_mapping. However, these functions require the operand to be
> >>> 64-bit aligned on arm64. Currently, only status is 64-bit aligned.
> >>>
> >>> Make flags 64-bit aligned by introducing an explicit padding field.
> >>>
> >>> Signed-off-by: Stefano Stabellini 
> >>>
> >>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> >>> index ca5b773..aa07b2a 100644
> >>> --- a/drivers/xen/pvcalls-front.c
> >>> +++ b/drivers/xen/pvcalls-front.c
> >>> @@ -78,6 +78,7 @@ struct sock_mapping {
> >>>  #define PVCALLS_STATUS_BIND  1
> >>>  #define PVCALLS_STATUS_LISTEN2
> >>>   uint8_t status;
> >>> + uint8_t pad[7];
> >> Does this guarantee alignment (for either status or flag)?
> > Yes: status is part of a struct and a union. Unions and structs have the
> > alignment of their most aligned type. In this case they are 64-bit
> > aligned, as some of the fields are pointers.
> >
> > The padding makes sure that flags is 1+7 bytes from it.
> 
> OK.
> 
> What about adding  __attribute__((aligned(8))) to both (with a comment
> explaining reasoning)?

That's fine by me


> >
> >  
> >>>   /*
> >>>* Internal state-machine flags.
> >>>* Only one accept operation can be inflight for a socket.
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] pvcalls-front: 64-bit align flags

2018-02-27 Thread Boris Ostrovsky
On 02/27/2018 04:32 PM, Stefano Stabellini wrote:
> On Tue, 27 Feb 2018, Boris Ostrovsky wrote:
>> On 02/27/2018 02:54 PM, Stefano Stabellini wrote:
>>> We are using test_and_* operations on the status and flag fields of
>>> struct sock_mapping. However, these functions require the operand to be
>>> 64-bit aligned on arm64. Currently, only status is 64-bit aligned.
>>>
>>> Make flags 64-bit aligned by introducing an explicit padding field.
>>>
>>> Signed-off-by: Stefano Stabellini 
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index ca5b773..aa07b2a 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -78,6 +78,7 @@ struct sock_mapping {
>>>  #define PVCALLS_STATUS_BIND  1
>>>  #define PVCALLS_STATUS_LISTEN2
>>> uint8_t status;
>>> +   uint8_t pad[7];
>> Does this guarantee alignment (for either status or flag)?
> Yes: status is part of a struct and a union. Unions and structs have the
> alignment of their most aligned type. In this case they are 64-bit
> aligned, as some of the fields are pointers.
>
> The padding makes sure that flags is 1+7 bytes from it.

OK.

What about adding  __attribute__((aligned(8))) to both (with a comment
explaining reasoning)?

-boris


>
>  
>>> /*
>>>  * Internal state-machine flags.
>>>  * Only one accept operation can be inflight for a socket.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] pvcalls-front: 64-bit align flags

2018-02-27 Thread Stefano Stabellini
On Tue, 27 Feb 2018, Boris Ostrovsky wrote:
> On 02/27/2018 02:54 PM, Stefano Stabellini wrote:
> > We are using test_and_* operations on the status and flag fields of
> > struct sock_mapping. However, these functions require the operand to be
> > 64-bit aligned on arm64. Currently, only status is 64-bit aligned.
> >
> > Make flags 64-bit aligned by introducing an explicit padding field.
> >
> > Signed-off-by: Stefano Stabellini 
> >
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index ca5b773..aa07b2a 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -78,6 +78,7 @@ struct sock_mapping {
> >  #define PVCALLS_STATUS_BIND  1
> >  #define PVCALLS_STATUS_LISTEN2
> > uint8_t status;
> > +   uint8_t pad[7];
> 
> Does this guarantee alignment (for either status or flag)?

Yes: status is part of a struct and a union. Unions and structs have the
alignment of their most aligned type. In this case they are 64-bit
aligned, as some of the fields are pointers.

The padding makes sure that flags is 1+7 bytes from it.

 
> > /*
> >  * Internal state-machine flags.
> >  * Only one accept operation can be inflight for a socket.
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xl/libxl: add pvcalls support

2018-02-27 Thread Stefano Stabellini
Add pvcalls support to libxl and xl. Create the appropriate pvcalls
entries in xenstore.

Signed-off-by: Stefano Stabellini 

diff --git a/docs/misc/xenstore-paths.markdown 
b/docs/misc/xenstore-paths.markdown
index 7be2592..77d1a36 100644
--- a/docs/misc/xenstore-paths.markdown
+++ b/docs/misc/xenstore-paths.markdown
@@ -299,6 +299,11 @@ A virtual scsi device frontend. Described by
 A virtual usb device frontend. Described by
 [xen/include/public/io/usbif.h][USBIF]
 
+ ~/device/pvcalls/$DEVID/* []
+
+Paravirtualized POSIX function calls frontend. Described by
+[docs/misc/pvcalls.markdown][PVCALLS]
+
  ~/console/* []
 
 The primary PV console device. Described in [console.txt](console.txt)
@@ -377,6 +382,10 @@ A PV SCSI backend.
 
 A PV USB backend. Described by
 [xen/include/public/io/usbif.h][USBIF]
+ 
+ ~/backend/pvcalls/$DOMID/$DEVID/* []
+
+A PVCalls backend. Described in [docs/misc/pvcalls.markdown][PVCALLS].
 
  ~/backend/console/$DOMID/$DEVID/* []
 
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 917ceb0..035e66e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -140,7 +140,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_vtpm.o libxl_nic.o libxl_disk.o libxl_console.o \
libxl_cpupool.o libxl_mem.o libxl_sched.o libxl_tmem.o \
libxl_9pfs.o libxl_domain.o libxl_vdispl.o \
-$(LIBXL_OBJS-y)
+libxl_pvcalls.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index eca0ea2..76574d2 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -2006,6 +2006,16 @@ int libxl_device_p9_destroy(libxl_ctx *ctx, uint32_t 
domid,
 const libxl_asyncop_how *ao_how)
 LIBXL_EXTERNAL_CALLERS_ONLY;
 
+/* pvcalls */
+int libxl_device_pvcalls_remove(libxl_ctx *ctx, uint32_t domid,
+libxl_device_pvcalls *pvcalls,
+const libxl_asyncop_how *ao_how)
+LIBXL_EXTERNAL_CALLERS_ONLY;
+int libxl_device_pvcalls_destroy(libxl_ctx *ctx, uint32_t domid,
+ libxl_device_pvcalls *pvcalls,
+ const libxl_asyncop_how *ao_how)
+ LIBXL_EXTERNAL_CALLERS_ONLY;
+
 /* PCI Passthrough */
 int libxl_device_pci_add(libxl_ctx *ctx, uint32_t domid,
  libxl_device_pci *pcidev,
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index c498135..bbdeee5 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1374,6 +1374,9 @@ static void domcreate_launch_dm(libxl__egc *egc, 
libxl__multidev *multidev,
 for (i = 0; i < d_config->num_p9s; i++)
 libxl__device_add(gc, domid, __p9_devtype, _config->p9s[i]);
 
+for (i = 0; i < d_config->num_pvcallss; i++)
+libxl__device_add(gc, domid, __pvcalls_devtype, 
_config->pvcallss[i]);
+
 switch (d_config->c_info.type) {
 case LIBXL_DOMAIN_TYPE_HVM:
 {
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 506687f..e9edfac 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3648,6 +3648,7 @@ extern const struct libxl_device_type 
libxl__usbdev_devtype;
 extern const struct libxl_device_type libxl__pcidev_devtype;
 extern const struct libxl_device_type libxl__vdispl_devtype;
 extern const struct libxl_device_type libxl__p9_devtype;
+extern const struct libxl_device_type libxl__pvcalls_devtype;
 
 extern const struct libxl_device_type *device_type_tbl[];
 
diff --git a/tools/libxl/libxl_pvcalls.c b/tools/libxl/libxl_pvcalls.c
new file mode 100644
index 000..a285343
--- /dev/null
+++ b/tools/libxl/libxl_pvcalls.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright (C) 2018  Aporeto
+ * Author Stefano Stabellini 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h"
+
+#include "libxl_internal.h"
+
+static int libxl__device_pvcalls_setdefault(libxl__gc *gc, uint32_t domid,
+libxl_device_pvcalls *pvcalls,
+bool hotplug)
+{
+return libxl__resolve_domid(gc, pvcalls->backend_domname,
+

Re: [Xen-devel] [PATCH] pvcalls-front: 64-bit align flags

2018-02-27 Thread Boris Ostrovsky
On 02/27/2018 02:54 PM, Stefano Stabellini wrote:
> We are using test_and_* operations on the status and flag fields of
> struct sock_mapping. However, these functions require the operand to be
> 64-bit aligned on arm64. Currently, only status is 64-bit aligned.
>
> Make flags 64-bit aligned by introducing an explicit padding field.
>
> Signed-off-by: Stefano Stabellini 
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index ca5b773..aa07b2a 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -78,6 +78,7 @@ struct sock_mapping {
>  #define PVCALLS_STATUS_BIND  1
>  #define PVCALLS_STATUS_LISTEN2
>   uint8_t status;
> + uint8_t pad[7];

Does this guarantee alignment (for either status or flag)?

-boris

>   /*
>* Internal state-machine flags.
>* Only one accept operation can be inflight for a socket.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [seabios test] 120038: regressions - FAIL

2018-02-27 Thread osstest service owner
flight 120038 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120038/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail in 120002 REGR. vs. 
115539

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-ws16-amd64 15 guest-saverestore.2 fail pass in 120002

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115539
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115539
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115539
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 seabios  af0daeb2687ad2595482b8a71b02a082a5672ceb
baseline version:
 seabios  0ca6d6277dfafc671a5b3718cbeb5c78e2a888ea

Last test of basis   115539  2017-11-03 20:48:58 Z  115 days
Failing since115733  2017-11-10 17:19:59 Z  109 days  138 attempts
Testing same since   119258  2018-02-15 09:12:54 Z   12 days   16 attempts


People who touched revisions under test:
  Kevin O'Connor 
  Marcel Apfelbaum 
  Michael S. Tsirkin 
  Nikolay Nikolov 
  Paul Menzel 
  Stefan Berger 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit af0daeb2687ad2595482b8a71b02a082a5672ceb
Author: Nikolay Nikolov 
Date:   Sat Feb 10 13:52:17 2018 +0200

floppy: Send 4 sense interrupt commands during controller initialization

During initialization, real floppy controllers need 4 sense interrupt 
commands
to clear the interrupt status (this represents the transition from "not 
ready"
to "ready" for each of the four virtual floppy drives), instead of just one.

This is described in detail in section 7.4 - Drive Polling of the Intel 
82077AA
datasheet.

Signed-off-by: Nikolay Nikolov 

commit 2611db472c0f0bad4987c20990a45c175342fc22
Author: Nikolay Nikolov 
Date:  

[Xen-devel] [xen-unstable test] 120037: tolerable FAIL

2018-02-27 Thread osstest service owner
flight 120037 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120037/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-ws16-amd64 16 guest-localmigrate/x10 fail pass in 
120001

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail in 120001 blocked in 
120037
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 120001
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 120001
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 120001
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 120001
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 120001
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 120001
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 120001
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 120001
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 120001
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  a823a5280f25ad19a751dd9a41044f556471e61a
baseline version:
 xen  a823a5280f25ad19a751dd9a41044f556471e61a

Last test of basis   120037  2018-02-26 13:16:29 Z1 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  

Re: [Xen-devel] [PATCH] xen: use hvc console for dom0

2018-02-27 Thread Julien Grall

Hi,

On 27/02/2018 20:03, Stefano Stabellini wrote:

On Tue, 27 Feb 2018, Julien Grall wrote:

Hi,

On 26/02/18 12:32, Juergen Gross wrote:

On 26/02/18 13:06, Andrii Anisov wrote:

Hello Juergen,


On 26.02.18 13:08, Juergen Gross wrote:

Today the hvc console is added as a preferred console for pv domUs
only. As this requires a boot parameter for getting dom0 messages per
default add it for dom0, too.

Signed-off-by: Juergen Gross 
---
    arch/x86/xen/enlighten_pv.c | 4 +++-
    1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c

Is this something x86 specific? Could it be a generic approach?


In case ARM wants something similar I guess the test for
xen_initial_domain() should be dropped in xen_early_init().

I am pretty sure we discussed to remove !xen_initial_domain() for Arm in the
past. But I don't remember why the patch was not sent to remove it.

Anyway, I guess this should be fine to have hvc as a preferred console for the
initial domain as well.


Usually, Dom0 has access to several physical UARTs and/or VGA, making
this patch less obviously desirable. I think that for Dom0 the best
behavior would be to add "hvc0" as first console rather than last
console, so that if the user specified something else, this call won't
interfere with it.


Well, that's exactly the goal of add_preferred_console. It will use hvc0 
unless specified otherwise by the user on the command line.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen: use hvc console for dom0

2018-02-27 Thread Stefano Stabellini
On Tue, 27 Feb 2018, Julien Grall wrote:
> Hi,
> 
> On 26/02/18 12:32, Juergen Gross wrote:
> > On 26/02/18 13:06, Andrii Anisov wrote:
> > > Hello Juergen,
> > > 
> > > 
> > > On 26.02.18 13:08, Juergen Gross wrote:
> > > > Today the hvc console is added as a preferred console for pv domUs
> > > > only. As this requires a boot parameter for getting dom0 messages per
> > > > default add it for dom0, too.
> > > > 
> > > > Signed-off-by: Juergen Gross 
> > > > ---
> > > >    arch/x86/xen/enlighten_pv.c | 4 +++-
> > > >    1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
> > > Is this something x86 specific? Could it be a generic approach?
> > 
> > In case ARM wants something similar I guess the test for
> > xen_initial_domain() should be dropped in xen_early_init().
> I am pretty sure we discussed to remove !xen_initial_domain() for Arm in the
> past. But I don't remember why the patch was not sent to remove it.
> 
> Anyway, I guess this should be fine to have hvc as a preferred console for the
> initial domain as well.

Usually, Dom0 has access to several physical UARTs and/or VGA, making
this patch less obviously desirable. I think that for Dom0 the best
behavior would be to add "hvc0" as first console rather than last
console, so that if the user specified something else, this call won't
interfere with it.___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] rwlock and per-cpu rwlock recursive?

2018-02-27 Thread Andrew Cooper
On 27/02/18 19:18, Julien Grall wrote:
> Hi,
>
> On Arm, we made the (wrong?) assumption that the rwlock was recursive.
> We have a couple of places where the read lock can be nested (mostly
> the memaccess code).
>
> I found out today that it can actually deadlock in the following case:
> 1) CPU A -> Ask for the read lock
>     => Succeed
> 2) CPU B -> Ask for the write lock
>     => Already taken by CPU A so go to the slowpath
>     => Take the internal lock (lock->lock)
>     => Wait until the lock is released.
> 3) CPU A -> Ask for the read lock recursively
>     => A writer is waiting so go to the slowpath
>     => Try to take the internal lock (lock->lock)
>     => Blocking because CPU B already has it
>
> So we end up in a deadlock case. Can someone confirm whether whether
> rwlock was meant to be recursive?

rwlocks are not recursive, but probably will appear to be in the common
case.

When uncontended, an effectively-arbitrary number of read_lock()'s can
complete, but when contended, readers and writers use a regular spinlock
(which is a ticketlock under the hood) to ensure fairness.

> Similarly, I was thinking to move the p2m code to the per-cpu rwlock
> as x86 does. From what I gathered by reading the x86 code, this lock
> could be taken recursively. Am I right?

No - what gives you the impression that it can be taken recursively?  In
the case where we detect taking a second percpu_rwlock, we fall back to
the slowpath of a real read_lock() call.

> Lastly, the x86 p2m code seemed to use rwlock before hand. How did the
> p2m code was handling recursive locking?

Plain spinlocks (when using the recursive helpers), and x86
mm_{rw,}lock_t's can be taken recursively.  A call to spin_lock() will
deadlock if the lock is already taken and you meant to use
spin_lock_recursive().

You probably want to see about reading xen/arch/x86/mm/mm-locks.h

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] pvcalls-front: 64-bit align flags

2018-02-27 Thread Stefano Stabellini
We are using test_and_* operations on the status and flag fields of
struct sock_mapping. However, these functions require the operand to be
64-bit aligned on arm64. Currently, only status is 64-bit aligned.

Make flags 64-bit aligned by introducing an explicit padding field.

Signed-off-by: Stefano Stabellini 

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index ca5b773..aa07b2a 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -78,6 +78,7 @@ struct sock_mapping {
 #define PVCALLS_STATUS_BIND  1
 #define PVCALLS_STATUS_LISTEN2
uint8_t status;
+   uint8_t pad[7];
/*
 * Internal state-machine flags.
 * Only one accept operation can be inflight for a socket.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] rwlock and per-cpu rwlock recursive?

2018-02-27 Thread Julien Grall

Hi,

On Arm, we made the (wrong?) assumption that the rwlock was recursive. 
We have a couple of places where the read lock can be nested (mostly the 
memaccess code).


I found out today that it can actually deadlock in the following case:
1) CPU A -> Ask for the read lock
=> Succeed
2) CPU B -> Ask for the write lock
=> Already taken by CPU A so go to the slowpath
=> Take the internal lock (lock->lock)
=> Wait until the lock is released.
3) CPU A -> Ask for the read lock recursively
=> A writer is waiting so go to the slowpath
=> Try to take the internal lock (lock->lock)
=> Blocking because CPU B already has it

So we end up in a deadlock case. Can someone confirm whether whether 
rwlock was meant to be recursive?


Similarly, I was thinking to move the p2m code to the per-cpu rwlock as 
x86 does. From what I gathered by reading the x86 code, this lock could 
be taken recursively. Am I right?


Lastly, the x86 p2m code seemed to use rwlock before hand. How did the 
p2m code was handling recursive locking?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 120071: tolerable all pass - PUSHED

2018-02-27 Thread osstest service owner
flight 120071 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120071/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  d81925f7b37efd96711f31832e0b03fa3b304b48
baseline version:
 xen  40681735502648fdc45973382a440aa38f4ec800

Last test of basis   120066  2018-02-27 14:01:11 Z0 days
Testing same since   120071  2018-02-27 17:02:04 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ian Jackson 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   4068173550..d81925f7b3  d81925f7b37efd96711f31832e0b03fa3b304b48 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 33/41] tools/libacpi, hvmloader: detect QEMU fw_cfg interface

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:10:22PM +0800, Haozhong Zhang wrote:
> diff --git a/tools/libacpi/qemu_fw_cfg.c b/tools/libacpi/qemu_fw_cfg.c
> new file mode 100644
> index 00..254d2f575d
> --- /dev/null
> +++ b/tools/libacpi/qemu_fw_cfg.c
> @@ -0,0 +1,66 @@
> +/*
> + * libacpi/qemu_fw_cfg.c
> + *
> + * Driver of QEMU fw_cfg interface. The reference document can be found at
> + * https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt.

On github.com, it's only a mirror. I think the URL should read:
https://git.qemu.org/?p=qemu.git;a=blob;f=docs/specs/fw_cfg.txt;hb=HEAD

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 34/41] tools/libacpi: probe QEMU ACPI ROMs via fw_cfg interface

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:10:23PM +0800, Haozhong Zhang wrote:
> Probe following QEMU ACPI ROMs:
>  * etc/acpi/rsdp:   QEMU RSDP, which is used to iterate other
> QEMU ACPI tables in etc/acpi/tables
> 
>  * etc/acpi/tables: other QEMU ACPI tables
> 
>  * etc/table-loader:QEMU BIOSLinkerLoader ROM, which can be
> executed to load QEMU ACPI tables
> 
>  * etc/acpi/nvdimm-mem: RAM which is used as NVDIMM ACPI DSM buffer,
> the exact location will be allocated during
> the execution of /etc/table-loader
> 
> Signed-off-by: Haozhong Zhang 
> ---

> diff --git a/tools/libacpi/qemu_loader.c b/tools/libacpi/qemu_loader.c
> new file mode 100644
> index 00..c0ed3b0ad0
> --- /dev/null
> +++ b/tools/libacpi/qemu_loader.c
> @@ -0,0 +1,82 @@
> +/*
> + * libacpi/qemu_loader.c
> + *
> + * Driver of QEMU BIOSLinkerLoader interface. The reference document
> + * can be found at
> + * https://github.com/qemu/qemu/blob/master/hw/acpi/bios-linker-loader.c.

That's only a mirror, the official QEMU tree is on git.qemu.org. So I
think the URL should read:
https://git.qemu.org/?p=qemu.git;a=blob;f=hw/acpi/bios-linker-loader.c;hb=HEAD

> + *
> + * Copyright (C) 2017,  Intel Corporation
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License, version 2.1, as published by the Free Software Foundation.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; If not, see 
> .
> + */
> +
> +#include LIBACPI_STDUTILS
> +#include "libacpi.h"
> +#include "qemu.h"
> +
> +struct rom {
> +struct fw_cfg_file file;
> +struct rom *next;
> +};
> +
> +static struct rom *roms = NULL;
> +static struct rom *bios_loader = NULL;
> +
> +static bool rom_needed(const char *file_name)
> +{
> +return
> +!strncmp(file_name, "etc/acpi/rsdp", FW_CFG_FILE_PATH_MAX_LENGTH) ||
> +!strncmp(file_name, "etc/acpi/tables", FW_CFG_FILE_PATH_MAX_LENGTH) 
> ||
> +!strncmp(file_name, "etc/table-loader", FW_CFG_FILE_PATH_MAX_LENGTH) 
> ||
> +!strncmp(file_name, "etc/acpi/nvdimm-mem", 
> FW_CFG_FILE_PATH_MAX_LENGTH);

Is it necessary to filter the "files" that are available via fw_cfg? Is
there enough memory for hvmloader to just alocate the "struct rom" for
every available files? Other solution might be to filter base on
"etc/acpi/*" + "etc/table-loader".


-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/5] xen: sched/credit: convert scheduling parameter to s_time_t when set

2018-02-27 Thread George Dunlap
On 02/23/2018 04:41 PM, Dario Faggioli wrote:
> Basically, instead of converting integers to s_time_t
> at usage time (hot paths), do the convertion when the
> values are set (cold paths).
> 
> This applies to the timeslice and the ratelimit
> parameters of Credit1.
> 
> Note that, when changing the type of the fields of
> struct csched_private (from unsigned to s_time_t),
> ncpus is moved up a bit, for better packing.
> 
> Signed-off-by: Dario Faggioli 

Acked-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 33/41] tools/libacpi, hvmloader: detect QEMU fw_cfg interface

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:10:22PM +0800, Haozhong Zhang wrote:
> Add a function in libacpi to detect QEMU fw_cfg interface. Limit the
> usage of fw_cfg interface to hvmloader now, so use stub functions for
> others.

I think libacpi is not the right place for a driver. The fw_cfg driver
would be better in hvmloader.

As to copy the ACPI tables from fw_cfg to libacpi, maybe the passthrough
tables (or an improvement of it) could be use. (It is already to to add
extra tables from libxl (HVM_XS_ACPI_PT_ADDRESS).)

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] xen: events: free irqs in error condition

2018-02-27 Thread Shah, Amit

On Di, 2018-02-27 at 17:07 +, Roger Pau Monné wrote:
> On Tue, Feb 27, 2018 at 03:55:58PM +, Amit Shah wrote:
> > 
> > In case of errors in irq setup for MSI, free up the allocated irqs.
> > 
> > Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
> > Reported-by: Hooman Mirhadi 
> > CC: 
> > CC: Roger Pau Monné 
> > CC: Boris Ostrovsky 
> > CC: Eduardo Valentin 
> > CC: Juergen Gross 
> > CC: Thomas Gleixner 
> > CC: "K. Y. Srinivasan" 
> > CC: Liu Shuo 
> > CC: Anoob Soman 
> > Signed-off-by: Amit Shah 
> > ---
> >  drivers/xen/events/events_base.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/events/events_base.c
> > b/drivers/xen/events/events_base.c
> > index c86d10e..a299586 100644
> > --- a/drivers/xen/events/events_base.c
> > +++ b/drivers/xen/events/events_base.c
> > @@ -750,11 +750,14 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev
> > *dev, struct msi_desc *msidesc,
> >  
> >     ret = irq_set_msi_desc(irq, msidesc);
> >     if (ret < 0)
> > -   goto error_irq;
> > +   goto error_desc;
> >  out:
> >     mutex_unlock(_mapping_update_lock);
> >     return irq;
> >  error_irq:
> > +   while (--nvec >= i)
> > +   xen_free_irq(irq + nvec);
> > +error_desc:
> >     while (i > 0) {
> >     i--;
> >     __unbind_from_irq(irq + i);
> It seems pointless to introduce another label and another loop to fix
> something that can be fixed with a single label and a single loop,
> this just makes the code more complex for no reason.

I disagree, just because there are two different cleanups to be made
for two different issues; it's not as if the if.. and else conditions
are going to be interleaved.

Anyway it's a matter of taste.

Since you've already proposed the patch, would you mind baking a proper
one and posting it?

Thanks!


> IMHO the way to solve this issue is:
> 
> while (nvec--) {
>   if (nvec >= i)
>   xen_free_irq(irq + nvec);
>   else
>   __unbind_from_irq(irq + nvec);
> }

Amit

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Naming things (was XEN_SYSCTL_get_cpuid_policy)

2018-02-27 Thread Andrew Cooper
All,

Having finally got back to some CPUID work, I've come back to a naming
problem I've been unable to resolve in the intervening time.

Originally, the plan was to introduce XEN_SYSCTL_get_cpuid_policy and
XEN_DOMCTL_{get,set}_cpuid_policy, which was shortly followed by similar
MSR policy calls.

However, to do the hypervisor side audit, the domain set cpuid and msr
policy calls have to be done together, so all state can be cross-checked
together.

Therefore, I was thinking of something more along the lines of
XEN_DOMCTL_{get,set}_arch_config to be rather more generic, and prevent
the need to add a new get/set hypercall pair for each class of
information.  However, that name is confusing with struct
xen_arch_domainconfig arch_config as part of the createdomain call.

So, XEN_DOMCTL_{get,set}_arch_settings ?

Also, is this the kind of thing which would be useful on ARM?  Its
intended for the things settings which are essentially variable for the
domain, based on the admins choice.  One usecase Jan has proposed is the
ability for a one-time action where the admin has done an update (e.g.
taken the Spectre Microcode) and wants to bump up the visible featureset
in the guest, knowing that the guest has a mechanism to detect and adapt
to the new features.  (I realise this description is a bit woolly).

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC QEMU PATCH v4 00/10] Implement vNVDIMM for Xen HVM guest

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:18:02PM +0800, Haozhong Zhang wrote:
> This is the QEMU part patches that works with the associated Xen
> patches to enable vNVDIMM support for Xen HVM domains. Xen relies on
> QEMU to build guest NFIT and NVDIMM namespace devices, and allocate
> guest address space for vNVDIMM devices.

I've got other question, and maybe possible improvements.

When QEMU build the ACPI tables, it also initialize some MemoryRegion,
which use more guest memory. Do you know if those regions are used with
your patch series on Xen? Otherwise, we could try to avoid their
creation with this:
In xenfv_machine_options()
m->rom_file_has_mr = false;
(setting this in xen_hvm_init() would probably be better, but I havn't
try)

If this is possible, libxl would not need to allocate more memory for
the guest (dm_acpi_size).

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] xen: events: free irqs in error condition

2018-02-27 Thread Roger Pau Monné
On Tue, Feb 27, 2018 at 03:55:58PM +, Amit Shah wrote:
> In case of errors in irq setup for MSI, free up the allocated irqs.
> 
> Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
> Reported-by: Hooman Mirhadi 
> CC: 
> CC: Roger Pau Monné 
> CC: Boris Ostrovsky 
> CC: Eduardo Valentin 
> CC: Juergen Gross 
> CC: Thomas Gleixner 
> CC: "K. Y. Srinivasan" 
> CC: Liu Shuo 
> CC: Anoob Soman 
> Signed-off-by: Amit Shah 
> ---
>  drivers/xen/events/events_base.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index c86d10e..a299586 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -750,11 +750,14 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, 
> struct msi_desc *msidesc,
>  
>   ret = irq_set_msi_desc(irq, msidesc);
>   if (ret < 0)
> - goto error_irq;
> + goto error_desc;
>  out:
>   mutex_unlock(_mapping_update_lock);
>   return irq;
>  error_irq:
> + while (--nvec >= i)
> + xen_free_irq(irq + nvec);
> +error_desc:
>   while (i > 0) {
>   i--;
>   __unbind_from_irq(irq + i);

It seems pointless to introduce another label and another loop to fix
something that can be fixed with a single label and a single loop,
this just makes the code more complex for no reason.

IMHO the way to solve this issue is:

while (nvec--) {
if (nvec >= i)
xen_free_irq(irq + nvec);
else
__unbind_from_irq(irq + nvec);
}

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/xen: Calculate __max_logical_packages on PV domains

2018-02-27 Thread Simon Gaiser
Boris Ostrovsky:
> On 02/07/2018 06:49 PM, Prarit Bhargava wrote:
>> The kernel panics on PV domains because native_smp_cpus_done() is
>> only called for HVM domains.
>>
>> Calculate __max_logical_packages for PV domains.
>>
>> Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate")
>> Signed-off-by: Prarit Bhargava 
>> Tested-and-reported-by: Simon Gaiser 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: "H. Peter Anvin" 
>> Cc: x...@kernel.org
>> Cc: Boris Ostrovsky 
>> Cc: Juergen Gross 
>> Cc: Dou Liyang 
>> Cc: Prarit Bhargava 
>> Cc: Kate Stewart 
>> Cc: Greg Kroah-Hartman 
>> Cc: Andy Lutomirski 
>> Cc: Andi Kleen 
>> Cc: Vitaly Kuznetsov 
>> Cc: xen-devel@lists.xenproject.org
> 
> 
> Reviewed-by: Boris Ostrovsky 

The fixed bug is in 4.15 so it should get into stable. It's
63e708f826bb21470155d37b103a75d8a9e25b18 upstream.



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 43/49] ARM: new VGIC: Add preliminary stub implementations

2018-02-27 Thread Andre Przywara
Hi,

On 19/02/18 12:34, Julien Grall wrote:
> Hi,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> The Xen core code requires an interrupt controller emulation to implement
>> arch_move_irqs(), to move the affinity of an hardware mapped virtual IRQ
>> to another core. In the moment we don't implement this
>> physical-follow-virtual regime in our new VGIC, so just provide an empty
>> stub implementation to make the linker happy.
> 
> physical-follow-virtual is a must feature for the new vGIC. This has
> shown better interrupt latency.
> 
>> Similarily vgic_clear_pending_irqs() is required by the ARM code,
>> although it is suspected that it is actually not necessary. Go with a
>> stub for now.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic.c | 13 +
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>> index d91028bd43..77fa756329 100644
>> --- a/xen/arch/arm/vgic/vgic.c
>> +++ b/xen/arch/arm/vgic/vgic.c
>> @@ -770,6 +770,19 @@ void gic_dump_vgic_info(struct vcpu *v)
>>  irq->active ? "" : "not ", irq->enabled ? "" : "not ");
>>   }
>>   +void vgic_clear_pending_irqs(struct vcpu *v)
>> +{
>> +    /*
>> + * TODO: It is unclear whether we really need this, so we might
>> instead
>> + * remove it on the caller site.
>> + */
> 
> I remember some issue with the current vGIC when not removing pending
> interrupts on PSCI CPU ON. But that was a while ago. I will have another
> try and see if we can drop it.

So that's what I came up with:
CPU_ON can be called from two states:
- The initial state of a core before it's being brought up for the first
time.
- After the OS has called CPU_OFF.
 The PSCI spec says that before calling CPU_OFF an OS all IRQs must have
been migrated away from that core. I take this as no IRQs are allowed,
hence we don't have to clear anything on CPU_ON.

In both cases the CPU is expected to enter in a defined state, which
includes all interrupts masked on the CPU level (SPSR.ELx.[DAIF] = 1).
The GIC defaults to ISPENDR being 0.

So I take we should not be held responsible for clearing the pending
state upon CPU_ON.

What is your opinion?

>> +}
>> +
>> +void arch_move_irqs(struct vcpu *v)
>> +{
>> +    /* TODO: implement this (?) */
> 
> Here you would need to go through the interrupt and modify the affinity
> of the physical IRQs routed to that vCPU.

Since this is apparently the next best thing after sliced bread, I
implemented this now. Coming to a theatre near you any time soon.

> 
>> +}
>> +
>>   struct irq_desc *vgic_get_hw_irq_desc(struct domain *d, struct vcpu *v,
>>     unsigned int virq)
>>   {
>>
> 
> Cheers,
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V5] x86/hvm: fix domain crash when CR3 has the noflush bit set

2018-02-27 Thread Razvan Cojocaru
On 02/27/2018 06:26 PM, George Dunlap wrote:
>>> As an aside -- are you sure clearing the NOFLUSH from reported CR3
>>> values during introspection is the right thing to do?  You don't think
>>> your introspection engine will ever want to know if the guest OS is
>>> setting this bit?
>>
>> We can't be sure this will never be useful to know, but at least for now
>> I've not seen any requests to be able to, and our introspection engine
>> is not interested in the information (in fact, one of the reasons why
>> we've even missed the problem until it's been reported is that we
>> haven't even been subscribing to CR3 write events for a while now).
>>
>> So as far as we're concerned, losing the information about the NOFLUSH
>> bit is no problem at all (and it's definitely preferable to a domain
>> crash). Since Tamas has acked the patch, it's safe to assume that they
>> have no problem with it either, and Bitweasil seemed happy with the
>> solution as well.
>>
>> I suppose we can always write a patch later if it turns out that this is
>> valuable information.
> 
> Well if you want to maintain backwards compatibility, you'd need to
> either have the bit opt-in, or pass the noflush bit back somewhere else
> (either with a flag or with a different part of the struct).
> 
> If everyone is happy with it I don't mind.

A bool-like .noflush field could be nice, except it would only apply to
CR3 but affect both the common CR structure:

243 struct vm_event_write_ctrlreg {
244 uint32_t index;
245 uint32_t _pad;
246 uint64_t new_value;
247 uint64_t old_value;
248 };

and the common CR monitor function:

 33 bool hvm_monitor_cr(unsigned int index, unsigned long value,
unsigned long old)
 34 {
 35 struct vcpu *curr = current;
 36 struct arch_domain *ad = >domain->arch;
 37 unsigned int ctrlreg_bitmask = monitor_ctrlreg_bitmask(index);
 38
 39 if ( index == VM_EVENT_X86_CR3 && hvm_pcid_enabled(curr) )
 40 value &= ~X86_CR3_NOFLUSH; /* Clear the noflush bit. */
 41
 42 if ( (ad->monitor.write_ctrlreg_enabled & ctrlreg_bitmask) &&
 43  (!(ad->monitor.write_ctrlreg_onchangeonly & ctrlreg_bitmask) ||
 44   value != old) &&
 45  ((value ^ old) & ~ad->monitor.write_ctrlreg_mask[index]) )
 46 {
 47 bool sync = ad->monitor.write_ctrlreg_sync & ctrlreg_bitmask;
 48
 49 vm_event_request_t req = {
 50 .reason = VM_EVENT_REASON_WRITE_CTRLREG,
 51 .u.write_ctrlreg.index = index,
 52 .u.write_ctrlreg.new_value = value,
 53 .u.write_ctrlreg.old_value = old
 54 };
 55
 56 if ( monitor_traps(curr, sync, ) >= 0 )
 57 return 1;
 58 }
 59
 60 return 0;
 61 }

There's the additional problem of old vs. new values, as compared in the
above code: the way things work, the previous (old) value will always be
NOFLUSH-free (since we clear the NOFLUSH bit before storing). That means
that a NOFLUSH-set new value, identical to the old one except for the
NOFLUSH bit will trigger an "onchangeonly" event - which we don't want,
since the actual values are really identical (it's just that the NOFLUSH
bit has been removed before storing previously).

For example: previously we had (0x1 | NOFLUSH). We didn't flush, and
we've cleared the NOFLUSH bit, and now old value == 0x1.

CR3 is now being set again to (0x1 | NOFLUSH). Compared to the old
value, it's different, but is it? Was the previous value simply 0x1? Or
was it (0x1 | NOFLUSH)?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] xen: fix out-of-bounds irq unbind for MSI message groups

2018-02-27 Thread Roger Pau Monné
On Tue, Feb 27, 2018 at 03:55:57PM +, Amit Shah wrote:
> When an MSI descriptor was not available, the error path would try to
> unbind an irq that was never acquired - potentially unbinding an
> unrelated irq.

Those IRQs have been allocated in the xen_allocate_irqs_dynamic call,
so I think the "potentially unbinding an unrelated irq" part is wrong.
The unbind call would be performed against an unbound IRQ, which is
harmless AFAICT.

> Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
> Reported-by: Hooman Mirhadi 
> CC: 
> CC: Roger Pau Monné 
> CC: Boris Ostrovsky 
> CC: Eduardo Valentin 
> CC: Juergen Gross 
> CC: Thomas Gleixner 
> CC: "K. Y. Srinivasan" 
> CC: Liu Shuo 
> CC: Anoob Soman 
> Signed-off-by: Amit Shah 
> ---
>  drivers/xen/events/events_base.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/events/events_base.c 
> b/drivers/xen/events/events_base.c
> index 1ab4bd1..c86d10e 100644
> --- a/drivers/xen/events/events_base.c
> +++ b/drivers/xen/events/events_base.c
> @@ -755,8 +755,10 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
> msi_desc *msidesc,
>   mutex_unlock(_mapping_update_lock);
>   return irq;
>  error_irq:
> - for (; i >= 0; i--)
> + while (i > 0) {

while (i--)
__unbind_from_irq(irq + i);

Although please see reply to patch 2.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC QEMU PATCH v4 05/10] xen-hvm: initialize fw_cfg interface

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:18:07PM +0800, Haozhong Zhang wrote:
> Xen is going to reuse QEMU to build ACPI of some devices (e.g., NFIT
> and SSDT for NVDIMM) for HVM domains. The existing QEMU ACPI build
> code requires a fw_cfg interface which will also be used to pass QEMU
> built ACPI to Xen. Therefore, we need to initialize fw_cfg when any
> ACPI is going to be built by QEMU.
> 
> Signed-off-by: Haozhong Zhang 
> ---
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> Cc: "Michael S. Tsirkin" 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> ---
>  hw/i386/xen/xen-hvm.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> index fe01b7a025..4b29f4052b 100644
> --- a/hw/i386/xen/xen-hvm.c
> +++ b/hw/i386/xen/xen-hvm.c
> @@ -14,6 +14,7 @@
>  #include "hw/pci/pci.h"
>  #include "hw/i386/pc.h"
>  #include "hw/i386/apic-msidef.h"
> +#include "hw/loader.h"
>  #include "hw/xen/xen_common.h"
>  #include "hw/xen/xen_backend.h"
>  #include "qmp-commands.h"
> @@ -1234,6 +1235,14 @@ static void xen_wakeup_notifier(Notifier *notifier, 
> void *data)
>  xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 0);
>  }
>  
> +static void xen_fw_cfg_init(PCMachineState *pcms)
> +{

The fw_cfg interface might already be initialized, it is used for
"direct kernel boot" on hvm. It is initialized in xen_load_linux().

> +FWCfgState *fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
> +
> +rom_set_fw(fw_cfg);
> +pcms->fw_cfg = fw_cfg;
> +}
> +
>  void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory)
>  {
>  int i, rc;
> @@ -1384,6 +1393,9 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion 
> **ram_memory)
>  
>  /* Disable ACPI build because Xen handles it */
>  pcms->acpi_build_enabled = false;
> +if (pcms->acpi_build_enabled) {
> +xen_fw_cfg_init(pcms);
> +}
>  
>  return;
>  
> -- 
> 2.15.1
> 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/misc: Tweak reserved bit handling for xen-cpuid

2018-02-27 Thread Andrew Cooper
On 27/02/18 16:38, Jan Beulich wrote:
 On 27.02.18 at 17:28,  wrote:
>> Instead of printing REZ, use NULL pointers to indicate missing information,
>> and have dump_leaf() print out the bit which is unknown.
>>
>> E.g.
>>
>> 
>> Dynamic sets:
>> Raw   
>> 178bfbff:fed8320b:2fd3fbff:35c233ff:000f:209c01a9::6799:1007:
>>   [00] 0x0001.edx fpu vme de pse tsc msr pae mce cx8 apic sysenter 
>> mtrr pge  ...
>>   [01] 0x0001.ecx sse3 pclmulqdq monitor ssse3 fma cx16 sse41 sse42 
>> movebe   ...
>>   [02] 0x8001.edx fpu vme de pse tsc msr pae mce cx8 apic syscall 
>> mtrr pge   ...
>>   [03] 0x8001.ecx lahf_lm cmp svm extapic cr8d lzcnt sse4a msse 
>> 3dnowpf osvw ...
>>   [04] 0x000d:1.eax   xsaveopt xsavec xgetbv1 xsaves
>>   [05] 0x0007:0.ebx   fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap 
>> clflushopt sha
>>   [06] 0x0007:0.ecx
>>   [07] 0x8007.edx itsc  efro 
>>  
>>   [08] 0x8008.ebx clzero   ibpb
>>   [09] 0x0007:0.edx
> How about omitting the BIT part, making it just <0> etc, which still
> makes clear the bit is set but has no name? Other than that ...

Hmm

  [07] 0x8007.edx <0> <3> <4> <7> itsc <9> efro <13> <14>
  [08] 0x8008.ebx clzero <1> <2> ibpb

Yes - I think I prefer that.

>
>> which is the output on an AMD EPYC system, where Xen doesn't know about, and
>> has therefore masked, most of the more advanced thermal/perf features.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 

Thanks,

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC QEMU PATCH v4 03/10] hostmem-xen: add a host memory backend for Xen

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:18:05PM +0800, Haozhong Zhang wrote:
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index ee2c2d5bfd..ba13a52994 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -12,6 +12,7 @@
>  #include "qemu/osdep.h"
>  #include "sysemu/hostmem.h"
>  #include "hw/boards.h"
> +#include "hw/xen/xen.h"
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
>  #include "qapi-types.h"
> @@ -277,6 +278,14 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
> Error **errp)
>  goto out;
>  }
>  
> +/*
> + * The backend storage of MEMORY_BACKEND_XEN is managed by Xen,
> + * so no further work in this function is needed.
> + */
> +if (xen_enabled() && !backend->mr.ram_block) {
> +goto out;
> +}
> +
>  ptr = memory_region_get_ram_ptr(>mr);
>  sz = memory_region_size(>mr);
>  
> diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
> index 66eace5a5c..dcbfce33d5 100644
> --- a/hw/mem/pc-dimm.c
> +++ b/hw/mem/pc-dimm.c
> @@ -28,6 +28,7 @@
>  #include "sysemu/kvm.h"
>  #include "trace.h"
>  #include "hw/virtio/vhost.h"
> +#include "hw/xen/xen.h"
>  
>  typedef struct pc_dimms_capacity {
>   uint64_t size;
> @@ -108,7 +109,10 @@ void pc_dimm_memory_plug(DeviceState *dev, 
> MemoryHotplugState *hpms,
>  }
>  
>  memory_region_add_subregion(>mr, addr - hpms->base, mr);
> -vmstate_register_ram(vmstate_mr, dev);
> +/* memory-backend-xen is not backed by RAM. */
> +if (!xen_enabled()) {

Is it possible to have the same condition as the one used in
host_memory_backend_memory_complete? i.e. base on whether the memory
region is mapped or not (backend->mr.ram_block).

> +vmstate_register_ram(vmstate_mr, dev);
> +}
>  numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
>  
>  out:
> -- 
> 2.15.1
> 

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/misc: Tweak reserved bit handling for xen-cpuid

2018-02-27 Thread Jan Beulich
>>> On 27.02.18 at 17:28,  wrote:
> Instead of printing REZ, use NULL pointers to indicate missing information,
> and have dump_leaf() print out the bit which is unknown.
> 
> E.g.
> 
> 
> Dynamic sets:
> Raw   
> 178bfbff:fed8320b:2fd3fbff:35c233ff:000f:209c01a9::6799:1007:
>   [00] 0x0001.edx fpu vme de pse tsc msr pae mce cx8 apic sysenter 
> mtrr pge  ...
>   [01] 0x0001.ecx sse3 pclmulqdq monitor ssse3 fma cx16 sse41 sse42 
> movebe   ...
>   [02] 0x8001.edx fpu vme de pse tsc msr pae mce cx8 apic syscall 
> mtrr pge   ...
>   [03] 0x8001.ecx lahf_lm cmp svm extapic cr8d lzcnt sse4a msse 
> 3dnowpf osvw ...
>   [04] 0x000d:1.eax   xsaveopt xsavec xgetbv1 xsaves
>   [05] 0x0007:0.ebx   fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap 
> clflushopt sha
>   [06] 0x0007:0.ecx
>   [07] 0x8007.edx itsc  efro 
>  
>   [08] 0x8008.ebx clzero   ibpb
>   [09] 0x0007:0.edx

How about omitting the BIT part, making it just <0> etc, which still
makes clear the bit is set but has no name? Other than that ...

> which is the output on an AMD EPYC system, where Xen doesn't know about, and
> has therefore masked, most of the more advanced thermal/perf features.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC QEMU PATCH v4 02/10] xen-hvm: create the hotplug memory region on Xen

2018-02-27 Thread Anthony PERARD
On Thu, Dec 07, 2017 at 06:18:04PM +0800, Haozhong Zhang wrote:
> The guest physical address of vNVDIMM is allocated from the hotplug
> memory region, which is not created when QEMU is used as Xen device
> model. In order to use vNVDIMM for Xen HVM domains, this commit reuses
> the code for pc machine type to create the hotplug memory region for
> Xen HVM domains.
> 
> Signed-off-by: Haozhong Zhang 
> ---
> Cc: "Michael S. Tsirkin" 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> ---
>  hw/i386/pc.c  | 86 
> ---
>  hw/i386/xen/xen-hvm.c |  2 ++
>  include/hw/i386/pc.h  |  1 +
>  3 files changed, 51 insertions(+), 38 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 186545d2a4..9f46c8df79 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1315,6 +1315,53 @@ void xen_load_linux(PCMachineState *pcms)
>  pcms->fw_cfg = fw_cfg;
>  }
>  
> +void pc_memory_hotplug_init(PCMachineState *pcms, MemoryRegion 
> *system_memory)

It might be better to have a separate patch which move the code into a function.

> +{
> +MachineState *machine = MACHINE(pcms);
> +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
> +
> +if (!pcmc->has_reserved_memory || machine->ram_size >= 
> machine->maxram_size)
> +return;
> +
> +if (memory_region_size(>hotplug_memory.mr)) {

This new check looks like to catch programming error, rather than user
error. Would it be better to be an assert instead?

> +error_report("hotplug memory region has been initialized");
> +exit(EXIT_FAILURE);
> +}
> +

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 120066: tolerable all pass - PUSHED

2018-02-27 Thread osstest service owner
flight 120066 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120066/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  40681735502648fdc45973382a440aa38f4ec800
baseline version:
 xen  5b0426207998cd1f381b46299907bbcdf8bd240d

Last test of basis   120058  2018-02-27 11:02:08 Z0 days
Testing same since   120066  2018-02-27 14:01:11 Z0 days1 attempts


People who touched revisions under test:
  Alexandru Isaila 
  Andrew Cooper 
  George Dunlap 
  Jan Beulich 
  Razvan Cojocaru 
  Roger Pau Monné 
  Tamas K Lengyel 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   5b04262079..4068173550  40681735502648fdc45973382a440aa38f4ec800 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] tools/misc: Tweak reserved bit handling for xen-cpuid

2018-02-27 Thread Andrew Cooper
Instead of printing REZ, use NULL pointers to indicate missing information,
and have dump_leaf() print out the bit which is unknown.

E.g.


Dynamic sets:
Raw   
178bfbff:fed8320b:2fd3fbff:35c233ff:000f:209c01a9::6799:1007:
  [00] 0x0001.edx fpu vme de pse tsc msr pae mce cx8 apic sysenter mtrr 
pge  ...
  [01] 0x0001.ecx sse3 pclmulqdq monitor ssse3 fma cx16 sse41 sse42 
movebe   ...
  [02] 0x8001.edx fpu vme de pse tsc msr pae mce cx8 apic syscall mtrr 
pge   ...
  [03] 0x8001.ecx lahf_lm cmp svm extapic cr8d lzcnt sse4a msse 3dnowpf 
osvw ...
  [04] 0x000d:1.eax   xsaveopt xsavec xgetbv1 xsaves
  [05] 0x0007:0.ebx   fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap 
clflushopt sha
  [06] 0x0007:0.ecx
  [07] 0x8007.edx itsc  efro  

  [08] 0x8008.ebx clzero   ibpb
  [09] 0x0007:0.edx
...

which is the output on an AMD EPYC system, where Xen doesn't know about, and
has therefore masked, most of the more advanced thermal/perf features.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Ian Jackson 
CC: Wei Liu 

Hey look - I've finally got back to some CPUID work...
---
 tools/misc/xen-cpuid.c | 55 +-
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index 8c3dac0..d56a8c4 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -16,12 +16,12 @@ static const char *str_1d[32] =
 [ 4] = "tsc",  [ 5] = "msr",
 [ 6] = "pae",  [ 7] = "mce",
 [ 8] = "cx8",  [ 9] = "apic",
-[10] = "REZ",  [11] = "sysenter",
+/* [10] */ [11] = "sysenter",
 [12] = "mtrr", [13] = "pge",
 [14] = "mca",  [15] = "cmov",
 [16] = "pat",  [17] = "pse36",
 [18] = "psn",  [19] = "clflush",
-[20] = "REZ",  [21] = "ds",
+/* [20] */ [21] = "ds",
 [22] = "acpi", [23] = "mmx",
 [24] = "fxsr", [25] = "sse",
 [26] = "sse2", [27] = "ss",
@@ -39,7 +39,7 @@ static const char *str_1c[32] =
 [10] = "cntx-id", [11] = "sdgb",
 [12] = "fma", [13] = "cx16",
 [14] = "xtpr",[15] = "pdcm",
-[16] = "REZ", [17] = "pcid",
+/* [16] */[17] = "pcid",
 [18] = "dca", [19] = "sse41",
 [20] = "sse42",   [21] = "x2apic",
 [22] = "movebe",  [23] = "popcnt",
@@ -56,16 +56,16 @@ static const char *str_e1d[32] =
 [ 4] = "tsc",[ 5] = "msr",
 [ 6] = "pae",[ 7] = "mce",
 [ 8] = "cx8",[ 9] = "apic",
-[10] = "REZ",[11] = "syscall",
+/* [10] */   [11] = "syscall",
 [12] = "mtrr",   [13] = "pge",
 [14] = "mca",[15] = "cmov",
 [16] = "fcmov",  [17] = "pse36",
-[18] = "REZ",[19] = "mp",
-[20] = "nx", [21] = "REZ",
+/* [18] */   [19] = "mp",
+[20] = "nx", /* [21] */
 [22] = "mmx+",   [23] = "mmx",
 [24] = "fxsr",   [25] = "fxsr+",
 [26] = "pg1g",   [27] = "rdtscp",
-[28] = "REZ",[29] = "lm",
+/* [28] */   [29] = "lm",
 [30] = "3dnow+", [31] = "3dnow",
 };
 
@@ -78,16 +78,14 @@ static const char *str_e1c[32] =
 [ 8] = "3dnowpf",[ 9] = "osvw",
 [10] = "ibs",[11] = "xop",
 [12] = "skinit", [13] = "wdt",
-[14] = "REZ",[15] = "lwp",
+/* [14] */   [15] = "lwp",
 [16] = "fma4",   [17] = "tce",
-[18] = "REZ",[19] = "nodeid",
-[20] = "REZ",[21] = "tbm",
+/* [18] */   [19] = "nodeid",
+/* [20] */   [21] = "tbm",
 [22] = "topoext",[23] = "perfctr_core",
-[24] = "perfctr_nb", [25] = "REZ",
+[24] = "perfctr_nb", /* [25] */
 [26] = "dbx",[27] = "perftsc",
 [28] = "pcx_l2i",[29] = "monitorx",
-
-[30 ... 31] = "REZ",
 };
 
 static const char *str_7b0[32] =
@@ -114,8 +112,6 @@ static const char *str_Da1[32] =
 {
 [ 0] = "xsaveopt", [ 1] = "xsavec",
 [ 2] = "xgetbv1",  [ 3] = "xsaves",
-
-[4 ... 31] = "REZ",
 };
 
 static const char *str_7c0[32] =
@@ -124,49 +120,29 @@ static const char *str_7c0[32] =
 [ 2] = "umip", [ 3] = "pku",
 [ 4] = "ospke",
 
-[5 ... 13] = "REZ",
-
 [14] = "avx512_vpopcntdq",
 
-[15 ... 21] = "REZ",
-
 [22] = "rdpid",
-
-[23 ... 31] = "REZ",
 };
 
 static const char *str_e7d[32] =
 {
-[0 ... 7] = "REZ",
-
-[ 8] = "itsc", [ 9] = "REZ",
+[ 8] = "itsc",
 [10] = "efro",
-
-[11 ... 31] = "REZ",
 };
 
 static const char *str_e8b[32] =
 {
 [ 0] = "clzero",
 
-[1 ... 11] = "REZ",
-
 [12] = "ibpb",
-
-[13 ... 31] = "REZ",
 };
 
 static const char *str_7d0[32] =
 {
-[0 ... 1] = "REZ",
-
 [ 2] = "avx512_4vnniw", [ 3] = "avx512_4fmaps",
 
-[4 ... 25] = "REZ",
-
 [26] = "ibrsb", [27] = "stibp",
-
-[28 ... 31] = "REZ",
 };
 
 static struct {
@@ -213,7 +189,12 @@ static 

Re: [Xen-devel] [PATCH V5] x86/hvm: fix domain crash when CR3 has the noflush bit set

2018-02-27 Thread George Dunlap
On 02/27/2018 04:19 PM, Razvan Cojocaru wrote:
> On 02/27/2018 05:53 PM, George Dunlap wrote:
>> On 02/23/2018 07:29 AM, Razvan Cojocaru wrote:
>>> On 02/23/2018 06:53 AM, Tian, Kevin wrote:
> From: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com]
> Sent: Friday, February 16, 2018 6:22 PM
>
> The emulation layers of Xen lack PCID support, and as we only offer
> PCID to HAP guests, all writes to CR3 are handled by hardware,
> except when introspection is involved. Consequently, trying to set
> CR3 when the noflush bit is set in hvm_set_cr3() leads to domain
> crashes. The workaround is to clear the noflush bit in
> hvm_set_cr3(). CR3 values in hvm_monitor_cr() are also sanitized.
> Additionally, a bool parameter now propagates to
> {svm,vmx}_update_guest_cr(), so that no flushes occur when
> the bit was set.

 Above message is not very clear for people who didn't follow
 previous discussions, e.g. why lacking PCID support in emulation 
 layer would lead to domain crash? and why noflush trick can 
 avoid the situation? Can you help elaborate it?
>>>
>>> Lacking PCID support in the emulation layer creates two different way of
>>> handling the NOFLUSH being set: one is in hardware, and this happens for
>>> everything except the introspection case, and one in the emulation layer
>>> (this happens when an introspection agent asks Xen to emulate an
>>> instruction when it replies to an EPT fault vm_event).
>>>
>>> The checks in place expected the guest state to be correct with regard
>>> to handling the bit being set "the hardware way", but the emulation
>>> layer was, previous to this patch, completely ignoring the NOFLUSH bit.
>>> Hence, there was a difference between the expected domain state and the
>>> actual domain state, which translated into a domain crash.
>>
>> Just getting up to speed on this -- is it the case that the hardware
>> will automatically clear this bit; so because it wasn't being cleared in
>> hvm_set_cr3() during emulation, one of the checks in Xen (on exiting the
>> emulator?) was failing due to the bit being set, and then crashing the
>> domain?
> 
> Yes, I believe that that is what happens. The check is done on VMENTRY,
> this is the original email reporting the issue showing it:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg02275.html
> 
>> In which case you're not so much fixing a domain crash by providing a
>> workaround, as actually implementing a bit of functionality.
>>
>> If so I think the commit message could use expanding.  What about
>> something like the following (assuming I haven't misunderstood what's
>> going on):
>>
>> ---
>> In hardware, when PCID support is enabled and the X86_CR3_NOFLUSH bit is
>> set when writing a CR3 value, the hardware will clear that that bit and
>> change the CR3 without flushing the TLB.  hvm_set_cr3(), however, was
>> ignoring this bit; the result was that post-emulation checks detected an
>> invalid CR3 value and crashed the domain.
>>
>> Handle X86_CR3_NOFLUSH in hvm_set_cr3() by:
>> 1. Clearing the bit
>> 2. Passing a "noflush" flag to lower-level cr3 setting functions to
>> indicate that a flush should not be performed.
>>
>> Also clear X86_CR3_NOFLUSH when reporting CR3 monitored CR3 writes.
>>
>> This allows introspection to be used on VMs whose operating system uses
>> the NOFLUSH bit.
>> ---
> 
> Fair enough, I'm happy to change the commit message if nobody else
> objects / has other changes in mind.
> 
>> As an aside -- are you sure clearing the NOFLUSH from reported CR3
>> values during introspection is the right thing to do?  You don't think
>> your introspection engine will ever want to know if the guest OS is
>> setting this bit?
> 
> We can't be sure this will never be useful to know, but at least for now
> I've not seen any requests to be able to, and our introspection engine
> is not interested in the information (in fact, one of the reasons why
> we've even missed the problem until it's been reported is that we
> haven't even been subscribing to CR3 write events for a while now).
> 
> So as far as we're concerned, losing the information about the NOFLUSH
> bit is no problem at all (and it's definitely preferable to a domain
> crash). Since Tamas has acked the patch, it's safe to assume that they
> have no problem with it either, and Bitweasil seemed happy with the
> solution as well.
> 
> I suppose we can always write a patch later if it turns out that this is
> valuable information.

Well if you want to maintain backwards compatibility, you'd need to
either have the bit opt-in, or pass the noflush bit back somewhere else
(either with a flag or with a different part of the struct).

If everyone is happy with it I don't mind.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 120040: all pass - PUSHED

2018-02-27 Thread osstest service owner
flight 120040 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120040/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf f068aa038d09053c5dddea93c5f9576c51993546
baseline version:
 ovmf ebfca258f5d7ab59cd1b72ad56f1de0e7a138ba9

Last test of basis   119996  2018-02-24 17:57:18 Z2 days
Failing since120014  2018-02-25 14:21:05 Z2 days2 attempts
Testing same since   120040  2018-02-26 16:05:21 Z0 days1 attempts


People who touched revisions under test:
  Feng, YunhuaX 
  Hao Wu 
  Kinney, Michael D 
  Liming Gao 
  Michael D Kinney 
  Star Zeng 
  Yonghong Zhu 
  Yunhua Feng 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   ebfca258f5..f068aa038d  f068aa038d09053c5dddea93c5f9576c51993546 -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V5] x86/hvm: fix domain crash when CR3 has the noflush bit set

2018-02-27 Thread Razvan Cojocaru
On 02/27/2018 05:53 PM, George Dunlap wrote:
> On 02/23/2018 07:29 AM, Razvan Cojocaru wrote:
>> On 02/23/2018 06:53 AM, Tian, Kevin wrote:
 From: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com]
 Sent: Friday, February 16, 2018 6:22 PM

 The emulation layers of Xen lack PCID support, and as we only offer
 PCID to HAP guests, all writes to CR3 are handled by hardware,
 except when introspection is involved. Consequently, trying to set
 CR3 when the noflush bit is set in hvm_set_cr3() leads to domain
 crashes. The workaround is to clear the noflush bit in
 hvm_set_cr3(). CR3 values in hvm_monitor_cr() are also sanitized.
 Additionally, a bool parameter now propagates to
 {svm,vmx}_update_guest_cr(), so that no flushes occur when
 the bit was set.
>>>
>>> Above message is not very clear for people who didn't follow
>>> previous discussions, e.g. why lacking PCID support in emulation 
>>> layer would lead to domain crash? and why noflush trick can 
>>> avoid the situation? Can you help elaborate it?
>>
>> Lacking PCID support in the emulation layer creates two different way of
>> handling the NOFLUSH being set: one is in hardware, and this happens for
>> everything except the introspection case, and one in the emulation layer
>> (this happens when an introspection agent asks Xen to emulate an
>> instruction when it replies to an EPT fault vm_event).
>>
>> The checks in place expected the guest state to be correct with regard
>> to handling the bit being set "the hardware way", but the emulation
>> layer was, previous to this patch, completely ignoring the NOFLUSH bit.
>> Hence, there was a difference between the expected domain state and the
>> actual domain state, which translated into a domain crash.
> 
> Just getting up to speed on this -- is it the case that the hardware
> will automatically clear this bit; so because it wasn't being cleared in
> hvm_set_cr3() during emulation, one of the checks in Xen (on exiting the
> emulator?) was failing due to the bit being set, and then crashing the
> domain?

Yes, I believe that that is what happens. The check is done on VMENTRY,
this is the original email reporting the issue showing it:

https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg02275.html

> In which case you're not so much fixing a domain crash by providing a
> workaround, as actually implementing a bit of functionality.
> 
> If so I think the commit message could use expanding.  What about
> something like the following (assuming I haven't misunderstood what's
> going on):
> 
> ---
> In hardware, when PCID support is enabled and the X86_CR3_NOFLUSH bit is
> set when writing a CR3 value, the hardware will clear that that bit and
> change the CR3 without flushing the TLB.  hvm_set_cr3(), however, was
> ignoring this bit; the result was that post-emulation checks detected an
> invalid CR3 value and crashed the domain.
> 
> Handle X86_CR3_NOFLUSH in hvm_set_cr3() by:
> 1. Clearing the bit
> 2. Passing a "noflush" flag to lower-level cr3 setting functions to
> indicate that a flush should not be performed.
> 
> Also clear X86_CR3_NOFLUSH when reporting CR3 monitored CR3 writes.
> 
> This allows introspection to be used on VMs whose operating system uses
> the NOFLUSH bit.
> ---

Fair enough, I'm happy to change the commit message if nobody else
objects / has other changes in mind.

> As an aside -- are you sure clearing the NOFLUSH from reported CR3
> values during introspection is the right thing to do?  You don't think
> your introspection engine will ever want to know if the guest OS is
> setting this bit?

We can't be sure this will never be useful to know, but at least for now
I've not seen any requests to be able to, and our introspection engine
is not interested in the information (in fact, one of the reasons why
we've even missed the problem until it's been reported is that we
haven't even been subscribing to CR3 write events for a while now).

So as far as we're concerned, losing the information about the NOFLUSH
bit is no problem at all (and it's definitely preferable to a domain
crash). Since Tamas has acked the patch, it's safe to assume that they
have no problem with it either, and Bitweasil seemed happy with the
solution as well.

I suppose we can always write a patch later if it turns out that this is
valuable information.

> In any case, with the updated commit message:
> 
> Acked-by: George Dunlap 

Thank you very much for the review!


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] libxc backport request

2018-02-27 Thread Ian Jackson
Jan Beulich writes ("libxc backport request"):
> in the context of the migration related changes for Spectre v2 I think
> it becomes necessary to backport
> 
> 72efb1df62 tools/libxc: Avoid generating inappropriate zero-content records
> f1a0a8c3fe tools/libxc: Fix restoration of PV MSRs after migrate
> 
> to applicable trees. The lack of the former is, afaict, at least part of
> the reason why osstest's save/restore currently fails on 4.8.

Done.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 0/2] xen: fix bugs in error conditions

2018-02-27 Thread Amit Shah
Hello,

These bugs were found during code review.  Details in the commits.

Please review and apply.

v2:
 - fix up patch 2 properly (Roger Pau Monné)

CC: Roger Pau Monné 
CC: Boris Ostrovsky 
CC: Eduardo Valentin 
CC: Juergen Gross 
CC: Thomas Gleixner 
CC: "K. Y. Srinivasan" 
CC: Liu Shuo 
CC: Anoob Soman 

Amit Shah (2):
  xen: fix out-of-bounds irq unbind for MSI message groups
  xen: events: free irqs in error condition

 drivers/xen/events/events_base.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

-- 
2.7.3.AMZN

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/2] xen: fix out-of-bounds irq unbind for MSI message groups

2018-02-27 Thread Amit Shah
When an MSI descriptor was not available, the error path would try to
unbind an irq that was never acquired - potentially unbinding an
unrelated irq.

Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi 
CC: 
CC: Roger Pau Monné 
CC: Boris Ostrovsky 
CC: Eduardo Valentin 
CC: Juergen Gross 
CC: Thomas Gleixner 
CC: "K. Y. Srinivasan" 
CC: Liu Shuo 
CC: Anoob Soman 
Signed-off-by: Amit Shah 
---
 drivers/xen/events/events_base.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 1ab4bd1..c86d10e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -755,8 +755,10 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
msi_desc *msidesc,
mutex_unlock(_mapping_update_lock);
return irq;
 error_irq:
-   for (; i >= 0; i--)
+   while (i > 0) {
+   i--;
__unbind_from_irq(irq + i);
+   }
mutex_unlock(_mapping_update_lock);
return ret;
 }
-- 
2.7.3.AMZN

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/2] xen: events: free irqs in error condition

2018-02-27 Thread Amit Shah
In case of errors in irq setup for MSI, free up the allocated irqs.

Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi 
CC: 
CC: Roger Pau Monné 
CC: Boris Ostrovsky 
CC: Eduardo Valentin 
CC: Juergen Gross 
CC: Thomas Gleixner 
CC: "K. Y. Srinivasan" 
CC: Liu Shuo 
CC: Anoob Soman 
Signed-off-by: Amit Shah 
---
 drivers/xen/events/events_base.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index c86d10e..a299586 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -750,11 +750,14 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct 
msi_desc *msidesc,
 
ret = irq_set_msi_desc(irq, msidesc);
if (ret < 0)
-   goto error_irq;
+   goto error_desc;
 out:
mutex_unlock(_mapping_update_lock);
return irq;
 error_irq:
+   while (--nvec >= i)
+   xen_free_irq(irq + nvec);
+error_desc:
while (i > 0) {
i--;
__unbind_from_irq(irq + i);
-- 
2.7.3.AMZN

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V5] x86/hvm: fix domain crash when CR3 has the noflush bit set

2018-02-27 Thread George Dunlap
On 02/23/2018 07:29 AM, Razvan Cojocaru wrote:
> On 02/23/2018 06:53 AM, Tian, Kevin wrote:
>>> From: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com]
>>> Sent: Friday, February 16, 2018 6:22 PM
>>>
>>> The emulation layers of Xen lack PCID support, and as we only offer
>>> PCID to HAP guests, all writes to CR3 are handled by hardware,
>>> except when introspection is involved. Consequently, trying to set
>>> CR3 when the noflush bit is set in hvm_set_cr3() leads to domain
>>> crashes. The workaround is to clear the noflush bit in
>>> hvm_set_cr3(). CR3 values in hvm_monitor_cr() are also sanitized.
>>> Additionally, a bool parameter now propagates to
>>> {svm,vmx}_update_guest_cr(), so that no flushes occur when
>>> the bit was set.
>>
>> Above message is not very clear for people who didn't follow
>> previous discussions, e.g. why lacking PCID support in emulation 
>> layer would lead to domain crash? and why noflush trick can 
>> avoid the situation? Can you help elaborate it?
> 
> Lacking PCID support in the emulation layer creates two different way of
> handling the NOFLUSH being set: one is in hardware, and this happens for
> everything except the introspection case, and one in the emulation layer
> (this happens when an introspection agent asks Xen to emulate an
> instruction when it replies to an EPT fault vm_event).
> 
> The checks in place expected the guest state to be correct with regard
> to handling the bit being set "the hardware way", but the emulation
> layer was, previous to this patch, completely ignoring the NOFLUSH bit.
> Hence, there was a difference between the expected domain state and the
> actual domain state, which translated into a domain crash.

Just getting up to speed on this -- is it the case that the hardware
will automatically clear this bit; so because it wasn't being cleared in
hvm_set_cr3() during emulation, one of the checks in Xen (on exiting the
emulator?) was failing due to the bit being set, and then crashing the
domain?

In which case you're not so much fixing a domain crash by providing a
workaround, as actually implementing a bit of functionality.

If so I think the commit message could use expanding.  What about
something like the following (assuming I haven't misunderstood what's
going on):

---
In hardware, when PCID support is enabled and the X86_CR3_NOFLUSH bit is
set when writing a CR3 value, the hardware will clear that that bit and
change the CR3 without flushing the TLB.  hvm_set_cr3(), however, was
ignoring this bit; the result was that post-emulation checks detected an
invalid CR3 value and crashed the domain.

Handle X86_CR3_NOFLUSH in hvm_set_cr3() by:
1. Clearing the bit
2. Passing a "noflush" flag to lower-level cr3 setting functions to
indicate that a flush should not be performed.

Also clear X86_CR3_NOFLUSH when reporting CR3 monitored CR3 writes.

This allows introspection to be used on VMs whose operating system uses
the NOFLUSH bit.
---

As an aside -- are you sure clearing the NOFLUSH from reported CR3
values during introspection is the right thing to do?  You don't think
your introspection engine will ever want to know if the guest OS is
setting this bit?

In any case, with the updated commit message:

Acked-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen: events: free irqs in error condition

2018-02-27 Thread Shah, Amit


On Di, 2018-02-27 at 08:14 +, Roger Pau Monné wrote:
> On Mon, Feb 26, 2018 at 06:57:03PM +, Shah, Amit wrote:
> > 
> > 
> > On Mo, 2018-02-26 at 18:14 +, Roger Pau Monné wrote:
> > > 
> > > On Mon, Feb 26, 2018 at 05:36:35PM +, Amit Shah wrote:
> > > > 
> > > > 
> > > > In case of errors in irq setup for MSI, free up the allocated
> > > > irqs.
> > > > 
> > > > Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message
> > > > groups")
> > > > Reported-by: Hooman Mirhadi 
> > > > CC: 
> > > > CC: Roger Pau Monné 
> > > > CC: David Vrabel 
> > > > CC: Boris Ostrovsky 
> > > > CC: Eduardo Valentin 
> > > > CC: Juergen Gross 
> > > > CC: Thomas Gleixner 
> > > > CC: "K. Y. Srinivasan" 
> > > > CC: Liu Shuo 
> > > > CC: Anoob Soman 
> > > > Signed-off-by: Amit Shah 
> > > > ---
> > > >  drivers/xen/events/events_base.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/drivers/xen/events/events_base.c
> > > > b/drivers/xen/events/events_base.c
> > > > index b6b8b29..96aa575 100644
> > > > --- a/drivers/xen/events/events_base.c
> > > > +++ b/drivers/xen/events/events_base.c
> > > > @@ -758,6 +758,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev
> > > > *dev, struct msi_desc *msidesc,
> > > >  error_irq:
> > > >     for (; i >= 0; i--)
> > > >     __unbind_from_irq(irq + i);
> > > > +   xen_free_irq(irq);
> > > Hm, xen_free_irq calls irq_free_desc, which is
> > > irq_free_descs(irq,
> > > 1),
> > Er...  right.
> > 
> > > 
> > > I think you will have to introduce a new free function:
> > > 
> > > xen_free_irqs(unsigned irq, unsigned int nr)
> > > 
> > > That calls irq_free_descs(irq, nr)
> > Actually, xen_free_irq() is already done in __unbind_from_irq(), so
> > this patch is actually wrong and not needed.
> You still need to free unbound IRQs, AFAICT you could fix the issue
> with a single patch, like:
> 
> while (nvec--) {
>   if (nvec >= i)
>   xen_free_irq(irq + i);
>   else
>   __unbind_from_irq(irq + i);
> }

Agreed.

However, since these are two different things, I'd still like to
separate out into two patches, and two paths so it's easier to see
what's being done.

Sending v2 in a bit.

    Amit


Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] libxc backport request

2018-02-27 Thread Jan Beulich
Ian,

in the context of the migration related changes for Spectre v2 I think
it becomes necessary to backport

72efb1df62 tools/libxc: Avoid generating inappropriate zero-content records
f1a0a8c3fe tools/libxc: Fix restoration of PV MSRs after migrate

to applicable trees. The lack of the former is, afaict, at least part of
the reason why osstest's save/restore currently fails on 4.8.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/3] xen/arm: domain_build: Rework the way to allocate the event channel interrupt

2018-02-27 Thread julien . grall
From: Julien Grall 

At the moment, a placeholder will be created in the device-tree for the
event channel information. Later in the domain construction, the
interrupt for the event channel upcall will be allocated the device-tree
fixed up.

Looking at the code, the current split is not necessary because all the
PPIs used by the hardware domain will by the time we create the node in
the device-tree.

From now, mandate that all interrupts are registered before
acpi_prepare() and dtb_prepare(). This allows us to rework the event
channel code and remove one placeholder.

Note, this will also help to fix the BUG(...) condition in set_interrupt_ppi
which is completely wrong. See in a follow-up patch.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/domain_build.c | 74 +
 1 file changed, 35 insertions(+), 39 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index a5e5c82355..ed1a393bb5 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -576,7 +576,10 @@ static int make_memory_node(const struct domain *d,
 return res;
 }
 
-static int make_hypervisor_node(const struct kernel_info *kinfo,
+static void evtchn_allocate(struct domain *d);
+
+static int make_hypervisor_node(struct domain *d,
+const struct kernel_info *kinfo,
 const struct dt_device_node *parent)
 {
 const char compat[] =
@@ -620,10 +623,18 @@ static int make_hypervisor_node(const struct kernel_info 
*kinfo,
 return res;
 
 /*
- * Placeholder for the event channel interrupt.  The values will be
- * replaced later.
+ * It is safe to allocate the event channel here because all the
+ * PPIs used by the hardware domain have been registered.
+ */
+evtchn_allocate(d);
+
+/*
+ * Interrupt event channel upcall:
+ *  - Active-low level-sensitive
+ *  - All CPUs
+ *  TODO: Handle properly the cpumask;
  */
-set_interrupt_ppi(intr, ~0, 0xf, IRQ_TYPE_INVALID);
+set_interrupt_ppi(intr, d->arch.evtchn_irq, 0xf, IRQ_TYPE_LEVEL_LOW);
 res = fdt_property_interrupts(fdt, , 1);
 if ( res )
 return res;
@@ -1282,7 +1293,11 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
 
 if ( node == dt_host )
 {
-res = make_hypervisor_node(kinfo, node);
+/*
+ * The hypervisor node should always be created after all nodes
+ * from the host DT have been parsed.
+ */
+res = make_hypervisor_node(d, kinfo, node);
 if ( res )
 return res;
 
@@ -1939,6 +1954,12 @@ static int prepare_acpi(struct domain *d, struct 
kernel_info *kinfo)
 if ( rc != 0 )
 return rc;
 
+/*
+ * All PPIs have been registered, allocate the event channel
+ * interrupts.
+ */
+evtchn_allocate(d);
+
 return 0;
 }
 #else
@@ -2014,16 +2035,18 @@ static void initrd_load(struct kernel_info *kinfo)
 panic("Unable to copy the initrd in the hwdom memory");
 }
 
-static void evtchn_fixup(struct domain *d, struct kernel_info *kinfo)
+/*
+ * Allocate the event channel PPIs and setup the HVM_PARAM_CALLBACK_IRQ.
+ * The allocated IRQ will be found in d->arch.evtchn_irq.
+ *
+ * Note that this should only be called once all PPIs used by the
+ * hardware domain have been registered.
+ */
+static void evtchn_allocate(struct domain *d)
 {
-int res, node;
+int res;
 u64 val;
-gic_interrupt_t intr;
 
-/*
- * The allocation of the event channel IRQ has been deferred until
- * now. At this time, all PPIs used by DOM0 have been registered.
- */
 res = vgic_allocate_ppi(d);
 if ( res < 0 )
 panic("Unable to allocate a PPI for the event channel interrupt\n");
@@ -2041,31 +2064,6 @@ static void evtchn_fixup(struct domain *d, struct 
kernel_info *kinfo)
  HVM_PARAM_CALLBACK_TYPE_PPI_FLAG_MASK);
 val |= d->arch.evtchn_irq;
 d->arch.hvm_domain.params[HVM_PARAM_CALLBACK_IRQ] = val;
-
-/*
- * When booting Dom0 using ACPI, Dom0 can only get the event channel
- * interrupt via hypercall.
- */
-if ( !acpi_disabled )
-return;
-
-/* Fix up "interrupts" in /hypervisor node */
-node = fdt_path_offset(kinfo->fdt, "/hypervisor");
-if ( node < 0 )
-panic("Cannot find the /hypervisor node");
-
-/* Interrupt event channel upcall:
- *  - Active-low level-sensitive
- *  - All CPUs
- *
- *  TODO: Handle properly the cpumask
- */
-set_interrupt_ppi(intr, d->arch.evtchn_irq, 0xf,
-  IRQ_TYPE_LEVEL_LOW);
-res = fdt_setprop_inplace(kinfo->fdt, node, "interrupts",
-  , sizeof(intr));
-if ( res )
-panic("Cannot fix up \"interrupts\" property of the hypervisor node");
 }
 
 static void __init find_gnttab_region(struct 

Re: [Xen-devel] [PATCH 6/6] x86/msr: Blacklist various MSRs which guests definitely shouldn't be using

2018-02-27 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:35:19PM +, Andrew Cooper wrote:
> The main purpose is to blacklist the Intel Resource Director Technology MSRs.
> We do not yet virtualise support for guests, but Linux has been observed to
> probe for these MSRs without checking CPUID first.
> 
> The architecturally inaccessable ranges don't need to fall back into the
> legacy ranges, because they are not going to eventually evaluate as
> accessible.
> 
> The Silicon Debug interface will probably never be virtualised for guests, but
> doesn't want to leak through from real hardware.  SGX isn't yet virtualised,
> but likely will be in the future.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Sergey Dyasli 
> ---
>  xen/arch/x86/msr.c | 44 
>  1 file changed, 44 insertions(+)
> 
> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index 3cb4158..0237637 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -158,6 +158,10 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
>  *val = vp->spec_ctrl.raw;
>  break;
>  
> +case 0x8c ... 0x8f: /* SGX Launch Enclave Public Key Hash. */
> +/* Not implemented yet. */
> +goto gp_fault;
> +
>  case MSR_INTEL_PLATFORM_INFO:
>  if ( !dp->plaform_info.available )
>  goto gp_fault;
> @@ -183,6 +187,15 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
>  ret = guest_rdmsr_x2apic(v, msr, val);
>  goto out;
>  
> +case 0xc80:
> +/* Silicon Debug Inferface not advertised to guests. */
> +goto gp_fault;
> +
> +case 0xc81 ... 0xc8f: /* Misc RDT MSRs. */
> +case 0xc90 ... 0xd8f: /* CAT Mask registers. */
> +/* Not implemented yet. */
> +goto gp_fault;

You have sorted them by index, I would maybe group them together in
order to use a single goto.

I would however prefer to have defines for those values.

> +
>  case 0x4000 ... 0x41ff:
>  if ( is_viridian_domain(d) )
>  {
> @@ -196,6 +209,15 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
>  goto out;
>  
>  default:
> +/*
> + * Blacklist the architecturally inaccessable MSRs. No point 
> wandering
> + * the legacy handlers.
> + */
> +if ( msr > 0x1fff &&
> + (msr < 0xc000 || msr > 0xc0001fff) &&
> + (msr < 0xc001 || msr > 0xc0011fff) )

Maybe this could be a macro in order to avoid duplication with the
chunk in wrmsr? (MSR_INACCESSIBLE or some such).

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/3] xen/arm: domain_build: Prepare DTB/ACPI tables after specific mappings

2018-02-27 Thread julien . grall
From: Julien Grall 

A follow-up patch will require to have all interrupts routed to the
hardware registered before calling prepare_dtb/prepare_acpi.

At the moment, it is not necessary to call platform specific mappings
(gic and platform) after, so it is fine to move them.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/domain_build.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 941688a2ce..a5e5c82355 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2145,14 +2145,6 @@ int construct_dom0(struct domain *d)
 allocate_memory(d, );
 find_gnttab_region(d, );
 
-if ( acpi_disabled )
-rc = prepare_dtb(d, );
-else
-rc = prepare_acpi(d, );
-
-if ( rc < 0 )
-return rc;
-
 /* Map extra GIC MMIO, irqs and other hw stuffs to dom0. */
 rc = gic_map_hwdom_extra_mappings(d);
 if ( rc < 0 )
@@ -2162,6 +2154,14 @@ int construct_dom0(struct domain *d)
 if ( rc < 0 )
 return rc;
 
+if ( acpi_disabled )
+rc = prepare_dtb(d, );
+else
+rc = prepare_acpi(d, );
+
+if ( rc < 0 )
+return rc;
+
 /*
  * The following loads use the domain's p2m and require current to
  * be a vcpu of the domain, temporarily switch
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/3] xen/arm: Rework the way to allocate event channel IRQ for hwdom

2018-02-27 Thread julien . grall
From: Julien Grall 

Hi all,

This patch series was triggered by commit 11e7dd958d "xen/arm: domain_builder:
irq sanity check logic fix" that is valid but break some assumption in the
current Xen. Rather than removing a useful BUG_ON(), this series rework
the event channel IRQ allocation for the hardware domain.

The last patch of the series will reinstance commit 11e7dd958d.

Cheers,

Julien Grall (2):
  xen/arm: domain_build: Prepare DTB/ACPI tables after specific mappings
  xen/arm: domain_build: Rework the way to allocate the event channel
interrupt

Stewart Hildebrand (1):
  xen/arm: domain_builder: irq sanity check logic fix

 xen/arch/arm/domain_build.c | 95 ++---
 1 file changed, 46 insertions(+), 49 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxc: really tolerate empty PV records

2018-02-27 Thread Andrew Cooper
On 27/02/18 15:00, Andrew Cooper wrote:
> On 27/02/18 14:51, Jan Beulich wrote:
>> ---
>> This is what is currently breaking save/restore on the 4.8 branch; I
>> haven't checked yet why the sending side produces an empty record there.
>> I wonder whether that's related to f1a0a8c3fe ("tools/libxc: Fix
>> restoration of PV MSRs after migrate") not having been backported yet.
>> Otoh the latter is missing on 4.7 too, yet there save/restore doesn't
>> fail.
> It will be 72efb1df6 "Avoid generating inappropriate zero-content
> records" which is missing.
>
> It triggers when there are a non-zero maximum number of MSRs, but zero
> MSRs to tend after Xen skips the empty MSRs.

send*

Sorry for the noise.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/xen: add tty0 and hvc0 as preferred consoles for dom0

2018-02-27 Thread Boris Ostrovsky
On 02/27/2018 05:19 AM, Juergen Gross wrote:
> Today the tty0 and hvc0 consoles are added as a preferred consoles for
> pv domUs only. As this requires a boot parameter for getting dom0
> messages per default, add them for dom0, too.
>
> Signed-off-by: Juergen Gross 

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/6] x86/hvm: Handle x2apic MSRs the new guest_{rd, wr}msr() infrastructure

2018-02-27 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:35:18PM +, Andrew Cooper wrote:
> Dispatch from the guest_{rd,wr}msr() functions.  The read side should be safe
> outside of current context, but the write side is definitely not.  As the
> toolstack has plausible reason to access the APIC registers via this interface
> (not least because whether they are accessible at all depends on guest
> settings), unilaterally reject access attempts outside of current context.
> 
> Rename to guest_{rd,wr}msr_x2apic() for consistency, and alter the functions
> to use X86EMUL_EXCEPTION rather than X86EMUL_UNHANDLEABLE.  The previous
> callers turned UNHANDLEABLE into EXCEPTION, but using UNHANDLEABLE will now
> interfere with the fallback to legacy MSR handling.
> 
> While altering guest_rdmsr_x2apic() make a couple of minor improvements.
> Reformat the initialiser for readable[] so it indents in a more natural way,
> and alter high to be a 64bit integer to avoid shifting 0 by 32 in the common
> path.
> 
> Observant people might notice that we now don't let PV guests read the x2apic
> MSRs.  They should never have been able to in the first place.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Roger Pau Monné 

> @@ -1054,13 +1056,18 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int 
> msr, uint64_t msr_content)
>  case APIC_EOI:
>  case APIC_ESR:
>  if ( msr_content )
> +{
>  default:
> -return X86EMUL_UNHANDLEABLE;

Why not add a gp_fault label here and just return X86EMUL_EXCEPTION?
That seems better than adding a gp_fault label below.

> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -139,6 +139,7 @@ int init_vcpu_msr_policy(struct vcpu *v)
>  
>  int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>  {
> +const struct vcpu *curr = current;
>  const struct domain *d = v->domain;
>  const struct cpuid_policy *cp = d->arch.cpuid;
>  const struct msr_domain_policy *dp = d->arch.msr;
> @@ -175,6 +176,13 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
> _MSR_MISC_FEATURES_CPUID_FAULTING;
>  break;
>  
> +case MSR_IA32_APICBASE_MSR ... MSR_IA32_APICBASE_MSR + 0x3ff:
> +if ( !is_hvm_domain(d) || v != curr )
> +goto gp_fault;
> +
> +ret = guest_rdmsr_x2apic(v, msr, val);
> +goto out;
> +
>  case 0x4000 ... 0x41ff:
>  if ( is_viridian_domain(d) )
>  {
> @@ -270,6 +278,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t 
> val)
>  break;
>  }
>  
> +case MSR_IA32_APICBASE_MSR ... MSR_IA32_APICBASE_MSR + 0x3ff:
> +if ( !is_hvm_domain(d) || v != curr )
> +goto gp_fault;
> +
> +ret = guest_wrmsr_x2apic(v, msr, val);
> +goto out;

AFAICT you could just use a break here?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] libxc: really tolerate empty PV records

2018-02-27 Thread Andrew Cooper
On 27/02/18 14:51, Jan Beulich wrote:
> Commit 119ee4d773 ("tools/libxc: Tolerate specific zero-content records
> in migration v2 streams") meant tolerate those, but failed to set rc
> accordingly.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

> ---
> This is what is currently breaking save/restore on the 4.8 branch; I
> haven't checked yet why the sending side produces an empty record there.
> I wonder whether that's related to f1a0a8c3fe ("tools/libxc: Fix
> restoration of PV MSRs after migrate") not having been backported yet.
> Otoh the latter is missing on 4.7 too, yet there save/restore doesn't
> fail.

It will be 72efb1df6 "Avoid generating inappropriate zero-content
records" which is missing.

It triggers when there are a non-zero maximum number of MSRs, but zero
MSRs to tend after Xen skips the empty MSRs.

~Andrew

>
> --- a/tools/libxc/xc_sr_restore_x86_pv.c
> +++ b/tools/libxc/xc_sr_restore_x86_pv.c
> @@ -770,6 +770,7 @@ static int handle_x86_pv_vcpu_blob(struc
>  {
>  DBGPRINTF("Skipping empty %s record for vcpu %u\n",
>rec_type_to_str(rec->type), vhdr->vcpu_id);
> +rc = 0;
>  goto out;
>  }
>  
>
>
>


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch v4 8/8] x86/hvm: bump the maximum number of vcpus to 512

2018-02-27 Thread George Dunlap
On 02/26/2018 01:11 PM, Chao Gao wrote:
> On Mon, Feb 26, 2018 at 01:26:42AM -0700, Jan Beulich wrote:
> On 23.02.18 at 19:11,  wrote:
>>> On Wed, Dec 06, 2017 at 03:50:14PM +0800, Chao Gao wrote:
 Signed-off-by: Chao Gao 
 ---
  xen/include/public/hvm/hvm_info_table.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/xen/include/public/hvm/hvm_info_table.h 
>>> b/xen/include/public/hvm/hvm_info_table.h
 index 08c252e..6833a4c 100644
 --- a/xen/include/public/hvm/hvm_info_table.h
 +++ b/xen/include/public/hvm/hvm_info_table.h
 @@ -32,7 +32,7 @@
  #define HVM_INFO_PADDR   ((HVM_INFO_PFN << 12) + HVM_INFO_OFFSET)
  
  /* Maximum we can support with current vLAPIC ID mapping. */
 -#define HVM_MAX_VCPUS128
 +#define HVM_MAX_VCPUS512
>>>
>>> Wow, that looks like a pretty big jump. I certainly don't have access
>>> to any box with this number of vCPUs, so that's going to be quite hard
>>> to test. What the reasoning behind this bump? Is hardware with 512
>>> ways expected soon-ish?
>>>
>>> Also osstest is not even able to test the current limit, so I would
>>> maybe bump this to 256, but as I expressed in other occasions I don't
>>> feel comfortable with have a number of vCPUs that the current test
>>> system doesn't have hardware to test with.
>>
>> I think implementation limit and supported limit need to be clearly
>> distinguished here. Therefore I'd put the question the other way
>> around: What's causing the limit to be 512, rather than 1024,
>> 4096, or even 4G-1 (x2APIC IDs are 32 bits wide, after all)?
> 
> TBH, I have no idea. When I choose a value, what comes up to my mind is
> that the value should be 288, because Intel has Xeon-phi platform which
> has 288 physical threads, and some customers wants to use this new platform
> for HPC cloud. Furthermore, they requests to support a big VM in which
> almost computing and device resources are assigned to the VM. They just
> use virtulization technology to manage the machines. In this situation,
> I choose 512 is because I feel much better if the limit is a power of 2.
> 
> You are asking that as these patches remove limitations imposed by some
> components, which one is the next bottleneck and how many vcpus does it
> limit.  Maybe it would be the use-case. No one is requesting to support
> more than 288 at this moment. So what is the value you prefer? 288 or
> 512? or you think I should find the next bottleneck in Xen's
> implementation.

I understood Jan to be responding to Roger -- Roger said he didn't want
to increase the limit beyond what osstest could reasonably test; Jan
said that what osstest can test should be factored into the *supported*
limit, not the *implementation* limit.

I agree with Jan: People should be allowed to run systems with 288
vcpus, understanding that they are at that point running essentially an
experimental system which may not work; and that if it doesn't work, it
may be that nobody will be able to reproduce and thus fix their issue.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/5] x86/entry: Correct comparisons against boolean variables

2018-02-27 Thread Andrew Cooper
The correct way to check a boolean is `cmpb $0` or `testb $0xff`, whereas a
lot of our entry code uses `testb $1`.  This will work in principle for values
which are really C _Bool types, but won't work for other integer types which
are intended to have boolean properties.

cmp is the more logical way of thinking about the operation, so adjust all
outstanding uses of `testb $1` against boolean values.  Changing test to cmp
changes the logical mnemonic of the following condition from 'zero' to
'equal', but the actual encoding remains the same.

No functional change, as all uses are real C _Bool types, and confirmed by
diffing the disassembly.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 

New in v2
---
 xen/arch/x86/x86_64/compat/entry.S |  8 
 xen/arch/x86/x86_64/entry.S| 28 ++--
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 458d810..3e8b6c1 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -41,11 +41,11 @@ ENTRY(compat_test_all_events)
 leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
 cmpl  $0,(%rcx,%rax,1)
 jne   compat_process_softirqs
-testb $1,VCPU_mce_pending(%rbx)
-jnz   compat_process_mce
+cmpb  $0, VCPU_mce_pending(%rbx)
+jne   compat_process_mce
 .Lcompat_test_guest_nmi:
-testb $1,VCPU_nmi_pending(%rbx)
-jnz   compat_process_nmi
+cmpb  $0, VCPU_nmi_pending(%rbx)
+jne   compat_process_nmi
 compat_test_guest_events:
 movq  VCPU_vcpu_info(%rbx),%rax
 movzwl COMPAT_VCPUINFO_upcall_pending(%rax),%eax
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 941f06f..6249efe 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -190,11 +190,11 @@ test_all_events:
 leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
 cmpl  $0,(%rcx,%rax,1)
 jne   process_softirqs
-testb $1,VCPU_mce_pending(%rbx)
-jnz   process_mce
+cmpb  $0, VCPU_mce_pending(%rbx)
+jne   process_mce
 .Ltest_guest_nmi:
-testb $1,VCPU_nmi_pending(%rbx)
-jnz   process_nmi
+cmpb  $0, VCPU_nmi_pending(%rbx)
+jne   process_nmi
 test_guest_events:
 movq  VCPU_vcpu_info(%rbx),%rax
 movzwl VCPUINFO_upcall_pending(%rax),%eax
@@ -305,8 +305,8 @@ UNLIKELY_END(sysenter_gpf)
 movq  VCPU_domain(%rbx),%rdi
 movq  %rax,TRAPBOUNCE_eip(%rdx)
 movb  %cl,TRAPBOUNCE_flags(%rdx)
-testb $1,DOMAIN_is_32bit_pv(%rdi)
-jnz   compat_sysenter
+cmpb  $0, DOMAIN_is_32bit_pv(%rdi)
+jne   compat_sysenter
 jmp   .Lbounce_exception
 
 ENTRY(int80_direct_trap)
@@ -342,8 +342,8 @@ UNLIKELY_END(msi_check)
 jzint80_slow_path
 
 movq  VCPU_domain(%rbx),%rax
-testb $1,DOMAIN_is_32bit_pv(%rax)
-jnz   compat_int80_direct_trap
+cmpb  $0, DOMAIN_is_32bit_pv(%rax)
+jne   compat_int80_direct_trap
 
 call  create_bounce_frame
 jmp   test_all_events
@@ -484,8 +484,8 @@ ENTRY(dom_crash_sync_extable)
 # create_bounce_frame() temporarily clobbers CS.RPL. Fix up.
 movq  STACK_CPUINFO_FIELD(current_vcpu)(%rax), %rax
 movq  VCPU_domain(%rax),%rax
-testb $1,DOMAIN_is_32bit_pv(%rax)
-setz  %al
+cmpb  $0, DOMAIN_is_32bit_pv(%rax)
+sete  %al
 leal  (%rax,%rax,2),%eax
 orb   %al,UREGS_cs(%rsp)
 xorl  %edi,%edi
@@ -529,8 +529,8 @@ ENTRY(ret_from_intr)
 testb $3,UREGS_cs(%rsp)
 jzrestore_all_xen
 movq  VCPU_domain(%rbx),%rax
-testb $1,DOMAIN_is_32bit_pv(%rax)
-jztest_all_events
+cmpb  $0, DOMAIN_is_32bit_pv(%rax)
+jetest_all_events
 jmp   compat_test_all_events
 
 ENTRY(page_fault)
@@ -629,8 +629,8 @@ handle_exception_saved:
 jzrestore_all_xen
 leaq  VCPU_trap_bounce(%rbx),%rdx
 movq  VCPU_domain(%rbx),%rax
-testb $1,DOMAIN_is_32bit_pv(%rax)
-jnz   compat_post_handle_exception
+cmpb  $0, DOMAIN_is_32bit_pv(%rax)
+jne   compat_post_handle_exception
 testb $TBF_EXCEPTION,TRAPBOUNCE_flags(%rdx)
 jztest_all_events
 .Lbounce_exception:
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 4/5] x86/pv: Drop {compat_, }create_bounce_frame() and use the C version instead

2018-02-27 Thread Andrew Cooper
The clobbering of TRAPBOUNCE_flags in .L{compat_}bounce_exception is subsumed
by the logic at the end of pv_create_bounce_frame().

This cleanup removes all callers of asm_domain_crash_synchronous(), which is
therefore dropped as well.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 

v2:
 * Remove redundant lea's
---
 xen/arch/x86/traps.c   |  23 --
 xen/arch/x86/x86_64/compat/entry.S | 120 ++
 xen/arch/x86/x86_64/entry.S| 147 ++---
 xen/include/xen/sched.h|   7 --
 4 files changed, 10 insertions(+), 287 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 27190e0..a9b53da 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2103,29 +2103,6 @@ long set_debugreg(struct vcpu *v, unsigned int reg, 
unsigned long value)
 return 0;
 }
 
-void asm_domain_crash_synchronous(unsigned long addr)
-{
-/*
- * We need clear AC bit here because in entry.S AC is set
- * by ASM_STAC to temporarily allow accesses to user pages
- * which is prevented by SMAP by default.
- *
- * For some code paths, where this function is called, clac()
- * is not needed, but adding clac() here instead of each place
- * asm_domain_crash_synchronous() is called can reduce the code
- * redundancy, and it is harmless as well.
- */
-clac();
-
-if ( addr == 0 )
-addr = this_cpu(last_extable_addr);
-
-printk("domain_crash_sync called from entry.S: fault at %p %pS\n",
-   _p(addr), _p(addr));
-
-__domain_crash_synchronous();
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 3e8b6c1..4f681bf 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -60,7 +60,7 @@ compat_test_guest_events:
 movl  VCPU_event_sel(%rbx),%eax
 movw  %ax,TRAPBOUNCE_cs(%rdx)
 movb  $TBF_INTERRUPT,TRAPBOUNCE_flags(%rdx)
-call  compat_create_bounce_frame
+call  pv_create_exception_frame
 jmp   compat_test_all_events
 
 ALIGN
@@ -102,8 +102,7 @@ compat_process_nmi:
 movb %dl,VCPU_async_exception_mask(%rbx)
 /* FALLTHROUGH */
 compat_process_trap:
-leaq  VCPU_trap_bounce(%rbx),%rdx
-call  compat_create_bounce_frame
+call  pv_create_exception_frame
 jmp   compat_test_all_events
 
 /* %rbx: struct vcpu, interrupts disabled */
@@ -196,8 +195,7 @@ ENTRY(compat_post_handle_exception)
 testb $TBF_EXCEPTION,TRAPBOUNCE_flags(%rdx)
 jzcompat_test_all_events
 .Lcompat_bounce_exception:
-call  compat_create_bounce_frame
-movb  $0,TRAPBOUNCE_flags(%rdx)
+call  pv_create_exception_frame
 jmp   compat_test_all_events
 
 /* See lstar_enter for entry register state. */
@@ -264,118 +262,10 @@ ENTRY(compat_sysenter)
 movl  $FLAT_COMPAT_USER_SS,UREGS_ss(%rsp)
 cmovzl %ecx,%eax
 movw  %ax,TRAPBOUNCE_cs(%rdx)
-call  compat_create_bounce_frame
+call  pv_create_exception_frame
 jmp   compat_test_all_events
 
 ENTRY(compat_int80_direct_trap)
 CR4_PV32_RESTORE
-call  compat_create_bounce_frame
+call  pv_create_exception_frame
 jmp   compat_test_all_events
-
-/* compat_create_bounce_frame & helpers don't need to be in 
.text.entry */
-.text
-
-/* CREATE A BASIC EXCEPTION FRAME ON GUEST OS (RING-1) STACK:*/
-/*   {[ERRCODE,] EIP, CS, EFLAGS, [ESP, SS]} */
-/* %rdx: trap_bounce, %rbx: struct vcpu  */
-/* On return only %rbx and %rdx are guaranteed non-clobbered.*/
-compat_create_bounce_frame:
-ASSERT_INTERRUPTS_ENABLED
-mov   %fs,%edi
-ASM_STAC
-testb $2,UREGS_cs+8(%rsp)
-jz1f
-/* Push new frame at registered guest-OS stack base. */
-movl  VCPU_kernel_sp(%rbx),%esi
-.Lft1:  mov   VCPU_kernel_ss(%rbx),%fs
-subl  $2*4,%esi
-movl  UREGS_rsp+8(%rsp),%eax
-.Lft2:  movl  %eax,%fs:(%rsi)
-movl  UREGS_ss+8(%rsp),%eax
-.Lft3:  movl  %eax,%fs:4(%rsi)
-jmp   2f
-1:  /* In kernel context already: push new frame at existing %rsp. */
-movl  UREGS_rsp+8(%rsp),%esi
-.Lft4:  mov   UREGS_ss+8(%rsp),%fs
-2:
-movq  VCPU_domain(%rbx),%r8
-subl  $3*4,%esi
-movq  VCPU_vcpu_info(%rbx),%rax
-pushq COMPAT_VCPUINFO_upcall_mask(%rax)
-testb $TBF_INTERRUPT,TRAPBOUNCE_flags(%rdx)
-setnz %ch   # TBF_INTERRUPT -> set upcall mask
-orb   %ch,COMPAT_VCPUINFO_upcall_mask(%rax)
-popq  %rax
-shll  $16,%eax  # Bits 16-23: saved_upcall_mask
-movw  

[Xen-devel] [PATCH] libxc: really tolerate empty PV records

2018-02-27 Thread Jan Beulich
Commit 119ee4d773 ("tools/libxc: Tolerate specific zero-content records
in migration v2 streams") meant tolerate those, but failed to set rc
accordingly.

Signed-off-by: Jan Beulich 
---
This is what is currently breaking save/restore on the 4.8 branch; I
haven't checked yet why the sending side produces an empty record there.
I wonder whether that's related to f1a0a8c3fe ("tools/libxc: Fix
restoration of PV MSRs after migrate") not having been backported yet.
Otoh the latter is missing on 4.7 too, yet there save/restore doesn't
fail.

--- a/tools/libxc/xc_sr_restore_x86_pv.c
+++ b/tools/libxc/xc_sr_restore_x86_pv.c
@@ -770,6 +770,7 @@ static int handle_x86_pv_vcpu_blob(struc
 {
 DBGPRINTF("Skipping empty %s record for vcpu %u\n",
   rec_type_to_str(rec->type), vhdr->vcpu_id);
+rc = 0;
 goto out;
 }
 




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 0/5] x86: Lift create_exception_frame() up out of C

2018-02-27 Thread Andrew Cooper
This has finally undergone performance testing.  No measureable difference in
any lmbench test on either 32 or 64bit PV guests.

Andrew Cooper (5):
  x86/entry: Correct comparisons against boolean variables
  x86/pv: Drop int80_bounce from struct pv_vcpu
  x86/pv: Introduce pv_create_exception_frame()
  x86/pv: Drop {compat_,}create_bounce_frame() and use the C version instead
  x86/pv: Implement the failsafe callback using the general path

 xen/arch/x86/domain.c  | 102 ++-
 xen/arch/x86/pv/callback.c |   8 --
 xen/arch/x86/pv/traps.c| 176 +---
 xen/arch/x86/traps.c   |  23 -
 xen/arch/x86/x86_64/asm-offsets.c  |   1 -
 xen/arch/x86/x86_64/compat/entry.S | 128 ++-
 xen/arch/x86/x86_64/entry.S| 203 -
 xen/include/asm-x86/domain.h   |   1 -
 xen/include/asm-x86/processor.h|   3 +-
 xen/include/xen/sched.h|   7 --
 10 files changed, 224 insertions(+), 428 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 5/5] x86/pv: Implement the failsafe callback using the general path

2018-02-27 Thread Andrew Cooper
Reintroduce TBF_FAILSAFE and update pv_create_exception_frame() to cope with
the additional data segment registers.

load_segments() now fills in trap_bounce, and lets the general return-to-guest
path inject the exception.

Bloat-o-meter reports:
  add/remove: 0/0 grow/shrink: 1/1 up/down: 123/-2522 (-2399)
  function old new   delta
  pv_create_exception_frame   10881211+123
  context_switch  35651043   -2522

which I suspect is largely due to the quantity of code hidden behind
put_user().

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Wei Liu 
CC: Roger Pau Monné 

v2:
 * Use const uregs.
---
 xen/arch/x86/domain.c   | 100 +++-
 xen/arch/x86/pv/traps.c |  29 ++--
 xen/include/asm-x86/processor.h |   1 +
 3 files changed, 32 insertions(+), 98 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 69679a6..7bce7de 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1354,100 +1354,14 @@ static void load_segments(struct vcpu *n)
 
 if ( unlikely(!all_segs_okay) )
 {
-struct pv_vcpu *pv = >arch.pv_vcpu;
-struct cpu_user_regs *regs = guest_cpu_user_regs();
-unsigned long *rsp =
-(unsigned long *)(((n->arch.flags & TF_kernel_mode)
-   ? regs->rsp : pv->kernel_sp) & ~0xf);
-unsigned long cs_and_mask, rflags;
-
-/* Fold upcall mask and architectural IOPL into RFLAGS.IF. */
-rflags  = regs->rflags & ~(X86_EFLAGS_IF|X86_EFLAGS_IOPL);
-rflags |= !vcpu_info(n, evtchn_upcall_mask) << 9;
-if ( VM_ASSIST(n->domain, architectural_iopl) )
-rflags |= n->arch.pv_vcpu.iopl;
-
-if ( is_pv_32bit_vcpu(n) )
-{
-unsigned int *esp = ring_1(regs) ?
-(unsigned int *)regs->rsp :
-(unsigned int *)pv->kernel_sp;
-int ret = 0;
-
-/* CS longword also contains full evtchn_upcall_mask. */
-cs_and_mask = (unsigned short)regs->cs |
-((unsigned int)vcpu_info(n, evtchn_upcall_mask) << 16);
-
-if ( !ring_1(regs) )
-{
-ret  = put_user(regs->ss,   esp-1);
-ret |= put_user(regs->esp,  esp-2);
-esp -= 2;
-}
-
-if ( ret |
- put_user(rflags,  esp-1) |
- put_user(cs_and_mask, esp-2) |
- put_user(regs->eip,   esp-3) |
- put_user(uregs->gs,   esp-4) |
- put_user(uregs->fs,   esp-5) |
- put_user(uregs->es,   esp-6) |
- put_user(uregs->ds,   esp-7) )
-{
-gprintk(XENLOG_ERR,
-"error while creating compat failsafe callback 
frame\n");
-domain_crash(n->domain);
-}
+bool disable = n->arch.vgc_flags & VGCF_failsafe_disables_events;
 
-if ( n->arch.vgc_flags & VGCF_failsafe_disables_events )
-vcpu_info(n, evtchn_upcall_mask) = 1;
-
-regs->entry_vector |= TRAP_syscall;
-regs->eflags   &= ~(X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
-X86_EFLAGS_IOPL|X86_EFLAGS_TF);
-regs->ss= FLAT_COMPAT_KERNEL_SS;
-regs->esp   = (unsigned long)(esp-7);
-regs->cs= FLAT_COMPAT_KERNEL_CS;
-regs->eip   = pv->failsafe_callback_eip;
-return;
-}
-
-if ( !(n->arch.flags & TF_kernel_mode) )
-toggle_guest_mode(n);
-else
-regs->cs &= ~3;
-
-/* CS longword also contains full evtchn_upcall_mask. */
-cs_and_mask = (unsigned long)regs->cs |
-((unsigned long)vcpu_info(n, evtchn_upcall_mask) << 32);
-
-if ( put_user(regs->ss,rsp- 1) |
- put_user(regs->rsp,   rsp- 2) |
- put_user(rflags,  rsp- 3) |
- put_user(cs_and_mask, rsp- 4) |
- put_user(regs->rip,   rsp- 5) |
- put_user(uregs->gs,   rsp- 6) |
- put_user(uregs->fs,   rsp- 7) |
- put_user(uregs->es,   rsp- 8) |
- put_user(uregs->ds,   rsp- 9) |
- put_user(regs->r11,   rsp-10) |
- put_user(regs->rcx,   rsp-11) )
-{
-gprintk(XENLOG_ERR,
-"error while creating failsafe callback frame\n");
-domain_crash(n->domain);
-}
-
-if ( n->arch.vgc_flags & VGCF_failsafe_disables_events )
-   

[Xen-devel] [PATCH v2 3/5] x86/pv: Introduce pv_create_exception_frame()

2018-02-27 Thread Andrew Cooper
This is a C implementation of {compat_,}create_bounce_frame(), based loosely
on the existing failsafe implementation in load_segments().  It picks up all
injection information from the trap_bounce structure.

One minor improvement is that at no point is regs->cs left with an rpl of 0 on
the root stack frame.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 

v2:
 * Use domain_crash() rather than domain_crash_sync().  All callers
   immediately continue to {compat_}test_all_events
 * Count the number of frame[] entries correctly
 * Consistently use 64bit operations when adjusting the root frame
 * Introduce a compat_addr_ok() check for the 32bit side.  The ASM version
   didn't have protection attempting to write into the compat p2m, other than
   hitting a #PF while trying.
---
 xen/arch/x86/pv/traps.c | 143 
 1 file changed, 143 insertions(+)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index 98549bc..b7d7d2b 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -245,6 +245,149 @@ int pv_raise_interrupt(struct vcpu *v, uint8_t vector)
 }
 
 /*
+ * This function emulates the behaviour of hardware when Xen needs to inject
+ * an event into into a guest.
+ *
+ * It may switch from user mode to kernel mode, will write an appropriate
+ * hardware exception frame (including Xen-specific extras), and alter the
+ * root stack frame to invoke the guest kernels correct entry point on exit
+ * from the hypervisor.
+ */
+void pv_create_exception_frame(void)
+{
+struct vcpu *curr = current;
+struct trap_bounce *tb = >arch.pv_vcpu.trap_bounce;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+const bool user_mode_frame = !guest_kernel_mode(curr, regs);
+uint8_t *evt_mask = _info(curr, evtchn_upcall_mask);
+unsigned int flags, bytes, missing;
+
+ASSERT_NOT_IN_ATOMIC();
+
+if ( unlikely(null_trap_bounce(curr, tb)) )
+{
+gprintk(XENLOG_ERR, "Fatal: Attempting to inject null trap bounce\n");
+domain_crash(curr->domain);
+return;
+}
+
+/* Fold the upcall mask and architectural IOPL into the guests rflags. */
+flags  = regs->rflags & ~(X86_EFLAGS_IF | X86_EFLAGS_IOPL);
+flags |= ((*evt_mask ? 0 : X86_EFLAGS_IF) |
+  (VM_ASSIST(curr->domain, architectural_iopl)
+   ? curr->arch.pv_vcpu.iopl : 0));
+
+if ( is_pv_32bit_vcpu(curr) )
+{
+/* { [ERRCODE,] EIP, CS/MASK , EFLAGS, [ESP, SS] } */
+unsigned int frame[6], *ptr = frame, ksp =
+(user_mode_frame ? curr->arch.pv_vcpu.kernel_sp : regs->esp);
+
+if ( tb->flags & TBF_EXCEPTION_ERRCODE )
+*ptr++ = tb->error_code;
+
+*ptr++ = regs->eip;
+*ptr++ = regs->cs | ((unsigned int)*evt_mask << 16);
+*ptr++ = flags;
+
+if ( user_mode_frame )
+{
+*ptr++ = regs->esp;
+*ptr++ = regs->ss;
+}
+
+/* Copy the constructed frame to the guest kernel stack. */
+bytes = _p(ptr) - _p(frame);
+ksp -= bytes;
+
+if ( unlikely(!__compat_access_ok(curr->domain, ksp, bytes)) )
+{
+gprintk(XENLOG_ERR, "Fatal: Bad guest kernel stack %p\n", _p(ksp));
+domain_crash(curr->domain);
+return;
+}
+
+if ( unlikely((missing = __copy_to_user(_p(ksp), frame, bytes)) != 0) )
+{
+gprintk(XENLOG_ERR, "Fatal: Fault while writing exception 
frame\n");
+show_page_walk(ksp + missing);
+domain_crash(curr->domain);
+return;
+}
+
+/* Rewrite our stack frame. */
+regs->rip   = (uint32_t)tb->eip;
+regs->cs= tb->cs;
+regs->rflags   &= ~(X86_EFLAGS_VM | X86_EFLAGS_RF |
+X86_EFLAGS_NT | X86_EFLAGS_TF);
+regs->rsp   = ksp;
+if ( user_mode_frame )
+regs->ss = curr->arch.pv_vcpu.kernel_ss;
+}
+else
+{
+/* { RCX, R11, [ERRCODE,] RIP, CS/MASK, RFLAGS, RSP, SS } */
+unsigned long frame[8], *ptr = frame, ksp =
+(user_mode_frame ? curr->arch.pv_vcpu.kernel_sp : regs->rsp) & 
~0xf;
+
+if ( user_mode_frame )
+toggle_guest_mode(curr);
+
+*ptr++ = regs->rcx;
+*ptr++ = regs->r11;
+
+if ( tb->flags & TBF_EXCEPTION_ERRCODE )
+*ptr++ = tb->error_code;
+
+*ptr++ = regs->rip;
+*ptr++ = ((user_mode_frame ? regs->cs : regs->cs & ~3) |
+  ((unsigned long)*evt_mask << 32));
+*ptr++ = flags;
+*ptr++ = regs->rsp;
+*ptr++ = regs->ss;
+
+/* Copy the constructed frame to the guest kernel stack. */
+bytes = _p(ptr) - _p(frame);
+ksp -= bytes;
+
+if ( unlikely(!__addr_ok(ksp)) 

[Xen-devel] [PATCH v2 2/5] x86/pv: Drop int80_bounce from struct pv_vcpu

2018-02-27 Thread Andrew Cooper
The int80_bounce field of struct pv_vcpu is a bit of an odd special case,
because it is a simple derivation of trap_ctxt[0x80], which is also stored.

It is also the only use of {compat_,}create_bounce_frame() which isn't
referencing the plain trap_bounce field of struct pv_vcpu.  (And altering this
property the purpose of this patch.)

Remove the int80_bounce field entirely, along with init_int80_direct_trap(),
which in turn requires that the int80_direct_trap() path gain logic previously
contained in init_int80_direct_trap().

This does admittedly make the int80 fastpath slightly longer, but these few
instructions are in the noise compared to the architectural context switch
overhead, and it now matches the syscall/sysenter paths (which have far less
architectural overhead already).

No behavioural change from the guests point of view.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * Fix comparason against DOMAIN_is_32bit_pv
---
 xen/arch/x86/domain.c |  2 --
 xen/arch/x86/pv/callback.c|  8 
 xen/arch/x86/pv/traps.c   | 14 --
 xen/arch/x86/x86_64/asm-offsets.c |  1 -
 xen/arch/x86/x86_64/entry.S   | 32 
 xen/include/asm-x86/domain.h  |  1 -
 xen/include/asm-x86/processor.h   |  2 --
 7 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1f8b08e..69679a6 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -892,8 +892,6 @@ int arch_set_info_guest(
 goto out;
 }
 
-init_int80_direct_trap(v);
-
 /* IOPL privileges are virtualised. */
 v->arch.pv_vcpu.iopl = v->arch.user_regs.eflags & X86_EFLAGS_IOPL;
 v->arch.user_regs.eflags &= ~X86_EFLAGS_IOPL;
diff --git a/xen/arch/x86/pv/callback.c b/xen/arch/x86/pv/callback.c
index 97d8438..29ae692 100644
--- a/xen/arch/x86/pv/callback.c
+++ b/xen/arch/x86/pv/callback.c
@@ -371,7 +371,6 @@ long 
do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
 if ( guest_handle_is_null(traps) )
 {
 memset(dst, 0, NR_VECTORS * sizeof(*dst));
-init_int80_direct_trap(curr);
 return 0;
 }
 
@@ -393,9 +392,6 @@ long 
do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
 
 memcpy([cur.vector], , sizeof(cur));
 
-if ( cur.vector == 0x80 )
-init_int80_direct_trap(curr);
-
 guest_handle_add_offset(traps, 1);
 
 if ( hypercall_preempt_check() )
@@ -420,7 +416,6 @@ int 
compat_set_trap_table(XEN_GUEST_HANDLE(trap_info_compat_t) traps)
 if ( guest_handle_is_null(traps) )
 {
 memset(dst, 0, NR_VECTORS * sizeof(*dst));
-init_int80_direct_trap(curr);
 return 0;
 }
 
@@ -439,9 +434,6 @@ int 
compat_set_trap_table(XEN_GUEST_HANDLE(trap_info_compat_t) traps)
 
 XLAT_trap_info(dst + cur.vector, );
 
-if ( cur.vector == 0x80 )
-init_int80_direct_trap(curr);
-
 guest_handle_add_offset(traps, 1);
 
 if ( hypercall_preempt_check() )
diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index d122881..98549bc 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -142,20 +142,6 @@ bool set_guest_nmi_trapbounce(void)
 return !null_trap_bounce(curr, tb);
 }
 
-void init_int80_direct_trap(struct vcpu *v)
-{
-struct trap_info *ti = >arch.pv_vcpu.trap_ctxt[0x80];
-struct trap_bounce *tb = >arch.pv_vcpu.int80_bounce;
-
-tb->cs  = ti->cs;
-tb->eip = ti->address;
-
-if ( null_trap_bounce(v, tb) )
-tb->flags = 0;
-else
-tb->flags = TBF_EXCEPTION | (TI_GET_IF(ti) ? TBF_INTERRUPT : 0);
-}
-
 struct softirq_trap {
 struct domain *domain;   /* domain to inject trap */
 struct vcpu *vcpu;   /* vcpu to inject trap */
diff --git a/xen/arch/x86/x86_64/asm-offsets.c 
b/xen/arch/x86/x86_64/asm-offsets.c
index e6d4147..1a45428 100644
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -61,7 +61,6 @@ void __dummy__(void)
 OFFSET(VCPU_domain, struct vcpu, domain);
 OFFSET(VCPU_vcpu_info, struct vcpu, vcpu_info);
 OFFSET(VCPU_trap_bounce, struct vcpu, arch.pv_vcpu.trap_bounce);
-OFFSET(VCPU_int80_bounce, struct vcpu, arch.pv_vcpu.int80_bounce);
 OFFSET(VCPU_thread_flags, struct vcpu, arch.flags);
 OFFSET(VCPU_event_addr, struct vcpu, arch.pv_vcpu.event_callback_eip);
 OFFSET(VCPU_event_sel, struct vcpu, arch.pv_vcpu.event_callback_cs);
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 6249efe..bf41563 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -336,12 +336,36 @@ UNLIKELY_END(msi_check)
 
 movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
 
-/* Check that the callback is non-null. */
-leaq  VCPU_int80_bounce(%rbx),%rdx
-cmpb  $0,TRAPBOUNCE_flags(%rdx)
+

Re: [Xen-devel] Xen Security Advisory 255 - grant table v2 -> v1 transition may crash Xen

2018-02-27 Thread Steven Haigh
On Wednesday, 28 February 2018 1:36:14 AM AEDT George Dunlap wrote:
> On 02/27/2018 02:22 PM, Jan Beulich wrote:
>  On 27.02.18 at 13:37,  wrote:
> >> On Tuesday, 27 February 2018 11:00:08 PM AEDT Xen. org security team 
wrote:
> >>> RESOLUTION
> >>> ==
> >>> 
> >>> Applying the appropriate attached patch resolves this issue.
> >>> 
> >>> xsa255-?.patch xen-unstable, Xen 4.10.x
> >>> xsa255-4.9-?.patch Xen 4.9.x, Xen 4.8.x
> >>> xsa255-4.7-?.patch Xen 4.7.x
> >>> xsa255-4.6-?.patch Xen 4.6.x
> >> 
> >> Is there a missing pre-requisite patch required for 4.6.6?
> >> 
> >> I'm currently getting a failure on these patches as follows:
> >> 
> >> Patch #55 (xsa255-4.6-1.patch):
> >> + echo 'Patch #55 (xsa255-4.6-1.patch):'
> >> + /bin/cat /builddir/build/SOURCES/xsa255-4.6-1.patch
> >> + /usr/bin/patch -p1 --fuzz=2
> >> patching file xen/arch/arm/domain.c
> >> patching file xen/arch/arm/mm.c
> >> Hunk #2 FAILED at 1075.
> >> Hunk #3 FAILED at 1090.
> >> 2 out of 3 hunks FAILED -- saving rejects to file xen/arch/arm/mm.c.rej
> > 
> > I've just applied the patches to all stable branches, and they all
> > applied fine, including the 4.6 ones. Are you perhaps missing the
> > XSA-235 fix there? In any event, as said a number of times in
> > the past, the patches we provide are against the staging branches
> > for the respective stable versions; we don't guarantee patches
> > apply to vanilla stable releases.
> 
> And as other people have said several times, most downstreams don't
> build from stable-XX, but take a tarball and add patches to it.  I
> expect Steven was asking if someone could point him to specific commits
> from stable-XX that might be required.

Hi George,

Yes, you are correct.

As XSA-235 was an ARM only issue (and I don't build anything for ARM), these 
usually get skipped in my packaging.

As XSA-255 is *both* ARM & x86, it needed that extra bit of TLC... This 
probably makes it a little unique in how XSAs are normally presented.

I did look at the two patches in XSA-255, but it looked like there is a 
combination of both ARM & x86 changes in specifically the -2 patch which lead 
me to the conclusion that I couldn't just remove one patch to take out the 
common and x86 parts.

I figured something was missing, but wasn't able to track it back to the patch 
from August last year.

Thanks to Jan for the pointers to the missing requirement - I've got packages 
built for 4.6 now to push shortly.

-- 
Steven Haigh

 net...@crc.id.au    http://www.crc.id.au
 +61 (3) 9001 6090 0412 935 897


signature.asc
Description: This is a digitally signed message part.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.8-testing test] 120030: regressions - FAIL

2018-02-27 Thread osstest service owner
flight 120030 xen-4.8-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120030/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386broken in 119953
 test-amd64-i386-migrupgrade  broken  in 119998
 test-amd64-i386-xl-qemut-debianhvm-amd64  broken in 119998
 test-amd64-amd64-xl-multivcpu 15 guest-saverestore   fail REGR. vs. 119771
 test-amd64-amd64-xl-xsm  15 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-libvirt  15 guest-saverestorefail REGR. vs. 119771
 test-amd64-amd64-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 119771
 test-amd64-amd64-libvirt-xsm 15 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-xl-xsm   15 guest-saverestorefail REGR. vs. 119771
 test-amd64-amd64-xl-credit2  15 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-pair  22 guest-migrate/src_host/dst_host fail REGR. vs. 119771
 test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. 
vs. 119771
 test-amd64-amd64-xl  15 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-libvirt-xsm  15 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-xl   15 guest-saverestorefail REGR. vs. 119771
 test-amd64-amd64-libvirt 15 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 
119771
 test-amd64-amd64-xl-qcow214 guest-saverestorefail REGR. vs. 119771
 test-amd64-amd64-amd64-pvgrub 14 guest-saverestore   fail REGR. vs. 119771
 test-amd64-amd64-i386-pvgrub 14 guest-saverestorefail REGR. vs. 119771
 test-amd64-amd64-libvirt-vhd 14 guest-saverestorefail REGR. vs. 119771
 test-amd64-i386-xl-raw   14 guest-saverestorefail REGR. vs. 119771

Tests which are failing intermittently (not blocking):
 test-amd64-i386-freebsd10-i386 4 host-install(4) broken in 119953 pass in 
120030
 test-amd64-i386-xl-qemut-debianhvm-amd64 4 host-install(4) broken in 119998 
pass in 120030
 test-amd64-i386-migrupgrade 4 host-install/src_host(4) broken in 119998 pass 
in 120030
 test-xtf-amd64-amd64-4 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119953 pass 
in 120030
 test-xtf-amd64-amd64-2 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119953 pass 
in 120030
 test-xtf-amd64-amd64-1 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119953 pass 
in 120030
 test-xtf-amd64-amd64-3 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119953 pass 
in 120030
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 
guest-localmigrate/x10 fail in 119998 pass in 120030
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail in 119998 pass 
in 120030
 test-armhf-armhf-xl-vhd  16 guest-start.2fail in 119998 pass in 120030
 test-xtf-amd64-amd64-5   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 119953
 test-amd64-amd64-pygrub  14 guest-saverestore  fail pass in 119953
 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 119953
 test-armhf-armhf-xl-xsm   5 host-ping-check-native fail pass in 119998

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 15 guest-saverestorefail REGR. vs. 119771

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds13 migrate-support-check fail in 119953 never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 119953 never pass
 test-armhf-armhf-xl-xsm 13 migrate-support-check fail in 119953 never pass
 test-armhf-armhf-xl-xsm 14 saverestore-support-check fail in 119953 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119771
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119771
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119771
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119771
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 119771
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 119771
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 119771
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 119771
 test-xtf-amd64-amd64-4   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-1   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-2   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-5   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-3   52 xtf/test-hvm64-memop-seg fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 build-amd64-prev  7 xen-build/dist-test  

Re: [Xen-devel] [PATCH 2/6] x86/hvm: Handle viridian MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-27 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 26 February 2018 17:35
> To: Xen-devel 
> Cc: Andrew Cooper ; Jan Beulich
> ; Jun Nakajima ; Paul
> Durrant ; Kevin Tian ; Boris
> Ostrovsky ; Suravee Suthikulpanit
> ; Wei Liu ; Roger
> Pau Monne ; Sergey Dyasli
> 
> Subject: [PATCH 2/6] x86/hvm: Handle viridian MSRs via the new
> guest_{rd,wr}msr() infrastructure
> 
> Dispatch from the guest_{rd,wr}msr() functions, after confirming that the
> domain is configured to use viridian.  This allows for simplifiction of the
> viridian helpers as they don't need to cope with the "not a viridian MSR"
> case.  It also means that viridian MSRs which are unimplemented, or
> excluded
> because of features, don't fall back into default handling path.
> 
> Rename {rd,wr}msr_viridian_regs() to guest_{rd,wr}msr_viridian() for
> consistency, and because the _regs suffix isn't very appropriate.
> 
> Update them to take a vcpu pointer rather than presuming that they act on
> current, which is safe for all implemented operations.  Also update them to
> use X86EMUL_* return values.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Jun Nakajima 
> CC: Paul Durrant 
> CC: Kevin Tian 
> CC: Boris Ostrovsky 
> CC: Suravee Suthikulpanit 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Sergey Dyasli 
> ---
>  xen/arch/x86/hvm/svm/svm.c |  6 +
>  xen/arch/x86/hvm/viridian.c| 49 ++---
> -
>  xen/arch/x86/hvm/vmx/vmx.c |  6 +
>  xen/arch/x86/msr.c | 41 +++
>  xen/include/asm-x86/hvm/viridian.h | 11 ++---
>  5 files changed, 64 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 8b4cefd..6d8ed5c 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1967,8 +1967,7 @@ static int svm_msr_read_intercept(unsigned int
> msr, uint64_t *msr_content)
>  else if ( ret )
>  break;
> 
> -if ( rdmsr_viridian_regs(msr, msr_content) ||
> - rdmsr_hypervisor_regs(msr, msr_content) )
> +if ( rdmsr_hypervisor_regs(msr, msr_content) )
>  break;
> 
>  if ( rdmsr_safe(msr, *msr_content) == 0 )
> @@ -2123,9 +2122,6 @@ static int svm_msr_write_intercept(unsigned int
> msr, uint64_t msr_content)
>  else if ( ret )
>  break;
> 
> -if ( wrmsr_viridian_regs(msr, msr_content) )
> -break;
> -
>  switch ( wrmsr_hypervisor_regs(msr, msr_content) )
>  {
>  case -ERESTART:
> diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c
> index 70aab52..23de433 100644
> --- a/xen/arch/x86/hvm/viridian.c
> +++ b/xen/arch/x86/hvm/viridian.c
> @@ -554,13 +554,11 @@ static void update_reference_tsc(struct domain *d,
> bool_t initialize)
>  put_page_and_type(page);
>  }
> 
> -int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
> +int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx, uint64_t val)
>  {
> -struct vcpu *v = current;
>  struct domain *d = v->domain;
> 
> -if ( !is_viridian_domain(d) )
> -return 0;
> +ASSERT(is_viridian_domain(d));
> 
>  switch ( idx )
>  {
> @@ -615,7 +613,7 @@ int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
> 
>  case HV_X64_MSR_REFERENCE_TSC:
>  if ( !(viridian_feature_mask(d) & HVMPV_reference_tsc) )
> -return 0;
> +goto gp_fault;
> 
>  perfc_incr(mshv_wrmsr_tsc_msr);
>  d->arch.hvm_domain.viridian.reference_tsc.raw = val;
> @@ -655,14 +653,15 @@ int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
>  }
> 
>  default:
> -if ( idx >= VIRIDIAN_MSR_MIN && idx <= VIRIDIAN_MSR_MAX )
> -gprintk(XENLOG_WARNING, "write to unimplemented MSR %#x\n",
> -idx);
> -
> -return 0;
> +gdprintk(XENLOG_WARNING,
> + "Write %016"PRIx64" to unimplemented MSR %#x\n", val, idx);
> +goto gp_fault;
>  }
> 
> -return 1;
> +return X86EMUL_OKAY;
> +
> + gp_fault:
> +return X86EMUL_EXCEPTION;
>  }
> 
>  static int64_t raw_trc_val(struct domain *d)
> @@ -698,13 +697,11 @@ void viridian_time_ref_count_thaw(struct domain
> *d)
>  trc->off = (int64_t)trc->val - raw_trc_val(d);
>  }
> 
> -int rdmsr_viridian_regs(uint32_t idx, uint64_t *val)
> +int guest_rdmsr_viridian(const struct 

Re: [Xen-devel] Xen Security Advisory 255 - grant table v2 -> v1 transition may crash Xen

2018-02-27 Thread George Dunlap
On 02/27/2018 02:22 PM, Jan Beulich wrote:
 On 27.02.18 at 13:37,  wrote:
>> On Tuesday, 27 February 2018 11:00:08 PM AEDT Xen. org security team wrote:
>>> RESOLUTION
>>> ==
>>>
>>> Applying the appropriate attached patch resolves this issue.
>>>
>>> xsa255-?.patch xen-unstable, Xen 4.10.x
>>> xsa255-4.9-?.patch Xen 4.9.x, Xen 4.8.x
>>> xsa255-4.7-?.patch Xen 4.7.x
>>> xsa255-4.6-?.patch Xen 4.6.x
>>
>> Is there a missing pre-requisite patch required for 4.6.6?
>>
>> I'm currently getting a failure on these patches as follows:
>>
>> Patch #55 (xsa255-4.6-1.patch):
>> + echo 'Patch #55 (xsa255-4.6-1.patch):'
>> + /bin/cat /builddir/build/SOURCES/xsa255-4.6-1.patch
>> + /usr/bin/patch -p1 --fuzz=2
>> patching file xen/arch/arm/domain.c
>> patching file xen/arch/arm/mm.c
>> Hunk #2 FAILED at 1075.
>> Hunk #3 FAILED at 1090.
>> 2 out of 3 hunks FAILED -- saving rejects to file xen/arch/arm/mm.c.rej
> 
> I've just applied the patches to all stable branches, and they all
> applied fine, including the 4.6 ones. Are you perhaps missing the
> XSA-235 fix there? In any event, as said a number of times in
> the past, the patches we provide are against the staging branches
> for the respective stable versions; we don't guarantee patches
> apply to vanilla stable releases.

And as other people have said several times, most downstreams don't
build from stable-XX, but take a tarball and add patches to it.  I
expect Steven was asking if someone could point him to specific commits
from stable-XX that might be required.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] x86/hvm: Constify the read side of vlapic handling

2018-02-27 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:35:17PM +, Andrew Cooper wrote:
> This is in preparation to make hvm_x2apic_msr_read() take a const vcpu
> pointer.  One modification is to alter vlapic_get_tmcct() to not use current.
> 
> This in turn needs an alteration to hvm_get_guest_time_fixed(), which is safe
> because the only mutable action it makes is to take the domain plt lock.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Roger Pau Monné 

> index 181f4cb..862c715 100644
> --- a/xen/arch/x86/hvm/vpt.c
> +++ b/xen/arch/x86/hvm/vpt.c
> @@ -35,7 +35,7 @@ void hvm_init_guest_time(struct domain *d)
>  pl->last_guest_time = 0;
>  }
>  
> -u64 hvm_get_guest_time_fixed(struct vcpu *v, u64 at_tsc)
> +u64 hvm_get_guest_time_fixed(const struct vcpu *v, u64 at_tsc)

While there you could s/u64/uint64_t/.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 30/49] ARM: new VGIC: Add ENABLE registers handlers

2018-02-27 Thread Julien Grall



On 27/02/18 13:54, Andre Przywara wrote:

Hi,

On 19/02/18 14:13, Julien Grall wrote:



On 19/02/18 12:41, Andre Przywara wrote:

Hi,


Hi,


On 16/02/18 16:57, Julien Grall wrote:

On 09/02/18 14:39, Andre Przywara wrote:

+    spin_lock_irqsave(>lock, flags);
+    if ( enable )
+    {
+    gic_set_irq_type(desc, irq_type == VGIC_CONFIG_LEVEL ?
+ IRQ_TYPE_LEVEL_HIGH : IRQ_TYPE_EDGE_RISING);


Indentation and I would prefer a helper to convert between the vgic
value and the IRQ_TYPE. This would make the code easier to read.

Also, this code does not replicate correctly the current vGIC.
gic_set_irq_type is only allowed to be used when
irq_set_type_by_domain(d) returns true. If you consider this change
valid, then I would like to know why.


So what is/was the rationale for not allowing IRQ type changes for
non-privileged guests? If you allow to pass through an hardware IRQ to a
guest (which is the case this function handles), then I don't see why a
guest would not be allowed to change the configuration? It seems rather
odd, I guess it's up to the guest to know which type of IRQ this is?


If you can answer the question on top of irq_type_set_by_domain (i.e
"See whether it is possible to let any domain configure the type) then
we can remove it. We decided to only allow for the hardware domain
because we trust it.


But why would you mistrust a DomU in this respect?
The only point I see is that a guest has *some* influence on a hardware
access, but I fail to see how a single MMIO read-modify-write sequence
would actually impact the host. Especially since we do it only on
enabling an IRQ.
Looking more closely at the existing VGIC code we might want to check if
the hardware IRQ was already enabled before entering the
"if ( p->desc != NULL )" branch, btw.


That's not an issue here. You can only enter in vgic_enable_irqs if the 
virtual interrupt was previously disabled. Because the physical 
interrupt is routed to the guest, it will also be disabled at that time.




So is this the concern? Commit b0003bdd690 wasn't really enlightening in
this respect.


It was not really clear if it would be an issue when I wrote the patch. 
We trust the hardware domain so it is fine to let him configure the 
interrupt. For the guests, this will be taken from the DT (see 
gic_route_irq_to_guest). So this is likely to get configured correctly 
for the guest.


What I was worry about is whether we need to sanitize the ICFGR when the 
interrupt is routed to another domain. But if you can clear that, then I 
guess it should be ok.


However, I would prefer to do this in a separate patch and keep 
irq_type_set_by_domain around. This is to match the current vGIC and not 
changing too much Xen behavior.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86: Handle the Xen MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-27 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:35:16PM +, Andrew Cooper wrote:
> Dispatch from the guest_{rd,wr}msr() functions, after falling through from the
> !is_viridian_domain() case.
> 
> Rename {rd,wr}msr_hypervisor_regs() to guest_{rd,wr}msr_xen() for consistency,
> and because the _regs suffix isn't very appropriate.
> 
> Update them to take a vcpu pointer rather than presuming that they act on
> current, and switch to using X86EMUL_* return values.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Roger Pau Monné 

> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index 2ff9361..9f20fd8 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -183,6 +183,10 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
>  }
>  
>  /* Fallthrough. */
> +case 0x4200 ... 0x42ff:
> +ret = guest_rdmsr_xen(v, msr, val);
> +goto out;
> +
>  default:
>  return X86EMUL_UNHANDLEABLE;
>  }
> @@ -274,6 +278,10 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t 
> val)
>  }
>  
>  /* Fallthrough. */
> +case 0x4200 ... 0x42ff:

A define would be better in order to prevent this two values from
getting out of sync with the ones in rdmsr maybe.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] common/gnttab: Introduce command line feature controls

2018-02-27 Thread Jan Beulich
>>> On 27.02.18 at 12:38,  wrote:
> This patch was originally released as part of XSA-226.  It retains the same
> command line syntax (as various downstreams are mitigating XSA-226 using this
> mechanism) but the defaults have been updated due to the revised XSA-226
> patched, after which transitive grants are believed to functioning
> properly.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Security Advisory 255 - grant table v2 -> v1 transition may crash Xen

2018-02-27 Thread Jan Beulich
>>> On 27.02.18 at 13:37,  wrote:
> On Tuesday, 27 February 2018 11:00:08 PM AEDT Xen. org security team wrote:
>> RESOLUTION
>> ==
>> 
>> Applying the appropriate attached patch resolves this issue.
>> 
>> xsa255-?.patch xen-unstable, Xen 4.10.x
>> xsa255-4.9-?.patch Xen 4.9.x, Xen 4.8.x
>> xsa255-4.7-?.patch Xen 4.7.x
>> xsa255-4.6-?.patch Xen 4.6.x
> 
> Is there a missing pre-requisite patch required for 4.6.6?
> 
> I'm currently getting a failure on these patches as follows:
> 
> Patch #55 (xsa255-4.6-1.patch):
> + echo 'Patch #55 (xsa255-4.6-1.patch):'
> + /bin/cat /builddir/build/SOURCES/xsa255-4.6-1.patch
> + /usr/bin/patch -p1 --fuzz=2
> patching file xen/arch/arm/domain.c
> patching file xen/arch/arm/mm.c
> Hunk #2 FAILED at 1075.
> Hunk #3 FAILED at 1090.
> 2 out of 3 hunks FAILED -- saving rejects to file xen/arch/arm/mm.c.rej

I've just applied the patches to all stable branches, and they all
applied fine, including the 4.6 ones. Are you perhaps missing the
XSA-235 fix there? In any event, as said a number of times in
the past, the patches we provide are against the staging branches
for the respective stable versions; we don't guarantee patches
apply to vanilla stable releases.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC Patch v4 7/8] x86/hvm: bump the number of pages of shadow memory

2018-02-27 Thread George Dunlap
On 12/06/2017 07:50 AM, Chao Gao wrote:
> Each vcpu of hvm guest consumes at least one shadow page. Currently, only 256
> (for hap case) pages are pre-allocated as shadow memory at beginning. It would
> run out if guest has more than 256 vcpus and guest creation fails. Bump the
> number of shadow pages to 2 * HVM_MAX_VCPUS for hap case and 8 * HVM_MAX_VCPUS
> for shadow case.
> 
> This patch won't lead to more memory consumption for the size of shadow memory
> will be adjusted via XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION according to the size
> of guest memory and the number of vcpus.
> 
> Signed-off-by: Chao Gao 

Acked-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] x86/hvm: Handle viridian MSRs via the new guest_{rd, wr}msr() infrastructure

2018-02-27 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:35:15PM +, Andrew Cooper wrote:
> Dispatch from the guest_{rd,wr}msr() functions, after confirming that the
> domain is configured to use viridian.  This allows for simplifiction of the
> viridian helpers as they don't need to cope with the "not a viridian MSR"
> case.  It also means that viridian MSRs which are unimplemented, or excluded
> because of features, don't fall back into default handling path.
> 
> Rename {rd,wr}msr_viridian_regs() to guest_{rd,wr}msr_viridian() for
> consistency, and because the _regs suffix isn't very appropriate.
> 
> Update them to take a vcpu pointer rather than presuming that they act on
> current, which is safe for all implemented operations.  Also update them to
> use X86EMUL_* return values.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Roger Pau Monné 

Just two nits.

> ---
> CC: Jan Beulich 
> CC: Jun Nakajima 
> CC: Paul Durrant 
> CC: Kevin Tian 
> CC: Boris Ostrovsky 
> CC: Suravee Suthikulpanit 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Sergey Dyasli 
> ---
>  xen/arch/x86/hvm/svm/svm.c |  6 +
>  xen/arch/x86/hvm/viridian.c| 49 
> ++
>  xen/arch/x86/hvm/vmx/vmx.c |  6 +
>  xen/arch/x86/msr.c | 41 +++
>  xen/include/asm-x86/hvm/viridian.h | 11 ++---
>  5 files changed, 64 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 8b4cefd..6d8ed5c 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1967,8 +1967,7 @@ static int svm_msr_read_intercept(unsigned int msr, 
> uint64_t *msr_content)
>  else if ( ret )
>  break;
>  
> -if ( rdmsr_viridian_regs(msr, msr_content) ||
> - rdmsr_hypervisor_regs(msr, msr_content) )
> +if ( rdmsr_hypervisor_regs(msr, msr_content) )
>  break;
>  
>  if ( rdmsr_safe(msr, *msr_content) == 0 )
> @@ -2123,9 +2122,6 @@ static int svm_msr_write_intercept(unsigned int msr, 
> uint64_t msr_content)
>  else if ( ret )
>  break;
>  
> -if ( wrmsr_viridian_regs(msr, msr_content) )
> -break;
> -
>  switch ( wrmsr_hypervisor_regs(msr, msr_content) )
>  {
>  case -ERESTART:
> diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c
> index 70aab52..23de433 100644
> --- a/xen/arch/x86/hvm/viridian.c
> +++ b/xen/arch/x86/hvm/viridian.c
> @@ -554,13 +554,11 @@ static void update_reference_tsc(struct domain *d, 
> bool_t initialize)
>  put_page_and_type(page);
>  }
>  
> -int wrmsr_viridian_regs(uint32_t idx, uint64_t val)
> +int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx, uint64_t val)

Since this now returns X86EMUL_* which doesn't have negative value,
should this be unsigned int?

> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
> index 7aaa2b0..2ff9361 100644
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -139,9 +139,11 @@ int init_vcpu_msr_policy(struct vcpu *v)
>  
>  int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>  {
> -const struct cpuid_policy *cp = v->domain->arch.cpuid;
> -const struct msr_domain_policy *dp = v->domain->arch.msr;
> +const struct domain *d = v->domain;
> +const struct cpuid_policy *cp = d->arch.cpuid;
> +const struct msr_domain_policy *dp = d->arch.msr;
>  const struct msr_vcpu_policy *vp = v->arch.msr;
> +int ret = X86EMUL_OKAY;
>  
>  switch ( msr )
>  {
> @@ -173,11 +175,26 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
> _MSR_MISC_FEATURES_CPUID_FAULTING;
>  break;
>  
> +case 0x4000 ... 0x41ff:

I think it would be clearer to use VIRIDIAN_MSR_MIN ...
VIRIDIAN_MSR_MAX.

Or else the defines should be removed because you are removing all of
it's users.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 07/11] vpci/bars: add handlers to map the BARs

2018-02-27 Thread Jan Beulich
>>> On 27.02.18 at 12:57,  wrote:
> On Tue, Feb 27, 2018 at 03:01:44AM -0700, Jan Beulich wrote:
>> >>> On 27.02.18 at 10:21,  wrote:
>> > With the current approach in the unmap case there will be stale
>> > mappings left behind.
>> > 
>> > I guess it's better then to not modify the memory decoding bit at all
>> > until the operation finishes. That also rises the question of whether
>> > the memory decoding bit should be modified if p2m mapping/unmapping
>> > reports an error.
>> > 
>> > Should we also attempt to rollback failed map/unmap operations? What
>> > happens if the rollback also fails?
>> 
>> It is in particular this last question why I don't think rollback makes
>> sense. If there's any failure, I think the decode bit should be sticky
>> clear; we may want (need?) to invent some magical mechanism to
>> get a device back out of this state later on. But that way stale
>> mappings are not an immediate problem (I think).
> 
> The only problem I see with this approach is that if an error happens
> in the unmap case we will disable memory decoding and leave some stale
> p2m mappings. Then the guest might change the position of the BARs,
> and those stale mappings would be completely forgotten.

Well, the implication was that together with the decode enable bit
not being allowed to be turned off would be that some recording
would perhaps be needed of the stale mappings. Allowing the
decode enable bit to be set again would then depend on the stale
mappings first being dealt with. As long as the decode enable bit
can't be set, the owning domain playing with the BARs is of no
interest.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/HVM: don't give the wrong impression of WRMSR succeeding

2018-02-27 Thread Boris Ostrovsky
On 02/27/2018 03:39 AM, Jan Beulich wrote:
 On 23.02.18 at 08:55,  wrote:
> On 22.02.18 at 23:16,  wrote:
>>> On 02/22/2018 10:44 AM, Jan Beulich wrote:
>>> On 22.02.18 at 15:53,  wrote:
> On 22/02/18 13:44, Jan Beulich wrote:
>> ... for unknown MSRs: wrmsr_hypervisor_regs()'s comment clearly says
>> that the function returns 0 for unrecognized MSRs, so
>> {svm,vmx}_msr_write_intercept() should not convert this into success.
>>
>> At the time it went in, commit 013e34f5a6 ("x86: handle paged gfn in
>> wrmsr_hypervisor_regs") was probably okay, since prior to that the
>> return value wasn't checked at all. But that's not how we want things
>> to be handled nowadays.
>>
>> Signed-off-by: Jan Beulich 
> I agree in principle, but this does have a large potential risk for
> guests.  Any unknown MSR which guests don't check for #GP faults from
> will now cause the guests to crash.
>
> That said, it is the correct direction to go long-term, and we've got to
> throw the switch some time, but I expect this will cause problems in the
> short term, especially for migrated-in guests.
 Thinking about this again, the RDMSR side of things already raises
 #GP for inaccessible MSRs. We obviously can't do a probing WRMSR
 in {svm,vmx}_msr_write_intercept(), but couldn't we rdmsr_safe()
 in the "case 0:" block, treating the result as the verdict whether to
 raise #GP to the guest? As the read path does this anyway, we're
 not exposing ourselves to new risks.
>>> What about write-only MSRs?
>> Bad luck (I'm sorry to say so, but we have an actual bug to fix here).
>> If we find any such is used, we'll have to add individual case labels.
> Since it wasn't clear with your question above and you earlier
> given R-b, I had dropped the latter from v2. Could you clarify
> whether I may reinstate it?

Yes, please.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 30/49] ARM: new VGIC: Add ENABLE registers handlers

2018-02-27 Thread Andre Przywara
Hi,

On 19/02/18 14:13, Julien Grall wrote:
> 
> 
> On 19/02/18 12:41, Andre Przywara wrote:
>> Hi,
> 
> Hi,
> 
>> On 16/02/18 16:57, Julien Grall wrote:
>>> On 09/02/18 14:39, Andre Przywara wrote:
 +    spin_lock_irqsave(>lock, flags);
 +    if ( enable )
 +    {
 +    gic_set_irq_type(desc, irq_type == VGIC_CONFIG_LEVEL ?
 + IRQ_TYPE_LEVEL_HIGH : IRQ_TYPE_EDGE_RISING);
>>>
>>> Indentation and I would prefer a helper to convert between the vgic
>>> value and the IRQ_TYPE. This would make the code easier to read.
>>>
>>> Also, this code does not replicate correctly the current vGIC.
>>> gic_set_irq_type is only allowed to be used when
>>> irq_set_type_by_domain(d) returns true. If you consider this change
>>> valid, then I would like to know why.
>>
>> So what is/was the rationale for not allowing IRQ type changes for
>> non-privileged guests? If you allow to pass through an hardware IRQ to a
>> guest (which is the case this function handles), then I don't see why a
>> guest would not be allowed to change the configuration? It seems rather
>> odd, I guess it's up to the guest to know which type of IRQ this is?
> 
> If you can answer the question on top of irq_type_set_by_domain (i.e
> "See whether it is possible to let any domain configure the type) then
> we can remove it. We decided to only allow for the hardware domain
> because we trust it.

But why would you mistrust a DomU in this respect?
The only point I see is that a guest has *some* influence on a hardware
access, but I fail to see how a single MMIO read-modify-write sequence
would actually impact the host. Especially since we do it only on
enabling an IRQ.
Looking more closely at the existing VGIC code we might want to check if
the hardware IRQ was already enabled before entering the
"if ( p->desc != NULL )" branch, btw.

So is this the concern? Commit b0003bdd690 wasn't really enlightening in
this respect.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/6] x86/vmx: Simplfy the default cases in vmx_msr_{read, write}_intercept()

2018-02-27 Thread Roger Pau Monné
On Tue, Feb 27, 2018 at 12:43:21PM +, Roger Pau Monné wrote:
> On Mon, Feb 26, 2018 at 05:35:14PM +, Andrew Cooper wrote:
> > The default case of vmx_msr_write_intercept() in particular is very tangled.
> > 
> > First of all, fold long_mode_do_msr_{read,write}() into their callers.  
> > These
> > functions were split out in the past because of the 32bit build of Xen, but 
> > it
> > is unclear why the cases weren't simply #ifdef'd in place.
> > 
> > Next, invert the vmx_write_guest_msr()/is_last_branch_msr() logic to break 
> > if
> > the condition is satisfied, rather than nesting if it wasn't.  This allows 
> > the
> > wrmsr_hypervisor_regs() call to be un-nested with respect to the other 
> > default
> > logic.
> > 
> > No practical difference from a guests point of view.
> 
> I think there's a difference from guest PoV now, guest wrmsr of
> GS/FS/LSTAR/CSTAR with non-canonical addresses will get a #GP.

Forget about this, I've just realized this is completely wrong.

And I've also figured out my question to CSTAR, so:

Reviewed-by: Roger Pau Monné 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/6] x86/vmx: Simplfy the default cases in vmx_msr_{read, write}_intercept()

2018-02-27 Thread Andrew Cooper
On 27/02/18 12:43, Roger Pau Monné wrote:
> On Mon, Feb 26, 2018 at 05:35:14PM +, Andrew Cooper wrote:
>> The default case of vmx_msr_write_intercept() in particular is very tangled.
>>
>> First of all, fold long_mode_do_msr_{read,write}() into their callers.  These
>> functions were split out in the past because of the 32bit build of Xen, but 
>> it
>> is unclear why the cases weren't simply #ifdef'd in place.
>>
>> Next, invert the vmx_write_guest_msr()/is_last_branch_msr() logic to break if
>> the condition is satisfied, rather than nesting if it wasn't.  This allows 
>> the
>> wrmsr_hypervisor_regs() call to be un-nested with respect to the other 
>> default
>> logic.
>>
>> No practical difference from a guests point of view.
> I think there's a difference from guest PoV now, guest wrmsr of
> GS/FS/LSTAR/CSTAR with non-canonical addresses will get a #GP.
>
>> @@ -3090,6 +3015,45 @@ static int vmx_msr_write_intercept(unsigned int msr, 
>> uint64_t msr_content)
>>  goto gp_fault;
>>  __vmwrite(GUEST_SYSENTER_EIP, msr_content);
>>  break;
>> +
>> +case MSR_FS_BASE:
>> +case MSR_GS_BASE:
>> +case MSR_SHADOW_GS_BASE:
>> +if ( !is_canonical_address(msr_content) )
>> +goto gp_fault;
> This is AFAICT different from previous behaviour, previous code would
> just return X86EMUL_EXCEPTION without injecting a fault. From the SDM
> I see injecting a #GP is the correct behaviour.

The gp_fault label simply returns X86EMUL_EXCEPTION, but is more
descriptive when named like this.

It is up to the top level to convert EXCEPTION into an
hvm_inject_hw_exception() if it wants, because there is at least one
case in the emulator where we need to squash the exception rather than
propagating it.

Previously, non-canonical addresses on the wrmsr side did have their
EXCEPTION propagated up from long_mode_do_msr_write(), so after careful
re-review, I still think the behaviour is the same.

>
>> +
>> +if ( msr == MSR_FS_BASE )
>> +__vmwrite(GUEST_FS_BASE, msr_content);
>> +else if ( msr == MSR_GS_BASE )
>> +__vmwrite(GUEST_GS_BASE, msr_content);
>> +else
>> +wrmsrl(MSR_SHADOW_GS_BASE, msr_content);
>> +
>> +break;
>> +
>> +case MSR_STAR:
>> +v->arch.hvm_vmx.star = msr_content;
>> +wrmsrl(MSR_STAR, msr_content);
>> +break;
>> +
>> +case MSR_LSTAR:
>> +if ( !is_canonical_address(msr_content) )
>> +goto gp_fault;
>> +v->arch.hvm_vmx.lstar = msr_content;
>> +wrmsrl(MSR_LSTAR, msr_content);
>> +break;
>> +
>> +case MSR_CSTAR:
>> +if ( !is_canonical_address(msr_content) )
>> +goto gp_fault;
>> +v->arch.hvm_vmx.cstar = msr_content;
> Likely a stupid question, but why is CSTAR not written here? (like it's
> done for LSTAR)

CSTAR on Intel is a weird corner case.  The MSR exists and can be
interacted with via RDMSR/WRMSR, but SYSCALL fails with #UD outside of
64bit mode, there is nothing in the pipeline which will make use of the
value.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.6-testing test] 120035: regressions - trouble: broken/fail/pass

2018-02-27 Thread osstest service owner
flight 120035 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/120035/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt broken
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
119227

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt  4 host-install(4)  broken pass in 120009
 test-xtf-amd64-amd64-1 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119983 pass 
in 120035
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail in 119983 pass 
in 120035
 test-xtf-amd64-amd64-3   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 119983
 test-amd64-i386-rumprun-i386 17 rumprun-demo-xenstorels/xenstorels.repeat fail 
pass in 120009

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-2 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 119983 like 
119187
 test-armhf-armhf-libvirt 14 saverestore-support-check fail in 119983 like 
119227
 test-armhf-armhf-xl-rtds 12 guest-start fail in 119983 like 119227
 test-armhf-armhf-libvirt13 migrate-support-check fail in 119983 never pass
 test-xtf-amd64-amd64-4 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 120009 like 
119227
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 119187
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 119227
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 119227
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 119227
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 119227
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 119227
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 119227
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 119227
 test-xtf-amd64-amd64-2   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-4   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-5   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-2   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-4   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-5   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-4   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-1   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-3   37 xtf/test-hvm32pae-memop-seg  fail   never pass
 test-xtf-amd64-amd64-1   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-2   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-5   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-3   52 xtf/test-hvm64-memop-seg fail   never pass
 test-xtf-amd64-amd64-1   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-3   76 xtf/test-pv32pae-xsa-194 fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail 

Re: [Xen-devel] Xen Security Advisory 255 - grant table v2 -> v1 transition may crash Xen

2018-02-27 Thread Steven Haigh
On Tuesday, 27 February 2018 11:00:08 PM AEDT Xen. org security team wrote:
> Xen Security Advisory XSA-255
>   version 3
> 
>  grant table v2 -> v1 transition may crash Xen
> 

> RESOLUTION
> ==
> 
> Applying the appropriate attached patch resolves this issue.
> 
> xsa255-?.patch xen-unstable, Xen 4.10.x
> xsa255-4.9-?.patch Xen 4.9.x, Xen 4.8.x
> xsa255-4.7-?.patch Xen 4.7.x
> xsa255-4.6-?.patch Xen 4.6.x

Is there a missing pre-requisite patch required for 4.6.6?

I'm currently getting a failure on these patches as follows:

Patch #55 (xsa255-4.6-1.patch):
+ echo 'Patch #55 (xsa255-4.6-1.patch):'
+ /bin/cat /builddir/build/SOURCES/xsa255-4.6-1.patch
+ /usr/bin/patch -p1 --fuzz=2
patching file xen/arch/arm/domain.c
patching file xen/arch/arm/mm.c
Hunk #2 FAILED at 1075.
Hunk #3 FAILED at 1090.
2 out of 3 hunks FAILED -- saving rejects to file xen/arch/arm/mm.c.rej

The patches for 4.7, 4.9 and 4.10 seem to apply successfully.

-- 
Steven Haigh

 net...@crc.id.au    http://www.crc.id.au
 +61 (3) 9001 6090 0412 935 897

signature.asc
Description: This is a digitally signed message part.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/6] x86/vmx: Simplfy the default cases in vmx_msr_{read, write}_intercept()

2018-02-27 Thread Roger Pau Monné
On Mon, Feb 26, 2018 at 05:35:14PM +, Andrew Cooper wrote:
> The default case of vmx_msr_write_intercept() in particular is very tangled.
> 
> First of all, fold long_mode_do_msr_{read,write}() into their callers.  These
> functions were split out in the past because of the 32bit build of Xen, but it
> is unclear why the cases weren't simply #ifdef'd in place.
> 
> Next, invert the vmx_write_guest_msr()/is_last_branch_msr() logic to break if
> the condition is satisfied, rather than nesting if it wasn't.  This allows the
> wrmsr_hypervisor_regs() call to be un-nested with respect to the other default
> logic.
> 
> No practical difference from a guests point of view.

I think there's a difference from guest PoV now, guest wrmsr of
GS/FS/LSTAR/CSTAR with non-canonical addresses will get a #GP.

> @@ -3090,6 +3015,45 @@ static int vmx_msr_write_intercept(unsigned int msr, 
> uint64_t msr_content)
>  goto gp_fault;
>  __vmwrite(GUEST_SYSENTER_EIP, msr_content);
>  break;
> +
> +case MSR_FS_BASE:
> +case MSR_GS_BASE:
> +case MSR_SHADOW_GS_BASE:
> +if ( !is_canonical_address(msr_content) )
> +goto gp_fault;

This is AFAICT different from previous behaviour, previous code would
just return X86EMUL_EXCEPTION without injecting a fault. From the SDM
I see injecting a #GP is the correct behaviour.

> +
> +if ( msr == MSR_FS_BASE )
> +__vmwrite(GUEST_FS_BASE, msr_content);
> +else if ( msr == MSR_GS_BASE )
> +__vmwrite(GUEST_GS_BASE, msr_content);
> +else
> +wrmsrl(MSR_SHADOW_GS_BASE, msr_content);
> +
> +break;
> +
> +case MSR_STAR:
> +v->arch.hvm_vmx.star = msr_content;
> +wrmsrl(MSR_STAR, msr_content);
> +break;
> +
> +case MSR_LSTAR:
> +if ( !is_canonical_address(msr_content) )
> +goto gp_fault;
> +v->arch.hvm_vmx.lstar = msr_content;
> +wrmsrl(MSR_LSTAR, msr_content);
> +break;
> +
> +case MSR_CSTAR:
> +if ( !is_canonical_address(msr_content) )
> +goto gp_fault;
> +v->arch.hvm_vmx.cstar = msr_content;

Likely a stupid question, but why is CSTAR not written here? (like it's
done for LSTAR)

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/9] drm/xen-front: Add support for Xen PV display frontend

2018-02-27 Thread Oleksandr Andrushchenko

Please find some more clarifications on VirtIO use with Xen
(I would like to thank Xen community for helping with this)

1. Possible security issues - VirtIO devices are PCI bus masters, thus
allowing real device (running, for example, in untrusted driver domain)
to get control over guest's memory by writing to its memory

2. VirtIO currently uses GFNs written into the shared ring, without Xen
grants support. This will require generic grant-mapping/sharing layer
to be added to VirtIO.

3. VirtIO requires QEMU PCI emulation for setting up a device. Xen PV 
(and PVH)
domains don't use QEMU for platform emulation in order to reduce attack 
surface.
(PVH is in the process of gaining PCI config space emulation though, but 
it is

optional, not a requirement)

4. Most of the PV drivers a guest uses at the moment are Xen PV drivers, 
e.g. net,

block, console, so only virtio-gpu will require QEMU to run.
Although this use case would work on x86 it will require additional changes
to get this running on ARM, which is my target platform.

Thank you,
Oleksandr

On 02/26/2018 10:21 AM, Oleksandr Andrushchenko wrote:

**

*Hi, all!*

*

Last *Friday* some concerns on #dri-devel were raised wrt "yet

another driver" for Xen and why not virtio-gpu. Let me highlight

on why we need a new paravirtualized driver for Xen and why we

can't just use virtio. Hope this helps the communities (both Xen

and DRI) to have better understanding of this work and our motivation.


Disclaimer: some or all of the below may sound weak argument or

not 100% correct, so any help on clarifying the below is more

than welcome ;)


1. First of all, we are targeting ARM embedded use-cases and for

ARM we do not use QEMU [1]: "...Xen on ARM is not just a straight

1:1 port of x86 Xen... Xen on ARM does not need QEMU because it does

not do any emulation. It accomplishes the goal by exploiting

virtualization support in hardware as much as possible and using

paravirtualized interfaces for IO."


That being said it is still possible to run virtio-gpu and Xen+QEMU: [2]


In this case QEMU can be used for device virtualization, e.g. network,

block, console. But these already exist as Xen para-virtualized drivers

again eliminating the need for QEMU: typical ARM system runs 
para-virtualized


drivers for network, block, console etc.


2. virtio-gpu requires PCI/MMIO emulation

virtio-gpu (virtio-gpu-pci) require virtio-pci, but para-virtualized 
device


drivers do not need this.


3. No need for 3d/virgl.

There are use-cases which either do not use OpenGL at all or will use

custom virtualization solutions allowing sharing of a real GPU with 
guest,


e.g. vGPU approach.


4. More freedom for buffer allocation.

As of now virtio-gpu is only capable of allocating buffers via TTM, while

there are use-cases where we need to have more freedom:

for systems which do not provide IOMMU support, but having specific

requirements for display buffers, it is possible to allocate such buffers

at backend side and share those with the frontend driver.

For example, if host domain is 1:1 mapped and has DRM/GPU hardware 
expecting


physically contiguous memory (in PA, not IPA), this allows implementing

zero-copying use-cases.


5. Zero-copying support at backend side

Having native Xen implementation allows implementing zero-copying 
use-cases


on backend side with the help of supporting driver DRM driver [3] 
which we


hope to upstream as well (it is not yet ready in terms of code cleanup).


6. QEMU backends for virtio-gpu cannot be used as is, e.g. guest displays

could be just a part of the final user experience. Thus, a QEMU backend

must be modified to interact, for example, with Automotive Grade Linux

display manager. So, QEMU part needs modifications.

In our use-case we have a backend which supports multi-touch and guest

display(s) and running either as a weston client (which is not supported

by QEMU at the moment?) or KMS/DRM client. This allows us to enable much

more use-cases**without the need to run QEMU.

*

*Thank you,*

**Oleksandr Andrushchenko*
*

*
*

*[1] 
https://wiki.xen.org/wiki/Xen_ARM_with_Virtualization_Extensions_whitepaper*


*

[2] https://elinux.org/R-Car/Virtualization

[3] 
https://github.com/xen-troops/linux/blob/ces2018/drivers/gpu/drm/xen/xen_drm_zcopy_drv.c



*


On 02/21/2018 10:03 AM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

Hello!

This patch series adds support for Xen [1] para-virtualized
frontend display driver. It implements the protocol from
include/xen/interface/io/displif.h [2].
Accompanying backend [3] is implemented as a user-space application
and its helper library [4], capable of running as a Weston client
or DRM master.
Configuration of both backend and frontend is done via
Xen guest domain configuration options [5].

*** 


* Driver limitations

Re: [Xen-devel] Is there any way to read msr in hypervisor code?

2018-02-27 Thread Minjun Hong
On Tue, Feb 27, 2018 at 7:00 PM, Andrew Cooper 
wrote:

> On 27/02/2018 09:37, Minjun Hong wrote:
>
> Hello, I've tried to read msr in hypervisor code:
>
> uint64_t reg;
>
> rdmsr_safe(0x412e, ret);
>
>
> But, I all failed to read it because the hypervisor is running in ring-1
> CPL as I did googling.
>
>
> Xen runs in ring 0, root mode.  Forget terms like "ring -1" because they
> are not accurate, and not a helpful description of what is going on.
>
> When I accessed msr, the crash log is below:
>
>
> (XEN) [ Xen-4.9.0  x86_64  debug=n   Not tainted ]
>>
>> (XEN) CPU:7
>>
>> (XEN) RIP:e008:[] mcsched.c#shscan_timer_fn+0xd/
>>> 0x180
>>
>> (XEN) RFLAGS: 00010206   CONTEXT: hypervisor
>>
>> (XEN) rax: 83084abfe028   rbx: 83084abfe300   rcx:
>>> 412e
>>
>> (XEN) rdx: 83084abb7fff   rsi: 82d080632f00   rdi:
>>> 
>>
>> (XEN) rbp: 82d080241390   rsp: 83084abb7e30   r8:
>>> 830868d4bdc0
>>
>> (XEN) r9:  0005   r10: 012da75b50a5   r11:
>>> 8300782f3060
>>
>> (XEN) r12:    r13: 012d9f9d6e07   r14:
>>> 83084abb7fff
>>
>> (XEN) r15: 82d08062ad80   cr0: 8005003b   cr4:
>>> 003526e0
>>
>> (XEN) cr3: 0003426f3000   cr2: 7fc3c21d4000
>>
>> (XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
>>
>> (XEN) Xen code around  (mcsched.c#shscan_timer_fn+
>>> 0xd/0x180):
>>
>> (XEN)  00 00 41 55 41 54 55 53 <0f> 32 4c 8d 05 0a 79 11 00 48 8d 0d 91
>>> 45 14 00
>>
>>
> This is a rdmsr(), not rdmsr_safe(), which is why you are crashing.
>
> Irrespective of that, 0x412e isn't an MSR which exists on any real
> hardware, so I'm not sure what other result you were expecting.
>
> ~Andrew
>

Thanks for your answer, Andrew.

Actually, I intended to get the number of cache-misses from msr and  according
to "*Figure 18**-1 Layout of IA32_PERFEVTSELx MSRs in Intel® 64 and IA-32
Architectures Software Developer’s Manual Volume 3B: System Programming
Guide*",
it can be achieved by "0x412e":

Bit Position

CPUID.AH.EBX

Event Name

UMask

Event Select

0

UnHalted Core Cycles

00H

3CH

1

Instruction Retired

00H

C0H

2

UnHalted Reference Cycles

01H

3CH

3

LLC Reference

4FH

2EH

*4*

*LLC Misses*

*41H*

*2EH*

5

Branch Instruction Retired

00H

C4H

6

Branch Misses Retired

00H

C5H

I also tried to do that w/ rdmsr_safe() but, I failed to read msr.
Through the macro that returns a non-zero value, it was easy to see that
reading msr failed:

if ( rdmsr_safe(0x412e, ret) ) {
> printk("cannot read msr!!!\n");
> } else {
> printk("rdmsr: %lu\n", ret);
> }


 I cannot understand why you said "0x412e isn't an MSR which exists on any
real hardware" so, is it impossible to get the number of cache-misses by
using msr ??
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Security Advisory 255 - grant table v2 -> v1 transition may crash Xen

2018-02-27 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory XSA-255
  version 3

 grant table v2 -> v1 transition may crash Xen

UPDATES IN VERSION 3


Public release.

ISSUE DESCRIPTION
=

Grant tables come in two flavors (versions), and domains are permitted
to freely change between them (subject to certain constraints).  For
the guest to use the facility, both the "normal" shared pages
(applicable to v1 and v2) and the "status" pages (applicable to v2
only) need to be mapped by the guest into its address space.

When transitioning from v2 to v1, the status pages become unnecessary
and are therefore freed by Xen.  That means Xen needs to check that
there are no mappings of those pages by the domain.  However, that
check was mistakenly implemented as a bug check, rather than returning
an error to the guest.

IMPACT
==

A malicious or buggy guest may cause a hypervisor crash, resulting in
a Denial of Service (DoS) affecting the entire host.  Privilege
escalation as well as information leaks cannot be ruled out for HVM,
PVH (both x86), and ARM guests.

The impact is more severe for Xen versions 4.0.x, 4.1.0 ... 4.1.3, and
4.2 in that the pages are freed without any checking, thus allowing
their re-use for another domain, or by Xen itself, while there still
are active mappings (see XSA-26).

VULNERABLE SYSTEMS
==

Xen versions 4.0 and newer are vulnerable.

Both x86 and ARM systems are vulnerable.

MITIGATION
==

Using the "gnttab=max_ver:1" hypervisor command line option, where
available, to disable use of v2 grant tables allows to avoid the
vulnerability.  Use of this option will, however, break any guests which
require to make use of v2 functionality.  The patch introducing this
option was not merged so far, but is available (in its current form) at
https://lists.xenproject.org/archives/html/xen-devel/2018-02/msg00059.html
("common/gnttab: Introduce command line feature controls").

There is no other known mitigation.

CREDITS
===

This issue was discovered by Jan Beulich of SUSE.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa255-?.patch xen-unstable, Xen 4.10.x
xsa255-4.9-?.patch Xen 4.9.x, Xen 4.8.x
xsa255-4.7-?.patch Xen 4.7.x
xsa255-4.6-?.patch Xen 4.6.x

$ sha256sum xsa255*
05a5570ecf4354f7aad35bb77a4c2f5f556bcabf3555829a98c94dcfb6dd4696  xsa255-1.patch
df43a147f1e1a2b7d59588bc91cdaac05d4e45bcfc4e2c8cb5e8de840d44b43d  xsa255-2.patch
be62d81583df10a6be275427d5cfa02084c8717473b3694cd2a9bbdc10cbadcb  
xsa255-4.6-1.patch
3dd58114c5ce68fd8dd43f8f92eaafdcec1fd9add37eb41faed1cf818058539a  
xsa255-4.6-2.patch
9bfc4a33a0faeb36aec8449ea940cef52d523cc3d13529b4eeaae64bf5a7b644  
xsa255-4.7-1.patch
6d95ceb54298de7863dc7133c0f3adf85f7da9b8d326146ff46e641194a47fc0  
xsa255-4.7-2.patch
0b4706f0d2d21d4f6414ae9c0205e553bfb792c23d44e129b3a0f90be557d13f  
xsa255-4.9-1.patch
9c6b2d2183ffa484182ca75e1a048d0713c4d150e750ccf58be5a24991a3e1de  
xsa255-4.9-2.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches described above (or others which are
substantially similar) is permitted during the embargo, even on
public-facing systems with untrusted guest users and administrators.

However, deployment of the mitigation is NOT permitted (except where
all the affected systems and VMs are administered and used only by
organisations which are members of the Xen Project Security Issues
Predisclosure List).  Specifically, deployment on public cloud systems
is NOT permitted.  This is because this produces a guest-visible
change which will indicate which component contains the vulnerability.

Additionally, distribution of updated software is prohibited (except to
other members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBCAAGBQJalUeyAAoJEIP+FMlX6CvZrgoH/3bGXBS8dUA/59slmiyw7UpG
jShI/y+aCo5APtvPVrm+1cM/YOFVCHoZcnL0V+E9VBekCeVo84lfrDwrsU76RWNq
vIyPKBBeJbJFCCLatiWWDSEG6MukhhF0xiJy5RuMd/A1d4+6XLsD3y2bIkBb1P13
WzPJcgN1/wMM4A5Tp7MgyncOdm5yODu0A85L4J6fOOGm+LrNErvFpREcivoIKhbq
PKm04dSNb3jp7b7J9cVfSH/ZhpD1szwJ9yrddX3zgOF/1jlDi54Bri2potAZRZ0j
h+dN4Oh1HUR6NJeTuJqHx/VwFsx7V2zmWZZwsHaI7f/Oe15GRY/vPJNEm3VhNoM=
=atsR
-END PGP SIGNATURE-


xsa255-1.patch
Description: Binary data


xsa255-2.patch

[Xen-devel] Xen Security Advisory 252 - DoS via non-preemptable L3/L4 pagetable freeing

2018-02-27 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory XSA-252
  version 2

 DoS via non-preemptable L3/L4 pagetable freeing

UPDATES IN VERSION 2


Public release.

ISSUE DESCRIPTION
=

Guests have the ability to request removal of memory from themselves.
This operation is intended to be requested for normal read/write pages,
but is also permitted to be used on other types of pages.  So far this
in particular included pages pinned to their current type, with the
necessary unpinning happening implicitly.  The unpinning of higher level
page tables can, however, take a significant amount of time, and hence
is generally expected to be carried out with intermediate preemption
checks.  Such checks were missing from the code path involved here.

IMPACT
==

A malicious guest administrator can cause a Denial of Service (DoS).
Specifically, prevent use of a physical CPU for a significant period of
time.

VULNERABLE SYSTEMS
==

All Xen versions are vulnerable.

Only x86 systems are affected.  ARM systems are not affected.

Only PV guests can leverage this vulnerability.  HVM guests cannot
leverage this vulnerability.

MITIGATION
==

Running only HVM guests will avoid this issue.

CREDITS
===

This issue was discovered by Jann Horn of Google Project Zero.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa252.patch   xen-unstable, Xen 4.10.0
xsa252-4.9.patch   Xen 4.9.x, Xen 4.8.x
xsa252-4.7.patch   Xen 4.7.x
xsa252-4.6.patch   Xen 4.6.x, Xen 4.5.x

$ sha256sum xsa252*
5bf651378b92520969cde49d11500bcaeffab15590d21c16736be408a85ab3fa  xsa252.meta
53174dfd05eb274431dc756c9c3a39b355d485d6c9d12a8797b350bab343d22e  xsa252.patch
b7ba005fa62ace07f4880cc79824968c24ead3182245e4ed3a6e22cf8d2d7c05  
xsa252-4.6.patch
14f37eb6b7a9fb19b258ca3c0e2da71dbc4240e6273137d5eb4003b122101aa6  
xsa252-4.7.patch
cb679f2145e76b1c754c4377b397d201007f50438ee18e451c4b0da3f510a293  
xsa252-4.9.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBCAAGBQJalUevAAoJEIP+FMlX6CvZaDEH/0MrInFkPbVr0OFNs8KHuZNh
5fz3sXFbf/7O0aTdFT5JJpwZaOngSyjnnKJKZMtsEHz52Nzs6o4xnYzqzNlemPJf
FG5NKjWgQI762H8Co4z65eWwHevfDo9a1XAy2LRHlbaNkGXMwic3B2VbhW2A0Hkp
nAATx19TpS21Fk4dK5+P8HCy+YN5RwPKKADE1Jps0MsCcSZ9NHcKfedokqpaD2DQ
XEWlfhclzHGLdrBGFWtvBUGuxUIioB/ovVQK/6q7/Go2nLNvkrU63tdiCchzpVLA
qXskJeatqqH/QnLXxhgzAQWf4rmjCU21l3Lh75ZK0xrRKAPFMOiPLuQ3VtVhcYA=
=sq8W
-END PGP SIGNATURE-


xsa252.meta
Description: Binary data


xsa252.patch
Description: Binary data


xsa252-4.6.patch
Description: Binary data


xsa252-4.7.patch
Description: Binary data


xsa252-4.9.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Security Advisory 256 - x86 PVH guest without LAPIC may DoS the host

2018-02-27 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory XSA-256
  version 2

 x86 PVH guest without LAPIC may DoS the host

UPDATES IN VERSION 2


Public release.

ISSUE DESCRIPTION
=

So far, x86 PVH guests can be configured with or without Local APICs.
Configurations with Local APICs are identical to x86 HVM guests, and
will use as much hardware acceleration support as possible.
Configurations without Local APICs try to turn off all hardware
acceleration, and disable all software emulation.

Multiple paths in Xen assume the presence of a Local APIC without
sufficient checks, and can fall over a NULL pointer.  On Intel hardware,
the logic to turn off hardware acceleration is incomplete and leaves the
guest with full control of the real Task Priority Register.

IMPACT
==

A malicious or buggy guest may cause a hypervisor crash, resulting in
a Denial of Service (DoS) affecting the entire host.

VULNERABLE SYSTEMS
==

Xen version 4.8 and onwards are vulnerable.

Only x86 systems are vulnerable.  ARM systems are not vulnerable.

Only x86 PVH guests can exploit the vulnerability.  x86 PV and HVM
guests cannot exploit the vulnerability.

MITIGATION
==

Running only PV or HVM guests avoids the vulnerability.

Running all PVH guests with "apic=1" in the guest configuration file
(or equivalent thereof) also avoids the vulnerability.

CREDITS
===

This issue was discovered by Ian Jackson of Citrix.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa256.patch   xen-unstable, Xen 4.10.x, Xen 4.9.x
xsa256-4.8.patch   Xen 4.8.x

$ sha256sum xsa256*
3e45cc3f2ea516e7470083592041e238c0dfe32324790b2fba0e47c9efe38865  xsa256.patch
c029fcb67ff7c3c9a2adcb8e6f5e245a0d347acc8a9b3530591a639cbf321349  
xsa256-4.8.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBCAAGBQJalUe0AAoJEIP+FMlX6CvZsmIH/3B9QnpiL1+NRkGIE62xljEG
NfV/vL6gE2ytNMs8PRdhycovQum7qj+l9S53EswiwgiaUFw9VW5Jq9pg1UQlAQ/q
7aIIke33TgkVKwZnb+7ercGfLNWsJAIldGc5emc9lBSBkPOUhFtxmTytdudB6dy1
VMI+MVM1f4xgxEizNN7QstmlaMB34m0WH0nEdoCR8evXlAcmcBi+HwYDouUNnR5x
21DkEBxyslvheX6SI8sbocfrZpT/K2b8B3zdLmd3nO3TF5ypC1daowIk0vl8o4Yj
TSx4nsBlJ4V0G0gYa1UDBktUfDbVrpoEcdGb5zO3RhoMhcagzWVD6P6F25aYbiU=
=PLNS
-END PGP SIGNATURE-


xsa256.patch
Description: Binary data


xsa256-4.8.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 07/11] vpci/bars: add handlers to map the BARs

2018-02-27 Thread Roger Pau Monné
On Tue, Feb 27, 2018 at 03:01:44AM -0700, Jan Beulich wrote:
> >>> On 27.02.18 at 10:21,  wrote:
> > On Tue, Feb 27, 2018 at 01:32:24AM -0700, Jan Beulich wrote:
> >> >>> On 26.02.18 at 19:00,  wrote:
> >> > On Mon, Feb 26, 2018 at 04:20:17AM -0700, Jan Beulich wrote:
> >> >> >>> On 23.01.18 at 16:07,  wrote:
> >> >> > +static void maybe_defer_map(struct domain *d, const struct pci_dev 
> >> >> > *pdev,
> >> >> > +struct rangeset *mem, bool map, bool rom)
> >> >> > +{
> >> >> > +struct vcpu *curr = current;
> >> >> > +
> >> >> > +if ( is_idle_vcpu(curr) )
> >> >> > +{
> >> >> > +struct map_data data = { .d = d, .map = true };
> >> >> > +
> >> >> > +/*
> >> >> > + * Only used for domain construction in order to map the BARs
> >> >> > + * of devices with memory decoding enabled.
> >> >> > + */
> >> >> > +ASSERT(map && !rom);
> >> >> > +rangeset_consume_ranges(mem, map_range, );
> >> >> 
> >> >> What if this produces -ENOMEM? And despite having looked at
> >> >> several revisions of this, I can't make the connection to why this
> >> >> is in an is_idle_vcpu() conditional (neither the direct caller nor the
> >> >> next level up make this obvious to me). There's clearly a need
> >> >> for extending the comment.
> >> > 
> >> > I thought the comment above that mentions domain construction would be
> >> > enough. I can try to expand this, maybe like:
> >> > 
> >> > "This function will only be called from the idle vCPU while building
> >> > the domain, in which case it's not possible to defer the operation
> >> > (like done in the else branch). Call rangeset_consume_ranges in order
> >> > to establish the mappings right away."
> >> 
> >> And what again is the connection between is_idle_domain() and
> >> domain construction? I think the comment belongs ahead of the
> >> if(), and it needs to make that connection.
> > 
> > Oh, domain constructions runs on the idle vCPU, that's the relation.
> > 
> > "This function will only be called from the idle vCPU while building
> > the domain (because Dom0 building runs on the idle vCPU), in which
> > case it's not possible to defer the operation (like done in the else
> > branch). Call rangeset_consume_ranges in order to establish the
> > mappings right away."
> > 
> > Does that seem clearer if placed ahead of the if?
> 
> Can be shorter imo - the thing I didn't pay attention to is that
> this is all about _dom0_ building, not general _domain_ building.
> Hence perhaps:
> 
> "Dom0 building runs on the idle vCPU, in which case it's not possible
>  to defer the operation (like done in the else branch). Call
>  rangeset_consume_ranges in order to establish the mappings right
>  away."

LGTM. Thanks for re-writing it.

> >> >> > +static void modify_bars(const struct pci_dev *pdev, bool map, bool 
> >> >> > rom)
> >> >> > +{
> >> >> > +struct vpci_header *header = >vpci->header;
> >> >> > +struct rangeset *mem = rangeset_new(NULL, NULL, 0);
> >> >> > +const struct pci_dev *tmp;
> >> >> > +unsigned int i;
> >> >> > +int rc;
> >> >> > +
> >> >> > +if ( !map )
> >> >> > +modify_decoding(pdev, false, rom);
> >> >> > +
> >> >> > +if ( !mem )
> >> >> > +return;
> >> >> 
> >> >> Similarly here - why is it okay (or what effect will it have) to return
> >> >> here when we're out of memory, but the caller won't know?
> >> > 
> >> > The behaviour here depends on the change to the memory decoding bit:
> >> > 
> >> >  - Clearing: memory decoding on device will be disabled, BARs won't be
> >> >unmapped.
> >> >  - Setting: no change to device memory decoding bit, BARs won't be
> >> >mapped.
> >> > 
> >> > Do you think this is suitable? IMO it's fine to disable the memory
> >> > decoding bit on the device and leave the memory regions mapped.
> >> 
> >> As long as subsequent changes to the decoding bit can't leave
> >> stale mappings. Plus there needs to be a comment to explain this.
> > 
> > With the current approach in the unmap case there will be stale
> > mappings left behind.
> > 
> > I guess it's better then to not modify the memory decoding bit at all
> > until the operation finishes. That also rises the question of whether
> > the memory decoding bit should be modified if p2m mapping/unmapping
> > reports an error.
> > 
> > Should we also attempt to rollback failed map/unmap operations? What
> > happens if the rollback also fails?
> 
> It is in particular this last question why I don't think rollback makes
> sense. If there's any failure, I think the decode bit should be sticky
> clear; we may want (need?) to invent some magical mechanism to
> get a device back out of this state later on. But that way stale
> mappings are not an immediate problem (I think).

The only problem I see with this approach is that if an error happens
in the unmap case we will disable memory decoding and 

[Xen-devel] [PATCH] common/gnttab: Introduce command line feature controls

2018-02-27 Thread Andrew Cooper
This patch was originally released as part of XSA-226.  It retains the same
command line syntax (as various downstreams are mitigating XSA-226 using this
mechanism) but the defaults have been updated due to the revised XSA-226
patched, after which transitive grants are believed to functioning
properly.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: George Dunlap 
CC: Konrad Rzeszutek Wilk 
CC: Stefano Stabellini 
CC: Tim Deegan 
CC: Wei Liu 

v2:
 * Rebase over command-line parsing changes.
v3:
 * Switch to 'max-ver', leaving max_ver as an undocumented alias.
 * Check the end pointer from simple_strtol() to check for trailing characters.
 * Tollerate set_version(1) even if failing set_version(2) with -ENOSYS.
---
 docs/misc/xen-command-line.markdown | 13 
 xen/common/grant_table.c| 42 -
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 8317639..a95195f 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -920,6 +920,19 @@ Controls EPT related features.
 
 Specify which console gdbstub should use. See **console**.
 
+### gnttab
+> `= List of [ max-ver:, transitive= ]`
+
+> Default: `gnttab=max-ver:2,transitive`
+
+Control various aspects of the grant table behaviour available to guests.
+
+* `max-ver` Select the maximum grant table version to offer to guests.  Valid
+version are 1 and 2.
+* `transitive` Permit or disallow the use of transitive grants.  Note that the
+use of grant table v2 without transitive grants is an ABI breakage from the
+guests point of view.
+
 ### gnttab\_max\_frames
 > `= `
 
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 48c5479..36a182e 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -97,6 +97,41 @@ static unsigned int __read_mostly max_maptrack_frames =
DEFAULT_MAX_MAPTRACK_FRAMES;
 integer_runtime_param("gnttab_max_maptrack_frames", max_maptrack_frames);
 
+static unsigned int __read_mostly opt_gnttab_max_version = 2;
+static bool __read_mostly opt_transitive_grants = true;
+
+static int __init parse_gnttab(const char *s)
+{
+const char *ss, *e;
+int val, rc = 0;
+
+do {
+ss = strchr(s, ',');
+if ( !ss )
+ss = strchr(s, '\0');
+
+if ( !strncmp(s, "max-ver:", 8) ||
+ !strncmp(s, "max_ver:", 8) ) /* Alias for original XSA-226 patch 
*/
+{
+long ver = simple_strtol(s + 8, , 10);
+
+if ( e == ss && ver >= 1 && ver <= 2 )
+opt_gnttab_max_version = ver;
+else
+rc = -EINVAL;
+}
+else if ( (val = parse_boolean("transitive", s, ss)) >= 0 )
+opt_transitive_grants = val;
+else
+rc = -EINVAL;
+
+s = ss + 1;
+} while ( *ss );
+
+return rc;
+}
+custom_param("gnttab", parse_gnttab);
+
 /*
  * Note that the three values below are effectively part of the ABI, even if
  * we don't need to make them a formal part of it: A guest suspended for
@@ -2674,7 +2709,8 @@ static int gnttab_copy_claim_buf(const struct gnttab_copy 
*op,
 current->domain->domain_id,
 buf->read_only,
 >frame, >page,
->ptr.offset, >len, true);
+>ptr.offset, >len,
+opt_transitive_grants);
 if ( rc != GNTST_okay )
 goto out;
 buf->ptr.u.ref = ptr->u.ref;
@@ -2876,6 +2912,10 @@ 
gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop)
 if ( op.version != 1 && op.version != 2 )
 goto out;
 
+res = -ENOSYS;
+if ( op.version == 2 && opt_gnttab_max_version == 1 )
+goto out; /* Behave as before set_version was introduced. */
+
 res = 0;
 if ( gt->gt_version == op.version )
 goto out;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/arm: Flush TLBs before turning on the MMU to avoid stale entries

2018-02-27 Thread Julien Grall
We don't know what is the state of the TLBs when booting Xen. To avoid
stale entries, it is necessary to flush the TLBs before turning on the
MMU.

Reported-by: Iain Hunter 
Signed-off-by: Julien Grall 
---
 xen/arch/arm/arm32/head.S | 7 +++
 xen/arch/arm/arm64/head.S | 7 +++
 2 files changed, 14 insertions(+)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 43374e77c6..612fc8fc3c 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -360,6 +360,13 @@ virtphys_clash:
 1:
 PRINT("- Turning on paging -\r\n")
 
+/*
+ * The state of the TLBs is unknown before turning on the MMU.
+ * Flush them to avoid stale one.
+ */
+mcr   CP32(r0, TLBIALLH) /* Flush hypervisor TLBs */
+dsb   nsh
+
 ldr   r1, =paging/* Explicit vaddr, not RIP-relative */
 mrc   CP32(r0, HSCTLR)
 orr   r0, r0, #(SCTLR_M|SCTLR_C) /* Enable MMU and D-cache */
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 35cf8e5cc9..5ba4832cf3 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -498,6 +498,13 @@ virtphys_clash:
 1:
 PRINT("- Turning on paging -\r\n")
 
+/*
+ * The state of the TLBs is unknown before turning on the MMU.
+ * Flush them to avoid stale one.
+ */
+tlbi  alle2  /* Flush hypervisor TLBs */
+dsb   nsh
+
 ldr   x1, =paging/* Explicit vaddr, not RIP-relative */
 mrs   x0, SCTLR_EL2
 orr   x0, x0, #SCTLR_M   /* Enable MMU */
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen: use hvc console for dom0

2018-02-27 Thread Julien Grall

Hi,

On 26/02/18 12:32, Juergen Gross wrote:

On 26/02/18 13:06, Andrii Anisov wrote:

Hello Juergen,


On 26.02.18 13:08, Juergen Gross wrote:

Today the hvc console is added as a preferred console for pv domUs
only. As this requires a boot parameter for getting dom0 messages per
default add it for dom0, too.

Signed-off-by: Juergen Gross 
---
   arch/x86/xen/enlighten_pv.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c

Is this something x86 specific? Could it be a generic approach?


In case ARM wants something similar I guess the test for
xen_initial_domain() should be dropped in xen_early_init().
I am pretty sure we discussed to remove !xen_initial_domain() for Arm in 
the past. But I don't remember why the patch was not sent to remove it.


Anyway, I guess this should be fine to have hvc as a preferred console 
for the initial domain as well.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/link: Don't merge .init.text and .init.data

2018-02-27 Thread Jan Beulich
>>> On 26.02.18 at 20:01,  wrote:
>> Putting all init-time code and data in a single section is a perfectly
>> valid thing to do imo.
> 
> Having our debug symbols borderline unusable, isn't valid.
> 
> Certainly not to work around what you yourself identify as a theoretical
> issue in the first place.

I wonder what theoretical issue you refer to here. I didn't say
anything like that on this thread afaics, nor in the description
of the patch that you want to partly undo. EFI implementations
write-protecting r/o sections is not just a theoretical issue afaict.

> Other option would be to #ifdef the section handing based on EFI, or
> attempt to merge the sections with objcopy before passing mkreloc?

Well, as much as I dislike such #ifdef-ary, I think I could (rather
hesitantly) accept that.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/pv: Rename pv/ro-page-fault.c to pv/emul-ro-page-fault.c

2018-02-27 Thread Jan Beulich
>>> On 26.02.18 at 19:15,  wrote:
> On 05/02/18 13:01, Jan Beulich wrote:
> On 05.02.18 at 13:22,  wrote:
>>> On 05/02/18 08:57, Jan Beulich wrote:
>>> On 02.02.18 at 17:58,  wrote:
> To match all our other emulation handling.
>
> No functional change.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> ---
>  xen/arch/x86/pv/Makefile  | 2 +-
>  xen/arch/x86/pv/{ro-page-fault.c => emul-ro-page-fault.c} | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>  rename xen/arch/x86/pv/{ro-page-fault.c => emul-ro-page-fault.c} (99%)
 When this file was introduced, iirc I had specifically asked to drop
 the pointless emul- prefix. If you want to make things consistent
 again, please instead drop the emul- prefixes of the other files.
>>> No.
>>>
>>> First of all, this file is the most recent to come into existence,
>>> around 3 months after the others.
>> Right - it was too late for me to realize the needlessly long names
>> in those earlier code movement patches.
> 
> That is a very subjective point of view which I don't agree with.
> 
> Naming is all to do with conveying meaning, and shorter isn't
> necessarily better.
> 
>>
>>> The point of naming things in a consistent fashion is for the benefit of
>>> humans, and having the emulation related functionality logically grouped
>>> is a benefit, not a detriment.
>> They're all quite well grouped now already by being in pv/.
> 
> That is not the relevant grouping.  Most of our emulation based logic
> has an emul- prefix and this file is an odd one out.
> 
> Naming the files without their emul- prefix leaves them with no context
> as to what they are doing.  "gate-op.c" or "invl-op.c" are far less
> obvious to their purpose than "emul-gate-op.c" and "emul-invl-op.c".

A little less obvious, yes, but far? Furthermore, taking the file you
want to rename into account, "emul-gate-op" indeed stands for
"emulate gate operations" (likewise for the inv and priv infixes),
while "emul-ro-page-fault" does _not_ mean "emulate r/o page
faults", as that would mean we emulate something to _produce_
r/o page faults. Instead we _handle_ r/o page faults here in
order to emulate certain write operations the guest does. In that
sense, the name ought to be e.g. "emul-write-op". That, however,
would break the connection to e.g. the "ptwr" abbreviation we
continue to use, so please don't take this as a suggestion.

IOW - I can live with your objection to rename the existing
emul-*.c files (despite thinking that for name completion they're
badly named; the emul- prefix would better have been an -emul
suffix, and leaving aside mere name length), but I continue to
dislike the new name of the file here.

>> Otherwise do you mean to also change e.g. gpr_switch.S to emul-gpr_switch.S?
> 
> This is extra special, and as soon as I can figure out how it actually
> works, I plan to replace it with something comprehensive.

;-)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] common/gnttab: Introduce command line feature controls

2018-02-27 Thread George Dunlap
On 02/26/2018 06:02 PM, Andrew Cooper wrote:
> On 05/02/18 13:14, George Dunlap wrote:
>> On 02/05/2018 12:56 PM, Jan Beulich wrote:
>> On 05.02.18 at 12:55,  wrote:
 On 02/02/18 08:51, Jan Beulich wrote:
 On 01.02.18 at 15:38,  wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -916,6 +916,19 @@ Controls EPT related features.
>>  
>>  Specify which console gdbstub should use. See **console**.
>>  
>> +### gnttab
>> +> `= List of [ max_ver:, transitive= ]`
> I realize you don't want to change this as people already use it, but
> I'd still like to give my usual comment: I'd prefer if we could avoid
> introducing further underscore-containing (sub)options. I really don't
> understand why everyone does this: Dashes are easier to type on
> all keyboards I'm aware of, and there's no need to mimic C identifier
> names for command line options.
 I can introduce a max-ver alias if you insist, but dropping max_ver here
 is going to break users who took this patch for XSA-226.
>>> Hence the way I've worded my reply - I don't mean to insist on
>>> changing what you have, or the introduction of an alias. I merely
>>> wanted to give the comment, in the hope that it helps to avoid
>>> future underscores in command line option names.
>> FWIW I often end up looking at other options and name things similarly;
>> so making the documentation say "max-ver", but accepting both "max-ver"
>> and "max_ver", would probably make it more likely that future options
>> would start out as having a dash rather than an underscore.
>>
>> But it's just a suggestion; I wouldn't push for it.
> 
> So how to unblock this?  There are no concrete suggestions, and no
> concrete objections to the patch in its current form.

From what I understand, both Jan and I are saying, "We won't block the
patch if you re-submit with max_ver but Jan's other comments addressed;
but we would prefer it if max-ver could be used instead."

Given that you want to keep things compatible with the securty patch, I
see two options forward for you:

1. Re-submit it using only max_ver

2. Make it accept both max-ver and max_ver, but only document  max-ver.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 22/49] ARM: new VGIC: Implement virtual IRQ injection

2018-02-27 Thread Julien Grall



On 27/02/18 10:17, Andre Przywara wrote:

Hi,


Hi Andre,


On 12/02/18 18:59, Julien Grall wrote:

On 09/02/18 14:39, Andre Przywara wrote:

   /*
    * Iterate over the VM's list of mapped LPIs to find the one with a
    * matching interrupt ID and return a reference to the IRQ structure.
@@ -97,6 +123,204 @@ void vgic_put_irq(struct domain *d, struct
vgic_irq *irq)
   xfree(irq);
   }
   +/**
+ * vgic_target_oracle - compute the target vcpu for an irq
+ *
+ * @irq:    The irq to route. Must be already locked.
+ *
+ * Based on the current state of the interrupt (enabled, pending,
+ * active, vcpu and target_vcpu), compute the next vcpu this should be
+ * given to. Return NULL if this shouldn't be injected at all.
+ *
+ * Requires the IRQ lock to be held.
+ */
+static struct vcpu *vgic_target_oracle(struct vgic_irq *irq)
+{
+    ASSERT(spin_is_locked(>irq_lock));
+
+    /* If the interrupt is active, it must stay on the current vcpu */
+    if ( irq->active )
+    return irq->vcpu ? : irq->target_vcpu;

I am not sure to understand why you check whether irq->vcpu is NULL. If
the interrupt is active, then irq->vcpu should be NULL. Did I miss

 ^
not   you mean?


Yes not NULL.


anything?


Not if it has been explicitly activated via ISACTIVER. This is not
implemented in Xen at the moment, but would be in the future. So I like
to keep this in.


Oh, I missed that case. Thank you for the explanation :).

[...]



The list is not sorted on insertion because this is not necessary most
of the time. In fact the hardware VGIC will do the sorting (kind of)
within the LRs. So as long as we don't have more than 
IRQs in the list, sorting is a waste of time. Experiments in the past
showed that the number of used LRs is less than 4 almost every time. And
since 4 is the mostly used number of implemented LRs, we virtually never
need the sorting. So we avoid doing that on every insertion, instead
doing that only if it's necessary.


I can foresee quite a few issues with this choice on Xen:
 1) You compute the size of ap list in vgic_flush_lr_state() and take
lock on every IRQ one by one. A guest could be nasty and make that list
quite big by make IRQs pending but never "active" them (i.e read IAR).


Yeah, we could try to shortcut a bit here.


 2) This might be an issue while checking whether you need to deliver
an interrupt (vgic_vcpu_pending_irq) because the list is not sorted.


Most of the time the list is very short, storing one, two or actually no
IRQs. In the function where we check for pending IRQs we bail out as
soon as we found the first eligible interrupt. So sorting does not help
in the majority of cases.


As you say "most of the time". Malicious guest are unusual but just 
enough to keep busy both the hypervisor and the security team.




If you are really concerned about that list growing too long, we could
think about mitigations:
1) Try to avoid iterating the whole list while checking whether it needs
to be sorted.
2) Store a flag that notes if the list has already been sorted. As long
as we don't change anything, we don't need to sort again. Would be good
to test whether this is actually helpful. But we would need to keep this
flag up-to-date, which sounds a bit fragile to get right.
3) Switch to sort-on-insertion once we reached a certain number of IRQs
on the list, to mitigate DOS attacks from the guest. This should avoid
list iterations in hot paths, with IRQs disabled.

But all of these sound a bit hackish to me and just would spoil the very
clean and robust code we have today. Also I am not sure we can avoid
list iterations every time (for instance in prune_ap_list()). There is
an upper limit today (number of SPIs), so we might be happy with that
for now.
To be honest I would very much dislike changing the code at this point.
I believe a patch series afterwards would be better, also to actually
have some numbers on the impact of this.


More robust than the current vGIC yes. However, clean code is not an 
excuse to dismiss valid (but unusual) use case. So even if I quite like 
the new vGIC, I really don't want to deal with yet another security.


Thankfully passthrough case is not currently security supported (see 
SUPPORT.md). So we probably can defer, although I would like to keep 
track of known pitfalls of the new vGIC.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] fuzz/x86_emulate: fix bounds for input size

2018-02-27 Thread George Dunlap
On 02/23/2018 10:48 PM, Paul Semel wrote:
> The maximum size for the input size was set to INPUT_SIZE, which is actually
> the size of the data array inside the fuzz_corpus structure and so was not
> abling user (or AFL) to fill in the whole structure. Changing to
> sizeof(struct fuzz_corpus) correct this problem.
> 
> Signed-off-by: Paul Semel 

Hey Paul,

Thanks for the patch.  Looking a bit more at the code over the weekend,
I figured out what that BUILD_BUG_ON() is for -- in afl_harness.c, we
statically allocate a buffer of size INPUT_SIZE to hold the fuzz data.
The BUILD_BUG_ON() is to make sure that this buffer is always big enough
to hold the minimum input size.  And increasing the size accepted by
LLVMFuzzerTestOneInput() won't have any effect for anybody using
afl-harness, as the size passed in will never be larger than INPUT_SIZE.

Are you running afl-harness, or are you using fuzz-emul directly some
other way (e.g., through Google's fuzzing service)?

 -George


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >