[Xen-devel] [ovmf test] 141845: all pass - PUSHED

2019-09-26 Thread osstest service owner
flight 141845 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141845/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf c70fef962e804eba483512b64ec24169871060be
baseline version:
 ovmf c671c9106c52f4bd000ab8857eecd19574c70dbd

Last test of basis   141798  2019-09-25 03:09:53 Z2 days
Testing same since   141845  2019-09-26 03:19:02 Z1 days1 attempts


People who touched revisions under test:
  Dong, Eric 
  Eric Dong 
  Gris87 
  Liming Gao 
  Zhichao Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   c671c9106c..c70fef962e  c70fef962e804eba483512b64ec24169871060be -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 35/47] xen/sched: make vcpu_wake() and vcpu_sleep() core scheduling aware

2019-09-26 Thread Jürgen Groß

On 25.09.19 15:07, Jürgen Groß wrote:

On 24.09.19 13:55, Jan Beulich wrote:

On 14.09.2019 10:52, Juergen Gross wrote:

@@ -765,16 +774,22 @@ void vcpu_wake(struct vcpu *v)
  {
  unsigned long flags;
  spinlock_t *lock;
+    struct sched_unit *unit = v->sched_unit;
  TRACE_2D(TRC_SCHED_WAKE, v->domain->domain_id, v->vcpu_id);
-    lock = unit_schedule_lock_irqsave(v->sched_unit, );
+    lock = unit_schedule_lock_irqsave(unit, );
  if ( likely(vcpu_runnable(v)) )
  {
  if ( v->runstate.state >= RUNSTATE_blocked )
  vcpu_runstate_change(v, RUNSTATE_runnable, NOW());
-    sched_wake(vcpu_scheduler(v), v->sched_unit);
+    sched_wake(vcpu_scheduler(v), unit);


Is this correct / necessary when the unit is not asleep as a whole?
After all the corresponding sched_sleep() further up is called
conditionally only.


Oh, indeed. Will change that.


It turned out this is not so easy at it seemed.

I encountered dom0 boot hangs with making the call conditional, even
when running in cpu scheduling mode. I guess the reason is that a vcpu
can call do_poll() which will try to put itself to sleep and in some
cases call vcpu_wake() in case the condition already changed. In that
case we need the sched_wake() call even if the unit is still running.



@@ -2067,9 +2160,29 @@ static void sched_slave(void)
  now = NOW();
+    v = unit2vcpu_cpu(prev, cpu);
+    if ( v && v->force_context_switch )
+    {
+    v = sched_force_context_switch(vprev, v, cpu, now);
+
+    if ( v )
+    {
+    pcpu_schedule_unlock_irq(lock, cpu);


I can't figure what it is that guarantees that this unlock isn't
going to be followed ...


+    sched_context_switch(vprev, v, false, now);
+    }
+
+    do_softirq = true;
+    }
+
  if ( !prev->rendezvous_in_cnt )
  {
  pcpu_schedule_unlock_irq(lock, cpu);


... by another unlock here. Or wait - is sched_context_switch()
(and perhaps other functions involved there) lacking a "noreturn"
annotation?


Indeed it is. Like context_switch() today. :-)

I'll annotate the functions.


And now I discovered that on ARM context_switch is _not_ "noreturn".
So thanks for noticing that problem. I have fixed it in order to
avoid a latent problem in case we want to support core scheduling on
ARM some day (and yes: that would only have been a problem in core
mode).


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 0/6] Code of Conduct + Extra Guides and Best Practices

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

This series proposes a concrete version of the Xen Project
CoC based on v1.4 of the Contributor Covenant. See [1]

It contains *ALL* the portions I was still going to add.
I spent a bit of time on word-smithing, but I am not a native English speaker
So there is probably time for improvement

The series also reflects the discussion in [2] and some private
discussions on IRC to identify initial members of the Xen
Project’s CoC team.

For convenience of review and in line with other policy documents
I created a git repository at [3]. This series can be found at [5].

[1] https://www.contributor-covenant.org/version/1/4/code-of-conduct.md
[2] https://xen.markmail.org/thread/56ao2gyhpltqmrew 
[3] http://xenbits.xen.org/gitweb/?p=people/larsk/code-of-conduct.git;a=summary
[4] 
https://www.slideshare.net/xen_com_mgr/xpdds19-keynote-patch-review-for-nonmaintainers-george-dunlap-citrix-systems-uk-ltd
[5] 
http://xenbits.xen.org/gitweb/?p=people/larsk/code-of-conduct.git;a=shortlog;h=refs/heads/CoC-v2

Changes since v1
* Code of Conduct 
  Only whitespace changes

* Added Communication Guide
  Contains values and a process based on advice and mediation in case of issues
  This is the primary portal for 

* Added Code Review Guide
  Which is based on [4] with some additions for completeness
  It primarily sets expectations and anything communication related is removed

* Added guide on Communication Best Practice
  Takes the communication section from [4] and expands on it with more examples
  and cases. This is probably where we may need some discussion

* Added document on Resolving Disagreement
  A tiny bit of theory to set the scene
  It covers some common cases of disagreements and how we may approach them
  Again, this probably needs some discussion

Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org

Lars Kurth (6):
  Import v1.4 of Contributor Covenant CoC
  Xen Project Code of Conduct
  Add Communication Guide
  Add Code Review Guide
  Add guide on Communication Best Practice
  Added Resolving Disagreement

-- 
2.13.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/6] Import v1.4 of Contributor Covenant CoC

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

Signed-off-by: Lars Kurth 
---
Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org
---
 code-of-conduct.md | 76 ++
 1 file changed, 76 insertions(+)
 create mode 100644 code-of-conduct.md

diff --git a/code-of-conduct.md b/code-of-conduct.md
new file mode 100644
index 000..81b217c
--- /dev/null
+++ b/code-of-conduct.md
@@ -0,0 +1,76 @@
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to make participation in our project and
+our community a harassment-free experience for everyone, regardless of age, 
body
+size, disability, ethnicity, sex characteristics, gender identity and 
expression,
+level of experience, education, socio-economic status, nationality, personal
+appearance, race, religion, or sexual identity and orientation.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment
+include:
+
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+* The use of sexualized language or imagery and unwelcome sexual attention or
+  advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+  address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+  professional setting
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, wiki edits, issues, and other contributions
+that are not aligned to this Code of Conduct, or to ban temporarily or
+permanently any contributor for other behaviors that they deem inappropriate,
+threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies within all project spaces, and it also applies 
when
+an individual is representing the project or its community in public spaces.
+Examples of representing a project or community include using an official
+project e-mail address, posting via an official social media account, or acting
+as an appointed representative at an online or offline event. Representation of
+a project may be further defined and clarified by project maintainers.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team at [INSERT EMAIL ADDRESS]. All
+complaints will be reviewed and investigated and will result in a response that
+is deemed necessary and appropriate to the circumstances. The project team is
+obligated to maintain confidentiality with regard to the reporter of an 
incident.
+Further details of specific enforcement policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], 
version 1.4,
+available at 
https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+[homepage]: https://www.contributor-covenant.org
+
+For answers to common questions about this code of conduct, see
+https://www.contributor-covenant.org/faq
-- 
2.13.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC for-4.13 00/10] xen/arm: XSA-201 and XSA-263 fixes

2019-09-26 Thread Jürgen Groß

On 26.09.19 20:37, Julien Grall wrote:

Hi all,

This patch series aims to fix two bugs in the entry path from the guest:
 1) Make sure that SSBD workaround is enabled before executing any 
hypervisor code
 2) Avoid guest state corruption when an virtual SError is received

The full series is candidate for Xen 4.13. Without it, the hypervisor would
not be properly protected against SSB vulnerability and the guest state may
get corrupted if an SError is received.

This is in RFC state because the entry code is now quite different and
arm32 changes are not yet implemented. I will modify arm32 once we agreed
on the approach.

Cheers,

Cc: jgr...@suse.com


I think the explanation of the motivation qualifies the series to be
marked as a blocker for 4.13.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/6] Xen Project Code of Conduct

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

Specific changes to the baseline:
* Replace list of positive behaviors with link to separate process
* Replace maintainers with project leadership
  (except in our pledge where maintainers is more appropriate)
* Add 'of all sub-projects' to clarify scope of CoC
* Rename Enforcement
* Replace "project team" with "Conduct Team members"
* Add e-mail alias
* Add section on contacting individual Conduct Team members
* Add section on Conduct Team members

Signed-off-by: Lars Kurth 
---
Chagges since v1:
* Addressed newline changes

Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org
---
 code-of-conduct.md | 45 -
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/code-of-conduct.md b/code-of-conduct.md
index 81b217c..5d6d1d5 100644
--- a/code-of-conduct.md
+++ b/code-of-conduct.md
@@ -1,4 +1,4 @@
-# Contributor Covenant Code of Conduct
+# Xen Project Code of Conduct
 
 ## Our Pledge
 
@@ -11,14 +11,10 @@ appearance, race, religion, or sexual identity and 
orientation.
 
 ## Our Standards
 
-Examples of behavior that contributes to creating a positive environment
-include:
-
-* Using welcoming and inclusive language
-* Being respectful of differing viewpoints and experiences
-* Gracefully accepting constructive criticism
-* Focusing on what is best for the community
-* Showing empathy towards other community members
+We believe that a Code of Conduct can help create a harassment-free 
environment,
+but is not sufficient to create a welcoming environment on its own: guidance on
+creating a welcoming environment, how to communicate in an effective and 
friendly
+way, etc. can be found [here](communication-guide.md).
 
 Examples of unacceptable behavior by participants include:
 
@@ -33,11 +29,11 @@ Examples of unacceptable behavior by participants include:
 
 ## Our Responsibilities
 
-Project maintainers are responsible for clarifying the standards of acceptable
+Project leadership team members are responsible for clarifying the standards 
of acceptable
 behavior and are expected to take appropriate and fair corrective action in
 response to any instances of unacceptable behavior.
 
-Project maintainers have the right and responsibility to remove, edit, or
+Project leadership team members have the right and responsibility to remove, 
edit, or
 reject comments, commits, code, wiki edits, issues, and other contributions
 that are not aligned to this Code of Conduct, or to ban temporarily or
 permanently any contributor for other behaviors that they deem inappropriate,
@@ -45,26 +41,41 @@ threatening, offensive, or harmful.
 
 ## Scope
 
-This Code of Conduct applies within all project spaces, and it also applies 
when
+This Code of Conduct applies within all project spaces of all sub-projects, 
and it also applies when
 an individual is representing the project or its community in public spaces.
 Examples of representing a project or community include using an official
 project e-mail address, posting via an official social media account, or acting
 as an appointed representative at an online or offline event. Representation of
-a project may be further defined and clarified by project maintainers.
+a project may be further defined and clarified by the project leadership.
 
-## Enforcement
+## What to do if you witness or are subject to unacceptable behavior
 
 Instances of abusive, harassing, or otherwise unacceptable behavior may be
-reported by contacting the project team at [INSERT EMAIL ADDRESS]. All
+reported by contacting Conduct Team members at cond...@xenproject.org. All
 complaints will be reviewed and investigated and will result in a response that
-is deemed necessary and appropriate to the circumstances. The project team is
+is deemed necessary and appropriate to the circumstances. Conduct Team members 
are
 obligated to maintain confidentiality with regard to the reporter of an 
incident.
 Further details of specific enforcement policies may be posted separately.
 
-Project maintainers who do not follow or enforce the Code of Conduct in good
+If you have concerns about any of the members of the conduct@ alias,
+you are welcome to contact precisely the Conduct Team member(s) of
+your choice.
+
+Project leadership team members who do not follow or enforce the Code of 
Conduct in good
 faith may face temporary or permanent repercussions as determined by other
 members of the project's leadership.
 
+## Conduct Team members
+Conduct Team members are project leadership team members from any
+sub-project. The current list of Conduct Team members is:
+* Lars Kurth 
+* George Dunlap 
+* Ian Jackson 
+
+Conduct Team members are changed by proposing a change to this document,
+posted on all sub-project lists, followed by a formal global vote as outlined
+[here]: 

[Xen-devel] [PATCH v2 5/6] Add guide on Communication Best Practice

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

This guide covers the bulk on Best Practice related to code review
It primarily focusses on code review interactions
It also covers how to deal with Misunderstandings and Cultural
Differences

Signed-off-by: Lars Kurth 
---
Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org
---
 communication-practice.md | 410 ++
 1 file changed, 410 insertions(+)
 create mode 100644 communication-practice.md

diff --git a/communication-practice.md b/communication-practice.md
new file mode 100644
index 000..db9a5ef
--- /dev/null
+++ b/communication-practice.md
@@ -0,0 +1,410 @@
+# Communication Best Practice
+
+This guide provides communication Best Practice that helps you in
+* Using welcoming and inclusive language
+* Keeping discussions technical and actionable
+* Being respectful of differing viewpoints and experiences
+* Being aware of your own and counterpart’s communication style and culture
+* Show empathy towards other community members
+
+## Code reviews for **reviewers** and **patch authors**
+
+Before embarking on a code review, it is important to remember that
+* A poorly executed code review can hurt the contributors feeling, even when a 
reviewer
+  did not intend to do so. Feeling defensive is a normal reaction to a 
critique or feedback.
+  A reviewer should be aware of how the pitch, tone, or sentiment of their 
comments
+  could be interpreted by the contributor. The same applies to responses of an 
author
+  to the reviewer.
+* When reviewing someone's code, you are ultimately looking for issues. A good 
code
+  reviewer is able to mentally separate finding issues from articulating code 
review
+  comments in a constructive and positive manner: depending on your 
personality this
+  can be **difficult** and you may need to develop a technique that works for 
you.
+* As software engineers we like to be proud of the solutions we came up with. 
This can
+  make it easy to take another people’s criticism personally. Always remember 
that it is
+  the code that is being reviewed, not you as a person.
+* When you receive code review feedback, please be aware that we have reviewers
+  from different backgrounds, communication styles and cultures. Although we 
all trying
+  to create a productive, welcoming and agile environment, we do not always 
succeed.
+
+### Express appreciation
+As the nature of code review to find bugs and possible issues, it is very easy 
for
+reviewers to get into a mode of operation where the patch review ends up being 
a list
+of issues, not mentioning what is right and well done. This can lead to the 
code
+submitter interpreting your feedback in a negative way.
+
+The opening of a code review provides an opportunity to address this and also 
sets the
+tone for the rest of the code review. Starting **every** review on a positive 
note, helps
+set the tone for the rest of the review.
+
+For an initial patch, you can use phrases such as
+> Thanks for the patch
+> Thanks for doing this
+
+For further revisions within a review, phrases such as
+> Thank you for addressing the last set of changes
+
+If you believe the code was good, it is good practice to highlight this by 
using phrases
+such as
+> Looks good, just a few comments
+> The changes you have made since the last version look good
+
+If you think there were issues too many with the code to use one of the 
phrases,
+you can still start on a positive note, by for example saying
+> I think this is a good change
+> I think this is a good feature proposal
+
+It is also entirely fine to highlight specific changes as good. The best place 
to
+do this, is at top of a patch, as addressing code review comments typically 
requires
+a contributor to go through the list of things to address and an in-lined 
positive
+comment is likely to break that workflow.
+
+You should also consider, that if you review a patch of an experienced
+contributor phrases such as *Thanks for the patch* could come across as
+patronizing, while using *Thanks for doing this* is less likely to be 
interpreted
+as such.
+
+Appreciation should also be expressed by patch authors when asking for 
clarifications
+to a review or responding to questions. A simple
+> Thank you for your feedback
+> Thank you for your reply
+> Thank you XXX!
+
+is normally sufficient.
+
+### Avoid opinion: stick to the facts
+The way how a reviewer expresses feedback, has a big impact on how the author
+perceives the feedback. Key to this is what we call **stick to the facts**.  
The same is
+true when a patch author is responding to a comment from a reviewer.
+
+One of our maintainers has been studying Mandarin for several years and has 
come
+across the most strongly-worded dictionary entry
+[he has ever seen](https://youtu.be/ehZvBmrLRwg?t=834). This example
+illustrates the problem of using 

[Xen-devel] [PATCH v2 3/6] Add Communication Guide

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

This document is a portal page that lays out our gold standard,
best practices for some common situations and mechanisms to help
resolve issues that can have a negative effect on our community.

Detail is covered in subsequent documents

Signed-off-by: Lars Kurth 
---
Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org
---
 communication-guide.md | 67 ++
 1 file changed, 67 insertions(+)
 create mode 100644 communication-guide.md

diff --git a/communication-guide.md b/communication-guide.md
new file mode 100644
index 000..4bcf440
--- /dev/null
+++ b/communication-guide.md
@@ -0,0 +1,67 @@
+# Communication Guide
+
+We believe that our [Code of Conduct] (code-of-conduct.md) can help create a
+harassment-free environment, but is not sufficient to create a welcoming
+environment on its own. We can all make mistakes: when we do, we take
+responsibility for them and try to improve.
+
+This document lays out our gold standard, best practices for some common
+situations and mechanisms to help resolve issues that can have a
+negative effect on our community.
+
+## Goal
+
+We want a productive, welcoming and agile community that can welcome new
+ideas in a complex technical field which is able to reflect on and improve how 
we
+work.
+
+## Communication & Handling Differences in Opinions
+
+Examples of behavior that contributes to creating a positive environment
+include:
+* Use welcoming and inclusive language
+* Keep discussions technical and actionable
+* Be respectful of differing viewpoints and experiences
+* Be aware of your own and counterpart’s communication style and culture
+* Gracefully accept constructive criticism
+* Focus on what is best for the community
+* Show empathy towards other community members
+* Resolve differences in opinion effectively
+
+## Getting Help
+
+When developing code collaboratively, technical discussion and disagreements
+are unavoidable. Our contributors come from different countries and cultures,
+are driven by different goals and take pride in their work and in their point
+of view. This invariably can lead to lengthy and unproductive debate,
+followed by indecision, sometimes this can impact working relationships
+or lead to other issues that can have a negative effect on our community.
+
+To minimize such issue, we provide a 3-stage process
+* Self-help as outlined in this document
+* Ability to ask for an independent opinion or help in private
+* Mediation between parties which disagree. In this case a neutral community
+  member assists the disputing parties resolve the issues or will work with the
+  parties such that they can improve future interactions.
+
+If you need and independent opinion or help, feel free to contact
+mediat...@xenproject.org. The team behind mediation@ is made up of the
+same community members as those listed in the Conduct Team: see
+[Code of Conduct](code-of-conduct.md). In addition, team members are obligated
+to maintain confidentiality with regard discussions that take place. If you
+have concerns about any of the members of the mediation@ alias, you are
+welcome to contact precisely the team member(s) of your choice. In this case,
+please make certain that you highlight the nature of a request by making sure 
that
+either help or mediation is mentioned in the e-mail subject or body.
+
+## Specific Topics and Best Practice
+
+* [Code Review Guide] (code-review-guide.md):
+  Essential reading for code reviewers and contributors
+* [Communication Best Practice] (communication-practice.md):
+  This guide covers communication guidelines for code reviewers and reviewees. 
It
+  should help you create self-awareness, anticipate, avoid  and help resolve
+  communication issues.
+* [Resolving Disagreement] (resolving-disagreement.md):
+  This guide lays out common situations that can lead to dead-lock and shows 
common
+  patterns on how to avoid and resolve issues.
-- 
2.13.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 4/6] Add Code Review Guide

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

This document highlights what reviewers such as maintainers and committers look
for when reviewing code. It sets expectations for code authors and provides
a framework for code reviewers.

Signed-off-by: Lars Kurth 
---
Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org
---
 code-review-guide.md | 125 +++
 1 file changed, 125 insertions(+)
 create mode 100644 code-review-guide.md

diff --git a/code-review-guide.md b/code-review-guide.md
new file mode 100644
index 000..8639431
--- /dev/null
+++ b/code-review-guide.md
@@ -0,0 +1,125 @@
+# Code Review Guide
+
+This document highlights what reviewers such as maintainers and committers look
+for when reviewing your code. It sets expectations for code authors and 
provides
+a framework for code reviewers.
+
+This document does **not cover** the following topics:
+* [Communication Best Practice](communication-practice.md)
+* [Resolving Disagreement](resolving-disagreement.md)
+* [Patch Submission 
Workflow](https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches)
+* [Managing Patch Submission with 
Git](https://wiki.xenproject.org/wiki/Managing_Xen_Patches_with_Git)
+
+## What we look for in Code Reviews
+When performing a code review, reviewers typically look for the following 
things
+
+### Is the change necessary to accomplish the goals?
+* Is it clear what the goals are?
+* Do we need to make a change, or can the goals be met with existing
+  functionality?
+
+### Architecture / Interface
+* Is this the best way to solve the problem?
+* Is this the right part of the code to modify?
+* Is this the right level of abstraction?
+* Is the interface general enough? Too general? Forward compatible?
+
+### Functionality
+* Does it do what it’s trying to do?
+* Is it doing it in the most efficient way?
+* Does it handle all the corner / error cases correctly?
+
+### Maintainability / Robustness
+* Is the code clear? Appropriately commented?
+* Does it duplicate another piece of code?
+* Does the code make hidden assumptions?
+* Does it introduce sections which need to be kept **in sync** with other 
sections?
+* Are there other **traps** someone modifying this code might fall into?
+
+**Note:** Sometimes you will work in areas which have identified 
maintainability
+and/or robustness issues. In such cases, maintainers may ask you to make 
additional
+changes, such that your submitted code does not make things worse or point you
+to other patches are already being worked on.
+
+### System properties
+In some areas of the code, system properties such as
+* Code size
+* Performance
+* Scalability
+* Latency
+* Complexity
+* 
+are also important during code reviews.
+
+### Style
+* Comments, carriage returns, **snuggly braces**, 
+* See 
[CODING_STYLE](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=blob;f=CODING_STYLE)
+  and 
[tools/libxl/CODING_STYLE](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=blob;f=tools/libxl/CODING_STYLE)
+* No extraneous whitespace changes
+
+### Documentation and testing
+* If there is pre-existing documentation in the tree, such as man pages, design
+  documents, etc. a contributor may be asked to update the documentation 
alongside
+  the change. Documentation is typically present in the
+  [docs](https://xenbits.xen.org/gitweb/?p=xen.git;a=tree;f=docs) folder.
+* When adding new features that have an impact on the end-user,
+  a contributor should include an update to the
+  [SUPPORT.md](https://xenbits.xen.org/gitweb/?p=xen.git;a=tree;f=docs) file.
+  Typically, more complex features require several patch series before it is 
ready to be
+  advertised in SUPPORT.md
+* When adding new features, a contributor may be asked to provide tests or
+  ensure that existing tests pass
+
+ Testing for the Xen Project Hypervisor
+Tests are typically located in one of the following directories
+* **Unit tests**: 
[tools/tests](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=tree;f=tools/tests)
+or 
[xen/test](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=tree;f=xen/test)
+  Unit testing is hard for a system like Xen and typically requires building a 
subsystem of
+  your tree. If your change can be easily unit tested, you should consider 
submitting tests
+  with your patch.
+* **Build and smoke test**: see [Xen GitLab 
CI](https://gitlab.com/xen-project/xen/pipelines)
+  Runs build tests for a combination of various distros and compilers against 
changes
+  committed to staging. Developers can join as members and test their 
development
+  branches **before** submitting a patch.
+* **XTF tests** (microkernel-based tests): see 
[XTF](https://xenbits.xenproject.org/docs/xtf/)
+  XTF has been designed to test interactions between your software and 
hardware.
+  It is a very useful tool for testing low level functionality and is 

[Xen-devel] [PATCH v2 6/6] Added Resolving Disagreement

2019-09-26 Thread Lars Kurth
From: Lars Kurth 

This guide provides Best Practice on identifying and resolving
common classes of disagreement

Signed-off-by: Lars Kurth 
--
Cc: minios-de...@lists.xenproject.org
Cc: xen-...@lists.xenproject.org
Cc: win-pv-de...@lists.xenproject.org
Cc: mirageos-de...@lists.xenproject.org
Cc: committ...@xenproject.org
---
 resolving-disagreement.md | 146 ++
 1 file changed, 146 insertions(+)
 create mode 100644 resolving-disagreement.md

diff --git a/resolving-disagreement.md b/resolving-disagreement.md
new file mode 100644
index 000..19aedbe
--- /dev/null
+++ b/resolving-disagreement.md
@@ -0,0 +1,146 @@
+# Resolving Disagreement
+
+This guide provides Best Practice on resolving disagreement, such as
+* Gracefully accept constructive criticism
+* Focus on what is best for the community
+* Resolve differences in opinion effectively
+
+## Theory: Paul Graham's hierarchy of disagreement
+Paul Graham proposed a **disagreement hierarchy** in a 2008 essay 
+**[How to Disagree](http://www.paulgraham.com/disagree.html)**, putting types 
of
+arguments into a seven-point hierarchy and observing that *moving up the
+disagreement hierarchy makes people less mean, and will make most of them 
happier*.
+Graham also suggested that the hierarchy can be thought of as a pyramid, as 
the 
+highest forms of disagreement are rarer.
+
+| ![Graham's Hierarchy of 
Disagreemen](https://upload.wikimedia.org/wikipedia/commons/a/a3/Graham%27s_Hierarchy_of_Disagreement-en.svg)
 |
+| *A representation of Graham's hierarchy of disagreement from 
[Loudacris](http://www.createdebate.com/user/viewprofile/Loudacris) modified by 
[Rocket000](https://en.wikipedia.org/wiki/User:Rocket000)* |
+
+In the context of the Xen Project we strive to **only use the top half** of 
the hierarchy.
+**Name-calling** and **Ad hominem** arguments are not acceptable within the Xen
+Project.
+
+## Issue: Scope creep
+
+One thing which occasionally happens during code review is that a code reviewer
+asks or appears to ask the author of patch to implement additional 
functionality.
+
+This could take for example the form of
+> Do you think it would be useful for the code to do XXX? 
+> I can imagine a user wanting to do YYY (and XXX would enable this)
+
+That potentially adds additional work for the code author, which they may not 
have
+the time to perform. It is good practice for authors to consider such a 
request in terms of
+* Usefulness to the user
+* Code churn, complexity or impact on other system properties
+* Extra time to implement and report back to the reviewer
+
+If you believe that the impact/cost is too high, report back to the reviewer. 
To resolve
+this, it is advisable to
+* Report your findings
+* And then check whether this was merely an interesting suggestion, or 
something the
+reviewer feels more strongly about
+
+In the latter case, there are typically several common outcomes
+* The **author and reviewer agree** that the suggestion should be implemented
+* The **author and reviewer agree** that it may make sense to defer 
implementation
+* The **author and reviewer agree** that it makes no sense to implement the 
suggestion
+
+The author of a patch would typically suggest their preferred outcome, for 
example
+> I am not sure it is worth to implement XXX
+> Do you think this could be done as a separate patch in future?
+
+In cases, where no agreement can be found, the best approach would be to get an
+independent opinion from another maintainer or the project's leadership team.
+
+## Issue: [Bikeshedding](https://en.wiktionary.org/wiki/bikeshedding)
+
+Occasionally discussions about unimportant but easy-to-grasp issues can lead to
+prolonged and unproductive discussion. The best way to approach this is to
+try and **anticipate** bikeshedding and highlight it as such upfront. However, 
the
+format of a code review does not always lend itself well to this approach, 
except
+for highlighting it in the cover letter of a patch series.
+
+However, typically Bikeshedding issues are fairly easy to recognize in a code 
review,
+as you will very quickly get different reviewers providing differing opinions. 
In this case
+it is best for the author or a reviewer to call out the potential bikeshedding 
issue using
+something like
+
+> Looks we have a bikeshedding issue here
+> I think we should call a quick vote to settle the issue
+
+Our governance provides the mechanisms of [informal 
votes](https://xenproject.org/developers/governance/#informal-votes-or-surveys) 
or
+[lazy voting](https://xenproject.org/developers/governance/#lazyconsensus) 
which lend
+themselves well to resolve such issues.
+
+## Issue: Small functional issues
+
+The most common area of disagreements which happen in code reviews, are 
differing
+opinions on whether small functional issues in a patch series have to be 
resolved or
+not before the code is ready to be submitted. Such disagreements are typically 
caused
+by 

[Xen-devel] [qemu-mainline test] 141824: regressions - FAIL

2019-09-26 Thread osstest service owner
flight 141824 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141824/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 11 guest-startfail REGR. vs. 140282
 test-amd64-i386-freebsd10-amd64 11 guest-start   fail REGR. vs. 140282
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 140282
 test-amd64-i386-qemuu-rhel6hvm-amd 10 redhat-install fail REGR. vs. 140282
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 
140282
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 140282
 test-amd64-amd64-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 
140282
 test-amd64-i386-qemuu-rhel6hvm-intel 10 redhat-install   fail REGR. vs. 140282
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 140282
 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 
140282
 test-amd64-amd64-qemuu-nested-amd 10 debian-hvm-install  fail REGR. vs. 140282
 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install  fail REGR. vs. 140282
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 10 debian-hvm-install fail REGR. 
vs. 140282
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 10 debian-hvm-install fail REGR. 
vs. 140282
 test-amd64-amd64-qemuu-nested-intel 10 debian-hvm-install fail REGR. vs. 140282
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail 
REGR. vs. 140282
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 140282
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 140282
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail 
REGR. vs. 140282
 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install   fail REGR. vs. 140282
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-install  fail REGR. vs. 140282
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install   fail REGR. vs. 140282

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 140282
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 140282
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 

Re: [Xen-devel] Latest development (master) Xen fails to boot on HP ProLiant DL20 GEN10

2019-09-26 Thread Roman Shaposhnik
On Thu, Sep 26, 2019 at 12:44 AM Jan Beulich  wrote:
>
> On 26.09.2019 00:31, Roman Shaposhnik wrote:
> > Jan, Roger, thank you so much for the initial ideas. I tried a few of
> > those and here's where I am.
> >
> > First of all, it is definitely related to CPU bring up. Adding
> > cpuidle=0 to xen command line made Xen boot.
> >
> > Then, a good friend of mine (who you may know from ancient Xen days
> > ;-)) suggested that this could be related to this:
> >  https://wiki.xenproject.org/wiki/Xen_power_management
> > so I went to the BIOS settings and quite to my surprise all of them
> > were grayed out (not tweakable).
> >
> > The only one that wasn't was 2xAPIC support. So just for kicks -- I
> > disabled that.
> >
> > That, in turn, made Xen boot even without cpuidle=0. I'm attaching that log.
>
> Interesting, but unfortunately this particular log is of no real use
> for investigation of the issue (other than knowing the CPU model). I
> also notice it's a 4.12.0 log, when your original report was against
> latest master.

Understood. But this brings us back to: if I don't get Xen booting,
I don't think I can capture the logs. This is a rackable server without
anything like a serial port, etc.

I suppose the best I can do is to capture a video of Xen failing to boot?

Any other ideas?

> > So I guess at this point, you could say that I have a functional
> > system, but I'm curious whether you guys would be interested to look
> > into 2xAPIC situation.
>
> Of course we do. As a next step I'd suggest reverting the BIOS settings
> change you did, and instead using the "x2apic=0" Xen command line option.
>
> And then we of course need a complete boot log (as requested earlier) of
> a problem case.
>
> Further I'd suggest moving away from the black-and-white "cpuidle="
> option, and instead limiting use of deep C states ("max_cstate="). I
> wouldn't be surprised if this was the issue; we'd then have to first
> of all go through errata for the part your system is using.

Will do these experiments tomorrow and report back.

Still -- please let me know how can I capture the log without serial, etc.

Thanks,
Roman.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 141822: tolerable FAIL - PUSHED

2019-09-26 Thread osstest service owner
flight 141822 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141822/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 141750
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 141750
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 141750
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 141750
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 141750
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 141750
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 141750
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 141750
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 141750
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  f93abf0315efef861270c25d83c8047fd6a54ec4
baseline version:
 xen  c4a5656b2ef3d29bb8acfb5342e786a5b9578018

Last test of basis   141750  2019-09-24 06:12:11 Z2 days
Testing same since   141822  2019-09-25 14:59:55 Z1 days1 attempts


People who 

[Xen-devel] [PATCH v6 0/8] dom0less device assignment

2019-09-26 Thread Stefano Stabellini
Hi all,

This small patch series adds device assignment support to Dom0less.
The last patch is the documentation.

Cheers,

Stefano


The following changes since commit f93abf0315efef861270c25d83c8047fd6a54ec4:

  xen: sched: Fix Arm build after commit f855dd9625 (2019-09-24 18:58:55 +0100)

are available in the Git repository at:

  http://xenbits.xenproject.org/git-http/people/sstabellini/xen-unstable.git 

for you to fetch changes up to 7852e6acc1b28e00fc4f7da0bc16315b6da81d27:

  xen/arm: add dom0-less device assignment info to docs (2019-09-25 19:03:10 
-0700)


Stefano Stabellini (8):
  xen/arm: introduce handle_device_interrupts
  xen/arm: export device_tree_get_reg and device_tree_get_u32
  xen/arm: introduce kinfo->phandle_gic
  xen/arm: copy dtb fragment to guest dtb
  xen/arm: assign devices to boot domains
  xen/arm: handle "multiboot,device-tree" compatible nodes
  xen/arm: introduce nr_spis
  xen/arm: add dom0-less device assignment info to docs

 docs/misc/arm/device-tree/booting.txt |  44 +++-
 docs/misc/arm/passthrough.txt | 101 +
 xen/arch/arm/bootfdt.c|  10 +-
 xen/arch/arm/domain_build.c   | 398 +-
 xen/arch/arm/kernel.c |  14 +-
 xen/arch/arm/setup.c  |   1 +
 xen/include/asm-arm/kernel.h  |   5 +-
 xen/include/asm-arm/setup.h   |   7 +
 8 files changed, 523 insertions(+), 57 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 1/8] xen/arm: introduce handle_device_interrupts

2019-09-26 Thread Stefano Stabellini
Move the interrupt handling code out of handle_device to a new function
so that it can be reused for dom0less VMs (it will be used in later
patches).

Signed-off-by: Stefano Stabellini 
Acked-by: Julien Grall 
---
Changes in v4:
- rename handle_interrupts to handle_device_interrupts
- improve in-code comment
- remove return 1 if mapping is done
- use unsigned

Changes in v3:
- add patch

The diff is hard to read but I just moved the interrupts related code
from handle_devices to a new function handle_device_interrupts, and very
little else.
---
 xen/arch/arm/domain_build.c | 80 +++--
 1 file changed, 51 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index a0fee1ef13..21985628f0 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1239,41 +1239,22 @@ static int __init map_device_children(struct domain *d,
 }
 
 /*
- * For a given device node:
- *  - Give permission to the guest to manage IRQ and MMIO range
- *  - Retrieve the IRQ configuration (i.e edge/level) from device tree
- * When the device is not marked for guest passthrough:
- *  - Assign the device to the guest if it's protected by an IOMMU
- *  - Map the IRQs and iomem regions to DOM0
+ * handle_device_interrupts retrieves the interrupts configuration from
+ * a device tree node and maps those interrupts to the target domain.
+ *
+ * Returns:
+ *   < 0 error
+ *   0   success
  */
-static int __init handle_device(struct domain *d, struct dt_device_node *dev,
-p2m_type_t p2mt)
+static int __init handle_device_interrupts(struct domain *d,
+   struct dt_device_node *dev,
+   bool need_mapping)
 {
-unsigned int nirq;
-unsigned int naddr;
-unsigned int i;
+unsigned int i, nirq;
 int res;
 struct dt_raw_irq rirq;
-u64 addr, size;
-bool need_mapping = !dt_device_for_passthrough(dev);
 
 nirq = dt_number_of_irq(dev);
-naddr = dt_number_of_address(dev);
-
-dt_dprintk("%s passthrough = %d nirq = %d naddr = %u\n",
-   dt_node_full_name(dev), need_mapping, nirq, naddr);
-
-if ( dt_device_is_protected(dev) && need_mapping )
-{
-dt_dprintk("%s setup iommu\n", dt_node_full_name(dev));
-res = iommu_assign_dt_device(d, dev);
-if ( res )
-{
-printk(XENLOG_ERR "Failed to setup the IOMMU for %s\n",
-   dt_node_full_name(dev));
-return res;
-}
-}
 
 /* Give permission and map IRQs */
 for ( i = 0; i < nirq; i++ )
@@ -1310,6 +1291,47 @@ static int __init handle_device(struct domain *d, struct 
dt_device_node *dev,
 return res;
 }
 
+return 0;
+}
+
+/*
+ * For a given device node:
+ *  - Give permission to the guest to manage IRQ and MMIO range
+ *  - Retrieve the IRQ configuration (i.e edge/level) from device tree
+ * When the device is not marked for guest passthrough:
+ *  - Assign the device to the guest if it's protected by an IOMMU
+ *  - Map the IRQs and iomem regions to DOM0
+ */
+static int __init handle_device(struct domain *d, struct dt_device_node *dev,
+p2m_type_t p2mt)
+{
+unsigned int naddr;
+unsigned int i;
+int res;
+u64 addr, size;
+bool need_mapping = !dt_device_for_passthrough(dev);
+
+naddr = dt_number_of_address(dev);
+
+dt_dprintk("%s passthrough = %d naddr = %u\n",
+   dt_node_full_name(dev), need_mapping, naddr);
+
+if ( dt_device_is_protected(dev) && need_mapping )
+{
+dt_dprintk("%s setup iommu\n", dt_node_full_name(dev));
+res = iommu_assign_dt_device(d, dev);
+if ( res )
+{
+printk(XENLOG_ERR "Failed to setup the IOMMU for %s\n",
+   dt_node_full_name(dev));
+return res;
+}
+}
+
+res = handle_device_interrupts(d, dev, need_mapping);
+if ( res < 0 )
+return res;
+
 /* Give permission and map MMIOs */
 for ( i = 0; i < naddr; i++ )
 {
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 2/8] xen/arm: export device_tree_get_reg and device_tree_get_u32

2019-09-26 Thread Stefano Stabellini
They'll be used in later patches.

Signed-off-by: Stefano Stabellini 
Acked-by: Julien Grall 

---
Changes in v5:
- move declarations to xen/include/asm-arm/setup.h

Changes in v4:
- new patch
---
 xen/arch/arm/bootfdt.c  | 8 
 xen/include/asm-arm/setup.h | 6 ++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 623173bc7f..a7810abb15 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -55,15 +55,15 @@ static bool __init device_tree_node_compatible(const void 
*fdt, int node,
 return false;
 }
 
-static void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
-   u32 size_cells, u64 *start, u64 *size)
+void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,
+u32 size_cells, u64 *start, u64 *size)
 {
 *start = dt_next_cell(address_cells, cell);
 *size = dt_next_cell(size_cells, cell);
 }
 
-static u32 __init device_tree_get_u32(const void *fdt, int node,
-  const char *prop_name, u32 dflt)
+u32 __init device_tree_get_u32(const void *fdt, int node,
+   const char *prop_name, u32 dflt)
 {
 const struct fdt_property *prop;
 
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index efcba545c2..fa0a8721b2 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -115,6 +115,12 @@ const char *boot_module_kind_as_string(bootmodule_kind 
kind);
 extern uint32_t hyp_traps_vector[];
 void init_traps(void);
 
+void device_tree_get_reg(const __be32 **cell, u32 address_cells,
+ u32 size_cells, u64 *start, u64 *size);
+
+u32 device_tree_get_u32(const void *fdt, int node,
+const char *prop_name, u32 dflt);
+
 #endif
 /*
  * Local variables:
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 7/8] xen/arm: introduce nr_spis

2019-09-26 Thread Stefano Stabellini
We don't have a clear way to know how many virtual SPIs we need for the
dom0-less domains. Introduce a new option under xen,domain to specify
the number of SPIs to allocate for a domain.

The property is optional. When absent, we'll use the physical number of
GIC lines for dom0-less domains, or GUEST_VPL011_SPI+1 if vpl011 is
requested, whichever is greater.

Remove the old setting of nr_spis based on the presence of the vpl011.

The implication of this change is that without nr_spis dom0less domains
get the same amount of SPI allocated as dom0, regardless of how many
physical devices they have assigned, and regardless of whether they have
a virtual pl011 (which also needs an emulated SPI). This is done because
the SPIs allocation needs to be done before parsing any passthrough
information, so we have to account for any potential physical SPI
assigned to the domain.

When nr_spis is present, the domain gets exactly nr_spis allocated SPIs.
If the number is too low, it might not be enough for the devices
assigned it to it. If the number is less than GUEST_VPL011_SPI, the
virtual pl011 won't work.

Signed-off-by: Stefano Stabellini 
Reviewed-by: Volodymyr Babchuk 
---
Changes in v5:
- improve commit message
- allocate enough SPIs for vpl011

Changes in v4:
- improve commit message

Changes in v3:
- improve commit message
- introduce nr_spis
---
 xen/arch/arm/domain_build.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index a461816345..40c4a782c9 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2355,7 +2355,6 @@ void __init create_domUs(void)
 struct domain *d;
 struct xen_domctl_createdomain d_cfg = {
 .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
-.arch.nr_spis = 0,
 .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
 .max_evtchn_port = -1,
 .max_grant_frames = 64,
@@ -2365,9 +2364,6 @@ void __init create_domUs(void)
 if ( !dt_device_is_compatible(node, "xen,domain") )
 continue;
 
-if ( dt_property_read_bool(node, "vpl011") )
-d_cfg.arch.nr_spis = GUEST_VPL011_SPI - 32 + 1;
-
 if ( !dt_property_read_u32(node, "cpus", _cfg.max_vcpus) )
 panic("Missing property 'cpus' for domain %s\n",
   dt_node_name(node));
@@ -2375,6 +2371,19 @@ void __init create_domUs(void)
 if ( dt_find_compatible_node(node, NULL, "multiboot,device-tree") )
 d_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
+if ( !dt_property_read_u32(node, "nr_spis", _cfg.arch.nr_spis) )
+{
+d_cfg.arch.nr_spis = gic_number_lines() - 32;
+
+/*
+ * vpl011 uses one emulated SPI. If vpl011 is requested, make
+ * sure that we allocate enough SPIs for it.
+ */
+if ( dt_property_read_bool(node, "vpl011") )
+d_cfg.arch.nr_spis = MAX(d_cfg.arch.nr_spis,
+ GUEST_VPL011_SPI - 32 + 1);
+}
+
 d = domain_create(++max_init_domid, _cfg, false);
 if ( IS_ERR(d) )
 panic("Error creating domain %s\n", dt_node_name(node));
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 3/8] xen/arm: introduce kinfo->phandle_gic

2019-09-26 Thread Stefano Stabellini
Instead of always hard-coding the GIC phandle (GUEST_PHANDLE_GIC), store
it in a variable under kinfo. This way it can be dynamically chosen per
domain. Remove the fdt pointer argument to the make_*_domU_node
functions and oass a struct kernel_info * instead. The fdt pointer can
be accessed from kinfo->fdt. Remove the struct domain *d parameter to
the make_*_domU_node functions because it becomes unused.

Initialize phandle_gic to GUEST_PHANDLE_GIC at the beginning of
prepare_dtb_domU for DomUs. Later patches will change the value of
phandle_gic depending on user provided information.

For Dom0, initialize phandle_gic to dt_interrupt_controller->phandle
(current value) at the beginning of prepare_dtb.

Signed-off-by: Stefano Stabellini 

---
Changes in v6:
- rename guest_phandle_gic to phandle_gic
- use phandle_gic for dom0 too

Changes in v5:
- improve commit message

Changes in v4:
- new patch
---
 xen/arch/arm/domain_build.c  | 39 
 xen/include/asm-arm/kernel.h |  3 +++
 2 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 21985628f0..b25abe8d08 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -626,15 +626,14 @@ static int __init fdt_property_interrupts(const struct 
kernel_info *kinfo,
   unsigned num_irq)
 {
 int res;
-uint32_t phandle = is_hardware_domain(kinfo->d) ?
-   dt_interrupt_controller->phandle : GUEST_PHANDLE_GIC;
 
 res = fdt_property(kinfo->fdt, "interrupts",
intr, sizeof(intr[0]) * num_irq);
 if ( res )
 return res;
 
-res = fdt_property_cell(kinfo->fdt, "interrupt-parent", phandle);
+res = fdt_property_cell(kinfo->fdt, "interrupt-parent",
+kinfo->phandle_gic);
 
 return res;
 }
@@ -1537,8 +1536,9 @@ static int __init handle_node(struct domain *d, struct 
kernel_info *kinfo,
 return res;
 }
 
-static int __init make_gicv2_domU_node(const struct domain *d, void *fdt)
+static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
 {
+void *fdt = kinfo->fdt;
 int res = 0;
 __be32 reg[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) * 2];
 __be32 *cells;
@@ -1573,11 +1573,11 @@ static int __init make_gicv2_domU_node(const struct 
domain *d, void *fdt)
 if (res)
 return res;
 
-res = fdt_property_cell(fdt, "linux,phandle", GUEST_PHANDLE_GIC);
+res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
 if (res)
 return res;
 
-res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_GIC);
+res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
 if (res)
 return res;
 
@@ -1586,8 +1586,9 @@ static int __init make_gicv2_domU_node(const struct 
domain *d, void *fdt)
 return res;
 }
 
-static int __init make_gicv3_domU_node(const struct domain *d, void *fdt)
+static int __init make_gicv3_domU_node(struct kernel_info *kinfo)
 {
+void *fdt = kinfo->fdt;
 int res = 0;
 __be32 reg[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) * 2];
 __be32 *cells;
@@ -1622,11 +1623,11 @@ static int __init make_gicv3_domU_node(const struct 
domain *d, void *fdt)
 if (res)
 return res;
 
-res = fdt_property_cell(fdt, "linux,phandle", GUEST_PHANDLE_GIC);
+res = fdt_property_cell(fdt, "linux,phandle", kinfo->phandle_gic);
 if (res)
 return res;
 
-res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_GIC);
+res = fdt_property_cell(fdt, "phandle", kinfo->phandle_gic);
 if (res)
 return res;
 
@@ -1635,22 +1636,23 @@ static int __init make_gicv3_domU_node(const struct 
domain *d, void *fdt)
 return res;
 }
 
-static int __init make_gic_domU_node(const struct domain *d, void *fdt)
+static int __init make_gic_domU_node(struct kernel_info *kinfo)
 {
-switch ( d->arch.vgic.version )
+switch ( kinfo->d->arch.vgic.version )
 {
 case GIC_V3:
-return make_gicv3_domU_node(d, fdt);
+return make_gicv3_domU_node(kinfo);
 case GIC_V2:
-return make_gicv2_domU_node(d, fdt);
+return make_gicv2_domU_node(kinfo);
 default:
 panic("Unsupported GIC version\n");
 }
 }
 
 #ifdef CONFIG_SBSA_VUART_CONSOLE
-static int __init make_vpl011_uart_node(const struct domain *d, void *fdt)
+static int __init make_vpl011_uart_node(struct kernel_info *kinfo)
 {
+void *fdt = kinfo->fdt;
 int res;
 gic_interrupt_t intr;
 __be32 reg[GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS];
@@ -1681,7 +1683,7 @@ static int __init make_vpl011_uart_node(const struct 
domain *d, void *fdt)
 return res;
 
 res = fdt_property_cell(fdt, "interrupt-parent",
-GUEST_PHANDLE_GIC);
+kinfo->phandle_gic);
 if ( res )
 return res;
 
@@ -1706,6 +1708,8 @@ static int 

[Xen-devel] [PATCH v6 8/8] xen/arm: add dom0-less device assignment info to docs

2019-09-26 Thread Stefano Stabellini
Add info about the SPI used for the virtual pl011.

Signed-off-by: Stefano Stabellini 
Acked-by: Julien Grall 

---
Changes in v6:
- fix nr_spis description
- add ack

Changes in v5:
- improve wording

Changes in v4:
- fix spelling
- add "multiboot,module"
- improve commit message
- improve doc
- expand the nr_spis and vpl011 sections and include information about
  the vpl011 SPI
- move passthrough information to docs/misc/arm/passthrough.txt

Changes in v3:
- add nr_spis
- change description of interrupts and interrupt-parent

Changes in v2:
- device tree fragment loaded in cacheable memory
- rename multiboot,dtb to multiboot,device-tree
- rename "path" to "xen,path"
- add a note about device memory mapping
- introduce xen,reg
- specify only the GIC is supported
---
 docs/misc/arm/device-tree/booting.txt |  44 ++-
 docs/misc/arm/passthrough.txt | 101 ++
 2 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index 317a9e962a..649e00d09f 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -146,7 +146,18 @@ with the following properties:
 
 - vpl011
 
-An empty property to enable/disable a virtual pl011 for the guest to use.
+An empty property to enable/disable a virtual pl011 for the guest to
+use. The virtual pl011 uses SPI number 0 (see GUEST_VPL011_SPI).
+Please note that the SPI used for the virtual pl011 could clash with the
+physical SPI of a physical device assigned to the guest.
+
+- nr_spis
+
+Optional. A 32-bit integer specifying the number of SPIs (Shared
+Peripheral Interrupts) to allocate for the domain. If nr_spis is
+missing, the max number of SPIs supported by the physical GIC is
+used, or GUEST_VPL011_SPI+1 if vpl011 is enabled, whichever is
+greater.
 
 - #address-cells and #size-cells
 
@@ -226,3 +237,34 @@ chosen {
 };
 };
 };
+
+
+Device Assignment
+=
+
+Device Assignment (Passthrough) is supported by adding another module,
+alongside the kernel and ramdisk, with the device tree fragment
+corresponding to the device node to assign to the guest.
+
+The dtb sub-node should have the following properties:
+
+- compatible
+
+"multiboot,device-tree" and "multiboot,module"
+
+- reg
+
+Specifies the physical address of the device tree binary fragment
+RAM and its length.
+
+As an example:
+
+module@0xc00 {
+compatible = "multiboot,device-tree", "multiboot,module";
+reg = <0x0 0xc00 0xff>;
+};
+
+The DTB fragment is loaded at 0xc00 in the example above. It should
+follow the convention explained in docs/misc/arm/passthrough.txt. The
+DTB fragment will be added to the guest device tree, so that the guest
+kernel will be able to discover the device.
diff --git a/docs/misc/arm/passthrough.txt b/docs/misc/arm/passthrough.txt
index 0efbd122de..a67ada8eb8 100644
--- a/docs/misc/arm/passthrough.txt
+++ b/docs/misc/arm/passthrough.txt
@@ -80,6 +80,107 @@ SPI numbers start from 32, in this example 80 + 32 = 112.
 See man [xl.cfg] for the iomem format. The reg property is just a pair
 of address, then size numbers, each of them can occupy 1 or 2 cells.
 
+
+Dom0-less Device Passthrough
+
+
+The partial device tree for dom0-less guests should have the following
+properties for each node corresponding to a physical device to assign to
+the guest:
+
+- xen,reg
+
+  The xen,reg property is an array of:
+
+
+
+  They specify the physical address and size of the device memory
+  ranges together with the corresponding guest address to map them to.
+  The size of `phys_addr' and `guest_addr' is determined by
+  #address-cells, the size of `size' is determined by #size-cells, of
+  the partial device tree.
+  The memory will be mapped as device memory in the guest (Device-nGnRE).
+
+- xen,path
+
+  A string property representing the path in the host device tree to the
+  corresponding device node.
+
+In addition, a special /gic node is expected as a placeholder for the
+full GIC node that will be added by Xen for the guest. /gic can be
+referenced by other properties in the device tree fragment. For
+instance, it can be referenced by interrupt-parent under a device node.
+Xen will take care of replacing the "gic" placeholder node for a
+complete GIC node while retaining all the references correctly. The new
+GIC node created by Xen is a regular interrupt-controller@ node.
+
+gic: gic {
+#interrupt-cells = <0x3>;
+interrupt-controller;
+};
+
+Note that the #interrupt-cells and interrupt-controller properties are
+not actually required, however, DTC expects them to be present if gic is
+referenced by interrupt-parent or similar.
+
+
+Example
+===
+
+The following is a real-world example of a device tree fragment to
+assign a network card to a 

[Xen-devel] [PATCH v6 4/8] xen/arm: copy dtb fragment to guest dtb

2019-09-26 Thread Stefano Stabellini
Read the dtb fragment corresponding to a passthrough device from memory
at the location referred to by the "multiboot,device-tree" compatible
node.

Add a new field named dtb_bootmodule to struct kernel_info to keep track
of the dtb fragment location.

Copy the fragment to the guest dtb (only /aliases and /passthrough).

Set kinfo->phandle_gic based on the phandle of the special "/gic"
node in the device tree fragment. "/gic" is a dummy node in the dtb
fragment that represents the gic interrupt controller. Other properties
in the dtb fragment might refer to it (for instance interrupt-parent of
a device node). We reuse the phandle of "/gic" from the dtb fragment as
the phandle of the full GIC node that will be created for the guest
device tree. That way, when we copy properties from the device tree
fragment to the domU device tree the links remain unbroken.

scan_passthrough_prop is introduced here and not used in this patch but
it will be used by later patches.

Some of the code below is taken from tools/libxl/libxl_arm.c. Note that
it is OK to take LGPL 2.1 code and including it into a GPLv2 code base.
The result is GPLv2 code.

Signed-off-by: Stefano Stabellini 


Changes in v6:
- code style
- in-code comment
- commit message improvements

Changes in v5:
- code style
- in-code comment
- remove depth parameter from scan_pfdt_node
- for instead of loop in domain_handle_dtb_bootmodule
- move "gic" check to domain_handle_dtb_bootmodule
- add check_partial_fdt
- use DT_ROOT_NODE_ADDR/SIZE_CELLS_DEFAULT
- add scan_passthrough_prop parameter, set it to false for "/aliases"

Changes in v4:
- use recursion in the implementation
- rename handle_properties to handle_prop_pfdt
- rename scan_pt_node to scan_pfdt_node
- pass kinfo to handle_properties
- use uint32_t instead of u32
- rename r to res
- add "passthrough" and "aliases" check
- add a name == NULL check
- code style
- move DTB fragment scanning earlier, before DomU GIC node creation
- set guest_phandle_gic based on "/gic"
- in-code comment

Changes in v3:
- switch to using device_tree_for_each_node for the copy

Changes in v2:
- add a note about the code coming from libxl in the commit message
- copy /aliases
- code style
---
 xen/arch/arm/domain_build.c  | 164 +++
 xen/include/asm-arm/kernel.h |   2 +-
 2 files changed, 165 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index b25abe8d08..08d6d238e3 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1698,6 +1699,157 @@ static int __init make_vpl011_uart_node(struct 
kernel_info *kinfo)
 }
 #endif
 
+static int __init handle_prop_pfdt(struct kernel_info *kinfo,
+   const void *pfdt, int nodeoff,
+   uint32_t address_cells, uint32_t size_cells,
+   bool scan_passthrough_prop)
+{
+void *fdt = kinfo->fdt;
+int propoff, nameoff, res;
+const struct fdt_property *prop;
+
+for ( propoff = fdt_first_property_offset(pfdt, nodeoff);
+  propoff >= 0;
+  propoff = fdt_next_property_offset(pfdt, propoff) )
+{
+if ( !(prop = fdt_get_property_by_offset(pfdt, propoff, NULL)) )
+return -FDT_ERR_INTERNAL;
+
+nameoff = fdt32_to_cpu(prop->nameoff);
+res = fdt_property(fdt, fdt_string(pfdt, nameoff),
+   prop->data, fdt32_to_cpu(prop->len));
+if ( res )
+return res;
+}
+
+/* FDT_ERR_NOTFOUND => There is no more properties for this node */
+return ( propoff != -FDT_ERR_NOTFOUND ) ? propoff : 0;
+}
+
+static int __init scan_pfdt_node(struct kernel_info *kinfo, const void *pfdt,
+ int nodeoff,
+ uint32_t address_cells, uint32_t size_cells,
+ bool scan_passthrough_prop)
+{
+int rc = 0;
+void *fdt = kinfo->fdt;
+int node_next;
+
+rc = fdt_begin_node(fdt, fdt_get_name(pfdt, nodeoff, NULL));
+if ( rc )
+return rc;
+
+rc = handle_prop_pfdt(kinfo, pfdt, nodeoff, address_cells, size_cells,
+  scan_passthrough_prop);
+if ( rc )
+return rc;
+
+address_cells = device_tree_get_u32(pfdt, nodeoff, "#address-cells",
+DT_ROOT_NODE_ADDR_CELLS_DEFAULT);
+size_cells = device_tree_get_u32(pfdt, nodeoff, "#size-cells",
+ DT_ROOT_NODE_SIZE_CELLS_DEFAULT);
+
+node_next = fdt_first_subnode(pfdt, nodeoff);
+while ( node_next > 0 )
+{
+scan_pfdt_node(kinfo, pfdt, node_next, address_cells, size_cells,
+   scan_passthrough_prop);
+node_next = fdt_next_subnode(pfdt, node_next);
+}
+
+return fdt_end_node(fdt);
+}
+
+static int 

[Xen-devel] [PATCH v6 5/8] xen/arm: assign devices to boot domains

2019-09-26 Thread Stefano Stabellini
Scan the user provided dtb fragment at boot. For each device node, map
memory to guests, and route interrupts and setup the iommu.

The memory region to remap is specified by the "xen,reg" property.

The iommu is setup by passing the node of the device to assign on the
host device tree. The path is specified in the device tree fragment as
the "xen,path" string property.

The interrupts are remapped based on the information from the
corresponding node on the host device tree. Call
handle_device_interrupts to remap interrupts. Interrupts related device
tree properties are copied from the device tree fragment, same as all
the other properties.

Also set add the new flag XEN_DOMCTL_CDF_iommu so that dom0less domU
can use the IOMMU if a partial dtb is specified.

Signed-off-by: Stefano Stabellini 
---
Changes in v6:
- turn dprintks into printks
- return error on page alignment check failure
- set XEN_DOMCTL_CDF_iommu if partial dtb is specified

Changes in v5:
- use local variable for name
- use map_regions_p2mt
- add warning for not page aligned addresses/sizes
- introduce handle_passthrough_prop

Changes in v4:
- use unsigned
- improve commit message
- code style
- use dt_prop_cmp
- use device_tree_get_reg
- don't copy over xen,reg and xen,path
- don't create special interrupt properties for domU: copy them from the
  fragment
- in-code comment

Changes in v3:
- improve commit message
- remove superfluous cast
- merge code with the copy code
- add interrup-parent
- demove depth > 2 check
- reuse code from handle_device_interrupts
- copy interrupts from host dt

Changes in v2:
- rename "path" to "xen,path"
- grammar fix
- use gaddr_to_gfn and maddr_to_mfn
- remove depth <= 2 limitation in scanning the dtb fragment
- introduce and parse xen,reg
- code style
- support more than one interrupt per device
- specify only the GIC is supported
---
 xen/arch/arm/domain_build.c | 104 ++--
 1 file changed, 101 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 08d6d238e3..a461816345 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1699,6 +1699,88 @@ static int __init make_vpl011_uart_node(struct 
kernel_info *kinfo)
 }
 #endif
 
+/*
+ * Scan device tree properties for passthrough specific information.
+ * Returns -ENOENT when no passthrough properties are found
+ * < 0 on error
+ * 0 on success
+ */
+static int __init handle_passthrough_prop(struct kernel_info *kinfo,
+  const struct fdt_property *prop,
+  const char *name,
+  uint32_t address_cells, uint32_t 
size_cells)
+{
+const __be32 *cell;
+unsigned int i, len;
+struct dt_device_node *node;
+int res;
+
+/* xen,reg specifies where to map the MMIO region */
+if ( dt_prop_cmp("xen,reg", name) == 0 )
+{
+paddr_t mstart, size, gstart;
+cell = (const __be32 *)prop->data;
+len = fdt32_to_cpu(prop->len) /
+((address_cells * 2 + size_cells) * sizeof(uint32_t));
+
+for ( i = 0; i < len; i++ )
+{
+device_tree_get_reg(, address_cells, size_cells,
+, );
+gstart = dt_next_cell(address_cells, );
+
+if ( gstart & ~PAGE_MASK || mstart & ~PAGE_MASK || size & 
~PAGE_MASK )
+{
+printk(XENLOG_ERR
+   "DomU passthrough config has not page aligned 
addresses/sizes\n");
+return -EINVAL;
+}
+
+res = map_regions_p2mt(kinfo->d,
+   gaddr_to_gfn(gstart),
+   PFN_DOWN(size),
+   maddr_to_mfn(mstart),
+   p2m_mmio_direct_dev);
+if ( res < 0 )
+{
+printk(XENLOG_ERR
+   "Failed to map %"PRIpaddr" to the guest 
at%"PRIpaddr"\n",
+   mstart, gstart);
+return -EFAULT;
+}
+}
+
+return 0;
+}
+/*
+ * xen,path specifies the corresponding node in the host DT.
+ * Both interrupt mappings and IOMMU settings are based on it,
+ * as they are done based on the corresponding host DT node.
+ */
+else if ( dt_prop_cmp("xen,path", name) == 0 )
+{
+node = dt_find_node_by_path(prop->data);
+if ( node == NULL )
+{
+printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n",
+   (char *)prop->data);
+return -EINVAL;
+}
+
+res = iommu_assign_dt_device(kinfo->d, node);
+if ( res < 0 )
+return res;
+
+res = handle_device_interrupts(kinfo->d, node, true);
+if ( res < 0 )
+return res;
+
+return 0;
+}
+
+return -ENOENT;
+}
+
 static 

[Xen-devel] [PATCH v6 6/8] xen/arm: handle "multiboot, device-tree" compatible nodes

2019-09-26 Thread Stefano Stabellini
Detect "multiboot,device-tree" compatible nodes. Add them to the bootmod
array as BOOTMOD_GUEST_DTB.  In kernel_probe, find the right
BOOTMOD_GUEST_DTB and store a pointer to it in dtb_bootmodule.

Signed-off-by: Stefano Stabellini 
Acked-by: Julien Grall 
---
Changes in v4:
- use uint32_t
- remove useless 0 initialization
- add return value check

Changes in v2:
- rename BOOTMOD_DTB to BOOTMOD_GUEST_DTB
- rename multiboot,dtb to multiboot,device-tree
---
 xen/arch/arm/bootfdt.c  |  2 ++
 xen/arch/arm/kernel.c   | 14 +-
 xen/arch/arm/setup.c|  1 +
 xen/include/asm-arm/setup.h |  1 +
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index a7810abb15..08fb59f4e7 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -242,6 +242,8 @@ static void __init process_multiboot_node(const void *fdt, 
int node,
 kind = BOOTMOD_RAMDISK;
 else if ( fdt_node_check_compatible(fdt, node, "xen,xsm-policy") == 0 )
 kind = BOOTMOD_XSM;
+else if ( fdt_node_check_compatible(fdt, node, "multiboot,device-tree") == 
0 )
+kind = BOOTMOD_GUEST_DTB;
 else
 kind = BOOTMOD_UNKNOWN;
 
diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c
index 389bef2afa..8eff074836 100644
--- a/xen/arch/arm/kernel.c
+++ b/xen/arch/arm/kernel.c
@@ -425,7 +425,7 @@ int __init kernel_probe(struct kernel_info *info,
 struct bootmodule *mod = NULL;
 struct bootcmdline *cmd = NULL;
 struct dt_device_node *node;
-u64 kernel_addr, initrd_addr, size;
+u64 kernel_addr, initrd_addr, dtb_addr, size;
 int rc;
 
 /* domain is NULL only for the hardware domain */
@@ -469,6 +469,18 @@ int __init kernel_probe(struct kernel_info *info,
 info->initrd_bootmodule = boot_module_find_by_addr_and_kind(
 BOOTMOD_RAMDISK, initrd_addr);
 }
+else if ( dt_device_is_compatible(node, "multiboot,device-tree") )
+{
+uint32_t len;
+const __be32 *val;
+
+val = dt_get_property(node, "reg", );
+if ( val == NULL )
+continue;
+dt_get_range(, node, _addr, );
+info->dtb_bootmodule = boot_module_find_by_addr_and_kind(
+BOOTMOD_GUEST_DTB, dtb_addr);
+}
 else
 continue;
 }
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index fca7e68cba..d764a4e9b2 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -369,6 +369,7 @@ const char * __init 
boot_module_kind_as_string(bootmodule_kind kind)
 case BOOTMOD_KERNEL:  return "Kernel";
 case BOOTMOD_RAMDISK: return "Ramdisk";
 case BOOTMOD_XSM: return "XSM";
+case BOOTMOD_GUEST_DTB: return "DTB";
 case BOOTMOD_UNKNOWN: return "Unknown";
 default: BUG();
 }
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index fa0a8721b2..2f8f24e286 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -16,6 +16,7 @@ typedef enum {
 BOOTMOD_KERNEL,
 BOOTMOD_RAMDISK,
 BOOTMOD_XSM,
+BOOTMOD_GUEST_DTB,
 BOOTMOD_UNKNOWN
 }  bootmodule_kind;
 
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] VM_BUG_ON_PAGE(!PageOffline(page), page) in alloc_xenballooned_pages

2019-09-26 Thread Marek Marczykowski-Górecki
Hi,

I've hit VM_BUG_ON_PAGE(!PageOffline(page), page) in
alloc_xenballooned_pages, when trying to use gnttab from userspace
application. It happens on Xen PV, but not on Xen PVH or HVM with the
same kernel. This happens at least with 5.1.6, but also 5.2.15
(as seen below). Based on this, it looks related to 0266def91377
(xen/balloon: Fix mapping PG_offline pages to user space) and probably
77c4adf6a6df (xen/balloon: mark inflated pages PG_offline).

Any idea? Below is full message.


page:ea0003e7ffc0 refcount:1 mapcount:0 mapping: index:0x0
flags: 0xe1000(reserved)
raw: 000e1000 dead0100 dead0200 
raw:   0001 
page dumped because: VM_BUG_ON_PAGE(!PageOffline(page))
[ cut here ]
kernel BUG at include/linux/page-flags.h:744!
invalid opcode:  [#1] SMP NOPTI
CPU: 0 PID: 551 Comm: qubesdb-daemon Tainted: GW 
5.2.15-200.fc30.x86_64 #1
RIP: e030:alloc_xenballooned_pages+0xef/0x110
Code: c0 0c 10 00 e8 b2 fa ff ff 85 c0 0f 84 60 ff ff ff 41 89 dd b8 f4 ff ff 
ff eb b0 48 c7 c6 e8 af 14 82 48 89 c7 e8 31 32 ca ff <0f> 0b 48 c7 c7 40 f 0 
4d 82 e8 13 85 3f 00 31 c0 48 83 c4 08 5b 5d
RSP: e02b:c90001113d98 EFLAGS: 00010246
RAX: 0037 RBX:  RCX: 0149
RDX:  RSI:  RDI: 82143bbc
RBP: 0001 R08: 0181 R09: 0149
R10: 000a R11: c90001113c38 R12: 88800d670960
R13: 7fffdff236a0 R14: 7fffdff236a0 R15: 8880108bd000
FS:  7f30e205e7c0() GS:888013e0() knlGS:
CS:  e030 DS:  ES:  CR0: 80050033
CR2: 7f30e2082000 CR3: 0c92 CR4: 00040660
Call Trace:
 ? __kmalloc+0x16c/0x210
 gnttab_alloc_pages+0x11/0x40
 gntdev_alloc_map+0xe7/0x180 [xen_gntdev]
 gntdev_ioctl+0x203/0x530 [xen_gntdev]
 do_vfs_ioctl+0x405/0x660
 ksys_ioctl+0x5e/0x90
 __x64_sys_ioctl+0x16/0x20
 do_syscall_64+0x5f/0x1a0
 ? page_fault+0x8/0x30
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f30e239b3bb
Code: 0f 1e fa 48 8b 05 cd ca 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff 
c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 8b 0d 9d ca 0c 00 f7 d8 64 89 01 48
RSP: 002b:7fffdff23698 EFLAGS: 0202 ORIG_RAX: 0010
RAX: ffda RBX: 0003 RCX: 7f30e239b3bb
RDX: 7fffdff236a0 RSI: 00184700 RDI: 000b
RBP: 7fffdff23730 R08: 7fffdff2375c R09: 7fffdff23758
R10: fcc9 R11: 0202 R12: 7fffdff236a0
R13: 1000 R14: 000b R15: 0001
Modules linked in: xenfs ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel xen_blkfront xen_scsiback target_core_mod xen_netback 
xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 141875: tolerable all pass - PUSHED

2019-09-26 Thread osstest service owner
flight 141875 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141875/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  ddc5a85fbcfbacc34bbd9abcdb12923de2fc27b3
baseline version:
 xen  52633db2c5ed2df352004f97e2272edc90af3cb4

Last test of basis   141871  2019-09-26 14:00:34 Z0 days
Testing same since   141875  2019-09-26 18:00:52 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Oleksandr Tyshchenko 
  Sergey Dyasli 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   52633db2c5..ddc5a85fbc  ddc5a85fbcfbacc34bbd9abcdb12923de2fc27b3 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.19 test] 141813: regressions - FAIL

2019-09-26 Thread osstest service owner
flight 141813 linux-4.19 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141813/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvops 6 kernel-build fail REGR. vs. 129313

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-examine  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 129313
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linuxd573e8a79f70404ba08623d1de7ea617d55092ac
baseline version:
 linux84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d

Last test of basis   129313  2018-11-02 05:39:08 Z  328 days
Failing since129412  2018-11-04 14:10:15 Z  326 days  244 attempts
Testing same since   141616  2019-09-22 03:05:43 Z4 days4 attempts


2592 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass   

Re: [Xen-devel] [PATCH RFC] pass-through: sync pir to irr after msix vector been updated

2019-09-26 Thread Joe Jin
On 9/24/19 8:42 AM, Roger Pau Monné wrote:
> On Fri, Sep 13, 2019 at 09:50:34AM -0700, Joe Jin wrote:
>> On 9/13/19 3:33 AM, Roger Pau Monné wrote:
>>> On Thu, Sep 12, 2019 at 11:03:14AM -0700, Joe Jin wrote:
 With below testcase, guest kernel reported "No irq handler for vector":
   1). Passthrough mlx ib VF to 2 pvhvm guests.
   2). Start rds-stress between 2 guests.
   3). Scale down 2 guests vcpu from 32 to 6 at the same time.

 Repeat above test several iteration, guest kernel reported "No irq handler
 for vector", and IB traffic downed to zero which caused by interrupt lost.

 When vcpu offline, kernel disabled local IRQ, migrate IRQ to other cpu,
 update MSI-X table, enable IRQ. If any new interrupt arrived after
 local IRQ disabled also before MSI-X table been updated, interrupt still 
 used old vector and dest cpu info, and when local IRQ enabled again, 
 interrupt been sent to wrong cpu and vector.
>>>
>>> Yes, but that's something Linux shoulkd be able to handle, according
>>> to your description there's a window where interrupts can be delivered
>>> to the old CPU, but that's something expected.
>>
>> Actually, kernel will check APIC IRR when do migration, if any pending
>> IRQ, will retrigger IRQ to new destination, but Xen does not set the
>> bit.
> 
> Right, because the interrupt is pending in the PIRR posted descriptor
> field, it has not yet reached the vlapic IRR.
> 
>>>

 Looks sync PIR to IRR after MSI-X been updated is help for this issue.
>>>
>>> AFAICT the sync that you do is still using the old vcpu id, as
>>> pirq_dpci->gmsi.dest_vcpu_id gets updated a little bit below. I'm
>>> unsure about why does this help, I would expect the sync between pir
>>> and irr to happen anyway, and hence I'm not sure why is this helping.
>>
>> As my above update, IRQ retriggered by old cpu, so Xen need to set IRR
>> for old cpu but not of new.
> 
> AFAICT you are draining any pending data from the posted interrupt
> PIRR field into the IRR vlapic field, so that no stale interrupts are
> left behind after the MSI fields have been updated by the guest. I
> think this is correct, I wonder however whether you also need to
> trigger a vcpu re-scheduling (pause/unpause the vpcu), so that pending
> interrupts in IRR are injected by vmx_intr_assist.
> 
> Also, I think you should do this syncing after the pi_update_irte
> call, or else there's still a window (albeit small) where you can
> still get posted interrupts delivered to the old vcpu.

I agree with you that we need to take care of this issue as well.

I created an additional patch but not tested yet for the test env was
broken, post here for your comment firstly, I'll update later of test
result when my test env back:

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 3f43b9d5a9..4596110a17 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static DEFINE_PER_CPU(struct list_head, dpci_list);
@@ -443,9 +444,21 @@ int pt_irq_create_bind(
 hvm_migrate_pirq(pirq_dpci, vcpu);
 
 /* Use interrupt posting if it is supported. */
-if ( iommu_intpost )
-pi_update_irte(vcpu ? >arch.hvm.vmx.pi_desc : NULL,
-   info, pirq_dpci->gmsi.gvec);
+if ( iommu_intpost ) {
+unsigned int ndst = APIC_INVALID_DEST;
+struct irq_desc *desc;
+
+desc = pirq_spin_lock_irq_desc(info, NULL);
+if ( irq_desc ) {
+ndst = irq_desc->msi_desc->pi_desc->ndst;
+spin_unlock_irq(>lock);
+}
+
+if ( pi_update_irte(vcpu ? >arch.hvm.vmx.pi_desc : NULL,
+info, pirq_dpci->gmsi.gvec) == 0
+&& ndst != APIC_INVALID_DEST )
+vlapic_sync_pir_to_irr(d->vcpu[ndst]);
+}
 
 if ( pt_irq_bind->u.msi.gflags & XEN_DOMCTL_VMSI_X86_UNMASKED )
 {
-- 
2.21.0 (Apple Git-122)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [GIT PULL] xen: features for 5.4-rc1

2019-09-26 Thread pr-tracker-bot
The pull request you sent on Thu, 26 Sep 2019 16:17:43 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
> for-linus-5.4-rc1-tag

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ec56103e18c7590303c69329dd4aaadf8a898c19

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 00/21] Refine memblock API

2019-09-26 Thread Adam Ford
On Thu, Sep 26, 2019 at 11:04 AM Mike Rapoport  wrote:
>
> Hi,
>
> On Thu, Sep 26, 2019 at 08:09:52AM -0500, Adam Ford wrote:
> > On Wed, Sep 25, 2019 at 10:17 AM Fabio Estevam  wrote:
> > >
> > > On Wed, Sep 25, 2019 at 9:17 AM Adam Ford  wrote:
> > >
> > > > I tried cma=256M and noticed the cma dump at the beginning didn't
> > > > change.  Do we need to setup a reserved-memory node like
> > > > imx6ul-ccimx6ulsom.dtsi did?
> > >
> > > I don't think so.
> > >
> > > Were you able to identify what was the exact commit that caused such 
> > > regression?
> >
> > I was able to narrow it down the 92d12f9544b7 ("memblock: refactor
> > internal allocation functions") that caused the regression with
> > Etnaviv.
>
>
> Can you please test with this change:
>

That appears to have fixed my issue.  I am not sure what the impact
is, but is this a safe option?


adam

> diff --git a/mm/memblock.c b/mm/memblock.c
> index 7d4f61a..1f5a0eb 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1356,9 +1356,6 @@ static phys_addr_t __init 
> memblock_alloc_range_nid(phys_addr_t size,
> align = SMP_CACHE_BYTES;
> }
>
> -   if (end > memblock.current_limit)
> -   end = memblock.current_limit;
> -
>  again:
> found = memblock_find_in_range_node(size, align, start, end, nid,
> flags);
>
> > I also noticed that if I create a reserved memory node as was done one
> > imx6ul-ccimx6ulsom.dtsi the 3D seems to work again, but without it, I
> > was getting errors regardless of the 'cma=256M' or not.
> > I don't have a problem using the reserved memory, but I guess I am not
> > sure what the amount should be.  I know for the video decoding 1080p,
> > I have historically used cma=128M, but with the 3D also needing some
> > memory allocation, is that enough or should I use 256M?
> >
> > adam
>
> --
> Sincerely yours,
> Mike.
>

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH] xen/gntdev: Stop abusing DT of_dma_configure API

2019-09-26 Thread Rob Herring
On Thu, Sep 26, 2019 at 6:16 AM Oleksandr Andrushchenko
 wrote:
>
> On 9/26/19 1:46 PM, Robin Murphy wrote:
> > On 2019-09-26 11:17 am, Oleksandr Andrushchenko wrote:
> >>
> >> On 9/26/19 12:49 PM, Julien Grall wrote:
> >>> Hi Rob,
> >>>
> >>>
> >>> On 9/25/19 10:50 PM, Rob Herring wrote:
>  As the comment says, this isn't a DT based device. of_dma_configure()
>  is going to stop allowing a NULL DT node, so this needs to be fixed.
> >>>
> >>> And this can't work on arch not selecting CONFIG_OF and can select
> >>> CONFIG_XEN_GRANT_DMA_ALLOC.
> >>>
> >>> We are lucky enough on x86 because, AFAICT, arch_setup_dma_ops is just
> >>> a nop.
> >>>
> >> No luck is needed as [1] does nothing for those platforms not using
> >> CONFIG_OF
> 
>  Not sure exactly what setup besides arch_setup_dma_ops is needed...
> >>>
> >>> We probably want to update dma_mask, coherent_dma_mask and
> >>> dma_pfn_offset.
> >>>
> >>> Also, while look at of_configure_dma, I noticed that we consider the
> >>> DMA will not be coherent for the grant-table. Oleksandr, do you know
> >>> why they can't be coherent?
> >> The main and the only reason to use of_configure_dma is that if we don't
> >> then we
> >> are about to stay with dma_dummy_ops [2]. It effectively means that
> >> operations on dma-bufs
> >> will end up returning errors, like [3], [4], thus not making it possible
> >> for Xen PV DRM and DMA
> >> part of gntdev driver to do what we need (dma-bufs in our use-cases
> >> allow zero-copying
> >> while using graphics buffers and many more).
> >>
> >> I didn't find any better way of achieving that, but of_configure_dma...
> >> If there is any better solution which will not break the existing
> >> functionality then
> >> I will definitely change the drivers so we do not abuse DT )
> >> Before that, please keep in mind that merging this RFC will break Xen PV
> >> DRM +
> >> DMA buf support in gntdev...
> >> Hope we can work out some acceptable solution, so everyone is happy
> >
> > As I mentioned elsewhere, the recent dma-direct rework means that
> > dma_dummy_ops are now only explicitly installed for the ACPI error
> > case, so - much as I may dislike it - you should get regular
> > (direct/SWIOTLB) ops by default again.
> Ah, my bad, I missed that change. So, if no dummy dma ops are to be used
> then
> I believe we can apply both changes, e.g. remove of_dma_configure from
> both of the drivers.

What about the dma masks? I think there's a default setup, but it is
considered a driver bug to not set its mask. xen_drm_front sets the
coherent_dma_mask (why only 32-bits though?), but not the dma_mask.
gntdev is doing neither. I could copy out what of_dma_configure does
but better for the Xen folks to decide what is needed or not and test
the change. I'm not setup to test any of this.

Rob

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 04/10] xen/arm: Ensure the SSBD workaround is re-enabled right after exiting a guest

2019-09-26 Thread Julien Grall
At the moment, SSBD workaround is re-enabled for Xen after interrupts
are unmasked. This means we may end up to execute some part of the
hypervisor if an interrupt is received before the workaround is
re-enabled.

As the rest of enter_hypervisor_from_guest() does not require to have
interrupts masked, the function is now split in two parts:
1) enter_hypervisor_from_guest_noirq() called with interrupts
   masked.
2) enter_hypervisor_from_guest() called with interrupts unmasked.

Note that while enter_hypervisor_from_guest_noirq() does not use the
on-stack context registers, it is still passed as parameter to match the
rest of the C functions called from the entry path.

Fixes: a7898e4c59 ("xen/arm: Add ARCH_WORKAROUND_2 support for guests")
Reported-by: Andrii Anisov 
Signed-off-by: Julien Grall 

---

Note the Arm32 code has not been changed yet. I am also open on turn
both enter_hypervisor_from_guest_noirq() and enter_hypervisor_from()
to functions not taking any parameters.
---
 xen/arch/arm/arm64/entry.S |  2 ++
 xen/arch/arm/traps.c   | 16 +---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 9eafae516b..458d12f188 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -173,6 +173,8 @@
 ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
 "nop; nop",
 SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
+mov x0, sp
+bl  enter_hypervisor_from_guest_noirq
 msr daifclr, \iflags
 mov x0, sp
 bl  enter_hypervisor_from_guest
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 20ba34ec91..5848dd8399 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2007,16 +2007,26 @@ static inline bool needs_ssbd_flip(struct vcpu *v)
 }
 
 /*
- * Actions that needs to be done after exiting the guest and before any
- * request from it is handled.
+ * Actions that needs to be done after exiting the guest and before the
+ * interrupts are unmasked.
  */
-void enter_hypervisor_from_guest(struct cpu_user_regs *regs)
+void enter_hypervisor_from_guest_noirq(struct cpu_user_regs *regs)
 {
 struct vcpu *v = current;
 
 /* If the guest has disabled the workaround, bring it back on. */
 if ( needs_ssbd_flip(v) )
 arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
+}
+
+/*
+ * Actions that needs to be done after exiting the guest and before any
+ * request from it is handled. Depending on the exception trap, this may
+ * be called with interrupts unmasked.
+ */
+void enter_hypervisor_from_guest(struct cpu_user_regs *regs)
+{
+struct vcpu *v = current;
 
 /*
  * If we pended a virtual abort, preserve it until it gets cleared.
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 06/10] xen/arm: Move ARCH_PATCH_INSN_SIZE out of the header livepatch.h

2019-09-26 Thread Julien Grall
At the moment, ARCH_PATCH_INSN_SIZE is defined in the header
livepatch.h. However, this is also used in the alternative code.

Rather than including livepatch.h just for using the define, move it in
the header insn.h which seems more suitable.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/alternative.c  | 2 --
 xen/include/asm-arm/insn.h  | 3 +++
 xen/include/asm-arm/livepatch.h | 4 +---
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 52ed7edf69..237c4e5642 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -30,8 +30,6 @@
 #include 
 #include 
 #include 
-/* XXX: Move ARCH_PATCH_INSN_SIZE out of livepatch.h */
-#include 
 #include 
 
 /* Override macros from asm/page.h to make them work with mfn_t */
diff --git a/xen/include/asm-arm/insn.h b/xen/include/asm-arm/insn.h
index 3489179826..19277212e1 100644
--- a/xen/include/asm-arm/insn.h
+++ b/xen/include/asm-arm/insn.h
@@ -11,6 +11,9 @@
 # error "unknown ARM variant"
 #endif
 
+/* On ARM32,64 instructions are always 4 bytes long. */
+#define ARCH_PATCH_INSN_SIZE 4
+
 #endif /* !__ARCH_ARM_INSN */
 /*
  * Local variables:
diff --git a/xen/include/asm-arm/livepatch.h b/xen/include/asm-arm/livepatch.h
index 6bca79deb9..026af5e7dc 100644
--- a/xen/include/asm-arm/livepatch.h
+++ b/xen/include/asm-arm/livepatch.h
@@ -7,9 +7,7 @@
 #define __XEN_ARM_LIVEPATCH_H__
 
 #include  /* For SZ_* macros. */
-
-/* On ARM32,64 instructions are always 4 bytes long. */
-#define ARCH_PATCH_INSN_SIZE 4
+#include 
 
 /*
  * The va of the hypervisor .text region. We need this as the
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 00/10] xen/arm: XSA-201 and XSA-263 fixes

2019-09-26 Thread Julien Grall
Hi all,

This patch series aims to fix two bugs in the entry path from the guest:
1) Make sure that SSBD workaround is enabled before executing any 
hypervisor code
2) Avoid guest state corruption when an virtual SError is received

The full series is candidate for Xen 4.13. Without it, the hypervisor would
not be properly protected against SSB vulnerability and the guest state may
get corrupted if an SError is received.

This is in RFC state because the entry code is now quite different and
arm32 changes are not yet implemented. I will modify arm32 once we agreed
on the approach.

Cheers,

Cc: jgr...@suse.com

Julien Grall (9):
  xen/arm64: entry: Introduce a macro to generate guest vector and use
it
  xen/arm64: head: Check if an SError is pending when receiving a
vSError
  xen/arm: traps: Rework entry/exit from the guest path
  xen/arm: Ensure the SSBD workaround is re-enabled right after exiting
a guest
  xen/arm: alternative: Remove unused parameter for
alternative_if_not_cap
  xen/arm: Move ARCH_PATCH_INSN_SIZE out of the header livepatch.h
  xen/arm: Allow insn.h to be called from assembly
  xen/arm: asm: Replace use of ALTERNATIVE with alternative_if
  xen/arm64: entry: Ensure the guest state is synced when receiving a
vSError

Mark Rutland (1):
  xen/arm: alternative: add auto-nop infrastructure

 xen/arch/arm/alternative.c|   2 -
 xen/arch/arm/arm32/entry.S|   9 ++-
 xen/arch/arm/arm64/entry.S| 121 +++---
 xen/arch/arm/traps.c  |  81 +
 xen/include/asm-arm/alternative.h |  74 ---
 xen/include/asm-arm/insn.h|  11 
 xen/include/asm-arm/livepatch.h   |   4 +-
 xen/include/asm-arm/macros.h  |   7 +++
 8 files changed, 172 insertions(+), 137 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 07/10] xen/arm: Allow insn.h to be called from assembly

2019-09-26 Thread Julien Grall
A follow-up patch will require to include insn.h from assembly code. So
wee need to protect any C-specific definition to avoid compilation
error when used in assembly code.

Signed-off-by: Julien Grall 
---
 xen/include/asm-arm/insn.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/xen/include/asm-arm/insn.h b/xen/include/asm-arm/insn.h
index 19277212e1..00391f83f9 100644
--- a/xen/include/asm-arm/insn.h
+++ b/xen/include/asm-arm/insn.h
@@ -1,8 +1,14 @@
 #ifndef __ARCH_ARM_INSN
 #define __ARCH_ARM_INSN
 
+#ifndef __ASSEMBLY__
+
 #include 
 
+/*
+ * At the moment, arch-specific headers contain only definition for C
+ * code.
+ */
 #if defined(CONFIG_ARM_64)
 # include 
 #elif defined(CONFIG_ARM_32)
@@ -11,6 +17,8 @@
 # error "unknown ARM variant"
 #endif
 
+#endif /* __ASSEMBLY__ */
+
 /* On ARM32,64 instructions are always 4 bytes long. */
 #define ARCH_PATCH_INSN_SIZE 4
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 02/10] xen/arm64: head: Check if an SError is pending when receiving a vSError

2019-09-26 Thread Julien Grall
At the moment, when we receive an SError exception from the guest, we
don't check if there are any other pending. For hardening the code, we
should ensure any pending SError are accounted to the guest before
executing any code with SError unmasked.

The recently introduced macro 'guest_vector' could used to generate the
two vectors and therefore take advantage of any change required in the
future.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/arm64/entry.S | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 8665d2844a..40d9f3ec8c 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -324,11 +324,7 @@ guest_fiq_invalid:
 invalid BAD_FIQ
 
 guest_error:
-entry   hyp=0, compat=0
-msr daifclr, #6
-mov x0, sp
-bl  do_trap_guest_serror
-exithyp=0, compat=0
+guest_vector compat=0, iflags=6, trap=guest_serror
 
 guest_sync_compat:
 guest_vector compat=1, iflags=6, trap=guest_sync
@@ -341,11 +337,7 @@ guest_fiq_invalid_compat:
 invalid BAD_FIQ
 
 guest_error_compat:
-entry   hyp=0, compat=1
-msr daifclr, #6
-mov x0, sp
-bl  do_trap_guest_serror
-exithyp=0, compat=1
+guest_vector compat=1, iflags=6, trap=guest_serror
 
 ENTRY(return_to_new_vcpu32)
 exithyp=0, compat=1
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 08/10] xen/arm: alternative: add auto-nop infrastructure

2019-09-26 Thread Julien Grall
From: Mark Rutland 

In some cases, one side of an alternative sequence is simply a number of
NOPs used to balance the other side. Keeping track of this manually is
tedious, and the presence of large chains of NOPs makes the code more
painful to read than necessary.

To ameliorate matters, this patch adds a new alternative_else_nop_endif,
which automatically balances an alternative sequence with a trivial NOP
sled.

In many cases, we would like a NOP-sled in the default case, and
instructions patched in in the presence of a feature. To enable the NOPs
to be generated automatically for this case, this patch also adds a new
alternative_if, and updates alternative_else and alternative_endif to
work with either alternative_if or alternative_endif.

The alternative infrastructure was originally ported from Linux. So this
is pretty much a straight backport from commit 792d47379f4d "arm64:
alternative: add auto-nop infrastructure". The only difference is the
nops macro added as not yet existing in Xen.

Signed-off-by: Mark Rutland 
[will: use new nops macro to generate nop sequences]
Signed-off-by: Will Deacon 
[julien: Add nops and port to Xen]
Signed-off-by: Julien Grall 
---
 xen/include/asm-arm/alternative.h | 70 +--
 xen/include/asm-arm/macros.h  |  7 
 2 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/xen/include/asm-arm/alternative.h 
b/xen/include/asm-arm/alternative.h
index 2830a6da2d..e8271ac04e 100644
--- a/xen/include/asm-arm/alternative.h
+++ b/xen/include/asm-arm/alternative.h
@@ -2,6 +2,7 @@
 #define __ASM_ALTERNATIVE_H
 
 #include 
+#include 
 
 #define ARM_CB_PATCH ARM_NCAPS
 
@@ -111,34 +112,55 @@ int apply_alternatives(const struct alt_instr *start, 
const struct alt_instr *en
 .endm
 
 /*
- * Begin an alternative code sequence.
+ * Alternative sequences
+ *
+ * The code for the case where the capability is not present will be
+ * assembled and linked as normal. There are no restrictions on this
+ * code.
+ *
+ * The code for the case where the capability is present will be
+ * assembled into a special section to be used for dynamic patching.
+ * Code for that case must:
+ *
+ * 1. Be exactly the same length (in bytes) as the default code
+ *sequence.
  *
- * The code that follows this macro will be assembled and linked as
- * normal. There are no restrictions on this code.
+ * 2. Not contain a branch target that is used outside of the
+ *alternative sequence it is defined in (branches into an
+ *alternative sequence are not fixed up).
+ */
+
+/*
+ * Begin an alternative code sequence.
  */
 .macro alternative_if_not cap
+   .set .Lasm_alt_mode, 0
.pushsection .altinstructions, "a"
altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
.popsection
 661:
 .endm
 
+.macro alternative_if cap
+   .set .Lasm_alt_mode, 1
+   .pushsection .altinstructions, "a"
+   altinstruction_entry 663f, 661f, \cap, 664f-663f, 662f-661f
+   .popsection
+   .pushsection .altinstr_replacement, "ax"
+   .align 2/* So GAS knows label 661 is suitably aligned */
+661:
+.endm
+
 /*
- * Provide the alternative code sequence.
- *
- * The code that follows this macro is assembled into a special
- * section to be used for dynamic patching. Code that follows this
- * macro must:
- *
- * 1. Be exactly the same length (in bytes) as the default code
- *sequence.
- *
- * 2. Not contain a branch target that is used outside of the
- *alternative sequence it is defined in (branches into an
- *alternative sequence are not fixed up).
+ * Provide the other half of the alternative code sequence.
  */
 .macro alternative_else
-662:   .pushsection .altinstr_replacement, "ax"
+662:
+   .if .Lasm_alt_mode==0
+   .pushsection .altinstr_replacement, "ax"
+   .else
+   .popsection
+   .endif
 663:
 .endm
 
@@ -154,12 +176,26 @@ int apply_alternatives(const struct alt_instr *start, 
const struct alt_instr *en
  * Complete an alternative code sequence.
  */
 .macro alternative_endif
-664:   .popsection
+664:
+   .if .Lasm_alt_mode==0
+   .popsection
+   .endif
.org. - (664b-663b) + (662b-661b)
.org. - (662b-661b) + (664b-663b)
 .endm
 
 /*
+ * Provides a trivial alternative or default sequence consisting solely
+ * of NOPs. The number of NOPs is chosen automatically to match the
+ * previous case.
+ */
+.macro alternative_else_nop_endif
+alternative_else
+   nops(662b-661b) / ARCH_PATCH_INSN_SIZE
+alternative_endif
+.endm
+
+/*
  * Callback-based alternative epilogue
  */
 .macro alternative_cb_end
diff --git a/xen/include/asm-arm/macros.h b/xen/include/asm-arm/macros.h
index 1d4bb41d15..91ea3505e4 100644
--- a/xen/include/asm-arm/macros.h
+++ b/xen/include/asm-arm/macros.h
@@ -13,4 +13,11 @@
 # error "unknown ARM variant"
 #endif
 
+/* NOP sequence  */
+.macro nops, num
+.rept   \num
+nop
+.endr
+.endm
+
 #endif 

[Xen-devel] [PATCH RFC for-4.13 01/10] xen/arm64: entry: Introduce a macro to generate guest vector and use it

2019-09-26 Thread Julien Grall
Most of the guest vectors are using the same pattern. This makes fairly
tedious to alter the pattern and risk introducing mistakes when updating
each path.

A new macro is introduced to generate the guest vectors and now use it
in the one that use the open-code version.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/arm64/entry.S | 84 --
 1 file changed, 28 insertions(+), 56 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 2d9a2713a1..8665d2844a 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -157,6 +157,30 @@
 
 .endm
 
+/*
+ * Generate a guest vector.
+ *
+ * iflags: Correspond to the list of interrupts to unmask
+ * save_x0_x1: See the description on top of the macro 'entry'
+ */
+.macro  guest_vector compat, iflags, trap, save_x0_x1=1
+entry   hyp=0, compat=\compat, save_x0_x1=\save_x0_x1
+/*
+ * The vSError will be checked while SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
+ * is not set. If a vSError took place, the initial exception will be
+ * skipped. Exit ASAP
+ */
+ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
+"nop; nop",
+SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
+msr daifclr, \iflags
+mov x0, sp
+bl  do_trap_\trap
+1:
+exithyp=0, compat=\compat
+.endm
+
+
 /*
  * Bad Abort numbers
  *-
@@ -290,36 +314,10 @@ guest_sync_slowpath:
  * x0/x1 may have been scratch by the fast path above, so avoid
  * to save them.
  */
-entry   hyp=0, compat=0, save_x0_x1=0
-/*
- * The vSError will be checked while SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
- * is not set. If a vSError took place, the initial exception will be
- * skipped. Exit ASAP
- */
-ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
-"nop; nop",
-SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
-msr daifclr, #6
-mov x0, sp
-bl  do_trap_guest_sync
-1:
-exithyp=0, compat=0
+guest_vector compat=0, iflags=6, trap=guest_sync, save_x0_x1=0
 
 guest_irq:
-entry   hyp=0, compat=0
-/*
- * The vSError will be checked while SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
- * is not set. If a vSError took place, the initial exception will be
- * skipped. Exit ASAP
- */
-ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
-"nop; nop",
-SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
-msr daifclr, #4
-mov x0, sp
-bl  do_trap_irq
-1:
-exithyp=0, compat=0
+guest_vector compat=0, iflags=4, trap=irq
 
 guest_fiq_invalid:
 entry   hyp=0, compat=0
@@ -333,36 +331,10 @@ guest_error:
 exithyp=0, compat=0
 
 guest_sync_compat:
-entry   hyp=0, compat=1
-/*
- * The vSError will be checked while SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
- * is not set. If a vSError took place, the initial exception will be
- * skipped. Exit ASAP
- */
-ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
-"nop; nop",
-SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
-msr daifclr, #6
-mov x0, sp
-bl  do_trap_guest_sync
-1:
-exithyp=0, compat=1
+guest_vector compat=1, iflags=6, trap=guest_sync
 
 guest_irq_compat:
-entry   hyp=0, compat=1
-/*
- * The vSError will be checked while SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
- * is not set. If a vSError took place, the initial exception will be
- * skipped. Exit ASAP
- */
-ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
-"nop; nop",
-SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
-msr daifclr, #4
-mov x0, sp
-bl  do_trap_irq
-1:
-exithyp=0, compat=1
+guest_vector compat=1, iflags=4, trap=irq
 
 guest_fiq_invalid_compat:
 entry   hyp=0, compat=1
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 09/10] xen/arm: asm: Replace use of ALTERNATIVE with alternative_if

2019-09-26 Thread Julien Grall
Using alternative_if makes the code a bit more streamlined.

Take the opportunity to use the new auto-nop infrastructure to avoid
counting the number of nop in the else part for arch/arm/arm64/entry.S

Signed-off-by: Julien Grall 

---
This is pretty much a matter of taste, but at least for arm64 this
allows us to use the auto-nop infrastructure. So the arm32 is more
to keep inline with arm64.
---
 xen/arch/arm/arm32/entry.S | 9 ++---
 xen/arch/arm/arm64/entry.S | 8 +---
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S
index 0b4cd19abd..1428cd3583 100644
--- a/xen/arch/arm/arm32/entry.S
+++ b/xen/arch/arm/arm32/entry.S
@@ -65,9 +65,12 @@ save_guest_regs:
  * If the SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT has been set in the cpu
  * feature, the checking of pending SErrors will be skipped.
  */
-ALTERNATIVE("nop",
-"b skip_check",
-SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
+alternative_if SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
+nop
+alternative_else
+b   skip_check
+alternative_endif
+
 /*
  * Start to check pending virtual abort in the gap of Guest -> HYP
  * world switch.
diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 458d12f188..91cf6ee6f4 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -170,9 +170,11 @@
  * is not set. If a vSError took place, the initial exception will be
  * skipped. Exit ASAP
  */
-ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
-"nop; nop",
-SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
+alternative_if SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
+bl  check_pending_vserror
+cbnzx0, 1f
+alternative_else_nop_endif
+
 mov x0, sp
 bl  enter_hypervisor_from_guest_noirq
 msr daifclr, \iflags
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 05/10] xen/arm: alternative: Remove unused parameter for alternative_if_not_cap

2019-09-26 Thread Julien Grall
The macro alternative_if_not_cap is taking two parameters. The second
parameter is never used and it is hard to see how this can be used
correctly as it is only protecting the alternative section magic.

Signed-off-by: Julien Grall 
---
 xen/include/asm-arm/alternative.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/include/asm-arm/alternative.h 
b/xen/include/asm-arm/alternative.h
index dedb6dd001..2830a6da2d 100644
--- a/xen/include/asm-arm/alternative.h
+++ b/xen/include/asm-arm/alternative.h
@@ -116,13 +116,11 @@ int apply_alternatives(const struct alt_instr *start, 
const struct alt_instr *en
  * The code that follows this macro will be assembled and linked as
  * normal. There are no restrictions on this code.
  */
-.macro alternative_if_not cap, enable = 1
-   .if \enable
+.macro alternative_if_not cap
.pushsection .altinstructions, "a"
altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
.popsection
 661:
-   .endif
 .endm
 
 /*
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 10/10] xen/arm64: entry: Ensure the guest state is synced when receiving a vSError

2019-09-26 Thread Julien Grall
At the moment, when a SError is received while checking for a pending
one, we will skip the handling the initial exception.

This includes call to exit_from_guest{, _noirq} that is used to
synchronize part of the guest state with the internal representation.
However, we still call leave_hypervisor_tail() which is used for preempting
the guest and synchronizing back part of the guest state.

exit_from_guest{, _noirq} works in pair with leave_hypervisor_tail(), so
skipping if former may result to a loss of some part of  guest state.
An example is the new vGIC which will save the state of the LRS on exit
from the guest and rewrite all of them on entry to the guest.

For now, calling leave_hypervisor_tail() is not necessary when injecting
a vSError to the guest. But as the path is spread accross multiple file,
it is hard to enforce that for the future (someone we may want to crash the
domain). Therefore it is best to call exit_from_guest{, _noirq} in the
vSError path as well.

Note that the return value of check_pending_vserror is now set in x19
instead of x0. This is because we want to keep the value across call to
C-function and x0, unlike x19, will not be saved by the callee.

Signed-off-by: Julien Grall 

---

I am not aware of any issues other than with the new vGIC. But I
haven't looked hard enough so I think it would be worth to try to fix it
for Xen 4.13.
---
 xen/arch/arm/arm64/entry.S | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 91cf6ee6f4..f5350247e1 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -168,11 +168,13 @@
 /*
  * The vSError will be checked while SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
  * is not set. If a vSError took place, the initial exception will be
- * skipped. Exit ASAP
+ * skipped.
+ *
+ * However, we still need to call exit_from_guest{,_noirq} as the
+ * return path to the guest may rely on state saved by them.
  */
 alternative_if SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
 bl  check_pending_vserror
-cbnzx0, 1f
 alternative_else_nop_endif
 
 mov x0, sp
@@ -180,6 +182,11 @@
 msr daifclr, \iflags
 mov x0, sp
 bl  enter_hypervisor_from_guest
+
+alternative_if SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT
+cbnzx19, 1f
+alternative_else_nop_endif
+
 mov x0, sp
 bl  do_trap_\trap
 1:
@@ -383,9 +390,9 @@ return_from_trap:
 /*
  * This function is used to check pending virtual SError in the gap of
  * EL1 -> EL2 world switch.
- * The x0 register will be used to indicate the results of detection.
- * x0 -- Non-zero indicates a pending virtual SError took place.
- * x0 -- Zero indicates no pending virtual SError took place.
+ * The register x19 will be used to indicate the results of detection.
+ * x19 -- Non-zero indicates a pending virtual SError took place.
+ * x19 -- Zero indicates no pending virtual SError took place.
  */
 check_pending_vserror:
 /*
@@ -432,9 +439,9 @@ abort_guest_exit_end:
 
 /*
  * Not equal, the pending SError exception took place, set
- * x0 to non-zero.
+ * x19 to non-zero.
  */
-csetx0, ne
+csetx19, ne
 
 ret
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC for-4.13 03/10] xen/arm: traps: Rework entry/exit from the guest path

2019-09-26 Thread Julien Grall
At the moment, enter_hypervisor_head() and leave_hypervisor_tail() are
used to deal with actions to be done before/after any guest request is
handled.

While they are meant to work in pair, the former is called for most of
the traps, including traps from the same exception level (i.e.
hypervisor) whilst the latter will only be called when returning to the
guest.

As pointed out, the enter_hypervisor_head() is not called from all the
traps, so this makes potentially difficult to extend it for the dealing
with same exception level.

Furthermore, some assembly only path will require to call
enter_hypervisor_tail(). So the function is now directly call by
assembly in for guest vector only. This means that the check whether we
are called in a guest trap can now be removed.

Take the opportunity to rename enter_hypervisor_tail() and
leave_hypervisor_tail() to something more meaningful and document them.
This should help everyone to understand the purpose of the two
functions.

Signed-off-by: Julien Grall 

---

I haven't done the 32-bits part yet. I wanted to gather feedback before
looking in details how to integrate that with Arm32.
---
 xen/arch/arm/arm64/entry.S |  4 ++-
 xen/arch/arm/traps.c   | 71 ++
 2 files changed, 37 insertions(+), 38 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 40d9f3ec8c..9eafae516b 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -147,7 +147,7 @@
 
 .if \hyp == 0 /* Guest mode */
 
-bl  leave_hypervisor_tail /* Disables interrupts on return */
+bl  leave_hypervisor_to_guest /* Disables interrupts on return */
 
 exit_guest \compat
 
@@ -175,6 +175,8 @@
 SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
 msr daifclr, \iflags
 mov x0, sp
+bl  enter_hypervisor_from_guest
+mov x0, sp
 bl  do_trap_\trap
 1:
 exithyp=0, compat=\compat
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a3b961bd06..20ba34ec91 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2006,47 +2006,46 @@ static inline bool needs_ssbd_flip(struct vcpu *v)
  cpu_require_ssbd_mitigation();
 }
 
-static void enter_hypervisor_head(struct cpu_user_regs *regs)
+/*
+ * Actions that needs to be done after exiting the guest and before any
+ * request from it is handled.
+ */
+void enter_hypervisor_from_guest(struct cpu_user_regs *regs)
 {
-if ( guest_mode(regs) )
-{
-struct vcpu *v = current;
+struct vcpu *v = current;
 
-/* If the guest has disabled the workaround, bring it back on. */
-if ( needs_ssbd_flip(v) )
-arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
+/* If the guest has disabled the workaround, bring it back on. */
+if ( needs_ssbd_flip(v) )
+arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
 
-/*
- * If we pended a virtual abort, preserve it until it gets cleared.
- * See ARM ARM DDI 0487A.j D1.14.3 (Virtual Interrupts) for details,
- * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
- * (alias of HCR.VA) is cleared to 0."
- */
-if ( v->arch.hcr_el2 & HCR_VA )
-v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
+/*
+ * If we pended a virtual abort, preserve it until it gets cleared.
+ * See ARM ARM DDI 0487A.j D1.14.3 (Virtual Interrupts) for details,
+ * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
+ * (alias of HCR.VA) is cleared to 0."
+ */
+if ( v->arch.hcr_el2 & HCR_VA )
+v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
 
 #ifdef CONFIG_NEW_VGIC
-/*
- * We need to update the state of our emulated devices using level
- * triggered interrupts before syncing back the VGIC state.
- *
- * TODO: Investigate whether this is necessary to do on every
- * trap and how it can be optimised.
- */
-vtimer_update_irqs(v);
-vcpu_update_evtchn_irq(v);
+/*
+ * We need to update the state of our emulated devices using level
+ * triggered interrupts before syncing back the VGIC state.
+ *
+ * TODO: Investigate whether this is necessary to do on every
+ * trap and how it can be optimised.
+ */
+vtimer_update_irqs(v);
+vcpu_update_evtchn_irq(v);
 #endif
 
-vgic_sync_from_lrs(v);
-}
+vgic_sync_from_lrs(v);
 }
 
 void do_trap_guest_sync(struct cpu_user_regs *regs)
 {
 const union hsr hsr = { .bits = regs->hsr };
 
-enter_hypervisor_head(regs);
-
 switch ( hsr.ec )
 {
 case HSR_EC_WFI_WFE:
@@ -2180,8 +2179,6 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
 {
 const union hsr hsr = { .bits = regs->hsr };
 
-enter_hypervisor_head(regs);
-
 switch ( hsr.ec )
 {
 #ifdef CONFIG_ARM_64
@@ 

[Xen-devel] [xen-unstable-smoke test] 141871: tolerable all pass - PUSHED

2019-09-26 Thread osstest service owner
flight 141871 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141871/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  52633db2c5ed2df352004f97e2272edc90af3cb4
baseline version:
 xen  ff22a91b4c45f9310d0ec0d7ee070d84a373dd87

Last test of basis   141821  2019-09-25 14:00:59 Z1 days
Failing since141827  2019-09-25 18:00:48 Z0 days6 attempts
Testing same since   141871  2019-09-26 14:00:34 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Christian Lindig 
  Ian Jackson 
  Jan Beulich 
  Julien Grall 
  Paul Durrant 
  Razvan Cojocaru 
  Roger Pau Monné 
  Sergey Dyasli 
  Volodymyr Babchuk 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   ff22a91b4c..52633db2c5  52633db2c5ed2df352004f97e2272edc90af3cb4 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 30/47] xen/sched: add support for multiple vcpus per sched unit where missing

2019-09-26 Thread Dario Faggioli
On Thu, 2019-09-26 at 16:40 +0200, Jürgen Groß wrote:
> On 26.09.19 15:53, Dario Faggioli wrote:
> > However, I'm not sure I understand what it is the issue that Jan
> > thinks
> > that has, and in what sense the code/behavior is regarded as
> > "unexpected".
> > 
> > Can you help me see the problem? Maybe, if I realize it, I'd change
> > my
> > preference...
> 
> I have changed it meanwhile and I think the new solution removes a
> latent problem. 
>
Ok, I'll comment directly on the new shape of it then, I guess.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers

2019-09-26 Thread Roger Pau Monné
On Thu, Sep 26, 2019 at 01:14:04PM +0200, Roger Pau Monné wrote:
> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
> > On 03.09.2019 18:14, Roger Pau Monne wrote:
> > A possible consequence of the answers to this might be for
> > the hook's middle parameter to be constified (in patch 4).
> 
> Yes, I think it can be constified.

No, it can't be constified because the handler needs to fill
ioreq->data for read requests.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 47/47] xen/sched: add scheduling granularity enum

2019-09-26 Thread Dario Faggioli
On Thu, 2019-09-26 at 14:37 +0200, Jürgen Groß wrote:
> On 26.09.19 11:46, Dario Faggioli wrote:
> > On Sat, 2019-09-14 at 10:52 +0200, Juergen Gross wrote:
> > > Add a scheduling granularity enum ("cpu", "core", "socket") for
> > > specification of the scheduling granularity. Initially it is set
> > > to
> > > "cpu", this can be modified by the new boot parameter (x86 only)
> > > "sched-gran".
> > > 
> > > According to the selected granularity sched_granularity is set
> > > after
> > > all cpus are online.
> > > 
> > > A test is added for all sched resources holding the same number
> > > of
> > > cpus. Fall back to core- or cpu-scheduling in that case.
> > > 
> > > Signed-off-by: Juergen Gross 
> > > 
> > Reviewed-by: Dario Faggioli 
> 
> Does this still stand with moving all code of this patch to
> cpupool.c? That would avoid making some variables/functions globally
> visible.
>
Yep, I saw this being discussed during review of other patches, but
forgot to mention it here, sorry.

Yes, I'm fine with that, and I'm fine with the consequences of doing
that on this patch (i.e., moving to cpupool.c).

The Reviewed-by still holds.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 00/84] Remove direct map from Xen

2019-09-26 Thread hongyax

On 26/09/2019 13:02, Julien Grall wrote:


So I understand this correctly, Wei's series has no bug in it and could be 
committed as is without breaking Xen. Am I correct?


The reason I am asking that if you have a few patches that say fix leak/bug. If 
they are bug in Wei's series, then they should be squashed in patches 
introducing them.




My additional patches fixed several bugs found in Wei's series. If those fixes 
are folded in, Wei's series can be committed without breaking Xen.


So, it might make sense to commit Wei's series first with the fixes, then my 
extra series to actually remove the direct map.


Hongyan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 10/11] ioreq: split the code to detect PCI config space accesses

2019-09-26 Thread Roger Pau Monné
On Tue, Sep 10, 2019 at 04:06:20PM +0200, Paul Durrant wrote:
> > -Original Message-
> > From: Roger Pau Monne 
> > Sent: 03 September 2019 17:14
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne ; Paul Durrant 
> > ; Jan Beulich
> > ; Andrew Cooper ; Wei Liu 
> > 
> > Subject: [PATCH v2 10/11] ioreq: split the code to detect PCI config space 
> > accesses
> > 
> > Place the code that converts a PIO/COPY ioreq into a PCI_CONFIG one
> > into a separate function, and adjust the code to make use of this
> > newly introduced function.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Roger Pau Monné 
> > ---
> > Changes since v1:
> >  - New in this version.
> > ---
> >  xen/arch/x86/hvm/ioreq.c | 111 +++
> >  1 file changed, 67 insertions(+), 44 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> > index fecdc2786f..33c56b880c 100644
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -183,6 +183,54 @@ static bool hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, 
> > ioreq_t *p)
> >  return true;
> >  }
> > 
> > +static void convert_pci_ioreq(struct domain *d, ioreq_t *p)
> > +{
> > +const struct hvm_mmcfg *mmcfg;
> > +uint32_t cf8 = d->arch.hvm.pci_cf8;
> > +
> > +if ( p->type != IOREQ_TYPE_PIO && p->type != IOREQ_TYPE_COPY )
> > +{
> > +ASSERT_UNREACHABLE();
> > +return;
> > +}
> > +
> > +read_lock(>arch.hvm.mmcfg_lock);
> 
> Actually, looking at this... can you not restrict holding the mmcfg_lock...
> 
> > +if ( (p->type == IOREQ_TYPE_PIO &&
> > +  (p->addr & ~3) == 0xcfc &&
> > +  CF8_ENABLED(cf8)) ||
> > + (p->type == IOREQ_TYPE_COPY &&
> > +  (mmcfg = hvm_mmcfg_find(d, p->addr)) != NULL) )
> > +{
> > +uint32_t x86_fam;
> > +pci_sbdf_t sbdf;
> > +unsigned int reg;
> > +
> > +reg = p->type == IOREQ_TYPE_PIO ? hvm_pci_decode_addr(cf8, p->addr,
> > +  )
> > +: hvm_mmcfg_decode_addr(mmcfg, 
> > p->addr,
> > +);
> 
> ... to within hvm_mmcfg_decode_addr()?

Hm, not really. There's a call to hvm_mmcfg_find in the if condition
which needs the lock to be held, and then breaking this into two
different lock sections (pick lock, get mmcfg, unlock, pick lock,
decode, unlock) could lead to the mmcfg region being freed under our
feet.

I think the locking needs to stay as-is unless we switch to a
different locking model for the mmcfg regions. Note it's a read lock,
so it shouldn't have any contention since modifying the mmcfg region
list is very rare.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 00/21] Refine memblock API

2019-09-26 Thread Mike Rapoport
Hi,

On Thu, Sep 26, 2019 at 08:09:52AM -0500, Adam Ford wrote:
> On Wed, Sep 25, 2019 at 10:17 AM Fabio Estevam  wrote:
> >
> > On Wed, Sep 25, 2019 at 9:17 AM Adam Ford  wrote:
> >
> > > I tried cma=256M and noticed the cma dump at the beginning didn't
> > > change.  Do we need to setup a reserved-memory node like
> > > imx6ul-ccimx6ulsom.dtsi did?
> >
> > I don't think so.
> >
> > Were you able to identify what was the exact commit that caused such 
> > regression?
> 
> I was able to narrow it down the 92d12f9544b7 ("memblock: refactor
> internal allocation functions") that caused the regression with
> Etnaviv.


Can you please test with this change:

diff --git a/mm/memblock.c b/mm/memblock.c
index 7d4f61a..1f5a0eb 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1356,9 +1356,6 @@ static phys_addr_t __init 
memblock_alloc_range_nid(phys_addr_t size,
align = SMP_CACHE_BYTES;
}
 
-   if (end > memblock.current_limit)
-   end = memblock.current_limit;
-
 again:
found = memblock_find_in_range_node(size, align, start, end, nid,
flags);
 
> I also noticed that if I create a reserved memory node as was done one
> imx6ul-ccimx6ulsom.dtsi the 3D seems to work again, but without it, I
> was getting errors regardless of the 'cma=256M' or not.
> I don't have a problem using the reserved memory, but I guess I am not
> sure what the amount should be.  I know for the video decoding 1080p,
> I have historically used cma=128M, but with the 3D also needing some
> memory allocation, is that enough or should I use 256M?
> 
> adam

-- 
Sincerely yours,
Mike.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers

2019-09-26 Thread Roger Pau Monné
On Thu, Sep 26, 2019 at 05:13:23PM +0200, Jan Beulich wrote:
> On 26.09.2019 15:46, Roger Pau Monné  wrote:
> > On Thu, Sep 26, 2019 at 03:17:15PM +0200, Jan Beulich wrote:
> >> On 26.09.2019 13:14, Roger Pau Monné  wrote:
> >>> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
>  Having said this, as a result of having looked at some of the
>  involved code, and with the cover letter not clarifying this,
>  what's the reason for going this seemingly more complicated
>  route, rather than putting vPCI behind the hvm_io_intercept()
>  machinery, just like is the case for other internal handling?
> >>>
> >>> If vPCI is handled at the hvm_io_intercept level (like its done ATM)
> >>> then it's not possible to have both (external) ioreq servers and vPCI
> >>> handling accesses to different devices in the PCI config space, since
> >>> vPCI would trap all accesses to the PCI IO ports and the MCFG regions
> >>> and those would never reach the ioreq processing.
> >>
> >> Why would vPCI (want to) do that? The accept() handler should
> >> sub-class the CF8-CFF port range; there would likely want to
> >> be another struct hvm_io_ops instance dealing with config
> >> space accesses (and perhaps with ones through port I/O and
> >> through MCFG at the same time).
> > 
> > Do you mean to expand hvm_io_handler to add something like a pciconf
> > sub-structure to the existing union of portio and mmio?
> 
> Yes, something along these lines.
> 
> > That's indeed feasible, but I'm not sure why it's better that the
> > approach proposed on this series. Long term I think we would like all
> > intercept handlers to use the ioreq infrastructure and remove the
> > usage of hvm_io_intercept.
> > 
> >> In the end this would likely
> >> more similar to how chipsets handle this on real hardware
> >> than your "internal server" solution (albeit I agree to a
> >> degree it's in implementation detail anyway).
> > 
> > I think the end goal should be to unify the internal and external
> > intercepts into a single point, and the only feasible way to do this
> > is to switch the internal intercepts to use the ioreq infrastructure.
> 
> Well, I recall this having been mentioned as an option; I don't
> recall this being a firm plan. There are certainly benefits to
> such a model, but there's also potentially more overhead (at the
> very least the ioreq_t will then need setting up / maintaining
> everywhere, when right now the interfaces are using more
> immediate parameters).

AFAICT from code in hvmemul_do_io which dispatches to both
hvm_io_intercept and ioreq servers the ioreq is already there, so I'm
not sure why more setup would be required in order to handle internal
intercepts as ioreq servers. For vPCI at least I've been able to get
away without having to modify hvmemul_do_io IIRC.

> But yes, if this _is_ the plan, then going that route right away
> for vPCI is desirable.

I think it would be desirable to have a single point where intercepts
are handled instead of having such different implementations for
internal vs external, and the only way I can devise to achieve this is
by moving intercepts to the ioreq model.

I'm not certainly planning to move all intercepts right now, but I
think it's a good first step having the code in place to allow this,
and at least vPCI using it.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 84/84] x86/pv: fix a couple of direct map assumptions in dom0 building.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:47AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Signed-off-by: Hongyan Xia 
> ---
>  xen/arch/x86/pv/dom0_build.c| 7 ---
>  xen/include/asm-x86/processor.h | 2 --
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
> index 202edcaa17..98dcc18d21 100644
> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/dom0_build.c
> @@ -626,9 +626,10 @@ int __init dom0_construct_pv(struct domain *d,
>  l4start = l4tab = map_xen_pagetable(maddr_to_mfn(mpt_alloc));
>  mpt_alloc += PAGE_SIZE;
>  clear_page(l4tab);
> -init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)),
> -  d, INVALID_MFN, true);
> -v->arch.guest_table = pagetable_from_paddr(__pa(l4start));
> +init_xen_l4_slots(l4tab, _mfn(virt_to_mfn_walk(l4start)), d,
> +INVALID_MFN, true);
> +v->arch.guest_table =
> +pagetable_from_mfn(_mfn(virt_to_mfn_walk(l4start)));
>  }
>  else
>  {
> diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
> index f571191cdb..7e8d010d07 100644
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -465,8 +465,6 @@ static inline void disable_each_ist(idt_entry_t *idt)
>  extern idt_entry_t idt_table[];
>  extern idt_entry_t *idt_tables[];
>  
> -DECLARE_PER_CPU(struct tss_struct, init_tss);
> -

Why is this deleted?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 83/84] x86/pmap: rewrite logic for locking.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:46AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Due to the limited PMAP entries, another pCPU is allowed to use PMAP
> only when the current pCPU has unmapped all mappings.
> 

Under what condition would two pCPUs try to use PMAP at the same time?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 82/84] x86: deduplicate code a bit and fix an unmapping bug.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:45AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
>  void unmap_domain_page(const void *ptr)
> diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
> index 1555a61b84..202edcaa17 100644
> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/dom0_build.c
> @@ -236,7 +236,7 @@ static __init void setup_pv_physmap(struct domain *d, 
> unsigned long pgtbl_pfn,
>  if ( pl3e )
>  unmap_domain_page(pl3e);
>  
> -//unmap_domain_page(l4start);
> +unmap_domain_page(l4start);

Please fix the bug where it was introduced.

Wei.

>  }
>  
>  static struct page_info * __init alloc_chunk(struct domain *d,
> -- 
> 2.17.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 80/84] x86/setup: Install dummy 1:1 mappings for all mem passed to allocators.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:43AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> This means we no longer have an always-mapped direct map now.

But why a dummy mapping is this needed at all? That's the same question
that was asked in a previous patch.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 81/84] x86/mm: optimise and properly unmap pages in virt_to_mfn_walk().

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:44AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> This also resolves a mapcache overflow bug.


This should be squashed into the patch that touched virt_to_mfn_walk.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/CPUID: RSTR_FP_ERR_PTRS depends on FPU

2019-09-26 Thread Jürgen Groß

On 25.09.19 17:27, Jan Beulich wrote:

There's nothing to restore here if there's no FPU in the first place.

Signed-off-by: Jan Beulich 


Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shim: fix ballooning down the guest

2019-09-26 Thread Jürgen Groß

On 26.09.19 15:51, Roger Pau Monné  wrote:

On Thu, Sep 26, 2019 at 02:36:18PM +0100, Sergey Dyasli wrote:

Currently ballooning down a pvshim guest causes the following errors
inside the shim:

 d3v0 failed to reserve 512 extents of order 512 for offlining

And the ballooned-out pages stay inside shim and don't reach L0 Xen.

Fix this by passing the correct arguments to pv_shim_offline_memory()
during a XENMEM_decrease_reservation request.



This is missing:

Fixes: b2245acc60c3 ('xen/pvshim: memory hotplug')


Signed-off-by: Sergey Dyasli 


Reviewed-by: Roger Pau Monné 

Also adding Juergen for a release Ack.

Thanks, Roger.



Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 12/12] livepatch: Add python bindings for livepatch operations

2019-09-26 Thread Wieczorkiewicz, Pawel


> On 25. Sep 2019, at 18:47, Ross Lagerwall  wrote:
> 
> On 9/16/19 12:40 PM, Pawel Wieczorkiewicz wrote:
>> Extend the XC python bindings library to support also all common
>> livepatch operations and actions.
>> Add the python bindings for the following operations:
>> - status (pyxc_livepatch_status):
>>   Requires a payload name as an input.
>>   Returns a status dict containing a state string and a return code
>>   integer.
>> - action (pyxc_livepatch_action):
>>   Requires a payload name and an action id as an input. Timeout and
>>   flags are optional parameters.
>>   Returns a return code integer.
>> - upload (pyxc_livepatch_upload):
>>   Requires a payload name and a module's filename as an input.
>>   Returns a return code integer.
>> - list (pyxc_livepatch_list):
>>   Takes no parameters.
>>   Returns a list of dicts containing each payload's:
>>   * name as a string
>>   * state as a string
>>   * return code as an integer
>>   * list of metadata key=value strings
>> Each functions throws an exception error based on the errno value
>> received from its corresponding libxc function call.
>> Signed-off-by: Pawel Wieczorkiewicz 
>> Reviewed-by: Martin Mazein 
>> Reviewed-by: Andra-Irina Paraschiv 
>> Reviewed-by: Leonard Foerster 
>> Reviewed-by: Norbert Manthey 
>> Acked-by: Marek Marczykowski-Górecki 
> 
> This will be very useful, thanks!
> 
>> ---
>> Changed since v1:
>>   * changed PyList_Append() with PyList_SetItem() as requested by
>> Marek
>>  tools/python/xen/lowlevel/xc/xc.c | 273 
>> ++
>>  1 file changed, 273 insertions(+)
> snip> +static PyObject *pyxc_livepatch_action(XcObject *self,
>> +   PyObject *args,
>> +   PyObject *kwds)
>> +{
>> +int (*action_func)(xc_interface *xch, char *name, uint32_t timeout, 
>> uint64_t flags);
>> +char *name;
>> +unsigned int action;
>> +uint32_t timeout;
>> +uint64_t flags;
>> +int rc;
>> +
>> +static char *kwd_list[] = { "name", "action", "timeout", "flags", NULL 
>> };
>> +
>> +if ( !PyArg_ParseTupleAndKeywords(args, kwds, "sI|Ik", kwd_list,
>> +  , , , ) )
>> +goto error;
>> +
>> +switch (action)
>> +{
>> +case LIVEPATCH_ACTION_UNLOAD:
>> +action_func = xc_livepatch_unload;
>> +break;
>> +case LIVEPATCH_ACTION_REVERT:
>> +action_func = xc_livepatch_revert;
>> +break;
>> +case LIVEPATCH_ACTION_APPLY:
>> +action_func = xc_livepatch_apply;
>> +break;
>> +case LIVEPATCH_ACTION_REPLACE:
>> +action_func = xc_livepatch_replace;
>> +break;
>> +default:
>> +goto error;
>> +}
>> +
>> +rc = action_func(self->xc_handle, name, timeout, flags);
>> +if ( rc )
>> +goto error;
>> +
>> +return Py_BuildValue("i", rc);
> 
> For this and all the other functions which return zero on success, IMO 
> returning None would be more Pythonic.
> 

OK, will change.

>> +error:
>> +return pyxc_error_to_exception(self->xc_handle);
>> +}
>> +
>> +static PyObject *pyxc_livepatch_upload(XcObject *self,
>> +   PyObject *args,
>> +   PyObject *kwds)
>> +{
>> +unsigned char *fbuf = MAP_FAILED;
>> +char *name, *filename;
>> +struct stat buf;
>> +int fd = 0, rc;
>> +ssize_t len;
>> +
>> +static char *kwd_list[] = { "name", "filename", NULL };
>> +
>> +if ( !PyArg_ParseTupleAndKeywords(args, kwds, "ss", kwd_list,
>> +  , ))
>> +goto error;
>> +
>> +fd = open(filename, O_RDONLY);
>> +if ( fd < 0 )
>> +goto error;
>> +
>> +if ( stat(filename, ) != 0 )
>> +goto error;
> 
> I think it would be better to use fstat() to avoid a second path lookup 
> potentially pointing to a different file.
> 

Ah, certainly! Will fix.

>> +
>> +len = buf.st_size;
>> +fbuf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
>> +if ( fbuf == MAP_FAILED )
>> +goto error;
>> +
>> +rc = xc_livepatch_upload(self->xc_handle, name, fbuf, len);
>> +if ( rc )
>> +goto error;
>> +
>> +if ( munmap(fbuf, len) )
>> +{
>> +fbuf = MAP_FAILED;
>> +goto error;
>> +}
>> +close(fd);
>> +
>> +return Py_BuildValue("i", rc);;
> 
> Stray semicolon

ACK

> 
>> +error:
>> +if ( fbuf != MAP_FAILED )
>> +munmap(fbuf, len);
>> +if ( fd >= 0 )
>> +close(fd);
> 
> You should probably save & restore errno so you can return the original error.
> 

Yes, that’s right. Will fix.

>> +return pyxc_error_to_exception(self->xc_handle);
> 
> Maybe you can have a conditional return to avoid duplicating the munmap() & 
> close()? E.g.
> 
> return rc ? pyxc_error_to_exception(self->xc_handle) : …
> 

Oh, this indeed can work. Let me apply that. Thanks.

>> +}
>> +
>> +static 

Re: [Xen-devel] [PATCH V6 0/8] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec

2019-09-26 Thread Oleksandr


On 26.09.19 17:56, Julien Grall wrote:

Hi,


Hi Julien




On 9/26/19 12:20 PM, Oleksandr Tyshchenko wrote:

Oleksandr Tyshchenko (8):
   iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff
   iommu/arm: Add ability to handle deferred probing request
   xen/common: Introduce _xrealloc function
   xen/common: Introduce xrealloc_flex_struct() helper macros
   iommu/arm: Add lightweight iommu_fwspec support
   iommu: Order the headers alphabetically in device_tree.c
   iommu/arm: Introduce iommu_add_dt_device API
   iommu/arm: Add Renesas IPMMU-VMSA support


This series is now merged.


Thank you!


Julien, what do you think regarding the following:

https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02304.html

this one is intended to address "the main TODO" in the IPMMU-VMSA driver 
and it would be really nice if


it could go in too... (I will be able to resolve any issues with it 
today/tomorrow if still present)



--
Regards,

Oleksandr Tyshchenko


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 06/11] ioreq: allow dispatching ioreqs to internal servers

2019-09-26 Thread Jan Beulich
On 26.09.2019 15:46, Roger Pau Monné  wrote:
> On Thu, Sep 26, 2019 at 03:17:15PM +0200, Jan Beulich wrote:
>> On 26.09.2019 13:14, Roger Pau Monné  wrote:
>>> On Fri, Sep 20, 2019 at 01:35:13PM +0200, Jan Beulich wrote:
 Having said this, as a result of having looked at some of the
 involved code, and with the cover letter not clarifying this,
 what's the reason for going this seemingly more complicated
 route, rather than putting vPCI behind the hvm_io_intercept()
 machinery, just like is the case for other internal handling?
>>>
>>> If vPCI is handled at the hvm_io_intercept level (like its done ATM)
>>> then it's not possible to have both (external) ioreq servers and vPCI
>>> handling accesses to different devices in the PCI config space, since
>>> vPCI would trap all accesses to the PCI IO ports and the MCFG regions
>>> and those would never reach the ioreq processing.
>>
>> Why would vPCI (want to) do that? The accept() handler should
>> sub-class the CF8-CFF port range; there would likely want to
>> be another struct hvm_io_ops instance dealing with config
>> space accesses (and perhaps with ones through port I/O and
>> through MCFG at the same time).
> 
> Do you mean to expand hvm_io_handler to add something like a pciconf
> sub-structure to the existing union of portio and mmio?

Yes, something along these lines.

> That's indeed feasible, but I'm not sure why it's better that the
> approach proposed on this series. Long term I think we would like all
> intercept handlers to use the ioreq infrastructure and remove the
> usage of hvm_io_intercept.
> 
>> In the end this would likely
>> more similar to how chipsets handle this on real hardware
>> than your "internal server" solution (albeit I agree to a
>> degree it's in implementation detail anyway).
> 
> I think the end goal should be to unify the internal and external
> intercepts into a single point, and the only feasible way to do this
> is to switch the internal intercepts to use the ioreq infrastructure.

Well, I recall this having been mentioned as an option; I don't
recall this being a firm plan. There are certainly benefits to
such a model, but there's also potentially more overhead (at the
very least the ioreq_t will then need setting up / maintaining
everywhere, when right now the interfaces are using more
immediate parameters).

But yes, if this _is_ the plan, then going that route right away
for vPCI is desirable.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] xen/sched: don't let XEN_RUNSTATE_UPDATE leak into vcpu_runstate_get()

2019-09-26 Thread Julien Grall

Hi Juergen,

On 9/26/19 4:02 PM, Juergen Gross wrote:

vcpu_runstate_get() should never return a state entry time with
XEN_RUNSTATE_UPDATE set. To avoid this let update_runstate_area()
operate on a local runstate copy.

As it is required to first set the XEN_RUNSTATE_UPDATE indicator in
guest memory, then update all the runstate data, and then at last
clear the XEN_RUNSTATE_UPDATE again it is much less effort to have
a local copy of the runstate data instead of keeping only a copy of
state_entry_time.

This problem was introduced with commit 2529c850ea48f036 ("add update
indicator to vcpu_runstate_info").

Reported-by: Andrew Cooper 
Signed-off-by: Juergen Gross 


Acked-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 09/11] vpci: register as an internal ioreq server

2019-09-26 Thread Roger Pau Monné
On Tue, Sep 10, 2019 at 03:49:41PM +0200, Paul Durrant wrote:
> > -Original Message-
> > From: Roger Pau Monne 
> > Sent: 03 September 2019 17:14
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne ; Ian Jackson 
> > ; Wei Liu
> > ; Andrew Cooper ; George Dunlap 
> > ; Jan
> > Beulich ; Julien Grall ; Konrad 
> > Rzeszutek Wilk
> > ; Stefano Stabellini ; Tim 
> > (Xen.org) ;
> > Paul Durrant 
> > Subject: [PATCH v2 09/11] vpci: register as an internal ioreq server
> > @@ -478,6 +480,67 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, 
> > unsigned int size,
> >  spin_unlock(>vpci->lock);
> >  }
> > 
> > +#ifdef __XEN__
> > +static int ioreq_handler(struct vcpu *v, ioreq_t *req, void *data)
> > +{
> > +pci_sbdf_t sbdf;
> > +
> > +if ( req->type == IOREQ_TYPE_INVALIDATE )
> > +/*
> > + * Ignore invalidate requests, those can be received even without
> > + * having any memory ranges registered, see send_invalidate_req.
> > + */
> > +return X86EMUL_OKAY;
> 
> In general, I wonder whether internal servers will ever need to deal with 
> invalidate? The code only exists to get QEMU to drop its map cache after a 
> decrease_reservation so that the page refs get dropped.

I think the best solution here is to rename hvm_broadcast_ioreq to
hvm_broadcast_ioreq_external and switch it's callers. Both
send_timeoffset_req and send_invalidate_req seem only relevant to
external ioreq servers.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] xen/sched: don't let XEN_RUNSTATE_UPDATE leak into vcpu_runstate_get()

2019-09-26 Thread Jan Beulich
On 26.09.2019 17:02, Juergen Gross wrote:
> vcpu_runstate_get() should never return a state entry time with
> XEN_RUNSTATE_UPDATE set. To avoid this let update_runstate_area()
> operate on a local runstate copy.
> 
> As it is required to first set the XEN_RUNSTATE_UPDATE indicator in
> guest memory, then update all the runstate data, and then at last
> clear the XEN_RUNSTATE_UPDATE again it is much less effort to have
> a local copy of the runstate data instead of keeping only a copy of
> state_entry_time.
> 
> This problem was introduced with commit 2529c850ea48f036 ("add update
> indicator to vcpu_runstate_info").
> 
> Reported-by: Andrew Cooper 
> Signed-off-by: Juergen Gross 

And yet another time
Reviewed-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 8/8] xen/arm32: head: Use a page mapping for the 1:1 mapping in create_page_tables()

2019-09-26 Thread Julien Grall

Hi,

On 9/26/19 5:24 AM, Stefano Stabellini wrote:

@@ -530,16 +470,53 @@ create_page_tables:
  blo   1b
  
  /*

- * Defer fixmap and dtb mapping until after paging enabled, to
- * avoid them clashing with the 1:1 mapping.
+ * If Xen is loaded at exactly XEN_VIRT_START then we don't
+ * need an additional 1:1 mapping, the virtual mapping will
+ * suffice.
   */
+cmp   r9, #XEN_VIRT_START
+moveq pc, lr
  
-/* boot pagetable setup complete */

+1:


As far as I can tell, this 1 label is unused. If so, we should remove
it. With that gone:


Hmmm, yes it is.



Reviewed-by: Stefano Stabellini 


Thank you!

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3] xen/sched: don't let XEN_RUNSTATE_UPDATE leak into vcpu_runstate_get()

2019-09-26 Thread Juergen Gross
vcpu_runstate_get() should never return a state entry time with
XEN_RUNSTATE_UPDATE set. To avoid this let update_runstate_area()
operate on a local runstate copy.

As it is required to first set the XEN_RUNSTATE_UPDATE indicator in
guest memory, then update all the runstate data, and then at last
clear the XEN_RUNSTATE_UPDATE again it is much less effort to have
a local copy of the runstate data instead of keeping only a copy of
state_entry_time.

This problem was introduced with commit 2529c850ea48f036 ("add update
indicator to vcpu_runstate_info").

Reported-by: Andrew Cooper 
Signed-off-by: Juergen Gross 
---
V2: add handling on ARM, too (Jan Beulich)
---
 xen/arch/arm/domain.c | 13 -
 xen/arch/x86/domain.c | 17 ++---
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 61d35cd120..f0ee5a2140 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -280,28 +280,31 @@ static void ctxt_switch_to(struct vcpu *n)
 static void update_runstate_area(struct vcpu *v)
 {
 void __user *guest_handle = NULL;
+struct vcpu_runstate_info runstate;
 
 if ( guest_handle_is_null(runstate_guest(v)) )
 return;
 
+memcpy(, >runstate, sizeof(runstate));
+
 if ( VM_ASSIST(v->domain, runstate_update_flag) )
 {
 guest_handle = >runstate_guest.p->state_entry_time + 1;
 guest_handle--;
-v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
 __raw_copy_to_guest(guest_handle,
-(void *)(>runstate.state_entry_time + 1) - 1, 
1);
+(void *)(_entry_time + 1) - 1, 1);
 smp_wmb();
 }
 
-__copy_to_guest(runstate_guest(v), >runstate, 1);
+__copy_to_guest(runstate_guest(v), , 1);
 
 if ( guest_handle )
 {
-v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
 smp_wmb();
 __raw_copy_to_guest(guest_handle,
-(void *)(>runstate.state_entry_time + 1) - 1, 
1);
+(void *)(_entry_time + 1) - 1, 1);
 }
 }
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index c0faf68852..c7fa224c89 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1600,21 +1600,24 @@ bool update_runstate_area(struct vcpu *v)
 bool rc;
 struct guest_memory_policy policy = { .nested_guest_mode = false };
 void __user *guest_handle = NULL;
+struct vcpu_runstate_info runstate;
 
 if ( guest_handle_is_null(runstate_guest(v)) )
 return true;
 
 update_guest_memory_policy(v, );
 
+memcpy(, >runstate, sizeof(runstate));
+
 if ( VM_ASSIST(v->domain, runstate_update_flag) )
 {
 guest_handle = has_32bit_shinfo(v->domain)
 ? >runstate_guest.compat.p->state_entry_time + 1
 : >runstate_guest.native.p->state_entry_time + 1;
 guest_handle--;
-v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
 __raw_copy_to_guest(guest_handle,
-(void *)(>runstate.state_entry_time + 1) - 1, 
1);
+(void *)(_entry_time + 1) - 1, 1);
 smp_wmb();
 }
 
@@ -1622,20 +1625,20 @@ bool update_runstate_area(struct vcpu *v)
 {
 struct compat_vcpu_runstate_info info;
 
-XLAT_vcpu_runstate_info(, >runstate);
+XLAT_vcpu_runstate_info(, );
 __copy_to_guest(v->runstate_guest.compat, , 1);
 rc = true;
 }
 else
-rc = __copy_to_guest(runstate_guest(v), >runstate, 1) !=
- sizeof(v->runstate);
+rc = __copy_to_guest(runstate_guest(v), , 1) !=
+ sizeof(runstate);
 
 if ( guest_handle )
 {
-v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
 smp_wmb();
 __raw_copy_to_guest(guest_handle,
-(void *)(>runstate.state_entry_time + 1) - 1, 
1);
+(void *)(_entry_time + 1) - 1, 1);
 }
 
 update_guest_memory_policy(v, );
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V6 0/8] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec

2019-09-26 Thread Julien Grall

Hi,

On 9/26/19 12:20 PM, Oleksandr Tyshchenko wrote:

Oleksandr Tyshchenko (8):
   iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff
   iommu/arm: Add ability to handle deferred probing request
   xen/common: Introduce _xrealloc function
   xen/common: Introduce xrealloc_flex_struct() helper macros
   iommu/arm: Add lightweight iommu_fwspec support
   iommu: Order the headers alphabetically in device_tree.c
   iommu/arm: Introduce iommu_add_dt_device API
   iommu/arm: Add Renesas IPMMU-VMSA support


This series is now merged.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 78/84] Revert "x86/smpboot: use xenheap pages for rpts in smpboot."

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:41AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> We have properly handled (un)mapping of pages in restore_all_guests.
> This hack is no longer required.
> 
> Signed-off-by: Hongyan Xia 

If you rearrange this series  a bit you don't need this and the patch it
reverts in the first place, I think.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 77/84] x86: properly (un)map pages in restore_all_guests.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:40AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Before, it assumed both cr3 could be accessed via a direct map. This is
> no longer true. Also, this means we can remove a xenheap mapping hack
> we introduced earlier when building the cr3 of dom0.
> 
> Signed-off-by: Hongyan Xia 
> ---
>  xen/arch/x86/pv/dom0_build.c | 11 +--
>  xen/arch/x86/x86_64/entry.S  | 32 +---
>  2 files changed, 34 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
> index 0ec30988b8..202edcaa17 100644
> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/dom0_build.c
> @@ -623,9 +623,7 @@ int __init dom0_construct_pv(struct domain *d,
>  if ( !is_pv_32bit_domain(d) )
>  {
>  maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l4_page_table;
> -l4start = l4tab = __va(mpt_alloc);
> -map_pages_to_xen((unsigned long)l4start, maddr_to_mfn(mpt_alloc), 1,
> -PAGE_HYPERVISOR);
> +l4start = l4tab = map_xen_pagetable(maddr_to_mfn(mpt_alloc));
>  mpt_alloc += PAGE_SIZE;
>  clear_page(l4tab);
>  init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)),
> @@ -635,9 +633,8 @@ int __init dom0_construct_pv(struct domain *d,
>  else
>  {
>  /* Monitor table already created by switch_compat(). */
> -l4start = l4tab = __va(pagetable_get_paddr(v->arch.guest_table));
> -map_pages_to_xen((unsigned long)l4start,
> -pagetable_get_mfn(v->arch.guest_table), 1, PAGE_HYPERVISOR);
> +l4start = l4tab =
> +map_xen_pagetable(pagetable_get_mfn(v->arch.guest_table));
>  /* See public/xen.h on why the following is needed. */
>  maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l3_page_table;
>  l3start = map_xen_pagetable(maddr_to_mfn(mpt_alloc));
> @@ -907,6 +904,8 @@ int __init dom0_construct_pv(struct domain *d,
>  pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, 
> vconsole_start,
>vphysmap_start, si);
>  
> +UNMAP_XEN_PAGETABLE(l4start);
> +

These hunks should be part of a previous patch, right? The one you
changed PV Dom0 construction.

>  if ( is_pv_32bit_domain(d) )
>  xlat_start_info(si, pv_shim ? XLAT_start_info_console_domU
>  : XLAT_start_info_console_dom0);
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index 11385857fa..8ca9a8e0ea 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -150,11 +150,27 @@ restore_all_guest:
>  je.Lrag_copy_done
>  movb  $0, STACK_CPUINFO_FIELD(root_pgt_changed)(%rdx)
>  movabs $PADDR_MASK & PAGE_MASK, %rsi
> -movabs $DIRECTMAP_VIRT_START, %rcx
>  and   %rsi, %rdi
>  and   %r9, %rsi
> -add   %rcx, %rdi
> -add   %rcx, %rsi
> +
> +/* Without a direct map, we have to map pages first before copying. 
> */
> +/* FIXME: optimisations may be needed. */
> +pushq %r9
> +pushq %rdx
> +pushq %rax
> +pushq %rsi
> +shr   $PAGE_SHIFT, %rdi
> +callq map_xen_pagetable
> +popq  %rdi
> +pushq %rax
> +shr   $PAGE_SHIFT, %rdi
> +callq map_xen_pagetable
> +mov   %rax, %rsi
> +mov   0(%rsp), %rdi
> +
> +/* %rsi and %rdi are on top the stack for unmapping. */
> +pushq %rsi
> +
>  mov   $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx
>  mov   root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8
>  mov   %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi)
> @@ -166,6 +182,16 @@ restore_all_guest:
>  sub   $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \
>  ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi
>  rep movsq
> +
> +/* Unmap the two pages. */
> +popq  %rdi
> +callq unmap_xen_pagetable
> +popq  %rdi
> +callq unmap_xen_pagetable
> +popq  %rax
> +popq  %rdx
> +popq  %r9
> +

This section is for synchronising root page tables. Now that it has
become so long, it would be better if you write a C function for this
purpose.

Wei.

>  .Lrag_copy_done:
>  mov   %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx)
>  movb  $1, STACK_CPUINFO_FIELD(use_pv_cr3)(%rdx)
> -- 
> 2.17.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/sched: don't let XEN_RUNSTATE_UPDATE leak into vcpu_runstate_get()

2019-09-26 Thread Julien Grall

Hi,

On 9/26/19 3:45 PM, Jürgen Groß wrote:

On 26.09.19 16:27, Julien Grall wrote:

Hi,

On 9/25/19 5:29 AM, Juergen Gross wrote:

vcpu_runstate_get() should never return a state entry time with
XEN_RUNSTATE_UPDATE set. To avoid this let update_runstate_area()
operate on a local runstate copy.

This problem was introduced with commit 2529c850ea48f036 ("add update
indicator to vcpu_runstate_info").

Reported-by: Andrew Cooper 
Signed-off-by: Juergen Gross 
---
V2: add handling on ARM, too (Jan Beulich)
---
  xen/arch/arm/domain.c | 13 -
  xen/arch/x86/domain.c | 17 ++---
  2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index ae13e47e86..d681ff5c6e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -280,28 +280,31 @@ static void ctxt_switch_to(struct vcpu *n)
  static void update_runstate_area(struct vcpu *v)
  {
  void __user *guest_handle = NULL;
+    struct vcpu_runstate_info runstate;
  if ( guest_handle_is_null(runstate_guest(v)) )
  return;
+    memcpy(, >runstate, sizeof(runstate));


I am not really happy with this solution. AFAICT, you only copy the 
full structure here just for the benefits of updating state_entry_time.


I saw you discuss about it with Jan, so it would be nice to log at 
least in the commit message the reason why this is done like that.


Isn't the reference to commit 2529c850ea48f036 enough? The update
protocol is clearly described in that commit message.


I meant the reason to use the 'memcpy', which sounds like quite a waste, 
compare to only copy state_entry_time.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH for-next 04/18] x86/mem_sharing: cleanup code in various locations

2019-09-26 Thread Tamas K Lengyel
On Thu, Sep 26, 2019 at 8:20 AM Jan Beulich  wrote:
>
> On 26.09.2019 16:09, Tamas K Lengyel wrote:
> > On Wed, Sep 25, 2019 at 10:15 AM Jan Beulich  wrote:
> >> On 25.09.2019 17:48, Tamas K Lengyel wrote:
> >>> --- a/xen/arch/x86/hvm/hvm.c
> >>> +++ b/xen/arch/x86/hvm/hvm.c
> >>> @@ -1879,12 +1879,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, 
> >>> unsigned long gla,
> >>>  if ( npfec.write_access && (p2mt == p2m_ram_shared) )
> >>>  {
> >>>  ASSERT(p2m_is_hostp2m(p2m));
> >>> -sharing_enomem =
> >>> -(mem_sharing_unshare_page(currd, gfn, 0) < 0);
> >>> +sharing_enomem = mem_sharing_unshare_page(currd, gfn, 0);
> >>
> >> I guess the implication here is that the function can only return
> >> -ENOMEM? Not very forward compatible, but well. However, if you
> >> touch this already, shouldn't you at least make "sharing_enomem"
> >> bool?
> >
> > Correct, there is a BUG_ON for every other rc value but ENOMEM. We
> > could turn it into a bool but I don't see a reason for it, perhaps
> > there will be another rc value in the future that we want to handle
> > gracefully.
>
> At which point the variable's name will no longer be appropriate.
> Hence my request to make it bool; at such a future point the code
> would need touching again anyway if you (understandably) don't
> want to make more than purely cosmetic changes now.

By the way, it is made bool in patch 7 of the series because there is
no need to call this function directly here.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/sched: don't let XEN_RUNSTATE_UPDATE leak into vcpu_runstate_get()

2019-09-26 Thread Jürgen Groß

On 26.09.19 16:27, Julien Grall wrote:

Hi,

On 9/25/19 5:29 AM, Juergen Gross wrote:

vcpu_runstate_get() should never return a state entry time with
XEN_RUNSTATE_UPDATE set. To avoid this let update_runstate_area()
operate on a local runstate copy.

This problem was introduced with commit 2529c850ea48f036 ("add update
indicator to vcpu_runstate_info").

Reported-by: Andrew Cooper 
Signed-off-by: Juergen Gross 
---
V2: add handling on ARM, too (Jan Beulich)
---
  xen/arch/arm/domain.c | 13 -
  xen/arch/x86/domain.c | 17 ++---
  2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index ae13e47e86..d681ff5c6e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -280,28 +280,31 @@ static void ctxt_switch_to(struct vcpu *n)
  static void update_runstate_area(struct vcpu *v)
  {
  void __user *guest_handle = NULL;
+    struct vcpu_runstate_info runstate;
  if ( guest_handle_is_null(runstate_guest(v)) )
  return;
+    memcpy(, >runstate, sizeof(runstate));


I am not really happy with this solution. AFAICT, you only copy the full 
structure here just for the benefits of updating state_entry_time.


I saw you discuss about it with Jan, so it would be nice to log at least 
in the commit message the reason why this is done like that.


Isn't the reference to commit 2529c850ea48f036 enough? The update
protocol is clearly described in that commit message.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-next test] 141809: regressions - FAIL

2019-09-26 Thread osstest service owner
flight 141809 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/141809/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-amd 10 debian-hvm-install  fail REGR. vs. 141737

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-raw7 xen-boot fail  like 141737
 test-amd64-i386-libvirt   7 xen-boot fail  like 141737
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail  like 141737
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail  like 141737
 test-amd64-i386-freebsd10-i386  7 xen-bootfail like 141737
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail like 
141737
 test-amd64-i386-examine   8 reboot   fail  like 141737
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot  fail like 141737
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail like 
141737
 test-amd64-i386-xl7 xen-boot fail  like 141737
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-bootfail like 141737
 test-amd64-i386-xl-shadow 7 xen-boot fail  like 141737
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot  fail like 141737
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot   fail like 141737
 test-amd64-i386-xl-xsm7 xen-boot fail  like 141737
 test-amd64-i386-pair 10 xen-boot/src_hostfail  like 141737
 test-amd64-i386-pair 11 xen-boot/dst_hostfail  like 141737
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-bootfail like 141737
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot  fail like 141737
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  7 xen-boot   fail like 141737
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot  fail like 141737
 test-amd64-i386-libvirt-xsm   7 xen-boot fail  like 141737
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm  7 xen-boot fail like 141737
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot   fail like 141737
 test-amd64-i386-xl-pvshim 7 xen-boot fail  like 141737
 test-amd64-i386-freebsd10-amd64  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot   fail like 141737
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot   fail like 141737
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 141737
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 141737
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 141737
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 141737
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 141737
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 141737
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 

Re: [Xen-devel] [RFC PATCH 76/84] x86/setup: also clear the permission bits in the dummy 1:1 mapping.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:39AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 

Assuming we end up keeping those calls, this patch should be squashed.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 30/47] xen/sched: add support for multiple vcpus per sched unit where missing

2019-09-26 Thread Jürgen Groß

On 26.09.19 15:53, Dario Faggioli wrote:

On Wed, 2019-09-25 at 15:07 +0200, Jan Beulich wrote:

On 25.09.2019 14:40, Jürgen Groß  wrote:

On 24.09.19 17:22, Jan Beulich wrote:

On 24.09.2019 17:09, Jürgen Groß wrote:

On 24.09.19 17:00, Jan Beulich wrote:

On 24.09.2019 16:41, Jürgen Groß wrote:

for_each_sched_unit_vcpu() for an idle
unit might end premature when one of the vcpus is running
in another
unit (idle_vcpu->sched_unit != idle_unit).


Oh, that (v)->sched_unit == (i) in the construct is clearly
unexpected.
Is this really still needed by the end of the series? I
realize that
_some_ check is needed, but could this perhaps be arranged in
a way
that you'd still hit all vCPU-s when using it on an idle
unit, no
matter whether they're in use as a substitute in a "real"
unit?


I could do that by having another linked list in struct vcpu.
This way
I can avoid it.


Oh, no, not another list just for this purpose. I was rather
thinking
of e.g. a comparison of IDs.


That would result either in something like:

(v)->vcpu_id < (u)->unit_id + (u)->res->cpupool->granularity

requiring to make struct sched_resource public as keyhandler.c
needs
for_each_sched_unit_vcpu() plus being quite expensive,


I agree this is not a good option.


or:

!(u)->next_in_list || (v)->vcpu_id < (u)->next_in_list->unit_id

which seems to be more expensive as the current variant, too.


It's not this much more expensive, and it eliminates unexpected
(as I would call it) behavior, so I think I'd go this route.


So, I honestly like the way it's currently done in Juergen's pathes.

However, I'm not sure I understand what it is the issue that Jan thinks
that has, and in what sense the code/behavior is regarded as
"unexpected".

Can you help me see the problem? Maybe, if I realize it, I'd change my
preference...


I have changed it meanwhile and I think the new solution removes a
latent problem. Otherwise one would have to be very careful not to use
for_each_sched_unit_vcpu() for idle units, as this might result in
occasional wrong results.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 75/84] x86/mm: handle PSE early termination cases in virt_to_mfn_walk().

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:38AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Signed-off-by: Hongyan Xia 
> ---
>  xen/arch/x86/mm.c | 36 ++--
>  1 file changed, 34 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index ab760ecc1f..39ba9f9bf4 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5058,8 +5058,40 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
>  
>  unsigned long virt_to_mfn_walk(void *va)
>  {
> -l1_pgentry_t *pl1e = virt_to_xen_l1e((unsigned long)(va));
> -unsigned long ret = l1e_get_pfn(*pl1e);
> +unsigned long ret;
> +l3_pgentry_t *pl3e;
> +l2_pgentry_t *pl2e;
> +l1_pgentry_t *pl1e;
> +
> +/*
> + * FIXME: This is rather unoptimised, because e.g. virt_to_xen_l2e
> + * recomputes virt_to_xen_l3e again. Clearly one can keep the result and
> + * carry on.
> + */
> +
> +pl3e = virt_to_xen_l3e((unsigned long)(va));
> +BUG_ON(!(l3e_get_flags(*pl3e) & _PAGE_PRESENT));
> +if ( l3e_get_flags(*pl3e) & _PAGE_PSE )
> +{
> +ret = l3e_get_pfn(*pl3e);
> +ret |= (((unsigned long)va & ((1UL << L3_PAGETABLE_SHIFT)-1)) >> 
> PAGE_SHIFT);
> +unmap_xen_pagetable(pl3e);
> +return ret;
> +}
> +
> +pl2e = virt_to_xen_l2e((unsigned long)(va));
> +BUG_ON(!(l2e_get_flags(*pl2e) & _PAGE_PRESENT));
> +if ( l2e_get_flags(*pl2e) & _PAGE_PSE )
> +{
> +ret = l2e_get_pfn(*pl2e);
> +ret |= (((unsigned long)va & ((1UL << L2_PAGETABLE_SHIFT)-1)) >> 
> PAGE_SHIFT);
> +unmap_xen_pagetable(pl2e);
> +return ret;
> +}
> +
> +pl1e = virt_to_xen_l1e((unsigned long)(va));
> +BUG_ON(!(l1e_get_flags(*pl1e) & _PAGE_PRESENT));
> +ret = l1e_get_pfn(*pl1e);

Don't you end up leaking pl3e and pl2e in the !PSE case?

Also, if you only want to walk page table that is already populated,
there may be a better way to do it than calling virt_to_xen_lXe.

Wei.

>  unmap_xen_pagetable(pl1e);
>  return ret;
>  }
> -- 
> 2.17.1
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V6 7/8] iommu/arm: Introduce iommu_add_dt_device API

2019-09-26 Thread Jan Beulich
On 26.09.2019 14:52, Julien Grall wrote:
> On 9/26/19 12:20 PM, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko 
>>
>> The main puprose of this patch is to add a way to register DT device
>> (which is behind the IOMMU) using the generic IOMMU DT bindings [1]
>> before assigning that device to a domain.
>>
>> So, this patch adds new "iommu_add_dt_device" API for adding DT device
>> to the IOMMU using generic IOMMU DT bindings and previously added
>> "iommu_fwspec" support. As devices can be assigned to the hardware domain
>> and other domains this function is called from two places: handle_device()
>> and iommu_do_dt_domctl().
>>
>> Besides that, this patch adds new "dt_xlate" callback (borrowed from
>> Linux "of_xlate") for providing the driver with DT IOMMU specifier
>> which describes the IOMMU master interfaces of that device (device IDs, etc).
>> According to the generic IOMMU DT bindings the context of required
>> properties for IOMMU device/master node (#iommu-cells, iommus) depends
>> on many factors and is really driver depended thing.
>>
>> Please note, all IOMMU drivers which support generic IOMMU DT bindings
>> should use "dt_xlate" and "add_device" callbacks.
>>
>> [1] 
>> https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt
>>
>> Signed-off-by: Oleksandr Tyshchenko 
> 
> Acked-by: Julien Grall 

Acked-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 3/3] AMD/IOMMU: pre-fill all DTEs right after table allocation

2019-09-26 Thread Jan Beulich
Make sure we don't leave any DTEs unexpected requests through which
would be passed through untranslated. Set V and IV right away (with
all other fields left as zero), relying on the V and/or IV bits
getting cleared only by amd_iommu_set_root_page_table() and
amd_iommu_set_intremap_table() under special pass-through circumstances.
Switch back to initial settings in amd_iommu_disable_domain_device().

Take the liberty and also make the latter function static, constifying
its first parameter at the same time, at this occasion.

Signed-off-by: Jan Beulich 
Reviewed-by: Paul Durrant 
---
v7: Avoid writing the DT twice during initial allocation.
v6: New.

--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1262,12 +1262,28 @@ static int __init amd_iommu_setup_device
 
 if ( !dt )
 {
+unsigned int size = dt_alloc_size();
+
 /* allocate 'device table' on a 4K boundary */
 dt = IVRS_MAPPINGS_DEVTAB(ivrs_mappings) =
-allocate_buffer(dt_alloc_size(), "Device Table", true);
+allocate_buffer(size, "Device Table", false);
+if ( !dt )
+return -ENOMEM;
+
+/*
+ * Prefill every DTE such that all kinds of requests will get aborted.
+ * Besides the two bits set to true below this builds upon
+ * IOMMU_DEV_TABLE_SYS_MGT_DMA_ABORTED,
+ * IOMMU_DEV_TABLE_IO_CONTROL_ABORTED, as well as
+ * IOMMU_DEV_TABLE_INT_CONTROL_ABORTED all being zero, and us also
+ * wanting at least TV, GV, I, and EX set to false.
+ */
+for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
+dt[bdf] = (struct amd_iommu_dte){
+  .v = true,
+  .iv = true,
+  };
 }
-if ( !dt )
-return -ENOMEM;
 
 /* Add device table entries */
 for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ )
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -267,9 +267,9 @@ static void __hwdom_init amd_iommu_hwdom
 setup_hwdom_pci_devices(d, amd_iommu_add_device);
 }
 
-void amd_iommu_disable_domain_device(struct domain *domain,
- struct amd_iommu *iommu,
- u8 devfn, struct pci_dev *pdev)
+static void amd_iommu_disable_domain_device(const struct domain *domain,
+struct amd_iommu *iommu,
+uint8_t devfn, struct pci_dev 
*pdev)
 {
 struct amd_iommu_dte *table, *dte;
 unsigned long flags;
@@ -284,9 +284,21 @@ void amd_iommu_disable_domain_device(str
 spin_lock_irqsave(>lock, flags);
 if ( dte->tv || dte->v )
 {
+/* See the comment in amd_iommu_setup_device_table(). */
+dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_ABORTED;
+smp_wmb();
+dte->iv = true;
 dte->tv = false;
-dte->v = false;
+dte->gv = false;
 dte->i = false;
+dte->ex = false;
+dte->sa = false;
+dte->se = false;
+dte->sd = false;
+dte->sys_mgt = IOMMU_DEV_TABLE_SYS_MGT_DMA_ABORTED;
+dte->ioctl = IOMMU_DEV_TABLE_IO_CONTROL_ABORTED;
+smp_wmb();
+dte->v = true;
 
 amd_iommu_flush_device(iommu, req_id);
 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 1/3] AMD/IOMMU: allocate one device table per PCI segment

2019-09-26 Thread Jan Beulich
Having a single device table for all segments can't possibly be right.
(Even worse, the symbol wasn't static despite being used in just one
source file.) Attach the device tables to their respective IVRS mapping
ones.

Signed-off-by: Jan Beulich 
Reviewed-by: Paul Durrant 
---
v6: New.

--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -39,7 +39,6 @@ unsigned int __read_mostly ivrs_bdf_entr
 u8 __read_mostly ivhd_type;
 static struct radix_tree_root ivrs_maps;
 LIST_HEAD_READ_MOSTLY(amd_iommu_head);
-struct table_struct device_table;
 bool_t iommuv2_enabled;
 
 static bool iommu_has_ht_flag(struct amd_iommu *iommu, u8 mask)
@@ -989,6 +988,12 @@ static void disable_iommu(struct amd_iom
 spin_unlock_irqrestore(>lock, flags);
 }
 
+static unsigned int __init dt_alloc_size(void)
+{
+return PAGE_SIZE << get_order_from_bytes(ivrs_bdf_entries *
+ IOMMU_DEV_TABLE_ENTRY_SIZE);
+}
+
 static void __init deallocate_buffer(void *buf, uint32_t sz)
 {
 int order = 0;
@@ -999,12 +1004,6 @@ static void __init deallocate_buffer(voi
 }
 }
 
-static void __init deallocate_device_table(struct table_struct *table)
-{
-deallocate_buffer(table->buffer, table->alloc_size);
-table->buffer = NULL;
-}
-
 static void __init deallocate_ring_buffer(struct ring_buffer *ring_buf)
 {
 deallocate_buffer(ring_buf->buffer, ring_buf->alloc_size);
@@ -1068,8 +1067,29 @@ static void * __init allocate_ppr_log(st
 IOMMU_PPR_LOG_DEFAULT_ENTRIES, "PPR Log");
 }
 
+/*
+ * Within ivrs_mappings[] we allocate an extra array element to store
+ * - segment number,
+ * - device table.
+ */
+#define IVRS_MAPPINGS_SEG(m) (m)[ivrs_bdf_entries].dte_requestor_id
+#define IVRS_MAPPINGS_DEVTAB(m) (m)[ivrs_bdf_entries].intremap_table
+
+static void __init free_ivrs_mapping(void *ptr)
+{
+const struct ivrs_mappings *ivrs_mappings = ptr;
+
+if ( IVRS_MAPPINGS_DEVTAB(ivrs_mappings) )
+deallocate_buffer(IVRS_MAPPINGS_DEVTAB(ivrs_mappings),
+  dt_alloc_size());
+
+xfree(ptr);
+}
+
 static int __init amd_iommu_init_one(struct amd_iommu *iommu, bool intr)
 {
+const struct ivrs_mappings *ivrs_mappings;
+
 if ( allocate_cmd_buffer(iommu) == NULL )
 goto error_out;
 
@@ -1082,13 +1102,15 @@ static int __init amd_iommu_init_one(str
 if ( intr && !set_iommu_interrupt_handler(iommu) )
 goto error_out;
 
-/* To make sure that device_table.buffer has been successfully allocated */
-if ( device_table.buffer == NULL )
+/* Make sure that the device table has been successfully allocated. */
+ivrs_mappings = get_ivrs_mappings(iommu->seg);
+if ( !IVRS_MAPPINGS_DEVTAB(ivrs_mappings) )
 goto error_out;
 
-iommu->dev_table.alloc_size = device_table.alloc_size;
-iommu->dev_table.entries = device_table.entries;
-iommu->dev_table.buffer = device_table.buffer;
+iommu->dev_table.alloc_size = dt_alloc_size();
+iommu->dev_table.entries = iommu->dev_table.alloc_size /
+   IOMMU_DEV_TABLE_ENTRY_SIZE;
+iommu->dev_table.buffer = IVRS_MAPPINGS_DEVTAB(ivrs_mappings);
 
 enable_iommu(iommu);
 printk("AMD-Vi: IOMMU %d Enabled.\n", nr_amd_iommus );
@@ -1135,11 +1157,8 @@ static void __init amd_iommu_init_cleanu
 xfree(iommu);
 }
 
-/* free device table */
-deallocate_device_table(_table);
-
-/* free ivrs_mappings[] */
-radix_tree_destroy(_maps, xfree);
+/* Free ivrs_mappings[] and their device tables. */
+radix_tree_destroy(_maps, free_ivrs_mapping);
 
 iommu_enabled = 0;
 iommu_hwdom_passthrough = false;
@@ -1147,12 +1166,6 @@ static void __init amd_iommu_init_cleanu
 iommuv2_enabled = 0;
 }
 
-/*
- * We allocate an extra array element to store the segment number
- * (and in the future perhaps other global information).
- */
-#define IVRS_MAPPINGS_SEG(m) m[ivrs_bdf_entries].dte_requestor_id
-
 struct ivrs_mappings *get_ivrs_mappings(u16 seg)
 {
 return radix_tree_lookup(_maps, seg);
@@ -1235,24 +1248,18 @@ static int __init alloc_ivrs_mappings(u1
 static int __init amd_iommu_setup_device_table(
 u16 seg, struct ivrs_mappings *ivrs_mappings)
 {
+struct amd_iommu_dte *dt = IVRS_MAPPINGS_DEVTAB(ivrs_mappings);
 unsigned int bdf;
 
 BUG_ON( (ivrs_bdf_entries == 0) );
 
-if ( !device_table.buffer )
+if ( !dt )
 {
 /* allocate 'device table' on a 4K boundary */
-device_table.alloc_size = PAGE_SIZE <<
-  get_order_from_bytes(
-  PAGE_ALIGN(ivrs_bdf_entries *
-  IOMMU_DEV_TABLE_ENTRY_SIZE));
-device_table.entries = device_table.alloc_size /
-   IOMMU_DEV_TABLE_ENTRY_SIZE;
-
-device_table.buffer = allocate_buffer(device_table.alloc_size,
-

[Xen-devel] [PATCH v7 2/3] AMD/IOMMU: allow callers to request allocate_buffer() to skip its memset()

2019-09-26 Thread Jan Beulich
The command ring buffer doesn't need clearing up front in any event.
Subsequently we'll also want to avoid clearing the device tables.

While playing with functions signatures replace undue use of fixed width
types at the same time, and extend this to deallocate_buffer() as well.

Signed-off-by: Jan Beulich 
---
v7: New.

--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -994,12 +994,12 @@ static unsigned int __init dt_alloc_size
  IOMMU_DEV_TABLE_ENTRY_SIZE);
 }
 
-static void __init deallocate_buffer(void *buf, uint32_t sz)
+static void __init deallocate_buffer(void *buf, unsigned long sz)
 {
-int order = 0;
 if ( buf )
 {
-order = get_order_from_bytes(sz);
+unsigned int order = get_order_from_bytes(sz);
+
 __free_amd_iommu_tables(buf, order);
 }
 }
@@ -1012,10 +1012,11 @@ static void __init deallocate_ring_buffe
 ring_buf->tail = 0;
 }
 
-static void * __init allocate_buffer(uint32_t alloc_size, const char *name)
+static void *__init allocate_buffer(unsigned long alloc_size,
+const char *name, bool clear)
 {
-void * buffer;
-int order = get_order_from_bytes(alloc_size);
+void *buffer;
+unsigned int order = get_order_from_bytes(alloc_size);
 
 buffer = __alloc_amd_iommu_tables(order);
 
@@ -1025,13 +1026,16 @@ static void * __init allocate_buffer(uin
 return NULL;
 }
 
-memset(buffer, 0, PAGE_SIZE * (1UL << order));
+if ( clear )
+memset(buffer, 0, PAGE_SIZE << order);
+
 return buffer;
 }
 
-static void * __init allocate_ring_buffer(struct ring_buffer *ring_buf,
-  uint32_t entry_size,
-  uint64_t entries, const char *name)
+static void *__init allocate_ring_buffer(struct ring_buffer *ring_buf,
+ unsigned int entry_size,
+ unsigned long entries,
+ const char *name, bool clear)
 {
 ring_buf->head = 0;
 ring_buf->tail = 0;
@@ -1041,7 +1045,8 @@ static void * __init allocate_ring_buffe
 ring_buf->alloc_size = PAGE_SIZE << get_order_from_bytes(entries *
  entry_size);
 ring_buf->entries = ring_buf->alloc_size / entry_size;
-ring_buf->buffer = allocate_buffer(ring_buf->alloc_size, name);
+ring_buf->buffer = allocate_buffer(ring_buf->alloc_size, name, clear);
+
 return ring_buf->buffer;
 }
 
@@ -1050,21 +1055,23 @@ static void * __init allocate_cmd_buffer
 /* allocate 'command buffer' in power of 2 increments of 4K */
 return allocate_ring_buffer(>cmd_buffer, sizeof(cmd_entry_t),
 IOMMU_CMD_BUFFER_DEFAULT_ENTRIES,
-"Command Buffer");
+"Command Buffer", false);
 }
 
 static void * __init allocate_event_log(struct amd_iommu *iommu)
 {
 /* allocate 'event log' in power of 2 increments of 4K */
 return allocate_ring_buffer(>event_log, sizeof(event_entry_t),
-IOMMU_EVENT_LOG_DEFAULT_ENTRIES, "Event Log");
+IOMMU_EVENT_LOG_DEFAULT_ENTRIES, "Event Log",
+true);
 }
 
 static void * __init allocate_ppr_log(struct amd_iommu *iommu)
 {
 /* allocate 'ppr log' in power of 2 increments of 4K */
 return allocate_ring_buffer(>ppr_log, sizeof(ppr_entry_t),
-IOMMU_PPR_LOG_DEFAULT_ENTRIES, "PPR Log");
+IOMMU_PPR_LOG_DEFAULT_ENTRIES, "PPR Log",
+true);
 }
 
 /*
@@ -1257,7 +1264,7 @@ static int __init amd_iommu_setup_device
 {
 /* allocate 'device table' on a 4K boundary */
 dt = IVRS_MAPPINGS_DEVTAB(ivrs_mappings) =
-allocate_buffer(dt_alloc_size(), "Device Table");
+allocate_buffer(dt_alloc_size(), "Device Table", true);
 }
 if ( !dt )
 return -ENOMEM;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v7 0/3] AMD IOMMU: further improvements

2019-09-26 Thread Jan Beulich
01: allocate one device table per PCI segment
02: allow callers to request allocate_buffer() to skip its memset()
03: pre-fill all DTEs right after table allocation

Jan




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/sched: don't let XEN_RUNSTATE_UPDATE leak into vcpu_runstate_get()

2019-09-26 Thread Julien Grall

Hi,

On 9/25/19 5:29 AM, Juergen Gross wrote:

vcpu_runstate_get() should never return a state entry time with
XEN_RUNSTATE_UPDATE set. To avoid this let update_runstate_area()
operate on a local runstate copy.

This problem was introduced with commit 2529c850ea48f036 ("add update
indicator to vcpu_runstate_info").

Reported-by: Andrew Cooper 
Signed-off-by: Juergen Gross 
---
V2: add handling on ARM, too (Jan Beulich)
---
  xen/arch/arm/domain.c | 13 -
  xen/arch/x86/domain.c | 17 ++---
  2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index ae13e47e86..d681ff5c6e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -280,28 +280,31 @@ static void ctxt_switch_to(struct vcpu *n)
  static void update_runstate_area(struct vcpu *v)
  {
  void __user *guest_handle = NULL;
+struct vcpu_runstate_info runstate;
  
  if ( guest_handle_is_null(runstate_guest(v)) )

  return;
  
+memcpy(, >runstate, sizeof(runstate));


I am not really happy with this solution. AFAICT, you only copy the full 
structure here just for the benefits of updating state_entry_time.


I saw you discuss about it with Jan, so it would be nice to log at least 
in the commit message the reason why this is done like that.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 71/84] x86/setup: start tearing down the direct map.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:34AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Signed-off-by: Hongyan Xia 
> ---
>  xen/arch/x86/setup.c| 4 ++--
>  xen/common/page_alloc.c | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index e964c032f6..3dc2fad987 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1367,7 +1367,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>  if ( map_e < end )
>  {
> -map_pages_to_xen((unsigned long)__va(map_e), 
> maddr_to_mfn(map_e),
> +map_pages_to_xen((unsigned long)__va(map_e), INVALID_MFN,
>   PFN_DOWN(end - map_e), PAGE_HYPERVISOR);

Why don't you just remove the calls to map_pages_to_xen?

>  init_boot_pages(map_e, end);
>  map_e = end;
> @@ -1382,7 +1382,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  }
>  if ( s < map_s )
>  {
> -map_pages_to_xen((unsigned long)__va(s), maddr_to_mfn(s),
> +map_pages_to_xen((unsigned long)__va(s), INVALID_MFN,
>   PFN_DOWN(map_s - s), PAGE_HYPERVISOR);
>  init_boot_pages(s, map_s);
>  }
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index a00db4c0d9..deeeac065c 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -2157,7 +2157,7 @@ void *alloc_xenheap_pages(unsigned int order, unsigned 
> int memflags)
>  map_pages_to_xen((unsigned long)ret, page_to_mfn(pg),
>   1UL << order, PAGE_HYPERVISOR);
>  
> -return page_to_virt(pg);
> +return ret;

This hunk is a fix to a previous patch. It doesn't below here.

Wei.

>  }
>  
>  
> -- 
> 2.17.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 30/47] xen/sched: add support for multiple vcpus per sched unit where missing

2019-09-26 Thread Jan Beulich
On 26.09.2019 15:53, Dario Faggioli wrote:
> On Wed, 2019-09-25 at 15:07 +0200, Jan Beulich wrote:
>> On 25.09.2019 14:40, Jürgen Groß  wrote:
>>> On 24.09.19 17:22, Jan Beulich wrote:
 On 24.09.2019 17:09, Jürgen Groß wrote:
> On 24.09.19 17:00, Jan Beulich wrote:
>> On 24.09.2019 16:41, Jürgen Groß wrote:
>>> for_each_sched_unit_vcpu() for an idle
>>> unit might end premature when one of the vcpus is running
>>> in another
>>> unit (idle_vcpu->sched_unit != idle_unit).
>>
>> Oh, that (v)->sched_unit == (i) in the construct is clearly
>> unexpected.
>> Is this really still needed by the end of the series? I
>> realize that
>> _some_ check is needed, but could this perhaps be arranged in
>> a way
>> that you'd still hit all vCPU-s when using it on an idle
>> unit, no
>> matter whether they're in use as a substitute in a "real"
>> unit?
>
> I could do that by having another linked list in struct vcpu.
> This way
> I can avoid it.

 Oh, no, not another list just for this purpose. I was rather
 thinking
 of e.g. a comparison of IDs.
>>>
>>> That would result either in something like:
>>>
>>> (v)->vcpu_id < (u)->unit_id + (u)->res->cpupool->granularity
>>>
>>> requiring to make struct sched_resource public as keyhandler.c
>>> needs
>>> for_each_sched_unit_vcpu() plus being quite expensive,
>>
>> I agree this is not a good option.
>>
>>> or:
>>>
>>> !(u)->next_in_list || (v)->vcpu_id < (u)->next_in_list->unit_id
>>>
>>> which seems to be more expensive as the current variant, too.
>>
>> It's not this much more expensive, and it eliminates unexpected
>> (as I would call it) behavior, so I think I'd go this route. 
>>
> So, I honestly like the way it's currently done in Juergen's pathes.
> 
> However, I'm not sure I understand what it is the issue that Jan thinks
> that has, and in what sense the code/behavior is regarded as
> "unexpected".
> 
> Can you help me see the problem? Maybe, if I realize it, I'd change my
> preference...

The unexpected / surprising behavior is described at the top (i.e.
still visible in context), but I'll quote it again here:

"for_each_sched_unit_vcpu() for an idle unit might end premature
 when one of the vcpus is running in another unit
 (idle_vcpu->sched_unit != idle_unit)"

This started out with me asking about an apparently (but as
Jürgen has clarified not truly) unnecessary special casing of
idle vCPU-s/units/domain ahead of a use of this construct.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] SUPPORT.md: Describe Renesas IPMMU-VMSA support (Arm)

2019-09-26 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Renesas IPMMU-VMSA support (Arm) can be considered
as Technological Preview feature.

Signed-off-by: Oleksandr Tyshchenko 
---
Please note, should only go in after:
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg02707.html

CC: Andrew Cooper 
CC: George Dunlap 
CC: Ian Jackson 
CC: Jan Beulich 
CC: Julien Grall 
CC: Konrad Rzeszutek Wilk 
CC: Stefano Stabellini 
CC: Tim Deegan 
CC: Wei Liu 
---
 SUPPORT.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/SUPPORT.md b/SUPPORT.md
index 375473a..100a3b1 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -64,6 +64,7 @@ supported in this document.
 Status, Intel VT-d: Supported
 Status, ARM SMMUv1: Supported
 Status, ARM SMMUv2: Supported
+Status, Renesas IPMMU-VMSA: Tech Preview
 
 ### ARM/GICv3 ITS
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH for-next 04/18] x86/mem_sharing: cleanup code in various locations

2019-09-26 Thread Jan Beulich
On 26.09.2019 16:09, Tamas K Lengyel wrote:
> On Wed, Sep 25, 2019 at 10:15 AM Jan Beulich  wrote:
>> On 25.09.2019 17:48, Tamas K Lengyel wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -1879,12 +1879,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
>>> long gla,
>>>  if ( npfec.write_access && (p2mt == p2m_ram_shared) )
>>>  {
>>>  ASSERT(p2m_is_hostp2m(p2m));
>>> -sharing_enomem =
>>> -(mem_sharing_unshare_page(currd, gfn, 0) < 0);
>>> +sharing_enomem = mem_sharing_unshare_page(currd, gfn, 0);
>>
>> I guess the implication here is that the function can only return
>> -ENOMEM? Not very forward compatible, but well. However, if you
>> touch this already, shouldn't you at least make "sharing_enomem"
>> bool?
> 
> Correct, there is a BUG_ON for every other rc value but ENOMEM. We
> could turn it into a bool but I don't see a reason for it, perhaps
> there will be another rc value in the future that we want to handle
> gracefully.

At which point the variable's name will no longer be appropriate.
Hence my request to make it bool; at such a future point the code
would need touching again anyway if you (understandably) don't
want to make more than purely cosmetic changes now.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [GIT PULL] xen: features for 5.4-rc1

2019-09-26 Thread Juergen Gross
Linus,

Please git pull the following tag:

 git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git for-linus-5.4-rc1-tag

xen: features for 5.4-rc1

It contains only two small patches this time:

- a small cleanup for swiotlb-xen
- a fix for PCI initialization for some platforms

Thanks.

Juergen

 drivers/xen/pci.c | 21 +++--
 drivers/xen/swiotlb-xen.c |  5 ++---
 2 files changed, 17 insertions(+), 9 deletions(-)

Igor Druzhinin (1):
  xen/pci: reserve MCFG areas earlier

Souptick Joarder (1):
  swiotlb-xen: Convert to use macro

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 67/84] x86/domain_page: remove direct map code and initialise idle mapcache.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:30AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> These are found in the NDEBUG build.
> 
> Signed-off-by: Hongyan Xia 
> ---
>  xen/arch/x86/domain_page.c | 10 --
>  1 file changed, 10 deletions(-)
> 
> diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
> index f4f53a2a33..f606677ae6 100644
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -78,11 +78,6 @@ void *map_domain_page(mfn_t mfn)
>  struct mapcache_vcpu *vcache;
>  struct vcpu_maphash_entry *hashent;
>  
> -#ifdef NDEBUG
> -if ( mfn_x(mfn) <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
> -return mfn_to_virt(mfn_x(mfn));
> -#endif
> -

Again, this is dropping a fast path. Where is the corresponding change
to unamp_domain_page?

>  v = mapcache_current_vcpu();
>  if ( !v )
>  {
> @@ -257,11 +252,6 @@ int mapcache_domain_init(struct domain *d)
>  struct mapcache_domain *dcache = >arch.mapcache;
>  unsigned int bitmap_pages;
>  
> -#ifdef NDEBUG
> -if ( !mem_hotplug && max_page <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) 
> )
> -return 0;
> -#endif
> -

This should be in its own commit with justification.

Wei.

>  BUILD_BUG_ON(MAPCACHE_VIRT_END + PAGE_SIZE * (3 +
>   2 * PFN_UP(BITS_TO_LONGS(MAPCACHE_ENTRIES) * sizeof(long))) 
> >
>   MAPCACHE_VIRT_START + (PERDOMAIN_SLOT_MBYTES << 20));
> -- 
> 2.17.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 7/8] xen/arm: introduce nr_spis

2019-09-26 Thread Julien Grall

Hi Stefano,

On 9/26/19 2:25 AM, Stefano Stabellini wrote:

On Wed, 25 Sep 2019, Julien Grall wrote:

Hi,

On 25/09/2019 19:49, Stefano Stabellini wrote:

We don't have a clear way to know how many virtual SPIs we need for the
dom0-less domains. Introduce a new option under xen,domain to specify
the number of SPIs to allocate for a domain.

The property is optional. When absent, we'll use the physical number of
GIC lines for dom0-less domains, just like for dom0.


Based on the code below, this is not correct when using vpl011.


I'll write:

The property is optional. When absent, we'll use the physical number of
GIC lines for dom0-less domains, or GUEST_VPL011_SPI+1 if vpl011 is
requested, whichever is greater.


Sounds good to me.






Remove the old setting of nr_spis based on the presence of the vpl011.

The implication of this change is that without nr_spis dom0less domains
get the same amount of SPI allocated as dom0, regardless of how many
physical devices they have assigned, and regardless of whether they have
a virtual pl011 (which also needs an emulated SPI). For instance, we
could end up exposing 256 SPIs for each dom0less domain without a
nr_spis property. If we have 4 dom0less domains without nr_spis, it
would result in 80K of additional memory being used.


I don't understand what you are trying to imply with your example. Ok,
this tell you how much memory you are going to waste... but this does
still not explain why the nr_spis are increased in the default case.


I misunderstood what you wanted me to add to the commit message.


Sorry for the confusion, my main point is you can't really say this is 
low footprint as this is very subjective. Personally, I feel it is a lot 
because if you take the example, this is roughly 8% of the current size 
of Xen (in default config).



I'll remove the example and instead write:

The implication of this change is that without nr_spis dom0less domains
get the same amount of SPI allocated as dom0, regardless of how many
physical devices they have assigned, and regardless of whether they have
a virtual pl011 (which also needs an emulated SPI). This is done because
the SPIs allocation needs to be done before parsing any passthrough
information, so we have to account for any potential physical SPI
assigned to the domain.


Is this better?


Yes, thank you.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH for-next 04/18] x86/mem_sharing: cleanup code in various locations

2019-09-26 Thread Tamas K Lengyel
On Wed, Sep 25, 2019 at 10:15 AM Jan Beulich  wrote:
>
> On 25.09.2019 17:48, Tamas K Lengyel wrote:
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -1879,12 +1879,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
> > long gla,
> >  if ( npfec.write_access && (p2mt == p2m_ram_shared) )
> >  {
> >  ASSERT(p2m_is_hostp2m(p2m));
> > -sharing_enomem =
> > -(mem_sharing_unshare_page(currd, gfn, 0) < 0);
> > +sharing_enomem = mem_sharing_unshare_page(currd, gfn, 0);
>
> I guess the implication here is that the function can only return
> -ENOMEM? Not very forward compatible, but well. However, if you
> touch this already, shouldn't you at least make "sharing_enomem"
> bool?

Correct, there is a BUG_ON for every other rc value but ENOMEM. We
could turn it into a bool but I don't see a reason for it, perhaps
there will be another rc value in the future that we want to handle
gracefully.

>
> > @@ -1240,11 +1277,11 @@ int __mem_sharing_unshare_page(struct domain *d,
> >  mem_sharing_page_unlock(old_page);
> >  put_page_and_type(old_page);
> >
> > -private_page_found:
> > +private_page_found:
>
> Please also indent this label by (at least) one blank.

OK

>
> > @@ -57,34 +59,34 @@ struct page_sharing_info
> >  };
> >  };
> >
> > -#define sharing_supported(_d) \
> > -(is_hvm_domain(_d) && paging_mode_hap(_d))
> > -
> >  unsigned int mem_sharing_get_nr_saved_mfns(void);
> >  unsigned int mem_sharing_get_nr_shared_mfns(void);
> >
> >  #define MEM_SHARING_DESTROY_GFN   (1<<1)
> >  /* Only fails with -ENOMEM. Enforce it with a BUG_ON wrapper. */
> >  int __mem_sharing_unshare_page(struct domain *d,
> > - unsigned long gfn,
> > - uint16_t flags);
> > -static inline int mem_sharing_unshare_page(struct domain *d,
> > -   unsigned long gfn,
> > -   uint16_t flags)
> > +   unsigned long gfn,
> > +   uint16_t flags);
> > +
> > +static inline
> > +int mem_sharing_unshare_page(struct domain *d,
> > + unsigned long gfn,
> > + uint16_t flags)
> >  {
> >  int rc = __mem_sharing_unshare_page(d, gfn, flags);
> >  BUG_ON( rc && (rc != -ENOMEM) );
>
> Would you mind also dropping the stray blanks here?
>

Sure.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 65/84] x86: fix some wrong assumptions on direct map. Increase PMAP slots to 8.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:28AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Signed-off-by: Hongyan Xia 
> ---
>  xen/arch/x86/domain_page.c | 8 
>  xen/arch/x86/x86_64/mm.c   | 3 ++-
>  xen/include/asm-x86/pmap.h | 4 ++--
>  3 files changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
> index 4a3995ccef..f4f53a2a33 100644
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -328,11 +328,6 @@ void *map_domain_page_global(mfn_t mfn)
>   system_state < SYS_STATE_active) ||
>  local_irq_is_enabled()));
>  
> -#ifdef NDEBUG
> -if ( mfn_x(mfn) <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
> -return mfn_to_virt(mfn_x(mfn));
> -#endif
> -

I wouldn't call this a wrong assumption.

This path is a fast path. The problem is it is not longer applicable in
the new world.

I would change the commit message to something like:

   Drop fast paths in map_domain_page_global API pair, because Xen will
   no longer have a direct map.

>  return vmap(, 1);
>  }
>  
> @@ -340,9 +335,6 @@ void unmap_domain_page_global(const void *ptr)
>  {
>  unsigned long va = (unsigned long)ptr;
>  
> -if ( va >= DIRECTMAP_VIRT_START )
> -return;
> -
>  ASSERT(va >= VMAP_VIRT_START && va < VMAP_VIRT_END);
>  
>  vunmap(ptr);
> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
> index 37e8d59e5d..40f29f8ddc 100644
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -712,7 +712,8 @@ void __init paging_init(void)
>  if ( mfn_eq(l2_ro_mpt_mfn, INVALID_MFN) )
>  goto nomem;
>  l2_ro_mpt = map_xen_pagetable(l2_ro_mpt_mfn);
> -compat_idle_pg_table_l2 = l2_ro_mpt;
> +/* compat_idle_pg_table_l2 is used globally. */
> +compat_idle_pg_table_l2 = map_domain_page_global(l2_ro_mpt_mfn);

Again, if this is a bug in a previous patch, it should be fixed there.

>  clear_page(l2_ro_mpt);
>  l3e_write(_ro_mpt[l3_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
>l3e_from_mfn(l2_ro_mpt_mfn, __PAGE_HYPERVISOR_RO));
> diff --git a/xen/include/asm-x86/pmap.h b/xen/include/asm-x86/pmap.h
> index feab1e9170..34d4f2bb38 100644
> --- a/xen/include/asm-x86/pmap.h
> +++ b/xen/include/asm-x86/pmap.h
> @@ -1,8 +1,8 @@
>  #ifndef __X86_PMAP_H__
>  #define __X86_PMAP_H__
>  
> -/* Large enough for mapping 5 levels of page tables */
> -#define NUM_FIX_PMAP 5
> +/* Large enough for mapping 5 levels of page tables with some headroom */
> +#define NUM_FIX_PMAP 8
>  

This looks rather arbitrary to me. Can you specify why extra slots are
needed? The original justification for 5 is for page tables. Now
obviously it is used for more than just mapping page tables.  I'm
worried that even 8 may not be enough. 

Also, this change should either be in a separate patch or folded into
PMAP patch itself.

Wei.

>  void pmap_lock(void);
>  void pmap_unlock(void);
> -- 
> 2.17.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shim: fix ballooning down the guest

2019-09-26 Thread Jan Beulich
On 26.09.2019 15:51, Roger Pau Monné  wrote:
> On Thu, Sep 26, 2019 at 02:36:18PM +0100, Sergey Dyasli wrote:
>> Currently ballooning down a pvshim guest causes the following errors
>> inside the shim:
>>
>> d3v0 failed to reserve 512 extents of order 512 for offlining
>>
>> And the ballooned-out pages stay inside shim and don't reach L0 Xen.
>>
>> Fix this by passing the correct arguments to pv_shim_offline_memory()
>> during a XENMEM_decrease_reservation request.
>>
> 
> This is missing:
> 
> Fixes: b2245acc60c3 ('xen/pvshim: memory hotplug')
> 
>> Signed-off-by: Sergey Dyasli 
> 
> Reviewed-by: Roger Pau Monné 

Acked-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 68/84] page_alloc: actually do the mapping and unmapping on xenheap.

2019-09-26 Thread Julien Grall

Hi,

On 9/26/19 2:03 PM, hong...@amazon.com wrote:

On 26/09/2019 13:24, Julien Grall wrote:

Hi,

On 9/26/19 12:18 PM, hong...@amazon.com wrote:

On 26/09/2019 11:39, Julien Grall wrote:

Hi,

NIT Title: Please remove full stop.

On 9/26/19 10:46 AM, hong...@amazon.com wrote:

From: Hongyan Xia 


Please provide a description of what/why you are doing this in the 
commit message.


Also, IIRC, x86 always have !CONFIG_SEPARATE_XENHEAP. So can you 
explain why the path with separate xenheap is also modified?




Signed-off-by: Hongyan Xia 
---
  xen/common/page_alloc.c | 18 --
  1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 7cb1bd368b..4ec6299ba8 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2143,6 +2143,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
  void *alloc_xenheap_pages(unsigned int order, unsigned int memflags)
  {
  struct page_info *pg;
+    void *ret;
  ASSERT(!in_irq());
@@ -2151,7 +2152,10 @@ void *alloc_xenheap_pages(unsigned int 
order, unsigned int memflags)

  if ( unlikely(pg == NULL) )
  return NULL;
-    memguard_unguard_range(page_to_virt(pg), 1 << (order + 
PAGE_SHIFT));

+    ret = page_to_virt(pg);
+    memguard_unguard_range(ret, 1 << (order + PAGE_SHIFT));
+    map_pages_to_xen((unsigned long)ret, page_to_mfn(pg),
+ 1UL << order, PAGE_HYPERVISOR);


As mentioned earlier on for Arm, xenheap will always be mapped. So 
unless you have plan to tackle the Arm side as well, we should make 
sure that the behavior is not changed for Arm.


I can add an #ifdef for x86. Although I think if the Arm code is 
correct, this should still not break things. It breaks if a xenheap 
access is made even before allocation or after free, which I think is 
a bug.


Correctness is a matter of perspective ;). xenheap is already map on 
Arm and therefore trying to map it again is considered as an error. I 
think this is a valid behavior because if you try to map twice then it 
likely means you may have to unmap later on.


Good point. Thanks. Will an ifdef do the job?


I would suggest to provide helpers so you can document in a single place 
why this is necessary and avoid to add #ifdefery everywhere.


Also, do we expect similar bits in other part of the common code? If 
yes, I would suggest to add those helpers in the header. If not, 
page_alloc.c should be enough.


Regarding the config name, I would rather not use CONFIG_X86 but use a 
new arch-agnostic one. Maybe CONFIG_DIRECTMAP_NOT_ALWAYS_MAPPED? 
(probably too verbose).






Furthermore, xenheap is using superpage (2MB, 1GB) mapping at the 
moment. We completely forbid shattering superpage because they are not 
trivial to deal with.


In short, if you wanted to unmap part it, then you would need to 
shatter the page. Shattering superpage required to follow a specific 
sequence (called break-before-make) that will go through an invalid 
mapping. We need to be careful as another processor may access the 
superpage at the same time.


It may be possible to use only 4KB mapping for the xenheap, but that's 
need to be investigated first.


The series is intended for x86 which then starts with no xenheap 
mappings. If one call to map_pages_to_xen() maps the first half of a 
superpage and a second one maps the remaining, will the second call 
merge them into an actual superpage mapping? Also, do we do a xenheap 
allocation and then only free part of it in some cases? If both answers 
are hopefully no, then shattering won't happen.


For avoidance of doubt, I was describing how Arm works. For any x86 
specific question, then Jan/Andrew are the best point of contact (I saw 
Jan already answered).


The main point is any common code should be able to work for any 
existing architecture (ATM x86, arm)


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 63/84] x86/domain_page: mapcache is no longer tied to pv.

2019-09-26 Thread Wei Liu
On Thu, Sep 26, 2019 at 10:46:26AM +0100, hong...@amazon.com wrote:
> From: Hongyan Xia 
> 
> Signed-off-by: Hongyan Xia 

AIUI there is where we want to end up. I do wonder if this is the
correct place for this patch though. The bottom line is we need to make
sure the HVM path works before we can allow HVM to use it.

I don't have an answer yet. I will have to look at later patches.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] iommu: avoid triggering ASSERT_UNREACHABLE() on ARM...

2019-09-26 Thread Julien Grall

Hi Jan,

On 9/26/19 1:12 PM, Jan Beulich wrote:

On 26.09.2019 12:03, Paul Durrant wrote:

...when the IOMMU is not enabled.

80ff3d338dc9 "iommu: tidy up iommu_use_hap_pt() and need_iommu_pt_sync()
macros" introduced CONFIG_IOMMU_FORCE_PT_SHARE, which causes the global
'iommu_hap_pt_share' option to be replaced with a #define-d value of true.
In this configuration, calling clear_iommu_hap_pt_share() will result
trigger the aforementioned ASSERT.

CONFIG_IOMMU_FORCE_PT_SHARE is always selected for ARM builds and,
because clear_iommu_hap_pt_share() is called by the common iommu_setup()
function if the IOMMU is not enabled, it is no longer safe to disable the
IOMMU on ARM systems.

However, running with the IOMMU disabled is a valid configuration for ARM
systems, so this patch rectifies the problem by removing the call to
clear_iommu_hap_pt_share() from common code. As a side effect of this,
however, it becomes possible on x86 systems for iommu_enabled to be false
but iommu_hap_pt_share to be true. Thus the code in sysctl.c
needs to be changed to make sure that XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share
is not erroneously advertised when the IOMMU has been disabled.

Signed-off-by: Paul Durrant 
Reported-by: Oleksandr 


Preferably with the adjustments mantioned elsewhere
Reviewed-by: Jan Beulich 


I have done it while committing the patch.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 30/47] xen/sched: add support for multiple vcpus per sched unit where missing

2019-09-26 Thread Dario Faggioli
On Wed, 2019-09-25 at 15:07 +0200, Jan Beulich wrote:
> On 25.09.2019 14:40, Jürgen Groß  wrote:
> > On 24.09.19 17:22, Jan Beulich wrote:
> > > On 24.09.2019 17:09, Jürgen Groß wrote:
> > > > On 24.09.19 17:00, Jan Beulich wrote:
> > > > > On 24.09.2019 16:41, Jürgen Groß wrote:
> > > > > > for_each_sched_unit_vcpu() for an idle
> > > > > > unit might end premature when one of the vcpus is running
> > > > > > in another
> > > > > > unit (idle_vcpu->sched_unit != idle_unit).
> > > > > 
> > > > > Oh, that (v)->sched_unit == (i) in the construct is clearly
> > > > > unexpected.
> > > > > Is this really still needed by the end of the series? I
> > > > > realize that
> > > > > _some_ check is needed, but could this perhaps be arranged in
> > > > > a way
> > > > > that you'd still hit all vCPU-s when using it on an idle
> > > > > unit, no
> > > > > matter whether they're in use as a substitute in a "real"
> > > > > unit?
> > > > 
> > > > I could do that by having another linked list in struct vcpu.
> > > > This way
> > > > I can avoid it.
> > > 
> > > Oh, no, not another list just for this purpose. I was rather
> > > thinking
> > > of e.g. a comparison of IDs.
> > 
> > That would result either in something like:
> > 
> > (v)->vcpu_id < (u)->unit_id + (u)->res->cpupool->granularity
> > 
> > requiring to make struct sched_resource public as keyhandler.c
> > needs
> > for_each_sched_unit_vcpu() plus being quite expensive,
> 
> I agree this is not a good option.
> 
> > or:
> > 
> > !(u)->next_in_list || (v)->vcpu_id < (u)->next_in_list->unit_id
> > 
> > which seems to be more expensive as the current variant, too.
> 
> It's not this much more expensive, and it eliminates unexpected
> (as I would call it) behavior, so I think I'd go this route. 
>
So, I honestly like the way it's currently done in Juergen's pathes.

However, I'm not sure I understand what it is the issue that Jan thinks
that has, and in what sense the code/behavior is regarded as
"unexpected".

Can you help me see the problem? Maybe, if I realize it, I'd change my
preference...

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shim: fix ballooning down the guest

2019-09-26 Thread Roger Pau Monné
On Thu, Sep 26, 2019 at 02:36:18PM +0100, Sergey Dyasli wrote:
> Currently ballooning down a pvshim guest causes the following errors
> inside the shim:
> 
> d3v0 failed to reserve 512 extents of order 512 for offlining
> 
> And the ballooned-out pages stay inside shim and don't reach L0 Xen.
> 
> Fix this by passing the correct arguments to pv_shim_offline_memory()
> during a XENMEM_decrease_reservation request.
> 

This is missing:

Fixes: b2245acc60c3 ('xen/pvshim: memory hotplug')

> Signed-off-by: Sergey Dyasli 

Reviewed-by: Roger Pau Monné 

Also adding Juergen for a release Ack.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v11 5/7] microcode: remove microcode_update_lock

2019-09-26 Thread Chao Gao
microcode_update_lock is to prevent logic threads of a same core from
updating microcode at the same time. But due to using a global lock, it
also prevented parallel microcode updating on different cores.

Remove this lock in order to update microcode in parallel. It is safe
because we have already ensured serialization of sibling threads at the
caller side.
1.For late microcode update, do_microcode_update() ensures that only one
  sibiling thread of a core can update microcode.
2.For microcode update during system startup or CPU-hotplug,
  microcode_mutex() guarantees update serialization of logical threads.
3.get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
  late microcode update.

Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
only) are still processed sequentially.

Signed-off-by: Chao Gao 
Reviewed-by: Jan Beulich 
---
Changes in v7:
 - reworked. Remove complex lock logics introduced in v5 and v6. The microcode
 patch to be applied is passed as an argument without any global variable. Thus
 no lock is added to serialize potential readers/writers. Callers of
 apply_microcode() will guarantee the correctness: the patch poninted by the
 arguments won't be changed by others.

Changes in v6:
 - introduce early_ucode_update_lock to serialize early ucode update.

Changes in v5:
 - newly add
---
 xen/arch/x86/microcode_amd.c   | 8 +---
 xen/arch/x86/microcode_intel.c | 8 +---
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 9a8f179..1e52f7f 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -74,9 +74,6 @@ struct mpbhdr {
 uint8_t data[];
 };
 
-/* serialize access to the physical write */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 /* See comment in start_update() for cases when this routine fails */
 static int collect_cpu_info(struct cpu_signature *csig)
 {
@@ -232,7 +229,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(const struct microcode_patch *patch)
 {
-unsigned long flags;
 uint32_t rev;
 int hw_err;
 unsigned int cpu = smp_processor_id();
@@ -247,15 +243,13 @@ static int apply_microcode(const struct microcode_patch 
*patch)
 
 hdr = patch->mc_amd->mpb;
 
-spin_lock_irqsave(_update_lock, flags);
+BUG_ON(local_irq_is_enabled());
 
 hw_err = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr);
 
 /* get patch id after patching */
 rdmsrl(MSR_AMD_PATCHLEVEL, rev);
 
-spin_unlock_irqrestore(_update_lock, flags);
-
 /*
  * Some processors leave the ucode blob mapping as UC after the update.
  * Flush the mapping to regain normal cacheability.
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index c083e17..9ededcc 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -93,9 +93,6 @@ struct extended_sigtable {
 
 #define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
 
-/* serialize access to the physical write to MSR 0x79 */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 static int collect_cpu_info(struct cpu_signature *csig)
 {
 unsigned int cpu_num = smp_processor_id();
@@ -287,7 +284,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(const struct microcode_patch *patch)
 {
-unsigned long flags;
 uint64_t msr_content;
 unsigned int val[2];
 unsigned int cpu_num = raw_smp_processor_id();
@@ -302,8 +298,7 @@ static int apply_microcode(const struct microcode_patch 
*patch)
 
 mc_intel = patch->mc_intel;
 
-/* serialize access to the physical write to MSR 0x79 */
-spin_lock_irqsave(_update_lock, flags);
+BUG_ON(local_irq_is_enabled());
 
 /* write microcode via MSR 0x79 */
 wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
@@ -316,7 +311,6 @@ static int apply_microcode(const struct microcode_patch 
*patch)
 rdmsrl(MSR_IA32_UCODE_REV, msr_content);
 val[1] = (uint32_t)(msr_content >> 32);
 
-spin_unlock_irqrestore(_update_lock, flags);
 if ( val[1] != mc_intel->hdr.rev )
 {
 printk(KERN_ERR "microcode: CPU%d update from revision "
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v11 7/7] microcode: reject late ucode loading if any core is parked

2019-09-26 Thread Chao Gao
If a core with all of its thread being parked, late ucode loading
which currently only loads ucode on online threads would lead to
differing ucode revisions in the system. In general, keeping ucode
revision consistent would be less error-prone. To this end, if there
is a parked thread doesn't have an online sibling thread, late ucode
loading is rejected.

Two threads are on the same core or computing unit iff they have
the same phys_proc_id and cpu_core_id/compute_unit_id. Based on
phys_proc_id and cpu_core_id/compute_unit_id, an unique core id
is generated for each thread. And use a bitmap to reduce the
number of comparison.

Signed-off-by: Chao Gao 
---
Alternatively, we can mask the thread id off apicid and use it
as the unique core id. It needs to introduce new field in cpuinfo_x86
to record the mask for thread id. So I don't take this way.
---
 xen/arch/x86/microcode.c| 75 +
 xen/include/asm-x86/processor.h |  1 +
 2 files changed, 76 insertions(+)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index b9fa8bb..b70eb16 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -573,6 +573,64 @@ static int do_microcode_update(void *patch)
 return ret;
 }
 
+static unsigned int unique_core_id(unsigned int cpu, unsigned int socket_shift)
+{
+unsigned int core_id = cpu_to_cu(cpu);
+
+if ( core_id == INVALID_CUID )
+core_id = cpu_to_core(cpu);
+
+return (cpu_to_socket(cpu) << socket_shift) + core_id;
+}
+
+static int has_parked_core(void)
+{
+int ret = 0;
+
+if ( park_offline_cpus )
+{
+unsigned int cpu, max_bits, core_width;
+unsigned int max_sockets = 1, max_cores = 1;
+struct cpuinfo_x86 *c = cpu_data;
+unsigned long *bitmap;
+
+for_each_present_cpu(cpu)
+{
+if ( x86_cpu_to_apicid[cpu] == BAD_APICID )
+continue;
+
+/* Note that cpu_to_socket() get an ID starting from 0. */
+if ( cpu_to_socket(cpu) + 1 > max_sockets )
+max_sockets = cpu_to_socket(cpu) + 1;
+
+if ( c[cpu].x86_max_cores > max_cores )
+max_cores = c[cpu].x86_max_cores;
+}
+
+core_width = fls(max_cores);
+max_bits = max_sockets << core_width;
+bitmap = xzalloc_array(unsigned long, BITS_TO_LONGS(max_bits));
+if ( !bitmap )
+return -ENOMEM;
+
+for_each_present_cpu(cpu)
+{
+if ( cpu_online(cpu) || x86_cpu_to_apicid[cpu] == BAD_APICID )
+continue;
+
+__set_bit(unique_core_id(cpu, core_width), bitmap);
+}
+
+for_each_online_cpu(cpu)
+__clear_bit(unique_core_id(cpu, core_width), bitmap);
+
+ret = (find_first_bit(bitmap, max_bits) < max_bits);
+xfree(bitmap);
+}
+
+return ret;
+}
+
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
 int ret;
@@ -611,6 +669,23 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) 
buf, unsigned long len)
  */
 ASSERT(cpumask_first(_online_map) == nmi_cpu);
 
+/*
+ * If there is a core with all of its threads parked, late loading may
+ * cause differing ucode revisions in the system. Refuse this operation.
+ */
+ret = has_parked_core();
+if ( ret )
+{
+if ( ret > 0 )
+{
+printk(XENLOG_WARNING
+   "Ucode loading aborted: found a parked core\n");
+ret = -EPERM;
+}
+xfree(buffer);
+goto put;
+}
+
 patch = parse_blob(buffer, len);
 xfree(buffer);
 if ( IS_ERR(patch) )
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index c92956f..753deec 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -171,6 +171,7 @@ extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 
*c);
 
 #define cpu_to_core(_cpu)   (cpu_data[_cpu].cpu_core_id)
 #define cpu_to_socket(_cpu) (cpu_data[_cpu].phys_proc_id)
+#define cpu_to_cu(_cpu) (cpu_data[_cpu].compute_unit_id)
 
 unsigned int apicid_to_socket(unsigned int);
 
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v11 6/7] microcode: rendezvous CPUs in NMI handler and load ucode

2019-09-26 Thread Chao Gao
When one core is loading ucode, handling NMI on sibling threads or
on other cores in the system might be problematic. By rendezvousing
all CPUs in NMI handler, it prevents NMI acceptance during ucode
loading.

Basically, some work previously done in stop_machine context is
moved to NMI handler. Primary threads call in and load ucode in
NMI handler. Secondary threads wait for the completion of ucode
loading on all CPU cores. An option is introduced to disable this
behavior.

Control thread doesn't rendezvous in NMI handler by calling self_nmi()
(in case of unknown_nmi_error() being triggered). The side effect is
control thread might be handling an NMI and interacting with the old
ucode not in a controlled way while other threads are loading ucode.
Update ucode on the control thread first to mitigate this issue.

Signed-off-by: Sergey Dyasli 
Signed-off-by: Chao Gao 
---
Changes in v11:
 - Extend existing 'nmi' option rather than use a new one.
 - use per-cpu variable to store error code of xxx_nmi_work()
 - rename secondary_thread_work to secondary_nmi_work.
 - intialize nmi_patch to ZERO_BLOCK_PTR and make it static.
 - constify nmi_cpu
 - explain why control thread loads ucode first in patch description

Changes in v10:
 - rewrite based on Sergey's idea and patch
 - add Sergey's SOB.
 - add an option to disable ucode loading in NMI handler
 - don't send IPI NMI to the control thread to avoid unknown_nmi_error()
 in do_nmi().
 - add an assertion to make sure the cpu chosen to handle platform NMI
 won't send self NMI. Otherwise, there is a risk that we encounter
 unknown_nmi_error() and system crashes.

Changes in v9:
 - control threads send NMI to all other threads. Slave threads will
 stay in the NMI handling to prevent NMI acceptance during ucode
 loading. Note that self-nmi is invalid according to SDM.
 - s/rep_nop/cpu_relax
 - remove debug message in microcode_nmi_callback(). Printing debug
 message would take long times and control thread may timeout.
 - rebase and fix conflicts

Changes in v8:
 - new
---
 docs/misc/xen-command-line.pandoc |   6 +-
 xen/arch/x86/microcode.c  | 156 ++
 xen/arch/x86/traps.c  |   6 +-
 xen/include/asm-x86/nmi.h |   3 +
 4 files changed, 138 insertions(+), 33 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 832797e..8beb285 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2036,7 +2036,7 @@ pages) must also be specified via the tbuf_size parameter.
 > `= unstable | skewed | stable:socket`
 
 ### ucode (x86)
-> `= [ | scan]`
+> `= List of [  | scan, nmi= ]`
 
 Specify how and where to find CPU microcode update blob.
 
@@ -2057,6 +2057,10 @@ microcode in the cpio name space must be:
   - on Intel: kernel/x86/microcode/GenuineIntel.bin
   - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
 
+'nmi' determines late loading is performed in NMI handler or just in
+stop_machine context. In NMI handler, even NMIs are blocked, which is
+considered safer. The default value is `true`.
+
 ### unrestricted_guest (Intel)
 > `= `
 
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 6c23879..b9fa8bb 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -36,8 +36,10 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -95,6 +97,9 @@ static struct ucode_mod_blob __initdata ucode_blob;
  */
 static bool_t __initdata ucode_scan;
 
+/* By default, ucode loading is done in NMI handler */
+static bool ucode_in_nmi = true;
+
 /* Protected by microcode_mutex */
 static struct microcode_patch *microcode_cache;
 
@@ -105,23 +110,42 @@ void __init microcode_set_module(unsigned int idx)
 }
 
 /*
- * The format is '[|scan]'. Both options are optional.
+ * The format is '[|scan, nmi=]'. Both options are optional.
  * If the EFI has forced which of the multiboot payloads is to be used,
- * no parsing will be attempted.
+ * only nmi= is parsed.
  */
 static int __init parse_ucode(const char *s)
 {
-const char *q = NULL;
+const char *ss;
+int val, rc = 0;
 
-if ( ucode_mod_forced ) /* Forced by EFI */
-   return 0;
+do {
+ss = strchr(s, ',');
+if ( !ss )
+ss = strchr(s, '\0');
 
-if ( !strncmp(s, "scan", 4) )
-ucode_scan = 1;
-else
-ucode_mod_idx = simple_strtol(s, , 0);
+if ( (val = parse_boolean("nmi", s, ss)) >= 0 )
+ucode_in_nmi = val;
+else if ( !ucode_mod_forced ) /* Not forced by EFI */
+{
+const char *q = NULL;
+
+if ( !strncmp(s, "scan", 4) )
+{
+ucode_scan = true;
+q = s + 4;
+}
+else
+ucode_mod_idx = simple_strtol(s, , 0);
+
+if ( q != ss )
+rc = -EINVAL;
+}
+
+s = ss + 1;
+} while ( 

[Xen-devel] [PATCH v11 2/7] microcode: unify ucode loading during system bootup and resuming

2019-09-26 Thread Chao Gao
During system bootup and resuming, CPUs just load the cached ucode.
So one unified function microcode_update_one() is introduced. It
takes a boolean to indicate whether ->start_update should be called.
Since early_microcode_update_cpu() is only called on BSP (APs call
the unified function), start_update is always true and so remove
this parameter.

There is a functional change: ->start_update is called on BSP and
->end_update_percpu is called during system resuming. They are not
invoked by previous microcode_resume_cpu().

Signed-off-by: Chao Gao 
Reviewed-by: Jan Beulich 
---
Changes in v10:
 - call ->start_update for system resume from suspension

Changes in v9:
 - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
   microcode_update_one()
 - rebase and fix conflicts.

Changes in v8:
 - split out from the previous patch
---
 xen/arch/x86/acpi/power.c   |  2 +-
 xen/arch/x86/microcode.c| 91 +++--
 xen/arch/x86/smpboot.c  |  5 +--
 xen/include/asm-x86/processor.h |  4 +-
 4 files changed, 45 insertions(+), 57 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 269b140..01e6aec 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -278,7 +278,7 @@ static int enter_state(u32 state)
 
 console_end_sync();
 
-microcode_resume_cpu();
+microcode_update_one(true);
 
 if ( !recheck_cpu_features(0) )
 panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 3ea2a6e..9c0e5c4 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, 
size_t len)
 return NULL;
 }
 
-int microcode_resume_cpu(void)
-{
-int err;
-struct cpu_signature *sig = _cpu(cpu_sig);
-
-if ( !microcode_ops )
-return 0;
-
-spin_lock(_mutex);
-
-err = microcode_ops->collect_cpu_info(sig);
-if ( likely(!err) )
-err = microcode_ops->apply_microcode(microcode_cache);
-spin_unlock(_mutex);
-
-return err;
-}
-
 void microcode_free_patch(struct microcode_patch *microcode_patch)
 {
 microcode_ops->free_patch(microcode_patch->mc);
@@ -391,11 +373,38 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-int __init early_microcode_update_cpu(bool start_update)
+/* Load a cached update to current cpu */
+int microcode_update_one(bool start_update)
+{
+int err;
+
+if ( !microcode_ops )
+return -EOPNOTSUPP;
+
+microcode_ops->collect_cpu_info(_cpu(cpu_sig));
+
+if ( start_update && microcode_ops->start_update )
+{
+err = microcode_ops->start_update();
+if ( err )
+return err;
+}
+
+err = microcode_update_cpu(NULL);
+
+if ( microcode_ops->end_update_percpu )
+microcode_ops->end_update_percpu();
+
+return err;
+}
+
+/* BSP calls this function to parse ucode blob and then apply an update. */
+int __init early_microcode_update_cpu(void)
 {
 int rc = 0;
 void *data = NULL;
 size_t len;
+struct microcode_patch *patch;
 
 if ( !microcode_ops )
 return -ENOSYS;
@@ -411,44 +420,26 @@ int __init early_microcode_update_cpu(bool start_update)
 data = bootstrap_map(_mod);
 }
 
-microcode_ops->collect_cpu_info(_cpu(cpu_sig));
-
 if ( !data )
 return -ENOMEM;
 
-if ( start_update )
+patch = parse_blob(data, len);
+if ( IS_ERR(patch) )
 {
-struct microcode_patch *patch;
-
-patch = parse_blob(data, len);
-if ( IS_ERR(patch) )
-{
-printk(XENLOG_WARNING "Parsing microcode blob error %ld\n",
-   PTR_ERR(patch));
-return PTR_ERR(patch);
-}
-
-if ( !patch )
-return -ENOENT;
-
-spin_lock(_mutex);
-rc = microcode_update_cache(patch);
-spin_unlock(_mutex);
-ASSERT(rc);
-
-if ( microcode_ops->start_update )
-rc = microcode_ops->start_update();
-
-if ( rc )
-return rc;
+printk(XENLOG_WARNING "Parsing microcode blob error %ld\n",
+   PTR_ERR(patch));
+return PTR_ERR(patch);
 }
 
-rc = microcode_update_cpu(NULL);
+if ( !patch )
+return -ENOENT;
 
-if ( microcode_ops->end_update_percpu )
-microcode_ops->end_update_percpu();
+spin_lock(_mutex);
+rc = microcode_update_cache(patch);
+spin_unlock(_mutex);
+ASSERT(rc);
 
-return rc;
+return microcode_update_one(true);
 }
 
 int __init early_microcode_init(void)
@@ -468,7 +459,7 @@ int __init early_microcode_init(void)
 microcode_ops->collect_cpu_info(_cpu(cpu_sig));
 
 if ( ucode_mod.mod_end || ucode_blob.size )
-rc = early_microcode_update_cpu(true);
+rc = early_microcode_update_cpu();
 }
 
 return rc;
diff --git 

[Xen-devel] [PATCH v11 4/7] x86/microcode: Synchronize late microcode loading

2019-09-26 Thread Chao Gao
This patch ports microcode improvement patches from linux kernel.

Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.

Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.

Signed-off-by: Chao Gao 
Tested-by: Chao Gao 
[linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
[linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
Cc: Kevin Tian 
Cc: Jun Nakajima 
Cc: Ashok Raj 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Andrew Cooper 
Cc: Jan Beulich 
---
Changes in v11:
 - Use the sample code of wait_for_state() provided by Jan
 - make wait_cpu_call{in,out} take unsigned int to avoid type casting
 - do assignment in while clause in control_thread_fn() to eliminate
 duplication.

Changes in v10:
 - introduce wait_for_state() and set_state() helper functions
 - make wait_for_condition() return bool and take const void *
 - disable/enable watchdog in control thread
 - rename "master" and "slave" thread to "primary" and "secondary"

Changes in v9:
 - log __buildin_return_address(0) when timeout
 - divide CPUs into three logical sets and they will call different
 functions during ucode loading. The 'control thread' is chosen to
 coordinate ucode loading on all CPUs. Since only control thread would
 set 'loading_state', we can get rid of 'cmpxchg' stuff in v8.
 - s/rep_nop/cpu_relax
 - each thread updates its revision number itself
 - add XENLOG_ERR prefix for each line of multi-line log messages

Changes in v8:
 - to support blocking #NMI handling during loading ucode
   * introduce a flag, 'loading_state', to mark the start or end of
 ucode loading.
   * use a bitmap for cpu callin since if cpu may stay in #NMI handling,
 there are two places for a cpu to call in. bitmap won't be counted
 twice.
   * don't wait for all CPUs callout, just wait for CPUs that perform the
 update. We have to do this because some threads may be stuck in NMI
 handling (where cannot reach the rendezvous).
 - emit a warning if the system stays in stop_machine context for more
 than 1s
 - comment that rdtsc is fine while loading an update
 - use cmpxchg() to avoid panic being called on multiple CPUs
 - Propagate revision number to other threads
 - refine comments and prompt messages

Changes in v7:
 - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
 - reword the comment above microcode_update_cpu() to clearly state that
 one thread per core should do the update.
---
 xen/arch/x86/microcode.c | 297 ++-
 1 file changed, 267 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 9c0e5c4..6c23879 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -30,18 +30,52 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 
+#include 
 #include 
 #include 
 #include 
 #include 
 
+/*
+ * Before performing a late microcode update on any thread, we
+ * rendezvous all cpus in stop_machine context. The timeout for
+ * waiting for cpu rendezvous is 30ms. It is the timeout used by
+ * live patching
+ */
+#define MICROCODE_CALLIN_TIMEOUT_US 3
+
+/*
+ * Timeout for each thread to complete update is set to 1s. It is a
+ * conservative choice considering all possible interference.
+ */
+#define MICROCODE_UPDATE_TIMEOUT_US 100
+
 static module_t __initdata ucode_mod;
 static signed int __initdata ucode_mod_idx;
 static bool_t __initdata ucode_mod_forced;
+static unsigned int nr_cores;
+
+/*
+ * These states help to coordinate CPUs during loading an update.
+ *
+ * The semantics of each state is as follow:
+ *  - LOADING_PREPARE: initial state of 'loading_state'.
+ *  - LOADING_CALLIN: CPUs are allowed to callin.
+ *  - LOADING_ENTER: all CPUs have called in. Initiate ucode loading.
+ *  - LOADING_EXIT: ucode loading is done or aborted.
+ */
+static enum {
+LOADING_PREPARE,
+LOADING_CALLIN,
+LOADING_ENTER,
+LOADING_EXIT,
+} loading_state;
 
 /*
  * If we scan the initramfs.cpio for the early microcode code
@@ -190,6 +224,16 @@ static DEFINE_SPINLOCK(microcode_mutex);
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 /*
+ * Count the CPUs that have entered, exited the rendezvous and succeeded in
+ * microcode update during late microcode update respectively.
+ *
+ * Note that a bitmap is used for callin to allow cpu to set a bit multiple
+ * times. It is required to do busy-loop in #NMI handling.
+ */
+static cpumask_t cpu_callin_map;
+static atomic_t cpu_out, cpu_updated;
+
+/*
  * Return a patch that covers current CPU. If there are multiple patches,
  * return the one with the highest revision number. Return error If no
  * patch is 

[Xen-devel] [PATCH v11 0/7] improve late microcode loading

2019-09-26 Thread Chao Gao
Changes in v11:
 - reject late ucode loading if any core is parked
 - correct the usage of microcode_mutex in microcode_update_cpu()
 - extend 'ucode' boot option to enable/disable ucode loading in NMI
 - drop the last two patches of v10 (about BDF90 and wbinvd, I haven't
 get an answer to opens yet).
 - other minor changes are described in each patch's change log

Regarding changes to AMD side, I didn't do any test for them due to
lack of hardware.

The intention of this series is to make the late microcode loading
more reliable by rendezvousing all cpus in stop_machine context.
This idea comes from Ashok. I am porting his linux patch to Xen
(see patch 4 more details).

This series includes below changes:
 1. Patch 1-3: cleanup and preparation for synchronizing ucode loading
 2. Patch 4: synchronize late microcode loading
 3. Patch 5: support parallel microcodes update on different cores
 4. Patch 6: rendezvous CPUs in NMI handler and load ucode
 5. Patch 7: reject late ucode loading if any core is parked

Currently, late microcode loading does a lot of things including
parsing microcode blob, checking the signature/revision and performing
update. Putting all of them into stop_machine context is a bad idea
because of complexity (one issue I observed is memory allocation
triggered one assertion in stop_machine context). To simplify the
load process, parsing microcode is moved out of the load process.
Remaining parts of load process is put to stop_machine context.

Previous change log:
Major changes in version 10:
 - add back the patch to call wbinvd() conditionally
 - add a patch to disable late loading due to BDF90
 - rendezvous CPUs in NMI handler and load ucode. But provide an option
 to disable this behavior.
 - avoid the call of self_nmi() on the control thread because it may
 trigger the unknown_nmi_error() in do_nmi().
 - ensure ->start_update is called during system resuming from
 suspension

Changes in version 9:
 - add Jan's Reviewed-by
 - rendevzous threads in NMI handler to disable NMI. Note that NMI can
 be served as usual on threads that are chosen to initiate ucode loading
 on each core.
 - avoid unnecessary memory allocation or copy when creating a microcode
 patch (patch 12)
 - rework patch 1 to avoid microcode_update_match() being used to
 compare two arbitrary updates.
 - call .end_update in early loading path.

Changes in version 8:
 - block #NMI handling during microcode loading (Patch 16)
 - Don't assume that all CPUs in the system have loaded a same ucode.
 So when parsing a blob, we attempt to save a patch as long as it matches
 with current cpu signature regardless of the revision of the patch.
 And also for loading, we only require the patch to be loaded isn't old
 than the cached one.
 - store an update after the first successful loading on a CPU
 - remove the patch that calls wbinvd() unconditionally before microcode
 loading. It is under internal discussion.
 - divide two big patches into several patches to improve readability.

Changes in version 7:
 - cache one microcode update rather than a list of it. Assuming that all CPUs
 (including those will be plugged in later) in the system have the same
 signature, one update matches with one CPU should match with others. Thus, one
 update is enough for microcode updating during CPU hot-plug and resuming.
 - To handle load failure, microcode update is cached after it is applied to
 avoid a broken update overriding a validated one. Unvalidated microcode updates
 are passed by arguments rather than another global variable, where this series
 slightly differs from Roger's suggestion in:
 https://lists.xen.org/archives/html/xen-devel/2019-03/msg00776.html
 - incorporate Sergey's patch (patch 10) to fix a bug: we maintain a variable
 to reflect current microcode revision. But in some cases, this variable isn't
 initialized during system boot time, which results in falsely reporting that
 processor is susceptible to some known vulnerabilities.
 - fix issues reported by Sergey:
 https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00901.html
 - Responses to Sergey/Roger/Wei/Ashok's other comments.

Major changes in version 6:
 - run wbinvd before updating microcode (patch 10)
 - add an userspace tool for late microcode update (patch 1)
 - scale time to wait by the number of remaining CPUs to respond 
 - remove 'cpu' parameters from some related callbacks and functins
 - save an ucode patch only if its supported CPU is allowed to mix with
   current cpu.

Changes in version 5:
 - support parallel microcode updates for all cores (see patch 8)
 - Address Roger's comments on the last version.

Chao Gao (7):
  microcode: split out apply_microcode() from cpu_request_microcode()
  microcode: unify ucode loading during system bootup and resuming
  microcode: reduce memory allocation and copy when creating a patch
  x86/microcode: Synchronize late microcode loading
  microcode: remove microcode_update_lock
  microcode: rendezvous CPUs 

  1   2   3   >