On 07/10/2013 03:32 AM, Alexander Graf wrote:
On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
This adds special support for huge pages (16MB). The reference
counting cannot be easily done for such pages in real mode (when
MMU is off) so we added a list of huge pages. It is populated in
On 11.07.2013, at 04:48, tiejun.chen wrote:
On 07/10/2013 05:49 PM, Alexander Graf wrote:
On 10.07.2013, at 08:02, Tiejun Chen wrote:
We should ensure the preemption cannot occur while calling get_paca()
insdide hard_irq_disable(), otherwise the paca_struct may be the
wrong one just
On 11.07.2013, at 07:12, Alexey Kardashevskiy wrote:
On 07/10/2013 08:05 PM, Alexander Graf wrote:
On 10.07.2013, at 07:00, Alexey Kardashevskiy wrote:
On 07/10/2013 03:02 AM, Alexander Graf wrote:
On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
This adds real mode handlers for the
On 11.07.2013, at 12:51, Paul Mackerras wrote:
Currently HV-style KVM does not save and restore the SIAR and SDAR
registers in the PMU (performance monitor unit) on guest entry and
exit. The result is that performance monitoring tools in the guest
could get false information about where a
On 11.07.2013, at 12:54, Alexey Kardashevskiy wrote:
On 07/11/2013 08:11 PM, Alexander Graf wrote:
On 11.07.2013, at 07:12, Alexey Kardashevskiy wrote:
On 07/10/2013 08:05 PM, Alexander Graf wrote:
On 10.07.2013, at 07:00, Alexey Kardashevskiy wrote:
On 07/10/2013 03:02 AM,
Unlike the other general-purpose SPRs, SPRG3 can be read by usermode
code, and is used in recent kernels to store the CPU and NUMA node
numbers so that they can be read by VDSO functions. Thus we need to
load the guest's SPRG3 value into the real SPRG3 register when entering
the guest, and
64-bit POWER processors have a three-bit field for page protection in
the hashed page table entry (HPTE). Currently we only interpret the two
bits that were present in older versions of the architecture. The only
defined combination that has the new bit set is 110, meaning read-only
for
This series fixes some problems in PR KVM and adds support for using
64kB pages, both on the guest side, and also on the host side if the
host kernel is configured with a 64kB page size. Finally this makes
the HPT code SMP-safe using a mutex, which means that PR KVM can now
run SMP guests.
This
This reworks kvmppc_mmu_book3s_64_xlate() to make it check the large
page bit in the hashed page table entries (HPTEs) it looks at, and
to simplify and streamline the code. The checking of the first dword
of each HPTE is now done with a single mask and compare operation,
and all the code dealing
Currently, PR KVM uses 4k pages for the host-side mappings of guest
memory, regardless of the host page size. When the host page size is
64kB, we might as well use 64k host page mappings for guest mappings
of 64kB and larger pages and for guest real-mode mappings. However,
the magic page has to
Currently PR-style KVM keeps the volatile guest register values
(R0 - R13, CR, LR, CTR, XER, PC) in a shadow_vcpu struct rather than
the main kvm_vcpu struct. For 64-bit, the shadow_vcpu exists in two
places, a kmalloc'd struct and in the PACA, and it gets copied back
and forth in
This adds the code to interpret 64k HPTEs in the guest hashed page
table (HPT), 64k SLB entries, and to tell the guest about 64k pages
in kvm_vm_ioctl_get_smmu_info(). Guest 64k pages are still shadowed
by 4k pages.
This also adds another hash table to the four we have already in
The implementation of H_ENTER in PR KVM has some errors:
* With H_EXACT not set, if the HPTEG is full, we return H_PTEG_FULL
as the return value of kvmppc_h_pr_enter, but the caller is expecting
one of the EMULATE_* values. The H_PTEG_FULL needs to go in the
guest's R3 instead.
* With
This adds a per-VM mutex to provide mutual exclusion between vcpus
for accesses to and updates of the guest hashed page table (HPT).
This also makes the code use single-byte writes to the HPT entry
when updating of the reference (R) and change (C) bits. The reason
for doing this, rather than
On Thu, 2013-07-11 at 11:49 +0200, Alexander Graf wrote:
Ben, is soft_enabled == 0; hard_enabled == 1 a valid combination that
may ever occur?
Yes of course, that's what we call soft disabled :-) It's even the
whole point of doing lazy disable...
Ben.
--
To unsubscribe from this list: send
On Thu, 2013-07-11 at 11:52 +0200, Alexander Graf wrote:
Where exactly (it is rather SPAPR_TCE_IOMMU but does not really
matter)?
Select it on KVM_BOOK3S_64? CONFIG_KVM_BOOK3S_64_HV?
CONFIG_KVM_BOOK3S_64_PR? PPC_BOOK3S_64?
I'd say the most logical choice would be to check the Makefile
On Thu, 2013-07-11 at 13:15 +0200, Alexander Graf wrote:
There are 2 ways of dealing with this:
1) Call the ENABLE_CAP on every vcpu. That way one CPU may handle
this hypercall in the kernel while another one may not. The same as we
handle PAPR today.
2) Create a new ENABLE_CAP for
On 11.07.2013, at 14:28, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 11:49 +0200, Alexander Graf wrote:
Ben, is soft_enabled == 0; hard_enabled == 1 a valid combination that
may ever occur?
Yes of course, that's what we call soft disabled :-) It's even the
whole point of doing
On 11.07.2013, at 14:37, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 11:52 +0200, Alexander Graf wrote:
Where exactly (it is rather SPAPR_TCE_IOMMU but does not really
matter)?
Select it on KVM_BOOK3S_64? CONFIG_KVM_BOOK3S_64_HV?
CONFIG_KVM_BOOK3S_64_PR? PPC_BOOK3S_64?
I'd say
On 11.07.2013, at 14:39, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 13:15 +0200, Alexander Graf wrote:
And that's bad. Jeez, seriously. Don't argue this case. We enable new
features individually unless we're 100% sure we can keep everything
working. In this case an ENABLE_CAP
On Thu, 2013-07-11 at 14:47 +0200, Alexander Graf wrote:
Yes of course, that's what we call soft disabled :-) It's even the
whole point of doing lazy disable...
Meh. Of course it's soft_enabled = 1; hard_enabled = 0.
That doesn't happen in normal C code. It happens under very specific
On 07/11/2013 10:51 PM, Alexander Graf wrote:
On 11.07.2013, at 14:39, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 13:15 +0200, Alexander Graf wrote:
And that's bad. Jeez, seriously. Don't argue this case. We enable new
features individually unless we're 100% sure we can keep
On Thu, 2013-07-11 at 14:50 +0200, Alexander Graf wrote:
Not really no. But that would do. You could have give a more useful
answer in the first place though rather than stringing him along.
Sorry, I figured it was obvious.
It wasn't no, because of the mess with modules and the nasty
On Thu, 2013-07-11 at 14:51 +0200, Alexander Graf wrote:
I don't like bloat usually. But Alexey even had an #ifdef DEBUG in
there to selectively disable in-kernel handling of multi-TCE. Not
calling ENABLE_CAP would give him exactly that without ugly #ifdefs in
the kernel.
I don't see much
On Thu, 2013-07-11 at 15:12 +1000, Alexey Kardashevskiy wrote:
Any debug code is prohibited? Ok, I'll remove.
Debug code that requires code changes is prohibited, yes.
Debug code that is runtime switchable (pr_debug, trace points, etc)
are allowed.
Bollox.
$ grep DBG\( arch/powerpc/
On 11.07.2013, at 14:33, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 15:12 +1000, Alexey Kardashevskiy wrote:
Any debug code is prohibited? Ok, I'll remove.
Debug code that requires code changes is prohibited, yes.
Debug code that is runtime switchable (pr_debug, trace points, etc)
On 07/11/2013 10:58 PM, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 14:51 +0200, Alexander Graf wrote:
I don't like bloat usually. But Alexey even had an #ifdef DEBUG in
there to selectively disable in-kernel handling of multi-TCE. Not
calling ENABLE_CAP would give him exactly that
On 11.07.2013, at 15:13, Alexey Kardashevskiy wrote:
On 07/11/2013 10:58 PM, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 14:51 +0200, Alexander Graf wrote:
I don't like bloat usually. But Alexey even had an #ifdef DEBUG in
there to selectively disable in-kernel handling of multi-TCE.
Hi All,
I complied the latest kernel 3.10.0+ pulled from the git on top of
3.10.0-rc5+ by enabling the new Virtualiztaion features. The compliation
was sucessfull, when I rebooted the machine it fails to boot with error
as systemd [1] : Failed to mount /dev : no such device.
Is it problem
On 11.07.2013, at 15:41, chandrashekar shastri wrote:
Hi All,
I complied the latest kernel 3.10.0+ pulled from the git on top of
3.10.0-rc5+ by enabling the new Virtualiztaion features. The compliation was
sucessfull, when I rebooted the machine it fails to boot with error as
systemd
On Thu, 2013-07-11 at 12:11 +0200, Alexander Graf wrote:
So I must add one more ioctl to enable MULTITCE in kernel handling. Is it
what you are saying?
I can see KVM_CHECK_EXTENSION but I do not see KVM_ENABLE_EXTENSION or
anything like that.
KVM_ENABLE_CAP. It's how we enable sPAPR
On 07/11/2013 11:41 PM, chandrashekar shastri wrote:
Hi All,
I complied the latest kernel 3.10.0+ pulled from the git on top of
3.10.0-rc5+ by enabling the new Virtualiztaion features. The compliation
was sucessfull, when I rebooted the machine it fails to boot with error as
systemd [1] :
On Thu, 2013-07-11 at 15:07 +0200, Alexander Graf wrote:
Ok, let me quickly explain the problem.
We are leaving host context, switching slowly into guest context.
During that transition we call get_paca() indirectly (apparently by
another call to hard_disable() which sounds bogus, but that's
On Thu, 2013-07-11 at 11:18 -0500, Scott Wood wrote:
If we set IRQs as soft-disabled prior to calling hard_irq_disable(),
then hard_irq_disable() will fail to call trace_hardirqs_off().
Sure because setting them as soft-disabled will have done it.
However by doing so, you also create the
This adds a per-VM mutex to provide mutual exclusion between vcpus
for accesses to and updates of the guest hashed page table (HPT).
This also makes the code use single-byte writes to the HPT entry
when updating of the reference (R) and change (C) bits. The reason
for doing this, rather than
On 07/12/2013 08:19 AM, Benjamin Herrenschmidt wrote:
On Thu, 2013-07-11 at 15:07 +0200, Alexander Graf wrote:
Ok, let me quickly explain the problem.
We are leaving host context, switching slowly into guest context.
During that transition we call get_paca() indirectly (apparently by
another
On 07/12/2013 12:36 AM, Scott Wood wrote:
On 07/11/2013 11:30:41 AM, Alexander Graf wrote:
On 11.07.2013, at 18:18, Scott Wood wrote:
On 07/11/2013 08:07:30 AM, Alexander Graf wrote:
get_paca() warns when we're preemptible. We're only not preemptible when
either preempt is disabled or irqs
On Fri, 2013-07-12 at 10:13 +0800, tiejun.chen wrote:
#define hard_irq_disable()do {\
u8 _was_enabled = get_paca()-soft_enabled; \
Current problem I met is issued from the above line.
__hard_irq_disable(); \
-
On 07/12/2013 11:57 AM, Benjamin Herrenschmidt wrote:
On Fri, 2013-07-12 at 10:13 +0800, tiejun.chen wrote:
#define hard_irq_disable()do {\
u8 _was_enabled = get_paca()-soft_enabled; \
Current problem I met is issued from the above line.
39 matches
Mail list logo