On 18.07.2013, at 12:08, “tiejun.chen” wrote:

> On 07/18/2013 05:48 PM, Alexander Graf wrote:
>> 
>> On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Bhushan Bharat-R65777
>>>> Sent: Thursday, July 18, 2013 1:53 PM
>>>> To: '"�tiejun.chen�"'
>>>> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood 
>>>> Scott-
>>>> B07421
>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel
>>>> managed pages
>>>> 
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: "�tiejun.chen�" [mailto:tiejun.c...@windriver.com]
>>>>> Sent: Thursday, July 18, 2013 1:52 PM
>>>>> To: Bhushan Bharat-R65777
>>>>> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood
>>>>> Scott-
>>>>> B07421
>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
>>>>> kernel managed pages
>>>>> 
>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote:
>>>>>> 
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: kvm-ppc-ow...@vger.kernel.org
>>>>>>> [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of "�tiejun.chen�"
>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM
>>>>>>> To: Bhushan Bharat-R65777
>>>>>>> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
>>>>>>> Wood
>>>>>>> Scott-
>>>>>>> B07421
>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for
>>>>>>> kernel managed pages
>>>>>>> 
>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: "�tiejun.chen�" [mailto:tiejun.c...@windriver.com]
>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM
>>>>>>>>> To: Bhushan Bharat-R65777
>>>>>>>>> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de;
>>>>>>>>> Wood
>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777
>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only
>>>>>>>>> for kernel managed pages
>>>>>>>>> 
>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote:
>>>>>>>>>> If there is a struct page for the requested mapping then it's
>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheable)
>>>>>>>>>> else this is treated as I/O and we set  "I + G"  (cache
>>>>>>>>>> inhibited,
>>>>>>>>>> guarded)
>>>>>>>>>> 
>>>>>>>>>> This helps setting proper TLB mapping for direct assigned device
>>>>>>>>>> 
>>>>>>>>>> Signed-off-by: Bharat Bhushan <bharat.bhus...@freescale.com>
>>>>>>>>>> ---
>>>>>>>>>>    arch/powerpc/kvm/e500_mmu_host.c |   17 ++++++++++++-----
>>>>>>>>>>    1 files changed, 12 insertions(+), 5 deletions(-)
>>>>>>>>>> 
>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>> index 1c6a9d7..089c227 100644
>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c
>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32
>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int
>>>>>>>>> usermode)
>>>>>>>>>>      return mas3;
>>>>>>>>>>    }
>>>>>>>>>> 
>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int
>>>>>>>>>> usermode)
>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
>>>>>>>>>>    {
>>>>>>>>>> +    u32 mas2_attr;
>>>>>>>>>> +
>>>>>>>>>> +    mas2_attr = mas2 & MAS2_ATTRIB_MASK;
>>>>>>>>>> +
>>>>>>>>>> +    if (!pfn_valid(pfn)) {
>>>>>>>>> 
>>>>>>>>> Why not directly use kvm_is_mmio_pfn()?
>>>>>>>> 
>>>>>>>> What I understand from this function (someone can correct me) is
>>>>>>>> that it
>>>>>>> returns "false" when the page is managed by kernel and is not
>>>>>>> marked as RESERVED (for some reason). For us it does not matter
>>>>>>> whether the page is reserved or not, if it is kernel visible page then 
>>>>>>> it
>>>> is DDR.
>>>>>>>> 
>>>>>>> 
>>>>>>> I think you are setting I|G by addressing all mmio pages, right? If
>>>>>>> so,
>>>>>>> 
>>>>>>>      KVM: direct mmio pfn check
>>>>>>> 
>>>>>>>      Userspace may specify memory slots that are backed by mmio
>>>>>>> pages rather than
>>>>>>>      normal RAM.  In some cases it is not enough to identify these
>>>>>>> mmio
>>>>> pages
>>>>>>>      by pfn_valid().  This patch adds checking the PageReserved as well.
>>>>>> 
>>>>>> Do you know what are those "some cases" and how checking
>>>>>> PageReserved helps in
>>>>> those cases?
>>>>> 
>>>>> No, myself didn't see these actual cases in qemu,too. But this should
>>>>> be chronically persistent as I understand ;-)
>>>> 
>>>> Then I will wait till someone educate me :)
>>> 
>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not 
>>> want to call this for all tlbwe operation unless it is necessary.
>> 
>> It certainly does more than we need and potentially slows down the fast path 
>> (RAM mapping). The only thing it does on top of "if (pfn_valid())" is to 
>> check for pages that are declared reserved on the host. This happens in 2 
>> cases:
>> 
>>   1) Non cache coherent DMA
>>   2) Memory hot remove
>> 
>> The non coherent DMA case would be interesting, as with the mechanism as it 
>> is in place in Linux today, we could potentially break normal guest 
>> operation if we don't take it into account. However, it's Kconfig guarded by:
>> 
>>         depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON
>>         default n if PPC_47x
>>         default y
>> 
>> so we never hit it with any core we care about ;).
>> 
>> Memory hot remove does not exist on e500 FWIW, so we don't have to worry 
>> about that one either.
> 
> Thanks for this good information :)
> 
> So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside 
> kvm_is_mmio_pfn() to make sure that check is only valid when that is really 
> needed? This can decrease those unnecessary performance loss.
> 
> If I'm wrong please correct me :)

You're perfectly right, but this is generic KVM code. So it gets run across all 
architectures. What if someone has the great idea to add a new case here for 
x86, but doesn't tell us? In that case we potentially break x86.

I'd rather not like to break x86 :).

However, it'd be very interesting to see a benchmark with this change. Do you 
think you could just rip out the whole reserved check and run a few benchmarks 
and show us the results?


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to