On Thu, Jun 22, 2017 at 02:37:27PM +0530, Anshuman Khandual wrote: > On 06/17/2017 09:22 AM, Ram Pai wrote: > > Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6 > > in the 4K backed hpte pages. These bits continue to be used > > for 64K backed hpte pages in this patch, but will be freed > > up in the next patch. > > > > The patch does the following change to the 64K PTE format > > > > H_PAGE_BUSY moves from bit 3 to bit 9 > > H_PAGE_F_SECOND which occupied bit 4 moves to the second part > > of the pte. > > H_PAGE_F_GIX which occupied bit 5, 6 and 7 also moves to the > > second part of the pte. > > > > the four bits((H_PAGE_F_SECOND|H_PAGE_F_GIX) that represent a slot > > is initialized to 0xF indicating an invalid slot. If a hpte > > gets cached in a 0xF slot(i.e 7th slot of secondary), it is > > released immediately. In other words, even though 0xF is a > > valid slot we discard and consider it as an invalid > > slot;i.e hpte_soft_invalid(). This gives us an opportunity to not > > depend on a bit in the primary PTE in order to determine the > > validity of a slot. > > > > When we release a hpte in the 0xF slot we also release a > > legitimate primary slot and unmap that entry. This is to > > ensure that we do get a legimate non-0xF slot the next time we > > retry for a slot. > > > > Though treating 0xF slot as invalid reduces the number of available > > slots and may have an effect on the performance, the probabilty > > of hitting a 0xF is extermely low. > > > > Compared to the current scheme, the above described scheme reduces > > the number of false hash table updates significantly and has the > > added advantage of releasing four valuable PTE bits for other > > purpose. > > > > This idea was jointly developed by Paul Mackerras, Aneesh, Michael > > Ellermen and myself. > > > > 4K PTE format remain unchanged currently. > > Scanned through the PTE format again for hash 64K and 4K. It seems > to me that there might be 5 free bits already present on the PTE > format. I might have seriously mistaken something here :) Please > correct me if that is not the case. _RPAGE_RPN* I think is applicable > only for hash page table format and will not be available for radix > later. > > +#define _PAGE_FREE_1 0x0000000000000040UL /* Not used */ > +#define _RPAGE_SW0 0x2000000000000000UL /* Not used */ > +#define _RPAGE_SW1 0x0000000000000800UL /* Not used */ > +#define _RPAGE_RPN42 0x0040000000000000UL /* Not used */ > +#define _RPAGE_RPN41 0x0020000000000000UL /* Not used */ >
The bits are chosen to future proof for radix implementation. _RPAGE_SW* will eat into what is available for software in the future, and these key-bits will certainly be something that the radix hardware will read, in the future. The _RPAGE_RPN* bits cannot be relied on for radix. But finally the bits that we chose (H_PAGE_F_SECOND|H_PAGE_F_GIX) had the best potential for giving us the highest number of free bits with relatively less effort. RP