Re: Re-arrange contents of struct cpuinfo - kern/52919
Le 16/07/2018 à 00:27, Paul Goyette a écrit : Since maxv@ has already done some rearranging, but so far has not bumped the system version, I would like to do some more re-arrangement. This will group all the XEN stuff together, as well as move all of the conditional parts of sstruct cpuinfo to the end, following all of the non-conditional parts. Yes, please proceed.
Re-arrange contents of struct cpuinfo - kern/52919
Since maxv@ has already done some rearranging, but so far has not bumped the system version, I would like to do some more re-arrangement. This will group all the XEN stuff together, as well as move all of the conditional parts of sstruct cpuinfo to the end, following all of the non-conditional parts. Please review the attached patch and let me know if there any serious objections. I'd like to commit within the next day or two, along with a kernel rev bump... +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +--+--++Index: cpu.h === RCS file: /cvsroot/src/sys/arch/x86/include/cpu.h,v retrieving revision 1.95 diff -u -p -r1.95 cpu.h --- cpu.h 15 Jul 2018 08:47:43 - 1.95 +++ cpu.h 15 Jul 2018 22:23:37 - @@ -127,9 +127,6 @@ struct cpu_info { uint64_t ci_scratch; uintptr_t ci_pmap_data[128 / sizeof(uintptr_t)]; -#ifdef XEN - u_long ci_evtmask[NR_EVENT_CHANNELS]; /* events allowed on this CPU */ -#endif struct intrsource *ci_isources[MAX_INTR_SOURCES]; volatile intci_mtx_count; /* Negative count of spin mutexes */ @@ -174,6 +171,44 @@ struct cpu_info { u_int ci_cflush_lsize; /* CLFLUSH insn line size */ struct x86_cache_info ci_cinfo[CAI_COUNT]; + device_tci_frequency; /* Frequency scaling technology */ + device_tci_padlock; /* VIA PadLock private storage */ + device_tci_temperature; /* Intel coretemp(4) or equivalent */ + device_tci_vm; /* Virtual machine guest driver */ + + /* +* Segmentation-related data. +*/ + union descriptor *ci_gdt; + struct cpu_tss *ci_tss;/* Per-cpu TSSes; shared among LWPs */ + int ci_tss_sel; /* TSS selector of this cpu */ + + /* +* The following two are actually region_descriptors, +* but that would pollute the namespace. +*/ + uintptr_t ci_suspend_gdt; + uint16_tci_suspend_gdt_padding; + uintptr_t ci_suspend_idt; + uint16_tci_suspend_idt_padding; + + uint16_tci_suspend_tr; + uint16_tci_suspend_ldt; + uintptr_t ci_suspend_fs; + uintptr_t ci_suspend_gs; + uintptr_t ci_suspend_kgs; + uintptr_t ci_suspend_efer; + uintptr_t ci_suspend_reg[12]; + uintptr_t ci_suspend_cr0; + uintptr_t ci_suspend_cr2; + uintptr_t ci_suspend_cr3; + uintptr_t ci_suspend_cr4; + uintptr_t ci_suspend_cr8; + + /* The following must be in a single cache line. */ + int ci_want_resched __aligned(64); + int ci_padout __aligned(64); + #ifndef __HAVE_DIRECT_MAP #define VPAGE_SRC 0 #define VPAGE_DST 1 @@ -201,42 +236,24 @@ struct cpu_info { vaddr_t ci_svs_utls; #endif -#if defined(XEN) && (defined(PAE) || defined(__x86_64__)) +#if defined(XEN) +#if defined(PAE) || defined(__x86_64__) /* Currently active user PGD (can't use rcr3() with Xen) */ pd_entry_t *ci_kpm_pdir;/* per-cpu PMD (va) */ paddr_t ci_kpm_pdirpa; /* per-cpu PMD (pa) */ kmutex_tci_kpm_mtx; +#endif /* defined(PAE) || defined(__x86_64__) */ + #if defined(__x86_64__) /* per-cpu version of normal_pdes */ pd_entry_t *ci_normal_pdes[3]; /* Ok to hardcode. only for x86_64 && XEN */ paddr_t ci_xen_current_user_pgd; -#endif /* __x86_64__ */ -#endif /* XEN et.al */ - -#ifdef XEN - size_t ci_xpq_idx; -#endif +#endif /* defined(__x86_64__) */ -#ifndef XEN - struct evcnt ci_ipi_events[X86_NIPI]; -#else /* XEN */ + u_long ci_evtmask[NR_EVENT_CHANNELS]; /* events allowed on this CPU */ struct evcnt ci_ipi_events[XEN_NIPIS]; evtchn_port_t ci_ipi_evtchn; -#endif /* XEN */ - - device_tci_frequency; /* Frequency scaling technology */ - device_tci_padlock; /* VIA PadLock private storage */ - device_tci_temperature; /* Intel coretemp(4) or equivalent */ - device_tci_vm; /* Virtual machine guest driver */ - - /* -* Segmentation-related data. -*/ - union descriptor *ci_gdt; - struct cpu_tss *ci_tss;/* Per-cpu TSSes; shared among LWPs */ - int ci_tss_sel; /* TSS selector of this cpu */ - -#ifdef XEN + size_t ci_xpq_idx; /* Xen raw system time at which we last ran hardclock.
Re: aarch64 gcc kernel compilation
On 15.07.2018 20:08, Christos Zoulas wrote: > Hi, > > Gcc is now working on aarch64 but the kernel does not compile because of > some idiomatic clang code that is not supported by gcc (at least gcc-6) > > To define constants, it uses: > > static const uintmax_t > FOO = __BIT(9), > BAR = FOO; > > While this is nice, specially for the debugger, it produces an error > in gcc. While fixing these is easy, gcc also complains about using the > constants as switch labels. Thus it is better to just nukem all and > rewrite them as: > > #define FOO __BIT(9) > #define BAR FOO > > Should I go ahead and do it, or there is a smarter solution? > > christos > I used to have problems to build rumpkernel aarch64 on Linux with GCC (some years ago) due to usage __uint128_t in reg.h. Can we drop it? The __uint128_t type is not used anywhere else in aarch64 subdirs. It's used in assembly in FPREG_Q0-FPREQ_Q31 in cpuswitch.S. The same optimization can be done without the usage of __uint128_t, probably just need for proper alignment of fp_reg (15). There is also some mysterious fallout that General Purpose Registers in core files are shipped with 128bit containers. It's not compatible with LLDB and requires needless generic work for no purpose. I can try to prepare a patch blindly and share with aarch64 owners. signature.asc Description: OpenPGP digital signature
Re: Too many PMC implementations
On Sun, 15 Jul 2018, Maxime Villard wrote: Now I want to move: arch/x86/x86/tprof_pmi.c arch/x86/x86/tprof_amdpmi.c into dev/tprof/tprof_intel.c dev/tprof/tprof_amd.c I guess people are fine? I think it is better to gather all the pieces in one dir. I don't really have an opinion here, but I've just committed a new backend as dev/tprof/tprof_armv8.c. So I guess that's a vote for the latter :) Cheers, Jared
aarch64 gcc kernel compilation
Hi, Gcc is now working on aarch64 but the kernel does not compile because of some idiomatic clang code that is not supported by gcc (at least gcc-6) To define constants, it uses: static const uintmax_t FOO = __BIT(9), BAR = FOO; While this is nice, specially for the debugger, it produces an error in gcc. While fixing these is easy, gcc also complains about using the constants as switch labels. Thus it is better to just nukem all and rewrite them as: #define FOO __BIT(9) #define BAR FOO Should I go ahead and do it, or there is a smarter solution? christos
Re: Too many PMC implementations
Le 11/07/2018 à 18:22, Maxime Villard a écrit : Right now we have three (or more?) different implementations for Performance Monitoring Counters: * PMC: this one is MI. It is used only on one ARM model (xscale I think). There used to be an x86 code for it, but it was broken, and I removed it. The implementation comes with libpmc, a library we provide. The code hasn't moved these last 15 years. I don't like this implementation, it is really invasive (see the numerous pmc.h files that are all empty). * X86PMC: this one is MD, and only available for x86. I wrote it myself. The code is small (x86/pmc.c), and functional. The PMCs are system-wide, and retrieved on a per-cpu basis. But this implementation does not support tracking, that is, we get numbers (about the cache misses for example), but we don't know where they happened. * TPROF: this one is MI, but only x86 support is present. TPROF provides the backend needed to support tracking: via a device, that userland can read from, in order to absorb the event samples produced by the kernel. The backend is pretty good, but the frontend (where the user chooses which PMC etc) is inexistent - the CPU/event detection is not there either. The backend is MI (/dev/tprof/tprof.c), and can be used on other architectures. The module already exists to dynamically modload. I think it would be good to: * Remove PMC entirely. Then remove libpmc too. * Merge X86PMC into the x86 part of TPROF. That is to say, into x86/tprof_*. Then remove X86PMC. * Later, maybe, someone will want to add other architectures in TPROF, like all the recent ARMs. Maxime Now I want to move: arch/x86/x86/tprof_pmi.c arch/x86/x86/tprof_amdpmi.c into dev/tprof/tprof_intel.c dev/tprof/tprof_amd.c I guess people are fine? I think it is better to gather all the pieces in one dir.