[PATCH v3 0/3] cgroup: Introducing bypass mode
v2->v3: - Remove invalid cgroup subdirectory creation patch. - Add use cases for the bypass mode and removing statements about control files ownership in cgroup-v2.txt. - Restrict bypass mode to non-domain (threaded) controllers only. v1->v2: - Remove relax no-internal-process constraint patch as this feature is in the thread mode v4 patch. - Remove subtree root mode patch. - Remove the skip dying css patch as I can no longer reproduce the problem. - Rework the bypass mode so that write to "cgroup.controllers" to enable or disable controller interface files is only allowed if the parent grants bypass mode to children by writing the '#'-prefixed controller to "cgroup.subtree_control". - Add a patch to disable subdirectory creation on an invalid domain. v1 patch - https://lkml.org/lkml/2017/6/14/551 v2 patch - https://lkml.org/lkml/2017/7/21/606 This patchset introduces new capability to the cgroup v2 core to give more freedom and flexibility to non-domain controllers so that they can shape their own unique views of the virtual cgroup hierarchies that can best suit thier own use cases. It also enables a cgroup parent to selectively enable a non-domain controller in a subset of its child cgroups instead of in either all or none of them. The bypass mode cannot be used on domain controllers as it will complicate resource distribution model and rules. One use case is an application that want to use cpuset, for example, to bind some worker threads to individual cpus. At the same time, the application may also want to use cpu controller to limit the amount of cpu consumed by some other threads. Right now, the only way to do that with the current v2 control scheme is to create child cgroups with both cpu and cpuset controllers enabled and put the desired processes or threads into those child cgroups. The cost of enabling cpuset on a task that need cpu controller is negligible. However, the cost of enabling cpu controller on tasks that only need cpuset can be noticeable. The performance difference may become a concern for users who are thinking of moving from cgroup v1 to v2. Similarly, instead of cpuset, if we want to use perf_event, freezer or other non-domain controllers in a subset of tasks, we will also need to enable CPU controller along with the associated performance cost. With bypass mode, we will have the ability to enable just the non-domain controllers the tasks needed in their respective child cgroups. It is just like what we can currently do with cgroup v1. This patchset is layered on top of the "for-4.14" branch of Tejun's cgroup git tree. Patch 1 introduces a new bypass mode that allows a non-domain controller to be disabled in a cgroup, but re-enabled again in its children. This is enabled by writing the controller name prefixed with '#' to the "cgroup.subtree_control" file. Then all its children will have this controller in bypass mode. Patch 2 extends the bypass mode mechanism to allow those child cgroups that are put into the bypass mode for a particular non-domain controller by their parent to be re-enabled again by writing the controller name with the '+' prefix to the "cgroup.controllers" file. Patch 3 extends the debug controller to expose additional controller masks introduced by this patchset. Waiman Long (3): cgroup: subtree_control bypass mode for non-domain controllers cgroup: Allow reenabling of controller in bypass mode cgroup: Make debug controller report new controller masks Documentation/cgroup-v2.txt | 58 +++--- include/linux/cgroup-defs.h | 19 +++- kernel/cgroup/cgroup.c | 250 +++- kernel/cgroup/debug.c | 2 + 4 files changed, 257 insertions(+), 72 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] printk: Add boottime and real timestamps
On Mon, Aug 07, 2017 at 02:17:33PM -0400, Prarit Bhargava wrote: > > > On 08/07/2017 01:14 PM, Luis R. Rodriguez wrote: > > > > > Note printk_late_init() is a late_initcall(). This means if the > > printk_time_setting was disabled it will take a while to enable it. > > Enabling it > > is done at the device_initcall(), so if printk setting is disabled but a > > user > > enables it with a toggle of the module param there is a period of time > > during > > which time resolution would be different. > > I'm not sure I follow your comment. Could you elaborate with an example of > what you think is going wrong or might be confusing? Sure let's consider this: +static u64 printk_get_ts(void) +{ + u64 mono, offset_real; + + if (printk_time <= PRINTK_TIME_LOCAL) + return local_clock(); + + if (printk_time == PRINTK_TIME_BOOT) + return ktime_get_boot_log_ts(); + + mono = ktime_get_real_log_ts(_real); + + if (printk_time == PRINTK_TIME_MONO) + return mono; + + return mono + offset_real; +} So even if printk_time was flipped in the end the backend routines used will be local_clock(), ,ktime_get_boot_log_ts() or ktime_get_real_log_ts(). This is used here; @@ -1643,7 +1756,7 @@ static bool cont_add(int facility, int level, enum log_flags flags, const char * cont.facility = facility; cont.level = level; cont.owner = current; - cont.ts_nsec = local_clock(); + cont.ts_nsec = printk_get_ts(); cont.flags = flags; } But lets inspect these new calls: diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c @@ -477,6 +479,24 @@ u64 notrace ktime_get_boot_fast_ns(void) } EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns); +u64 ktime_get_real_log_ts(u64 *offset_real) +{ + *offset_real = ktime_to_ns(tk_core.timekeeper.offs_real); + + if (timekeeping_active) + return ktime_get_mono_fast_ns(); + else + return local_clock(); +} + +u64 ktime_get_boot_log_ts(void) +{ + if (timekeeping_active) + return ktime_get_boot_fast_ns(); + else + return local_clock(); +} + So they are really only effectively calling something other than what lock_clock() returns *iff* timekeeping_active is true. But this is only set later at the respective device_initcall() in this file: @@ -1530,6 +1550,8 @@ void __init timekeeping_init(void) write_seqcount_end(_core.seq); raw_spin_unlock_irqrestore(_lock, flags); + + timekeeping_active = 1; } So when the boot param is processed and prints out that it has changed someone inspecting any time setting after that print may assume its using after that ktime_get_mono_fast_ns() or time_get_boot_fast_ns() but this is not accurate, it will use local_clock() until *after* device_initcall(). So in between boot and this particular device_initcall() time resolution can only be local_time(). Seems worth documenting that. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/4] ipmi: bt-i2c: added IPMI Block Transfer over I2C
On Wed, Aug 9, 2017 at 7:26 PM, Corey Minyardwrote: > On 08/09/2017 08:04 PM, Brendan Higgins wrote: >>> >>> Perhaps that is some level of abuse, but it's pretty common. I'm not >>> against it. >>> >>> There is standard IPMI firmware NetFN (though no commands defined) that >>> if >>> you use >>> the driver automatically goes into "Maintenance mode" and modified the >>> timeouts >>> and handling to some extent to help with this. >> >> That is a really good point, I missed that. >> ... >>> >>> >>> There are ways to accomplish this that aren't that complex. You can >>> create >>> an OEM >>> command that can query the maximum message size and the ability to do >>> sequence >>> numbers in the messages. >>> >>> If messages larger than 32-bytes are supported, and the host I2C/SMBus >>> driver >>> supports it, you could use the standard SSIF SMBus commands to do this, >>> they >>> have an 8-bit length field. >>> >>> If sequence numbers are supported, The SSIF could use different SMBus >>> commands >>> to do the write and read requests. Since this is only if you get an OEM >>> command, >>> and if you put the sequence numbers at the end where they are easy to add >>> on >>> the send side, this is a small change to the driver. >> >> What if we just had an OEM command that changed the message structure from >> that point on? We could abuse the "maintenance mode" NetFN to get back >> into >> normal SSIF if necessary. > > > Actually, I wouldn't have a separate "openbmc mode". I would have OpenBMC > always > work with standard SSIF, and have separate SMBus commands for messages with > the sequence number and messages larger than 32 bytes. > > I've attached a patch with what I would expect the changes to be to the host > driver. > It doesn't handle multiple outstanding messages, but it shows what detection > and a > separate SMBus command would look like. I took a look at the patch, it seems reasonable. If I was maintaining SSIF, I probably would not want that kind of clutter for my admittedly weird use case, but if you're okay with it, then so am I. > > >>> So I think the changes would be small and contained. I'm actually ok >>> with a >>> different driver, but I think it would be more valuable to the OpenBMC >>> project >>> to have a standardized interface that would work (in a not quite as >>> efficient >>> mode) with software that does not use the Linux IPMI driver. >> >> I guess I see the all of my asks as hacky things which we can hopefully >> remove >> at some point. Hopefully, most OpenBMC users won't want or need these >> things. >> ... Regardless of what we do with the "BT-I2C" stuff, I am still interested in what you think about this. >>> >>> >>> I think you are right, it probably belongs some place else. The way that >>> makes the most >>> sense to me would be to have an "ipmi" directory with a "host" and >>> "slave" >>> side, and since >>> ipmi is not really a char driver, to move it to the main driver >>> directory. >>> That might be >>> fairly disruptive, though. >> >> That was my thinking exactly. >> >>> The other option that makes sense to me would be to add a >>> drivers/char/ipmi_slave directory, >>> or something like that, and put the slave code there. That would be less >>> disruptive. >> >> Right that is the approach I took, except I called it >> drivers/char/ipmi_bmc. >> >> I originally thought doing the less disruptive thing is best; however, I >> know >> there are also some OpenBMC people who are interested in implementing >> IPMB. So maybe now is the time to bite the bullet and create an ipmi >> directory under drivers/. > > > I'm not sure IPMB would make much difference, there's no host side change as > it's > already supported. I don't think there would be any significant code > sharing > between the two. No, I don't expect much code sharing between them. I just thought it would be a reasonable place to put IPMB, sort of like how we have a bunch of "character" device drivers in drivers/char, but I suppose that might be somewhat of an anti-pattern ;-) > > If there end up being a significant amount of common code, then it would > definitely be worth the effort to move it. > >>> -corey >> >> In summary, I think I can live with making it a mangled form of SSIF, but >> I would prefer to put it in its own driver. > > > You can look at the patch and consider it, and consider that you would need > to > implement flag and event handling. On an x86 host there would be SMBIOS > and ACPI stuff to deal with somehow for discovery. There's probably few > other > things to deal with. > >> In any case, I think I would rather focus on the the BMC side IPMI >> framework >> now, since it is a bigger change and would also reduce the work of >> implementing a BMC side SSIF driver. >> >> Here is what I propose: we focus on the BMC side IPMI framework RFC that >> I sent out the other day: >>
Re: [Linux-ima-devel] [PATCH, RESEND 08/12] ima: added parser for RPM data type
On 8/2/2017 9:22 AM, James Morris wrote: On Tue, 1 Aug 2017, Roberto Sassu wrote: On 8/1/2017 12:27 PM, Christoph Hellwig wrote: On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote: This patch introduces a parser for RPM packages. It extracts the digests from the RPMTAG_FILEDIGESTS header section and converts them to binary data before adding them to the hash table. The advantage of this data type is that verifiers can determine who produced that data, as headers are signed by Linux distributions vendors. RPM headers signatures can be provided as digest list metadata. Err, parsing arbitrary file formats has no business in the kernel. The benefit of this choice is that no actions are required for Linux distribution vendors to support the solution I'm proposing, because they already provide signed digest lists (RPM headers). Since the proof of loading a digest list is the digest of the digest list (included in the list metadata), if RPM headers are converted to a different format, remote attestation verifiers cannot check the signature. If the concern is security, it would be possible to prevent unsigned RPM headers from being parsed, if the PGP key type is upstreamed (adding in CC keyri...@vger.kernel.org). It's a security concern and also a layering violation, there should be no need to parse package file formats in the kernel. Parsing RPMs is not strictly necessary. Digests from the headers can be extracted and written to a new file using the compact data format (introduced with patch 7/12). At boot time, IMA measures this file before digests are uploaded to the kernel. At this point, only files with unknown digest will be added to the measurement list. At verification time, verifiers recreate the measurement list by merging together the digests uploaded to the kernel with the unknown digests. Then, they verify the obtained list. There are two ways to verify the digests: searching them in a reference database, or checking a signature. With the 'ima-sig' measurement list template, it is possible to verify signatures for each accessed file. With this patch set, it is possible to verify the signature of the file containing the digests uploaded to the kernel. If the data format changes, the signature cannot be verified. To avoid this limitation, the parsers could be moved to a userspace tool which then uploads the parsed digests to the kernel. IMA would measure the original files. But, if the tool is compromised, it could load digests not included in the parsed files. With the current solution this problem does not arise because no changes can be done by userspace applications to the uploaded data while digests are parsed by IMA. I could remove the RPM parser from the patch set for now. Is the remaining part of the patch set ok, and is the explanation of what it does clear? Thanks Roberto I'm not really clear on exactly how this patch series works. Can you provide a more concrete explanation of what steps would occur during boot and attestation? -- HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Bo PENG, Qiuen PENG, Shengli WANG -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-ima-devel] [PATCH 11/12] ima: don't report measurements if digests are included in the loaded lists
On 7/25/2017 11:44 AM, Roberto Sassu wrote: Don't report measurements if the file digest has been included in an uploaded digest list. The advantage of this solution is that the boot time overhead, when a TPM is available, is very small because a PCR is extended only for unknown files. The disadvantage is that verifiers do not know anymore which and when files are accessed (they must assume that the worst case happened, i.e. all files have been accessed). Am I reading this correctly that you want to measure certain files, but not ones that have been included in a "digest list", which sounds like a white list of sorts. If so, I have two concerns: 1 - How would the client get this digest list? Shouldn't it be up to the relying party to decide what is trusted and not trusted, not the client? What of the case with two different relying parties that have a different list of trusted applications? E.g., one trusts any version of program X, while the other trusts only version 3.1 and up? 2 - What about files on the digest list that were not run? The relying party may want to know if a program wasn't run? E.g., antivirus or a firewall. If the rule is "don't measure if it's on the digest list", how does the relying party know if it was run? -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] printk: Add monotonic, boottime, and realtime timestamps
On Tue, Aug 08, 2017 at 07:08:00PM -0400, Prarit Bhargava wrote: > > > On 08/08/2017 04:28 AM, Peter Zijlstra wrote: > > On Mon, Aug 07, 2017 at 01:36:39PM -0700, Paul E. McKenney wrote: > >> On Mon, Aug 07, 2017 at 04:06:09PM -0400, Prarit Bhargava wrote: > > > >>> peterz? Want to offer a suggestion? The issue is that I'm changing a > >>> bool > >>> config option to an int and that impacts all the arch's defconfigs. John > >>> points > >>> out that this is a lot of churn and we're both wondering if there's a > >>> better way > >>> to do the configs. > >> > >> The usual approach is to keep the old bool Kconfig option, and add another > >> int Kconfig option that depends on the original one. The tests for > >> the int value get a bit more complex, but one way to handle this is to > >> define a cpp macro something like the following: > >> > >> #ifdef CONFIG_OLD_OPTION > >> #define CPP_NEW_OPTION 0 > >> #else > >> #define CPP_NEW_OPTION CONFIG_NEW_OPTION > >> #endif > >> > >> Then use CPP_NEW_OPTION, where zero means disabled and other numbers > >> select the available options. > >> > >> Adjust to suit depending on what values mean what. > >> > >> Another approach is to make the range of the new Kconfig option > >> depend on the old option: > >> > >> config NEW_OPTION > >>int "your description here" > >>range 1 5 if OLD_OPTION > >>range 0 0 if !OLD_OPTION > >>default 0 > >>help > >> your help here > >> > >> Again, adjust to suit depending on what values mean what. > > > > Right this. Except I don't see the !OLD_OPTION working as expected. > > A 'new' config will not include the old one, so the !OLD_OPTION thing > > will 'always' be true. > > > > So your: > > > >> @@ -1,8 +1,46 @@ > >> menu "printk and dmesg options" > >> > >> +choice > >> + prompt "printk default clock" > >> + config PRINTK_TIME_DISABLE > >> + bool "Disabled" > >> + help > >> +Selecting this option disables the time stamps of printk(). > >> + > >> + config PRINTK_TIME_LOCAL > >> + bool "Local Clock" > >> + help > >> + Selecting this option causes the time stamps of printk() to be > >> + stamped with the unadjusted hardware clock. > >> + > >> + config PRINTK_TIME_BOOT > >> + bool "CLOCK_BOOTTIME" > >> + help > >> + Selecting this option causes the time stamps of printk() to be > >> + stamped with the adjusted boottime clock. > >> + > >> + config PRINTK_TIME_MONO > >> + bool "CLOCK_MONOTONIC" > >> + help > >> + Selecting this option causes the time stamps of printk() to be > >> + stamped with the adjusted monotonic clock. > >> + > >> + config PRINTK_TIME_REAL > >> + bool "CLOCK_REALTIME" > >> + help > >> + Selecting this option causes the time stamps of printk() to be > >> + stamped with the adjusted realtime clock. > >> + > >> +endchoice > >> + > >> config PRINTK_TIME > > > > Change that into something like: > > > > config PRINTK_CLOCK > > > > > >> - bool "Show timing information on printks" > >> + int "Show time stamp information on printks" > >> depends on PRINTK > >> + default 0 if PRINTK_TIME_DISABLE > >> + default 1 if PRINTK_TIME_LOCAL > > > > And that into: > > > > default 1 if PRINTK_TIME_LOCAL || PRINTK_TIME > > > >> + default 2 if PRINTK_TIME_BOOT > >> + default 3 if PRINTK_TIME_MONO > >> + default 4 if PRINTK_TIME_REAL > >> help > >>Selecting this option causes time stamps of the printk() > > > > Then the old PRINTK_TIME symbol will auto-convert into the new > > equivalent. > > > > I don't think there's an easy code way around this. Essentially this Kconfig > code boils down to properly evaluating > > config PRINTK_CLOCK > default 1 if PRINTK_TIME > default 0 > > where there is no Kconfig entry for PRINTK_TIME. > > If undefined CONFIG_PRINTK_TIME is used in a config, it is immediately > scrubbed by the kconfig script so it doesn't "exist" when CONFIG_PRINTK_CLOCK > is evaluated. The result of that is CONFIG_PRINT_CLOCK=0. > > I tried > > config PRINTK_TIME > bool "old config option" > > then I end up with both a CONFIG_PRINTK_CLOCK=1 and a CONFIG_PRINTK_TIME=y in > the resulting config which is confusing. > > I've debated using the other suggestion that Paul made but TBH (sorry > Paul) it seems like I'm avoiding the real but noisy solution of > > s/PRINTK_TIME=y/PRINTK_TIME=1/g > > I'm obviously open to other suggestions... It is someone else's turn to provide a suggestion. ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] printk: Add boottime and real timestamps
On 08/09/2017 02:24 PM, Luis R. Rodriguez wrote: > On Mon, Aug 07, 2017 at 02:17:33PM -0400, Prarit Bhargava wrote: >> >> >> On 08/07/2017 01:14 PM, Luis R. Rodriguez wrote: >> >>> >>> Note printk_late_init() is a late_initcall(). This means if the >>> printk_time_setting was disabled it will take a while to enable it. >>> Enabling it >>> is done at the device_initcall(), so if printk setting is disabled but a >>> user >>> enables it with a toggle of the module param there is a period of time >>> during >>> which time resolution would be different. >> >> I'm not sure I follow your comment. Could you elaborate with an example of >> what you think is going wrong or might be confusing? > > Sure let's consider this: > > +static u64 printk_get_ts(void) > +{ > + u64 mono, offset_real; > + > + if (printk_time <= PRINTK_TIME_LOCAL) > + return local_clock(); > + > + if (printk_time == PRINTK_TIME_BOOT) > + return ktime_get_boot_log_ts(); > + > + mono = ktime_get_real_log_ts(_real); > + > + if (printk_time == PRINTK_TIME_MONO) > + return mono; > + > + return mono + offset_real; > +} > > So even if printk_time was flipped in the end the backend routines used will > be > local_clock(), ,ktime_get_boot_log_ts() or ktime_get_real_log_ts(). > > This is used here; > > @@ -1643,7 +1756,7 @@ static bool cont_add(int facility, int level, enum > log_flags flags, const char * > cont.facility = facility; > cont.level = level; > cont.owner = current; > - cont.ts_nsec = local_clock(); > + cont.ts_nsec = printk_get_ts(); > cont.flags = flags; > } > > > But lets inspect these new calls: > > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > @@ -477,6 +479,24 @@ u64 notrace ktime_get_boot_fast_ns(void) > } > EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns); > > +u64 ktime_get_real_log_ts(u64 *offset_real) > +{ > + *offset_real = ktime_to_ns(tk_core.timekeeper.offs_real); > + > + if (timekeeping_active) > + return ktime_get_mono_fast_ns(); > + else > + return local_clock(); > +} > + > +u64 ktime_get_boot_log_ts(void) > +{ > + if (timekeeping_active) > + return ktime_get_boot_fast_ns(); > + else > + return local_clock(); > +} > + > > So they are really only effectively calling something other than > what lock_clock() returns *iff* timekeeping_active is true. But > this is only set later at the respective device_initcall() in this > file: > > @@ -1530,6 +1550,8 @@ void __init timekeeping_init(void) > > write_seqcount_end(_core.seq); > raw_spin_unlock_irqrestore(_lock, flags); > + > + timekeeping_active = 1; > } > > > So when the boot param is processed and prints out that it has > changed someone inspecting any time setting after that print > may assume its using after that ktime_get_mono_fast_ns() or > time_get_boot_fast_ns() but this is not accurate, it will use > local_clock() until *after* device_initcall(). > > So in between boot and this particular device_initcall() time > resolution can only be local_time(). Seems worth documenting > that. I've moved to a different model of using a fn ptr for print_get_ts() and using peterz's suggestion of returning 0 until the timekeeping is initialized, so this won't be a problem any more. P. > > Luis > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 9/9] sparc64: Add support for ADI (Application Data Integrity)
ADI is a new feature supported on SPARC M7 and newer processors to allow hardware to catch rogue accesses to memory. ADI is supported for data fetches only and not instruction fetches. An app can enable ADI on its data pages, set version tags on them and use versioned addresses to access the data pages. Upper bits of the address contain the version tag. On M7 processors, upper four bits (bits 63-60) contain the version tag. If a rogue app attempts to access ADI enabled data pages, its access is blocked and processor generates an exception. Please see Documentation/sparc/adi.txt for further details. This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable MCD (Memory Corruption Detection) on selected memory ranges, enable TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI version tags on page swap out/in or migration. ADI is not enabled by default for any task. A task must explicitly enable ADI on a memory range and set version tag for ADI to be effective for the task. Signed-off-by: Khalid AzizCc: Khalid Aziz --- v7: - Enhanced arch_validate_prot() to enable ADI only on writable addresses backed by physical RAM - Added support for saving/restoring ADI tags for each ADI block size address range on a page on swap in/out - Added code to copy ADI tags on COW - Updated values for auxiliary vectors to not conflict with values on other architectures to avoid conflict in glibc. glibc consolidates all auxiliary vectors into its headers and duplicate values in consolidated header are problematic - Disable same page merging on ADI enabled pages since ADI tags may not match on pages with identical data - Broke the patch up further into smaller patches v6: - Eliminated instructions to read and write PSTATE as well as MCDPER and PMCDPER on every access to userspace addresses by setting PSTATE and PMCDPER correctly upon entry into kernel. PSTATE.mcde and PMCDPER are set upon entry into kernel when running on an M7 processor. PSTATE.mcde being set only affects memory accesses that have TTE.mcd set. PMCDPER being set only affects writes to memory addresses that have TTE.mcd set. This ensures any faults caused by ADI tag mismatch on a write are exposed before kernel returns to userspace. v5: - Fixed indentation issues and instrcuctions in assembly code - Removed CONFIG_SPARC64 from mdesc.c - Changed to maintain state of MCDPER register in thread info flags as opposed to in mm context. MCDPER is a per-thread state and belongs in thread info flag as opposed to mm context which is shared across threads. Added comments to clarify this is a lazily maintained state and must be updated on context switch and copy_process() - Updated code to use the new arch_do_swap_page() and arch_unmap_one() functions v4: - Broke patch up into smaller patches v3: - Removed CONFIG_SPARC_ADI - Replaced prctl commands with mprotect - Added auxiliary vectors for ADI parameters - Enabled ADI for swappable pages v2: - Fixed a build error Documentation/sparc/adi.txt | 272 +++ arch/sparc/include/asm/mman.h | 72 - arch/sparc/include/asm/mmu_64.h | 17 ++ arch/sparc/include/asm/mmu_context_64.h | 43 + arch/sparc/include/asm/page_64.h| 4 + arch/sparc/include/asm/pgtable_64.h | 46 ++ arch/sparc/include/asm/thread_info_64.h | 2 +- arch/sparc/include/asm/trap_block.h | 2 + arch/sparc/include/uapi/asm/mman.h | 2 + arch/sparc/kernel/adi_64.c | 277 arch/sparc/kernel/etrap_64.S| 28 +++- arch/sparc/kernel/process_64.c | 25 +++ arch/sparc/kernel/setup_64.c| 11 +- arch/sparc/kernel/vmlinux.lds.S | 5 + arch/sparc/mm/gup.c | 37 + arch/sparc/mm/hugetlbpage.c | 14 +- arch/sparc/mm/init_64.c | 33 arch/sparc/mm/tsb.c | 21 +++ include/linux/mm.h | 3 + mm/ksm.c| 4 + 20 files changed, 913 insertions(+), 5 deletions(-) create mode 100644 Documentation/sparc/adi.txt diff --git a/Documentation/sparc/adi.txt b/Documentation/sparc/adi.txt new file mode 100644 index ..383bc65fec1e --- /dev/null +++ b/Documentation/sparc/adi.txt @@ -0,0 +1,272 @@ +Application Data Integrity (ADI) + + +SPARC M7 processor adds the Application Data Integrity (ADI) feature. +ADI allows a task to set version tags on any subset of its address +space. Once ADI is enabled and
[PATCH v7 0/9] Application Data Integrity feature introduced by SPARC M7
SPARC M7 processor adds additional metadata for memory address space that can be used to secure access to regions of memory. This additional metadata is implemented as a 4-bit tag attached to each cacheline size block of memory. A task can set a tag on any number of such blocks. Access to such block is granted only if the virtual address used to access that block of memory has the tag encoded in the uppermost 4 bits of VA. Since sparc processor does not implement all 64 bits of VA, top 4 bits are available for ADI tags. Any mismatch between tag encoded in VA and tag set on the memory block results in a trap. Tags are verified in the VA presented to the MMU and tags are associated with the physical page VA maps on to. If a memory page is swapped out and page frame gets reused for another task, the tags are lost and hence must be saved when swapping or migrating the page. A userspace task enables ADI through mprotect(). This patch series adds a page protection bit PROT_ADI and a corresponding VMA flag VM_SPARC_ADI. VM_SPARC_ADI is used to trigger setting TTE.mcd bit in the sparc pte that enables ADI checking on the corresponding page. MMU validates the tag embedded in VA for every page that has TTE.mcd bit set in its pte. After enabling ADI on a memory range, the userspace task can set ADI version tags using stxa instruction with ASI_MCD_PRIMARY or ASI_MCD_ST_BLKINIT_PRIMARY ASI. Once userspace task calls mprotect() with PROT_ADI, kernel takes following overall steps: 1. Find the VMAs covering the address range passed in to mprotect and set VM_SPARC_ADI flag. If address range covers a subset of a VMA, the VMA will be split. 2. When a page is allocated for a VA and the VMA covering this VA has VM_SPARC_ADI flag set, set the TTE.mcd bit so MMU will check the vwersion tag. 3. Userspace can now set version tags on the memory it has enabled ADI on. Userspace accesses ADI enabled memory using a virtual address that has the version tag embedded in the high bits. MMU validates this version tag against the actual tag set on the memory. If tag matches, MMU performs the VA->PA translation and access is granted. If there is a mismatch, hypervisor sends a data access exception or precise memory corruption detected exception depending upon whether precise exceptions are enabled or not (controlled by MCDPERR register). Kernel sends SIGSEGV to the task with appropriate si_code. 4. If a page is being swapped out or migrated, kernel must save any ADI tags set on the page. Kernel maintains a page worth of tag storage descriptors. Each descriptors pointsto a tag storage space and the address range it covers. If the page being swapped out or migrated has ADI enabled on it, kernel finds a tag storage descriptor that covers the address range for the page or allocates a new descriptor if none of the existing descriptors cover the address range. Kernel saves tags from the page into the tag storage space descriptor points to. 5. When the page is swapped back in or reinstantiated after migration, kernel restores the version tags on the new physical page by retrieving the original tag from tag storage pointed to by a tag storage descriptor for the virtual address range for new page. User task can disable ADI by calling mprotect() again on the memory range with PROT_ADI bit unset. Kernel clears the VM_SPARC_ADI flag in VMAs, merges adjacent VMAs if necessary, and clears TTE.mcd bit in the corresponding ptes. IOMMU does not support ADI checking. Any version tags embedded in the top bits of VA meant for IOMMU, are cleared and replaced with sign extension of the first non-version tag bit (bit 59 for SPARC M7) for IOMMU addresses. This patch series adds support for this feature in 9 patches: Patch 1/9 Tag mismatch on access by a task results in a trap from hypervisor as data access exception or a precide memory corruption detected exception. As part of handling these exceptions, kernel sends a SIGSEGV to user process with special si_code to indicate which fault occurred. This patch adds three new si_codes to differentiate between various mismatch errors. Patch 2/9 When a page is swapped or migrated, metadata associated with the page must be saved so it can be restored later. This patch adds a new function that saves/restores this metadata when updating pte upon a swap/migration. Patch 3/9 SPARC M7 processor adds new fields to control registers to support ADI feature. It also adds a new exception for precise traps on tag mismatch. This patch adds definitions for the new control register fields, new ASIs for ADI and an exception handler for the precise trap on tag mismatch. Patch 4/9 New hypervisor fault types were added by sparc M7 processor to support ADI feature. This patch adds code to handle these fault types for data access exception handler. Patch 5/9 When ADI is in use for a page and a tag mismatch occurs, processor raises "Memory corruption Detected" trap. This patch adds a
[PATCH v3 3/3] cgroup: Make debug controller report new controller masks
The newly added cgroup controller masks (subtree_bypass and enable_ss_mask) are now being reported in the debug.masks controller file. Signed-off-by: Waiman Long--- kernel/cgroup/debug.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/cgroup/debug.c b/kernel/cgroup/debug.c index f661b4c..5f35a76 100644 --- a/kernel/cgroup/debug.c +++ b/kernel/cgroup/debug.c @@ -262,6 +262,8 @@ static int cgroup_masks_read(struct seq_file *seq, void *v) cgroup_masks_read_one(seq, "subtree_control", cgrp->subtree_control); cgroup_masks_read_one(seq, "subtree_ss_mask", cgrp->subtree_ss_mask); + cgroup_masks_read_one(seq, "subtree_bypass", cgrp->subtree_bypass); + cgroup_masks_read_one(seq, "enable_ss_mask", cgrp->enable_ss_mask); cgroup_kn_unlock(of->kn); return 0; -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/3] cgroup: subtree_control bypass mode for non-domain controllers
The special prefix '#' attached to a non-domain controller name can now be written into the cgroup.subtree_control file to set that controller in bypass mode in all the child cgroups. The controller will show up in the children's cgroup.controllers file, but the corresponding control knobs will be absent. However, that controller can be enabled or bypassed in its children by writing to their respective subtree_control files. This mode is useful to non-domain controllers where there are costs to each additional layer of hierarchy. This mode will also allow more freedom in how each controller can shape its effective hierarchy independent of each others. Signed-off-by: Waiman Long--- include/linux/cgroup-defs.h | 12 ++-- kernel/cgroup/cgroup.c | 143 2 files changed, 100 insertions(+), 55 deletions(-) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 59e4ad9..15655e5 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -308,16 +308,18 @@ struct cgroup { struct cgroup_file events_file; /* handle for "cgroup.events" */ /* -* The bitmask of subsystems enabled on the child cgroups. -* ->subtree_control is the one configured through -* "cgroup.subtree_control" while ->child_ss_mask is the effective -* one which may have more subsystems enabled. Controller knobs -* are made available iff it's enabled in ->subtree_control. +* The bitmask of subsystems enabled or bypassed on the child cgroups. +* ->subtree_control and ->subtree_bypass are the one configured +* through "cgroup.subtree_control" while ->subtree_ss_mask is the +* effective one which may have more subsystems enabled. Controller +* knobs are made available iff it's enabled in ->subtree_ss_mask. */ u16 subtree_control; u16 subtree_ss_mask; + u16 subtree_bypass; u16 old_subtree_control; u16 old_subtree_ss_mask; + u16 old_subtree_bypass; /* Private pointers for each registered subsystem */ struct cgroup_subsys_state __rcu *subsys[CGROUP_SUBSYS_COUNT]; diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index f5ca55d..9e69f7f 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -365,7 +365,8 @@ static bool cgroup_can_be_thread_root(struct cgroup *cgrp) return false; /* and no domain controllers can be enabled */ - if (cgrp->subtree_control & ~cgrp_dfl_threaded_ss_mask) + if ((cgrp->subtree_control|cgrp->subtree_bypass) & + ~cgrp_dfl_threaded_ss_mask) return false; return true; @@ -387,7 +388,8 @@ bool cgroup_is_thread_root(struct cgroup *cgrp) * enabled is a thread root. */ if (cgroup_has_tasks(cgrp) && - (cgrp->subtree_control & cgrp_dfl_threaded_ss_mask)) + ((cgrp->subtree_control|cgrp->subtree_bypass) + & cgrp_dfl_threaded_ss_mask)) return true; return false; @@ -412,7 +414,7 @@ static bool cgroup_is_valid_domain(struct cgroup *cgrp) } /* subsystems visibly enabled on a cgroup */ -static u16 cgroup_control(struct cgroup *cgrp) +static u16 cgroup_control(struct cgroup *cgrp, bool show_bypass) { struct cgroup *parent = cgroup_parent(cgrp); u16 root_ss_mask = cgrp->root->subsys_mask; @@ -420,6 +422,9 @@ static u16 cgroup_control(struct cgroup *cgrp) if (parent) { u16 ss_mask = parent->subtree_control; + if (show_bypass) + ss_mask |= parent->subtree_bypass; + /* threaded cgroups can only have threaded controllers */ if (cgroup_is_threaded(cgrp)) ss_mask &= cgrp_dfl_threaded_ss_mask; @@ -433,13 +438,17 @@ static u16 cgroup_control(struct cgroup *cgrp) } /* subsystems enabled on a cgroup */ -static u16 cgroup_ss_mask(struct cgroup *cgrp) +static u16 cgroup_ss_mask(struct cgroup *cgrp, bool show_bypass) { struct cgroup *parent = cgroup_parent(cgrp); if (parent) { u16 ss_mask = parent->subtree_ss_mask; + + if (show_bypass) + ss_mask |= parent->subtree_bypass; + /* threaded cgroups can only have threaded controllers */ if (cgroup_is_threaded(cgrp)) ss_mask &= cgrp_dfl_threaded_ss_mask; @@ -492,7 +501,7 @@ static struct cgroup_subsys_state *cgroup_e_css(struct cgroup *cgrp, * This function is used while updating css associations and thus * can't test the csses directly. Test ss_mask. */ - while (!(cgroup_ss_mask(cgrp) & (1 << ss->id))) { + while (!(cgroup_ss_mask(cgrp, false) & (1 << ss->id))) { cgrp = cgroup_parent(cgrp); if (!cgrp)
[PATCH v3 2/3] cgroup: Allow reenabling of controller in bypass mode
Non-domain controllers set to bypass mode in the parent's "cgroup.subtree_control" can now be optionally enabled by writing the controller name with the '+' prefix to "cgroup.controllers". Using the '#' prefix will reset it back to the bypass state. This capability allows a cgroup parent to individually enable non-domain controllers in a subset of its children instead of either all or none of them. This increases the flexibility each controller has in shaping the effective cgroup hierarchy to best suit its need. Signed-off-by: Waiman Long--- Documentation/cgroup-v2.txt | 58 +-- include/linux/cgroup-defs.h | 7 +++ kernel/cgroup/cgroup.c | 109 ++-- 3 files changed, 156 insertions(+), 18 deletions(-) diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index dc44785..e76dc4cf 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -363,10 +363,16 @@ disabled by writing to the "cgroup.subtree_control" file:: # echo "+cpu +memory -io" > cgroup.subtree_control +The prefixes '+', '-' and '#' are used to enable, disable or put +a controller in the bypass mode respectively. In the bypass mode, +a controller is disabled in a cgroup, but it can be enabled again in +its child cgroups as it will still be listed in "cgroup.controllers". +Bypass mode can only be used on non-domain controllers. + Only controllers which are listed in "cgroup.controllers" can be -enabled. When multiple operations are specified as above, either they -all succeed or fail. If multiple operations on the same controller -are specified, the last one is effective. +enabled or bypassed. When multiple operations are specified as above, +either they all succeed or fail. If multiple operations on the same +controller are specified, the last one is effective. Enabling a controller in a cgroup indicates that the distribution of the target resource across its immediate children will be controlled. @@ -390,6 +396,20 @@ prefixed controller interface files from C and D. This means that the controller interface files - anything which doesn't start with "cgroup." are owned by the parent rather than the cgroup itself. +Once a non-domain controller is put into bypass mode in +"cgroup.subtree_control", that controller can optionally be enabled +again in child cgroups by writing the controller name with the '+ +prefix into "cgroup.controllers". Writing the controller name with +the '#' prefix into "cgroup.controllers" resets the state back to +bypass mode. The state of a non-domain controller cannot be changed +anymore if it is enabled or bypassed in its "cgroup.subtree_control". + +The use of bypass mode thus allows a cgroup parent to have the ability +to selectively enable a non-domain controller in a subset of its +child cgroups instead of in either all or none of them. In other words, +a non-domain controller can be enabled only on the cgroup that actually +needs it, if desired. + Top-down Constraint ~~~ @@ -397,10 +417,11 @@ Top-down Constraint Resources are distributed top-down and a cgroup can further distribute a resource only if the resource has been distributed to it from the parent. This means that all non-root "cgroup.subtree_control" files -can only contain controllers which are enabled in the parent's -"cgroup.subtree_control" file. A controller can be enabled only if -the parent has the controller enabled and a controller can't be -disabled if one or more children have it enabled. +can only contain controllers which are enabled or bypassed in the parent's +"cgroup.subtree_control" file. A controller can be enabled or bypassed +only if the parent has the controller enabled or bypassed and the +state of a controller can't be changed if one or more children have +it enabled or bypassed. No Internal Process Constraint @@ -823,11 +844,18 @@ All cgroup core files are prefixed with "cgroup." should be granted along with the containing directory. cgroup.controllers - A read-only space separated values file which exists on all + A read-write space separated values file which exists on all cgroups. It shows space separated list of all controllers available to - the cgroup. The controllers are not ordered. + the cgroup. Controller names with '#' prefix are in bypass + mode. The controllers are not ordered. + + When a controller is set into bypass mode in its parent's + "cgroup.subtree_control", its name prefixed with '+' or '#' + can be written to enable it or reset it back to bypass mode + respectively. Controllers not in bypass mode are not allowed + to be written. cgroup.subtree_control A read-write space separated values file which exists on all @@ -837,12 +865,12 @@ All cgroup core files are prefixed with "cgroup." which are enabled to control
Re: [PATCH v2 0/4] ipmi: bt-i2c: added IPMI Block Transfer over I2C
> Perhaps that is some level of abuse, but it's pretty common. I'm not > against it. > > There is standard IPMI firmware NetFN (though no commands defined) that if > you use > the driver automatically goes into "Maintenance mode" and modified the > timeouts > and handling to some extent to help with this. That is a really good point, I missed that. ... > > > There are ways to accomplish this that aren't that complex. You can create > an OEM > command that can query the maximum message size and the ability to do > sequence > numbers in the messages. > > If messages larger than 32-bytes are supported, and the host I2C/SMBus > driver > supports it, you could use the standard SSIF SMBus commands to do this, they > have an 8-bit length field. > > If sequence numbers are supported, The SSIF could use different SMBus > commands > to do the write and read requests. Since this is only if you get an OEM > command, > and if you put the sequence numbers at the end where they are easy to add on > the send side, this is a small change to the driver. What if we just had an OEM command that changed the message structure from that point on? We could abuse the "maintenance mode" NetFN to get back into normal SSIF if necessary. > > So I think the changes would be small and contained. I'm actually ok with a > different driver, but I think it would be more valuable to the OpenBMC > project > to have a standardized interface that would work (in a not quite as > efficient > mode) with software that does not use the Linux IPMI driver. I guess I see the all of my asks as hacky things which we can hopefully remove at some point. Hopefully, most OpenBMC users won't want or need these things. ... >> >> Regardless of what we do with the "BT-I2C" stuff, I am still interested in >> what >> you think about this. > > > I think you are right, it probably belongs some place else. The way that > makes the most > sense to me would be to have an "ipmi" directory with a "host" and "slave" > side, and since > ipmi is not really a char driver, to move it to the main driver directory. > That might be > fairly disruptive, though. That was my thinking exactly. > > The other option that makes sense to me would be to add a > drivers/char/ipmi_slave directory, > or something like that, and put the slave code there. That would be less > disruptive. Right that is the approach I took, except I called it drivers/char/ipmi_bmc. I originally thought doing the less disruptive thing is best; however, I know there are also some OpenBMC people who are interested in implementing IPMB. So maybe now is the time to bite the bullet and create an ipmi directory under drivers/. > > -corey In summary, I think I can live with making it a mangled form of SSIF, but I would prefer to put it in its own driver. In any case, I think I would rather focus on the the BMC side IPMI framework now, since it is a bigger change and would also reduce the work of implementing a BMC side SSIF driver. Here is what I propose: we focus on the BMC side IPMI framework RFC that I sent out the other day: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1463473.html I will add a change to the BMC side IPMI framework patchset to move all the IPMI stuff to the new drivers/ipmi directory as discussed and then drop the patch in that patchset that depends on this patchset. Let me know what you think -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 0/3] ipmi: bt-i2c: added IPMI Block Transfer over I2C
On Wed, Aug 9, 2017 at 3:56 AM, Anton D. Kachalovwrote: > Hello, > > I would like to mention one of the our related work for IPMI and I2C. > > We use OpenIPMI stack to connect to the computing nodes through the I2C > using IPMB (BT is not supported by nodes): > > https://github.com/ya-mouse/meta-openbmc-yandex/blob/master/meta-yandex/meta-openrack/meta-shaosi/recipes-kernel/linux/linux-obmc/ipmi_i2c.c > > It lacks complete slave support (slave part is only for receiving known > packets with query results due to OpenIPMI implementation in kernel) and use > one local slave to communicate with a number of target systems on the same > bus (currently supported only 1-to-1 schema). > > With this stuff we able to use ipmitool across different /dev/ipmiX devices > to communicate with nodes. Cool, I met someone else who had a similar use case which is part of why I decided to share this (not sure if should say who). So it sounds like we are probably not going to go with the approach I proposed; if you indeed find this useful, I would suggest that we put this in our OpenBMC repository and switch it out with the suggested method at some point. Let me know what you think -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 3/5] dt-bindings: i3c: Document core bindings
On Mon, Jul 31, 2017 at 06:24:48PM +0200, Boris Brezillon wrote: > A new I3C subsystem has been added and a generic description has been > created to represent the I3C bus and the devices connected on it. > > Document this generic representation. > > Signed-off-by: Boris Brezillon> --- > Documentation/devicetree/bindings/i3c/i3c.txt | 90 > +++ > 1 file changed, 90 insertions(+) > create mode 100644 Documentation/devicetree/bindings/i3c/i3c.txt > > diff --git a/Documentation/devicetree/bindings/i3c/i3c.txt > b/Documentation/devicetree/bindings/i3c/i3c.txt > new file mode 100644 > index ..49261dec7b01 > --- /dev/null > +++ b/Documentation/devicetree/bindings/i3c/i3c.txt > @@ -0,0 +1,90 @@ > +Generic device tree bindings for I3C busses > +=== > + > +This document describes generic bindings that should be used to describe I3C > +busses in a device tree. > + > +Required properties > +--- > + > +- #address-cells - should be <1>. Read more about addresses below. > +- #size-cells - should be <0>. > +- compatible - name of I3C bus controller following generic names > + recommended practice. > + > +For other required properties e.g. to describe register sets, > +clocks, etc. check the binding documentation of the specific driver. > + > +Optional properties > +--- > + > +These properties may not be supported by all I3C master drivers. Each I3C > +master bindings should specify which of them are supported. > + > +- i3c-scl-frequency: frequency (in Hz) of the SCL signal used for I3C > + transfers. When undefined the core set it to 12.5MHz. > + > +- i2c-scl-frequency: frequency (in Hz) of the SCL signal used for I2C > + transfers. When undefined, the core looks at LVR values > + of I2C devices described in the device tree to determine > + the maximum I2C frequency. > + > +I2C devices > +=== > + > +Each I2C device connected to the bus should be described in a subnode with > +the following properties: > + > +All properties described in Documentation/devicetree/bindings/i2c/i2c.txt are > +valid here. > + > +New required properties: > + > +- i3c-lvr: 32 bits integer property (only the lowest 8 bits are meaningful) What does lvr mean? > +describing device capabilities as described in the I3C > +specification. > + > +bit[31:8]: unused > +bit[7:5]: I2C device index. Possible values index? Seems more like flags > + * 0: I2C device has a 50 ns spike filter > + * 1: I2C device does not have a 50 ns spike filter but supports high > + frequency on SCL > + * 2: I2C device does not have a 50 ns spike filter and is not > + tolerant to high frequencies > + * 3-7: reserved > + > +bit[4]: tell whether the device operates in FM or FM+ mode > + * 0: FM+ mode > + * 1: FM mode > + > +bit[3:0]: device type > + * 0-15: reserved That's useful... > + > +I3C devices > +=== > + > +I3C are not described in the device tree yet. We could decide to represent > them > +at some point to assign a specific dynamic address to a device or to force an > +I3C device to act as an I2C device if it has a static address. I think we need to define this sooner rather than later if there's not a standard connector. That's the only thing that would enforce any sort of standard. Of course, that didn't help with SDIO. > + > +Example: > + > + i3c-master@0d04 { The node name should go into the DT spec. I tend to think "i3c" would be sufficient and aligned with i2c. > + compatible = "cdns,i3c-master"; > + clocks = <>, <>; > + clock-names = "pclk", "sysclk"; > + interrupts = <3 0>; > + reg = <0x0d04 0x1000>; > + #address-cells = <1>; > + #size-cells = <0>; > + > + status = "okay"; > + i2c-scl-frequency = <10>; > + > + nunchuk: nunchuk@52 { > + compatible = "nintendo,nunchuk"; > + reg = <0x52>; > + i3c-lvr = <0x10>; > + }; > + }; > + > -- > 2.7.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
make pdfdocs problem with 4.13-rc4
On my Fedora 26 workstation, with the latest patches, running make pdfdocs stops with [jim@krebstar ~]$ tail /tmp/make-pdfdocs.out Underfull \hbox (badness 1) in paragraph at lines 3980--3983 []\EU1/DejaVuSans(0)/m/n/10 Threshold below [31] ! Missing \endgroup inserted. \endgroup l.4114 \begin{savenotes}\sphinxattablestart ? Pressing the return key (many times) doesn't solve the problem, and eventually make fizzles out with make[2]: *** [Makefile:33: media.pdf] Error 1 make[1]: *** [Documentation/Makefile:83: pdfdocs] Error 2 make: *** [Makefile:1473: pdfdocs] Error 2 Oh, and [jim@krebstar ~]$ grep 'LaTeX Warning:' /tmp/make-pdfdocs.out | wc -l 5438 -- Jim -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/4] ipmi: bt-i2c: added IPMI Block Transfer over I2C
On 08/09/2017 08:04 PM, Brendan Higgins wrote: Perhaps that is some level of abuse, but it's pretty common. I'm not against it. There is standard IPMI firmware NetFN (though no commands defined) that if you use the driver automatically goes into "Maintenance mode" and modified the timeouts and handling to some extent to help with this. That is a really good point, I missed that. ... There are ways to accomplish this that aren't that complex. You can create an OEM command that can query the maximum message size and the ability to do sequence numbers in the messages. If messages larger than 32-bytes are supported, and the host I2C/SMBus driver supports it, you could use the standard SSIF SMBus commands to do this, they have an 8-bit length field. If sequence numbers are supported, The SSIF could use different SMBus commands to do the write and read requests. Since this is only if you get an OEM command, and if you put the sequence numbers at the end where they are easy to add on the send side, this is a small change to the driver. What if we just had an OEM command that changed the message structure from that point on? We could abuse the "maintenance mode" NetFN to get back into normal SSIF if necessary. Actually, I wouldn't have a separate "openbmc mode". I would have OpenBMC always work with standard SSIF, and have separate SMBus commands for messages with the sequence number and messages larger than 32 bytes. I've attached a patch with what I would expect the changes to be to the host driver. It doesn't handle multiple outstanding messages, but it shows what detection and a separate SMBus command would look like. So I think the changes would be small and contained. I'm actually ok with a different driver, but I think it would be more valuable to the OpenBMC project to have a standardized interface that would work (in a not quite as efficient mode) with software that does not use the Linux IPMI driver. I guess I see the all of my asks as hacky things which we can hopefully remove at some point. Hopefully, most OpenBMC users won't want or need these things. ... Regardless of what we do with the "BT-I2C" stuff, I am still interested in what you think about this. I think you are right, it probably belongs some place else. The way that makes the most sense to me would be to have an "ipmi" directory with a "host" and "slave" side, and since ipmi is not really a char driver, to move it to the main driver directory. That might be fairly disruptive, though. That was my thinking exactly. The other option that makes sense to me would be to add a drivers/char/ipmi_slave directory, or something like that, and put the slave code there. That would be less disruptive. Right that is the approach I took, except I called it drivers/char/ipmi_bmc. I originally thought doing the less disruptive thing is best; however, I know there are also some OpenBMC people who are interested in implementing IPMB. So maybe now is the time to bite the bullet and create an ipmi directory under drivers/. I'm not sure IPMB would make much difference, there's no host side change as it's already supported. I don't think there would be any significant code sharing between the two. If there end up being a significant amount of common code, then it would definitely be worth the effort to move it. -corey In summary, I think I can live with making it a mangled form of SSIF, but I would prefer to put it in its own driver. You can look at the patch and consider it, and consider that you would need to implement flag and event handling. On an x86 host there would be SMBIOS and ACPI stuff to deal with somehow for discovery. There's probably few other things to deal with. In any case, I think I would rather focus on the the BMC side IPMI framework now, since it is a bigger change and would also reduce the work of implementing a BMC side SSIF driver. Here is what I propose: we focus on the BMC side IPMI framework RFC that I sent out the other day: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1463473.html I will add a change to the BMC side IPMI framework patchset to move all the IPMI stuff to the new drivers/ipmi directory as discussed and then drop the patch in that patchset that depends on this patchset. Let me know what you think Let's hold off on the new directory, there's probably some convincing of the "powers that be" for that. I'll look at the patch set tomorrow, unless something critical comes up. Thanks, -corey diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index 11237e8..d467b1a 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -60,6 +60,11 @@ #define IPMI_GET_SYSTEM_INTERFACE_CAPABILITIES_CMD 0x57 +static u8 openbmc_iana[3] = { 0x10, 0x20, 0x30 }; +#define IPMI_OPENBMC_CAPABILITY_REQUEST_CMD 0x01 +#define SSIF_OPENBMC_REQUEST 0x80 +#define SSIF_OPENBMC_RESPONSE 0x81 + #define SSIF_IPMI_REQUEST 2
[PATCH] Document:add Chinese translation of rfkill.txt
From: "guohao.w"Signed-off-by: guohao.w --- Documentation/translations/zh_CN/rfkill.txt | 117 1 file changed, 117 insertions(+) create mode 100644 Documentation/translations/zh_CN/rfkill.txt diff --git a/Documentation/translations/zh_CN/rfkill.txt b/Documentation/translations/zh_CN/rfkill.txt new file mode 100644 index ..2c90f5733551 --- /dev/null +++ b/Documentation/translations/zh_CN/rfkill.txt @@ -0,0 +1,117 @@ +Chinese translated version of Documentation/rfkill.txt + +If you have any comment or update to the content, please post to LKML directly. +However, if you have problem communicating in English you can also ask the +Chinese maintainer for help. Contact the Chinese maintainer, if this +translation is outdated or there is problem with translation. + +Chinese maintainer: guohao.wang +- +Documentation/rfkill.txt的中文翻译 + +如果想评论或更新本文的内容,请直接发信到LKML。如果你使用英文交流有困难的话,也可 +以向中文版维护者求助。如果本翻译更新不及时或者翻译存在问题,请联系中文版维护者。 + +中文版维护者:guohao.wang +中文版翻译者:guohao.wang +中文版校译者:Lily <1517048...@qq.com> +以下为正文 +- + + +rfkill - RF kill 开关的支持 +=== + +1. 介绍 +2. 实现细节 +3. 内核API +4. 用户空间支持 + + +1. 介绍 + +rfkill子系统提供一个通用接口来禁用任意系统中的射频发射器。当发射器被锁定时,它 +将不再消耗任何电力。 + +这个子系统也具有响应按键操作来禁用特定种类发射器(或全部种类)的能力。这个是适 +用于关闭发射器的场合,比如说在飞行器上。 + +rfkill子系统提供了“硬”和“软”锁定的概念,它们意思几乎没有区别(断开 == 发射器关 +机), +而真正的区别在于它们状态是否能被改变: + - 硬锁定:只读射频设备锁定,它不能被软件修改。 + - 软锁定:可写的射频设备锁定(不需可读性),它可被系统软件设置。 + +rfkill子系统有两个被记录在kernel-parameters.txt的参数rfkill.default_state 与 +rfkill.master_switch_mode。 + +2. 实现细节 + +rfkill 子系统是由三个主要部分组成: + * rfkill核心, + * 已被弃用的rfkill-input模块(一个输入层的handler,被用户空间策略代码替换), + * rfkill驱动。 + +rfkill核心为驱动程序在内核中注册它们的射频发生器的打开和关闭的方法提供API,同时 +使系统知道硬件的禁用状态也许在设备中被实现。 + +rfkill核心代码还会提醒用户空间状态的改变,并提供为用户空间提供一个查询当前状态的 +方法。更多信息见下节“用户空间支持”。 + +当设备被硬锁定时(通过调用rfkill_set_hw_state()或query_hw_block)set_bloack()将 +会其他的软件锁定调用,但是驱动可以忽略这个调用因为它们可以使用 +rfkill_set_hw_state()的返回值来同步软件状态以此来替代set_block()调用的追踪。事 +实上,驱动应该使用rfkill_set_hw_state()的返回值除非硬件的确分开跟踪它的软锁定和 +硬锁定。 + +3. 内核API + +设备发射器的驱动通常都实现一个rfkill的驱动。 + +如果rfkill仅仅是个按钮,平台驱动也许实现输入设备。如果这个按钮要影响硬件你需要 +实现一个rfkill驱动取代平台驱动。如果平台提供一个开/关发射器的方法,以上方法同 +样适用。 + +对部分平台,在挂起/休眠期间修改硬件状态是非常可能的,这样的情况下在恢复时候以 +当前状态更新rfkill核心的状态很有必要。 + +去创建一个rfkill驱动,驱动的Kconfig需要有 + + depends on RFKILL || !RFKILL + +来确保当rfkill是模块时驱动不被编译进内核。!RFKILL允许驱动在rfkill没有被配置的 +情况下编译,这种情况所有的rfkill API 仍然可以被使用但是几乎什么都没有被编译进去。 + +当状态改变正在发生时调用rfkill_set_hw_state()需要rfkill驱动可以进行硬锁定,除非它 +们也可以分配poll_hw_block()回调(然后rfkill核心将会轮训设备)。不要这样做除非你 +不能通过它方法获取事件。 + +RFKill 提供一个每开关LED触发器,可以根据开关的状态驱动LED(LED_FULL表示断开, +LED_OFF表示其他情况)。 + +5. 用户空间支持 + +被推荐使用的用户态接口是/dev/rfkill,它属于杂项字符设备,它允许用户态获取和设置 +rfkill的状态来设置硬件。它也通知用户态设备的添加和移除。API是一个简单的读/写 +API,它在linux/rfkill.h中定义,有个ioctl允许关闭在kernel过渡期间废弃的输入 +handler。 + +除了一个ioctl,与内核的通信是通过“struct rfkill_event”实例read()和write()来 +完成。 在这个结构体中,软锁定和硬锁定块被正确区分(不同于sysfs,如下),用户空间 +能够获得系统中所有rfkill设备的一致的快照。 此外,切换所有rfkill驱动(或指定类 +型的所有驱动程序)的状态可能更新所有热插拔设备的默认状态。 + +应用程序打开/dev/rfkill后,它可以读到所有设备的当前状态。可以通过轮询热插拔描述 +符,或状态更改事件,在或者侦听rfkill核心框架发出的uevent来获取修改。 + +此外,每个rfkill设备都注册在sysfs并发出uevents。 + +rfkill设备发出uevents(具有“更改”的操作),并设置以下环境变量: + +RFKILL_NAME +RFKILL_STATE +RFKILL_TYPE + +这些变量的内容对应于上面解释的“name”,“state”和“type” 的sysfs文件。 + +更多的细节查看 Documentation/ABI/stable/sysfs-class-rfkill. -- 2.13.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-ima-devel] [PATCH, RESEND 08/12] ima: added parser for RPM data type
On Wed, 2017-08-09 at 11:15 +0200, Roberto Sassu wrote: > On 8/2/2017 9:22 AM, James Morris wrote: > > On Tue, 1 Aug 2017, Roberto Sassu wrote: > > > >> On 8/1/2017 12:27 PM, Christoph Hellwig wrote: > >>> On Tue, Aug 01, 2017 at 12:20:36PM +0200, Roberto Sassu wrote: > This patch introduces a parser for RPM packages. It extracts the digests > from the RPMTAG_FILEDIGESTS header section and converts them to binary > data > before adding them to the hash table. > > The advantage of this data type is that verifiers can determine who > produced that data, as headers are signed by Linux distributions vendors. > RPM headers signatures can be provided as digest list metadata. > >>> > >>> Err, parsing arbitrary file formats has no business in the kernel. > >> > >> The benefit of this choice is that no actions are required for > >> Linux distribution vendors to support the solution I'm proposing, > >> because they already provide signed digest lists (RPM headers). > >> > >> Since the proof of loading a digest list is the digest of the > >> digest list (included in the list metadata), if RPM headers are > >> converted to a different format, remote attestation verifiers > >> cannot check the signature. > >> > >> If the concern is security, it would be possible to prevent unsigned > >> RPM headers from being parsed, if the PGP key type is upstreamed > >> (adding in CC keyri...@vger.kernel.org). > > > > It's a security concern and also a layering violation, there should be no > > need to parse package file formats in the kernel. > > Parsing RPMs is not strictly necessary. Digests from the headers > can be extracted and written to a new file using the compact data > format (introduced with patch 7/12). > > At boot time, IMA measures this file before digests are uploaded to the > kernel. At this point, only files with unknown digest will be added > to the measurement list. At verification time, verifiers recreate the > measurement list by merging together the digests uploaded to the > kernel with the unknown digests. Then, they verify the obtained list. > > There are two ways to verify the digests: searching them in a reference > database, or checking a signature. With the 'ima-sig' measurement list > template, it is possible to verify signatures for each accessed file. > With this patch set, it is possible to verify the signature of > the file containing the digests uploaded to the kernel. If the data > format changes, the signature cannot be verified. > > To avoid this limitation, the parsers could be moved to a userspace > tool which then uploads the parsed digests to the kernel. IMA would > measure the original files. But, if the tool is compromised, it could > load digests not included in the parsed files. With the current solution > this problem does not arise because no changes can be done by userspace > applications to the uploaded data while digests are parsed by IMA. > > I could remove the RPM parser from the patch set for now. > > Is the remaining part of the patch set ok, and is the explanation of > what it does clear? >From a trusted boot perspective, file measurements are added to the measurement list, before access to the file is given. The measurement list contains ALL measurements, as defined by policy. This patch set changes that meaning to be all measurements, as defined by policy, with the exception of those in a white list. Changing the fundamental meaning of the measurement list is not acceptable. You could define a new securityfs file to differentiate between the full measurement list and this abbreviated one. But before making this sort of change, I would prefer to address the underlying problem - TPM peformance. There are a couple of things that could be done to improve the TPM driver performance, itself. Once all of these options have been pursued, we could then consider batching the measurements to the TPM, meaning that the measurement list would still contain all the file measurements, but instead of extending the TPM for each measurement, a batched hash - a hash of a group of file measurements - would be extended into the TPM. Mimi > > I'm not really clear on exactly how this patch series works. Can you > > provide a more concrete explanation of what steps would occur during boot > > and attestation? > > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html