[PATCH net-next] net: avoid indirect calls in L4 checksum calculation
Commit 283c16a2dfd3 ("indirect call wrappers: helpers to speed-up indirect calls of builtin") introduces some macros to avoid doing indirect calls. Use these helpers to remove two indirect calls in the L4 checksum calculation for devices which don't have hardware support for it. As a test I generate packets with pktgen out to a dummy interface with HW checksumming disabled, to have the checksum calculated in every sent packet. The packet rate measured with an i7-6700K CPU and a single pktgen thread raised from 6143 to 6608 Kpps, an increase by 7.5% Suggested-by: Davide Caratti Signed-off-by: Matteo Croce --- net/core/skbuff.c | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index e89be6282693..a24a7ef55ce9 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -69,6 +69,7 @@ #include #include #include +#include #include #include @@ -76,9 +77,22 @@ #include #include #include +#include #include "datagram.h" +#if IS_ENABLED(CONFIG_IP_SCTP) +#define CSUM_UPDATE(f, ...) \ + INDIRECT_CALL_2(f, csum_partial_ext, sctp_csum_update, __VA_ARGS__) +#define CSUM_COMBINE(f, ...) \ + INDIRECT_CALL_2(f, csum_block_add_ext, sctp_csum_combine, __VA_ARGS__) +#else +#define CSUM_UPDATE(f, ...) \ + INDIRECT_CALL_1(f, csum_partial_ext, __VA_ARGS__) +#define CSUM_COMBINE(f, ...) \ + INDIRECT_CALL_1(f, csum_block_add_ext, __VA_ARGS__) +#endif + struct kmem_cache *skbuff_head_cache __ro_after_init; static struct kmem_cache *skbuff_fclone_cache __ro_after_init; #ifdef CONFIG_SKB_EXTENSIONS @@ -2507,7 +2521,7 @@ __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len, if (copy > 0) { if (copy > len) copy = len; - csum = ops->update(skb->data + offset, copy, csum); + csum = CSUM_UPDATE(ops->update, skb->data + offset, copy, csum); if ((len -= copy) == 0) return csum; offset += copy; @@ -2534,9 +2548,9 @@ __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len, frag->page_offset + offset - start, copy, p, p_off, p_len, copied) { vaddr = kmap_atomic(p); - csum2 = ops->update(vaddr + p_off, p_len, 0); + csum2 = CSUM_UPDATE(ops->update, vaddr + p_off, p_len, 0); kunmap_atomic(vaddr); - csum = ops->combine(csum, csum2, pos, p_len); + csum = CSUM_COMBINE(ops->combine, csum, csum2, pos, p_len); pos += p_len; } @@ -2559,7 +2573,7 @@ __wsum __skb_checksum(const struct sk_buff *skb, int offset, int len, copy = len; csum2 = __skb_checksum(frag_iter, offset - start, copy, 0, ops); - csum = ops->combine(csum, csum2, pos, copy); + csum = CSUM_COMBINE(ops->combine, csum, csum2, pos, copy); if ((len -= copy) == 0) return csum; offset += copy; -- 2.21.0
Re: [RFC][PATCH 0/7] Mount, FS, Block and Keyrings notifications
On Tue, May 28, 2019 at 05:01:47PM +0100, David Howells wrote: > Things I want to avoid: > > (1) Introducing features that make the core VFS dependent on the network > stack or networking namespaces (ie. usage of netlink). > > (2) Dumping all this stuff into dmesg and having a daemon that sits there > parsing the output and distributing it as this then puts the > responsibility for security into userspace and makes handling > namespaces tricky. Further, dmesg might not exist or might be > inaccessible inside a container. > > (3) Letting users see events they shouldn't be able to see. How are you handling namespaces then? Are they determined by the namespace of the process that opened the original device handle, or the namespace that made the new syscall for the events to "start flowing"? Am I missing the logic that determines this in the patches, or is that not implemented yet? thanks, greg k-h
[PATCH] ARM: xor-neon: Replace __GNUC__ checks with CONFIG_CC_IS_GCC
Currently, when compiling this code with clang, the following warning is emitted: CC arch/arm/lib/xor-neon.o arch/arm/lib/xor-neon.c:33:2: warning: This code requires at least version 4.6 of GCC [-W#warnings] This is because clang poses as GCC 4.2.1 with its __GNUC__ conditionals for glibc compatibility[1]: $ echo | clang -dM -E -x c /dev/null | grep GNUC | awk '{print $2" "$3}' __GNUC_MINOR__ 2 __GNUC_PATCHLEVEL__ 1 __GNUC_STDC_INLINE__ 1 __GNUC__ 4 As pointed out by Ard Biesheuvel and Arnd Bergmann in an earlier thread[2], the oldest version of GCC that is currently supported is gcc 4.6 after commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") so we do not need to check for anything older anymore. However, just removing the version check is not enough to silence clang because it does not recognize '#pragma GCC optimize': arch/arm/lib/xor-neon.c:25:13: warning: unknown pragma ignored [-Wunknown-pragmas] #pragma GCC optimize "tree-vectorize" Looking into it further, -ftree-vectorize (which '#pragma GCC optimize "tree-vectorize"' enables) is an alias in clang for -fvectorize[3], which according to the documentation is on by default[4] (at least at -O2 or -Os). Just add the pragma when compiling with GCC so that clang does not unnecessarily warn. [1]: https://reviews.llvm.org/D51011#1206981 [2]: https://lore.kernel.org/lkml/cak8p3a3njtcgfd2dq9kbhp8dpxf6s-ulfeu6acayc4sdi+2...@mail.gmail.com/ [3]: https://github.com/llvm/llvm-project/blob/eafe8ef6f2b44ba/clang/include/clang/Driver/Options.td#L1729 [4]: https://llvm.org/docs/Vectorizers.html#usage Link: https://github.com/ClangBuiltLinux/linux/issues/496 Reported-by: Nick Desaulniers Signed-off-by: Nathan Chancellor --- arch/arm/lib/xor-neon.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c index c691b901092f..d532bc072ee4 100644 --- a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ -22,15 +22,8 @@ MODULE_LICENSE("GPL"); * -ftree-vectorize) to attempt to exploit implicit parallelism and emit * NEON instructions. */ -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) +#ifdef CONFIG_CC_IS_GCC #pragma GCC optimize "tree-vectorize" -#else -/* - * While older versions of GCC do not generate incorrect code, they fail to - * recognize the parallel nature of these functions, and emit plain ARM code, - * which is known to be slower than the optimized ARM code in asm-arm/xor.h. - */ -#warning This code requires at least version 4.6 of GCC #endif #pragma GCC diagnostic ignored "-Wunused-variable" -- 2.22.0.rc1
[PATCH net-next 3/5] net: dsa: mv88e6xxx: Let taggers specify a can_timestamp function
The newly introduced function is called on both the RX and TX paths. The boolean returned by port_txtstamp should only return false if the driver tried to timestamp the skb but failed. Currently there is some logic in the mv88e6xxx driver that determines whether it should timestamp frames or not. This is wasteful, because if the decision is to not timestamp them, then DSA will have cloned an skb and freed it immediately afterwards. Additionally other drivers (sja1105) may have other hardware criteria for timestamping frames on RX, and the default conditions for timestamping a frame are too restrictive. When RX timestamping is enabled, the sja1105 hardware emits a follow-up frame containing the timestamp for every trapped link-local frame. Then the link-local frame is queued up inside the port_rxtstamp callback where it waits for its follow-up meta frame to come. But only a subset of the link-local frames will pass through DSA's default filter for port_rxtstamp, so the rest of the link-local traffic would still receive a meta frame but would not get timestamped. Since the state machine of waiting for meta frames is implemented in the tagger rcv function for sja1105, it is difficult to know which frames will pass through DSA's later filter and which won't. And since timestamping more frames than just PTP does no harm, just implement a callback for sja1105 that will say that all link-local traffic will be timestamped on RX. PTP classification on the skb is still performed. But now it is saved to the DSA_SKB_CB, so drivers can reuse it without calling it again. The mv88e6xxx driver was also modified to use the new generic DSA_SKB_CB(skb)->ptp_type instead of its own, custom SKB_PTP_TYPE(skb). Signed-off-by: Vladimir Oltean --- drivers/net/dsa/mv88e6xxx/hwtstamp.c | 25 + drivers/net/dsa/mv88e6xxx/hwtstamp.h | 4 ++-- include/net/dsa.h| 6 -- net/dsa/dsa.c| 25 +++-- net/dsa/slave.c | 20 ++-- 5 files changed, 44 insertions(+), 36 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c b/drivers/net/dsa/mv88e6xxx/hwtstamp.c index a17c16a2ab78..3295ad10818f 100644 --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c @@ -20,8 +20,6 @@ #include "ptp.h" #include -#define SKB_PTP_TYPE(__skb) (*(unsigned int *)((__skb)->cb)) - static int mv88e6xxx_port_ptp_read(struct mv88e6xxx_chip *chip, int port, int addr, u16 *data, int len) { @@ -216,8 +214,9 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port, } /* Get the start of the PTP header in this skb */ -static u8 *parse_ptp_header(struct sk_buff *skb, unsigned int type) +static u8 *parse_ptp_header(struct sk_buff *skb) { + unsigned int type = DSA_SKB_CB(skb)->ptp_type; u8 *data = skb_mac_header(skb); unsigned int offset = 0; @@ -249,7 +248,7 @@ static u8 *parse_ptp_header(struct sk_buff *skb, unsigned int type) * or NULL if the caller should not. */ static u8 *mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, int port, - struct sk_buff *skb, unsigned int type) + struct sk_buff *skb) { struct mv88e6xxx_port_hwtstamp *ps = >port_hwtstamp[port]; u8 *hdr; @@ -257,7 +256,7 @@ static u8 *mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, int port, if (!chip->info->ptp_support) return NULL; - hdr = parse_ptp_header(skb, type); + hdr = parse_ptp_header(skb); if (!hdr) return NULL; @@ -278,8 +277,7 @@ static int mv88e6xxx_ts_valid(u16 status) static int seq_match(struct sk_buff *skb, u16 ts_seqid) { - unsigned int type = SKB_PTP_TYPE(skb); - u8 *hdr = parse_ptp_header(skb, type); + u8 *hdr = parse_ptp_header(skb); __be16 *seqid; seqid = (__be16 *)(hdr + OFF_PTP_SEQUENCE_ID); @@ -367,7 +365,7 @@ static int is_pdelay_resp(u8 *msgtype) } bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port, -struct sk_buff *skb, unsigned int type) +struct sk_buff *skb) { struct mv88e6xxx_port_hwtstamp *ps; struct mv88e6xxx_chip *chip; @@ -379,12 +377,10 @@ bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port, if (ps->tstamp_config.rx_filter != HWTSTAMP_FILTER_PTP_V2_EVENT) return false; - hdr = mv88e6xxx_should_tstamp(chip, port, skb, type); + hdr = mv88e6xxx_should_tstamp(chip, port, skb); if (!hdr) return false; - SKB_PTP_TYPE(skb) = type; - if (is_pdelay_resp(hdr)) skb_queue_tail(>rx_queue2, skb); else @@ -503,17 +499,14 @@ long mv88e6xxx_hwtstamp_work(struct ptp_clock_info *ptp) } bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds,
[PATCH net-next 0/5] PTP support for the SJA1105 DSA driver
This patchset adds the following: - A timecounter/cyclecounter based PHC for the free-running timestamping clock of this switch. - A state machine implemented in the DSA tagger for SJA1105, which keeps track of metadata follow-up Ethernet frames (the switch's way of transmitting RX timestamps). - Some common-sense on whether or not frames should be timestamped was taken out of the mv88e6xxx driver (the only other DSA driver with PTP support) and moved to the generic framework. An option was also added for drivers to override these common-sense decisions, and timestamp some more frames. This was the path of least resistance after implementing the aforementioned state machine - metadata follow-up frames need to be tracked anyway even if only to discard them and not pass them up the network stack. And since the switch can't just be told to timestamp only what the kernel wants (PTP frames), simply use all the timestamps it provides. - A generic helper in the timecounter/cyclecounter code for reconstructing partial PTP timestamps, such as those generated by the SJA1105. Not all is rosy, though. PTP timestamping will only work when the ports are bridged. Otherwise, the metadata follow-up frames holding RX timestamps won't be received because they will be blocked by the master port's MAC filter. Linuxptp tries to put the net device in ALLMULTI/PROMISC mode, but DSA doesn't pass this on to the master port, which does the actual reception. The master port is put in promiscous mode when the slave ports are enslaved to a bridge. Also, even with software-corrected timestamps, one can observe a negative path delay reported by linuxptp: ptp4l[55.600]: master offset 8 s2 freq +83677 path delay -2390 ptp4l[56.600]: master offset 17 s2 freq +83688 path delay -2391 ptp4l[57.601]: master offset 6 s2 freq +83682 path delay -2391 ptp4l[58.601]: master offset -1 s2 freq +83677 path delay -2391 Without investigating too deeply, this appears to be introduced by the correction applied by linuxptp to t4 (t4c: corrected master rxtstamp) during the path delay estimation process (removing the correction makes the path delay positive). This does not appear to have an obvious negative effect upon the synchronization. Lastly, clock manipulations on the actual hardware PTP clock will have to be implemented anyway, for the TTEthernet block and the time-based ingress policer. Vladimir Oltean (5): timecounter: Add helper for reconstructing partial timestamps net: dsa: sja1105: Add support for the PTP clock net: dsa: mv88e6xxx: Let taggers specify a can_timestamp function net: dsa: sja1105: Add support for PTP timestamping net: dsa: sja1105: Increase priority of CPU-trapped frames drivers/net/dsa/mv88e6xxx/hwtstamp.c | 25 +- drivers/net/dsa/mv88e6xxx/hwtstamp.h | 4 +- drivers/net/dsa/sja1105/Kconfig | 7 + drivers/net/dsa/sja1105/Makefile | 1 + drivers/net/dsa/sja1105/sja1105.h | 30 ++ .../net/dsa/sja1105/sja1105_dynamic_config.c | 2 + drivers/net/dsa/sja1105/sja1105_main.c| 272 - drivers/net/dsa/sja1105/sja1105_ptp.c | 357 ++ drivers/net/dsa/sja1105/sja1105_ptp.h | 48 +++ drivers/net/dsa/sja1105/sja1105_spi.c | 28 ++ .../net/dsa/sja1105/sja1105_static_config.c | 59 +++ .../net/dsa/sja1105/sja1105_static_config.h | 10 + include/linux/dsa/sja1105.h | 15 + include/linux/timecounter.h | 7 + include/net/dsa.h | 6 +- kernel/time/timecounter.c | 33 ++ net/dsa/dsa.c | 25 +- net/dsa/slave.c | 20 +- net/dsa/tag_sja1105.c | 135 ++- 19 files changed, 1043 insertions(+), 41 deletions(-) create mode 100644 drivers/net/dsa/sja1105/sja1105_ptp.c create mode 100644 drivers/net/dsa/sja1105/sja1105_ptp.h -- 2.17.1
[PATCH net-next 5/5] net: dsa: sja1105: Increase priority of CPU-trapped frames
Without noticing any particular issue, this patch ensures that management traffic is treated with the maximum priority on RX by the switch. This is generally desirable, as the driver keeps a state machine that waits for metadata follow-up frames as soon as a management frame is received. Increasing the priority helps expedite the reception (and further reconstruction) of the RX timestamp to the driver after the MAC has generated it. Signed-off-by: Vladimir Oltean --- drivers/net/dsa/sja1105/sja1105_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index ce516615536d..3bd250e4e070 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -380,7 +380,7 @@ static int sja1105_init_general_params(struct sja1105_private *priv) .mirr_ptacu = 0, .switchid = priv->ds->index, /* Priority queue for link-local frames trapped to CPU */ - .hostprio = 0, + .hostprio = 7, .mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A, .mac_flt1= SJA1105_LINKLOCAL_FILTER_A_MASK, .incl_srcpt1 = true, -- 2.17.1
[PATCH net-next 1/5] timecounter: Add helper for reconstructing partial timestamps
Some PTP hardware offers a 64-bit free-running counter whose snapshots are used for timestamping, but only makes part of that snapshot available as timestamps (low-order bits). In that case, timecounter/cyclecounter users must bring the cyclecounter and timestamps to the same bit width, and they currently have two options of doing so: - Trim the higher bits of the timecounter itself to the number of bits of the timestamps. This might work for some setups, but if the wraparound of the timecounter in this case becomes high (~10 times per second) then this causes additional strain on the system, which must read the clock that often just to avoid missing the wraparounds. - Reconstruct the timestamp by racing to read the PTP time within one wraparound cycle since the timestamp was generated. This is preferable when the wraparound time is small (do a time-critical readout once vs doing it periodically), and it has no drawback even when the wraparound is comfortably sized. Signed-off-by: Vladimir Oltean --- include/linux/timecounter.h | 7 +++ kernel/time/timecounter.c | 33 + 2 files changed, 40 insertions(+) diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h index 2496ad4cfc99..03eab1f3bb9c 100644 --- a/include/linux/timecounter.h +++ b/include/linux/timecounter.h @@ -30,6 +30,9 @@ * by the implementor and user of specific instances of this API. * * @read: returns the current cycle value + * @partial_tstamp_mask:bitmask in case the hardware emits timestamps + * which only capture low-order bits of the full + * counter, and should be reconstructed. * @mask: bitmask for two's complement * subtraction of non 64 bit counters, * see CYCLECOUNTER_MASK() helper macro @@ -38,6 +41,7 @@ */ struct cyclecounter { u64 (*read)(const struct cyclecounter *cc); + u64 partial_tstamp_mask; u64 mask; u32 mult; u32 shift; @@ -136,4 +140,7 @@ extern u64 timecounter_read(struct timecounter *tc); extern u64 timecounter_cyc2time(struct timecounter *tc, u64 cycle_tstamp); +extern u64 cyclecounter_reconstruct(const struct cyclecounter *cc, + u64 ts_partial); + #endif diff --git a/kernel/time/timecounter.c b/kernel/time/timecounter.c index 85b98e727306..d4657d64e38d 100644 --- a/kernel/time/timecounter.c +++ b/kernel/time/timecounter.c @@ -97,3 +97,36 @@ u64 timecounter_cyc2time(struct timecounter *tc, return nsec; } EXPORT_SYMBOL_GPL(timecounter_cyc2time); + +/** + * cyclecounter_reconstruct - reconstructs @ts_partial + * @cc:Pointer to cycle counter. + * @ts_partial:Typically RX or TX NIC timestamp, provided by hardware as + * the lower @partial_tstamp_mask bits of the cycle counter, + * sampled at the time the timestamp was collected. + * To reconstruct into a full @mask bit-wide timestamp, the + * cycle counter is read and the high-order bits (up to @mask) are + * filled in. + * Must be called within one wraparound of @partial_tstamp_mask + * bits of the cycle counter. + */ +u64 cyclecounter_reconstruct(const struct cyclecounter *cc, u64 ts_partial) +{ + u64 ts_reconstructed; + u64 cycle_now; + + cycle_now = cc->read(cc); + + ts_reconstructed = (cycle_now & ~cc->partial_tstamp_mask) | + ts_partial; + + /* Check lower bits of current cycle counter against the timestamp. +* If the current cycle counter is lower than the partial timestamp, +* then wraparound surely occurred and must be accounted for. +*/ + if ((cycle_now & cc->partial_tstamp_mask) <= ts_partial) + ts_reconstructed -= (cc->partial_tstamp_mask + 1); + + return ts_reconstructed; +} +EXPORT_SYMBOL_GPL(cyclecounter_reconstruct); -- 2.17.1
Re: linux-next: Fixes tags need some work in the sound-asoc tree
Hi Pierre-Louis, On Tue, 28 May 2019 17:22:40 -0500 Pierre-Louis Bossart wrote: > > On 5/28/19 4:56 PM, Stephen Rothwell wrote: > > Hi all, > > > > In commit > > > >be1b577d0178 ("ASoC: SOF: Intel: hda: fix the hda init chip") > > > > Fixes tag > > > >Fixes: 8a300c8fb17 ("ASoC: SOF: Intel: Add HDA controller for Intel > > DSP") > > Sorry about that, not sure how I managed to add an off-by-one in all > these tags. Checkpatch.pl --strict did not report any issues, something > must be broken either in my setup or the script. > Not sure how I can fix this now? Its not worth the rebase necessary to fix them. Just use it as a learning experience. -- Cheers, Stephen Rothwell pgprgtzjra5Q2.pgp Description: OpenPGP digital signature
Re: [PATCH net-next 0/2] net: stmmac: dwmac-meson: update with SPDX Licence identifier
From: Neil Armstrong Date: Mon, 27 May 2019 15:46:21 +0200 > Update the SPDX Licence identifier for the Amlogic Meson6 and Meson8 dwmac > glue drivers. Series applied.
Re: [PATCH net-next] net: mvpp2: cls: Remove unnessesary check in mvpp2_ethtool_cls_rule_ins
From: YueHaibing Date: Mon, 27 May 2019 21:46:46 +0800 > Fix smatch warning: > > drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c:1236 > mvpp2_ethtool_cls_rule_ins() warn: unsigned 'info->fs.location' is never > less than zero. > > 'info->fs.location' is u32 type, never less than zero. > > Signed-off-by: YueHaibing This doesn't apply to net-next.
Re: [PATCH v2 08/10] Input: elan_i2c - export true width/height
We do still use a maxed out major axis as a signal for a palm in the touchscreen logic, but I'm not too concerned because if that axis is maxed out, the contact should probably be treated as a palm anyway... I'm more concerned with this affecting our gesture detection for touchpad. It looks like this change would cause all contacts to reported as some percentage bigger than they are currently. Can you give me an idea of how big that percentage is? On Tue, May 28, 2019 at 11:13 AM Harry Cutts wrote: > > On Mon, 27 May 2019 at 18:21, Dmitry Torokhov > wrote: > > > > Hi Benjamin, KT, > > > > On Mon, May 27, 2019 at 11:55:01AM +0800, 廖崇榮 wrote: > > > Hi > > > > > > -Original Message- > > > From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com] > > > Sent: Friday, May 24, 2019 5:37 PM > > > To: Dmitry Torokhov; KT Liao; Rob Herring; Aaron Ma; Hans de Goede > > > Cc: open list:HID CORE LAYER; lkml; devicet...@vger.kernel.org > > > Subject: Re: [PATCH v2 08/10] Input: elan_i2c - export true width/height > > > > > > On Tue, May 21, 2019 at 3:28 PM Benjamin Tissoires > > > wrote: > > > > > > > > The width/height is actually in the same unit than X and Y. So we > > > > should not tamper the data, but just set the proper resolution, so > > > > that userspace can correctly detect which touch is a palm or a finger. > > > > > > > > Signed-off-by: Benjamin Tissoires > > > > > > > > -- > > > > > > > > new in v2 > > > > --- > > > > drivers/input/mouse/elan_i2c_core.c | 11 --- > > > > 1 file changed, 4 insertions(+), 7 deletions(-) > > > > > > > > diff --git a/drivers/input/mouse/elan_i2c_core.c > > > > b/drivers/input/mouse/elan_i2c_core.c > > > > index 7ff044c6cd11..6f4feedb7765 100644 > > > > --- a/drivers/input/mouse/elan_i2c_core.c > > > > +++ b/drivers/input/mouse/elan_i2c_core.c > > > > @@ -45,7 +45,6 @@ > > > > #define DRIVER_NAME"elan_i2c" > > > > #define ELAN_VENDOR_ID 0x04f3 > > > > #define ETP_MAX_PRESSURE 255 > > > > -#define ETP_FWIDTH_REDUCE 90 > > > > #define ETP_FINGER_WIDTH 15 > > > > #define ETP_RETRY_COUNT3 > > > > > > > > @@ -915,12 +914,8 @@ static void elan_report_contact(struct > > > > elan_tp_data *data, > > > > return; > > > > } > > > > > > > > - /* > > > > -* To avoid treating large finger as palm, let's reduce > > > > the > > > > -* width x and y per trace. > > > > -*/ > > > > - area_x = mk_x * (data->width_x - ETP_FWIDTH_REDUCE); > > > > - area_y = mk_y * (data->width_y - ETP_FWIDTH_REDUCE); > > > > + area_x = mk_x * data->width_x; > > > > + area_y = mk_y * data->width_y; > > > > > > > > major = max(area_x, area_y); > > > > minor = min(area_x, area_y); @@ -1123,8 +1118,10 @@ > > > > static int elan_setup_input_device(struct elan_tp_data *data) > > > > ETP_MAX_PRESSURE, 0, 0); > > > > input_set_abs_params(input, ABS_MT_TOUCH_MAJOR, 0, > > > > ETP_FINGER_WIDTH * max_width, 0, 0); > > > > + input_abs_set_res(input, ABS_MT_TOUCH_MAJOR, data->x_res); > > > > input_set_abs_params(input, ABS_MT_TOUCH_MINOR, 0, > > > > ETP_FINGER_WIDTH * min_width, 0, 0); > > > > + input_abs_set_res(input, ABS_MT_TOUCH_MINOR, data->y_res); > > > > > > I had a chat with Peter on Wednesday, and he mentioned that this is > > > dangerous as Major/Minor are max/min of the width and height. And given > > > that we might have 2 different resolutions, we would need to do some > > > computation in the kernel to ensure the data is correct with respect to > > > the resolution. > > > > > > TL;DR: I don't think we should export the resolution there :( > > > > > > KT, should I drop the patch entirely, or is there a strong argument for > > > keeping the ETP_FWIDTH_REDUCE around? > > > I suggest you apply the patch, I have no idea why ETP_FWIDTH_REDUCE > > > existed. > > > Our FW team know nothing about ETP_FWIDTH_REDUCE ether. > > > > > > The only side effect will happen on Chromebook because such computation > > > have stayed in ChromeOS' kernel for four years. > > > Chrome's finger/palm threshold may be different from other Linux > > > distribution. > > > We will discuss it with Google once the patch picked by chrome and cause > > > something wrong. > > > > Chrome has logic that contact with maximum major/minor is treated as a > > palm, so here the driver (which originally came from Chrome OS) > > artificially reduces the contact size to ensure that palm rejection > > logic does not trigger. > > > > I'm adding Harry to confirm whether we are still using this logic and to > > see if we can adjust it to be something else. > > I'm not very familiar with our touchpad code, so adding Sean O'Brien, who is.
Re: [PATCH] staging: rtl8723bs: Add missing blank lines
Em qua, 22 de mai de 2019 06:41, Dan Carpenter escreveu: > > On Tue, May 21, 2019 at 09:46:55PM -0300, Fabio Lima wrote: > > This patch resolves the following warning from checkpatch.pl > > WARNING: Missing a blank line after declarations > > > > Signed-off-by: Fabio Lima > > --- > > drivers/staging/rtl8723bs/core/rtw_debug.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/staging/rtl8723bs/core/rtw_debug.c > > b/drivers/staging/rtl8723bs/core/rtw_debug.c > > index 9f8446ccf..853362381 100644 > > --- a/drivers/staging/rtl8723bs/core/rtw_debug.c > > +++ b/drivers/staging/rtl8723bs/core/rtw_debug.c > > @@ -382,6 +382,7 @@ ssize_t proc_set_roam_tgt_addr(struct file *file, const > > char __user *buffer, siz > > if (buffer && !copy_from_user(tmp, buffer, sizeof(tmp))) { > > > > int num = sscanf(tmp, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx", addr, > > addr+1, addr+2, addr+3, addr+4, addr+5); > > + > > if (num == 6) > > memcpy(adapter->mlmepriv.roam_tgt_addr, addr, > > ETH_ALEN); > > > > I'm sorry but this function is really such nonsense. Can you send a > patch to re-write it instead? > > drivers/staging/rtl8723bs/core/rtw_debug.c >371 ssize_t proc_set_roam_tgt_addr(struct file *file, const char __user > *buffer, size_t count, loff_t *pos, void *data) >372 { >373 struct net_device *dev = data; >374 struct adapter *adapter = (struct adapter > *)rtw_netdev_priv(dev); >375 >376 char tmp[32]; >377 u8 addr[ETH_ALEN]; >378 >379 if (count < 1) > > This check is silly. I guess the safest thing is to change it to: > if (count < sizeof(tmp)) > >380 return -EFAULT; > > It should be return -EINVAL; > >381 >382 if (buffer && !copy_from_user(tmp, buffer, sizeof(tmp))) { > > Remove the check for if the user passes a NULL buffer, because that's > already handled in copy_from_user(). Return -EFAULT if copy_from_user() > fails. > > if (copy_from_user(tmp, buffer, sizeof(tmp))) > return -EFAULT; > > >383 > > Extra blank line. > >384 int num = sscanf(tmp, > "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx", addr, addr+1, addr+2, addr+3, addr+4, > addr+5); > > You will need to move the num declaration to the start of the function. > >385 if (num == 6) >386 memcpy(adapter->mlmepriv.roam_tgt_addr, addr, > ETH_ALEN); > > If num != 6 then return -EINVAL; > >387 >388 DBG_871X("set roam_tgt_addr to "MAC_FMT"\n", > MAC_ARG(adapter->mlmepriv.roam_tgt_addr)); >389 } >390 >391 return count; >392 } > > regards, > dan carpenter Thanks for your feedback. This is my first patch and I will send the second patch with modifications that you suggest. Fabio Lima
Re: [PATCH v5 0/2] Fix issues with vmalloc flush flag
From: Rick Edgecombe Date: Mon, 27 May 2019 14:10:56 -0700 > These two patches address issues with the recently added > VM_FLUSH_RESET_PERMS vmalloc flag. > > Patch 1 addresses an issue that could cause a crash after other > architectures besides x86 rely on this path. > > Patch 2 addresses an issue where in a rare case strange arguments > could be provided to flush_tlb_kernel_range(). It just occurred to me another situation that would cause trouble on sparc64, and that's if someone the address range of the main kernel image ended up being passed to flush_tlb_kernel_range(). That would flush the locked kernel mapping and crash the kernel instantly in a completely non-recoverable way.
Re: [PATCH net-next] hinic: fix a bug in set rx mode
From: Xue Chaojing Date: Mon, 27 May 2019 22:10:05 + > in set_rx_mode, __dev_mc_sync and netdev_for_each_mc_addr will > repeatedly set the multicast mac address. so we delete this loop. > > Signed-off-by: Xue Chaojing Applied.
Re: [PATCH net] Documentation: net-sysfs: Remove duplicate PHY device documentation
From: Florian Fainelli Date: Mon, 27 May 2019 19:06:38 -0700 > Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same > duplication information. There is not currently any MDIO bus specific > attribute, but there are PHY device (struct phy_device) specific > attributes. Use the more precise description from sysfs-bus-mdio and > carry that over to sysfs-class-net-phydev. > > Fixes: 86f22d04dfb5 ("net: sysfs: Document PHY device sysfs attributes") > Signed-off-by: Florian Fainelli Applied, thanks.
Re: [PATCH net-next 00/12] code optimizations & bugfixes for HNS3 driver
From: Huazhong Tan Date: Tue, 28 May 2019 17:02:50 +0800 > This patch-set includes code optimizations and bugfixes for the HNS3 > ethernet controller driver. > > [patch 1/12] fixes a compile warning reported by kbuild test robot. > > [patch 2/12] fixes HNS3_RXD_GRO_SIZE_M macro definition error. > > [patch 3/12] adds a debugfs command to dump firmware information. > > [patch 4/12 - 10/12] adds some code optimizaions and cleanups for > reset and driver unloading. > > [patch 11/12 - 12/12] adds two bugfixes. Series applied, thanks.
[PATCH v5 0/3] Qualcomm QCS404 PCIe support
This series adds support for the PCIe controller in the Qualcomm QCS404 platform. Bjorn Andersson (3): PCI: qcom: Use clk_bulk API for 2.4.0 controllers dt-bindings: PCI: qcom: Add QCS404 to the binding PCI: qcom: Add QCS404 PCIe controller support .../devicetree/bindings/pci/qcom,pcie.txt | 25 +++- drivers/pci/controller/dwc/pcie-qcom.c| 113 -- 2 files changed, 75 insertions(+), 63 deletions(-) -- 2.18.0
Re: [v7 PATCH 2/2] mm: vmscan: correct some vmscan counters for THP swapout
Yang Shi writes: > Since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after > swapped out"), THP can be swapped out in a whole. But, nr_reclaimed > and some other vm counters still get inc'ed by one even though a whole > THP (512 pages) gets swapped out. > > This doesn't make too much sense to memory reclaim. For example, direct > reclaim may just need reclaim SWAP_CLUSTER_MAX pages, reclaiming one THP > could fulfill it. But, if nr_reclaimed is not increased correctly, > direct reclaim may just waste time to reclaim more pages, > SWAP_CLUSTER_MAX * 512 pages in worst case. > > And, it may cause pgsteal_{kswapd|direct} is greater than > pgscan_{kswapd|direct}, like the below: > > pgsteal_kswapd 122933 > pgsteal_direct 26600225 > pgscan_kswapd 174153 > pgscan_direct 14678312 > > nr_reclaimed and nr_scanned must be fixed in parallel otherwise it would > break some page reclaim logic, e.g. > > vmpressure: this looks at the scanned/reclaimed ratio so it won't > change semantics as long as scanned & reclaimed are fixed in parallel. > > compaction/reclaim: compaction wants a certain number of physical pages > freed up before going back to compacting. > > kswapd priority raising: kswapd raises priority if we scan fewer pages > than the reclaim target (which itself is obviously expressed in order-0 > pages). As a result, kswapd can falsely raise its aggressiveness even > when it's making great progress. > > Other than nr_scanned and nr_reclaimed, some other counters, e.g. > pgactivate, nr_skipped, nr_ref_keep and nr_unmap_fail need to be fixed > too since they are user visible via cgroup, /proc/vmstat or trace > points, otherwise they would be underreported. > > When isolating pages from LRUs, nr_taken has been accounted in base > page, but nr_scanned and nr_skipped are still accounted in THP. It > doesn't make too much sense too since this may cause trace point > underreport the numbers as well. > > So accounting those counters in base page instead of accounting THP as > one page. > > nr_dirty, nr_unqueued_dirty, nr_congested and nr_writeback are used by > file cache, so they are not impacted by THP swap. > > This change may result in lower steal/scan ratio in some cases since > THP may get split during page reclaim, then a part of tail pages get > reclaimed instead of the whole 512 pages, but nr_scanned is accounted > by 512, particularly for direct reclaim. But, this should be not a > significant issue. > > Cc: "Huang, Ying" > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Mel Gorman > Cc: "Kirill A . Shutemov" > Cc: Hugh Dickins > Cc: Shakeel Butt > Cc: Hillf Danton > Signed-off-by: Yang Shi Looks good to me! Thanks for your effort! Reviewed-by: "Huang, Ying" Best Regards, Huang, Ying
[PATCH v5 1/3] PCI: qcom: Use clk_bulk API for 2.4.0 controllers
Before introducing the QCS404 platform, which uses the same PCIe controller as IPQ4019, migrate this to use the bulk clock API, in order to make the error paths slighly cleaner. Acked-by: Stanimir Varbanov Reviewed-by: Niklas Cassel Reviewed-by: Vinod Koul Signed-off-by: Bjorn Andersson --- Changes since v4: - Renamed "err_clks" label - Picked up Vinod's r-b and Stanimir's a-b drivers/pci/controller/dwc/pcie-qcom.c | 53 -- 1 file changed, 16 insertions(+), 37 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 0ed235d560e3..23dc01212508 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -112,10 +112,10 @@ struct qcom_pcie_resources_2_3_2 { struct regulator_bulk_data supplies[QCOM_PCIE_2_3_2_MAX_SUPPLY]; }; +#define QCOM_PCIE_2_4_0_MAX_CLOCKS 3 struct qcom_pcie_resources_2_4_0 { - struct clk *aux_clk; - struct clk *master_clk; - struct clk *slave_clk; + struct clk_bulk_data clks[QCOM_PCIE_2_4_0_MAX_CLOCKS]; + int num_clks; struct reset_control *axi_m_reset; struct reset_control *axi_s_reset; struct reset_control *pipe_reset; @@ -638,18 +638,17 @@ static int qcom_pcie_get_resources_2_4_0(struct qcom_pcie *pcie) struct qcom_pcie_resources_2_4_0 *res = >res.v2_4_0; struct dw_pcie *pci = pcie->pci; struct device *dev = pci->dev; + int ret; - res->aux_clk = devm_clk_get(dev, "aux"); - if (IS_ERR(res->aux_clk)) - return PTR_ERR(res->aux_clk); + res->clks[0].id = "aux"; + res->clks[1].id = "master_bus"; + res->clks[2].id = "slave_bus"; - res->master_clk = devm_clk_get(dev, "master_bus"); - if (IS_ERR(res->master_clk)) - return PTR_ERR(res->master_clk); + res->num_clks = 3; - res->slave_clk = devm_clk_get(dev, "slave_bus"); - if (IS_ERR(res->slave_clk)) - return PTR_ERR(res->slave_clk); + ret = devm_clk_bulk_get(dev, res->num_clks, res->clks); + if (ret < 0) + return ret; res->axi_m_reset = devm_reset_control_get_exclusive(dev, "axi_m"); if (IS_ERR(res->axi_m_reset)) @@ -719,9 +718,7 @@ static void qcom_pcie_deinit_2_4_0(struct qcom_pcie *pcie) reset_control_assert(res->axi_m_sticky_reset); reset_control_assert(res->pwr_reset); reset_control_assert(res->ahb_reset); - clk_disable_unprepare(res->aux_clk); - clk_disable_unprepare(res->master_clk); - clk_disable_unprepare(res->slave_clk); + clk_bulk_disable_unprepare(res->num_clks, res->clks); } static int qcom_pcie_init_2_4_0(struct qcom_pcie *pcie) @@ -850,23 +847,9 @@ static int qcom_pcie_init_2_4_0(struct qcom_pcie *pcie) usleep_range(1, 12000); - ret = clk_prepare_enable(res->aux_clk); - if (ret) { - dev_err(dev, "cannot prepare/enable iface clock\n"); - goto err_clk_aux; - } - - ret = clk_prepare_enable(res->master_clk); - if (ret) { - dev_err(dev, "cannot prepare/enable core clock\n"); - goto err_clk_axi_m; - } - - ret = clk_prepare_enable(res->slave_clk); - if (ret) { - dev_err(dev, "cannot prepare/enable phy clock\n"); - goto err_clk_axi_s; - } + ret = clk_bulk_prepare_enable(res->num_clks, res->clks); + if (ret) + goto err_clks; /* enable PCIe clocks and resets */ val = readl(pcie->parf + PCIE20_PARF_PHY_CTRL); @@ -891,11 +874,7 @@ static int qcom_pcie_init_2_4_0(struct qcom_pcie *pcie) return 0; -err_clk_axi_s: - clk_disable_unprepare(res->master_clk); -err_clk_axi_m: - clk_disable_unprepare(res->aux_clk); -err_clk_aux: +err_clks: reset_control_assert(res->ahb_reset); err_rst_ahb: reset_control_assert(res->pwr_reset); -- 2.18.0
[PATCH v5 3/3] PCI: qcom: Add QCS404 PCIe controller support
The QCS404 platform contains a PCIe controller of version 2.4.0 and a Qualcomm PCIe2 PHY. The driver already supports version 2.4.0, for the IPQ4019, but this support touches clocks and resets related to the PHY as well, and there's no upstream driver for the PHY. On QCS404 we must initialize the PHY, so a separate PHY driver is implemented to take care of this and the controller driver is updated to not require the PHY related resources. This is done by relying on the fact that operations in both the clock and reset framework are nops when passed NULL, so we can isolate this change to only the get_resource function. For QCS404 we also need to enable the AHB (iface) clock, in order to access the register space of the controller, but as this is not part of the IPQ4019 DT binding this is only added for new users of the 2.4.0 controller. Acked-by: Stanimir Varbanov Reviewed-by: Niklas Cassel Reviewed-by: Vinod Koul Signed-off-by: Bjorn Andersson --- Changes since v4: - Picked up Vinod's r-b and Stanimir's a-b drivers/pci/controller/dwc/pcie-qcom.c | 64 +++--- 1 file changed, 38 insertions(+), 26 deletions(-) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 23dc01212508..da5dd3639a49 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -112,7 +112,7 @@ struct qcom_pcie_resources_2_3_2 { struct regulator_bulk_data supplies[QCOM_PCIE_2_3_2_MAX_SUPPLY]; }; -#define QCOM_PCIE_2_4_0_MAX_CLOCKS 3 +#define QCOM_PCIE_2_4_0_MAX_CLOCKS 4 struct qcom_pcie_resources_2_4_0 { struct clk_bulk_data clks[QCOM_PCIE_2_4_0_MAX_CLOCKS]; int num_clks; @@ -638,13 +638,16 @@ static int qcom_pcie_get_resources_2_4_0(struct qcom_pcie *pcie) struct qcom_pcie_resources_2_4_0 *res = >res.v2_4_0; struct dw_pcie *pci = pcie->pci; struct device *dev = pci->dev; + bool is_ipq = of_device_is_compatible(dev->of_node, "qcom,pcie-ipq4019"); int ret; res->clks[0].id = "aux"; res->clks[1].id = "master_bus"; res->clks[2].id = "slave_bus"; + res->clks[3].id = "iface"; - res->num_clks = 3; + /* qcom,pcie-ipq4019 is defined without "iface" */ + res->num_clks = is_ipq ? 3 : 4; ret = devm_clk_bulk_get(dev, res->num_clks, res->clks); if (ret < 0) @@ -658,27 +661,33 @@ static int qcom_pcie_get_resources_2_4_0(struct qcom_pcie *pcie) if (IS_ERR(res->axi_s_reset)) return PTR_ERR(res->axi_s_reset); - res->pipe_reset = devm_reset_control_get_exclusive(dev, "pipe"); - if (IS_ERR(res->pipe_reset)) - return PTR_ERR(res->pipe_reset); - - res->axi_m_vmid_reset = devm_reset_control_get_exclusive(dev, -"axi_m_vmid"); - if (IS_ERR(res->axi_m_vmid_reset)) - return PTR_ERR(res->axi_m_vmid_reset); - - res->axi_s_xpu_reset = devm_reset_control_get_exclusive(dev, - "axi_s_xpu"); - if (IS_ERR(res->axi_s_xpu_reset)) - return PTR_ERR(res->axi_s_xpu_reset); - - res->parf_reset = devm_reset_control_get_exclusive(dev, "parf"); - if (IS_ERR(res->parf_reset)) - return PTR_ERR(res->parf_reset); - - res->phy_reset = devm_reset_control_get_exclusive(dev, "phy"); - if (IS_ERR(res->phy_reset)) - return PTR_ERR(res->phy_reset); + if (is_ipq) { + /* +* These resources relates to the PHY or are secure clocks, but +* are controlled here for IPQ4019 +*/ + res->pipe_reset = devm_reset_control_get_exclusive(dev, "pipe"); + if (IS_ERR(res->pipe_reset)) + return PTR_ERR(res->pipe_reset); + + res->axi_m_vmid_reset = devm_reset_control_get_exclusive(dev, + "axi_m_vmid"); + if (IS_ERR(res->axi_m_vmid_reset)) + return PTR_ERR(res->axi_m_vmid_reset); + + res->axi_s_xpu_reset = devm_reset_control_get_exclusive(dev, + "axi_s_xpu"); + if (IS_ERR(res->axi_s_xpu_reset)) + return PTR_ERR(res->axi_s_xpu_reset); + + res->parf_reset = devm_reset_control_get_exclusive(dev, "parf"); + if (IS_ERR(res->parf_reset)) + return PTR_ERR(res->parf_reset); + + res->phy_reset = devm_reset_control_get_exclusive(dev, "phy"); + if (IS_ERR(res->phy_reset)) + return PTR_ERR(res->phy_reset); + } res->axi_m_sticky_reset = devm_reset_control_get_exclusive(dev,
[PATCH v5 2/3] dt-bindings: PCI: qcom: Add QCS404 to the binding
The Qualcomm QCS404 platform contains a PCIe controller, add this to the Qualcomm PCI binding document. The controller is the same version as the one used in IPQ4019, but the PHY part is described separately, hence the difference in clocks and resets. Reviewed-by: Rob Herring Reviewed-by: Vinod Koul Signed-off-by: Bjorn Andersson --- Changes since v4: - Picked up Vinod's r-b .../devicetree/bindings/pci/qcom,pcie.txt | 25 +-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie.txt b/Documentation/devicetree/bindings/pci/qcom,pcie.txt index 1fd703bd73e0..ada80b01bf0c 100644 --- a/Documentation/devicetree/bindings/pci/qcom,pcie.txt +++ b/Documentation/devicetree/bindings/pci/qcom,pcie.txt @@ -10,6 +10,7 @@ - "qcom,pcie-msm8996" for msm8996 or apq8096 - "qcom,pcie-ipq4019" for ipq4019 - "qcom,pcie-ipq8074" for ipq8074 + - "qcom,pcie-qcs404" for qcs404 - reg: Usage: required @@ -116,6 +117,15 @@ - "ahb" AHB clock - "aux" Auxiliary clock +- clock-names: + Usage: required for qcs404 + Value type: + Definition: Should contain the following entries + - "iface" AHB clock + - "aux" Auxiliary clock + - "master_bus" AXI Master clock + - "slave_bus" AXI Slave clock + - resets: Usage: required Value type: @@ -167,6 +177,17 @@ - "ahb" AHB Reset - "axi_m_sticky"AXI Master Sticky reset +- reset-names: + Usage: required for qcs404 + Value type: + Definition: Should contain the following entries + - "axi_m" AXI Master reset + - "axi_s" AXI Slave reset + - "axi_m_sticky"AXI Master Sticky reset + - "pipe_sticky" PIPE sticky reset + - "pwr" PWR reset + - "ahb" AHB reset + - power-domains: Usage: required for apq8084 and msm8996/apq8096 Value type: @@ -195,12 +216,12 @@ Definition: A phandle to the PCIe endpoint power supply - phys: - Usage: required for apq8084 + Usage: required for apq8084 and qcs404 Value type: Definition: List of phandle(s) as listed in phy-names property - phy-names: - Usage: required for apq8084 + Usage: required for apq8084 and qcs404 Value type: Definition: Should contain "pciephy" -- 2.18.0
Re: [PATCH v3 1/3] PCI: qcom: Use clk_bulk API for 2.4.0 controllers
On Thu 16 May 02:14 PDT 2019, Stanimir Varbanov wrote: > Hi Bjorn, > > On 5/2/19 3:19 AM, Bjorn Andersson wrote: > > Before introducing the QCS404 platform, which uses the same PCIe > > controller as IPQ4019, migrate this to use the bulk clock API, in order > > to make the error paths slighly cleaner. > > > > Acked-by: Stanimir Varbanov > > Reviewed-by: Niklas Cassel > > Signed-off-by: Bjorn Andersson > > --- > > > > Changes since v2: > > - Defined QCOM_PCIE_2_4_0_MAX_CLOCKS > > > > drivers/pci/controller/dwc/pcie-qcom.c | 49 -- > > 1 file changed, 14 insertions(+), 35 deletions(-) > > > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c > > b/drivers/pci/controller/dwc/pcie-qcom.c > > index 0ed235d560e3..d740cbe0e56d 100644 > > --- a/drivers/pci/controller/dwc/pcie-qcom.c > > +++ b/drivers/pci/controller/dwc/pcie-qcom.c > > @@ -112,10 +112,10 @@ struct qcom_pcie_resources_2_3_2 { > > struct regulator_bulk_data supplies[QCOM_PCIE_2_3_2_MAX_SUPPLY]; > > }; > > > > +#define QCOM_PCIE_2_4_0_MAX_CLOCKS 3 > > struct qcom_pcie_resources_2_4_0 { > > - struct clk *aux_clk; > > - struct clk *master_clk; > > - struct clk *slave_clk; > > + struct clk_bulk_data clks[QCOM_PCIE_2_4_0_MAX_CLOCKS]; > > + int num_clks; > > struct reset_control *axi_m_reset; > > struct reset_control *axi_s_reset; > > struct reset_control *pipe_reset; > > @@ -638,18 +638,17 @@ static int qcom_pcie_get_resources_2_4_0(struct > > qcom_pcie *pcie) > > struct qcom_pcie_resources_2_4_0 *res = >res.v2_4_0; > > struct dw_pcie *pci = pcie->pci; > > struct device *dev = pci->dev; > > + int ret; > > > > - res->aux_clk = devm_clk_get(dev, "aux"); > > - if (IS_ERR(res->aux_clk)) > > - return PTR_ERR(res->aux_clk); > > + res->clks[0].id = "aux"; > > + res->clks[1].id = "master_bus"; > > + res->clks[2].id = "slave_bus"; > > > > - res->master_clk = devm_clk_get(dev, "master_bus"); > > - if (IS_ERR(res->master_clk)) > > - return PTR_ERR(res->master_clk); > > + res->num_clks = 3; > > Use the new fresh define QCOM_PCIE_2_4_0_MAX_CLOCKS? > As I replace it in patch 3/3 with a value different from "max clocks", I don't think it makes sense to use the define here. So I'm leaving this as is. > > > > - res->slave_clk = devm_clk_get(dev, "slave_bus"); > > - if (IS_ERR(res->slave_clk)) > > - return PTR_ERR(res->slave_clk); > > + ret = devm_clk_bulk_get(dev, res->num_clks, res->clks); > > + if (ret < 0) > > + return ret; > > > > res->axi_m_reset = devm_reset_control_get_exclusive(dev, "axi_m"); > > if (IS_ERR(res->axi_m_reset)) > > @@ -719,9 +718,7 @@ static void qcom_pcie_deinit_2_4_0(struct qcom_pcie > > *pcie) > > reset_control_assert(res->axi_m_sticky_reset); > > reset_control_assert(res->pwr_reset); > > reset_control_assert(res->ahb_reset); > > - clk_disable_unprepare(res->aux_clk); > > - clk_disable_unprepare(res->master_clk); > > - clk_disable_unprepare(res->slave_clk); > > + clk_bulk_disable_unprepare(res->num_clks, res->clks); > > } > > > > static int qcom_pcie_init_2_4_0(struct qcom_pcie *pcie) > > @@ -850,23 +847,9 @@ static int qcom_pcie_init_2_4_0(struct qcom_pcie *pcie) > > > > usleep_range(1, 12000); > > > > - ret = clk_prepare_enable(res->aux_clk); > > - if (ret) { > > - dev_err(dev, "cannot prepare/enable iface clock\n"); > > + ret = clk_bulk_prepare_enable(res->num_clks, res->clks); > > + if (ret) > > goto err_clk_aux; > > Maybe you have to change the name of the label too? > Updated this and posted v5. Should be good to be merged now. Thanks for your reviews! Regards, Bjorn
Re: [PATCH v2 1/8] vsock/virtio: limit the memory used per-socket
On 2019/5/29 上午12:45, Stefano Garzarella wrote: On Wed, May 15, 2019 at 10:48:44AM +0800, Jason Wang wrote: On 2019/5/15 上午12:35, Stefano Garzarella wrote: On Tue, May 14, 2019 at 11:25:34AM +0800, Jason Wang wrote: On 2019/5/14 上午1:23, Stefano Garzarella wrote: On Mon, May 13, 2019 at 05:58:53PM +0800, Jason Wang wrote: On 2019/5/10 下午8:58, Stefano Garzarella wrote: +static struct virtio_vsock_buf * +virtio_transport_alloc_buf(struct virtio_vsock_pkt *pkt, bool zero_copy) +{ + struct virtio_vsock_buf *buf; + + if (pkt->len == 0) + return NULL; + + buf = kzalloc(sizeof(*buf), GFP_KERNEL); + if (!buf) + return NULL; + + /* If the buffer in the virtio_vsock_pkt is full, we can move it to +* the new virtio_vsock_buf avoiding the copy, because we are sure that +* we are not use more memory than that counted by the credit mechanism. +*/ + if (zero_copy && pkt->len == pkt->buf_len) { + buf->addr = pkt->buf; + pkt->buf = NULL; + } else { Is the copy still needed if we're just few bytes less? We meet similar issue for virito-net, and virtio-net solve this by always copy first 128bytes for big packets. See receive_big() I'm seeing, It is more sophisticated. IIUC, virtio-net allocates a sk_buff with 128 bytes of buffer, then copies the first 128 bytes, then adds the buffer used to receive the packet as a frag to the skb. Yes and the point is if the packet is smaller than 128 bytes the pages will be recycled. So it's avoid the overhead of allocation of a large buffer. I got it. Just a curiosity, why the threshold is 128 bytes? From its name (GOOD_COPY_LEN), I think it just a value that won't lose much performance, e.g the size two cachelines. Jason, Stefan, since I'm removing the patches to increase the buffers to 64 KiB and I'm adding a threshold for small packets, I would simplify this patch, removing the new buffer allocation and copying small packets into the buffers already queued (if there is a space). In this way, I should solve the issue of 1 byte packets. Do you think could be better? I think so. Thanks Thanks, Stefano
Re: [PATCH v3 0/2] Qualcomm PCIe2 PHY
On Wed 01 May 17:14 PDT 2019, Bjorn Andersson wrote: > The Qualcomm PCIe2 PHY is based on design from Synopsys and found in > several different platforms where the QMP PHY isn't used. > Kishon, any feedback on this or would you be willing to pick it up? Regards, Bjorn > Bjorn Andersson (2): > dt-bindings: phy: Add binding for Qualcomm PCIe2 PHY > phy: qcom: Add Qualcomm PCIe2 PHY driver > > .../bindings/phy/qcom-pcie2-phy.txt | 42 +++ > drivers/phy/qualcomm/Kconfig | 8 + > drivers/phy/qualcomm/Makefile | 1 + > drivers/phy/qualcomm/phy-qcom-pcie2.c | 331 ++ > 4 files changed, 382 insertions(+) > create mode 100644 Documentation/devicetree/bindings/phy/qcom-pcie2-phy.txt > create mode 100644 drivers/phy/qualcomm/phy-qcom-pcie2.c > > -- > 2.18.0 >
Re: [PATCH v2] qcom: apr: Make apr callbacks in non-atomic context
On Fri 08 Feb 09:55 PST 2019, Srinivas Kandagatla wrote: > APR communication with DSP is not atomic in nature. > Its request-response type. Trying to pretend that these are atomic > and invoking apr client callbacks directly under atomic/irq context has > endless issues with soundcard. It makes more sense to convert these > to nonatomic calls. This also coverts all the dais to be nonatomic. > > All the callbacks are now invoked as part of rx work queue. > > Signed-off-by: Srinivas Kandagatla > Reviewed-by: Bjorn Andersson Picked up Thanks, Bjorn > --- > Changes since v1: > - flush and destroy work queue after removing the device >to avoid active communication from device. suggested by Bjorn. > > drivers/soc/qcom/apr.c | 74 +++--- > 1 file changed, 69 insertions(+), 5 deletions(-) > > diff --git a/drivers/soc/qcom/apr.c b/drivers/soc/qcom/apr.c > index 74f8b9607daa..039e3aa6f5e0 100644 > --- a/drivers/soc/qcom/apr.c > +++ b/drivers/soc/qcom/apr.c > @@ -8,6 +8,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -17,8 +18,18 @@ struct apr { > struct rpmsg_endpoint *ch; > struct device *dev; > spinlock_t svcs_lock; > + spinlock_t rx_lock; > struct idr svcs_idr; > int dest_domain_id; > + struct workqueue_struct *rxwq; > + struct work_struct rx_work; > + struct list_head rx_list; > +}; > + > +struct apr_rx_buf { > + struct list_head node; > + int len; > + uint8_t buf[]; > }; > > /** > @@ -62,11 +73,7 @@ static int apr_callback(struct rpmsg_device *rpdev, void > *buf, > int len, void *priv, u32 addr) > { > struct apr *apr = dev_get_drvdata(>dev); > - uint16_t hdr_size, msg_type, ver, svc_id; > - struct apr_device *svc = NULL; > - struct apr_driver *adrv = NULL; > - struct apr_resp_pkt resp; > - struct apr_hdr *hdr; > + struct apr_rx_buf *abuf; > unsigned long flags; > > if (len <= APR_HDR_SIZE) { > @@ -75,6 +82,34 @@ static int apr_callback(struct rpmsg_device *rpdev, void > *buf, > return -EINVAL; > } > > + abuf = kzalloc(sizeof(*abuf) + len, GFP_ATOMIC); > + if (!abuf) > + return -ENOMEM; > + > + abuf->len = len; > + memcpy(abuf->buf, buf, len); > + > + spin_lock_irqsave(>rx_lock, flags); > + list_add_tail(>node, >rx_list); > + spin_unlock_irqrestore(>rx_lock, flags); > + > + queue_work(apr->rxwq, >rx_work); > + > + return 0; > +} > + > + > +static int apr_do_rx_callback(struct apr *apr, struct apr_rx_buf *abuf) > +{ > + uint16_t hdr_size, msg_type, ver, svc_id; > + struct apr_device *svc = NULL; > + struct apr_driver *adrv = NULL; > + struct apr_resp_pkt resp; > + struct apr_hdr *hdr; > + unsigned long flags; > + void *buf = abuf->buf; > + int len = abuf->len; > + > hdr = buf; > ver = APR_HDR_FIELD_VER(hdr->hdr_field); > if (ver > APR_PKT_VER + 1) > @@ -132,6 +167,23 @@ static int apr_callback(struct rpmsg_device *rpdev, void > *buf, > return 0; > } > > +static void apr_rxwq(struct work_struct *work) > +{ > + struct apr *apr = container_of(work, struct apr, rx_work); > + struct apr_rx_buf *abuf, *b; > + unsigned long flags; > + > + if (!list_empty(>rx_list)) { > + list_for_each_entry_safe(abuf, b, >rx_list, node) { > + apr_do_rx_callback(apr, abuf); > + spin_lock_irqsave(>rx_lock, flags); > + list_del(>node); > + spin_unlock_irqrestore(>rx_lock, flags); > + kfree(abuf); > + } > + } > +} > + > static int apr_device_match(struct device *dev, struct device_driver *drv) > { > struct apr_device *adev = to_apr_device(dev); > @@ -285,6 +337,14 @@ static int apr_probe(struct rpmsg_device *rpdev) > dev_set_drvdata(dev, apr); > apr->ch = rpdev->ept; > apr->dev = dev; > + apr->rxwq = create_singlethread_workqueue("qcom_apr_rx"); > + if (!apr->rxwq) { > + dev_err(apr->dev, "Failed to start Rx WQ\n"); > + return -ENOMEM; > + } > + INIT_WORK(>rx_work, apr_rxwq); > + INIT_LIST_HEAD(>rx_list); > + spin_lock_init(>rx_lock); > spin_lock_init(>svcs_lock); > idr_init(>svcs_idr); > of_register_apr_devices(dev); > @@ -303,7 +363,11 @@ static int apr_remove_device(struct device *dev, void > *null) > > static void apr_remove(struct rpmsg_device *rpdev) > { > + struct apr *apr = dev_get_drvdata(>dev); > + > device_for_each_child(>dev, NULL, apr_remove_device); > + flush_workqueue(apr->rxwq); > + destroy_workqueue(apr->rxwq); > } > > /* > -- > 2.20.1 >
[PATCH] signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO
Recently syzbot in conjunction with KMSAN reported that ptrace_peek_siginfo can copy an uninitialized siginfo to userspace. Inspecting ptrace_peek_siginfo confirms this. The problem is that off when initialized from args.off can be initialized to a negaive value. At which point the "if (off >= 0)" test to see if off became negative fails because off started off negative. Prevent the core problem by adding a variable found that is only true if a siginfo is found and copied to a temporary in preparation for being copied to userspace. Prevent args.off from being truncated when being assigned to off by testing that off is <= the maximum possible value of off. Convert off to an unsigned long so that we should not have to truncate args.off, we have well defined overflow behavior so if we add another check we won't risk fighting undefined compiler behavior, and so that we have a type whose maximum value is easy to test for. Cc: Andrei Vagin Cc: sta...@vger.kernel.org Reported-by: syzbot+0d602a1b0d8c95bdf...@syzkaller.appspotmail.com Fixes: 84c751bd4aeb ("ptrace: add ability to retrieve signals without removing from a queue (v4)") Signed-off-by: "Eric W. Biederman" --- Comments? Concerns? Otherwise I will queue this up and send it to Linus. kernel/ptrace.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 6f357f4fc859..4c2b24a885d3 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -704,6 +704,10 @@ static int ptrace_peek_siginfo(struct task_struct *child, if (arg.nr < 0) return -EINVAL; + /* Ensure arg.off fits in an unsigned */ + if (arg.off > ULONG_MAX) + return 0; + if (arg.flags & PTRACE_PEEKSIGINFO_SHARED) pending = >signal->shared_pending; else @@ -711,18 +715,20 @@ static int ptrace_peek_siginfo(struct task_struct *child, for (i = 0; i < arg.nr; ) { kernel_siginfo_t info; - s32 off = arg.off + i; + unsigned long off = arg.off + i; + bool found = false; spin_lock_irq(>sighand->siglock); list_for_each_entry(q, >list, list) { if (!off--) { + found = true; copy_siginfo(, >info); break; } } spin_unlock_irq(>sighand->siglock); - if (off >= 0) /* beyond the end of the list */ + if (!found) /* beyond the end of the list */ break; #ifdef CONFIG_COMPAT -- 2.21.0.dirty
[UPSTREAM KERNEL] mm/zsmalloc.c: Add module parameter malloc_force_movable
zswap compresses swap pages into a dynamically allocated RAM-based memory pool. The memory pool should be zbud, z3fold or zsmalloc. All of them will allocate unmovable pages. It will increase the number of unmovable page blocks that will bad for anti-fragment. zsmalloc support page migration if request movable page: handle = zs_malloc(zram->mem_pool, comp_len, GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE); This commit adds module parameter malloc_force_movable to enable or disable zs_malloc force allocate block with gfp __GFP_HIGHMEM | __GFP_MOVABLE (disabled by default). Following part is test log in a pc that has 8G memory and 2G swap. When it disabled: ~# echo lz4 > /sys/module/zswap/parameters/compressor ~# echo zsmalloc > /sys/module/zswap/parameters/zpool ~# echo 1 > /sys/module/zswap/parameters/enabled ~# swapon /swapfile ~# cd /home/teawater/kernel/vm-scalability/ /home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 1024)) /home/teawater/kernel/vm-scalability# ./case-anon-w-seq 2717908992 bytes / 4410183 usecs = 601836 KB/s 2717908992 bytes / 4524375 usecs = 586646 KB/s 2717908992 bytes / 4558583 usecs = 582244 KB/s 2717908992 bytes / 4824261 usecs = 550179 KB/s 348046 usecs to free memory 401680 usecs to free memory 369660 usecs to free memory 180867 usecs to free memory /home/teawater/kernel/vm-scalability# cat /proc/pagetypeinfo Page block order: 9 Pages per block: 512 Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10 Node0, zone DMA, typeUnmovable 1 1 1 0 2 1 1 0 1 0 0 Node0, zone DMA, type Movable 0 0 0 0 0 0 0 0 0 1 3 Node0, zone DMA, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0 Node0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node0, zone DMA, type CMA 0 0 0 0 0 0 0 0 0 0 0 Node0, zone DMA, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Node0, zoneDMA32, typeUnmovable 13 11 10 11 10 6 7 3 1 0 0 Node0, zoneDMA32, type Movable 36 26 39 40 37 36 24 29 14 6767 Node0, zoneDMA32, type Reclaimable 0 0 0 0 0 0 0 0 0 0 1 Node0, zoneDMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node0, zoneDMA32, type CMA 0 0 0 0 0 0 0 0 0 0 0 Node0, zoneDMA32, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Node0, zone Normal, typeUnmovable 7744 7519 6900 5964 4583 2878 1346448146 1 0 Node0, zone Normal, type Movable645 1930 1685 1339 1020 670363210106310399 Node0, zone Normal, type Reclaimable 53 70116 48 13 0 0 0 0 0 0 Node0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node0, zone Normal, type CMA 0 0 0 0 0 0 0 0 0 0 0 Node0, zone Normal, type Isolate 0 0 0 0 0 0 0 0 0 0 0 Number of blocks type Unmovable Movable Reclaimable HighAtomic CMA Isolate Node 0, zone DMA1700 00 Node 0, zoneDMA324 165020 00 Node 0, zone Normal 947 1469 150 00 When it enabled: ~# echo 1 > /sys/module/zsmalloc/parameters/malloc_force_movable ~# echo lz4 > /sys/module/zswap/parameters/compressor ~# echo zsmalloc > /sys/module/zswap/parameters/zpool ~# echo 1 > /sys/module/zswap/parameters/enabled ~# swapon /swapfile ~# cd /home/teawater/kernel/vm-scalability/ /home/teawater/kernel/vm-scalability# export unit_size=$((9 * 1024 * 1024 * 1024)) /home/teawater/kernel/vm-scalability# ./case-anon-w-seq 2717908992 bytes / 4779235 usecs = 555362 KB/s 2717908992 bytes / 4856673 usecs = 546507 KB/s 2717908992 bytes / 4920079 usecs = 539464 KB/s 2717908992 bytes / 4935505 usecs = 537778 KB/s 354839 usecs to free memory 368167 usecs to free memory 355460 usecs to free memory 385452 usecs to free memory /home/teawater/kernel/vm-scalability# cat
Re: [PATCH] ARM: dts: aspeed: g4: add video engine support
On Mon, 27 May 2019, at 20:58, Alexander Filippov wrote: > Add a node to describe the video engine and VGA scratch registers on > AST2400. > > These changes were copied from aspeed-g5.dtsi > > Signed-off-by: Alexander Filippov Ugh, I should really sort out the bmc-misc stuff, I don't like to see it propagate in its current form. That's not your problem though, and I hope to address it in the near future. For the OpenBMC kernel tree: Acked-by: Andrew Jeffery > --- > arch/arm/boot/dts/aspeed-g4.dtsi | 62 > 1 file changed, 62 insertions(+) > > diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi > b/arch/arm/boot/dts/aspeed-g4.dtsi > index 6011692df15a..adc1804918df 100644 > --- a/arch/arm/boot/dts/aspeed-g4.dtsi > +++ b/arch/arm/boot/dts/aspeed-g4.dtsi > @@ -168,6 +168,10 @@ > compatible = "aspeed,g4-pinctrl"; > }; > > + vga_scratch: scratch { > + compatible = "aspeed,bmc-misc"; > + }; > + > p2a: p2a-control { > compatible = "aspeed,ast2400-p2a-ctrl"; > status = "disabled"; > @@ -195,6 +199,16 @@ > reg = <0x1e72 0x8000>; // 32K > }; > > + video: video@1e70 { > + compatible = "aspeed,ast2400-video-engine"; > + reg = <0x1e70 0x1000>; > + clocks = < ASPEED_CLK_GATE_VCLK>, > + < ASPEED_CLK_GATE_ECLK>; > + clock-names = "vclk", "eclk"; > + interrupts = <7>; > + status = "disabled"; > + }; > + > gpio: gpio@1e78 { > #gpio-cells = <2>; > gpio-controller; > @@ -1408,6 +1422,54 @@ > }; > }; > > +_scratch { > + dac_mux { > + offset = <0x2c>; > + bit-mask = <0x3>; > + bit-shift = <16>; > + }; > + vga0 { > + offset = <0x50>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga1 { > + offset = <0x54>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga2 { > + offset = <0x58>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga3 { > + offset = <0x5c>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga4 { > + offset = <0x60>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga5 { > + offset = <0x64>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga6 { > + offset = <0x68>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > + vga7 { > + offset = <0x6c>; > + bit-mask = <0x>; > + bit-shift = <0>; > + }; > +}; > + > _regs { > sio_2b { > offset = <0xf0>; > -- > 2.20.1 > >
Re: [RFC PATCH 0/3] Make deferred split shrinker memcg aware
On Tue, 28 May 2019, Yang Shi wrote: > > I got some reports from our internal application team about memcg OOM. > Even though the application has been killed by oom killer, there are > still a lot THPs reside, page reclaim doesn't reclaim them at all. > > Some investigation shows they are on deferred split queue, memcg direct > reclaim can't shrink them since THP deferred split shrinker is not memcg > aware, this may cause premature OOM in memcg. The issue can be > reproduced easily by the below test: > Right, we've also encountered this. I talked to Kirill about it a week or so ago where the suggestion was to split all compound pages on the deferred split queues under the presence of even memory pressure. That breaks cgroup isolation and perhaps unfairly penalizes workloads that are running attached to other memcg hierarchies that are not under pressure because their compound pages are now split as a side effect. There is a benefit to keeping these compound pages around while not under memory pressure if all pages are subsequently mapped again. > $ cgcreate -g memory:thp > $ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes > $ cgexec -g memory:thp ./transhuge-stress 4000 > > transhuge-stress comes from kernel selftest. > > It is easy to hit OOM, but there are still a lot THP on the deferred split > queue, memcg direct reclaim can't touch them since the deferred split > shrinker is not memcg aware. > Yes, we have seen this on at least 4.15 as well. > Convert deferred split shrinker memcg aware by introducing per memcg deferred > split queue. The THP should be on either per node or per memcg deferred > split queue if it belongs to a memcg. When the page is immigrated to the > other memcg, it will be immigrated to the target memcg's deferred split queue > too. > > And, move deleting THP from deferred split queue in page free before memcg > uncharge so that the page's memcg information is available. > > Reuse the second tail page's deferred_list for per memcg list since the same > THP can't be on multiple deferred split queues at the same time. > > Remove THP specific destructor since it is not used anymore with memcg aware > THP shrinker (Please see the commit log of patch 2/3 for the details). > > Make deferred split shrinker not depend on memcg kmem since it is not slab. > It doesn't make sense to not shrink THP even though memcg kmem is disabled. > > With the above change the test demonstrated above doesn't trigger OOM anymore > even though with cgroup.memory=nokmem. > I'm curious if your internal applications team is also asking for statistics on how much memory can be freed if the deferred split queues can be shrunk? We have applications that monitor their own memory usage through memcg stats or usage and proactively try to reduce that usage when it is growing too large. The deferred split queues have significantly increased both memcg usage and rss when they've upgraded kernels. How are your applications monitoring how much memory from deferred split queues can be freed on memory pressure? Any thoughts on providing it as a memcg stat? Thanks!
Re: [PATCH -next] EDAC: aspeed: Remove set but not used variable 'np'
On Sun, 26 May 2019, at 00:12, YueHaibing wrote: > Fixes gcc '-Wunused-but-set-variable' warning: > > drivers/edac/aspeed_edac.c: In function aspeed_probe: > drivers/edac/aspeed_edac.c:284:22: warning: variable np set but not > used [-Wunused-but-set-variable] > > It is never used and can be removed. > > Signed-off-by: YueHaibing Reviewed-by: Andrew Jeffery > --- > drivers/edac/aspeed_edac.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/drivers/edac/aspeed_edac.c b/drivers/edac/aspeed_edac.c > index 11833c0a5d07..5634437bb39d 100644 > --- a/drivers/edac/aspeed_edac.c > +++ b/drivers/edac/aspeed_edac.c > @@ -281,15 +281,11 @@ static int aspeed_probe(struct platform_device *pdev) > struct device *dev = >dev; > struct edac_mc_layer layers[2]; > struct mem_ctl_info *mci; > - struct device_node *np; > struct resource *res; > void __iomem *regs; > u32 reg04; > int rc; > > - /* setup regmap */ > - np = dev->of_node; > - > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > if (!res) > return -ENOENT; > -- > 2.17.1 > > >
RE: [EXT] Re: Issue: regmap: use debugfs even when no device
From: Mark Brown Sent: Tuesday, May 28, 2019 9:27 PM > On Tue, May 28, 2019 at 02:20:15AM +, Andy Duan wrote: > > > So on i.MX8MM/8QM/8QXP platforms, we catch the issue that user dump > > regmap registers without power cause system hang. > > Maybe revert the patch is more reasonable ? > > This is an issue with or without a device - you can have the same issue with > devices that are powered off. Typically where power is dynamic the driver > will use a register cache so the registers are always available. Correct, regmap without device also has issue when power if off, because regmap doesn't implement runtime pm for the device, but maybe device driver implement the runtime pm for the device. So regmap how to manage the clock and power when access registers by debugfs ? Andy
[PATCH] dm-init: fix 2 incorrect use of kstrndup()
In drivers/md/dm-init.c, kstrndup() is incorrectly used twice. It should be: char *kstrndup(const char *s, size_t max, gfp_t gfp); Signed-off-by: Gen Zhang --- diff --git a/drivers/md/dm-init.c b/drivers/md/dm-init.c index 352e803..526e261 100644 --- a/drivers/md/dm-init.c +++ b/drivers/md/dm-init.c @@ -140,8 +140,8 @@ static char __init *dm_parse_table_entry(struct dm_device *dev, char *str) return ERR_PTR(-EINVAL); } /* target_args */ - dev->target_args_array[n] = kstrndup(field[3], GFP_KERNEL, -DM_MAX_STR_SIZE); + dev->target_args_array[n] = kstrndup(field[3], DM_MAX_STR_SIZE, + GFP_KERNEL); if (!dev->target_args_array[n]) return ERR_PTR(-ENOMEM); @@ -275,7 +275,7 @@ static int __init dm_init_init(void) DMERR("Argument is too big. Limit is %d\n", DM_MAX_STR_SIZE); return -EINVAL; } - str = kstrndup(create, GFP_KERNEL, DM_MAX_STR_SIZE); + str = kstrndup(create, DM_MAX_STR_SIZE, GFP_KERNEL); if (!str) return -ENOMEM; ---
[PATCH] wd719x: pass GFP_ATOMIC instead of GFP_KERNEL
wd719x_chip_init is getting called in interrupt disabled mode(spin_lock_irqsave) , so we need to GFP_ATOMIC instead of GFP_KERNEL. Issue identified by coccicheck Signed-off-by: Hariprasad Kelam --- drivers/scsi/wd719x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/wd719x.c b/drivers/scsi/wd719x.c index c2f4006..f300fd7 100644 --- a/drivers/scsi/wd719x.c +++ b/drivers/scsi/wd719x.c @@ -319,7 +319,7 @@ static int wd719x_chip_init(struct wd719x *wd) if (!wd->fw_virt) wd->fw_virt = dma_alloc_coherent(>pdev->dev, wd->fw_size, ->fw_phys, GFP_KERNEL); +>fw_phys, GFP_ATOMIC); if (!wd->fw_virt) { ret = -ENOMEM; goto wd719x_init_end; -- 2.7.4
[v4, PATCH] net: stmmac: add support for hash table size 128/256 in dwmac4
1. get hash table size in hw feature reigster, and add support for taller hash table(128/256) in dwmac4. 2. only clear GMAC_PACKET_FILTER bits used in this function, to avoid side effect to functions of other bits. Signed-off-by: Biao Huang --- drivers/net/ethernet/stmicro/stmmac/common.h |7 +-- drivers/net/ethernet/stmicro/stmmac/dwmac4.h |4 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 49 - drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c |1 + drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |4 ++ 5 files changed, 40 insertions(+), 25 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index 1961fe9..26bbcd8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -335,6 +335,7 @@ struct dma_features { /* 802.3az - Energy-Efficient Ethernet (EEE) */ unsigned int eee; unsigned int av; + unsigned int hash_tb_sz; unsigned int tsoen; /* TX and RX csum */ unsigned int tx_coe; @@ -428,9 +429,9 @@ struct mac_device_info { struct mii_regs mii;/* MII register Addresses */ struct mac_link link; void __iomem *pcsr; /* vpointer to device CSRs */ - int multicast_filter_bins; - int unicast_filter_entries; - int mcast_bits_log2; + unsigned int multicast_filter_bins; + unsigned int unicast_filter_entries; + unsigned int mcast_bits_log2; unsigned int rx_csum; unsigned int pcs; unsigned int pmt; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h index 01c1089..a37e09b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h @@ -18,8 +18,7 @@ /* MAC registers */ #define GMAC_CONFIG0x #define GMAC_PACKET_FILTER 0x0008 -#define GMAC_HASH_TAB_0_31 0x0010 -#define GMAC_HASH_TAB_32_630x0014 +#define GMAC_HASH_TAB(x) (0x10 + x * 4) #define GMAC_RX_FLOW_CTRL 0x0090 #define GMAC_QX_TX_FLOW_CTRL(x)(0x70 + x * 4) #define GMAC_TXQ_PRTY_MAP0 0x98 @@ -184,6 +183,7 @@ enum power_event { #define GMAC_HW_FEAT_MIISELBIT(0) /* MAC HW features1 bitmap */ +#define GMAC_HW_HASH_TB_SZ GENMASK(25, 24) #define GMAC_HW_FEAT_AVSEL BIT(20) #define GMAC_HW_TSOEN BIT(18) #define GMAC_HW_TXFIFOSIZE GENMASK(10, 6) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index 5e98da4..2544cff 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -403,41 +403,50 @@ static void dwmac4_set_filter(struct mac_device_info *hw, struct net_device *dev) { void __iomem *ioaddr = (void __iomem *)dev->base_addr; - unsigned int value = 0; + int numhashregs = (hw->multicast_filter_bins >> 5); + int mcbitslog2 = hw->mcast_bits_log2; + unsigned int value; + int i; + value = readl(ioaddr + GMAC_PACKET_FILTER); + value &= ~GMAC_PACKET_FILTER_HMC; + value &= ~GMAC_PACKET_FILTER_HPF; + value &= ~GMAC_PACKET_FILTER_PCF; + value &= ~GMAC_PACKET_FILTER_PM; + value &= ~GMAC_PACKET_FILTER_PR; if (dev->flags & IFF_PROMISC) { value = GMAC_PACKET_FILTER_PR | GMAC_PACKET_FILTER_PCF; } else if ((dev->flags & IFF_ALLMULTI) || - (netdev_mc_count(dev) > HASH_TABLE_SIZE)) { + (netdev_mc_count(dev) > hw->multicast_filter_bins)) { /* Pass all multi */ - value = GMAC_PACKET_FILTER_PM; - /* Set the 64 bits of the HASH tab. To be updated if taller -* hash table is used -*/ - writel(0x, ioaddr + GMAC_HASH_TAB_0_31); - writel(0x, ioaddr + GMAC_HASH_TAB_32_63); + value |= GMAC_PACKET_FILTER_PM; + /* Set all the bits of the HASH tab */ + for (i = 0; i < numhashregs; i++) + writel(0x, ioaddr + GMAC_HASH_TAB(i)); } else if (!netdev_mc_empty(dev)) { - u32 mc_filter[2]; + u32 mc_filter[8]; struct netdev_hw_addr *ha; /* Hash filter for multicast */ - value = GMAC_PACKET_FILTER_HMC; + value |= GMAC_PACKET_FILTER_HMC; memset(mc_filter, 0, sizeof(mc_filter)); netdev_for_each_mc_addr(ha, dev) { - /* The upper 6 bits of the calculated CRC are used to -* index the content of the Hash Table Reg 0 and
[v4, PATCH] add some features in stmmac
Changes in v4: retain the reverse xmas tree ordering. Changes in v3: rewrite the patch base on serires in https://patchwork.ozlabs.org/project/netdev/list/?series=109699 Changes in v2; 1. reverse Christmas tree order in dwmac4_set_filter. 2. remove clause 45 patch, waiting for cl45 patch from Boon Leong v1: This series add some features in stmmac driver. 1. add support for hash table size 128/256 2. add mdio clause 45 access from mac device for dwmac4. Biao Huang (1): net: stmmac: add support for hash table size 128/256 in dwmac4 drivers/net/ethernet/stmicro/stmmac/common.h |7 +-- drivers/net/ethernet/stmicro/stmmac/dwmac4.h |4 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 49 - drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c |1 + drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |4 ++ 5 files changed, 40 insertions(+), 25 deletions(-) -- 1.7.9.5
Re: [PATCH] perf: Fix oops when kthread execs user process
Peter Zijlstra writes: > On Tue, May 28, 2019 at 08:31:29PM +0800, Young Xiao wrote: >> When a kthread calls call_usermodehelper() the steps are: >> 1. allocate current->mm >> 2. load_elf_binary() >> 3. populate current->thread.regs >> >> While doing this, interrupts are not disabled. If there is a perf >> interrupt in the middle of this process (i.e. step 1 has completed >> but not yet reached to step 3) and if perf tries to read userspace >> regs, kernel oops. >> >> Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace >> pt_regs are not set. >> >> See commit bf05fc25f268 ("powerpc/perf: Fix oops when kthread execs >> user process") for details. > > Why the hell do we set current->mm before it is complete? Note that > normally exec() builds the new mm before attaching it, see exec_mmap() > in flush_old_exec(). > > Also, why did those PPC folks 'fix' this in isolation? And why didn't > you Cc them? We just assumed it was our bug, 'cause we have plenty of those :) cheers
Re: [PATCH RESEND 2/7] csky: entry: Remove unneeded need_resched() loop
Thx Valentin, You are right, Approved. Best Regards Guo Ren On Tue, May 28, 2019 at 11:48:43AM +0100, Valentin Schneider wrote: > Since the enabling and disabling of IRQs within preempt_schedule_irq() > is contained in a need_resched() loop, we don't need the outer arch > code loop. > > Signed-off-by: Valentin Schneider > Cc: Guo Ren > --- > arch/csky/kernel/entry.S | 4 > 1 file changed, 4 deletions(-) > > diff --git a/arch/csky/kernel/entry.S b/arch/csky/kernel/entry.S > index a7e84bd8..679afbcc2001 100644 > --- a/arch/csky/kernel/entry.S > +++ b/arch/csky/kernel/entry.S > @@ -292,11 +292,7 @@ ENTRY(csky_irq) > ldw r8, (r9, TINFO_FLAGS) > btsti r8, TIF_NEED_RESCHED > bf 2f > -1: > jbsrpreempt_schedule_irq/* irq en/disable is done inside */ > - ldw r7, (r9, TINFO_FLAGS) /* get new tasks TI_FLAGS */ > - btsti r7, TIF_NEED_RESCHED > - bt 1b /* go again */ > #endif > 2: > jmpiret_from_exception > -- > 2.20.1 >
[PATCH] wcd9335: fix a incorrect use of kstrndup()
In wcd9335_codec_enable_dec(), 'widget_name' is allocated by kstrndup(). However, according to doc: "Note: Use kmemdup_nul() instead if the size is known exactly." So we should use kmemdup_nul() here instead of kstrndup(). Signed-off-by: Gen Zhang --- diff --git a/sound/soc/codecs/wcd9335.c b/sound/soc/codecs/wcd9335.c index a04a7ce..85737fe 100644 --- a/sound/soc/codecs/wcd9335.c +++ b/sound/soc/codecs/wcd9335.c @@ -2734,7 +2734,7 @@ static int wcd9335_codec_enable_dec(struct snd_soc_dapm_widget *w, char *dec; u8 hpf_coff_freq; - widget_name = kstrndup(w->name, 15, GFP_KERNEL); + widget_name = kmemdup_nul(w->name, 15, GFP_KERNEL); if (!widget_name) return -ENOMEM; ---
[PATCH] intel_menlow: avoid null pointer deference error
Fix a null pointer deference by acpi_driver_data() if device is null (dereference before check). We should only set cdev and check this is OK after we are sure device is not null. Signed-off-by: Young Xiao <92siuy...@gmail.com> --- drivers/platform/x86/intel_menlow.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/platform/x86/intel_menlow.c b/drivers/platform/x86/intel_menlow.c index 77eb870..28feb5c 100644 --- a/drivers/platform/x86/intel_menlow.c +++ b/drivers/platform/x86/intel_menlow.c @@ -180,9 +180,13 @@ static int intel_menlow_memory_add(struct acpi_device *device) static int intel_menlow_memory_remove(struct acpi_device *device) { - struct thermal_cooling_device *cdev = acpi_driver_data(device); + struct thermal_cooling_device *cdev; + + if (!device) + return -EINVAL; - if (!device || !cdev) + cdev = acpi_driver_data(device); + if (!cdev) return -EINVAL; sysfs_remove_link(>dev.kobj, "thermal_cooling"); -- 2.7.4
Re: [PATCH v2 2/7] drivers/soc: Add Aspeed XDMA Engine Driver
On Sat, 25 May 2019, at 01:39, Eddie James wrote: > > On 5/21/19 7:02 AM, Arnd Bergmann wrote: > > On Mon, May 20, 2019 at 10:19 PM Eddie James wrote: > >> diff --git a/include/uapi/linux/aspeed-xdma.h > >> b/include/uapi/linux/aspeed-xdma.h > >> new file mode 100644 > >> index 000..2a4bd13 > >> --- /dev/null > >> +++ b/include/uapi/linux/aspeed-xdma.h > >> @@ -0,0 +1,26 @@ > >> +/* SPDX-License-Identifier: GPL-2.0+ */ > >> +/* Copyright IBM Corp 2019 */ > >> + > >> +#ifndef _UAPI_LINUX_ASPEED_XDMA_H_ > >> +#define _UAPI_LINUX_ASPEED_XDMA_H_ > >> + > >> +#include > >> + > >> +/* > >> + * aspeed_xdma_op > >> + * > >> + * upstream: boolean indicating the direction of the DMA operation; > >> upstream > >> + * means a transfer from the BMC to the host > >> + * > >> + * host_addr: the DMA address on the host side, typically configured by > >> PCI > >> + *subsystem > >> + * > >> + * len: the size of the transfer in bytes; it should be a multiple of 16 > >> bytes > >> + */ > >> +struct aspeed_xdma_op { > >> + __u32 upstream; > >> + __u64 host_addr; > >> + __u32 len; > >> +}; > >> + > >> +#endif /* _UAPI_LINUX_ASPEED_XDMA_H_ */ > > If this is a user space interface, please remove the holes in the > > data structure. > > > Surely it's 4-byte aligned and there won't be holes?? __u64 is 8-byte aligned, so you have a hole after upstream. Easiest just to put upstream after len? Andrew
Re: [PATCH 2/2] Revert "mm, thp: restore node-local hugepage allocations"
On Fri, 24 May 2019, Andrea Arcangeli wrote: > > > We are going in circles, *yes* there is a problem for potential swap > > > storms today because of the poor interaction between memory compaction > > > and > > > directed reclaim but this is a result of a poor API that does not allow > > > userspace to specify that its workload really will span multiple sockets > > > so faulting remotely is the best course of action. The fix is not to > > > cause regressions for others who have implemented a userspace stack that > > > is based on the past 3+ years of long standing behavior or for > > > specialized > > > workloads where it is known that it spans multiple sockets so we want > > > some > > > kind of different behavior. We need to provide a clear and stable API to > > > define these terms for the page allocator that is independent of any > > > global setting of thp enabled, defrag, zone_reclaim_mode, etc. It's > > > workload dependent. > > > > um, who is going to do this work? > > That's a good question. It's going to be a not simple patch to > backport to -stable: it'll be intrusive and it will affect > mm/page_alloc.c significantly so it'll reject heavy. I wouldn't > consider it -stable material at least in the short term, it will > require some testing. > Hi Andrea, I'm not sure what patch you're referring to, unfortunately. The above comment was referring to APIs that are made available to userspace to define when to fault locally vs remotely and what the preference should be for any form of compaction or reclaim to achieve that. Today we have global enabling options, global defrag settings, enabling prctls, and madvise options. The point it makes is that whether a specific workload fits into a single socket is workload dependant and thus we are left with prctls and madvise options. The prctl either enables thp or it doesn't, it is not interesting here; the madvise is overloaded in four different ways (enabling, stalling at fault, collapsability, defrag) so it's not surprising that continuing to overload it for existing users will cause undesired results. It makes an argument that we need a clear and stable means of defining the behavior, not changing the 4+ year behavior and giving those who regress no workaround. > This is why applying a simple fix that avoids the swap storms (and the > swap-less pathological THP regression for vfio device assignment GUP > pinning) is preferable before adding an alloc_pages_multi_order (or > equivalent) so that it'll be the allocator that will decide when > exactly to fallback from 2M to 4k depending on the NUMA distance and > memory availability during the zonelist walk. The basic idea is to > call alloc_pages just once (not first for 2M and then for 4k) and > alloc_pages will decide which page "order" to return. > The commit description doesn't mention the swap storms that you're trying to fix, it's probably better to describe that again and why it is not beneficial to swap unless an entire pageblock can become free or memory compaction has indicated that additional memory freeing would allow migration to make an entire pageblock free. I understand that's a invasive code change, but merging this patch changes the 4+ year behavior that started here: commit 077fcf116c8c2bd7ee9487b645aa3b50368db7e1 Author: Aneesh Kumar K.V Date: Wed Feb 11 15:27:12 2015 -0800 mm/thp: allocate transparent hugepages on local node And that commit's description describes quite well the regression that we encounter if we remove __GFP_THISNODE here. That's because the access latency regression is much more substantial than what was reported for Naples in your changelog. In the interest of making forward progress, can we agree that swapping from the local node *never* makes sense unless we can show that an entire pageblock can become free or that it enables memory compaction to migrate memory that can make an entire pageblock free? Are you reporting swap storms for the local node when one of these is true? > > Implementing a new API doesn't help existing userspace which is hurting > > from the problem which this patch addresses. > > Yes, we can't change all apps that may not fit in a single NUMA > node. Currently it's unsafe to turn "transparent_hugepages/defrag = > always" or the bad behavior can then materialize also outside of > MADV_HUGEPAGE. Those apps that use MADV_HUGEPAGE on their long lived > allocations (i.e. guest physical memory) like qemu are affected even > with the default "defrag = madvise". Those apps are using > MADV_HUGEPAGE for more than 3 years and they are widely used and open > source of course. > I continue to reiterate that the 4+ year long standing behavior of MADV_HUGEPAGE is overloaded; you are anticipating a specific behavior for workloads that do not fit in a single NUMA node whereas other users developed in the past four years are anticipating a different behavior. I'm trying to
Re: [PATCH v2 1/3] KVM: x86: add support for user wait instructions
On 29/05/2019 09:24, Paolo Bonzini wrote: On 24/05/19 09:56, Tao Xu wrote: +7.19 KVM_CAP_ENABLE_USR_WAIT_PAUSE + +Architectures: x86 +Parameters: args[0] whether feature should be enabled or not + +With this capability enabled, a VM can use UMONITOR, UMWAIT and TPAUSE +instructions. If the instruction causes a delay, the amount of +time delayed is called here the physical delay. The physical delay is +first computed by determining the virtual delay (the time to delay +relative to the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT +and TPAUSE cause an invalid-opcode exception(#UD). + There is no need to make it a capability. You can just check the guest CPUID and see if it includes X86_FEATURE_WAITPKG. Paolo Thank you Paolo, but I have another question. I was wondering if it is appropriate to enable X86_FEATURE_WAITPKG when QEMU uses "-overcommit cpu-pm=on"? Or just enable X86_FEATURE_WAITPKG when QEMU add the feature "-cpu host,+waitpkg"? User wait instructions is the wait or pause instructions may be executed at any privilege level, but can use IA32_UMWAIT_CONTROL to set the maximum time.
Re: [PATCH net-next 1/5] timecounter: Add helper for reconstructing partial timestamps
On Tue, May 28, 2019 at 4:58 PM Vladimir Oltean wrote: > > Some PTP hardware offers a 64-bit free-running counter whose snapshots > are used for timestamping, but only makes part of that snapshot > available as timestamps (low-order bits). > > In that case, timecounter/cyclecounter users must bring the cyclecounter > and timestamps to the same bit width, and they currently have two > options of doing so: > > - Trim the higher bits of the timecounter itself to the number of bits > of the timestamps. This might work for some setups, but if the > wraparound of the timecounter in this case becomes high (~10 times per > second) then this causes additional strain on the system, which must > read the clock that often just to avoid missing the wraparounds. > > - Reconstruct the timestamp by racing to read the PTP time within one > wraparound cycle since the timestamp was generated. This is > preferable when the wraparound time is small (do a time-critical > readout once vs doing it periodically), and it has no drawback even > when the wraparound is comfortably sized. > > Signed-off-by: Vladimir Oltean > --- > include/linux/timecounter.h | 7 +++ > kernel/time/timecounter.c | 33 + > 2 files changed, 40 insertions(+) > > diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h > index 2496ad4cfc99..03eab1f3bb9c 100644 > --- a/include/linux/timecounter.h > +++ b/include/linux/timecounter.h > @@ -30,6 +30,9 @@ > * by the implementor and user of specific instances of this API. > * > * @read: returns the current cycle value > + * @partial_tstamp_mask:bitmask in case the hardware emits timestamps > + * which only capture low-order bits of the full > + * counter, and should be reconstructed. > * @mask: bitmask for two's complement > * subtraction of non 64 bit counters, > * see CYCLECOUNTER_MASK() helper macro > @@ -38,6 +41,7 @@ > */ > struct cyclecounter { > u64 (*read)(const struct cyclecounter *cc); > + u64 partial_tstamp_mask; > u64 mask; > u32 mult; > u32 shift; > @@ -136,4 +140,7 @@ extern u64 timecounter_read(struct timecounter *tc); > extern u64 timecounter_cyc2time(struct timecounter *tc, > u64 cycle_tstamp); > > +extern u64 cyclecounter_reconstruct(const struct cyclecounter *cc, > + u64 ts_partial); > + > #endif > diff --git a/kernel/time/timecounter.c b/kernel/time/timecounter.c > index 85b98e727306..d4657d64e38d 100644 > --- a/kernel/time/timecounter.c > +++ b/kernel/time/timecounter.c > @@ -97,3 +97,36 @@ u64 timecounter_cyc2time(struct timecounter *tc, > return nsec; > } > EXPORT_SYMBOL_GPL(timecounter_cyc2time); > + > +/** > + * cyclecounter_reconstruct - reconstructs @ts_partial > + * @cc:Pointer to cycle counter. > + * @ts_partial:Typically RX or TX NIC timestamp, provided by > hardware as > + * the lower @partial_tstamp_mask bits of the cycle counter, > + * sampled at the time the timestamp was collected. > + * To reconstruct into a full @mask bit-wide timestamp, the > + * cycle counter is read and the high-order bits (up to @mask) > are > + * filled in. > + * Must be called within one wraparound of @partial_tstamp_mask > + * bits of the cycle counter. > + */ > +u64 cyclecounter_reconstruct(const struct cyclecounter *cc, u64 ts_partial) > +{ > + u64 ts_reconstructed; > + u64 cycle_now; > + > + cycle_now = cc->read(cc); > + > + ts_reconstructed = (cycle_now & ~cc->partial_tstamp_mask) | > + ts_partial; > + > + /* Check lower bits of current cycle counter against the timestamp. > +* If the current cycle counter is lower than the partial timestamp, > +* then wraparound surely occurred and must be accounted for. > +*/ > + if ((cycle_now & cc->partial_tstamp_mask) <= ts_partial) > + ts_reconstructed -= (cc->partial_tstamp_mask + 1); > + > + return ts_reconstructed; > +} > +EXPORT_SYMBOL_GPL(cyclecounter_reconstruct); Hrm. Is this actually generic? Would it make more sense to have the specific implementations with this quirk implement this in their read() handler? If not, why? thanks -john
[PATCH] falcon: pass valid pointer from ef4_enqueue_unwind.
The bytes_compl and pkts_compl pointers passed to ef4_dequeue_buffers cannot be NULL. Add a paranoid warning to check this condition and fix the one case where they were NULL. Signed-off-by: Young Xiao <92siuy...@gmail.com> --- drivers/net/ethernet/sfc/falcon/tx.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/sfc/falcon/tx.c b/drivers/net/ethernet/sfc/falcon/tx.c index c5059f4..ed89bc6 100644 --- a/drivers/net/ethernet/sfc/falcon/tx.c +++ b/drivers/net/ethernet/sfc/falcon/tx.c @@ -69,6 +69,7 @@ static void ef4_dequeue_buffer(struct ef4_tx_queue *tx_queue, } if (buffer->flags & EF4_TX_BUF_SKB) { + EF4_WARN_ON_PARANOID(!pkts_compl || !bytes_compl); (*pkts_compl)++; (*bytes_compl) += buffer->skb->len; dev_consume_skb_any((struct sk_buff *)buffer->skb); @@ -271,12 +272,14 @@ static int ef4_tx_map_data(struct ef4_tx_queue *tx_queue, struct sk_buff *skb) static void ef4_enqueue_unwind(struct ef4_tx_queue *tx_queue) { struct ef4_tx_buffer *buffer; + unsigned int bytes_compl = 0; + unsigned int pkts_compl = 0; /* Work backwards until we hit the original insert pointer value */ while (tx_queue->insert_count != tx_queue->write_count) { --tx_queue->insert_count; buffer = __ef4_tx_queue_get_insert_buffer(tx_queue); - ef4_dequeue_buffer(tx_queue, buffer, NULL, NULL); + ef4_dequeue_buffer(tx_queue, buffer, _compl, _compl); } } -- 2.7.4
[PATCH net-next v3 4/5] net: stmmac: add xPCS functions for device with DWMACv5.1
From: Ong Boon Leong We introduce support for driver that has v5.10 IP and is also using xPCS as MMD. This can be easily enabled for other product that integrates xPCS that is not using v5.00 IP. Reviewed-by: Chuah Kim Tatt Reviewed-by: Voon Weifeng Reviewed-by: Kweh Hock Leong Reviewed-by: Baoli Zhang Signed-off-by: Ong Boon Leong Signed-off-by: Voon Weifeng --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 33 ++ drivers/net/ethernet/stmicro/stmmac/hwif.c| 41 ++- drivers/net/ethernet/stmicro/stmmac/hwif.h| 2 ++ 3 files changed, 75 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index b4bb5629de38..34f05068142e 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -801,6 +801,39 @@ static void dwmac4_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x, .flex_pps_config = dwmac5_flex_pps_config, }; +const struct stmmac_ops dwmac510_xpcs_ops = { + .core_init = dwmac4_core_init, + .set_mac = stmmac_dwmac4_set_mac, + .rx_ipc = dwmac4_rx_ipc_enable, + .rx_queue_enable = dwmac4_rx_queue_enable, + .rx_queue_prio = dwmac4_rx_queue_priority, + .tx_queue_prio = dwmac4_tx_queue_priority, + .rx_queue_routing = dwmac4_rx_queue_routing, + .prog_mtl_rx_algorithms = dwmac4_prog_mtl_rx_algorithms, + .prog_mtl_tx_algorithms = dwmac4_prog_mtl_tx_algorithms, + .set_mtl_tx_queue_weight = dwmac4_set_mtl_tx_queue_weight, + .map_mtl_to_dma = dwmac4_map_mtl_dma, + .config_cbs = dwmac4_config_cbs, + .dump_regs = dwmac4_dump_regs, + .host_irq_status = dwmac4_irq_status, + .host_mtl_irq_status = dwmac4_irq_mtl_status, + .flow_ctrl = dwmac4_flow_ctrl, + .pmt = dwmac4_pmt, + .set_umac_addr = dwmac4_set_umac_addr, + .get_umac_addr = dwmac4_get_umac_addr, + .set_eee_mode = dwmac4_set_eee_mode, + .reset_eee_mode = dwmac4_reset_eee_mode, + .set_eee_timer = dwmac4_set_eee_timer, + .set_eee_pls = dwmac4_set_eee_pls, + .debug = dwmac4_debug, + .set_filter = dwmac4_set_filter, + .safety_feat_config = dwmac5_safety_feat_config, + .safety_feat_irq_status = dwmac5_safety_feat_irq_status, + .safety_feat_dump = dwmac5_safety_feat_dump, + .rxp_config = dwmac5_rxp_config, + .flex_pps_config = dwmac5_flex_pps_config, +}; + int dwmac4_setup(struct stmmac_priv *priv) { struct mac_device_info *mac = priv->hw; diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c index 81b966a8261b..f1cb3ce165e5 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.c +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c @@ -73,11 +73,13 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) bool gmac; bool gmac4; bool xgmac; + bool has_xpcs; u32 min_id; const struct stmmac_regs_off regs; const void *desc; const void *dma; const void *mac; + const void *xpcs; const void *hwtimestamp; const void *mode; const void *tc; @@ -89,6 +91,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) .gmac = false, .gmac4 = false, .xgmac = false, + .has_xpcs = false, .min_id = 0, .regs = { .ptp_off = PTP_GMAC3_X_OFFSET, @@ -97,6 +100,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) .desc = NULL, .dma = _dma_ops, .mac = _ops, + .xpcs = NULL, .hwtimestamp = _ptp, .mode = NULL, .tc = NULL, @@ -106,6 +110,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) .gmac = true, .gmac4 = false, .xgmac = false, + .has_xpcs = false, .min_id = 0, .regs = { .ptp_off = PTP_GMAC3_X_OFFSET, @@ -114,6 +119,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) .desc = NULL, .dma = _dma_ops, .mac = _ops, + .xpcs = NULL, .hwtimestamp = _ptp, .mode = NULL, .tc = NULL, @@ -123,6 +129,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) .gmac = false, .gmac4 = true, .xgmac = false, + .has_xpcs = false, .min_id = 0, .regs = { .ptp_off = PTP_GMAC4_OFFSET, @@ -130,6 +137,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv) }, .desc = _desc_ops, .dma = _dma_ops, +
[PATCH net-next v3 3/5] net: stmmac: add xpcs function hooks into main driver and ethtool
From: Ong Boon Leong With xPCS functions now ready, we add them into the main driver and ethtool logics. To differentiate from EQoS MAC PCS and DWC Ethernet xPCS, we introduce 'has_xpcs' in platform data as a mean to indicate whether GBE controller includes xPCS or not. To support platform-specific C37 AN PCS mode selection for MII MMD, we introduce 'pcs_mode' in platform data. The basic framework for xPCS interrupt handling is implemented too. Reviewed-by: Chuah Kim Tatt Reviewed-by: Voon Weifeng Reviewed-by: Kweh Hock Leong Reviewed-by: Baoli Zhang Signed-off-by: Ong Boon Leong Signed-off-by: Voon Weifeng --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 + .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 50 +-- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 152 - include/linux/stmmac.h | 2 + 4 files changed, 158 insertions(+), 48 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index dd95d959c1ce..0b8460a4a220 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -36,6 +36,7 @@ struct stmmac_resources { const char *mac; int wol_irq; int lpi_irq; + int xpcs_irq; int irq; }; @@ -168,6 +169,7 @@ struct stmmac_priv { int clk_csr; struct timer_list eee_ctrl_timer; int lpi_irq; + int xpcs_irq; int eee_enabled; int eee_active; int tx_lpi_timer; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c index e09522c5509a..f0815d196147 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c @@ -28,6 +28,7 @@ #include "stmmac.h" #include "dwmac_dma.h" +#include "dwxpcs.h" #define REG_SPACE_SIZE 0x1060 #define MAC100_ETHTOOL_NAME"st_mac100" @@ -277,7 +278,8 @@ static int stmmac_ethtool_get_link_ksettings(struct net_device *dev, struct phy_device *phy = dev->phydev; if (priv->hw->pcs & STMMAC_PCS_RGMII || - priv->hw->pcs & STMMAC_PCS_SGMII) { + priv->hw->pcs & STMMAC_PCS_SGMII || + priv->plat->pcs_mode == AN_CTRL_PCS_MD_C37_1000BASEX) { struct rgmii_adv adv; u32 supported, advertising, lp_advertising; @@ -294,6 +296,11 @@ static int stmmac_ethtool_get_link_ksettings(struct net_device *dev, if (stmmac_pcs_get_adv_lp(priv, priv->ioaddr, )) return -EOPNOTSUPP; /* should never happen indeed */ + /* Get ADV & LPA is only application for 1000BASE-X C37. +* For MAC side SGMII AN, get ADV & LPA from PHY. +*/ + stmmac_xpcs_get_adv_lp(priv, dev, , priv->plat->pcs_mode); + /* Encoding of PSE bits is defined in 802.3z, 37.2.1.4 */ ethtool_convert_link_mode_to_legacy_u32( @@ -376,22 +383,23 @@ static int stmmac_ethtool_get_link_ksettings(struct net_device *dev, int rc; if (priv->hw->pcs & STMMAC_PCS_RGMII || - priv->hw->pcs & STMMAC_PCS_SGMII) { - u32 mask = ADVERTISED_Autoneg | ADVERTISED_Pause; - + priv->hw->pcs & STMMAC_PCS_SGMII || + priv->plat->pcs_mode == AN_CTRL_PCS_MD_C37_1000BASEX) { /* Only support ANE */ if (cmd->base.autoneg != AUTONEG_ENABLE) return -EINVAL; - mask &= (ADVERTISED_1000baseT_Half | - ADVERTISED_1000baseT_Full | - ADVERTISED_100baseT_Half | - ADVERTISED_100baseT_Full | - ADVERTISED_10baseT_Half | - ADVERTISED_10baseT_Full); - mutex_lock(>lock); stmmac_pcs_ctrl_ane(priv, priv->ioaddr, 1, priv->hw->ps, 0); + + /* For 1000BASE-X C37 AN, it is always 1000Mbps. And, we only +* support FD which is set by default in SR_MII_AN_ADV +* during XPCS init. So, we don't need to set FD again. +* For SGMII C37 AN, we let user to change link settings +* through PHY since it is MAC side SGMII. +*/ + stmmac_xpcs_ctrl_ane(priv, dev, 1, 0); + mutex_unlock(>lock); return 0; @@ -457,6 +465,16 @@ static void stmmac_ethtool_gregs(struct net_device *dev, pause->autoneg = 1; if (!adv_lp.pause) return; + } else if (priv->plat->pcs_mode == AN_CTRL_PCS_MD_C37_1000BASEX && + !stmmac_xpcs_get_adv_lp(priv, netdev, _lp, + priv->plat->pcs_mode)) { + /* DW xPCS 1000BASE-X C37 AN mode only because for MAC side +
[PATCH net-next v3 1/5] net: stmmac: enable clause 45 mdio support
From: Kweh Hock Leong DWMAC4 is capable to support clause 45 mdio communication. This patch enable the feature on stmmac_mdio_write() and stmmac_mdio_read() by following phy_write_mmd() and phy_read_mmd() mdiobus read write implementation format. Reviewed-by: Li, Yifan Signed-off-by: Kweh Hock Leong Signed-off-by: Ong Boon Leong Signed-off-by: Weifeng Voon --- drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 40 ++- include/linux/phy.h | 2 ++ 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c index bdd351597b55..c3d8f1d145ec 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c @@ -34,11 +34,27 @@ #define MII_BUSY 0x0001 #define MII_WRITE 0x0002 +#define MII_DATA_MASK GENMASK(15, 0) /* GMAC4 defines */ #define MII_GMAC4_GOC_SHIFT2 +#define MII_GMAC4_REG_ADDR_SHIFT 16 #define MII_GMAC4_WRITE(1 << MII_GMAC4_GOC_SHIFT) #define MII_GMAC4_READ (3 << MII_GMAC4_GOC_SHIFT) +#define MII_GMAC4_C45E BIT(1) + +static void stmmac_mdio_c45_setup(struct stmmac_priv *priv, int phyreg, + u32 *val, u32 *data) +{ + unsigned int reg_shift = priv->hw->mii.reg_shift; + unsigned int reg_mask = priv->hw->mii.reg_mask; + + *val |= MII_GMAC4_C45E; + *val &= ~reg_mask; + *val |= ((phyreg >> MII_DEVADDR_C45_SHIFT) << reg_shift) & reg_mask; + + *data |= (phyreg & MII_REGADDR_C45_MASK) << MII_GMAC4_REG_ADDR_SHIFT; +} /* XGMAC defines */ #define MII_XGMAC_SADDRBIT(18) @@ -165,22 +181,26 @@ static int stmmac_mdio_read(struct mii_bus *bus, int phyaddr, int phyreg) struct stmmac_priv *priv = netdev_priv(ndev); unsigned int mii_address = priv->hw->mii.addr; unsigned int mii_data = priv->hw->mii.data; - u32 v; - int data; u32 value = MII_BUSY; + int data = 0; + u32 v; value |= (phyaddr << priv->hw->mii.addr_shift) & priv->hw->mii.addr_mask; value |= (phyreg << priv->hw->mii.reg_shift) & priv->hw->mii.reg_mask; value |= (priv->clk_csr << priv->hw->mii.clk_csr_shift) & priv->hw->mii.clk_csr_mask; - if (priv->plat->has_gmac4) + if (priv->plat->has_gmac4) { value |= MII_GMAC4_READ; + if (phyreg & MII_ADDR_C45) + stmmac_mdio_c45_setup(priv, phyreg, , ); + } if (readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY), 100, 1)) return -EBUSY; + writel(data, priv->ioaddr + mii_data); writel(value, priv->ioaddr + mii_address); if (readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY), @@ -188,7 +208,7 @@ static int stmmac_mdio_read(struct mii_bus *bus, int phyaddr, int phyreg) return -EBUSY; /* Read the data from the MII data register */ - data = (int)readl(priv->ioaddr + mii_data); + data = (int)readl(priv->ioaddr + mii_data) & MII_DATA_MASK; return data; } @@ -208,8 +228,9 @@ static int stmmac_mdio_write(struct mii_bus *bus, int phyaddr, int phyreg, struct stmmac_priv *priv = netdev_priv(ndev); unsigned int mii_address = priv->hw->mii.addr; unsigned int mii_data = priv->hw->mii.data; - u32 v; u32 value = MII_BUSY; + int data = phydata; + u32 v; value |= (phyaddr << priv->hw->mii.addr_shift) & priv->hw->mii.addr_mask; @@ -217,10 +238,13 @@ static int stmmac_mdio_write(struct mii_bus *bus, int phyaddr, int phyreg, value |= (priv->clk_csr << priv->hw->mii.clk_csr_shift) & priv->hw->mii.clk_csr_mask; - if (priv->plat->has_gmac4) + if (priv->plat->has_gmac4) { value |= MII_GMAC4_WRITE; - else + if (phyreg & MII_ADDR_C45) + stmmac_mdio_c45_setup(priv, phyreg, , ); + } else { value |= MII_WRITE; + } /* Wait until any existing MII operation is complete */ if (readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY), @@ -228,7 +252,7 @@ static int stmmac_mdio_write(struct mii_bus *bus, int phyaddr, int phyreg, return -EBUSY; /* Set the MII address register to write */ - writel(phydata, priv->ioaddr + mii_data); + writel(data, priv->ioaddr + mii_data); writel(value, priv->ioaddr + mii_address); /* Wait until any existing MII operation is complete */ diff --git a/include/linux/phy.h b/include/linux/phy.h index 073fb151b5a9..d3daac8ec686 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -198,6 +198,8 @@
[PATCH net-next v3 5/5] net: stmmac: add EHL SGMII 1Gbps PCI info and PCI ID
Added EHL SGMII 1Gbps PCI ID. Different MII and speed will have different PCI ID. Signed-off-by: Voon Weifeng --- drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c | 111 +++ 1 file changed, 111 insertions(+) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c index 7cbc01f316fa..f2225c1eafc2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c @@ -23,6 +23,7 @@ #include #include "stmmac.h" +#include "dwxpcs.h" /* * This struct is used to associate PCI Function of MAC controller on a board, @@ -118,6 +119,113 @@ static int stmmac_default_data(struct pci_dev *pdev, .setup = stmmac_default_data, }; +static int ehl_common_data(struct pci_dev *pdev, + struct plat_stmmacenet_data *plat) +{ + int i; + + plat->bus_id = 1; + plat->phy_addr = 0; + plat->clk_csr = 5; + plat->has_gmac = 0; + plat->has_gmac4 = 1; + plat->xpcs_phy_addr = 0x16; + plat->pcs_mode = AN_CTRL_PCS_MD_C37_SGMII; + plat->force_sf_dma_mode = 0; + plat->tso_en = 1; + + plat->rx_queues_to_use = 8; + plat->tx_queues_to_use = 8; + plat->rx_sched_algorithm = MTL_RX_ALGORITHM_SP; + + for (i = 0; i < plat->rx_queues_to_use; i++) { + plat->rx_queues_cfg[i].mode_to_use = MTL_QUEUE_DCB; + plat->rx_queues_cfg[i].chan = i; + + /* Disable Priority config by default */ + plat->rx_queues_cfg[i].use_prio = false; + + /* Disable RX queues routing by default */ + plat->rx_queues_cfg[i].pkt_route = 0x0; + } + + for (i = 0; i < plat->tx_queues_to_use; i++) { + plat->tx_queues_cfg[i].mode_to_use = MTL_QUEUE_DCB; + + /* Disable Priority config by default */ + plat->tx_queues_cfg[i].use_prio = false; + } + + plat->tx_sched_algorithm = MTL_TX_ALGORITHM_WRR; + plat->tx_queues_cfg[0].weight = 0x09; + plat->tx_queues_cfg[1].weight = 0x0A; + plat->tx_queues_cfg[2].weight = 0x0B; + plat->tx_queues_cfg[3].weight = 0x0C; + plat->tx_queues_cfg[4].weight = 0x0D; + plat->tx_queues_cfg[5].weight = 0x0E; + plat->tx_queues_cfg[6].weight = 0x0F; + plat->tx_queues_cfg[7].weight = 0x10; + + plat->mdio_bus_data->phy_reset = NULL; + plat->mdio_bus_data->phy_mask = 0; + + plat->dma_cfg->pbl = 32; + plat->dma_cfg->pblx8 = true; + plat->dma_cfg->fixed_burst = 0; + plat->dma_cfg->mixed_burst = 0; + plat->dma_cfg->aal = 0; + + plat->axi = devm_kzalloc(>dev, sizeof(*plat->axi), +GFP_KERNEL); + if (!plat->axi) + return -ENOMEM; + plat->axi->axi_lpi_en = 0; + plat->axi->axi_xit_frm = 0; + plat->axi->axi_wr_osr_lmt = 0; + plat->axi->axi_rd_osr_lmt = 2; + plat->axi->axi_blen[0] = 4; + plat->axi->axi_blen[1] = 8; + plat->axi->axi_blen[2] = 16; + + /* Set default value for multicast hash bins */ + plat->multicast_filter_bins = HASH_TABLE_SIZE; + + /* Set default value for unicast filter entries */ + plat->unicast_filter_entries = 1; + + /* Set the maxmtu to a default of JUMBO_LEN */ + plat->maxmtu = JUMBO_LEN; + + /* Set 32KB fifo size as the advertised fifo size in +* the HW features is not the same as the HW implementation +*/ + plat->tx_fifo_size = 32768; + plat->rx_fifo_size = 32768; + + return 0; +} + +static int ehl_sgmii1g_data(struct pci_dev *pdev, + struct plat_stmmacenet_data *plat) +{ + int ret; + + /* Set common default data first */ + ret = ehl_common_data(pdev, plat); + + if (ret) + return ret; + + plat->interface = PHY_INTERFACE_MODE_SGMII; + plat->has_xpcs = 1; + + return 0; +} + +static struct stmmac_pci_info ehl_sgmii1g_pci_info = { + .setup = ehl_sgmii1g_data, +}; + static const struct stmmac_pci_func_data galileo_stmmac_func_data[] = { { .func = 6, @@ -290,6 +398,7 @@ static int stmmac_pci_probe(struct pci_dev *pdev, res.addr = pcim_iomap_table(pdev)[i]; res.wol_irq = pdev->irq; res.irq = pdev->irq; + res.xpcs_irq = 0; return stmmac_dvr_probe(>dev, plat, ); } @@ -359,6 +468,7 @@ static int __maybe_unused stmmac_pci_resume(struct device *dev) #define STMMAC_QUARK_ID 0x0937 #define STMMAC_DEVICE_ID 0x1108 +#define STMMAC_EHL_SGMII1G_ID 0x4b31 #define STMMAC_DEVICE(vendor_id, dev_id, info) { \ PCI_VDEVICE(vendor_id, dev_id), \ @@ -369,6 +479,7 @@ static int __maybe_unused stmmac_pci_resume(struct device *dev) STMMAC_DEVICE(STMMAC, STMMAC_DEVICE_ID, stmmac_pci_info), STMMAC_DEVICE(STMICRO,
Re: [PATCH v5 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper
Hi, On 5/28/19 7:50 PM, Eric Auger wrote: Several call sites are about to check whether a device belongs to the PCI sub-hierarchy of a candidate PCI-PCI bridge. Introduce an helper to perform that check. This looks good to me. Reviewed-by: Lu Baolu Best regards, Baolu Signed-off-by: Eric Auger --- drivers/iommu/intel-iommu.c | 37 + 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 5ec8b5bd308f..879f11c82b05 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -736,12 +736,39 @@ static int iommu_dummy(struct device *dev) return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO; } +/* is_downstream_to_pci_bridge - test if a device belongs to the + * PCI sub-hierarchy of a candidate PCI-PCI bridge + * + * @dev: candidate PCI device belonging to @bridge PCI sub-hierarchy + * @bridge: the candidate PCI-PCI bridge + * + * Return: true if @dev belongs to @bridge PCI sub-hierarchy + */ +static bool +is_downstream_to_pci_bridge(struct device *dev, struct device *bridge) +{ + struct pci_dev *pdev, *pbridge; + + if (!dev_is_pci(dev) || !dev_is_pci(bridge)) + return false; + + pdev = to_pci_dev(dev); + pbridge = to_pci_dev(bridge); + + if (pbridge->subordinate && + pbridge->subordinate->number <= pdev->bus->number && + pbridge->subordinate->busn_res.end >= pdev->bus->number) + return true; + + return false; +} + static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn) { struct dmar_drhd_unit *drhd = NULL; struct intel_iommu *iommu; struct device *tmp; - struct pci_dev *ptmp, *pdev = NULL; + struct pci_dev *pdev = NULL; u16 segment = 0; int i; @@ -787,13 +814,7 @@ static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devf goto out; } - if (!pdev || !dev_is_pci(tmp)) - continue; - - ptmp = to_pci_dev(tmp); - if (ptmp->subordinate && - ptmp->subordinate->number <= pdev->bus->number && - ptmp->subordinate->busn_res.end >= pdev->bus->number) + if (is_downstream_to_pci_bridge(dev, tmp)) goto got_pdev; }
[PATCH net-next v3 2/5] net: stmmac: introducing support for DWC xPCS logics
From: Ong Boon Leong xPCS is DWC Ethernet Physical Coding Sublayer that may be integrated into a GbE controller that uses DWC EQoS MAC controller. An example of HW configuration is shown below:- <-GBE Controller-->|<--External PHY chip--> +--+ +++---+ +--+ | EQoS | <-GMII->| DW |<-->|PHY| <-- SGMII --> | External GbE | | MAC| |xPCS||IF | | PHY Chip | +--+ +++---+ +--+ ^ ^ ^ | | | +-MDIO-+ xPCS is a Clause-45 MDIO Manageable Device (MMD) and we need a way to differentiate it from external PHY chip that is discovered over MDIO. Therefore, xpcs_phy_addr is introduced in stmmac platform data (plat_stmmacenet_data) for differentiating xPCS from 'phy_addr' that belongs to external PHY. Basic functionalities for initializing xPCS and configuring auto negotiation (AN), loopback, link status, AN advertisement and Link Partner ability are implemented. The implementation supports the C37 AN for 1000BASE-X and SGMII (MAC side SGMII only). Tested-by: Tan, Tee Min Reviewed-by: Voon Weifeng Reviewed-by: Kweh Hock Leong Signed-off-by: Ong Boon Leong Signed-off-by: Voon Weifeng --- drivers/net/ethernet/stmicro/stmmac/Makefile | 2 +- drivers/net/ethernet/stmicro/stmmac/common.h | 1 + drivers/net/ethernet/stmicro/stmmac/dwxpcs.c | 198 +++ drivers/net/ethernet/stmicro/stmmac/dwxpcs.h | 51 +++ drivers/net/ethernet/stmicro/stmmac/hwif.h | 19 +++ include/linux/stmmac.h | 1 + 6 files changed, 271 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwxpcs.c create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwxpcs.h diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile index c529c21e9bdd..57ca648fae4e 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Makefile +++ b/drivers/net/ethernet/stmicro/stmmac/Makefile @@ -6,7 +6,7 @@ stmmac-objs:= stmmac_main.o stmmac_ethtool.o stmmac_mdio.o ring_mode.o \ mmc_core.o stmmac_hwtstamp.o stmmac_ptp.o dwmac4_descs.o \ dwmac4_dma.o dwmac4_lib.o dwmac4_core.o dwmac5.o hwif.o \ stmmac_tc.o dwxgmac2_core.o dwxgmac2_dma.o dwxgmac2_descs.o \ - $(stmmac-y) + dwxpcs.o $(stmmac-y) # Ordering matters. Generic driver must be last. obj-$(CONFIG_STMMAC_PLATFORM) += stmmac-platform.o diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index 272b9ca66314..67d03a5a21af 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -419,6 +419,7 @@ struct mii_regs { struct mac_device_info { const struct stmmac_ops *mac; + const struct stmmac_xpcs *xpcs; const struct stmmac_desc_ops *desc; const struct stmmac_dma_ops *dma; const struct stmmac_mode_ops *mode; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxpcs.c b/drivers/net/ethernet/stmicro/stmmac/dwxpcs.c new file mode 100644 index ..081d3631afd2 --- /dev/null +++ b/drivers/net/ethernet/stmicro/stmmac/dwxpcs.c @@ -0,0 +1,198 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019, Intel Corporation. + * DWC Ethernet Physical Coding Sublayer + */ +#include +#include +#include "dwxpcs.h" +#include "stmmac.h" + +/* DW xPCS mdiobus_read and mdiobus_write helper functions */ +#define xpcs_read(dev, reg) \ + mdiobus_read(priv->mii, xpcs_phy_addr, \ +MII_ADDR_C45 | (reg) | \ +((dev) << MII_DEVADDR_C45_SHIFT)) +#define xpcs_write(dev, reg, val) \ + mdiobus_write(priv->mii, xpcs_phy_addr, \ + MII_ADDR_C45 | (reg) | \ + ((dev) << MII_DEVADDR_C45_SHIFT), val) + +static void dw_xpcs_init(struct net_device *ndev, int pcs_mode) +{ + struct stmmac_priv *priv = netdev_priv(ndev); + int xpcs_phy_addr = priv->plat->xpcs_phy_addr; + int phydata; + + if (pcs_mode == AN_CTRL_PCS_MD_C37_SGMII) { + /* For AN for SGMII mode, the settings are :- +* 1) VR_MII_AN_CTRL Bit(2:1)[PCS_MODE] = 10b (SGMII AN) +* 2) VR_MII_AN_CTRL Bit(3) [TX_CONFIG] = 0b (MAC side SGMII) +*DW xPCS used with DW EQoS MAC is always MAC +*side SGMII. +* 3) VR_MII_AN_CTRL Bit(0) [AN_INTR_EN] = 1b (AN Interrupt +*enabled) +* 4) VR_MII_DIG_CTRL1 Bit(9) [MAC_AUTO_SW] = 1b (Automatic +*speed mode change after SGMII AN complete) +* Note: Since it is MAC side SGMII, there is no need to set +
[PATCH net-next v3 0/5] net: stmmac: enable EHL SGMII
This patch-set is to enable Ethernet controller (DW Ethernet QoS and DW Ethernet PCS) with SGMII interface in Elkhart Lake. The DW Ethernet PCS is the Physical Coding Sublayer that is between Ethernet MAC and PHY and uses MDIO Clause-45 as Communication. Kweh Hock Leong (1): net: stmmac: enable clause 45 mdio support Ong Boon Leong (3): net: stmmac: introducing support for DWC xPCS logics net: stmmac: add xpcs function hooks into main driver and ethtool net: stmmac: add xPCS functions for device with DWMACv5.1 Voon Weifeng (1): net: stmmac: add EHL SGMII 1Gbps PCI info and PCI ID drivers/net/ethernet/stmicro/stmmac/Makefile | 2 +- drivers/net/ethernet/stmicro/stmmac/common.h | 1 + drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 33 drivers/net/ethernet/stmicro/stmmac/dwxpcs.c | 198 + drivers/net/ethernet/stmicro/stmmac/dwxpcs.h | 51 ++ drivers/net/ethernet/stmicro/stmmac/hwif.c | 41 - drivers/net/ethernet/stmicro/stmmac/hwif.h | 21 +++ drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 + .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 50 -- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 152 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 40 - drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c | 111 include/linux/phy.h| 2 + include/linux/stmmac.h | 3 + 14 files changed, 649 insertions(+), 58 deletions(-) create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwxpcs.c create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwxpcs.h -- Changelog v2: *Added support for the C37 AN for 1000BASE-X and SGMII (MAC side SGMII only) *removed and submitted the fix patch to net "net: stmmac: dma channel control register need to be init first" *Squash the following 2 patches and move it to the end of the patch set: "net: stmmac: add EHL SGMII 1Gbps platform data and PCI ID" "net: stmmac: add xPCS platform data for EHL" Changelog v3: *Applied reversed christmas tree 1.9.1
[PATCH] sparc: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD
The PERF_EVENT_IOC_PERIOD ioctl command can be used to change the sample period of a running perf_event. Consequently, when calculating the next event period, the new period will only be considered after the previous one has overflowed. This patch changes the calculation of the remaining event ticks so that they are offset if the period has changed. See commit 3581fe0ef37c ("ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD") for details. Signed-off-by: Young Xiao <92siuy...@gmail.com> --- arch/sparc/kernel/perf_event.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c index 6de7c68..a58ae9c 100644 --- a/arch/sparc/kernel/perf_event.c +++ b/arch/sparc/kernel/perf_event.c @@ -891,6 +891,10 @@ static int sparc_perf_event_set_period(struct perf_event *event, s64 period = hwc->sample_period; int ret = 0; + /* The period may have been changed by PERF_EVENT_IOC_PERIOD */ + if (unlikely(period != hwc->last_period)) + left = period - (hwc->last_period - left); + if (unlikely(left <= -period)) { left = period; local64_set(>period_left, left); -- 2.7.4
[PATCH net-next v2] net: stmmac: Switch to devm_alloc_etherdev_mqs
Make use of devm_alloc_etherdev_mqs() to simplify the code. Signed-off-by: Jisheng Zhang --- Since V1: - fix the build error, sorry, my bad. drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index a87ec70b19f1..4defdcb4f237 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4243,9 +4243,8 @@ int stmmac_dvr_probe(struct device *device, u32 queue, maxq; int ret = 0; - ndev = alloc_etherdev_mqs(sizeof(struct stmmac_priv), - MTL_MAX_TX_QUEUES, - MTL_MAX_RX_QUEUES); + ndev = devm_alloc_etherdev_mqs(device, sizeof(struct stmmac_priv), + MTL_MAX_TX_QUEUES, MTL_MAX_RX_QUEUES); if (!ndev) return -ENOMEM; @@ -4277,8 +4276,7 @@ int stmmac_dvr_probe(struct device *device, priv->wq = create_singlethread_workqueue("stmmac_wq"); if (!priv->wq) { dev_err(priv->device, "failed to create workqueue\n"); - ret = -ENOMEM; - goto error_wq; + return -ENOMEM; } INIT_WORK(>service_task, stmmac_service_task); @@ -4434,8 +4432,6 @@ int stmmac_dvr_probe(struct device *device, } error_hw_init: destroy_workqueue(priv->wq); -error_wq: - free_netdev(ndev); return ret; } @@ -4472,7 +4468,6 @@ int stmmac_dvr_remove(struct device *dev) stmmac_mdio_unregister(ndev); destroy_workqueue(priv->wq); mutex_destroy(>lock); - free_netdev(ndev); return 0; } -- 2.20.1
Re: [PATCH net-next] net: stmmac: Switch to devm_alloc_etherdev_mqs
On Tue, 28 May 2019 11:07:53 -0700 David Miller wrote: > > You never even tried to compiled this patch. > oops, my bad. I patched the another branch and tested the patch but when I manually patch net-next tree, I made a mistake. Sorry.
Re: [RFC PATCH 0/3] Make deferred split shrinker memcg aware
On 5/29/19 9:22 AM, David Rientjes wrote: On Tue, 28 May 2019, Yang Shi wrote: I got some reports from our internal application team about memcg OOM. Even though the application has been killed by oom killer, there are still a lot THPs reside, page reclaim doesn't reclaim them at all. Some investigation shows they are on deferred split queue, memcg direct reclaim can't shrink them since THP deferred split shrinker is not memcg aware, this may cause premature OOM in memcg. The issue can be reproduced easily by the below test: Right, we've also encountered this. I talked to Kirill about it a week or so ago where the suggestion was to split all compound pages on the deferred split queues under the presence of even memory pressure. That breaks cgroup isolation and perhaps unfairly penalizes workloads that are running attached to other memcg hierarchies that are not under pressure because their compound pages are now split as a side effect. There is a benefit to keeping these compound pages around while not under memory pressure if all pages are subsequently mapped again. Yes, I do agree. I tried other approaches too, it sounds making deferred split queue per memcg is the optimal one. $ cgcreate -g memory:thp $ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes $ cgexec -g memory:thp ./transhuge-stress 4000 transhuge-stress comes from kernel selftest. It is easy to hit OOM, but there are still a lot THP on the deferred split queue, memcg direct reclaim can't touch them since the deferred split shrinker is not memcg aware. Yes, we have seen this on at least 4.15 as well. Convert deferred split shrinker memcg aware by introducing per memcg deferred split queue. The THP should be on either per node or per memcg deferred split queue if it belongs to a memcg. When the page is immigrated to the other memcg, it will be immigrated to the target memcg's deferred split queue too. And, move deleting THP from deferred split queue in page free before memcg uncharge so that the page's memcg information is available. Reuse the second tail page's deferred_list for per memcg list since the same THP can't be on multiple deferred split queues at the same time. Remove THP specific destructor since it is not used anymore with memcg aware THP shrinker (Please see the commit log of patch 2/3 for the details). Make deferred split shrinker not depend on memcg kmem since it is not slab. It doesn't make sense to not shrink THP even though memcg kmem is disabled. With the above change the test demonstrated above doesn't trigger OOM anymore even though with cgroup.memory=nokmem. I'm curious if your internal applications team is also asking for statistics on how much memory can be freed if the deferred split queues can be shrunk? We have applications that monitor their own memory usage No, but this reminds me. The THPs on deferred split queue should be accounted into available memory too. through memcg stats or usage and proactively try to reduce that usage when it is growing too large. The deferred split queues have significantly increased both memcg usage and rss when they've upgraded kernels. How are your applications monitoring how much memory from deferred split queues can be freed on memory pressure? Any thoughts on providing it as a memcg stat? I don't think they have such monitor. I saw rss_huge is abormal in memcg stat even after the application is killed by oom, so I realized the deferred split queue may play a role here. The memcg stat doesn't have counters for available memory as global vmstat. It may be better to have such statistics, or extending reclaimable "slab" to shrinkable/reclaimable "memory". Thanks!
[PATCH 1/1] Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled"
This reverts commit 3e6a8fb3308419129c7a52de6eb42feef5a919a0. Cc: Andy Gross Cc: David Brown Cc: Amit Kucheria Cc: Zhang Rui Cc: Daniel Lezcano Suggested-by: Amit Kucheria Reported-by: Andy Gross Signed-off-by: Eduardo Valentin --- Added this for next -rc, as per request. drivers/thermal/qcom/tsens-common.c | 14 -- drivers/thermal/qcom/tsens-v0_1.c | 1 - drivers/thermal/qcom/tsens-v2.c | 1 - drivers/thermal/qcom/tsens.c| 5 - drivers/thermal/qcom/tsens.h| 1 - 5 files changed, 22 deletions(-) diff --git a/drivers/thermal/qcom/tsens-common.c b/drivers/thermal/qcom/tsens-common.c index 928e8e8..528df88 100644 --- a/drivers/thermal/qcom/tsens-common.c +++ b/drivers/thermal/qcom/tsens-common.c @@ -64,20 +64,6 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 *p1, } } -bool is_sensor_enabled(struct tsens_priv *priv, u32 hw_id) -{ - u32 val; - int ret; - - if ((hw_id > (priv->num_sensors - 1)) || (hw_id < 0)) - return -EINVAL; - ret = regmap_field_read(priv->rf[SENSOR_EN], ); - if (ret) - return ret; - - return val & (1 << hw_id); -} - static inline int code_to_degc(u32 adc_code, const struct tsens_sensor *s) { int degc, num, den; diff --git a/drivers/thermal/qcom/tsens-v0_1.c b/drivers/thermal/qcom/tsens-v0_1.c index a319283..6f26fad 100644 --- a/drivers/thermal/qcom/tsens-v0_1.c +++ b/drivers/thermal/qcom/tsens-v0_1.c @@ -334,7 +334,6 @@ static const struct reg_field tsens_v0_1_regfields[MAX_REGFIELDS] = { /* CTRL_OFFSET */ [TSENS_EN] = REG_FIELD(SROT_CTRL_OFF, 0, 0), [TSENS_SW_RST] = REG_FIELD(SROT_CTRL_OFF, 1, 1), - [SENSOR_EN]= REG_FIELD(SROT_CTRL_OFF, 3, 13), /* - TM -- */ /* INTERRUPT ENABLE */ diff --git a/drivers/thermal/qcom/tsens-v2.c b/drivers/thermal/qcom/tsens-v2.c index 1099069..0a4f2b8 100644 --- a/drivers/thermal/qcom/tsens-v2.c +++ b/drivers/thermal/qcom/tsens-v2.c @@ -44,7 +44,6 @@ static const struct reg_field tsens_v2_regfields[MAX_REGFIELDS] = { /* CTRL_OFF */ [TSENS_EN] = REG_FIELD(SROT_CTRL_OFF,0, 0), [TSENS_SW_RST] = REG_FIELD(SROT_CTRL_OFF,1, 1), - [SENSOR_EN]= REG_FIELD(SROT_CTRL_OFF,3, 18), /* - TM -- */ /* INTERRUPT ENABLE */ diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c index 36b0b52..0627d86 100644 --- a/drivers/thermal/qcom/tsens.c +++ b/drivers/thermal/qcom/tsens.c @@ -85,11 +85,6 @@ static int tsens_register(struct tsens_priv *priv) struct thermal_zone_device *tzd; for (i = 0; i < priv->num_sensors; i++) { - if (!is_sensor_enabled(priv, priv->sensor[i].hw_id)) { - dev_err(priv->dev, "sensor %d: disabled\n", - priv->sensor[i].hw_id); - continue; - } priv->sensor[i].priv = priv; priv->sensor[i].id = i; tzd = devm_thermal_zone_of_sensor_register(priv->dev, i, diff --git a/drivers/thermal/qcom/tsens.h b/drivers/thermal/qcom/tsens.h index eefe384..2fd9499 100644 --- a/drivers/thermal/qcom/tsens.h +++ b/drivers/thermal/qcom/tsens.h @@ -315,7 +315,6 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 *pt1, u32 *pt2, u32 mo int init_common(struct tsens_priv *priv); int get_temp_tsens_valid(struct tsens_priv *priv, int i, int *temp); int get_temp_common(struct tsens_priv *priv, int i, int *temp); -bool is_sensor_enabled(struct tsens_priv *priv, u32 hw_id); /* TSENS target */ extern const struct tsens_plat_data data_8960; -- 2.1.4
Re: [PATCH] thermal: tsens: Remove unnecessary comparison of unsigned integer with < 0
Gustavo, On Mon, May 27, 2019 at 11:08:25AM -0500, Gustavo A. R. Silva wrote: > There is no need to compare hw_id with < 0 because such comparison > of an unsigned value is always false. > > Fix this by removing such comparison. Thanks for fixing this. But we had to revert the commit that introduces this issue. So this patch is no longer applicable. > > Addresses-Coverity-ID: 1445440 ("Unsigned compared against 0") > Fixes: 3e6a8fb33084 ("drivers: thermal: tsens: Add new operation to check if > a sensor is enabled") > Signed-off-by: Gustavo A. R. Silva > --- > drivers/thermal/qcom/tsens-common.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/thermal/qcom/tsens-common.c > b/drivers/thermal/qcom/tsens-common.c > index 928e8e81ba69..94878ad35464 100644 > --- a/drivers/thermal/qcom/tsens-common.c > +++ b/drivers/thermal/qcom/tsens-common.c > @@ -69,7 +69,7 @@ bool is_sensor_enabled(struct tsens_priv *priv, u32 hw_id) > u32 val; > int ret; > > - if ((hw_id > (priv->num_sensors - 1)) || (hw_id < 0)) > + if (hw_id > priv->num_sensors - 1) > return -EINVAL; > ret = regmap_field_read(priv->rf[SENSOR_EN], ); > if (ret)
Re: [PATCH] kernel/sys.c: fix possible spectre-v1 in do_prlimit()
Hi, Although when detect it is misprediction and drop the execution, but it can not drop all the effects of speculative execution, like the cache state. During the speculative execution, the: rlim = tsk->signal->rlim + resource;// use resource as index ... *old_rlim = *rlim; may read some secret data into cache. and then the attacker can use side-channel attack to find out what the secret data is. Virtually any observable effect of speculatively executed code can be leveraged to create the covert channel that leaks sensitive information[1]. A general form of spectre v1 would be[1]: if (x < array1_size) { y = array1[x]; // do something using y that is // observable when speculatively // executed } [1] https://spectreattack.com/spectre.pdf Cyrill Gorcunov 于2019年5月28日周二 下午3:10写道: > > On Tue, May 28, 2019 at 10:37:10AM +0800, Dianzhang Chen wrote: > > Hi, > > Because when i reply your email,i always get 'Message rejected' from > > gmail(get this rejection from all the recipients). I still don't know > > how to deal with it, so i reply your email here: > > Hi! This is weird. Next time simply reply to LKML (I CC'ed it back). > > > Because of speculative execution, the attacker can bypass the bound > > check `if (resource >= RLIM_NLIMITS)`. > > And then misprediction get detected and execution is dropped. So I > still don't see a problem here, since we don't leak info even in > such case. > > That said I don't mind for this patch but rather in a sake of > code clarity, not because of spectre issue since it has > nothing to do here. > > > as for array_index_nospec(index, size), it will clamp the index within > > the range of [0, size), and attacker can't exploit speculative > > execution to make the index out of range [0, size). > > > > > > For more detail, please check the link below: > > > > https://github.com/torvalds/linux/commit/f3804203306e098dae9ca51540fcd5eb700d7f40
[PATCH] pinctrl: ns2: Fix potential NULL dereference
platform_get_resource() may fail and return NULL, so we should better check it's return value to avoid a NULL pointer dereference a bit later in the code. Signed-off-by: Young Xiao <92siuy...@gmail.com> --- drivers/pinctrl/bcm/pinctrl-ns2-mux.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pinctrl/bcm/pinctrl-ns2-mux.c b/drivers/pinctrl/bcm/pinctrl-ns2-mux.c index 4b5cf0e..2bf6af7 100644 --- a/drivers/pinctrl/bcm/pinctrl-ns2-mux.c +++ b/drivers/pinctrl/bcm/pinctrl-ns2-mux.c @@ -1048,6 +1048,8 @@ static int ns2_pinmux_probe(struct platform_device *pdev) return PTR_ERR(pinctrl->base0); res = platform_get_resource(pdev, IORESOURCE_MEM, 1); + if (!res) + return -EINVAL; pinctrl->base1 = devm_ioremap_nocache(>dev, res->start, resource_size(res)); if (!pinctrl->base1) { -- 2.7.4
Re: [PATCH 1/3] mm: thp: make deferred split shrinker memcg aware
On 5/28/19 10:42 PM, Kirill Tkhai wrote: Hi, Yang, On 28.05.2019 15:44, Yang Shi wrote: Currently THP deferred split shrinker is not memcg aware, this may cause premature OOM with some configuration. For example the below test would run into premature OOM easily: $ cgcreate -g memory:thp $ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes $ cgexec -g memory:thp transhuge-stress 4000 transhuge-stress comes from kernel selftest. It is easy to hit OOM, but there are still a lot THP on the deferred split queue, memcg direct reclaim can't touch them since the deferred split shrinker is not memcg aware. Convert deferred split shrinker memcg aware by introducing per memcg deferred split queue. The THP should be on either per node or per memcg deferred split queue if it belongs to a memcg. When the page is immigrated to the other memcg, it will be immigrated to the target memcg's deferred split queue too. And, move deleting THP from deferred split queue in page free before memcg uncharge so that the page's memcg information is available. Reuse the second tail page's deferred_list for per memcg list since the same THP can't be on multiple deferred split queues. Cc: Kirill Tkhai Cc: Johannes Weiner Cc: Michal Hocko Cc: "Kirill A . Shutemov" Cc: Hugh Dickins Cc: Shakeel Butt Signed-off-by: Yang Shi --- include/linux/huge_mm.h| 24 ++ include/linux/memcontrol.h | 6 ++ include/linux/mm_types.h | 7 +- mm/huge_memory.c | 182 + mm/memcontrol.c| 20 + mm/swap.c | 4 + 6 files changed, 194 insertions(+), 49 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 7cd5c15..f6d1cde 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -250,6 +250,26 @@ static inline bool thp_migration_supported(void) return IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION); } +static inline struct list_head *page_deferred_list(struct page *page) +{ + /* +* Global deferred list in the second tail pages is occupied by +* compound_head. +*/ + return [2].deferred_list; +} + +static inline struct list_head *page_memcg_deferred_list(struct page *page) +{ + /* +* Memcg deferred list in the second tail pages is occupied by +* compound_head. +*/ + return [2].memcg_deferred_list; +} + +extern void del_thp_from_deferred_split_queue(struct page *); + #else /* CONFIG_TRANSPARENT_HUGEPAGE */ #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; }) #define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; }) @@ -368,6 +388,10 @@ static inline bool thp_migration_supported(void) { return false; } + +static inline void del_thp_from_deferred_split_queue(struct page *page) +{ +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif /* _LINUX_HUGE_MM_H */ diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index bc74d6a..9ff5fab 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -316,6 +316,12 @@ struct mem_cgroup { struct list_head event_list; spinlock_t event_list_lock; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + struct list_head split_queue; + unsigned long split_queue_len; + spinlock_t split_queue_lock; +#endif + struct mem_cgroup_per_node *nodeinfo[0]; /* WARNING: nodeinfo must be the last member here */ }; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8ec38b1..405f5e6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -139,7 +139,12 @@ struct page { struct {/* Second tail page of compound page */ unsigned long _compound_pad_1; /* compound_head */ unsigned long _compound_pad_2; - struct list_head deferred_list; + union { + /* Global THP deferred split list */ + struct list_head deferred_list; + /* Memcg THP deferred split list */ + struct list_head memcg_deferred_list; Why we need two namesakes for this list entry? For me it looks redundantly: it does not give additional information, but it leads to duplication (and we have two helpers page_deferred_list() and page_memcg_deferred_list() instead of one). Yes, kind of. Actually I was also wondering if this is worth or not. My point is this may improve the code readability. We can figure out what split queue (per node or per memcg) is being manipulated just by the name of the list. If the most people thought this is unnecessary, I'm definitely ok to just keep one name. + }; }; struct {/* Page table pages */ unsigned long _pt_pad_1;/* compound_head */ diff --git
Re: [PATCH -next] drivers: thermal: tsens: Change hw_id type to int in is_sensor_enabled
YueHaibing, On Mon, May 27, 2019 at 09:41:24PM +0800, YueHaibing wrote: > Sensor hw_id is int type other u32, is_sensor_enabled > should use int to compare, this fix smatch warning: > > drivers/thermal/qcom/tsens-common.c:72 > is_sensor_enabled() warn: unsigned 'hw_id' is never less than zero. > > Fixes: 3e6a8fb33084 ("drivers: thermal: tsens: Add new operation to check if > a sensor is enabled") Thanks for the patch, but we had to revert this commit which was causing some issues. So, your patch is not applicable. > Signed-off-by: YueHaibing Thank you anyways. > --- > drivers/thermal/qcom/tsens-common.c | 2 +- > drivers/thermal/qcom/tsens.h| 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/thermal/qcom/tsens-common.c > b/drivers/thermal/qcom/tsens-common.c > index 928e8e81ba69..5df4eed84535 100644 > --- a/drivers/thermal/qcom/tsens-common.c > +++ b/drivers/thermal/qcom/tsens-common.c > @@ -64,7 +64,7 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 > *p1, > } > } > > -bool is_sensor_enabled(struct tsens_priv *priv, u32 hw_id) > +bool is_sensor_enabled(struct tsens_priv *priv, int hw_id) > { > u32 val; > int ret; > diff --git a/drivers/thermal/qcom/tsens.h b/drivers/thermal/qcom/tsens.h > index eefe3844fb4e..15264806f6a8 100644 > --- a/drivers/thermal/qcom/tsens.h > +++ b/drivers/thermal/qcom/tsens.h > @@ -315,7 +315,7 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 > *pt1, u32 *pt2, u32 mo > int init_common(struct tsens_priv *priv); > int get_temp_tsens_valid(struct tsens_priv *priv, int i, int *temp); > int get_temp_common(struct tsens_priv *priv, int i, int *temp); > -bool is_sensor_enabled(struct tsens_priv *priv, u32 hw_id); > +bool is_sensor_enabled(struct tsens_priv *priv, int hw_id); > > /* TSENS target */ > extern const struct tsens_plat_data data_8960;
Re: [PATCH] arm64: dts: ls1028a: Add Thermal Monitor Unit node
On Thu, Apr 25, 2019 at 04:26:40PM +0800, Yuantian Tang wrote: > The Thermal Monitoring Unit (TMU) monitors and reports the > temperature from 2 remote temperature measurement sites > located on ls1028a chip. > Add TMU dts node to enable this feature. > > Signed-off-by: Yuantian Tang I dont see anything wrong from a thermal standpoint. Acked-by: Eduardo Valentin Please get this via your arch tree maintainer to avoid merge conflicts. > --- > arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi | 114 > > 1 files changed, 114 insertions(+), 0 deletions(-) > > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > index b045812..a25f5fc 100644 > --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > @@ -29,6 +29,7 @@ > clocks = < 1 0>; > next-level-cache = <>; > cpu-idle-states = <_PH20>; > + #cooling-cells = <2>; > }; > > cpu1: cpu@1 { > @@ -39,6 +40,7 @@ > clocks = < 1 0>; > next-level-cache = <>; > cpu-idle-states = <_PH20>; > + #cooling-cells = <2>; > }; > > l2: l2-cache { > @@ -398,6 +400,118 @@ > status = "disabled"; > }; > > + tmu: tmu@1f0 { > + compatible = "fsl,qoriq-tmu"; > + reg = <0x0 0x1f8 0x0 0x1>; > + interrupts = <0 23 0x4>; > + fsl,tmu-range = <0xb 0xa0026 0x80048 0x70061>; > + fsl,tmu-calibration = <0x 0x0024 > +0x0001 0x002b > +0x0002 0x0031 > +0x0003 0x0038 > +0x0004 0x003f > +0x0005 0x0045 > +0x0006 0x004c > +0x0007 0x0053 > +0x0008 0x0059 > +0x0009 0x0060 > +0x000a 0x0066 > +0x000b 0x006d > + > +0x0001 0x001c > +0x00010001 0x0024 > +0x00010002 0x002c > +0x00010003 0x0035 > +0x00010004 0x003d > +0x00010005 0x0045 > +0x00010006 0x004d > +0x00010007 0x0045 > +0x00010008 0x005e > +0x00010009 0x0066 > +0x0001000a 0x006e > + > +0x0002 0x0018 > +0x00020001 0x0022 > +0x00020002 0x002d > +0x00020003 0x0038 > +0x00020004 0x0043 > +0x00020005 0x004d > +0x00020006 0x0058 > +0x00020007 0x0063 > +0x00020008 0x006e > + > +0x0003 0x0010 > +0x00030001 0x001c > +0x00030002 0x0029 > +0x00030003 0x0036 > +0x00030004 0x0042 > +0x00030005 0x004f > +0x00030006 0x005b > +0x00030007 0x0068>; > + little-endian; > + #thermal-sensor-cells = <1>; > + }; > + > + thermal-zones { > + core-cluster { > + polling-delay-passive = <1000>; > + polling-delay = <5000>; > + thermal-sensors = < 0>; > + > + trips { > + core_cluster_alert: core-cluster-alert { > +
[net-next:master 161/171] drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:924:6: sparse: sparse: symbol 'hclge_dbg_get_m7_stats_info' was not declared. Should it be static?
tree: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: 602e0f295a91813c9a15938f2a292b9c60a416d9 commit: 33a90e2f20e6c455889a0f41857692221172a5ae [161/171] net: hns3: add support for dump firmware statistics by debugfs reproduce: # apt-get install sparse # sparse version: v0.6.1-rc1-7-g2b96cd8-dirty git checkout 33a90e2f20e6c455889a0f41857692221172a5ae make ARCH=x86_64 allmodconfig make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' If you fix the issue, kindly add following tag Reported-by: kbuild test robot sparse warnings: (new ones prefixed by >>) drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:32:17: sparse: sparse: cast from restricted __le32 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:564:31: sparse: sparse: restricted __le16 degrades to integer drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:598:39: sparse: sparse: incorrect type in assignment (different base types) @@expected unsigned int @@got restricted __le32unsigned int @@ drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:598:39: sparse: expected unsigned int drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:598:39: sparse: got restricted __le32 [usertype] qs_bit_map drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:833:30: sparse: sparse: restricted __le16 degrades to integer drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:840:33: sparse: sparse: restricted __le16 degrades to integer drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:841:30: sparse: sparse: restricted __le16 degrades to integer drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:842:31: sparse: sparse: restricted __le16 degrades to integer drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:844:33: sparse: sparse: restricted __le16 degrades to integer >> drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:924:6: sparse: >> sparse: symbol 'hclge_dbg_get_m7_stats_info' was not declared. Should it be >> static? Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
[RFC PATCH net-next] net: hns3: hclge_dbg_get_m7_stats_info() can be static
Fixes: 33a90e2f20e6 ("net: hns3: add support for dump firmware statistics by debugfs") Signed-off-by: kbuild test robot --- hclge_debugfs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c index ed1f533..4fbed47a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c @@ -921,7 +921,7 @@ static void hclge_dbg_dump_rst_info(struct hclge_dev *hdev) hdev->rst_stats.reset_cnt); } -void hclge_dbg_get_m7_stats_info(struct hclge_dev *hdev) +static void hclge_dbg_get_m7_stats_info(struct hclge_dev *hdev) { struct hclge_desc *desc_src, *desc_tmp; struct hclge_get_m7_bd_cmd *req;
[GIT PULL] tracing: Avoid memory leak in predicate_parse()
Linus, This fixes a memory leak from the error path in the event filter logic. Please pull the latest trace-v5.2-rc2 tree, which can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git trace-v5.2-rc2 Tag SHA1: 0658b13d1bfd40bda1c2bd1ef3738857e1bf4000 Head SHA1: dfb4a6f2191a80c8b790117d0ff592fd712d3296 Tomas Bortoli (1): tracing: Avoid memory leak in predicate_parse() kernel/trace/trace_events_filter.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- commit dfb4a6f2191a80c8b790117d0ff592fd712d3296 Author: Tomas Bortoli Date: Tue May 28 17:43:38 2019 +0200 tracing: Avoid memory leak in predicate_parse() In case of errors, predicate_parse() goes to the out_free label to free memory and to return an error code. However, predicate_parse() does not free the predicates of the temporary prog_stack array, thence leaking them. Link: http://lkml.kernel.org/r/20190528154338.29976-1-tomasbort...@gmail.com Cc: sta...@vger.kernel.org Fixes: 80765597bc587 ("tracing: Rewrite filter logic to be simpler and faster") Reported-by: syzbot+6b8e0fb820e570c59...@syzkaller.appspotmail.com Signed-off-by: Tomas Bortoli [ Added protection around freeing prog_stack[i].pred ] Signed-off-by: Steven Rostedt (VMware) diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c index d3e59312ef40..5079d1db3754 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -428,7 +428,7 @@ predicate_parse(const char *str, int nr_parens, int nr_preds, op_stack = kmalloc_array(nr_parens, sizeof(*op_stack), GFP_KERNEL); if (!op_stack) return ERR_PTR(-ENOMEM); - prog_stack = kmalloc_array(nr_preds, sizeof(*prog_stack), GFP_KERNEL); + prog_stack = kcalloc(nr_preds, sizeof(*prog_stack), GFP_KERNEL); if (!prog_stack) { parse_error(pe, -ENOMEM, 0); goto out_free; @@ -579,7 +579,11 @@ predicate_parse(const char *str, int nr_parens, int nr_preds, out_free: kfree(op_stack); kfree(inverts); - kfree(prog_stack); + if (prog_stack) { + for (i = 0; prog_stack[i].pred; i++) + kfree(prog_stack[i].pred); + kfree(prog_stack); + } return ERR_PTR(ret); }
[PATCH v2 net-next] net: mvpp2: cls: Remove unnessesary check in mvpp2_ethtool_cls_rule_ins
Fix smatch warning: drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c:1236 mvpp2_ethtool_cls_rule_ins() warn: unsigned 'info->fs.location' is never less than zero. 'info->fs.location' is u32 type, never less than zero. Signed-off-by: YueHaibing --- v2: rework patch based net-next --- drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c index bd19a910dc90..e1c90adb2982 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c @@ -1300,8 +1300,7 @@ int mvpp2_ethtool_cls_rule_ins(struct mvpp2_port *port, struct mvpp2_ethtool_fs *efs, *old_efs; int ret = 0; - if (info->fs.location >= MVPP2_N_RFS_ENTRIES_PER_FLOW || - info->fs.location < 0) + if (info->fs.location >= MVPP2_N_RFS_ENTRIES_PER_FLOW) return -EINVAL; efs = kzalloc(sizeof(*efs), GFP_KERNEL); -- 2.20.1
Re: [PATCH RESEND V13 2/5] thermal: of-thermal: add API for getting sensor ID from DT
On Tue, May 28, 2019 at 02:06:18PM +0800, anson.hu...@nxp.com wrote: > From: Anson Huang > > On some platforms like i.MX8QXP, the thermal driver needs a > real HW sensor ID from DT thermal zone, the HW sensor ID is > used to get temperature from SCU firmware, and the virtual > sensor ID starting from 0 to N is NOT used at all, this patch > adds new API thermal_zone_of_get_sensor_id() to provide the > feature of getting sensor ID from DT thermal zone's node. > > Signed-off-by: Anson Huang > --- > Changes since V12: > - adjust the second parameter of thermal_zone_of_get_sensor_id() API, > then caller no need > to pass the of_phandle_args structure and put the sensor_specs.np > manually, also putting > the sensor node device check inside this API to make it easy for > usage; What happened to using nxp,resource-id property in your driver? Why do we need this as an API in of-thermal? What other drivers may benefit of this? Regardless, this patch needs to document the new API under Documentation/ > --- > drivers/thermal/of-thermal.c | 66 > +--- > include/linux/thermal.h | 10 +++ > 2 files changed, 60 insertions(+), 16 deletions(-) > > diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c > index dc5093b..a53792b 100644 > --- a/drivers/thermal/of-thermal.c > +++ b/drivers/thermal/of-thermal.c > @@ -449,6 +449,54 @@ thermal_zone_of_add_sensor(struct device_node *zone, > } > > /** > + * thermal_zone_of_get_sensor_id - get sensor ID from a DT thermal zone > + * @tz_np: a valid thermal zone device node. > + * @sensor_np: a sensor node of a valid sensor device. > + * @id: a sensor ID pointer will be passed back. > + * > + * This function will get sensor ID from a given thermal zone node, use > + * "thermal-sensors" as list name, and get sensor ID from first phandle's > + * argument. > + * > + * Return: 0 on success, proper error code otherwise. > + */ > + > +int thermal_zone_of_get_sensor_id(struct device_node *tz_np, > + struct device_node *sensor_np, > + u32 *id) > +{ > + struct of_phandle_args sensor_specs; > + int ret; > + > + ret = of_parse_phandle_with_args(tz_np, > + "thermal-sensors", > + "#thermal-sensor-cells", > + 0, > + _specs); > + if (ret) > + return ret; > + > + if (sensor_specs.np != sensor_np) { > + of_node_put(sensor_specs.np); > + return -ENODEV; > + } > + > + if (sensor_specs.args_count >= 1) { > + *id = sensor_specs.args[0]; > + WARN(sensor_specs.args_count > 1, > + "%pOFn: too many cells in sensor specifier %d\n", > + sensor_specs.np, sensor_specs.args_count); > + } else { > + *id = 0; > + } > + > + of_node_put(sensor_specs.np); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(thermal_zone_of_get_sensor_id); > + > +/** > * thermal_zone_of_sensor_register - registers a sensor to a DT thermal zone > * @dev: a valid struct device pointer of a sensor device. Must contain > * a valid .of_node, for the sensor node. > @@ -499,36 +547,22 @@ thermal_zone_of_sensor_register(struct device *dev, int > sensor_id, void *data, > sensor_np = of_node_get(dev->of_node); > > for_each_available_child_of_node(np, child) { > - struct of_phandle_args sensor_specs; > int ret, id; > > /* For now, thermal framework supports only 1 sensor per zone */ > - ret = of_parse_phandle_with_args(child, "thermal-sensors", > - "#thermal-sensor-cells", > - 0, _specs); > + ret = thermal_zone_of_get_sensor_id(child, sensor_np, ); > if (ret) > continue; > > - if (sensor_specs.args_count >= 1) { > - id = sensor_specs.args[0]; > - WARN(sensor_specs.args_count > 1, > - "%pOFn: too many cells in sensor specifier %d\n", > - sensor_specs.np, sensor_specs.args_count); > - } else { > - id = 0; > - } > - > - if (sensor_specs.np == sensor_np && id == sensor_id) { > + if (id == sensor_id) { > tzd = thermal_zone_of_add_sensor(child, sensor_np, >data, ops); > if (!IS_ERR(tzd)) > tzd->ops->set_mode(tzd, THERMAL_DEVICE_ENABLED); > > - of_node_put(sensor_specs.np); > of_node_put(child); > goto exit; > } > -
Re: [PATCH] lib: test_overflow: Avoid taining the kernel and fix wrap size
On Tue, May 28, 2019 at 04:40:06PM -0700, Joe Perches wrote: > On Tue, 2019-05-28 at 15:51 -0700, Kees Cook wrote: > > This adds __GFP_NOWARN to the kmalloc()-portions of the overflow test to > > avoid tainting the kernel. Additionally fixes up the math on wrap size > > to be architecture and page size agnostic. > [] > > diff --git a/lib/test_overflow.c b/lib/test_overflow.c > [] > > @@ -486,16 +486,17 @@ static int __init test_overflow_shift(void) > [] > > +#define alloc_GFP (GFP_KERNEL | __GFP_NOWARN) > [] > > +#define alloc110(alloc, arg, sz) alloc(arg, sz, alloc_GFP | __GFP_NOWARN) > > seems redundant. Whoops. Missed that one. Fixing... -- Kees Cook
Re: [PATCH] thermal/drivers/of: Add a get_temp_id callback function
On Thu, May 23, 2019 at 07:48:56PM -0700, Andrey Smirnov wrote: > On Mon, Apr 29, 2019 at 9:51 AM Daniel Lezcano > wrote: > > > > On 24/04/2019 01:08, Daniel Lezcano wrote: > > > On 23/04/2019 17:44, Eduardo Valentin wrote: > > >> Hello, > > >> > > >> On Tue, Apr 16, 2019 at 07:22:03PM +0200, Daniel Lezcano wrote: > > >>> Currently when we register a sensor, we specify the sensor id and a data > > >>> pointer to be passed when the get_temp function is called. However the > > >>> sensor_id is not passed to the get_temp callback forcing the driver to > > >>> do extra allocation and adding back pointer to find out from the sensor > > >>> information the driver data and then back to the sensor id. > > >>> > > >>> Add a new callback get_temp_id() which will be called if set. It will > > >>> call the get_temp_id() with the sensor id. > > >>> > > >>> That will be more consistent with the registering function. > > >> > > >> I still do not understand why we need to have a get_id callback. > > >> The use cases I have seen so far, which I have been intentionally > > >> rejecting, are > > >> mainly solvable by creating other compatible entries. And really, if you > > >> have, say a bandgap, chip that supports multiple sensors, but on > > >> SoC version A it has 5 sensors, and on SoC version B it has only 4, > > >> or on SoC version C, it has 5 but they are either logially located > > >> in different places (gpu vs iva regions), these are all cases in which > > >> you want a different compatible! > > >> > > >> Do you mind sharing why you need a get sensor id callback? > > > > > > It is not a get sensor id callback, it is a get_temp callback which pass > > > the sensor id. > > > > > > See in the different drivers, it is a common pattern there is a > > > structure for the driver, then a structure for the sensor. When the > > > get_temp is called, the callback needs info from the sensor structure > > > and from the driver structure, so a back pointer to the driver structure > > > is added in the sensor structure. > > Do you mind sending a patch showing how one could convert an existing driver to use this new API? > > Hi Eduardo, > > > > does the explanation clarifies the purpose of this change? > > > > Eduardo, did you ever have a chance to revisit this thread? I would > really like to make some progress on this one to unblock my i.MX8MQ > hwmon series. The problem I have with this patch is that it is an API which resides only in of-thermal. Growing APIs on DT only diverges of-thermal from thermal core and platform drivers. Besides, this patch needs to document the API in Documention/ > > Thanks, > Andrey Smirnov
Re: [PATCH v3 2/4] mtd: rawnand: Add Macronix MX25F0A NAND controller
Hi Miquel, > > > > > > +static void mxic_nand_select_chip(struct nand_chip *chip, int > > chipnr) > > > > > > > > > > _select_target() is preferred now > > > > > > > > Do you mean I implement mxic_nand_select_target() to control #CS ? > > > > > > > > If so, I need to call mxic_nand_select_target( ) to control #CS ON > > > > and then #CS OFF in _exec_op() due to nand_select_target() > nand_base,c> > > > > is still calling chip->legacy.select_chip ? > > > > > > You must forget about the ->select_chip() callback. Now it should be > > > handled directly from the controller driver. Please have a look at the > > > commit pointed against the marvell_nand.c driver. > > > > I have no Marvell NFC datasheet and have one question. > > > > In marvell_nand.c, there is no xxx_deselect_target() or > > something like that doing #CS OFF. > > marvell_nfc_select_target() seems always to make one of chip or die > > #CS keep low. > > > > Is it right ? > > Yes, AFAIR there is no "de-assert" mechanism in this controller. > > > > > How to make all #CS keep high for NAND to enter > > low-power standby mode if driver don't use "legacy.select_chip()" ? > > See commit 02b4a52604a4 ("mtd: rawnand: Make ->select_chip() optional > when ->exec_op() is implemented") which states: > > "When [->select_chip() is] not implemented, the core is assuming >the CS line is automatically asserted/deasserted by the driver >->exec_op() implementation." > > Of course, the above is right only when the controller driver supports > the ->exec_op() interface. Currently, it seems that we will get the incorrect data and error operation due to CS in error toggling if CS line is controlled in ->exec_op(). i.e,. 1) In nand_onfi_detect() to call nand_exec_op() twice by nand_read_param_page_op() and annd_read_data_op() 2) In nand_write_page_xxx to call nand_exec_op() many times by nand_prog_page_begin_op(), nand_write_data_op() and nand_prog_page_end_op(). Should we consider to add a CS line controller in struct nand_controller i.e,. struct nand_controller { struct mutex lock; const struct nand_controller_ops *ops; + void (*select_chip)(struct nand_chip *chip, int cs); }; to replace legacy.select_chip() ? To patch in nand_select_target() and nand_deselect_target() void nand_select_target(struct nand_chip *chip, unsigned int cs) { /* * cs should always lie between 0 and chip->numchips, when that's not * the case it's a bug and the caller should be fixed. */ if (WARN_ON(cs > chip->numchips)) return; chip->cur_cs = cs; + if (chip->controller->select_chip) + chip->controller->select_chip(chip, cs); + if (chip->legacy.select_chip) chip->legacy.select_chip(chip, cs); } void nand_deselect_target(struct nand_chip *chip) { + if (chip->controller->select_chip) + chip->controller->select_chip(chip, -1); + if (chip->legacy.select_chip) chip->legacy.select_chip(chip, -1); chip->cur_cs = -1; } > > So if you think it is not too time consuming and worth the trouble to > assert/deassert the CS at each operation, you may do it in your driver. > > > Thanks, > Miquèl thanks & best regards, Mason CONFIDENTIALITY NOTE: This e-mail and any attachments may contain confidential information and/or personal data, which is protected by applicable laws. Please be reminded that duplication, disclosure, distribution, or use of this e-mail (and/or its attachments) or any part thereof is prohibited. If you receive this e-mail in error, please notify us immediately and delete this mail as well as its attachment(s) from your system. In addition, please be informed that collection, processing, and/or use of personal data is prohibited unless expressly permitted by personal data protection laws. Thank you for your attention and cooperation. Macronix International Co., Ltd. = CONFIDENTIALITY NOTE: This e-mail and any attachments may contain confidential information and/or personal data, which is protected by applicable laws. Please be reminded that duplication, disclosure, distribution, or use of this e-mail (and/or its attachments) or any part thereof is prohibited. If you receive this e-mail in error, please notify us immediately and delete this mail as well as its attachment(s) from your system. In addition, please be informed that collection, processing, and/or use of personal data is prohibited unless expressly permitted by personal data protection laws. Thank you for your attention and cooperation. Macronix International Co., Ltd. =
[PATCH v2] lib: test_overflow: Avoid tainting the kernel and fix wrap size
This adds __GFP_NOWARN to the kmalloc()-portions of the overflow test to avoid tainting the kernel. Additionally fixes up the math on wrap size to be architecture and page size agnostic. Reported-by: Randy Dunlap Suggested-by: Rasmus Villemoes Fixes: ca90800a91ba ("test_overflow: Add memory allocation overflow tests") Signed-off-by: Kees Cook --- v2: fix leftover __GFP_NOWARN (joe) --- lib/test_overflow.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/lib/test_overflow.c b/lib/test_overflow.c index fc680562d8b6..7a4b6f6c5473 100644 --- a/lib/test_overflow.c +++ b/lib/test_overflow.c @@ -486,16 +486,17 @@ static int __init test_overflow_shift(void) * Deal with the various forms of allocator arguments. See comments above * the DEFINE_TEST_ALLOC() instances for mapping of the "bits". */ -#define alloc010(alloc, arg, sz) alloc(sz, GFP_KERNEL) -#define alloc011(alloc, arg, sz) alloc(sz, GFP_KERNEL, NUMA_NO_NODE) +#define alloc_GFP (GFP_KERNEL | __GFP_NOWARN) +#define alloc010(alloc, arg, sz) alloc(sz, alloc_GFP) +#define alloc011(alloc, arg, sz) alloc(sz, alloc_GFP, NUMA_NO_NODE) #define alloc000(alloc, arg, sz) alloc(sz) #define alloc001(alloc, arg, sz) alloc(sz, NUMA_NO_NODE) -#define alloc110(alloc, arg, sz) alloc(arg, sz, GFP_KERNEL) +#define alloc110(alloc, arg, sz) alloc(arg, sz, alloc_GFP) #define free0(free, arg, ptr) free(ptr) #define free1(free, arg, ptr) free(arg, ptr) -/* Wrap around to 8K */ -#define TEST_SIZE (9 << PAGE_SHIFT) +/* Wrap around to 16K */ +#define TEST_SIZE (5 * 4096) #define DEFINE_TEST_ALLOC(func, free_func, want_arg, want_gfp, want_node)\ static int __init test_ ## func (void *arg)\ -- 2.17.1 -- Kees Cook
Re: [PATCH -next] EDAC: aspeed: Remove set but not used variable 'np'
On Tuesday, May 28, 2019 at 6:27 PM, Andrew Jeffery wrote: > On Sun, 26 May 2019, at 00:12, YueHaibing wrote: > > Fixes gcc '-Wunused-but-set-variable' warning: > > > > drivers/edac/aspeed_edac.c: In function aspeed_probe: > > drivers/edac/aspeed_edac.c:284:22: warning: variable np set but not > > used [-Wunused-but-set-variable] > > > > It is never used and can be removed. > > > > Signed-off-by: YueHaibing > > Reviewed-by: Andrew Jeffery Reviewed-by: Stefan Schaeckeler > > --- > > drivers/edac/aspeed_edac.c | 4 > > 1 file changed, 4 deletions(-) > > > > diff --git a/drivers/edac/aspeed_edac.c b/drivers/edac/aspeed_edac.c > > index 11833c0a5d07..5634437bb39d 100644 > > --- a/drivers/edac/aspeed_edac.c > > +++ b/drivers/edac/aspeed_edac.c > > @@ -281,15 +281,11 @@ static int aspeed_probe(struct platform_device *pdev) > > struct device *dev = >dev; > > struct edac_mc_layer layers[2]; > > struct mem_ctl_info *mci; > > - struct device_node *np; > > struct resource *res; > > void __iomem *regs; > > u32 reg04; > > int rc; > > > > - /* setup regmap */ > > - np = dev->of_node; > > - > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > > if (!res) > > return -ENOENT; > > -- > > 2.17.1
Re: [PATCH] cpumask: Remove error message and backtrace on out-of-memory condition
On Mon, 27 May 2019 14:29:58 +0200 Geert Uytterhoeven wrote: > There is no need to print an error message and backtrace if > kmalloc_node() fails, as the memory allocation core already takes care > of that. > > ... > > --- a/lib/cpumask.c > +++ b/lib/cpumask.c > @@ -114,13 +114,6 @@ bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t > flags, int node) > { > *mask = kmalloc_node(cpumask_size(), flags, node); > > -#ifdef CONFIG_DEBUG_PER_CPU_MAPS > - if (!*mask) { > - printk(KERN_ERR "=> alloc_cpumask_var: failed!\n"); > - dump_stack(); > - } > -#endif > - > return *mask != NULL; > } > EXPORT_SYMBOL(alloc_cpumask_var_node); Well, not really - as it stands CONFIG_DEBUG_PER_CPU_MAPS=y can override a caller's __GFP_NOWARN. I wonder if anyone ever sets CONFIG_DEBUG_PER_CPU_MAPS any more...
Re: [PATCH 3/4] vsock/virtio: fix flush of works during the .remove()
On 2019/5/28 下午6:56, Stefano Garzarella wrote: We flush all pending works before to call vdev->config->reset(vdev), but other works can be queued before the vdev->config->del_vqs(vdev), so we add another flush after it, to avoid use after free. Suggested-by: Michael S. Tsirkin Signed-off-by: Stefano Garzarella --- net/vmw_vsock/virtio_transport.c | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index e694df10ab61..ad093ce96693 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -660,6 +660,15 @@ static int virtio_vsock_probe(struct virtio_device *vdev) return ret; } +static void virtio_vsock_flush_works(struct virtio_vsock *vsock) +{ + flush_work(>loopback_work); + flush_work(>rx_work); + flush_work(>tx_work); + flush_work(>event_work); + flush_work(>send_pkt_work); +} + static void virtio_vsock_remove(struct virtio_device *vdev) { struct virtio_vsock *vsock = vdev->priv; @@ -668,12 +677,6 @@ static void virtio_vsock_remove(struct virtio_device *vdev) mutex_lock(_virtio_vsock_mutex); the_virtio_vsock = NULL; - flush_work(>loopback_work); - flush_work(>rx_work); - flush_work(>tx_work); - flush_work(>event_work); - flush_work(>send_pkt_work); - /* Reset all connected sockets when the device disappear */ vsock_for_each_connected_socket(virtio_vsock_reset_sock); @@ -690,6 +693,9 @@ static void virtio_vsock_remove(struct virtio_device *vdev) vsock->event_run = false; mutex_unlock(>event_lock); + /* Flush all pending works */ + virtio_vsock_flush_works(vsock); + /* Flush all device writes and interrupts, device will not use any * more buffers. */ @@ -726,6 +732,11 @@ static void virtio_vsock_remove(struct virtio_device *vdev) /* Delete virtqueues and flush outstanding callbacks if any */ vdev->config->del_vqs(vdev); + /* Other works can be queued before 'config->del_vqs()', so we flush +* all works before to free the vsock object to avoid use after free. +*/ + virtio_vsock_flush_works(vsock); Some questions after a quick glance: 1) It looks to me that the work could be queued from the path of vsock_transport_cancel_pkt() . Is that synchronized here? 2) If we decide to flush after dev_vqs(), is tx_run/rx_run/event_run still needed? It looks to me we've already done except that we need flush rx_work in the end since send_pkt_work can requeue rx_work. Thanks + kfree(vsock); mutex_unlock(_virtio_vsock_mutex); }
Re: linux-next: Fixes tag needs some work in the cifs tree
On Fri, May 24, 2019 at 10:14 PM Steve French wrote: > > fixed and repushed to cifs-2.6.git for-next Thanks! [resend including mail lists] > > On Thu, May 23, 2019 at 11:27 PM Stephen Rothwell > wrote: > > > > Hi all, > > > > In commit > > > > f875253b5fe6 ("fs/cifs/smb2pdu.c: fix buffer free in SMB2_ioctl_free") > > > > Fixes tag > > > > Fixes: 2c87d6a ("cifs: Allocate memory for all iovs in smb2_ioctl") > > > > has these problem(s): > > > > - SHA1 should be at least 12 digits long > > Can be fixed by setting core.abbrev to 12 (or more) or (for git v2.11 > > or later) just making sure it is not set (or set to "auto"). > > > > -- > > Cheers, > > Stephen Rothwell > > > > -- > Thanks, > > Steve
RE: [EXT] Re: [PATCH] arm64: dts: ls1028a: Add Thermal Monitor Unit node
> -Original Message- > From: Eduardo Valentin > Sent: 2019年5月29日 10:54 > To: Andy Tang > Cc: shawn...@kernel.org; Leo Li ; > robh...@kernel.org; mark.rutl...@arm.com; > linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org; > linux-kernel@vger.kernel.org; linux...@vger.kernel.org; > daniel.lezc...@linaro.org; rui.zh...@intel.com > Subject: [EXT] Re: [PATCH] arm64: dts: ls1028a: Add Thermal Monitor Unit > node > > Caution: EXT Email > > On Thu, Apr 25, 2019 at 04:26:40PM +0800, Yuantian Tang wrote: > > The Thermal Monitoring Unit (TMU) monitors and reports the temperature > > from 2 remote temperature measurement sites located on ls1028a chip. > > Add TMU dts node to enable this feature. > > > > Signed-off-by: Yuantian Tang > > I dont see anything wrong from a thermal standpoint. > > Acked-by: Eduardo Valentin > > Please get this via your arch tree maintainer to avoid merge conflicts. Thanks for your review. The only concern for arch tree maintainer is that "cooling-maps" is a required property. So I have to add cooling-maps for each zone. Since there are two thermal zones but only one cooling device, which is cpufreq, I have to use CPUFREQ as cooling device twice which may cause cooling decision conflict. The case will get worse when we have 7 thermal zones. This makes me think "maybe we need to change cooling-maps to an optional property". In this way, we can put the cooling devices to specific thermal zones and leave the zones without Cooling devices to do the default action which is reset or poweroff soc. What's your opinion about this? BR, Andy > > > --- > > arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi | 114 > > > > 1 files changed, 114 insertions(+), 0 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > > b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > > index b045812..a25f5fc 100644 > > --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > > +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi > > @@ -29,6 +29,7 @@ > > clocks = < 1 0>; > > next-level-cache = <>; > > cpu-idle-states = <_PH20>; > > + #cooling-cells = <2>; > > }; > > > > cpu1: cpu@1 { > > @@ -39,6 +40,7 @@ > > clocks = < 1 0>; > > next-level-cache = <>; > > cpu-idle-states = <_PH20>; > > + #cooling-cells = <2>; > > }; > > > > l2: l2-cache { > > @@ -398,6 +400,118 @@ > > status = "disabled"; > > }; > > > > + tmu: tmu@1f0 { > > + compatible = "fsl,qoriq-tmu"; > > + reg = <0x0 0x1f8 0x0 0x1>; > > + interrupts = <0 23 0x4>; > > + fsl,tmu-range = <0xb 0xa0026 0x80048 > 0x70061>; > > + fsl,tmu-calibration = <0x 0x0024 > > +0x0001 > 0x002b > > +0x0002 > 0x0031 > > +0x0003 > 0x0038 > > +0x0004 > 0x003f > > +0x0005 > 0x0045 > > +0x0006 > 0x004c > > +0x0007 > 0x0053 > > +0x0008 > 0x0059 > > +0x0009 > 0x0060 > > +0x000a > 0x0066 > > +0x000b > 0x006d > > + > > +0x0001 > 0x001c > > +0x00010001 > 0x0024 > > +0x00010002 > 0x002c > > +0x00010003 > 0x0035 > > +0x00010004 > 0x003d > > +0x00010005 > 0x0045 > > +0x00010006 > 0x004d > > +0x00010007 > 0x0045 > > +0x00010008 > 0x005e > > +0x00010009 > 0x0066 > > +0x0001000a > 0x006e > > + > > +0x0002 > 0x0018 > > +0x00020001 > 0x0022 > > +0x00020002 > 0x002d > > +0x00020003 > 0x0038 > > +
ebpf trace doesn't work during cpu hotplug
Hi, Looks ebpf trace doesn't work during cpu hotplug, see the following trace: 1) trace two functions called during CPU unplug via bcc/trace /usr/share/bcc/tools/trace -T 'takedown_cpu "%d", arg1' 'take_cpu_down' 2) put cpu7 offline via: echo 0 > /sys/devices/system/cpu/cpu7/online 3) only trace on 'takedown_cpu' is dumped via bcc/trace: TIME PID TID COMMFUNC - 03:23:17 733 733 bashtakedown_cpu 7 The lost trace on 'take_cpu_down' can never be shown, even though CPU7 is switched ON again. take_cpu_down is called via stop_machine_cpuslocked. Thanks, Ming Lei
[PATCH] ASoC: cs42xx8: Fix build error with CONFIG_GPIOLIB is not set
From: Shengjiu Wang config: x86_64-randconfig-x000201921-201921 compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: make ARCH=x86_64 sound/soc/codecs/cs42xx8.c: In function ‘cs42xx8_probe’: sound/soc/codecs/cs42xx8.c:472:25: error: implicit declaration of function ‘devm_gpiod_get_optional’; did you mean ‘devm_clk_get_optional’? [-Werror=implicit-function-declaration] cs42xx8->gpiod_reset = devm_gpiod_get_optional(dev, "reset", ^~~ devm_clk_get_optional sound/soc/codecs/cs42xx8.c:473:8: error: ‘GPIOD_OUT_HIGH’ undeclared (first use in this function); did you mean ‘GPIOF_INIT_HIGH’? GPIOD_OUT_HIGH); ^~ GPIOF_INIT_HIGH sound/soc/codecs/cs42xx8.c:473:8: note: each undeclared identifier is reported only once for each function it appears in sound/soc/codecs/cs42xx8.c:477:2: error: implicit declaration of function ‘gpiod_set_value_cansleep’; did you mean ‘gpio_set_value_cansleep’? [-Werror=implicit-function-declaration] gpiod_set_value_cansleep(cs42xx8->gpiod_reset, 0); ^~~~ gpio_set_value_cansleep Fixes: bfe95dfa4dac ("ASoC: cs42xx8: Add reset gpio handling") Reported-by: kbuild test robot Signed-off-by: Shengjiu Wang --- sound/soc/codecs/cs42xx8.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/cs42xx8.c b/sound/soc/codecs/cs42xx8.c index 3e8dbf63adbe..3bbc62322dfe 100644 --- a/sound/soc/codecs/cs42xx8.c +++ b/sound/soc/codecs/cs42xx8.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include #include -- 2.21.0
RE: [PATCH RESEND V13 2/5] thermal: of-thermal: add API for getting sensor ID from DT
Hi, Eduardo > -Original Message- > From: Eduardo Valentin > Sent: Wednesday, May 29, 2019 11:02 AM > To: Anson Huang > Cc: robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; > s.ha...@pengutronix.de; ker...@pengutronix.de; feste...@gmail.com; > catalin.mari...@arm.com; will.dea...@arm.com; rui.zh...@intel.com; > daniel.lezc...@linaro.org; Aisheng Dong ; > ulf.hans...@linaro.org; Peng Fan ; Daniel Baluta > ; maxime.rip...@bootlin.com; o...@lixom.net; > ja...@amarulasolutions.com; horms+rene...@verge.net.au; Leonard Crestez > ; bjorn.anders...@linaro.org; > dingu...@kernel.org; enric.balle...@collabora.com; > devicet...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm- > ker...@lists.infradead.org; linux...@vger.kernel.org; dl-linux-imx i...@nxp.com> > Subject: Re: [PATCH RESEND V13 2/5] thermal: of-thermal: add API for getting > sensor ID from DT > > On Tue, May 28, 2019 at 02:06:18PM +0800, anson.hu...@nxp.com wrote: > > From: Anson Huang > > > > On some platforms like i.MX8QXP, the thermal driver needs a real HW > > sensor ID from DT thermal zone, the HW sensor ID is used to get > > temperature from SCU firmware, and the virtual sensor ID starting from > > 0 to N is NOT used at all, this patch adds new API > > thermal_zone_of_get_sensor_id() to provide the feature of getting > > sensor ID from DT thermal zone's node. > > > > Signed-off-by: Anson Huang > > --- > > Changes since V12: > > - adjust the second parameter of thermal_zone_of_get_sensor_id() API, > then caller no need > > to pass the of_phandle_args structure and put the sensor_specs.np > manually, also putting > > the sensor node device check inside this API to make it easy for > > usage; > > What happened to using nxp,resource-id property in your driver? > Why do we need this as an API in of-thermal? What other drivers may benefit > of this? > > Regardless, this patch needs to document the new API under Documentation/ As Rob has different opinion about this property, he thought it is unnecessary, see below discussion mail, that is why I need to add API to get the resource ID from phandle argument. I am totally confused now, which approach should we adopt? https://patchwork.kernel.org/patch/10831397/ Thanks, Anson > > > --- > > drivers/thermal/of-thermal.c | 66 +--- > > > include/linux/thermal.h | 10 +++ > > 2 files changed, 60 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/thermal/of-thermal.c > > b/drivers/thermal/of-thermal.c index dc5093b..a53792b 100644 > > --- a/drivers/thermal/of-thermal.c > > +++ b/drivers/thermal/of-thermal.c > > @@ -449,6 +449,54 @@ thermal_zone_of_add_sensor(struct device_node > > *zone, } > > > > /** > > + * thermal_zone_of_get_sensor_id - get sensor ID from a DT thermal > > + zone > > + * @tz_np: a valid thermal zone device node. > > + * @sensor_np: a sensor node of a valid sensor device. > > + * @id: a sensor ID pointer will be passed back. > > + * > > + * This function will get sensor ID from a given thermal zone node, > > + use > > + * "thermal-sensors" as list name, and get sensor ID from first > > + phandle's > > + * argument. > > + * > > + * Return: 0 on success, proper error code otherwise. > > + */ > > + > > +int thermal_zone_of_get_sensor_id(struct device_node *tz_np, > > + struct device_node *sensor_np, > > + u32 *id) > > +{ > > + struct of_phandle_args sensor_specs; > > + int ret; > > + > > + ret = of_parse_phandle_with_args(tz_np, > > +"thermal-sensors", > > +"#thermal-sensor-cells", > > +0, > > +_specs); > > + if (ret) > > + return ret; > > + > > + if (sensor_specs.np != sensor_np) { > > + of_node_put(sensor_specs.np); > > + return -ENODEV; > > + } > > + > > + if (sensor_specs.args_count >= 1) { > > + *id = sensor_specs.args[0]; > > + WARN(sensor_specs.args_count > 1, > > +"%pOFn: too many cells in sensor specifier %d\n", > > +sensor_specs.np, sensor_specs.args_count); > > + } else { > > + *id = 0; > > + } > > + > > + of_node_put(sensor_specs.np); > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(thermal_zone_of_get_sensor_id); > > + > > +/** > > * thermal_zone_of_sensor_register - registers a sensor to a DT thermal > > zone > > * @dev: a valid struct device pointer of a sensor device. Must contain > > * a valid .of_node, for the sensor node. > > @@ -499,36 +547,22 @@ thermal_zone_of_sensor_register(struct device > *dev, int sensor_id, void *data, > > sensor_np = of_node_get(dev->of_node); > > > > for_each_available_child_of_node(np, child) { > > - struct of_phandle_args sensor_specs; > > int ret, id; > > > > /*
[PATCH] NFC: microread/pn544: Fix possible null pointer deference error
When there is an access phy-hdev in pn544_hci_i2c_irq_thread_fn or microread_i2c_irq_thread_fn, it is not initialized in pn544_hci_i2c_probe or microread_i2c_probe. Therefore, we change the order of calling function xxx_probe and request_threaded_irq, and add guard of phy->hdev in xxx_i2c_irq_thread_fn function. Signed-off-by: Young Xiao <92siuy...@gmail.com> --- drivers/nfc/microread/i2c.c | 19 +++ drivers/nfc/pn544/i2c.c | 16 2 files changed, 15 insertions(+), 20 deletions(-) diff --git a/drivers/nfc/microread/i2c.c b/drivers/nfc/microread/i2c.c index 1806d20..80fc6d5 100644 --- a/drivers/nfc/microread/i2c.c +++ b/drivers/nfc/microread/i2c.c @@ -212,7 +212,7 @@ static irqreturn_t microread_i2c_irq_thread_fn(int irq, void *phy_id) struct sk_buff *skb = NULL; int r; - if (!phy || irq != phy->i2c_dev->irq) { + if (!phy || !phy->hdev || irq != phy->i2c_dev->irq) { WARN_ON_ONCE(1); return IRQ_NONE; } @@ -257,6 +257,12 @@ static int microread_i2c_probe(struct i2c_client *client, i2c_set_clientdata(client, phy); phy->i2c_dev = client; + r = microread_probe(phy, _phy_ops, LLC_SHDLC_NAME, + MICROREAD_I2C_FRAME_HEADROOM, + MICROREAD_I2C_FRAME_TAILROOM, + MICROREAD_I2C_LLC_MAX_PAYLOAD, >hdev); + if (r < 0) + return r; r = request_threaded_irq(client->irq, NULL, microread_i2c_irq_thread_fn, IRQF_TRIGGER_RISING | IRQF_ONESHOT, @@ -266,21 +272,10 @@ static int microread_i2c_probe(struct i2c_client *client, return r; } - r = microread_probe(phy, _phy_ops, LLC_SHDLC_NAME, - MICROREAD_I2C_FRAME_HEADROOM, - MICROREAD_I2C_FRAME_TAILROOM, - MICROREAD_I2C_LLC_MAX_PAYLOAD, >hdev); - if (r < 0) - goto err_irq; nfc_info(>dev, "Probed\n"); return 0; - -err_irq: - free_irq(client->irq, phy); - - return r; } static int microread_i2c_remove(struct i2c_client *client) diff --git a/drivers/nfc/pn544/i2c.c b/drivers/nfc/pn544/i2c.c index d0207f8..c9694c8 100644 --- a/drivers/nfc/pn544/i2c.c +++ b/drivers/nfc/pn544/i2c.c @@ -496,7 +496,7 @@ static irqreturn_t pn544_hci_i2c_irq_thread_fn(int irq, void *phy_id) struct sk_buff *skb = NULL; int r; - if (!phy || irq != phy->i2c_dev->irq) { + if (!phy || !phy->hdev || irq != phy->i2c_dev->irq) { WARN_ON_ONCE(1); return IRQ_NONE; } @@ -924,6 +924,13 @@ static int pn544_hci_i2c_probe(struct i2c_client *client, pn544_hci_i2c_platform_init(phy); + r = pn544_hci_probe(phy, _phy_ops, LLC_SHDLC_NAME, + PN544_I2C_FRAME_HEADROOM, PN544_I2C_FRAME_TAILROOM, + PN544_HCI_I2C_LLC_MAX_PAYLOAD, + pn544_hci_i2c_fw_download, >hdev); + if (r < 0) + return r; + r = devm_request_threaded_irq(>dev, client->irq, NULL, pn544_hci_i2c_irq_thread_fn, IRQF_TRIGGER_RISING | IRQF_ONESHOT, @@ -933,13 +940,6 @@ static int pn544_hci_i2c_probe(struct i2c_client *client, return r; } - r = pn544_hci_probe(phy, _phy_ops, LLC_SHDLC_NAME, - PN544_I2C_FRAME_HEADROOM, PN544_I2C_FRAME_TAILROOM, - PN544_HCI_I2C_LLC_MAX_PAYLOAD, - pn544_hci_i2c_fw_download, >hdev); - if (r < 0) - return r; - return 0; } -- 2.7.4
[PATCH 1/1] Drivers: hv: vmbus: Break out ISA independent parts of mshyperv.h
Break out parts of mshyperv.h that are ISA independent into a separate file in include/asm-generic. This move facilitates ARM64 code reusing these definitions and avoids code duplication. No functionality or behavior is changed. Signed-off-by: Michael Kelley --- MAINTAINERS | 1 + arch/x86/include/asm/mshyperv.h | 147 +--- include/asm-generic/mshyperv.h | 182 3 files changed, 187 insertions(+), 143 deletions(-) create mode 100644 include/asm-generic/mshyperv.h diff --git a/MAINTAINERS b/MAINTAINERS index cf2a5b7..521192d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7308,6 +7308,7 @@ F:net/vmw_vsock/hyperv_transport.c F: include/clocksource/hyperv_timer.h F: include/linux/hyperv.h F: include/uapi/linux/hyperv.h +F: include/asm-generic/mshyperv.h F: tools/hv/ F: Documentation/ABI/stable/sysfs-bus-vmbus diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index f4fa8a9..2a793bf 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -3,84 +3,15 @@ #define _ASM_X86_MSHYPER_H #include -#include #include #include #include #include -#define VP_INVAL U32_MAX - -struct ms_hyperv_info { - u32 features; - u32 misc_features; - u32 hints; - u32 nested_features; - u32 max_vp_index; - u32 max_lp_index; -}; - -extern struct ms_hyperv_info ms_hyperv; - - typedef int (*hyperv_fill_flush_list_func)( struct hv_guest_mapping_flush_list *flush, void *data); -/* - * Generate the guest ID. - */ - -static inline __u64 generate_guest_id(__u64 d_info1, __u64 kernel_version, - __u64 d_info2) -{ - __u64 guest_id = 0; - - guest_id = (((__u64)HV_LINUX_VENDOR_ID) << 48); - guest_id |= (d_info1 << 48); - guest_id |= (kernel_version << 16); - guest_id |= d_info2; - - return guest_id; -} - - -/* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) -{ - /* -* On crash we're reading some other CPU's message page and we need -* to be careful: this other CPU may already had cleared the header -* and the host may already had delivered some other message there. -* In case we blindly write msg->header.message_type we're going -* to lose it. We can still lose a message of the same type but -* we count on the fact that there can only be one -* CHANNELMSG_UNLOAD_RESPONSE and we don't care about other messages -* on crash. -*/ - if (cmpxchg(>header.message_type, old_msg_type, - HVMSG_NONE) != old_msg_type) - return; - - /* -* Make sure the write to MessageType (ie set to -* HVMSG_NONE) happens before we read the -* MessagePending and EOMing. Otherwise, the EOMing -* will not deliver any more messages since there is -* no empty slot -*/ - mb(); - - if (msg->header.message_flags.msg_pending) { - /* -* This will cause message queue rescan to -* possibly deliver another msg from the -* hypervisor -*/ - wrmsrl(HV_X64_MSR_EOM, 0); - } -} - #define hv_init_timer(timer, tick) \ wrmsrl(HV_X64_MSR_STIMER0_COUNT + (2*timer), tick) #define hv_init_timer_config(timer, val) \ @@ -97,6 +28,8 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) #define hv_get_vp_index(index) rdmsrl(HV_X64_MSR_VP_INDEX, index) +#define hv_signal_eom() wrmsrl(HV_X64_MSR_EOM, 0) + #define hv_get_synint_state(int_num, val) \ rdmsrl(HV_X64_MSR_SINT0 + int_num, val) #define hv_set_synint_state(int_num, val) \ @@ -122,13 +55,6 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) #define trace_hyperv_callback_vector hyperv_callback_vector #endif void hyperv_vector_handler(struct pt_regs *regs); -void hv_setup_vmbus_irq(void (*handler)(void)); -void hv_remove_vmbus_irq(void); - -void hv_setup_kexec_handler(void (*handler)(void)); -void hv_remove_kexec_handler(void); -void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs)); -void hv_remove_crash_handler(void); /* * Routines for stimer0 Direct Mode handling. @@ -136,8 +62,6 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type) */ void hv_stimer0_vector_handler(struct pt_regs *regs); void hv_stimer0_callback_vector(void); -int hv_setup_stimer0_irq(int *irq, int *vector, void (*handler)(void)); -void hv_remove_stimer0_irq(int irq); static inline void hv_enable_stimer0_percpu_irq(int irq) {} static inline void hv_disable_stimer0_percpu_irq(int irq) {} @@ -282,14 +206,6 @@ static inline u64
Re: [PATCH v2] mm/swap: Fix release_pages() when releasing devmap pages
On Mon, May 27, 2019 at 05:01:07PM +0200, Michal Hocko wrote: > On Fri 24-05-19 10:36:56, ira.we...@intel.com wrote: > > From: Ira Weiny > > > > Device pages can be more than type MEMORY_DEVICE_PUBLIC. > > > > Handle all device pages within release_pages() > > > > This was found via code inspection while determining if release_pages() > > and the new put_user_pages() could be interchangeable. > > Please expand more about who is such a user and why does it use > release_pages rather than put_*page API. Sorry for not being more clear. The error was discovered while discussing a proposal to change a use of release_pages() to put_user_pages()[1] [1] https://lore.kernel.org/lkml/20190523172852.ga27...@iweiny-desk2.sc.intel.com/ In that thread John was saying that release_pages() was functionally equivalent to a loop around put_page(). He also suggested implementing put_user_pages() by using release_pages(). On the surface they did not seem the same to me so I did a deep dive to make sure they were and found this error. > > The above changelog doesn't > really help understanding what is the actual problem. I also do not > understand the fix and a failure mode from release_pages is just scary. This is not failing release_pages(). The fix is that not all devmap pages are "public" type. So previous to this change devmap pages of other types would not correctly be accounted for. The discussion about put_devmap_managed_page() "failing" is not about it failing directly but rather in how these pages should be accounted for. Only devmap pages which require pagemap ops (specifically page_free()) require put_devmap_managed_page() processing. Because of the optimized locking in release_pages() the zone device check is required to release the lock even if put_devmap_managed_page() does not handle the put. > It is basically impossible to handle the error case. So what is going on > here? I think what has happened is the code in release_pages() and put_page() diverged at some point. I think it is worth a clean up in this area but I don't see way to do it at the moment which would be any cleaner than what is there. So I've refrained from doing so. Does this help? Would you like to roll a V3 with some of this in the commit message? Ira > > > > > > Cc: Jérôme Glisse > > Cc: Michal Hocko > > Reviewed-by: Dan Williams > > Reviewed-by: John Hubbard > > Signed-off-by: Ira Weiny > > > > --- > > Changes from V1: > > Add comment clarifying that put_devmap_managed_page() can still > > fail. > > Add Reviewed-by tags. > > > > mm/swap.c | 11 +++ > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/mm/swap.c b/mm/swap.c > > index 9d0432baddb0..f03b7b4bfb4f 100644 > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -740,15 +740,18 @@ void release_pages(struct page **pages, int nr) > > if (is_huge_zero_page(page)) > > continue; > > > > - /* Device public page can not be huge page */ > > - if (is_device_public_page(page)) { > > + if (is_zone_device_page(page)) { > > if (locked_pgdat) { > > spin_unlock_irqrestore(_pgdat->lru_lock, > >flags); > > locked_pgdat = NULL; > > } > > - put_devmap_managed_page(page); > > - continue; > > + /* > > +* zone-device-pages can still fail here and will > > +* therefore need put_page_testzero() > > +*/ > > + if (put_devmap_managed_page(page)) > > + continue; > > } > > > > page = compound_head(page); > > -- > > 2.20.1 > > > > -- > Michal Hocko > SUSE Labs
Re: [RFC PATCH v5 16/16] dcache: Add CONFIG_DCACHE_SMO
On Tue, May 21, 2019 at 02:05:38AM +, Roman Gushchin wrote: > On Tue, May 21, 2019 at 11:31:18AM +1000, Tobin C. Harding wrote: > > On Tue, May 21, 2019 at 12:57:47AM +, Roman Gushchin wrote: > > > On Mon, May 20, 2019 at 03:40:17PM +1000, Tobin C. Harding wrote: > > > > In an attempt to make the SMO patchset as non-invasive as possible add a > > > > config option CONFIG_DCACHE_SMO (under "Memory Management options") for > > > > enabling SMO for the DCACHE. Whithout this option dcache constructor is > > > > used but no other code is built in, with this option enabled slab > > > > mobility is enabled and the isolate/migrate functions are built in. > > > > > > > > Add CONFIG_DCACHE_SMO to guard the partial shrinking of the dcache via > > > > Slab Movable Objects infrastructure. > > > > > > Hm, isn't it better to make it a static branch? Or basically anything > > > that allows switching on the fly? > > > > If that is wanted, turning SMO on and off per cache, we can probably do > > this in the SMO code in SLUB. > > Not necessarily per cache, but without recompiling the kernel. > > > > > It seems that the cost of just building it in shouldn't be that high. > > > And the question if the defragmentation worth the trouble is so much > > > easier to answer if it's possible to turn it on and off without rebooting. > > > > If the question is 'is defragmentation worth the trouble for the > > dcache', I'm not sure having SMO turned off helps answer that question. > > If one doesn't shrink the dentry cache there should be very little > > overhead in having SMO enabled. So if one wants to explore this > > question then they can turn on the config option. Please correct me if > > I'm wrong. > > The problem with a config option is that it's hard to switch over. > > So just to test your changes in production a new kernel should be built, > tested and rolled out to a representative set of machines (which can be > measured in thousands of machines). Then if results are questionable, > it should be rolled back. > > What you're actually guarding is the kmem_cache_setup_mobility() call, > which can be perfectly avoided using a boot option, for example. Turning > it on and off completely dynamic isn't that hard too. Hi Roman, I've added a boot parameter to SLUB so that admins can enable/disable SMO at boot time system wide. Then for each object that implements SMO (currently XArray and dcache) I've also added a boot parameter to enable/disable SMO for that cache specifically (these depend on SMO being enabled system wide). All three boot parameters default to 'off', I've added a config option to default each to 'on'. I've got a little more testing to do on another part of the set then the PATCH version is coming at you :) This is more a courtesy email than a request for comment, but please feel free to shout if you don't like the method outlined above. Fully dynamic config is not currently possible because currently the SMO implementation does not support disabling mobility for a cache once it is turned on, a bit of extra logic would need to be added and some state stored - I'm not sure it warrants it ATM but that can be easily added later if wanted. Maybe Christoph will give his opinion on this. thanks, Tobin.
Re: [ext4] 079f9927c7: ltp.mmap16.fail
On Wed, May 29, 2019 at 10:52:56AM +0800, kernel test robot wrote: > FYI, we noticed the following commit (built with gcc-7): > > commit: 079f9927c7bfa026d963db1455197159ebe5b534 ("ext4: gracefully handle > ext4_break_layouts() failure during truncate") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master Jan --- this is the old version of your patch, which I had dropped before sending a push request to Linus. However, I forgot to reset the dev branch so it still had the old patch on it, and so it got picked up in linux-next. Apologies for the confusion. I've reset the dev branch on ext4.git, and the new version of your patch will show up there shortly, as I start reviewing patches for the next merge window. Cheers, - Ted > <<>> > tag=mmap16 stime=1559078706 > cmdline="mmap16" > contacts="" > analysis=exit > <<>> > mke2fs 1.43.4 (31-Jan-2017) > mmap16 0 TINFO : Using test device LTP_DEV='/dev/loop0' > mmap16 0 TINFO : Formatting /dev/loop0 with ext4 opts='-b 1024' extra > opts='10240' > mmap16 1 TFAIL : mmap16.c:85: Bug is reproduced! > <<>> > initiation_status="ok" > duration=8 termination_type=exited termination_id=1 corefile=no > cutime=11 cstime=345 > <<>>
kernel BUG at mm/swap_state.c:170!
Hi folks. I am observed kernel panic after update to git tag 5.2-rc2. This crash happens at memory pressing when swap being used. Unfortunately in journalctl saved only this: May 29 08:02:02 localhost.localdomain kernel: page:e9095823 refcount:1 mapcount:1 mapping:8f3ffeb36949 index:0x625002ab2 May 29 08:02:02 localhost.localdomain kernel: anon May 29 08:02:02 localhost.localdomain kernel: flags: 0x17fffe00080034(uptodate|lru|active|swapbacked) May 29 08:02:02 localhost.localdomain kernel: raw: 0017fffe00080034 e90944640888 e90956e208c8 8f3ffeb36949 May 29 08:02:02 localhost.localdomain kernel: raw: 000625002ab2 0001 8f41aeeff000 May 29 08:02:02 localhost.localdomain kernel: page dumped because: VM_BUG_ON_PAGE(entry != page) May 29 08:02:02 localhost.localdomain kernel: page->mem_cgroup:8f41aeeff000 May 29 08:02:02 localhost.localdomain kernel: [ cut here ] May 29 08:02:02 localhost.localdomain kernel: kernel BUG at mm/swap_state.c:170! -- Best Regards, Mike Gavrilov.
Re: [PATCH] x86/fpu: Use fault_in_pages_writeable() for pre-faulting
On Sun, 26 May 2019 19:33:25 +0200 Sebastian Andrzej Siewior wrote: > From: Hugh Dickins > > Since commit > >d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() > fails") Please add this as a Fixes: d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") line so that anyone who backports d9c9ce34ed5c8 has a chance of finding this patch also.
Re: [PATCH] perf: Fix oops when kthread execs user process
Will Deacon writes: > On Tue, May 28, 2019 at 04:01:03PM +0200, Peter Zijlstra wrote: >> On Tue, May 28, 2019 at 08:31:29PM +0800, Young Xiao wrote: >> > When a kthread calls call_usermodehelper() the steps are: >> > 1. allocate current->mm >> > 2. load_elf_binary() >> > 3. populate current->thread.regs >> > >> > While doing this, interrupts are not disabled. If there is a perf >> > interrupt in the middle of this process (i.e. step 1 has completed >> > but not yet reached to step 3) and if perf tries to read userspace >> > regs, kernel oops. > > This seems to be because pt_regs(current) gives NULL for kthreads on Power. Right, we've done that since roughly forever in copy_thread(): int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long kthread_arg, struct task_struct *p) { ... /* Copy registers */ sp -= sizeof(struct pt_regs); childregs = (struct pt_regs *) sp; if (unlikely(p->flags & PF_KTHREAD)) { /* kernel thread */ memset(childregs, 0, sizeof(struct pt_regs)); childregs->gpr[1] = sp + sizeof(struct pt_regs); ... p->thread.regs = NULL; /* no user register state */ See commit from 2002: https://github.com/mpe/linux-fullhistory/commit/c0a96c0918d21d8a99270e94d9c4a4a322d04581#diff-edb76bfcc84905163f34d24d2aad3f3aR187 > From the initial report [1], it doesn't look like the mm isn't initialised, > but rather than we're dereferencing a NULL pt_regs pointer somehow for the > current task (see previous comment). I don't see how that can happen on > arm64, given that we put the pt_regs on the kernel stack which is allocated > during fork. We have the regs on the stack too (see above), but we're explicitly NULL'ing the link from task->thread. Looks like on arm64 and x86 there is no link from task->thread, instead you get from task to pt_regs via task_stack_page(). That actually seems potentially fishy given the comment on task_stack_page() about the stack going away for exiting tasks. We should probably be NULL'ing the regs pointer in free_thread_stack() or similar. Though that race mustn't be happening because other arches would see it. Or are we just wrong and kthreads should have non-NULL regs? I can't find another arch that does the same as us. cheers
[PATCH v8 3/3] i2c-ocores: sifive: add polling mode workaround for FU540-C000 SoC.
The i2c-ocore driver already has a polling mode interface.But it needs a workaround for FU540 Chipset on HiFive unleashed board (RevA00). There is an erratum in FU540 chip that prevents interrupt driven i2c transfers from working, and also the I2C controller's interrupt bit cannot be cleared if set, due to this the existing i2c polling mode interface added in mainline earlier doesn't work, and CPU stall's infinitely, when-ever i2c transfer is initiated. Ref: commit dd7dbf0eb090 ("i2c: ocores: refactor setup for polling") The workaround / fix under OCORES_FLAG_BROKEN_IRQ is particularly for FU540-COOO SoC. The polling function identifies a SiFive device based on the device node and enables the workaround. Signed-off-by: Sagar Shrikant Kadam --- drivers/i2c/busses/i2c-ocores.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-ocores.c b/drivers/i2c/busses/i2c-ocores.c index b334fa2..4117f1a 100644 --- a/drivers/i2c/busses/i2c-ocores.c +++ b/drivers/i2c/busses/i2c-ocores.c @@ -35,6 +35,7 @@ struct ocores_i2c { int iobase; u32 reg_shift; u32 reg_io_width; + unsigned long flags; wait_queue_head_t wait; struct i2c_adapter adap; struct i2c_msg *msg; @@ -84,6 +85,8 @@ struct ocores_i2c { #define TYPE_GRLIB 1 #define TYPE_SIFIVE_REV0 2 +#define OCORES_FLAG_BROKEN_IRQ BIT(1) /* Broken IRQ for FU540-C000 SoC */ + static void oc_setreg_8(struct ocores_i2c *i2c, int reg, u8 value) { iowrite8(value, i2c->base + (reg << i2c->reg_shift)); @@ -236,9 +239,12 @@ static irqreturn_t ocores_isr(int irq, void *dev_id) struct ocores_i2c *i2c = dev_id; u8 stat = oc_getreg(i2c, OCI2C_STATUS); - if (!(stat & OCI2C_STAT_IF)) + if (i2c->flags & OCORES_FLAG_BROKEN_IRQ) { + if ((stat & OCI2C_STAT_IF) && !(stat & OCI2C_STAT_BUSY)) + return IRQ_NONE; + } else if (!(stat & OCI2C_STAT_IF)) { return IRQ_NONE; - + } ocores_process(i2c, stat); return IRQ_HANDLED; @@ -353,6 +359,11 @@ static void ocores_process_polling(struct ocores_i2c *i2c) ret = ocores_isr(-1, i2c); if (ret == IRQ_NONE) break; /* all messages have been transferred */ + else { + if (i2c->flags & OCORES_FLAG_BROKEN_IRQ) + if (i2c->state == STATE_DONE) + break; + } } } @@ -595,6 +606,7 @@ static int ocores_i2c_probe(struct platform_device *pdev) { struct ocores_i2c *i2c; struct ocores_i2c_platform_data *pdata; + const struct of_device_id *match; struct resource *res; int irq; int ret; @@ -677,6 +689,14 @@ static int ocores_i2c_probe(struct platform_device *pdev) irq = platform_get_irq(pdev, 0); if (irq == -ENXIO) { ocores_algorithm.master_xfer = ocores_xfer_polling; + + /* +* Set in OCORES_FLAG_BROKEN_IRQ to enable workaround for +* FU540-C000 SoC in polling mode. +*/ + match = of_match_node(ocores_i2c_match, pdev->dev.of_node); + if (match && (long)match->data == TYPE_SIFIVE_REV0) + i2c->flags |= OCORES_FLAG_BROKEN_IRQ; } else { if (irq < 0) return irq; -- 1.9.1
[PATCH v8 1/3] dt-bindings: i2c: extend existing opencore bindings.
Reformatted compatibility strings to one valid combination on each line. Add FU540-C000 specific device tree bindings to already available i2-ocores file. This device is available on HiFive Unleashed Rev A00 board. Move interrupt under optional property list as this can be optional. The FU540-C000 SoC from sifive, has an Opencore's I2C block reimplementation. The DT compatibility string for this IP is present in HDL and available at. https://github.com/sifive/sifive-blocks/blob/master/src/main/scala/devices/i2c/I2C.scala#L73 Signed-off-by: Sagar Shrikant Kadam --- Documentation/devicetree/bindings/i2c/i2c-ocores.txt | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/i2c/i2c-ocores.txt b/Documentation/devicetree/bindings/i2c/i2c-ocores.txt index 17bef9a..6b25a80 100644 --- a/Documentation/devicetree/bindings/i2c/i2c-ocores.txt +++ b/Documentation/devicetree/bindings/i2c/i2c-ocores.txt @@ -1,9 +1,13 @@ Device tree configuration for i2c-ocores Required properties: -- compatible : "opencores,i2c-ocores" or "aeroflexgaisler,i2cmst" +- compatible : "opencores,i2c-ocores" +"aeroflexgaisler,i2cmst" +"sifive,fu540-c000-i2c", "sifive,i2c0" +For Opencore based I2C IP block reimplemented in +FU540-C000 SoC. Please refer to sifive-blocks-ip-versioning.txt +for additional details. - reg : bus address start and address range size of device -- interrupts : interrupt number - clocks : handle to the controller clock; see the note below. Mutually exclusive with opencores,ip-clock-frequency - opencores,ip-clock-frequency: frequency of the controller clock in Hz; @@ -12,6 +16,7 @@ Required properties: - #size-cells : should be <0> Optional properties: +- interrupts : interrupt number. - clock-frequency : frequency of bus clock in Hz; see the note below. Defaults to 100 KHz when the property is not specified - reg-shift : device register offsets are shifted by this value -- 1.9.1
[PATCH v8 2/3] i2c-ocores: sifive: add support for i2c device on FU540-c000 SoC.
Update device id table for Opencore's I2C master based re-implementation used in FU540-c000 chipset on HiFive Unleashed platform. Device ID's include Sifive, soc-specific device for chip specific tweaks and sifive IP block specific device for generic programming model. Signed-off-by: Sagar Shrikant Kadam --- drivers/i2c/busses/i2c-ocores.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/i2c/busses/i2c-ocores.c b/drivers/i2c/busses/i2c-ocores.c index c3dabee..b334fa2 100644 --- a/drivers/i2c/busses/i2c-ocores.c +++ b/drivers/i2c/busses/i2c-ocores.c @@ -82,6 +82,7 @@ struct ocores_i2c { #define TYPE_OCORES0 #define TYPE_GRLIB 1 +#define TYPE_SIFIVE_REV0 2 static void oc_setreg_8(struct ocores_i2c *i2c, int reg, u8 value) { @@ -462,6 +463,14 @@ static u32 ocores_func(struct i2c_adapter *adap) .compatible = "aeroflexgaisler,i2cmst", .data = (void *)TYPE_GRLIB, }, + { + .compatible = "sifive,fu540-c000-i2c", + .data = (void *)TYPE_SIFIVE_REV0, + }, + { + .compatible = "sifive,i2c0", + .data = (void *)TYPE_SIFIVE_REV0, + }, {}, }; MODULE_DEVICE_TABLE(of, ocores_i2c_match); -- 1.9.1
[PATCH v8 3/3] i2c-ocores: sifive: add polling mode workaround for FU540-C000 SoC.
The i2c-ocore driver already has a polling mode interface.But it needs a workaround for FU540 Chipset on HiFive unleashed board (RevA00). There is an erratum in FU540 chip that prevents interrupt driven i2c transfers from working, and also the I2C controller's interrupt bit cannot be cleared if set, due to this the existing i2c polling mode interface added in mainline earlier doesn't work, and CPU stall's infinitely, when-ever i2c transfer is initiated. Ref: commit dd7dbf0eb090 ("i2c: ocores: refactor setup for polling") The workaround / fix under OCORES_FLAG_BROKEN_IRQ is particularly for FU540-COOO SoC. The polling function identifies a SiFive device based on the device node and enables the workaround. Signed-off-by: Sagar Shrikant Kadam --- drivers/i2c/busses/i2c-ocores.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-ocores.c b/drivers/i2c/busses/i2c-ocores.c index b334fa2..4117f1a 100644 --- a/drivers/i2c/busses/i2c-ocores.c +++ b/drivers/i2c/busses/i2c-ocores.c @@ -35,6 +35,7 @@ struct ocores_i2c { int iobase; u32 reg_shift; u32 reg_io_width; + unsigned long flags; wait_queue_head_t wait; struct i2c_adapter adap; struct i2c_msg *msg; @@ -84,6 +85,8 @@ struct ocores_i2c { #define TYPE_GRLIB 1 #define TYPE_SIFIVE_REV0 2 +#define OCORES_FLAG_BROKEN_IRQ BIT(1) /* Broken IRQ for FU540-C000 SoC */ + static void oc_setreg_8(struct ocores_i2c *i2c, int reg, u8 value) { iowrite8(value, i2c->base + (reg << i2c->reg_shift)); @@ -236,9 +239,12 @@ static irqreturn_t ocores_isr(int irq, void *dev_id) struct ocores_i2c *i2c = dev_id; u8 stat = oc_getreg(i2c, OCI2C_STATUS); - if (!(stat & OCI2C_STAT_IF)) + if (i2c->flags & OCORES_FLAG_BROKEN_IRQ) { + if ((stat & OCI2C_STAT_IF) && !(stat & OCI2C_STAT_BUSY)) + return IRQ_NONE; + } else if (!(stat & OCI2C_STAT_IF)) { return IRQ_NONE; - + } ocores_process(i2c, stat); return IRQ_HANDLED; @@ -353,6 +359,11 @@ static void ocores_process_polling(struct ocores_i2c *i2c) ret = ocores_isr(-1, i2c); if (ret == IRQ_NONE) break; /* all messages have been transferred */ + else { + if (i2c->flags & OCORES_FLAG_BROKEN_IRQ) + if (i2c->state == STATE_DONE) + break; + } } } @@ -595,6 +606,7 @@ static int ocores_i2c_probe(struct platform_device *pdev) { struct ocores_i2c *i2c; struct ocores_i2c_platform_data *pdata; + const struct of_device_id *match; struct resource *res; int irq; int ret; @@ -677,6 +689,14 @@ static int ocores_i2c_probe(struct platform_device *pdev) irq = platform_get_irq(pdev, 0); if (irq == -ENXIO) { ocores_algorithm.master_xfer = ocores_xfer_polling; + + /* +* Set in OCORES_FLAG_BROKEN_IRQ to enable workaround for +* FU540-C000 SoC in polling mode. +*/ + match = of_match_node(ocores_i2c_match, pdev->dev.of_node); + if (match && (long)match->data == TYPE_SIFIVE_REV0) + i2c->flags |= OCORES_FLAG_BROKEN_IRQ; } else { if (irq < 0) return irq; -- 1.9.1
[PATCH v8 0/3] Extend dt bindings to support I2C on sifive devices and a fix broken IRQ in polling mode.
The patch is based on mainline v5.2-rc1 and extends DT-bindings for Opencore based I2C IP block reimplemented in FU540 SoC, available on HiFive unleashed board (Rev A00), and also provides a workaround for broken IRQ which affects the already available I2C polling mode interface in mainline, for FU540-C000 chipsets. The polling mode workaround patch fixes the CPU stall issue, when-ever i2c transfer are initiated. This workaround checks if it's a FU540 chipset based on device tree information, and check's for open core's IF(interrupt flag) and BUSY flags to break from the polling loop upon completion of transfer. To test the patch, a PMOD-AD2 sensor is connected to HiFive Unleashed board over J1 connector, and appropriate device node is added into board specific device tree as per the information provided in dt-bindings in Documentation/devicetree/bindings/i2c/i2c-ocores.txt. Without this workaround, the CPU stall's infinitely. Busybox i2c utilities used to verify workaround : i2cdetect, i2cdump, i2cset, i2cget Patch History: V7<->V8: -Incorporated review comments for cosmetic changes like: space, comma and period(.) V6<->V7: -Rectified space and tab issue in dt bindings strings. -Implemented workaround based on i2c->flags, as per review comment on v6. V5<->V6: -Incorporated suggestions on v5 patch as follows: -Reformatted compatibility strings in dt doc with one valid combination on each line. -Removed interrupt-parents from optional property list. -With rebase to v5.2-rc1, the v5 variant of polling workaround PATCH becomes in-compatible. Till kernel v5.1 the polling mode was enabled based on i2c->flags, wherease in kernel v5.2-rc1 polling mode is set as master transfer algorithim at probe time itself, and i2c->flags checks are removed. -Modified v5 to check for SiFive device type in polling function and include the workaround/fix for broken IRQ. v4<->V5: -Removed un-necessary checks of OCORES_FLAG_BROKEN_IRQ. V3<->V4: -Incorporated suggestions on v3 patch as follows: -OCORES_FLAG_BROKEN_IRQ BIT position rectified. -Updated BORKEN_IRQ flag checks such that if sifive device (Fu540-C000) is identified,then use polling mode as IRQ is broken. V2<->V3: -Incorporated review comments on v2 patch as follows: -Rectified compatibility string sequence with the most specific one at the first (dt bindings). -Moved interrupts and interrupt-parent under optional property list (dt-bindings). -Updated reference to sifive-blocks-ip-versioning.txt and URL to IP repository used (dt-bindings). -Removed example for i2c0 device node from binding doc (dt-bindings). -Included sifive,i2c0 device under compatibility table in i2c-ocores driver (i2c-ocores). -Updated polling mode hooks for SoC specific fix to handle broken IRQ (i2c-ocores). V1<->V2: -Incorporate review comments from Andrew -Extend dt bindings into i2c-ocores.txt instead of adding new file -Rename SIFIVE_FLAG_POLL to OCORES_FLAG_BROKEN_IRQ V1: -Update dt bindings for sifive i2c devices -Fix broken IRQ affecting i2c polling mode interface. Sagar Shrikant Kadam (3): dt-bindings: i2c: extend existing opencore bindings. i2c-ocores: sifive: add support for i2c device on FU540-c000 SoC. i2c-ocores: sifive: add polling mode workaround for FU540-C000 SoC. .../devicetree/bindings/i2c/i2c-ocores.txt | 9 -- drivers/i2c/busses/i2c-ocores.c| 33 -- 2 files changed, 38 insertions(+), 4 deletions(-) -- 1.9.1
Re: [PATCH net-next 1/5] timecounter: Add helper for reconstructing partial timestamps
On Tue, May 28, 2019 at 07:14:22PM -0700, John Stultz wrote: > Hrm. Is this actually generic? Would it make more sense to have the > specific implementations with this quirk implement this in their > read() handler? If not, why? Strongly agree that this workaround should stay in the driver. After all, we do not want to encourage HW designers to continue in this way. Thanks, Richard
Re: [PATCH net-next 3/5] net: dsa: mv88e6xxx: Let taggers specify a can_timestamp function
On Wed, May 29, 2019 at 02:56:25AM +0300, Vladimir Oltean wrote: > The newly introduced function is called on both the RX and TX paths. NAK on this patch. > The boolean returned by port_txtstamp should only return false if the > driver tried to timestamp the skb but failed. So you say. > Currently there is some logic in the mv88e6xxx driver that determines > whether it should timestamp frames or not. > > This is wasteful, because if the decision is to not timestamp them, then > DSA will have cloned an skb and freed it immediately afterwards. No, it isn't wasteful. Look at the tests in that driver to see why. > Additionally other drivers (sja1105) may have other hardware criteria > for timestamping frames on RX, and the default conditions for > timestamping a frame are too restrictive. I'm sorry, but we won't change the frame just for one device that has design issues. Please put device specific workarounds into its driver. Thanks, Richard
[PATCH] amd64-agp: fix arbitrary kernel memory writes
pg_start is copied from userspace on AGPIOC_BIND and AGPIOC_UNBIND ioctl cmds of agp_ioctl() and passed to agpioc_bind_wrap(). As said in the comment, (pg_start + mem->page_count) may wrap in case of AGPIOC_BIND, and it is not checked at all in case of AGPIOC_UNBIND. As a result, user with sufficient privileges (usually "video" group) may generate either local DoS or privilege escalation. See commit 194b3da873fd ("agp: fix arbitrary kernel memory writes") for details. Signed-off-by: Young Xiao <92siuy...@gmail.com> --- drivers/char/agp/amd64-agp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c index c69e39f..5daa0e3 100644 --- a/drivers/char/agp/amd64-agp.c +++ b/drivers/char/agp/amd64-agp.c @@ -60,7 +60,8 @@ static int amd64_insert_memory(struct agp_memory *mem, off_t pg_start, int type) /* Make sure we can fit the range in the gatt table. */ /* FIXME: could wrap */ - if (((unsigned long)pg_start + mem->page_count) > num_entries) + if (((pg_start + mem->page_count) > num_entries) || + ((pg_start + mem->page_count) < pg_start)) return -EINVAL; j = pg_start; -- 2.7.4
Re: [PATCH net-next 0/5] PTP support for the SJA1105 DSA driver
On Wed, May 29, 2019 at 02:56:22AM +0300, Vladimir Oltean wrote: > Not all is rosy, though. You can sure say that again! > PTP timestamping will only work when the ports are bridged. Otherwise, > the metadata follow-up frames holding RX timestamps won't be received > because they will be blocked by the master port's MAC filter. Linuxptp > tries to put the net device in ALLMULTI/PROMISC mode, Untrue. > but DSA doesn't > pass this on to the master port, which does the actual reception. > The master port is put in promiscous mode when the slave ports are > enslaved to a bridge. > > Also, even with software-corrected timestamps, one can observe a > negative path delay reported by linuxptp: > > ptp4l[55.600]: master offset 8 s2 freq +83677 path delay -2390 > ptp4l[56.600]: master offset 17 s2 freq +83688 path delay -2391 > ptp4l[57.601]: master offset 6 s2 freq +83682 path delay -2391 > ptp4l[58.601]: master offset -1 s2 freq +83677 path delay -2391 > > Without investigating too deeply, this appears to be introduced by the > correction applied by linuxptp to t4 (t4c: corrected master rxtstamp) > during the path delay estimation process (removing the correction makes > the path delay positive). No. The root cause is the time stamps delivered by the hardware or your driver. That needs to be addressed before going forward. Thanks, Richard
[RFC PATCH v3] rtl8xxxu: Improve TX performance of RTL8723BU on rtl8xxxu driver
We have 3 laptops which connect the wifi by the same RTL8723BU. The PCI VID/PID of the wifi chip is 10EC:B720 which is supported. They have the same problem with the in-kernel rtl8xxxu driver, the iperf (as a client to an ethernet-connected server) gets ~1Mbps. Nevertheless, the signal strength is reported as around -40dBm, which is quite good. From the wireshark capture, the tx rate for each data and qos data packet is only 1Mbps. Compare to the driver from https://github.com/lwfinger/rtl8723bu, the same iperf test gets ~12 Mbps or more. The signal strength is reported similarly around -40dBm. That's why we want to improve. After reading the source code of the rtl8xxxu driver and Larry's, the major difference is that Larry's driver has a watchdog which will keep monitoring the signal quality and updating the rate mask just like the rtl8xxxu_gen2_update_rate_mask() does if signal quality changes. And this kind of watchdog also exists in rtlwifi driver of some specific chips, ex rtl8192ee, rtl8188ee, rtl8723ae, rtl8821ae...etc. They have the same member function named dm_watchdog and will invoke the corresponding dm_refresh_rate_adaptive_mask to adjust the tx rate mask. With this commit, the tx rate of each data and qos data packet will be 39Mbps (MCS4) with the 0xF0 as the tx rate mask. The 20th bit to 23th bit means MCS4 to MCS7. It means that the firmware still picks the lowest rate from the rate mask and explains why the tx rate of data and qos data is always lowest 1Mbps because the default rate mask passed is always 0xFFF ranges from the basic CCK rate, OFDM rate, and MCS rate. However, with Larry's driver, the tx rate observed from wireshark under the same condition is almost 65Mbps or 72Mbps. I believe the firmware of RTL8723BU may need fix. And I think we can still bring in the dm_watchdog as rtlwifi to improve from the driver side. Please leave precious comments for my commits and suggest what I can do better. Or suggest if there's any better idea to fix this. Thanks. Signed-off-by: Chris Chiu --- Notes: v2: - Fix errors and warnings complained by checkpatch.pl - Replace data structure rate_adaptive by 2 member variables - Make rtl8xxxu_wireless_mode non-static - Runs refresh_rate_mask() only in station mode v3: - Remove ugly rtl8xxxu_watchdog data structure - Make sure only one vif exists .../net/wireless/realtek/rtl8xxxu/rtl8xxxu.h | 49 ++ .../realtek/rtl8xxxu/rtl8xxxu_8723b.c | 145 ++ .../wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 80 +- 3 files changed, 273 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h index 8828baf26e7b..42e9227f4d19 100644 --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.h @@ -1195,6 +1195,44 @@ struct rtl8723bu_c2h { struct rtl8xxxu_fileops; +/*mlme related.*/ +enum wireless_mode { + WIRELESS_MODE_UNKNOWN = 0, + /* Sub-Element */ + WIRELESS_MODE_B = BIT(0), + WIRELESS_MODE_G = BIT(1), + WIRELESS_MODE_A = BIT(2), + WIRELESS_MODE_N_24G = BIT(3), + WIRELESS_MODE_N_5G = BIT(4), + WIRELESS_AUTO = BIT(5), + WIRELESS_MODE_AC = BIT(6), + WIRELESS_MODE_MAX = 0x7F, +}; + +/* from rtlwifi/wifi.h */ +enum ratr_table_mode_new { + RATEID_IDX_BGN_40M_2SS = 0, + RATEID_IDX_BGN_40M_1SS = 1, + RATEID_IDX_BGN_20M_2SS_BN = 2, + RATEID_IDX_BGN_20M_1SS_BN = 3, + RATEID_IDX_GN_N2SS = 4, + RATEID_IDX_GN_N1SS = 5, + RATEID_IDX_BG = 6, + RATEID_IDX_G = 7, + RATEID_IDX_B = 8, + RATEID_IDX_VHT_2SS = 9, + RATEID_IDX_VHT_1SS = 10, + RATEID_IDX_MIX1 = 11, + RATEID_IDX_MIX2 = 12, + RATEID_IDX_VHT_3SS = 13, + RATEID_IDX_BGN_3SS = 14, +}; + +#define RTL8XXXU_RATR_STA_INIT 0 +#define RTL8XXXU_RATR_STA_HIGH 1 +#define RTL8XXXU_RATR_STA_MID 2 +#define RTL8XXXU_RATR_STA_LOW 3 + struct rtl8xxxu_priv { struct ieee80211_hw *hw; struct usb_device *udev; @@ -1299,6 +1337,14 @@ struct rtl8xxxu_priv { u8 pi_enabled:1; u8 no_pape:1; u8 int_buf[USB_INTR_CONTENT_LENGTH]; + u8 ratr_index; + u8 rssi_level; + /* +* Single virtual interface permitted since the driver supports STATION +* mode only. +*/ + struct ieee80211_vif *vif; + struct delayed_work ra_watchdog; }; struct rtl8xxxu_rx_urb { @@ -1335,6 +1381,8 @@ struct rtl8xxxu_fileops { bool ht40); void (*update_rate_mask) (struct rtl8xxxu_priv *priv, u32 ramask, int sgi); + void (*refresh_rate_mask) (struct rtl8xxxu_priv *priv, int signal, + struct ieee80211_sta *sta); void (*report_connect) (struct rtl8xxxu_priv *priv, u8 macid, bool
Re: [RFC 1/7] mm: introduce MADV_COOL
On Wed 29-05-19 10:40:33, Hillf Danton wrote: > > On Wed, 29 May 2019 00:11:15 +0800 Michal Hocko wrote: > > On Tue 28-05-19 23:38:11, Hillf Danton wrote: > > > > > > In short, I prefer to skip IO mapping since any kind of address range > > > can be expected from userspace, and it may probably cover an IO mapping. > > > And things can get out of control, if we reclaim some IO pages while > > > underlying device is trying to fill data into any of them, for instance. > > > > What do you mean by IO pages why what is the actual problem? > > > Io pages are the backing-store pages of a mapping whose vm_flags has > VM_IO set, and the comment in mm/memory.c says: > /* > * Physically remapped pages are special. Tell the > * rest of the world about it: > * VM_IO tells people not to look at these pages > * (accesses can have side effects). > OK, thanks for the clarification of the first part of the question. Now to the second and the more important one. What is the actual concern? AFAIK those pages shouldn't be on LRU list. If they are then they should be safe to get reclaimed otherwise we would have a problem when reclaiming them on the normal memory pressure. Why is this madvise any different? -- Michal Hocko SUSE Labs
Re: [PATCH v5 0/2] Fix issues with vmalloc flush flag
On Tue, 2019-05-28 at 17:23 -0700, David Miller wrote: > From: Rick Edgecombe > Date: Mon, 27 May 2019 14:10:56 -0700 > > > These two patches address issues with the recently added > > VM_FLUSH_RESET_PERMS vmalloc flag. > > > > Patch 1 addresses an issue that could cause a crash after other > > architectures besides x86 rely on this path. > > > > Patch 2 addresses an issue where in a rare case strange arguments > > could be provided to flush_tlb_kernel_range(). > > It just occurred to me another situation that would cause trouble on > sparc64, and that's if someone the address range of the main kernel > image ended up being passed to flush_tlb_kernel_range(). > > That would flush the locked kernel mapping and crash the kernel > instantly in a completely non-recoverable way. Hmm, I haven't received the logs from Meelis that will show the real ranges being passed into flush_tlb_kernel_range() on sparc, but it should be flushing a range spanning from the modules to the direct map. It looks like the kernel is at the very bottom of the address space, so not included. Or do you mean the pages that hold the kernel text on the direct map? But regardless of this new code, DEBUG_PAGEALLOC hangs with the first vmalloc free/unmap. That should be just flushing a single allocation in the vmalloc range. If it is somehow catching a locked entry though... Are there any sparc flush mechanisms that could be used in vmalloc that won't touch locked entries? Peter Z was pointing out that flush_tlb_all() might be more approriate for vmalloc anyway.
RE: [PATCH RESEND 2/5] ARM: dts: imx7d-sdb: Assign corresponding power supply for LDOs
Hi, Leonard > -Original Message- > From: Leonard Crestez > Sent: Wednesday, May 29, 2019 3:24 AM > To: Anson Huang > Cc: robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; > s.ha...@pengutronix.de; ker...@pengutronix.de; feste...@gmail.com; > devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux- > ker...@vger.kernel.org; dl-linux-imx > Subject: Re: [PATCH RESEND 2/5] ARM: dts: imx7d-sdb: Assign corresponding > power supply for LDOs > > On 12.05.2019 12:57, Anson Huang wrote: > > On i.MX7D SDB board, sw2 supplies 1p0d/1p2 LDO, this patch assigns > > corresponding power supply for 1p0d/1p2 LDO to avoid confusion by > > below log: > > > > vdd1p0d: supplied by regulator-dummy > > vdd1p2: supplied by regulator-dummy > > > > With this patch, the power supply is more accurate: > > > > vdd1p0d: supplied by SW2 > > vdd1p2: supplied by SW2 > > > > diff --git a/arch/arm/boot/dts/imx7d-sdb.dts > > b/arch/arm/boot/dts/imx7d-sdb.dts > > > > +_1p0d { > > + vin-supply = <_reg>; > > +}; > > + > > +_1p2 { > > + vin-supply = <_reg>; > > +}; > > It's not clear why but this patch breaks imx7d-sdb boot. Checked two > boards: in a board farm and on my desk. Thanks for reporting this issue, I can reproduce it now, a quick debug shows that with this patch, when setting reg_1p0d's voltage to 1.0V, the SW2's voltage will be changed to 1.5V, the expected voltage should be 1.8V, so 1.5V cause board reset. Below patch can fix this issue, but I am still checking if this is the best fix, once I figure out, I will send out a fix patch for review: +++ b/arch/arm/boot/dts/imx7d-sdb.dts @@ -267,6 +267,7 @@ regulator-max-microvolt = <185>; regulator-boot-on; regulator-always-on; + regulator-max-step-microvolt = <25000>; }; Thanks, Anson > > -- > Regards, > Leonard