Re: [PATCH 2/3] dt-bindings: remoteproc: add Versal-NET platform

2024-03-21 Thread Krzysztof Kozlowski
On 21/03/2024 16:13, Tanmay Shah wrote:
> 
> 
> On 3/21/24 2:39 AM, Krzysztof Kozlowski wrote:
>> On 20/03/2024 16:14, Tanmay Shah wrote:
>>>
>>>
>>> On 3/20/24 2:40 AM, Krzysztof Kozlowski wrote:
 On 19/03/2024 15:42, Tanmay Shah wrote:
>
>
> On 3/19/24 12:30 AM, Krzysztof Kozlowski wrote:
>> On 19/03/2024 01:51, Tanmay Shah wrote:
>>> Hello Krzysztof,
>>>
>>> Thanks for reviews. Please find my comments below.
>>>
>>> On 3/17/24 1:53 PM, Krzysztof Kozlowski wrote:
 On 15/03/2024 22:15, Tanmay Shah wrote:
> AMD-Xilinx Versal-NET platform is successor of Versal platform. It
> contains multiple clusters of cortex-R52 real-time processing units.
> Each cluster contains two cores of cortex-R52 processors. Each cluster
> can be configured in lockstep mode or split mode.
>
> Each R52 core is assigned 128KB of TCM memory. ATCM memory is 64KB, 
> BTCM
> and CTCM memoreis are 32KB each. Each TCM memory has its own dedicated
> power-domain that needs to be requested before using it.
>
> Signed-off-by: Tanmay Shah 
> ---
>  .../remoteproc/xlnx,zynqmp-r5fss.yaml | 220 
> +++---
>  1 file changed, 184 insertions(+), 36 deletions(-)
>
> diff --git 
> a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml 
> b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> index 711da0272250..55654ee02eef 100644
> --- 
> a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> +++ 
> b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
> @@ -18,7 +18,9 @@ description: |
>  
>  properties:
>compatible:
> -const: xlnx,zynqmp-r5fss
> +enum:
> +  - xlnx,zynqmp-r5fss
> +  - xlnx,versal-net-r52fss
>  
>"#address-cells":
>  const: 2
> @@ -64,7 +66,9 @@ patternProperties:
>  
>  properties:
>compatible:
> -const: xlnx,zynqmp-r5f
> +enum:
> +  - xlnx,zynqmp-r5f
> +  - xlnx,versal-net-r52f
>  
>reg:
>  minItems: 1
> @@ -135,9 +139,11 @@ required:
>  allOf:
>- if:
>properties:
> -xlnx,cluster-mode:
> -  enum:
> -- 1
> +compatible:
> +  contains:
> +enum:
> +  - xlnx,versal-net-r52fss

 Why do you touch these lines?

> +
>  then:
>patternProperties:
>  "^r5f@[0-9a-f]+$":
> @@ -149,16 +155,14 @@ allOf:
>items:
>  - description: ATCM internal memory
>  - description: BTCM internal memory
> -- description: extra ATCM memory in lockstep mode
> -- description: extra BTCM memory in lockstep mode
> +- description: CTCM internal memory
>  
>  reg-names:
>minItems: 1
>items:
> -- const: atcm0
> -- const: btcm0
> -- const: atcm1
> -- const: btcm1
> +- const: atcm
> +- const: btcm
> +- const: ctcm
>  
>  power-domains:
>minItems: 2
> @@ -166,33 +170,70 @@ allOf:
>  - description: RPU core power domain
>  - description: ATCM power domain
>  - description: BTCM power domain
> -- description: second ATCM power domain
> -- description: second BTCM power domain
> +- description: CTCM power domain
>  
>  else:
> -  patternProperties:
> -"^r5f@[0-9a-f]+$":
> -  type: object
> -
> -  properties:
> -reg:
> -  minItems: 1
> -  items:
> -- description: ATCM internal memory
> -- description: BTCM internal memory
> -
> -reg-names:
> -  minItems: 1
> -  items:
> -- const: atcm0
> -- const: btcm0
> -
> -power-domains:
> -  minItems: 2
> -  items:
> -- description: RPU 

Re: [PATCH] virtio_net: Do not send RSS key if it is not supported

2024-03-21 Thread Xuan Zhuo
On Thu, 21 Mar 2024 09:54:30 -0700, Breno Leitao  wrote:
> There is a bug when setting the RSS options in virtio_net that can break
> the whole machine, getting the kernel into an infinite loop.
>
> Running the following command in any QEMU virtual machine with virtionet
> will reproduce this problem:
>
>   # ethtool -X eth0  hfunc toeplitz
>
> This is how the problem happens:
>
> 1) ethtool_set_rxfh() calls virtnet_set_rxfh()
>
> 2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
>
> 3) virtnet_commit_rss_command() populates 4 entries for the rss
>scatter-gather
>
> 4) Since the command above does not have a key, then the last
>scatter-gatter entry will be zeroed, since rss_key_size == 0.
> sg_buf_size = vi->rss_key_size;



if (vi->has_rss || vi->has_rss_hash_report) {
vi->rss_indir_table_size =
virtio_cread16(vdev, offsetof(struct virtio_net_config,
rss_max_indirection_table_length));
vi->rss_key_size =
virtio_cread8(vdev, offsetof(struct virtio_net_config, 
rss_max_key_size));

vi->rss_hash_types_supported =
virtio_cread32(vdev, offsetof(struct virtio_net_config, 
supported_hash_types));
vi->rss_hash_types_supported &=
~(VIRTIO_NET_RSS_HASH_TYPE_IP_EX |
  VIRTIO_NET_RSS_HASH_TYPE_TCP_EX |
  VIRTIO_NET_RSS_HASH_TYPE_UDP_EX);

dev->hw_features |= NETIF_F_RXHASH;
}


vi->rss_key_size is initiated here, I wonder if there is something wrong?

Thanks.


>
> 5) This buffer is passed to qemu, but qemu is not happy with a buffer
>with zero length, and do the following in virtqueue_map_desc() (QEMU
>function):
>
>   if (!sz) {
>   virtio_error(vdev, "virtio: zero sized buffers are not allowed");
>
> 6) virtio_error() (also QEMU function) set the device as broken
>
>   vdev->broken = true;
>
> 7) Qemu bails out, and do not repond this crazy kernel.
>
> 8) The kernel is waiting for the response to come back (function
>virtnet_send_command())
>
> 9) The kernel is waiting doing the following :
>
>   while (!virtqueue_get_buf(vi->cvq, ) &&
>  !virtqueue_is_broken(vi->cvq))
>   cpu_relax();
>
> 10) None of the following functions above is true, thus, the kernel
> loops here forever. Keeping in mind that virtqueue_is_broken() does
> not look at the qemu `vdev->broken`, so, it never realizes that the
> vitio is broken at QEMU side.
>
> Fix it by not sending the key scatter-gatter key if it is not set.
>
> Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
> Signed-off-by: Breno Leitao 
> Cc: sta...@vger.kernel.org
> Cc: qemu-de...@nongnu.org
> ---
>  drivers/net/virtio_net.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index d7ce4a1011ea..5a7700b103f8 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3041,11 +3041,16 @@ static int virtnet_set_ringparam(struct net_device 
> *dev,
>  static bool virtnet_commit_rss_command(struct virtnet_info *vi)
>  {
>   struct net_device *dev = vi->dev;
> + int has_key = vi->rss_key_size;
>   struct scatterlist sgs[4];
>   unsigned int sg_buf_size;
> + int nents = 3;
> +
> + if (has_key)
> + nents += 1;
>
>   /* prepare sgs */
> - sg_init_table(sgs, 4);
> + sg_init_table(sgs, nents);
>
>   sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table);
>   sg_set_buf([0], >ctrl->rss, sg_buf_size);
> @@ -3057,8 +3062,13 @@ static bool virtnet_commit_rss_command(struct 
> virtnet_info *vi)
>   - offsetof(struct virtio_net_ctrl_rss, max_tx_vq);
>   sg_set_buf([2], >ctrl->rss.max_tx_vq, sg_buf_size);
>
> - sg_buf_size = vi->rss_key_size;
> - sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
> + if (has_key) {
> + /* Only populate if key is available, otherwise
> +  * populating a buffer with zero size breaks virtio
> +  */
> + sg_buf_size = vi->rss_key_size;
> + sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
> + }
>
>   if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
> vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG
> --
> 2.43.0
>



Re: [PATCHv2 2/2] usb: typec: anx7688: Add driver for ANX7688 USB-C HDMI bridge

2024-03-21 Thread Pavel Machek
Hi!

> I'm sorry to keep you waiting.

Thanks for comments.

> > +   struct gpio_desc *gpio_reset;
> > +   struct gpio_desc *gpio_cabledet;
> > +
> > +   uint32_t src_caps[8];
> 
> Use u32 instead of uint32_t.

Will replace globally.

> > +static int anx7688_reg_read(struct anx7688 *anx7688, u8 reg_addr)
> > +{
> > +   int ret;
> > +
> > +   ret = i2c_smbus_read_byte_data(anx7688->client, reg_addr);
> > +   if (ret < 0)
> > +   dev_err(anx7688->dev, "i2c read failed at 0x%x (%d)\n",
> > +   reg_addr, ret);
> 
> dev_dbg instead.

I'd prefer not to. i2c functions should not really fail, and if they
do, we want the error log. This is not debugging, this is i2c failing.

> > +static void anx7688_power_enable(struct anx7688 *anx7688)
> > +{
> > +   gpiod_set_value(anx7688->gpio_reset, 1);
> > +   gpiod_set_value(anx7688->gpio_enable, 1);
> > +
> > +   /* wait for power to stabilize and release reset */
> > +   msleep(10);
> 
> So is it okay that the sleep may take up to 20ms?

I don't see how that would be a problem.

> > +   gpiod_set_value(anx7688->gpio_reset, 0);
> > +   udelay(2);
> 
> Use usleep_range() instead.

Can do, but it makes no difference.

> > +   gpiod_set_value(anx7688->gpio_reset, 1);
> > +   msleep(5);
> 
> The same question here, is it a problem if the sleep ends up taking
> 20ms?

Again, I expect that to be ok.

> > +   ret = anx7688_tcpc_reg_read(anx7688, ANX7688_TCPC_REG_INTERFACE_SEND);
> > +   if (ret) {
> > +   dev_err(anx7688->dev,
> > +   "failed to send pd packet (tx buffer full)\n");
> 
> One line should be enought for that one.

That makes it go over 80 columns, but yes, can be one line.

> > +   // wait until the message is processed (30ms max)
> > +   for (i = 0; i < 300; i++) {
> > +   ret = anx7688_tcpc_reg_read(anx7688, 
> > ANX7688_TCPC_REG_INTERFACE_SEND);
> > +   if (ret <= 0)
> > +   return ret;
> > +
> > +   udelay(100);
> > +   }
> > +
> > +   dev_err(anx7688->dev, "timeout waiting for the message queue flush\n");
> 
> Maybe dev_dbg for this too.

Let's not hide these. If they happen, we can downgrade them, but they
should not.

> > +   /* wait till the firmware is loaded (typically ~30ms) */
> > +   for (i = 0; i < 100; i++) {
> > +   ret = anx7688_reg_read(anx7688, 
> > ANX7688_REG_EEPROM_LOAD_STATUS0);
> > +
> > +   if (ret >= 0 && (ret & ANX7688_EEPROM_FW_LOADED) == 
> > ANX7688_EEPROM_FW_LOADED) {
> > +   dev_dbg(anx7688->dev, "eeprom0 = 0x%02x\n", ret);
> > +   dev_info(anx7688->dev, "fw loaded after %d ms\n", i * 
> > 10);
> 
> Debugging information. Use dev_dbg.

Ok.

> > +   set_bit(ANX7688_F_FW_FAILED, anx7688->flags);
> > +   dev_err(anx7688->dev, "boot firmware load failed (you may need to flash 
> > FW to anx7688 first)\n");
> > +   ret = -ETIMEDOUT;
> > +   goto err_vconoff;
> > +
> > +fw_loaded:
> 
> This label looks a bit messy to me. You could also move that firmware
> loading wait to its own function.

Ok, let me try to refactor this.

> > +static int anx7688_handle_pd_message_response(struct anx7688 *anx7688,
> > + u8 to_cmd, u8 resp)
> > +{
...
> > +   return 0;
> > +}
> 
> Noise. Drop this whole function. If you need this kind of information,
> then please consider trace points, or just use some debugfs trick like
> what we have in drivers/usb/typec/tcpm/tcpm.c and few other drivers.

Ok.

> > +   switch (cmd) {
> > +   case ANX7688_OCM_MSG_PWR_SRC_CAP:
> > +   dev_info(anx7688->dev, "received SRC_CAP\n");
> 
> Noise.

Ok, let me convert these to dev_dbg.

> > +
> > +   if (len % 4 != 0) {
> > +   dev_warn(anx7688->dev, "received invalid sized PDO 
> > array\n");
> > +   break;
> > +   }
> > +
> > +   /* the partner is PD capable */
> > +   anx7688->pd_capable = true;
> > +
> > +   for (i = 0; i < len / 4; i++) {
> > +   pdo = le32_to_cpu(pdos[i]);
> > +
> > +   if (pdo_type(pdo) == PDO_TYPE_FIXED) {
> > +   unsigned int voltage = pdo_fixed_voltage(pdo);
> > +   unsigned int max_curr = pdo_max_current(pdo);
> > +
> > +   dev_info(anx7688->dev, "SRC_CAP PDO_FIXED (%umV 
> > %umA)\n", voltage, max_curr);
> 
> Noise.
> 
> > +   } else if (pdo_type(pdo) == PDO_TYPE_BATT) {
> > +   unsigned int min_volt = pdo_min_voltage(pdo);
> > +   unsigned int max_volt = pdo_max_voltage(pdo);
> > +   unsigned int max_pow = pdo_max_power(pdo);
> > +
> > +   dev_info(anx7688->dev, "SRC_CAP PDO_BATT 
> > (%umV-%umV %umW)\n", min_volt, max_volt, max_pow);
> 
> Noise. That line also really should be split in two.
> 
> I'm stopping my review here. This driver is too noisy. All dev_info
> calls need to be 

Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

2024-03-21 Thread Steven Rostedt
On Tue, 12 Mar 2024 13:42:28 +
Mark Rutland  wrote:

> There are ways around that, but they're complicated and/or expensive, e.g.
> 
> * Use a sequence of multiple patches, starting with replacing the JALR with an
>   exception-generating instruction with a fixup handler, which is sort-of what
>   x86 does with UD2. This may require multiple passes with
>   synchronize_rcu_tasks() to make sure all threads have seen the latest
>   instructions, and that cannot be done under stop_machine(), so if you need
>   stop_machine() for CMODx reasons, you may need to use that several times 
> with
>   intervening calls to synchronize_rcu_tasks().

Just for clarification. x86 doesn't use UD2 for updating the call sites. It
uses the breakpoint (0xcc) operation. This is because x86 instructions are
not a fixed size and can cross cache boundaries, making updates to text
sections dangerous as another CPU may get half the old instruction and half
the new one!

Thus, when we have:

0f 1f 44 00 00  nop

and want to convert it to:

e8 e7 57 07 00  call ftrace_caller

We have to first add a breakpoint:

cc 17 44 00 00

Send an IPI to all CPUs so that they see the breakpoint (IPI is equivalent
to a memory barrier).

We have a ftrace breakpoint handler that will look at the function that the
breakpoint happened on. If it was a nop, it will just skip over the rest of
the instruction, and return from the break point handler just after the
"cc 17 44 00 00". If it's supposed to be a function, it will push the
return to after the instruction onto the stack, and return from the break
point handler to ftrace_caller. (Although things changed a little since
this update is now handled by text_poke_bp_batch()).

Then it changes the rest of the instruction:

cc e7 57 07 00

Sends out another IPI to all CPUs and removes the break point with the new
instruction op.

e8 e7 57 07 00

and now all the callers of this function will call the ftrace_caller.

-- Steve



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Mark Brown
On Thu, Mar 21, 2024 at 08:14:44PM +0100, Karel Balej wrote:
> Mark Brown, 2024-03-21T19:00:24+00:00:

> > I would expect that if you have two separate register maps they would
> > have separate configurations that describe the corresponding physical
> > register maps, as far as I can tell this driver is just making up a
> > maximum register number.

> Alright, so I should just use a separate config for each regmap and set
> the max_register value for each to whatever I can find is actually the
> highest used value in the downstream code -- correct?

That sounds plausible if you don't have any other information about the
register maps.


signature.asc
Description: PGP signature


Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Karel Balej
Mark Brown, 2024-03-21T19:00:24+00:00:
> On Thu, Mar 21, 2024 at 07:16:43PM +0100, Karel Balej wrote:
> > Mark Brown, 2024-03-21T17:48:28+00:00:
>
> > > > They do according to the downstream driver which is my only reference.
> > > > In fact, there the driver defines the configs separately for each regmap
> > > > but with the same values.
>
> > > This is a downstream driver - are you sure it's got the best code
> > > quality?
>
> > No, that is why I have rewritten it and tried to improve on this. But
> > like I said, it is my only reference. Is there some other way to verify
> > this value (besides perhaps the datasheet)?
>
> The maximum register is whatever the maximum register we know about for
> the device is, the datasheet is generally a good reference there.
>
> > > I'm not seeing any references to registers with numbers as high as the
> > > maximum register that's there in your driver for example.
>
> > Indeed, I have performed the same check with the same findings. But that
> > doesn't necessarily mean that the maximum should be lower, no?
>
> > Do you have some specific modifications of my code in mind regarding
> > this?
>
> I would expect that if you have two separate register maps they would
> have separate configurations that describe the corresponding physical
> register maps, as far as I can tell this driver is just making up a
> maximum register number.

Alright, so I should just use a separate config for each regmap and set
the max_register value for each to whatever I can find is actually the
highest used value in the downstream code -- correct?

Thank you,
K. B.



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Mark Brown
On Thu, Mar 21, 2024 at 07:16:43PM +0100, Karel Balej wrote:
> Mark Brown, 2024-03-21T17:48:28+00:00:

> > > They do according to the downstream driver which is my only reference.
> > > In fact, there the driver defines the configs separately for each regmap
> > > but with the same values.

> > This is a downstream driver - are you sure it's got the best code
> > quality?

> No, that is why I have rewritten it and tried to improve on this. But
> like I said, it is my only reference. Is there some other way to verify
> this value (besides perhaps the datasheet)?

The maximum register is whatever the maximum register we know about for
the device is, the datasheet is generally a good reference there.

> > I'm not seeing any references to registers with numbers as high as the
> > maximum register that's there in your driver for example.

> Indeed, I have performed the same check with the same findings. But that
> doesn't necessarily mean that the maximum should be lower, no?

> Do you have some specific modifications of my code in mind regarding
> this?

I would expect that if you have two separate register maps they would
have separate configurations that describe the corresponding physical
register maps, as far as I can tell this driver is just making up a
maximum register number.


signature.asc
Description: PGP signature


Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Karel Balej
Mark Brown, 2024-03-21T17:48:28+00:00:
> On Thu, Mar 21, 2024 at 06:32:03PM +0100, Karel Balej wrote:
> > Mark Brown, 2024-03-21T17:17:40+00:00:
>
> > > Do they both genuinely have the same maximum register?
>
> > They do according to the downstream driver which is my only reference.
> > In fact, there the driver defines the configs separately for each regmap
> > but with the same values.
>
> This is a downstream driver - are you sure it's got the best code
> quality?

No, that is why I have rewritten it and tried to improve on this. But
like I said, it is my only reference. Is there some other way to verify
this value (besides perhaps the datasheet)?

> I'm not seeing any references to registers with numbers as high as the
> maximum register that's there in your driver for example.

Indeed, I have performed the same check with the same findings. But that
doesn't necessarily mean that the maximum should be lower, no?

Do you have some specific modifications of my code in mind regarding
this?

Thank you,
K. B.



Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

2024-03-21 Thread Björn Töpel
Andy Chiu  writes:

> On Thu, Mar 21, 2024 at 4:48 PM Björn Töpel  wrote:
>>
>> Andy,
>>
>> Pulling out the A option:
>>
>> >> > A) Use auipc/jalr, only patch jalr to take us to a common
>> >> >dispatcher/trampoline
>> >> >
>> >> >  |  # probably on a data cache-line != func 
>> >> > .text to avoid ping-pong
>> >> >  | ...
>> >> >  | func:
>> >> >  |   ...make sure ra isn't messed up...
>> >> >  |   aupic
>> >> >  |   nop <=> jalr # Text patch point -> common_dispatch
>> >> >  |   ACTUAL_FUNC
>> >> >  |
>> >> >  | common_dispatch:
>> >> >  |   load  based on ra
>> >> >  |   jalr
>> >> >  |   ...
>> >> >
>> >> > The auipc is never touched, and will be overhead. Also, we need a mv to
>> >> > store ra in a scratch register as well -- like Arm. We'll have two insn
>> >> > per-caller overhead for a disabled caller.
>> >
>> > My patch series takes a similar "in-function dispatch" approach. A
>> > difference is that the  is
>> > embedded within each function entry. I'd like to have it moved to a
>> > run-time allocated array to reduce total text size.
>>
>> This is what arm64 has as well. It's a 8B + 1-2 dirt cheap movish like
>> instructions (save ra, prepare jump with auipc). I think that's a
>> reasonable overhead.
>>
>> > Another difference is that my series changes the first instruction to
>> > "j ACTUAL_FUNC" for the "ftrace disable" case. As long as the
>> > architecture guarantees the atomicity of the first instruction, then
>> > we are safe. For example, we are safe if the first instruction could
>> > only be "mv tmp, ra" or "j ACTUAL_FUNC". And since the loaded address is
>> > always valid, we can fix "mv + jalr" down so we don't have to
>> > play with the exception handler trick. The guarantee from arch would
>> > require ziccif (in RVA22) though, but I think it is the same for us
>> > (unless with stop_machine). For ziccif, I would rather call that out
>> > during boot than blindly assume.
>>
>> I'm maybe biased, but I'd prefer the A) over your version with the
>> unconditional jump. A) has the overhead of two, I'd say, free
>> instructions (again "Meten is Weten!" ;-)).
>
> Yes, I'd also prefer A for less overall patch size. We can also
> optimize the overhead with a direct jump if that makes sense. Though,
> we need to sort out a way to map functions to corresponding
> trampolines. A direct way I could image is CALL_OPS'ish patching
> style, if the ftrace destination has to be patched in a per-function
> manner. For example:
>
> 
> func_symbol:
> auipc t0, common_dispatch:high <=> j actual_func:
> jalr t0, common_dispatch:low(t0)
>
> common_dispatch:
> load t1, index + dispatch-list
> ld t1, 0(t1)
> jr t1

Yup, exactly like that (but I'd put the acutal target ptr in there,
instead of an additional indirection. Exactly what Mark does for arm64).

When we enter the common_dispatch, the ptr 
would be -12(t0).

As for patching auipc or jalr, I guess we need to measure what's best.
My knee-jerk would be always auipc is better than jump -- but let's
measure. ;-)


Björn



Re: [GIT PULL] remoteproc updates for v6.9

2024-03-21 Thread Linus Torvalds
On Thu, 21 Mar 2024 at 11:03, Bjorn Andersson  wrote:
>
> I was further notified that this conflicts with your tree, Linus. Below
> is the resolution for this conflict.

Heh. This email came in after the pr-tracker-bot email notifying you
that it's already done..

I think I got it all right, it didn't seem at all controversial, but
maybe you should double-check.

  Linus



Re: [GIT PULL] remoteproc updates for v6.9

2024-03-21 Thread Bjorn Andersson
On Thu, Mar 21, 2024 at 05:55:13AM -0700, Bjorn Andersson wrote:
> I'm sorry for the late pull request, I apparently had managed to get git
> send-email to only deliver my mail to /dev/null on the machine where I 
> prepared
> this.
> 

I was further notified that this conflicts with your tree, Linus. Below
is the resolution for this conflict.

Please let me know if you'd like me to create some form of merge commit
for this.

Regards,
Bjorn

diff --cc drivers/remoteproc/imx_dsp_rproc.c
index d73727a5828a,56a799cb8b36..087506e21508
--- a/drivers/remoteproc/imx_dsp_rproc.c
+++ b/drivers/remoteproc/imx_dsp_rproc.c
@@@ -1090,9 -1154,7 +1090,7 @@@ static int imx_dsp_rproc_probe(struct p
return 0;
  
  err_detach_domains:
 -  imx_dsp_detach_pm_domains(priv);
 +  dev_pm_domain_detach_list(priv->pd_list);
- err_put_rproc:
-   rproc_free(rproc);
  
return ret;
  }
@@@ -1104,8 -1166,7 +1102,7 @@@ static void imx_dsp_rproc_remove(struc
  
pm_runtime_disable(>dev);
rproc_del(rproc);
 -  imx_dsp_detach_pm_domains(priv);
 +  dev_pm_domain_detach_list(priv->pd_list);
-   rproc_free(rproc);
  }
  
  /* pm runtime functions */
diff --cc drivers/remoteproc/qcom_q6v5_adsp.c
index 93f9a1537ec6,34ac996a93b2..1d24c9b656a8
--- a/drivers/remoteproc/qcom_q6v5_adsp.c
+++ b/drivers/remoteproc/qcom_q6v5_adsp.c
@@@ -704,13 -713,15 +704,13 @@@ static int adsp_probe(struct platform_d
  
ret = adsp_init_clock(adsp, desc->clk_ids);
if (ret)
-   goto free_rproc;
+   return ret;
  
 -  ret = qcom_rproc_pds_attach(adsp->dev, adsp,
 -  desc->proxy_pd_names);
 +  ret = qcom_rproc_pds_attach(adsp, desc->pd_names, desc->num_pds);
if (ret < 0) {
dev_err(>dev, "Failed to attach proxy power domains\n");
-   goto free_rproc;
+   return ret;
}
 -  adsp->proxy_pd_count = ret;
  
ret = adsp_init_reset(adsp);
if (ret)
@@@ -742,11 -753,8 +742,8 @@@
return 0;
  
  disable_pm:
 -  qcom_rproc_pds_detach(adsp, adsp->proxy_pds, adsp->proxy_pd_count);
 +  qcom_rproc_pds_detach(adsp);
  
- free_rproc:
-   rproc_free(rproc);
- 
return ret;
  }
  
@@@ -760,8 -768,7 +757,7 @@@ static void adsp_remove(struct platform
qcom_remove_glink_subdev(adsp->rproc, >glink_subdev);
qcom_remove_sysmon_subdev(adsp->sysmon);
qcom_remove_ssr_subdev(adsp->rproc, >ssr_subdev);
 -  qcom_rproc_pds_detach(adsp, adsp->proxy_pds, adsp->proxy_pd_count);
 +  qcom_rproc_pds_detach(adsp);
-   rproc_free(adsp->rproc);
  }
  
  static const struct adsp_pil_data adsp_resource_init = {



Re: [GIT PULL] rpmsg updates for v6.9

2024-03-21 Thread pr-tracker-bot
The pull request you sent on Thu, 21 Mar 2024 05:56:51 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
> tags/rpmsg-v6.9

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/91f263dda66a2dd4bf0c5d8ad6f48ab9fd5d9eca

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html



Re: [GIT PULL] hwspinlock updates for v6.9

2024-03-21 Thread pr-tracker-bot
The pull request you sent on Thu, 21 Mar 2024 05:57:28 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
> tags/hwlock-v6.9

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/2ac2b1665d3fbec6ca709dd6ef3ea05f4a51ee4c

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html



Re: [GIT PULL] remoteproc updates for v6.9

2024-03-21 Thread pr-tracker-bot
The pull request you sent on Thu, 21 Mar 2024 05:55:13 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
> tags/rproc-v6.9

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0e875ee5e897db13104faab93bb1ab2b95da9ab9

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Mark Brown
On Thu, Mar 21, 2024 at 06:32:03PM +0100, Karel Balej wrote:
> Mark Brown, 2024-03-21T17:17:40+00:00:

> > Do they both genuinely have the same maximum register?

> They do according to the downstream driver which is my only reference.
> In fact, there the driver defines the configs separately for each regmap
> but with the same values.

This is a downstream driver - are you sure it's got the best code
quality?  I'm not seeing any references to registers with numbers as
high as the maximum register that's there in your driver for example.


signature.asc
Description: PGP signature


Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

2024-03-21 Thread Andy Chiu
On Thu, Mar 21, 2024 at 4:48 PM Björn Töpel  wrote:
>
> Andy,
>
> Pulling out the A option:
>
> >> > A) Use auipc/jalr, only patch jalr to take us to a common
> >> >dispatcher/trampoline
> >> >
> >> >  |  # probably on a data cache-line != func 
> >> > .text to avoid ping-pong
> >> >  | ...
> >> >  | func:
> >> >  |   ...make sure ra isn't messed up...
> >> >  |   aupic
> >> >  |   nop <=> jalr # Text patch point -> common_dispatch
> >> >  |   ACTUAL_FUNC
> >> >  |
> >> >  | common_dispatch:
> >> >  |   load  based on ra
> >> >  |   jalr
> >> >  |   ...
> >> >
> >> > The auipc is never touched, and will be overhead. Also, we need a mv to
> >> > store ra in a scratch register as well -- like Arm. We'll have two insn
> >> > per-caller overhead for a disabled caller.
> >
> > My patch series takes a similar "in-function dispatch" approach. A
> > difference is that the  is
> > embedded within each function entry. I'd like to have it moved to a
> > run-time allocated array to reduce total text size.
>
> This is what arm64 has as well. It's a 8B + 1-2 dirt cheap movish like
> instructions (save ra, prepare jump with auipc). I think that's a
> reasonable overhead.
>
> > Another difference is that my series changes the first instruction to
> > "j ACTUAL_FUNC" for the "ftrace disable" case. As long as the
> > architecture guarantees the atomicity of the first instruction, then
> > we are safe. For example, we are safe if the first instruction could
> > only be "mv tmp, ra" or "j ACTUAL_FUNC". And since the loaded address is
> > always valid, we can fix "mv + jalr" down so we don't have to
> > play with the exception handler trick. The guarantee from arch would
> > require ziccif (in RVA22) though, but I think it is the same for us
> > (unless with stop_machine). For ziccif, I would rather call that out
> > during boot than blindly assume.
>
> I'm maybe biased, but I'd prefer the A) over your version with the
> unconditional jump. A) has the overhead of two, I'd say, free
> instructions (again "Meten is Weten!" ;-)).

Yes, I'd also prefer A for less overall patch size. We can also
optimize the overhead with a direct jump if that makes sense. Though,
we need to sort out a way to map functions to corresponding
trampolines. A direct way I could image is CALL_OPS'ish patching
style, if the ftrace destination has to be patched in a per-function
manner. For example:


func_symbol:
auipc t0, common_dispatch:high <=> j actual_func:
jalr t0, common_dispatch:low(t0)

common_dispatch:
load t1, index + dispatch-list
ld t1, 0(t1)
jr t1


>
> > However, one thing I am not very sure is: do we need a destination
> > address in a "per-function" manner? It seems like most of the time the
> > destination address can only be ftrace_call, or ftrace_regs_call. If
> > the number of destination addresses is very few, then we could
> > potentially reduce the size of
> > .
>
> Yes, we do need a per-function manner. BPF, e.g., uses
> dynamically/JIT:ed trampolines/targets.
>
>
>
> Björn

Cheers,
Andy



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Karel Balej
Mark Brown, 2024-03-21T17:17:40+00:00:
> On Thu, Mar 21, 2024 at 06:08:16PM +0100, Karel Balej wrote:
> > Mark Brown, 2024-03-21T16:58:44+00:00:
>
> > > > > > > > +static const struct regmap_config pm886_i2c_regmap = {
> > > > > > > > +   .reg_bits = 8,
> > > > > > > > +   .val_bits = 8,
> > > > > > > > +   .max_register = PM886_REGMAP_CONF_MAX_REG,
> > > > > > > > +};
>
> ...
>
> > > You shouldn't be creating two regmaps for the same set of registers,
> > > that just opens the potential for confusion.
>
> > Just the regmap config is the same. Otherwise, each regmap lives at a
> > different I2C address.
>
> Do they both genuinely have the same maximum register?

They do according to the downstream driver which is my only reference.
In fact, there the driver defines the configs separately for each regmap
but with the same values.



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Mark Brown
On Thu, Mar 21, 2024 at 06:08:16PM +0100, Karel Balej wrote:
> Mark Brown, 2024-03-21T16:58:44+00:00:

> > > > > > > +static const struct regmap_config pm886_i2c_regmap = {
> > > > > > > + .reg_bits = 8,
> > > > > > > + .val_bits = 8,
> > > > > > > + .max_register = PM886_REGMAP_CONF_MAX_REG,
> > > > > > > +};

...

> > You shouldn't be creating two regmaps for the same set of registers,
> > that just opens the potential for confusion.

> Just the regmap config is the same. Otherwise, each regmap lives at a
> different I2C address.

Do they both genuinely have the same maximum register?


signature.asc
Description: PGP signature


Re: [syzbot] [virtualization?] upstream boot error: WARNING: refcount bug in __free_pages_ok

2024-03-21 Thread Stefan Hajnoczi
On Thu, Mar 21, 2024 at 08:52:03AM -0700, syzbot wrote:
> Hello,
> 
> syzbot tried to test the proposed patch but the build/boot failed:
> 
> bcore: registered new interface driver viperboard
> [7.297712][T1] usbcore: registered new interface driver dln2
> [7.299149][T1] usbcore: registered new interface driver pn533_usb
> [7.304759][  T924] kworker/u4:1 (924) used greatest stack depth: 22768 
> bytes left
> [7.308971][T1] nfcsim 0.2 initialized
> [7.310068][T1] usbcore: registered new interface driver port100
> [7.311312][T1] usbcore: registered new interface driver nfcmrvl
> [7.318405][T1] Loading iSCSI transport class v2.0-870.
> [7.334687][T1] virtio_scsi virtio0: 1/0/0 default/read/poll queues
> [7.344927][T1] [ cut here ]
> [7.345739][T1] refcount_t: decrement hit 0; leaking memory.

This confirms that the following commit introduced this issue:

  commit 217b2119b9e260609958db413876f211038f00ee
  Author: Oscar Salvador 
  Date:   Thu Feb 15 22:59:04 2024 +0100

  mm,page_owner: implement the tracking of the stacks count

Mike: thanks for pointing out the fix that Oscar is working on!

Oscar: Please add the syzbot trailer to the next revision of your
"[PATCH v2 0/2] page_owner: Refcount fixups" series so this issue can be
closed.

> [7.346982][T1] WARNING: CPU: 0 PID: 1 at lib/refcount.c:31 
> refcount_warn_saturate+0xfa/0x1d0
> [7.348761][T1] Modules linked in:
> [7.349418][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 6.8.0-rc5-syzkaller-00257-g217b2119b9e2 #0
> [7.351070][T1] Hardware name: Google Google Compute Engine/Google 
> Compute Engine, BIOS Google 02/29/2024
> [7.352824][T1] RIP: 0010:refcount_warn_saturate+0xfa/0x1d0
> [7.353979][T1] Code: b2 00 00 00 e8 97 2d fc fc 5b 5d c3 cc cc cc cc 
> e8 8b 2d fc fc c6 05 0d d9 d6 0a 01 90 48 c7 c7 a0 46 fd 8b e8 e7 2c c0 fc 90 
> <0f> 0b 90 90 eb d9 e8 6b 2d fc fc c6 05 ea d8 d6 0a 01 90 48 c7 c7
> [7.358181][T1] RSP: :c9066e10 EFLAGS: 00010246
> [7.360206][T1] RAX: 67b097fa09053300 RBX: 88814073377c RCX: 
> 8880166c
> [7.362234][T1] RDX:  RSI:  RDI: 
> 
> [7.363496][T1] RBP: 0004 R08: 81589d62 R09: 
> 1920cd14
> [7.365139][T1] R10: dc00 R11: f520cd15 R12: 
> ea000501edc0
> [7.366612][T1] R13: ea000501edc8 R14: 1d4000a03db9 R15: 
> 
> [7.368171][T1] FS:  () GS:8880b940() 
> knlGS:
> [7.370111][T1] CS:  0010 DS:  ES:  CR0: 80050033
> [7.371030][T1] CR2: 88823000 CR3: 0df34000 CR4: 
> 003506f0
> [7.372121][T1] DR0:  DR1:  DR2: 
> 
> [7.373506][T1] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [7.374889][T1] Call Trace:
> [7.375371][T1]  
> [7.375798][T1]  ? __warn+0x162/0x4b0
> [7.376442][T1]  ? refcount_warn_saturate+0xfa/0x1d0
> [7.377482][T1]  ? report_bug+0x2b3/0x500
> [7.378161][T1]  ? refcount_warn_saturate+0xfa/0x1d0
> [7.379268][T1]  ? handle_bug+0x3e/0x70
> [7.379887][T1]  ? exc_invalid_op+0x1a/0x50
> [7.380563][T1]  ? asm_exc_invalid_op+0x1a/0x20
> [7.381253][T1]  ? __warn_printk+0x292/0x360
> [7.381912][T1]  ? refcount_warn_saturate+0xfa/0x1d0
> [7.382752][T1]  __free_pages_ok+0xc36/0xd60
> [7.384180][T1]  make_alloc_exact+0xc4/0x140
> [7.385037][T1]  vring_alloc_queue_split+0x20a/0x600
> [7.386037][T1]  ? __pfx_vring_alloc_queue_split+0x10/0x10
> [7.387029][T1]  ? vp_find_vqs+0x4c/0x4e0
> [7.387719][T1]  ? virtscsi_probe+0x3ea/0xf60
> [7.388408][T1]  ? virtio_dev_probe+0x991/0xaf0
> [7.389665][T1]  ? really_probe+0x29e/0xc50
> [7.390429][T1]  ? driver_probe_device+0x50/0x430
> [7.391176][T1]  vring_create_virtqueue_split+0xc6/0x310
> [7.392014][T1]  ? ret_from_fork+0x4b/0x80
> [7.392800][T1]  ? __pfx_vring_create_virtqueue_split+0x10/0x10
> [7.394115][T1]  vring_create_virtqueue+0xca/0x110
> [7.395151][T1]  ? __pfx_vp_notify+0x10/0x10
> [7.395888][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
> [7.396674][T1]  setup_vq+0xe9/0x2d0
> [7.397283][T1]  ? __pfx_vp_notify+0x10/0x10
> [7.397938][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
> [7.398806][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
> [7.399938][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
> [7.400951][T1]  vp_setup_vq+0xbf/0x330
> [7.401889][T1]  ? __pfx_vp_config_changed+0x10/0x10
> [7.403092][T1]  ? ioread16+0x2f/0x90
> [7.403909][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
> [ 

Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Karel Balej
Mark Brown, 2024-03-21T16:58:44+00:00:
> On Thu, Mar 21, 2024 at 05:55:17PM +0100, Karel Balej wrote:
> > Lee Jones, 2024-03-21T16:20:45+00:00:
> > > On Thu, 21 Mar 2024, Karel Balej wrote:
>
> > > > > > +static const struct regmap_config pm886_i2c_regmap = {
> > > > > > +   .reg_bits = 8,
> > > > > > +   .val_bits = 8,
> > > > > > +   .max_register = PM886_REGMAP_CONF_MAX_REG,
> > > > > > +};
>
> > > > > Why is this in here?
>
> > > > Because since I moved the regulators regmap initialization into the
> > > > regulators driver, I need to access it from there.
>
> > > So move it into the regulators driver?
>
> > It is used in the MFD driver too for the base regmap.
>
> You shouldn't be creating two regmaps for the same set of registers,
> that just opens the potential for confusion.

Just the regmap config is the same. Otherwise, each regmap lives at a
different I2C address.

Regards,
K. B.



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Mark Brown
On Thu, Mar 21, 2024 at 05:55:17PM +0100, Karel Balej wrote:
> Lee Jones, 2024-03-21T16:20:45+00:00:
> > On Thu, 21 Mar 2024, Karel Balej wrote:

> > > > > +static const struct regmap_config pm886_i2c_regmap = {
> > > > > + .reg_bits = 8,
> > > > > + .val_bits = 8,
> > > > > + .max_register = PM886_REGMAP_CONF_MAX_REG,
> > > > > +};

> > > > Why is this in here?

> > > Because since I moved the regulators regmap initialization into the
> > > regulators driver, I need to access it from there.

> > So move it into the regulators driver?

> It is used in the MFD driver too for the base regmap.

You shouldn't be creating two regmaps for the same set of registers,
that just opens the potential for confusion.


signature.asc
Description: PGP signature


[PATCH] virtio_net: Do not send RSS key if it is not supported

2024-03-21 Thread Breno Leitao
There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.

Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:

# ethtool -X eth0  hfunc toeplitz

This is how the problem happens:

1) ethtool_set_rxfh() calls virtnet_set_rxfh()

2) virtnet_set_rxfh() calls virtnet_commit_rss_command()

3) virtnet_commit_rss_command() populates 4 entries for the rss
   scatter-gather

4) Since the command above does not have a key, then the last
   scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;

5) This buffer is passed to qemu, but qemu is not happy with a buffer
   with zero length, and do the following in virtqueue_map_desc() (QEMU
   function):

  if (!sz) {
  virtio_error(vdev, "virtio: zero sized buffers are not allowed");

6) virtio_error() (also QEMU function) set the device as broken

vdev->broken = true;

7) Qemu bails out, and do not repond this crazy kernel.

8) The kernel is waiting for the response to come back (function
   virtnet_send_command())

9) The kernel is waiting doing the following :

  while (!virtqueue_get_buf(vi->cvq, ) &&
 !virtqueue_is_broken(vi->cvq))
  cpu_relax();

10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.

Fix it by not sending the key scatter-gatter key if it is not set.

Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Breno Leitao 
Cc: sta...@vger.kernel.org
Cc: qemu-de...@nongnu.org
---
 drivers/net/virtio_net.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d7ce4a1011ea..5a7700b103f8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3041,11 +3041,16 @@ static int virtnet_set_ringparam(struct net_device *dev,
 static bool virtnet_commit_rss_command(struct virtnet_info *vi)
 {
struct net_device *dev = vi->dev;
+   int has_key = vi->rss_key_size;
struct scatterlist sgs[4];
unsigned int sg_buf_size;
+   int nents = 3;
+
+   if (has_key)
+   nents += 1;
 
/* prepare sgs */
-   sg_init_table(sgs, 4);
+   sg_init_table(sgs, nents);
 
sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table);
sg_set_buf([0], >ctrl->rss, sg_buf_size);
@@ -3057,8 +3062,13 @@ static bool virtnet_commit_rss_command(struct 
virtnet_info *vi)
- offsetof(struct virtio_net_ctrl_rss, max_tx_vq);
sg_set_buf([2], >ctrl->rss.max_tx_vq, sg_buf_size);
 
-   sg_buf_size = vi->rss_key_size;
-   sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
+   if (has_key) {
+   /* Only populate if key is available, otherwise
+* populating a buffer with zero size breaks virtio
+*/
+   sg_buf_size = vi->rss_key_size;
+   sg_set_buf([3], vi->ctrl->rss.key, sg_buf_size);
+   }
 
if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
  vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG
-- 
2.43.0




Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Karel Balej
Lee Jones, 2024-03-21T16:20:45+00:00:
> On Thu, 21 Mar 2024, Karel Balej wrote:
>
> > Lee Jones, 2024-03-21T15:42:11+00:00:
> > > On Mon, 11 Mar 2024, Karel Balej wrote:
> > > > diff --git a/include/linux/mfd/88pm886.h b/include/linux/mfd/88pm886.h
> > > > new file mode 100644
> > > > index ..a5e6524bb19d
> > > > --- /dev/null
> > > > +++ b/include/linux/mfd/88pm886.h
> > > > @@ -0,0 +1,38 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > > +#ifndef __MFD_88PM886_H
> > > > +#define __MFD_88PM886_H
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#define PM886_A1_CHIP_ID   0xa1
> > > > +
> > > > +#define PM886_REGMAP_CONF_MAX_REG  0xfe
> > > > +
> > > > +#define PM886_REG_ID   0x00
> > > > +
> > > > +#define PM886_REG_STATUS1  0x01
> > > > +#define PM886_ONKEY_STS1   BIT(0)
> > > > +
> > > > +#define PM886_REG_MISC_CONFIG1 0x14
> > > > +#define PM886_SW_PDOWN BIT(5)
> > > > +
> > > > +#define PM886_REG_MISC_CONFIG2 0x15
> > > > +#define PM886_INT_INV  BIT(0)
> > > > +#define PM886_INT_CLEARBIT(1)
> > > > +#define PM886_INT_RC   0x00
> > > > +#define PM886_INT_WC   BIT(1)
> > > > +#define PM886_INT_MASK_MODEBIT(2)
> > > > +
> > > > +struct pm886_chip {
> > > > +   struct i2c_client *client;
> > > > +   unsigned int chip_id;
> > > > +   struct regmap *regmap;
> > > > +};
> > > > +
> > > > +static const struct regmap_config pm886_i2c_regmap = {
> > > > +   .reg_bits = 8,
> > > > +   .val_bits = 8,
> > > > +   .max_register = PM886_REGMAP_CONF_MAX_REG,
> > > > +};
> > >
> > > Why is this in here?
> > 
> > Because since I moved the regulators regmap initialization into the
> > regulators driver, I need to access it from there.
>
> So move it into the regulators driver?

It is used in the MFD driver too for the base regmap.

K. B.



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Lee Jones
On Thu, 21 Mar 2024, Karel Balej wrote:

> Lee Jones, 2024-03-21T15:42:11+00:00:
> > On Mon, 11 Mar 2024, Karel Balej wrote:
> >
> > > From: Karel Balej 
> > > 
> > > Marvell 88PM886 is a PMIC which provides various functions such as
> > > onkey, battery, charger and regulators. It is found for instance in the
> > > samsung,coreprimevelte smartphone with which this was tested. Implement
> > > basic support to allow for the use of regulators and onkey.
> > > 
> > > Signed-off-by: Karel Balej 
> > > ---
> > > 
> > > Notes:
> > > RFC v4:
> > > - Use MFD_CELL_* macros.
> > > - Address Lee's feedback:
> > >   - Do not define regmap_config.val_bits and .reg_bits.
> > >   - Drop everything regulator related except mfd_cell (regmap
> > > initialization, IDs enum etc.). Drop pm886_initialize_subregmaps.
> > >   - Do not store regmap pointers as an array as there is now only one
> > > regmap. Also drop the corresponding enum.
> > >   - Move regmap_config to the header as it is needed in the regulators
> > > driver.
> > >   - pm886_chip.whoami -> chip_id
> > >   - Reword chip ID mismatch error message and print the ID as
> > > hexadecimal.
> > >   - Fix includes in include/linux/88pm886.h.
> > >   - Drop the pm886_irq_number enum and define the (for the moment) 
> > > only
> > > IRQ explicitly.
> > > - Have only one MFD cell for all regulators as they are now registered
> > >   all at once in the regulators driver.
> > > - Reword commit message.
> > > - Make device table static and remove comma after the sentinel to 
> > > signal
> > >   that nothing should come after it.
> > > RFC v3:
> > > - Drop onkey cell .of_compatible.
> > > - Rename LDO page offset and regmap to REGULATORS.
> > > RFC v2:
> > > - Remove some abstraction.
> > > - Sort includes alphabetically and add linux/of.h.
> > > - Depend on OF, remove of_match_ptr and add MODULE_DEVICE_TABLE.
> > > - Use more temporaries and break long lines.
> > > - Do not initialize ret in probe.
> > > - Use the wakeup-source DT property.
> > > - Rename ret to err.
> > > - Address Lee's comments:
> > >   - Drop patched in presets for base regmap and related defines.
> > >   - Use full sentences in comments.
> > >   - Remove IRQ comment.
> > >   - Define regmap_config member values.
> > >   - Rename data to sys_off_data.
> > >   - Add _PMIC suffix to Kconfig.
> > >   - Use dev_err_probe.
> > >   - Do not store irq_data.
> > >   - s/WHOAMI/CHIP_ID
> > >   - Drop LINUX part of include guard name.
> > >   - Merge in the regulator series modifications in order to have more
> > > devices and modify the commit message accordingly. Changes with
> > > respect to the original regulator series patches:
> > > - ret -> err
> > > - Add temporary for dev in pm88x_initialize_subregmaps.
> > > - Drop of_compatible for the regulators.
> > > - Do not duplicate LDO regmap for bucks.
> > > - Rewrite commit message.
> > > 
> > >  drivers/mfd/88pm886.c   | 149 
> > >  drivers/mfd/Kconfig |  12 +++
> > >  drivers/mfd/Makefile|   1 +
> > >  include/linux/mfd/88pm886.h |  38 +
> > >  4 files changed, 200 insertions(+)
> > >  create mode 100644 drivers/mfd/88pm886.c
> > >  create mode 100644 include/linux/mfd/88pm886.h
> >
> > Looks mostly okay.
> >
> > > diff --git a/include/linux/mfd/88pm886.h b/include/linux/mfd/88pm886.h
> > > new file mode 100644
> > > index ..a5e6524bb19d
> > > --- /dev/null
> > > +++ b/include/linux/mfd/88pm886.h
> > > @@ -0,0 +1,38 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +#ifndef __MFD_88PM886_H
> > > +#define __MFD_88PM886_H
> > > +
> > > +#include 
> > > +#include 
> > > +
> > > +#define PM886_A1_CHIP_ID 0xa1
> > > +
> > > +#define PM886_REGMAP_CONF_MAX_REG0xfe
> > > +
> > > +#define PM886_REG_ID 0x00
> > > +
> > > +#define PM886_REG_STATUS10x01
> > > +#define PM886_ONKEY_STS1 BIT(0)
> > > +
> > > +#define PM886_REG_MISC_CONFIG1   0x14
> > > +#define PM886_SW_PDOWN   BIT(5)
> > > +
> > > +#define PM886_REG_MISC_CONFIG2   0x15
> > > +#define PM886_INT_INVBIT(0)
> > > +#define PM886_INT_CLEAR  BIT(1)
> > > +#define PM886_INT_RC 0x00
> > > +#define PM886_INT_WC BIT(1)
> > > +#define PM886_INT_MASK_MODE  BIT(2)
> > > +
> > > +struct pm886_chip {
> > > + struct i2c_client *client;
> > > + unsigned int chip_id;
> > > + struct regmap *regmap;
> > > +};
> > > +
> > > +static const struct regmap_config pm886_i2c_regmap = {
> > > + .reg_bits = 8,
> > > + .val_bits = 8,
> > > + .max_register = PM886_REGMAP_CONF_MAX_REG,
> > > +};
> >
> > Why is this in here?
> 
> 

Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Karel Balej
Lee Jones, 2024-03-21T15:42:11+00:00:
> On Mon, 11 Mar 2024, Karel Balej wrote:
>
> > From: Karel Balej 
> > 
> > Marvell 88PM886 is a PMIC which provides various functions such as
> > onkey, battery, charger and regulators. It is found for instance in the
> > samsung,coreprimevelte smartphone with which this was tested. Implement
> > basic support to allow for the use of regulators and onkey.
> > 
> > Signed-off-by: Karel Balej 
> > ---
> > 
> > Notes:
> > RFC v4:
> > - Use MFD_CELL_* macros.
> > - Address Lee's feedback:
> >   - Do not define regmap_config.val_bits and .reg_bits.
> >   - Drop everything regulator related except mfd_cell (regmap
> > initialization, IDs enum etc.). Drop pm886_initialize_subregmaps.
> >   - Do not store regmap pointers as an array as there is now only one
> > regmap. Also drop the corresponding enum.
> >   - Move regmap_config to the header as it is needed in the regulators
> > driver.
> >   - pm886_chip.whoami -> chip_id
> >   - Reword chip ID mismatch error message and print the ID as
> > hexadecimal.
> >   - Fix includes in include/linux/88pm886.h.
> >   - Drop the pm886_irq_number enum and define the (for the moment) only
> > IRQ explicitly.
> > - Have only one MFD cell for all regulators as they are now registered
> >   all at once in the regulators driver.
> > - Reword commit message.
> > - Make device table static and remove comma after the sentinel to signal
> >   that nothing should come after it.
> > RFC v3:
> > - Drop onkey cell .of_compatible.
> > - Rename LDO page offset and regmap to REGULATORS.
> > RFC v2:
> > - Remove some abstraction.
> > - Sort includes alphabetically and add linux/of.h.
> > - Depend on OF, remove of_match_ptr and add MODULE_DEVICE_TABLE.
> > - Use more temporaries and break long lines.
> > - Do not initialize ret in probe.
> > - Use the wakeup-source DT property.
> > - Rename ret to err.
> > - Address Lee's comments:
> >   - Drop patched in presets for base regmap and related defines.
> >   - Use full sentences in comments.
> >   - Remove IRQ comment.
> >   - Define regmap_config member values.
> >   - Rename data to sys_off_data.
> >   - Add _PMIC suffix to Kconfig.
> >   - Use dev_err_probe.
> >   - Do not store irq_data.
> >   - s/WHOAMI/CHIP_ID
> >   - Drop LINUX part of include guard name.
> >   - Merge in the regulator series modifications in order to have more
> > devices and modify the commit message accordingly. Changes with
> > respect to the original regulator series patches:
> > - ret -> err
> > - Add temporary for dev in pm88x_initialize_subregmaps.
> > - Drop of_compatible for the regulators.
> > - Do not duplicate LDO regmap for bucks.
> > - Rewrite commit message.
> > 
> >  drivers/mfd/88pm886.c   | 149 
> >  drivers/mfd/Kconfig |  12 +++
> >  drivers/mfd/Makefile|   1 +
> >  include/linux/mfd/88pm886.h |  38 +
> >  4 files changed, 200 insertions(+)
> >  create mode 100644 drivers/mfd/88pm886.c
> >  create mode 100644 include/linux/mfd/88pm886.h
>
> Looks mostly okay.
>
> > diff --git a/include/linux/mfd/88pm886.h b/include/linux/mfd/88pm886.h
> > new file mode 100644
> > index ..a5e6524bb19d
> > --- /dev/null
> > +++ b/include/linux/mfd/88pm886.h
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef __MFD_88PM886_H
> > +#define __MFD_88PM886_H
> > +
> > +#include 
> > +#include 
> > +
> > +#define PM886_A1_CHIP_ID   0xa1
> > +
> > +#define PM886_REGMAP_CONF_MAX_REG  0xfe
> > +
> > +#define PM886_REG_ID   0x00
> > +
> > +#define PM886_REG_STATUS1  0x01
> > +#define PM886_ONKEY_STS1   BIT(0)
> > +
> > +#define PM886_REG_MISC_CONFIG1 0x14
> > +#define PM886_SW_PDOWN BIT(5)
> > +
> > +#define PM886_REG_MISC_CONFIG2 0x15
> > +#define PM886_INT_INV  BIT(0)
> > +#define PM886_INT_CLEARBIT(1)
> > +#define PM886_INT_RC   0x00
> > +#define PM886_INT_WC   BIT(1)
> > +#define PM886_INT_MASK_MODEBIT(2)
> > +
> > +struct pm886_chip {
> > +   struct i2c_client *client;
> > +   unsigned int chip_id;
> > +   struct regmap *regmap;
> > +};
> > +
> > +static const struct regmap_config pm886_i2c_regmap = {
> > +   .reg_bits = 8,
> > +   .val_bits = 8,
> > +   .max_register = PM886_REGMAP_CONF_MAX_REG,
> > +};
>
> Why is this in here?

Because since I moved the regulators regmap initialization into the
regulators driver, I need to access it from there.

> What would you like me to do with this RFC patch?

I was hoping that you would take this through the MFD tree (after the
regulator subsystem maintainers approve 

Re: [PATCH] uprobes: reduce contention on uprobes_tree access

2024-03-21 Thread Andrii Nakryiko
On Thu, Mar 21, 2024 at 7:57 AM Jonathan Haslam
 wrote:
>
> Active uprobes are stored in an RB tree and accesses to this tree are
> dominated by read operations. Currently these accesses are serialized by
> a spinlock but this leads to enormous contention when large numbers of
> threads are executing active probes.
>
> This patch converts the spinlock used to serialize access to the
> uprobes_tree RB tree into a reader-writer spinlock. This lock type
> aligns naturally with the overwhelmingly read-only nature of the tree
> usage here. Although the addition of reader-writer spinlocks are
> discouraged [0], this fix is proposed as an interim solution while an
> RCU based approach is implemented (that work is in a nascent form). This
> fix also has the benefit of being trivial, self contained and therefore
> simple to backport.

Yep, makes sense, I think we'll want to backport this ASAP to some of
the old kernels we have. Thanks!

Acked-by: Andrii Nakryiko 

>
> This change has been tested against production workloads that exhibit
> significant contention on the spinlock and an almost order of magnitude
> reduction for mean uprobe execution time is observed (28 -> 3.5 microsecs).
>
> [0] https://docs.kernel.org/locking/spinlocks.html
>
> Signed-off-by: Jonathan Haslam 
> ---
>  kernel/events/uprobes.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 929e98c62965..42bf9b6e8bc0 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -39,7 +39,7 @@ static struct rb_root uprobes_tree = RB_ROOT;
>   */
>  #define no_uprobe_events() RB_EMPTY_ROOT(_tree)
>
> -static DEFINE_SPINLOCK(uprobes_treelock);  /* serialize rbtree access */
> +static DEFINE_RWLOCK(uprobes_treelock);/* serialize rbtree access */
>
>  #define UPROBES_HASH_SZ13
>  /* serialize uprobe->pending_list */
> @@ -669,9 +669,9 @@ static struct uprobe *find_uprobe(struct inode *inode, 
> loff_t offset)
>  {
> struct uprobe *uprobe;
>
> -   spin_lock(_treelock);
> +   read_lock(_treelock);
> uprobe = __find_uprobe(inode, offset);
> -   spin_unlock(_treelock);
> +   read_unlock(_treelock);
>
> return uprobe;
>  }
> @@ -701,9 +701,9 @@ static struct uprobe *insert_uprobe(struct uprobe *uprobe)
>  {
> struct uprobe *u;
>
> -   spin_lock(_treelock);
> +   write_lock(_treelock);
> u = __insert_uprobe(uprobe);
> -   spin_unlock(_treelock);
> +   write_unlock(_treelock);
>
> return u;
>  }
> @@ -935,9 +935,9 @@ static void delete_uprobe(struct uprobe *uprobe)
> if (WARN_ON(!uprobe_is_active(uprobe)))
> return;
>
> -   spin_lock(_treelock);
> +   write_lock(_treelock);
> rb_erase(>rb_node, _tree);
> -   spin_unlock(_treelock);
> +   write_unlock(_treelock);
> RB_CLEAR_NODE(>rb_node); /* for uprobe_is_active() */
> put_uprobe(uprobe);
>  }
> @@ -1298,7 +1298,7 @@ static void build_probe_list(struct inode *inode,
> min = vaddr_to_offset(vma, start);
> max = min + (end - start) - 1;
>
> -   spin_lock(_treelock);
> +   read_lock(_treelock);
> n = find_node_in_range(inode, min, max);
> if (n) {
> for (t = n; t; t = rb_prev(t)) {
> @@ -1316,7 +1316,7 @@ static void build_probe_list(struct inode *inode,
> get_uprobe(u);
> }
> }
> -   spin_unlock(_treelock);
> +   read_unlock(_treelock);
>  }
>
>  /* @vma contains reference counter, not the probed instruction. */
> @@ -1407,9 +1407,9 @@ vma_has_uprobes(struct vm_area_struct *vma, unsigned 
> long start, unsigned long e
> min = vaddr_to_offset(vma, start);
> max = min + (end - start) - 1;
>
> -   spin_lock(_treelock);
> +   read_lock(_treelock);
> n = find_node_in_range(inode, min, max);
> -   spin_unlock(_treelock);
> +   read_unlock(_treelock);
>
> return !!n;
>  }
> --
> 2.43.0
>



Re: [syzbot] [virtualization?] upstream boot error: WARNING: refcount bug in __free_pages_ok

2024-03-21 Thread syzbot
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

bcore: registered new interface driver viperboard
[7.297712][T1] usbcore: registered new interface driver dln2
[7.299149][T1] usbcore: registered new interface driver pn533_usb
[7.304759][  T924] kworker/u4:1 (924) used greatest stack depth: 22768 
bytes left
[7.308971][T1] nfcsim 0.2 initialized
[7.310068][T1] usbcore: registered new interface driver port100
[7.311312][T1] usbcore: registered new interface driver nfcmrvl
[7.318405][T1] Loading iSCSI transport class v2.0-870.
[7.334687][T1] virtio_scsi virtio0: 1/0/0 default/read/poll queues
[7.344927][T1] [ cut here ]
[7.345739][T1] refcount_t: decrement hit 0; leaking memory.
[7.346982][T1] WARNING: CPU: 0 PID: 1 at lib/refcount.c:31 
refcount_warn_saturate+0xfa/0x1d0
[7.348761][T1] Modules linked in:
[7.349418][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
6.8.0-rc5-syzkaller-00257-g217b2119b9e2 #0
[7.351070][T1] Hardware name: Google Google Compute Engine/Google 
Compute Engine, BIOS Google 02/29/2024
[7.352824][T1] RIP: 0010:refcount_warn_saturate+0xfa/0x1d0
[7.353979][T1] Code: b2 00 00 00 e8 97 2d fc fc 5b 5d c3 cc cc cc cc e8 
8b 2d fc fc c6 05 0d d9 d6 0a 01 90 48 c7 c7 a0 46 fd 8b e8 e7 2c c0 fc 90 <0f> 
0b 90 90 eb d9 e8 6b 2d fc fc c6 05 ea d8 d6 0a 01 90 48 c7 c7
[7.358181][T1] RSP: :c9066e10 EFLAGS: 00010246
[7.360206][T1] RAX: 67b097fa09053300 RBX: 88814073377c RCX: 
8880166c
[7.362234][T1] RDX:  RSI:  RDI: 

[7.363496][T1] RBP: 0004 R08: 81589d62 R09: 
1920cd14
[7.365139][T1] R10: dc00 R11: f520cd15 R12: 
ea000501edc0
[7.366612][T1] R13: ea000501edc8 R14: 1d4000a03db9 R15: 

[7.368171][T1] FS:  () GS:8880b940() 
knlGS:
[7.370111][T1] CS:  0010 DS:  ES:  CR0: 80050033
[7.371030][T1] CR2: 88823000 CR3: 0df34000 CR4: 
003506f0
[7.372121][T1] DR0:  DR1:  DR2: 

[7.373506][T1] DR3:  DR6: fffe0ff0 DR7: 
0400
[7.374889][T1] Call Trace:
[7.375371][T1]  
[7.375798][T1]  ? __warn+0x162/0x4b0
[7.376442][T1]  ? refcount_warn_saturate+0xfa/0x1d0
[7.377482][T1]  ? report_bug+0x2b3/0x500
[7.378161][T1]  ? refcount_warn_saturate+0xfa/0x1d0
[7.379268][T1]  ? handle_bug+0x3e/0x70
[7.379887][T1]  ? exc_invalid_op+0x1a/0x50
[7.380563][T1]  ? asm_exc_invalid_op+0x1a/0x20
[7.381253][T1]  ? __warn_printk+0x292/0x360
[7.381912][T1]  ? refcount_warn_saturate+0xfa/0x1d0
[7.382752][T1]  __free_pages_ok+0xc36/0xd60
[7.384180][T1]  make_alloc_exact+0xc4/0x140
[7.385037][T1]  vring_alloc_queue_split+0x20a/0x600
[7.386037][T1]  ? __pfx_vring_alloc_queue_split+0x10/0x10
[7.387029][T1]  ? vp_find_vqs+0x4c/0x4e0
[7.387719][T1]  ? virtscsi_probe+0x3ea/0xf60
[7.388408][T1]  ? virtio_dev_probe+0x991/0xaf0
[7.389665][T1]  ? really_probe+0x29e/0xc50
[7.390429][T1]  ? driver_probe_device+0x50/0x430
[7.391176][T1]  vring_create_virtqueue_split+0xc6/0x310
[7.392014][T1]  ? ret_from_fork+0x4b/0x80
[7.392800][T1]  ? __pfx_vring_create_virtqueue_split+0x10/0x10
[7.394115][T1]  vring_create_virtqueue+0xca/0x110
[7.395151][T1]  ? __pfx_vp_notify+0x10/0x10
[7.395888][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[7.396674][T1]  setup_vq+0xe9/0x2d0
[7.397283][T1]  ? __pfx_vp_notify+0x10/0x10
[7.397938][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[7.398806][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[7.399938][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[7.400951][T1]  vp_setup_vq+0xbf/0x330
[7.401889][T1]  ? __pfx_vp_config_changed+0x10/0x10
[7.403092][T1]  ? ioread16+0x2f/0x90
[7.403909][T1]  ? __pfx_virtscsi_ctrl_done+0x10/0x10
[7.405136][T1]  vp_find_vqs_msix+0x8b2/0xc80
[7.405892][T1]  vp_find_vqs+0x4c/0x4e0
[7.406823][T1]  virtscsi_init+0x8db/0xd00
[7.407669][T1]  ? __pfx_virtscsi_init+0x10/0x10
[7.408413][T1]  ? __pfx_default_calc_sets+0x10/0x10
[7.409369][T1]  ? scsi_host_alloc+0xa57/0xea0
[7.410333][T1]  ? vp_get+0xfd/0x140
[7.410899][T1]  virtscsi_probe+0x3ea/0xf60
[7.411673][T1]  ? __pfx_virtscsi_probe+0x10/0x10
[7.412520][T1]  ? kernfs_add_one+0x159/0x8b0
[7.413222][T1]  ? virtio_no_restricted_mem_acc+0x9/0x10
[7.414081][T1]  ? virtio_features_ok+0x10c/0x270
[7.414875][T1]  

Re: [PATCH v7 1/5] tracing/probes: support '%pd' type for print struct dentry's name

2024-03-21 Thread Steven Rostedt
On Fri, 22 Mar 2024 00:28:05 +0900
Masami Hiramatsu (Google)  wrote:

> On Fri, 22 Mar 2024 00:07:59 +0900
> Masami Hiramatsu (Google)  wrote:
> 
> > > What would be really useful is if we had a way to expose BTF here. 
> > > Something like:
> > > 
> > >  "%pB::"
> > > 
> > > The "%pB" would mean to look up the struct/field offsets and types via 
> > > BTF,
> > > and create the appropriate command to find and print it.  
> > 
> > Would you mean casing the pointer to ""?  
> 
> BTW, for this BTF type casting, I would like to make it more naturally, like
> ( *)$arg1-> as same as other BTF args.
> 

Sure. I'm just interested in the functionality. I'll let others come up
with the syntax. ;-)

-- Steve



Re: [PATCH v7 1/5] tracing/probes: support '%pd' type for print struct dentry's name

2024-03-21 Thread Steven Rostedt
On Fri, 22 Mar 2024 00:07:59 +0900
Masami Hiramatsu (Google)  wrote:

> > What would be really useful is if we had a way to expose BTF here. 
> > Something like:
> > 
> >  "%pB::"
> > 
> > The "%pB" would mean to look up the struct/field offsets and types via BTF,
> > and create the appropriate command to find and print it.  
> 
> Would you mean casing the pointer to ""?

I mean, instead of having:

 ":%pd"

We could have:

 "+0(*:%pB:dentry:name):string"

Where the parsing could use BTF to see that this is a pointer to "struct dentry"
and the member field is "name".

This would also allow pretty much any other structure dereference.
That is if BTF gives structure member offsets?

-- Steve



Re: [RFC PATCH v4 2/5] mfd: add driver for Marvell 88PM886 PMIC

2024-03-21 Thread Lee Jones
On Mon, 11 Mar 2024, Karel Balej wrote:

> From: Karel Balej 
> 
> Marvell 88PM886 is a PMIC which provides various functions such as
> onkey, battery, charger and regulators. It is found for instance in the
> samsung,coreprimevelte smartphone with which this was tested. Implement
> basic support to allow for the use of regulators and onkey.
> 
> Signed-off-by: Karel Balej 
> ---
> 
> Notes:
> RFC v4:
> - Use MFD_CELL_* macros.
> - Address Lee's feedback:
>   - Do not define regmap_config.val_bits and .reg_bits.
>   - Drop everything regulator related except mfd_cell (regmap
> initialization, IDs enum etc.). Drop pm886_initialize_subregmaps.
>   - Do not store regmap pointers as an array as there is now only one
> regmap. Also drop the corresponding enum.
>   - Move regmap_config to the header as it is needed in the regulators
> driver.
>   - pm886_chip.whoami -> chip_id
>   - Reword chip ID mismatch error message and print the ID as
> hexadecimal.
>   - Fix includes in include/linux/88pm886.h.
>   - Drop the pm886_irq_number enum and define the (for the moment) only
> IRQ explicitly.
> - Have only one MFD cell for all regulators as they are now registered
>   all at once in the regulators driver.
> - Reword commit message.
> - Make device table static and remove comma after the sentinel to signal
>   that nothing should come after it.
> RFC v3:
> - Drop onkey cell .of_compatible.
> - Rename LDO page offset and regmap to REGULATORS.
> RFC v2:
> - Remove some abstraction.
> - Sort includes alphabetically and add linux/of.h.
> - Depend on OF, remove of_match_ptr and add MODULE_DEVICE_TABLE.
> - Use more temporaries and break long lines.
> - Do not initialize ret in probe.
> - Use the wakeup-source DT property.
> - Rename ret to err.
> - Address Lee's comments:
>   - Drop patched in presets for base regmap and related defines.
>   - Use full sentences in comments.
>   - Remove IRQ comment.
>   - Define regmap_config member values.
>   - Rename data to sys_off_data.
>   - Add _PMIC suffix to Kconfig.
>   - Use dev_err_probe.
>   - Do not store irq_data.
>   - s/WHOAMI/CHIP_ID
>   - Drop LINUX part of include guard name.
>   - Merge in the regulator series modifications in order to have more
> devices and modify the commit message accordingly. Changes with
> respect to the original regulator series patches:
> - ret -> err
> - Add temporary for dev in pm88x_initialize_subregmaps.
> - Drop of_compatible for the regulators.
> - Do not duplicate LDO regmap for bucks.
> - Rewrite commit message.
> 
>  drivers/mfd/88pm886.c   | 149 
>  drivers/mfd/Kconfig |  12 +++
>  drivers/mfd/Makefile|   1 +
>  include/linux/mfd/88pm886.h |  38 +
>  4 files changed, 200 insertions(+)
>  create mode 100644 drivers/mfd/88pm886.c
>  create mode 100644 include/linux/mfd/88pm886.h

Looks mostly okay.

> diff --git a/include/linux/mfd/88pm886.h b/include/linux/mfd/88pm886.h
> new file mode 100644
> index ..a5e6524bb19d
> --- /dev/null
> +++ b/include/linux/mfd/88pm886.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +#ifndef __MFD_88PM886_H
> +#define __MFD_88PM886_H
> +
> +#include 
> +#include 
> +
> +#define PM886_A1_CHIP_ID 0xa1
> +
> +#define PM886_REGMAP_CONF_MAX_REG0xfe
> +
> +#define PM886_REG_ID 0x00
> +
> +#define PM886_REG_STATUS10x01
> +#define PM886_ONKEY_STS1 BIT(0)
> +
> +#define PM886_REG_MISC_CONFIG1   0x14
> +#define PM886_SW_PDOWN   BIT(5)
> +
> +#define PM886_REG_MISC_CONFIG2   0x15
> +#define PM886_INT_INVBIT(0)
> +#define PM886_INT_CLEAR  BIT(1)
> +#define PM886_INT_RC 0x00
> +#define PM886_INT_WC BIT(1)
> +#define PM886_INT_MASK_MODE  BIT(2)
> +
> +struct pm886_chip {
> + struct i2c_client *client;
> + unsigned int chip_id;
> + struct regmap *regmap;
> +};
> +
> +static const struct regmap_config pm886_i2c_regmap = {
> + .reg_bits = 8,
> + .val_bits = 8,
> + .max_register = PM886_REGMAP_CONF_MAX_REG,
> +};

Why is this in here?

> +#endif /* __MFD_88PM886_H */

What would you like me to do with this RFC patch?

-- 
Lee Jones [李琼斯]



Re: [PATCH v7 1/5] tracing/probes: support '%pd' type for print struct dentry's name

2024-03-21 Thread Google
On Fri, 22 Mar 2024 00:07:59 +0900
Masami Hiramatsu (Google)  wrote:

> > What would be really useful is if we had a way to expose BTF here. 
> > Something like:
> > 
> >  "%pB::"
> > 
> > The "%pB" would mean to look up the struct/field offsets and types via BTF,
> > and create the appropriate command to find and print it.
> 
> Would you mean casing the pointer to ""?

BTW, for this BTF type casting, I would like to make it more naturally, like
( *)$arg1-> as same as other BTF args.

Thank you,

-- 
Masami Hiramatsu (Google) 



Re: [PATCH 2/2] remoteproc: mediatek: Don't parse extraneous subnodes for multi-core

2024-03-21 Thread Mathieu Poirier
On Thu, Mar 21, 2024 at 09:46:14AM +0100, AngeloGioacchino Del Regno wrote:
> When probing multi-core SCP, this driver is parsing all sub-nodes of
> the scp-cluster node, but one of those could be not an actual SCP core
> and that would make the entire SCP cluster to fail probing for no good
> reason.
> 
> To fix that, in scp_add_multi_core() treat a subnode as a SCP Core by
> parsing only available subnodes having compatible "mediatek,scp-core".
> 
> Fixes: 1fdbf0cdde98 ("remoteproc: mediatek: Probe SCP cluster on multi-core 
> SCP")
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/remoteproc/mtk_scp.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
> index 67518291a8ad..fbe1c232dae7 100644
> --- a/drivers/remoteproc/mtk_scp.c
> +++ b/drivers/remoteproc/mtk_scp.c
> @@ -1096,6 +1096,9 @@ static int scp_add_multi_core(struct platform_device 
> *pdev,
>   cluster_of_data = (const struct mtk_scp_of_data 
> **)of_device_get_match_data(dev);
>  
>   for_each_available_child_of_node(np, child) {
> + if (!of_device_is_compatible(child, "mediatek,scp-core"))
> + continue;
> +

Interesting - what else gets stashed under the remote processor node?  I don't
see anything specified in the bindings.

Thanks,
Mathieu

>   if (!cluster_of_data[core_id]) {
>   ret = -EINVAL;
>   dev_err(dev, "Not support core %d\n", core_id);
> -- 
> 2.44.0
> 



Re: [PATCH 1/2] remoteproc: mediatek: Make sure IPI buffer fits in L2TCM

2024-03-21 Thread Mathieu Poirier
Good day,

On Thu, Mar 21, 2024 at 09:46:13AM +0100, AngeloGioacchino Del Regno wrote:
> The IPI buffer location is read from the firmware that we load to the
> System Companion Processor, and it's not granted that both the SRAM
> (L2TCM) size that is defined in the devicetree node is large enough
> for that, and while this is especially true for multi-core SCP, it's
> still useful to check on single-core variants as well.
> 
> Failing to perform this check may make this driver perform R/W
> oeprations out of the L2TCM boundary, resulting (at best) in a

s/oeprations/operations

I will fix that when I apply the patch.

> kernel panic.
> 
> To fix that, check that the IPI buffer fits, otherwise return a
> failure and refuse to boot the relevant SCP core (or the SCP at
> all, if this is single core).
> 
> Fixes: 3efa0ea743b7 ("remoteproc/mediatek: read IPI buffer offset from FW")
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> ---
>  drivers/remoteproc/mtk_scp.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
> index a35409eda0cf..67518291a8ad 100644
> --- a/drivers/remoteproc/mtk_scp.c
> +++ b/drivers/remoteproc/mtk_scp.c
> @@ -132,7 +132,7 @@ static int scp_elf_read_ipi_buf_addr(struct mtk_scp *scp,
>  static int scp_ipi_init(struct mtk_scp *scp, const struct firmware *fw)
>  {
>   int ret;
> - size_t offset;
> + size_t buf_sz, offset;
>  
>   /* read the ipi buf addr from FW itself first */
>   ret = scp_elf_read_ipi_buf_addr(scp, fw, );
> @@ -144,6 +144,14 @@ static int scp_ipi_init(struct mtk_scp *scp, const 
> struct firmware *fw)
>   }
>   dev_info(scp->dev, "IPI buf addr %#010zx\n", offset);
>  
> + /* Make sure IPI buffer fits in the L2TCM range assigned to this core */
> + buf_sz = sizeof(*scp->recv_buf) + sizeof(*scp->send_buf);
> +
> + if (scp->sram_size < buf_sz + offset) {
> + dev_err(scp->dev, "IPI buffer does not fit in SRAM.\n");
> + return -EOVERFLOW;
> + }
> +
>   scp->recv_buf = (struct mtk_share_obj __iomem *)
>   (scp->sram_base + offset);
>   scp->send_buf = (struct mtk_share_obj __iomem *)
> -- 
> 2.44.0
> 



Re: [PATCH 2/3] dt-bindings: remoteproc: add Versal-NET platform

2024-03-21 Thread Tanmay Shah



On 3/21/24 2:39 AM, Krzysztof Kozlowski wrote:
> On 20/03/2024 16:14, Tanmay Shah wrote:
>> 
>> 
>> On 3/20/24 2:40 AM, Krzysztof Kozlowski wrote:
>>> On 19/03/2024 15:42, Tanmay Shah wrote:


 On 3/19/24 12:30 AM, Krzysztof Kozlowski wrote:
> On 19/03/2024 01:51, Tanmay Shah wrote:
>> Hello Krzysztof,
>>
>> Thanks for reviews. Please find my comments below.
>>
>> On 3/17/24 1:53 PM, Krzysztof Kozlowski wrote:
>>> On 15/03/2024 22:15, Tanmay Shah wrote:
 AMD-Xilinx Versal-NET platform is successor of Versal platform. It
 contains multiple clusters of cortex-R52 real-time processing units.
 Each cluster contains two cores of cortex-R52 processors. Each cluster
 can be configured in lockstep mode or split mode.

 Each R52 core is assigned 128KB of TCM memory. ATCM memory is 64KB, 
 BTCM
 and CTCM memoreis are 32KB each. Each TCM memory has its own dedicated
 power-domain that needs to be requested before using it.

 Signed-off-by: Tanmay Shah 
 ---
  .../remoteproc/xlnx,zynqmp-r5fss.yaml | 220 +++---
  1 file changed, 184 insertions(+), 36 deletions(-)

 diff --git 
 a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml 
 b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
 index 711da0272250..55654ee02eef 100644
 --- 
 a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
 +++ 
 b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
 @@ -18,7 +18,9 @@ description: |
  
  properties:
compatible:
 -const: xlnx,zynqmp-r5fss
 +enum:
 +  - xlnx,zynqmp-r5fss
 +  - xlnx,versal-net-r52fss
  
"#address-cells":
  const: 2
 @@ -64,7 +66,9 @@ patternProperties:
  
  properties:
compatible:
 -const: xlnx,zynqmp-r5f
 +enum:
 +  - xlnx,zynqmp-r5f
 +  - xlnx,versal-net-r52f
  
reg:
  minItems: 1
 @@ -135,9 +139,11 @@ required:
  allOf:
- if:
properties:
 -xlnx,cluster-mode:
 -  enum:
 -- 1
 +compatible:
 +  contains:
 +enum:
 +  - xlnx,versal-net-r52fss
>>>
>>> Why do you touch these lines?
>>>
 +
  then:
patternProperties:
  "^r5f@[0-9a-f]+$":
 @@ -149,16 +155,14 @@ allOf:
items:
  - description: ATCM internal memory
  - description: BTCM internal memory
 -- description: extra ATCM memory in lockstep mode
 -- description: extra BTCM memory in lockstep mode
 +- description: CTCM internal memory
  
  reg-names:
minItems: 1
items:
 -- const: atcm0
 -- const: btcm0
 -- const: atcm1
 -- const: btcm1
 +- const: atcm
 +- const: btcm
 +- const: ctcm
  
  power-domains:
minItems: 2
 @@ -166,33 +170,70 @@ allOf:
  - description: RPU core power domain
  - description: ATCM power domain
  - description: BTCM power domain
 -- description: second ATCM power domain
 -- description: second BTCM power domain
 +- description: CTCM power domain
  
  else:
 -  patternProperties:
 -"^r5f@[0-9a-f]+$":
 -  type: object
 -
 -  properties:
 -reg:
 -  minItems: 1
 -  items:
 -- description: ATCM internal memory
 -- description: BTCM internal memory
 -
 -reg-names:
 -  minItems: 1
 -  items:
 -- const: atcm0
 -- const: btcm0
 -
 -power-domains:
 -  minItems: 2
 -  items:
 -- description: RPU core power domain
 -- description: ATCM power domain
 -- description: BTCM power domain
 +  allOf:
 +- if:

Re: [PATCH v7 1/5] tracing/probes: support '%pd' type for print struct dentry's name

2024-03-21 Thread Google
On Thu, 21 Mar 2024 10:15:47 -0400
Steven Rostedt  wrote:

> On Wed, 20 Mar 2024 21:29:20 +0800
> Ye Bin  wrote:
> 
> > Support print type '%pd' for print dentry's  name.
> > 
> 
> The above is not a very detailed change log. A change log should state not
> only what the change is doing, but also why.
> 
> Having examples of before and after would be useful in the change log.

Hm, OK. Ye, something like this.
-
Support print type '%pd' for print dentry's name. For example "name=$arg1:%pd"
casts the `$arg1` as (struct dentry *), dereferences the "d_name.name" field
and stores it to "name" argument as a kernel string. Here is an example;

echo 'p:testprobe dput name=$arg1:%pd' > kprobe_events 
echo 1 > events/kprobes/testprobe/enable 
cat trace
...
  sh-132 [004] .   333.333051: testprobe: (dput+0x4/0x20) 
name="enable"


Note that this expects the given argument (e.g. $arg1) is an address of struct
dentry. User must ensure it.
-

And add similar changelog on [2/5]. 

Thank you,

> 
> > Signed-off-by: Ye Bin 
> > ---
> >  kernel/trace/trace.c|  2 +-
> >  kernel/trace/trace_fprobe.c |  6 +
> >  kernel/trace/trace_kprobe.c |  6 +
> >  kernel/trace/trace_probe.c  | 50 +
> >  kernel/trace/trace_probe.h  |  2 ++
> >  5 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index b12f8384a36a..ac26b8446337 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> 
> 
> > +/* @buf: *buf must be equal to NULL. Caller must to free *buf */
> > +int traceprobe_expand_dentry_args(int argc, const char *argv[], char **buf)
> > +{
> > +   int i, used, ret;
> > +   const int bufsize = MAX_DENTRY_ARGS_LEN;
> > +   char *tmpbuf = NULL;
> > +
> > +   if (*buf)
> > +   return -EINVAL;
> > +
> > +   used = 0;
> > +   for (i = 0; i < argc; i++) {
> > +   if (glob_match("*:%pd", argv[i])) {
> > +   char *tmp;
> > +   char *equal;
> > +
> > +   if (!tmpbuf) {
> > +   tmpbuf = kmalloc(bufsize, GFP_KERNEL);
> > +   if (!tmpbuf)
> > +   return -ENOMEM;
> > +   }
> > +
> > +   tmp = kstrdup(argv[i], GFP_KERNEL);
> > +   if (!tmp)
> > +   goto nomem;
> > +
> > +   equal = strchr(tmp, '=');
> > +   if (equal)
> > +   *equal = '\0';
> > +   tmp[strlen(argv[i]) - 4] = '\0';
> > +   ret = snprintf(tmpbuf + used, bufsize - used,
> > +  "%s%s+0x0(+0x%zx(%s)):string",
> > +  equal ? tmp : "", equal ? "=" : "",
> > +  offsetof(struct dentry, d_name.name),
> > +  equal ? equal + 1 : tmp);
> 
> What would be really useful is if we had a way to expose BTF here. Something 
> like:
> 
>  "%pB::"
> 
> The "%pB" would mean to look up the struct/field offsets and types via BTF,
> and create the appropriate command to find and print it.

Would you mean casing the pointer to ""?


> 
> That would be much more flexible and useful as it would allow reading any
> field from any structure without having to use gdb.
> 
> -- Steve
> 
> 
> > +   kfree(tmp);
> > +   if (ret >= bufsize - used)
> > +   goto nomem;
> > +   argv[i] = tmpbuf + used;
> > +   used += ret + 1;
> > +   }
> > +   }
> > +
> > +   *buf = tmpbuf;
> > +   return 0;
> > +nomem:
> > +   kfree(tmpbuf);
> > +   return -ENOMEM;
> > +}
> 


-- 
Masami Hiramatsu (Google) 



[PATCH] uprobes: reduce contention on uprobes_tree access

2024-03-21 Thread Jonathan Haslam
Active uprobes are stored in an RB tree and accesses to this tree are
dominated by read operations. Currently these accesses are serialized by
a spinlock but this leads to enormous contention when large numbers of
threads are executing active probes.

This patch converts the spinlock used to serialize access to the
uprobes_tree RB tree into a reader-writer spinlock. This lock type
aligns naturally with the overwhelmingly read-only nature of the tree
usage here. Although the addition of reader-writer spinlocks are
discouraged [0], this fix is proposed as an interim solution while an
RCU based approach is implemented (that work is in a nascent form). This
fix also has the benefit of being trivial, self contained and therefore
simple to backport.

This change has been tested against production workloads that exhibit
significant contention on the spinlock and an almost order of magnitude
reduction for mean uprobe execution time is observed (28 -> 3.5 microsecs).

[0] https://docs.kernel.org/locking/spinlocks.html

Signed-off-by: Jonathan Haslam 
---
 kernel/events/uprobes.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 929e98c62965..42bf9b6e8bc0 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -39,7 +39,7 @@ static struct rb_root uprobes_tree = RB_ROOT;
  */
 #define no_uprobe_events() RB_EMPTY_ROOT(_tree)
 
-static DEFINE_SPINLOCK(uprobes_treelock);  /* serialize rbtree access */
+static DEFINE_RWLOCK(uprobes_treelock);/* serialize rbtree access */
 
 #define UPROBES_HASH_SZ13
 /* serialize uprobe->pending_list */
@@ -669,9 +669,9 @@ static struct uprobe *find_uprobe(struct inode *inode, 
loff_t offset)
 {
struct uprobe *uprobe;
 
-   spin_lock(_treelock);
+   read_lock(_treelock);
uprobe = __find_uprobe(inode, offset);
-   spin_unlock(_treelock);
+   read_unlock(_treelock);
 
return uprobe;
 }
@@ -701,9 +701,9 @@ static struct uprobe *insert_uprobe(struct uprobe *uprobe)
 {
struct uprobe *u;
 
-   spin_lock(_treelock);
+   write_lock(_treelock);
u = __insert_uprobe(uprobe);
-   spin_unlock(_treelock);
+   write_unlock(_treelock);
 
return u;
 }
@@ -935,9 +935,9 @@ static void delete_uprobe(struct uprobe *uprobe)
if (WARN_ON(!uprobe_is_active(uprobe)))
return;
 
-   spin_lock(_treelock);
+   write_lock(_treelock);
rb_erase(>rb_node, _tree);
-   spin_unlock(_treelock);
+   write_unlock(_treelock);
RB_CLEAR_NODE(>rb_node); /* for uprobe_is_active() */
put_uprobe(uprobe);
 }
@@ -1298,7 +1298,7 @@ static void build_probe_list(struct inode *inode,
min = vaddr_to_offset(vma, start);
max = min + (end - start) - 1;
 
-   spin_lock(_treelock);
+   read_lock(_treelock);
n = find_node_in_range(inode, min, max);
if (n) {
for (t = n; t; t = rb_prev(t)) {
@@ -1316,7 +1316,7 @@ static void build_probe_list(struct inode *inode,
get_uprobe(u);
}
}
-   spin_unlock(_treelock);
+   read_unlock(_treelock);
 }
 
 /* @vma contains reference counter, not the probed instruction. */
@@ -1407,9 +1407,9 @@ vma_has_uprobes(struct vm_area_struct *vma, unsigned long 
start, unsigned long e
min = vaddr_to_offset(vma, start);
max = min + (end - start) - 1;
 
-   spin_lock(_treelock);
+   read_lock(_treelock);
n = find_node_in_range(inode, min, max);
-   spin_unlock(_treelock);
+   read_unlock(_treelock);
 
return !!n;
 }
-- 
2.43.0




Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

2024-03-21 Thread Steven Rostedt
On Tue, 12 Mar 2024 13:42:28 +
Mark Rutland  wrote:

> > It would be interesting to see how the per-call performance would
> > improve on x86 with CALL_OPS! ;-)  
> 
> Heh. ;)

But this would require adding -fpatchable-function-entry on x86, which
would increase the size of text, which could possibly have a performance
regression when tracing is disabled.

I'd have no problem if someone were to implement it, but there's a strict
requirement that it does not slow down the system when tracing is disabled.
As tracing is a second class citizen compared to the rest of the kernel.

-- Steve



Re: [PATCH v7 1/5] tracing/probes: support '%pd' type for print struct dentry's name

2024-03-21 Thread Steven Rostedt
On Wed, 20 Mar 2024 21:29:20 +0800
Ye Bin  wrote:

> Support print type '%pd' for print dentry's  name.
> 

The above is not a very detailed change log. A change log should state not
only what the change is doing, but also why.

Having examples of before and after would be useful in the change log.

> Signed-off-by: Ye Bin 
> ---
>  kernel/trace/trace.c|  2 +-
>  kernel/trace/trace_fprobe.c |  6 +
>  kernel/trace/trace_kprobe.c |  6 +
>  kernel/trace/trace_probe.c  | 50 +
>  kernel/trace/trace_probe.h  |  2 ++
>  5 files changed, 65 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index b12f8384a36a..ac26b8446337 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c


> +/* @buf: *buf must be equal to NULL. Caller must to free *buf */
> +int traceprobe_expand_dentry_args(int argc, const char *argv[], char **buf)
> +{
> + int i, used, ret;
> + const int bufsize = MAX_DENTRY_ARGS_LEN;
> + char *tmpbuf = NULL;
> +
> + if (*buf)
> + return -EINVAL;
> +
> + used = 0;
> + for (i = 0; i < argc; i++) {
> + if (glob_match("*:%pd", argv[i])) {
> + char *tmp;
> + char *equal;
> +
> + if (!tmpbuf) {
> + tmpbuf = kmalloc(bufsize, GFP_KERNEL);
> + if (!tmpbuf)
> + return -ENOMEM;
> + }
> +
> + tmp = kstrdup(argv[i], GFP_KERNEL);
> + if (!tmp)
> + goto nomem;
> +
> + equal = strchr(tmp, '=');
> + if (equal)
> + *equal = '\0';
> + tmp[strlen(argv[i]) - 4] = '\0';
> + ret = snprintf(tmpbuf + used, bufsize - used,
> +"%s%s+0x0(+0x%zx(%s)):string",
> +equal ? tmp : "", equal ? "=" : "",
> +offsetof(struct dentry, d_name.name),
> +equal ? equal + 1 : tmp);

What would be really useful is if we had a way to expose BTF here. Something 
like:

 "%pB::"

The "%pB" would mean to look up the struct/field offsets and types via BTF,
and create the appropriate command to find and print it.

That would be much more flexible and useful as it would allow reading any
field from any structure without having to use gdb.

-- Steve


> + kfree(tmp);
> + if (ret >= bufsize - used)
> + goto nomem;
> + argv[i] = tmpbuf + used;
> + used += ret + 1;
> + }
> + }
> +
> + *buf = tmpbuf;
> + return 0;
> +nomem:
> + kfree(tmpbuf);
> + return -ENOMEM;
> +}



Re: [PATCH v7 0/5] support '%pd' and '%pD' for print file name

2024-03-21 Thread Google
Hi Ye,

On Wed, 20 Mar 2024 21:29:19 +0800
Ye Bin  wrote:

> During fault locating, the file name needs to be printed based on the
> dentry/file address. The offset needs to be calculated each time, which
> is troublesome. Similar to printk, kprobe supports printing file names
> for dentry/file addresses.

Thanks for update! This series looks good to me.

Acked-by: Masami Hiramatsu (Google) 

Let me pick this and test. 

Thank you!

> 
> Diff v7 vs v6:
> 1. Squash [1/8] to [3/8] patches into 1 patch;
> 2. Split readme_msg[] into each patch;
> 
> Diff v6 vs v5:
> 1. Add const for 'bufsize' in PATCH [1];
> 2. Move PATCH 'tracing/probes: support '%pd/%pD' type for fprobe' after
> PATCH "tracing/probes: support '%pd' type for print struct dentry's name".
> 3. Add requires '"%pd/%pD":README' for testcase.
> 
> Diff v5 vs v4:
> 1. Use glob_match() instead of str_has_suffix(), so remove the first patch;
> 2. Allocate buffer from heap for expand dentry;
> 3. Support "%pd/%pD" print type for fprobe;
> 4. Use $arg1 instead of origin register in test case;
> 5. Add test case for fprobe;
> 
> Diff v4 vs v3:
> 1. Use "argv[i][idx + 3] == 'd'" instead of "argv[i][strlen(argv[i]) - 1] == 
> 'd'"
> to judge print format in PATCH[4/7];
> 
> Diff v3 vs v2:
> 1. Return the index of where the suffix was found in str_has_suffix();
> 
> Diff v2 vs v1:
> 1. Use "%pd/%pD" print format instead of "pd/pD" print format;
> 2. Add "%pd/%pD" in README;
> 3. Expand "%pd/%pD" argument before parameter parsing;
> 4. Add more detail information in ftrace documentation;
> 5. Add test cases for new print format in selftests/ftrace;
> 
> 
> Ye Bin (5):
>   tracing/probes: support '%pd' type for print struct dentry's name
>   tracing/probes: support '%pD' type for print struct file's name
>   Documentation: tracing: add new type '%pd' and '%pD' for kprobe
>   selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD"
>   selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"
> 
>  Documentation/trace/kprobetrace.rst   |  8 ++-
>  kernel/trace/trace.c  |  2 +-
>  kernel/trace/trace_fprobe.c   |  6 ++
>  kernel/trace/trace_kprobe.c   |  6 ++
>  kernel/trace/trace_probe.c| 63 +++
>  kernel/trace/trace_probe.h|  2 +
>  .../ftrace/test.d/dynevent/fprobe_args_vfs.tc | 40 
>  .../ftrace/test.d/kprobe/kprobe_args_vfs.tc   | 40 
>  8 files changed, 164 insertions(+), 3 deletions(-)
>  create mode 100644 
> tools/testing/selftests/ftrace/test.d/dynevent/fprobe_args_vfs.tc
>  create mode 100644 
> tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_vfs.tc
> 
> -- 
> 2.31.1
> 
> 


-- 
Masami Hiramatsu (Google) 



Re: [PATCH v3] net/ipv4: add tracepoint for icmp_send

2024-03-21 Thread Steven Rostedt
On Thu, 21 Mar 2024 10:45:00 +0800
Jason Xing  wrote:

> The format of the whole patch looks strange... Did you send this patch
> by using 'git send-email' instead of pasting the text and sending?

Yeah, it's uuencoded.

Subject: 
=?UTF-8?B?wqBbUEFUQ0ggdjNdIG5ldC9pcHY0OiBhZGQgdHJhY2Vwb2ludCBmb3IgaWNtcF9zZW5k?=
Content-Type: multipart/mixed;
boundary="=_001_next="
X-MAIL:mse-fl2.zte.com.cn 42L2Ahm2097008
X-Fangmail-Gw-Spam-Type: 0
X-Fangmail-Anti-Spam-Filtered: true
X-Fangmail-MID-QID: 65FB975E.000/4V0TV60kJlz8XrRb



--=_001_next=
Content-Type: multipart/related;
boundary="=_002_next="


--=_002_next=
Content-Type: multipart/alternative;
boundary="=_003_next="


--=_003_next=
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: base64

-- Steve



[GIT PULL] hwspinlock updates for v6.9

2024-03-21 Thread Bjorn Andersson


The following changes since commit 6613476e225e090cc9aad49be7fa504e290dd33d:

  Linux 6.8-rc1 (2024-01-21 14:11:32 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
tags/hwlock-v6.9

for you to fetch changes up to cebaa386d5ee1a44a58c12f1d220f62cc567fdb0:

  hwspinlock: omap: Use index to get hwspinlock pointer (2024-03-05 20:01:14 
-0800)


hwspinlock updates for v6.9

This provides some code cleanup for the OMAP hwspinlock driver.


Andrew Davis (4):
  hwspinlock: omap: Remove unneeded check for OF node
  hwspinlock: omap: Use devm_pm_runtime_enable() helper
  hwspinlock: omap: Use devm_hwspin_lock_register() helper
  hwspinlock: omap: Use index to get hwspinlock pointer

 drivers/hwspinlock/omap_hwspinlock.c | 57 +++-
 1 file changed, 10 insertions(+), 47 deletions(-)



[GIT PULL] rpmsg updates for v6.9

2024-03-21 Thread Bjorn Andersson


The following changes since commit 6613476e225e090cc9aad49be7fa504e290dd33d:

  Linux 6.8-rc1 (2024-01-21 14:11:32 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
tags/rpmsg-v6.9

for you to fetch changes up to b03aa6d4e9a74c4289929b6cf3c6bcc80270682d:

  rpmsg: core: Make rpmsg_bus const (2024-02-05 13:43:22 -0700)


rpmsg updates for v6.9

This transitions rpmsg_ctrl and rpmsg_char drivers away from the
deprecated ida_simple_*() API. It also makes the rpmsg_bus const.


Christophe JAILLET (1):
  rpmsg: Remove usage of the deprecated ida_simple_xx() API

Ricardo B. Marliere (1):
  rpmsg: core: Make rpmsg_bus const

 drivers/rpmsg/rpmsg_char.c | 12 ++--
 drivers/rpmsg/rpmsg_core.c |  2 +-
 drivers/rpmsg/rpmsg_ctrl.c | 12 ++--
 3 files changed, 13 insertions(+), 13 deletions(-)



[GIT PULL] remoteproc updates for v6.9

2024-03-21 Thread Bjorn Andersson
I'm sorry for the late pull request, I apparently had managed to get git
send-email to only deliver my mail to /dev/null on the machine where I prepared
this.

Regards,
Bjorn

The following changes since commit 6613476e225e090cc9aad49be7fa504e290dd33d:

  Linux 6.8-rc1 (2024-01-21 14:11:32 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux.git 
tags/rproc-v6.9

for you to fetch changes up to 62210f7509e13a2caa7b080722a45229b8f17a0a:

  remoteproc: qcom_q6v5_pas: Unload lite firmware on ADSP (2024-03-05 20:02:07 
-0800)


remoteproc updates for v6.9

Qualcomm SM8650 audio, compute and modem remoteproc are added. Qualcomm
X1 Elite audio and compute remoteprocs are added, after support for
shutting down the bootloader-loaded firmware loaded into the audio DSP..

A dozen drivers in the subsystem are transitioned to use devres helpers
for remoteproc and memory allocations.

It makes it possible to acquire in-kernel handle to individual
remoteproc instances in a cluster.

The release of DMA memory for remoteproc virtio is corrected to ensure
that restarting due to a watchdog bite doesn't attempt to allocate the
memory again without first freeing it.

Last, but not least, a couple of DeviceTree binding cleanups.


Abel Vesa (1):
  dt-bindings: remoteproc: qcom,sm8550-pas: document the X1E80100 aDSP & 
cDSP

Andrew Davis (17):
  remoteproc: k3-dsp: Use devm_rproc_alloc() helper
  remoteproc: k3-dsp: Add devm action to release reserved memory
  remoteproc: k3-dsp: Use devm_kcalloc() helper
  remoteproc: imx_dsp_rproc: Use devm_rproc_alloc() helper
  remoteproc: imx_rproc: Use devm_rproc_alloc() helper
  remoteproc: st: Use devm_rproc_alloc() helper
  remoteproc: stm32: Use devm_rproc_alloc() helper
  remoteproc: k3-dsp: Use devm_ti_sci_get_by_phandle() helper
  remoteproc: k3-dsp: Use devm_kzalloc() helper
  remoteproc: k3-dsp: Add devm action to release tsp
  remoteproc: k3-dsp: Use devm_ioremap_wc() helper
  remoteproc: k3-dsp: Use devm_rproc_add() helper
  remoteproc: qcom_q6v5_adsp: Use devm_rproc_alloc() helper
  remoteproc: qcom_q6v5_mss: Use devm_rproc_alloc() helper
  remoteproc: qcom_q6v5_pas: Use devm_rproc_alloc() helper
  remoteproc: qcom_q6v5_wcss: Use devm_rproc_alloc() helper
  remoteproc: qcom_wcnss: Use devm_rproc_alloc() helper

Arnaud Pouliquen (2):
  remoteproc: stm32: Fix incorrect type in assignment for va
  remoteproc: stm32: Fix incorrect type assignment returned by 
stm32_rproc_get_loaded_rsc_tablef

Dmitry Baryshkov (1):
  remoteproc: qcom: pas: correct data indentation

Joakim Zhang (1):
  remoteproc: virtio: Fix wdg cannot recovery remote processor

Krzysztof Kozlowski (2):
  dt-bindings: remoteproc: qcom,glink-rpm-edge: drop redundant type from 
label
  dt-bindings: remoteproc: do not override firmware-name $ref

Mathieu Poirier (1):
  remoteproc: Make rproc_get_by_phandle() work for clusters

Neil Armstrong (3):
  dt-bindings: remoteproc: qcom,sm8550-pas: document the SM8650 PAS
  remoteproc: qcom: pas: make region assign more generic
  remoteproc: qcom: pas: Add SM8650 remoteproc support

Sibi Sankar (2):
  remoteproc: qcom_q6v5_pas: Add support for X1E80100 ADSP/CDSP
  remoteproc: qcom_q6v5_pas: Unload lite firmware on ADSP

 .../devicetree/bindings/remoteproc/mtk,scp.yaml|   4 +-
 .../bindings/remoteproc/qcom,glink-rpm-edge.yaml   |   1 -
 .../bindings/remoteproc/qcom,qcs404-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sc7180-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sc7280-wpss-pil.yaml  |   2 +-
 .../bindings/remoteproc/qcom,sc8180x-pas.yaml  |   2 +-
 .../bindings/remoteproc/qcom,sm6115-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sm6350-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sm6375-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sm8150-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sm8350-pas.yaml   |   2 +-
 .../bindings/remoteproc/qcom,sm8550-pas.yaml   |  51 +++-
 .../bindings/remoteproc/qcom,wcnss-pil.yaml|   2 +-
 drivers/remoteproc/imx_dsp_rproc.c |  11 +-
 drivers/remoteproc/imx_rproc.c |  16 +-
 drivers/remoteproc/qcom_q6v5_adsp.c|  14 +-
 drivers/remoteproc/qcom_q6v5_mss.c |  28 +-
 drivers/remoteproc/qcom_q6v5_pas.c | 326 ++---
 drivers/remoteproc/qcom_q6v5_wcss.c|  24 +-
 drivers/remoteproc/qcom_wcnss.c|  17 +-
 drivers/remoteproc/remoteproc_core.c   |  29 +-
 drivers/remoteproc/remoteproc_virtio.c |   6 +-
 drivers/remoteproc/st_remoteproc.c |  15 +-
 drivers/remoteproc/stm32_rproc.c   |  

Re: [syzbot] [virtualization?] upstream boot error: WARNING: refcount bug in __free_pages_ok

2024-03-21 Thread Stefan Hajnoczi
On Wed, Mar 20, 2024 at 01:08:02PM -0700, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger any 
> issue:
> 
> Reported-and-tested-by: syzbot+70f57d8a3ae84934c...@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit: 4bedfb31 mm,page_owner: maintain own list of stack_rec..
> git tree:   
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> kernel config:  https://syzkaller.appspot.com/x/.config?x=527195e149aa3091
> dashboard link: https://syzkaller.appspot.com/bug?extid=70f57d8a3ae84934c003
> compiler:   Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 
> 2.40
> 
> Note: no patches were applied.
> Note: testing is done by a robot and is best-effort only.
> 

Good, that was the expected last working commit. Here is the next commit
(it should fail):

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
217b2119b9e260609958db413876f211038f00ee


signature.asc
Description: PGP signature


Re: [PATCH] virtio_ring: Fix the stale index in available ring

2024-03-21 Thread Gavin Shan

On 3/21/24 03:15, Keir Fraser wrote:

On Wed, Mar 20, 2024 at 03:24:16PM +1000, Gavin Shan wrote:


Before this patch was posted, I had debugging code to record last 16 
transactions
to the available and used queue from guest and host side. It did reveal the 
wrong
head was fetched from the available queue.

[   11.785745]  virtqueue_get_buf_ctx_split 
[   11.786238] virtio_net virtio0: output.0:id 74 is not a head!
[   11.786655] head to be released: 036 077
[   11.786952]
[   11.786952] avail_idx:
[   11.787234] 000  63985  <--
[   11.787237] 001  63986
[   11.787444] 002  63987
[   11.787632] 003  63988
[   11.787821] 004  63989
[   11.788006] 005  63990
[   11.788194] 006  63991
[   11.788381] 007  63992
[   11.788567] 008  63993
[   11.788772] 009  63994
[   11.788957] 010  63995
[   11.789141] 011  63996
[   11.789327] 012  63997
[   11.789515] 013  63998
[   11.789701] 014  63999
[   11.789886] 015  64000


Does the error always occur at such a round idx value?

Here, 64000 == 0xFA00. Maybe coincidence but it's improbable enough to be 
interesting.

This debug code seems rather useful!



Keir, Nope, it's just coincidence. We don't have such kind of pattern.

Thanks,
Gavin





[   11.790068]
[   11.790068] avail_head:
[   11.790529] 000  075  <--
[   11.790718] 001  036
[   11.790890] 002  077
[   11.791061] 003  129
[   11.791231] 004  072
[   11.791400] 005  130
[   11.791574] 006  015
[   11.791748] 007  074
[   11.791918] 008  130
[   11.792094] 009  130
[   11.792263] 010  074
[   11.792437] 011  015
[   11.792617] 012  072
[   11.792788] 013  129
[   11.792961] 014  077// The last two heads from guest to host: 077, 036
[   11.793134] 015  036

[root@nvidia-grace-hopper-05 qemu.main]# cat /proc/vhost

avail_idx
000  63998
001  64000
002  63954  <---
003  63955
004  63956
005  63974
006  63981
007  63984
008  63986
009  63987
010  63988
011  63989
012  63992
013  63993
014  63995
015  63997

avail_head
000  074
001  015
002  072
003  129
004  074// The last two heads seen by vhost is: 074, 036
005  036
006  075  <---
007  036
008  077
009  129
010  072
011  130
012  015
013  074
014  130
015  130

used_idx
000  64000
001  63882  <---
002  63889
003  63891
004  63898
005  63936
006  63942
007  63946
008  63949
009  63953
010  63957
011  63981
012  63990
013  63992
014  63993
015  63999

used_head
000  072
001  129
002  074  // The last two heads published to guest is: 074, 036
003  036
004  075  <---
005  036
006  077
007  129
008  072
009  130
010  015
011  074
012  130
013  130
014  074
015  015

Thanks,
Gavin











[syzbot] [trace?] [bpf?] KASAN: slab-use-after-free Read in bpf_trace_run4

2024-03-21 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:520fad2e3206 selftests/bpf: scale benchmark counting by us..
git tree:   bpf-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=121c067918
kernel config:  https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
dashboard link: https://syzkaller.appspot.com/bug?extid=62d8b26793e8a2bd0516
compiler:   Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 
2.40
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13dc423118
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1705d18518

Downloadable assets:
disk image: 
https://storage.googleapis.com/syzbot-assets/4eef3506c5ce/disk-520fad2e.raw.xz
vmlinux: 
https://storage.googleapis.com/syzbot-assets/24d60ebe76cc/vmlinux-520fad2e.xz
kernel image: 
https://storage.googleapis.com/syzbot-assets/8f883e706550/bzImage-520fad2e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+62d8b26793e8a2bd0...@syzkaller.appspotmail.com

==
BUG: KASAN: slab-use-after-free in __bpf_trace_run 
kernel/trace/bpf_trace.c:2376 [inline]
BUG: KASAN: slab-use-after-free in bpf_trace_run4+0x143/0x580 
kernel/trace/bpf_trace.c:2433
Read of size 8 at addr 8880238ba918 by task sshd/5076

CPU: 1 PID: 5076 Comm: sshd Not tainted 6.8.0-syzkaller-05233-g520fad2e3206 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
02/29/2024
Call Trace:
 
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 __bpf_trace_run kernel/trace/bpf_trace.c:2376 [inline]
 bpf_trace_run4+0x143/0x580 kernel/trace/bpf_trace.c:2433
 __traceiter_mm_page_alloc+0x3a/0x60 include/trace/events/kmem.h:177
 trace_mm_page_alloc include/trace/events/kmem.h:177 [inline]
 __alloc_pages+0x657/0x680 mm/page_alloc.c:4591
 __alloc_pages_node include/linux/gfp.h:238 [inline]
 alloc_pages_node include/linux/gfp.h:261 [inline]
 alloc_slab_page+0x5f/0x160 mm/slub.c:2190
 allocate_slab mm/slub.c:2354 [inline]
 new_slab+0x84/0x2f0 mm/slub.c:2407
 ___slab_alloc+0xd1b/0x13e0 mm/slub.c:3540
 __kmem_cache_alloc_bulk mm/slub.c:4574 [inline]
 kmem_cache_alloc_bulk+0x22e/0x790 mm/slub.c:4648
 napi_skb_cache_get+0x166/0x230 net/core/skbuff.c:348
 __napi_build_skb net/core/skbuff.c:527 [inline]
 __napi_alloc_skb+0x217/0x540 net/core/skbuff.c:846
 napi_alloc_skb include/linux/skbuff.h:3363 [inline]
 page_to_skb+0x275/0x9b0 drivers/net/virtio_net.c:569
 receive_mergeable drivers/net/virtio_net.c:1683 [inline]
 receive_buf+0x3b3/0x3890 drivers/net/virtio_net.c:1804
 virtnet_receive drivers/net/virtio_net.c:2110 [inline]
 virtnet_poll+0x720/0x18f0 drivers/net/virtio_net.c:2203
 __napi_poll+0xcb/0x490 net/core/dev.c:6632
 napi_poll net/core/dev.c:6701 [inline]
 net_rx_action+0x7bb/0x1090 net/core/dev.c:6813
 __do_softirq+0x2bc/0x943 kernel/softirq.c:554
 invoke_softirq kernel/softirq.c:428 [inline]
 __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
 common_interrupt+0xaa/0xd0 arch/x86/kernel/irq.c:247
 
 
 asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
RIP: 0010:kasan_check_range+0x1b7/0x290 mm/kasan/generic.c:189
Code: f5 4d 01 fb 48 8d 5d 07 48 85 ed 48 0f 49 dd 48 83 e3 f8 48 29 dd 74 12 
41 80 3b 00 0f 85 a6 00 00 00 49 ff c3 48 ff cd 75 ee <5b> 41 5c 41 5e 41 5f 5d 
c3 cc cc cc cc 40 84 ed 75 5f f7 c5 00 ff
RSP: 0018:c900039ff950 EFLAGS: 0256
RAX: 845bbe01 RBX: 19200073ff40 RCX: 845bbe35
RDX: 0001 RSI: 0030 RDI: c900039ffa00
RBP:  R08: c900039ffa2f R09: 19200073ff45
R10: dc00 R11: f5200073ff46 R12: 19200073ff3c
R13: dc00 R14: dc01 R15: f5200073ff46
 __asan_memset+0x23/0x50 mm/kasan/shadow.c:84
 tomoyo_socket_sendmsg_permission+0x95/0x420 security/tomoyo/network.c:761
 security_socket_sendmsg+0x75/0xb0 security/security.c:4501
 __sock_sendmsg+0x49/0x270 net/socket.c:742
 sock_write_iter+0x2dd/0x400 net/socket.c:1160
 call_write_iter include/linux/fs.h:2108 [inline]
 new_sync_write fs/read_write.c:497 [inline]
 vfs_write+0xa84/0xcb0 fs/read_write.c:590
 ksys_write+0x1a0/0x2c0 fs/read_write.c:643
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7f2e91716bf2
Code: 89 c7 48 89 44 24 08 e8 7b 34 fa ff 48 8b 44 24 08 48 83 c4 28 c3 c3 64 
8b 04 25 18 00 00 00 85 c0 75 20 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 6f 
48 8b 15 07 a2 0d 00 f7 d8 64 89 02 48 83
RSP: 002b:7ffee57321e8 EFLAGS: 0246 ORIG_RAX: 0001
RAX: ffda RBX: 00f4 RCX: 7f2e91716bf2
RDX: 00f4 RSI: 55731dc092b0 RDI: 0004
RBP: 55731dbfe220 R08: 

[syzbot] [bpf?] [trace?] KASAN: slab-use-after-free Read in bpf_trace_run2

2024-03-21 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:520fad2e3206 selftests/bpf: scale benchmark counting by us..
git tree:   bpf-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=11b967b918
kernel config:  https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
dashboard link: https://syzkaller.appspot.com/bug?extid=2cb5a6c573e98db598cc
compiler:   Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 
2.40
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1257dd8518
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14b55c6e18

Downloadable assets:
disk image: 
https://storage.googleapis.com/syzbot-assets/4eef3506c5ce/disk-520fad2e.raw.xz
vmlinux: 
https://storage.googleapis.com/syzbot-assets/24d60ebe76cc/vmlinux-520fad2e.xz
kernel image: 
https://storage.googleapis.com/syzbot-assets/8f883e706550/bzImage-520fad2e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2cb5a6c573e98db59...@syzkaller.appspotmail.com

==
BUG: KASAN: slab-use-after-free in __bpf_trace_run 
kernel/trace/bpf_trace.c:2376 [inline]
BUG: KASAN: slab-use-after-free in bpf_trace_run2+0xfa/0x530 
kernel/trace/bpf_trace.c:2431
Read of size 8 at addr 88802aaea218 by task syz-executor147/10463

CPU: 0 PID: 10463 Comm: syz-executor147 Not tainted 
6.8.0-syzkaller-05233-g520fad2e3206 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
02/29/2024
Call Trace:
 
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 __bpf_trace_run kernel/trace/bpf_trace.c:2376 [inline]
 bpf_trace_run2+0xfa/0x530 kernel/trace/bpf_trace.c:2431
 __traceiter_kfree+0x2b/0x50 include/trace/events/kmem.h:94
 trace_kfree include/trace/events/kmem.h:94 [inline]
 kfree+0x291/0x380 mm/slub.c:4396
 tomoyo_realpath_from_path+0xc2/0x5e0 security/tomoyo/realpath.c:250
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_check_open_permission+0x255/0x500 security/tomoyo/file.c:771
 security_file_open+0x69/0x570 security/security.c:2933
 do_dentry_open+0x327/0x15a0 fs/open.c:943
 do_open fs/namei.c:3643 [inline]
 path_openat+0x2860/0x3240 fs/namei.c:3800
 do_filp_open+0x235/0x490 fs/namei.c:3827
 do_sys_openat2+0x13e/0x1d0 fs/open.c:1407
 do_sys_open fs/open.c:1422 [inline]
 __do_sys_openat fs/open.c:1438 [inline]
 __se_sys_openat fs/open.c:1433 [inline]
 __x64_sys_openat+0x247/0x2a0 fs/open.c:1433
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7f2308d06f51
Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 3a 91 07 00 00 74 
6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 
93 00 00 00 48 8b 54 24 28 64 48 2b 14 25
RSP: 002b:7fffb8a33850 EFLAGS: 0202 ORIG_RAX: 0101
RAX: ffda RBX: 00080001 RCX: 7f2308d06f51
RDX: 00080001 RSI: 7f2308d51022 RDI: ff9c
RBP: 7f2308d51022 R08:  R09: 
R10:  R11: 0202 R12: 7fffb8a338f0
R13: 7fffb8a33dcc R14: 7fffb8a33de0 R15: 7fffb8a33dd0
 

Allocated by task 10462:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
 __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
 kasan_kmalloc include/linux/kasan.h:211 [inline]
 kmalloc_trace+0x1d9/0x360 mm/slub.c:4012
 kmalloc include/linux/slab.h:590 [inline]
 kzalloc include/linux/slab.h:711 [inline]
 bpf_raw_tp_link_attach+0x2a0/0x6e0 kernel/bpf/syscall.c:3816
 bpf_raw_tracepoint_open+0x1c2/0x240 kernel/bpf/syscall.c:3863
 __sys_bpf+0x3c0/0x810 kernel/bpf/syscall.c:5673
 __do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
 __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75

Freed by task 10462:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:589
 poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240
 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
 kasan_slab_free include/linux/kasan.h:184 [inline]
 slab_free_hook mm/slub.c:2121 [inline]
 slab_free mm/slub.c:4299 [inline]
 kfree+0x14a/0x380 mm/slub.c:4409
 bpf_link_release+0x3b/0x50 kernel/bpf/syscall.c:3071
 __fput+0x429/0x8a0 fs/file_table.c:423
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xa1b/0x27e0 kernel/exit.c:878
 do_group_exit+0x207/0x2c0 kernel/exit.c:1027
 __do_sys_exit_group kernel/exit.c:1038 [inline]
 __se_sys_exit_group kernel/exit.c:1036 

Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

2024-03-21 Thread Björn Töpel
Mark,

Mark Rutland  writes:

>> A) Use auipc/jalr, only patch jalr to take us to a common
>>dispatcher/trampoline
>>   
>>  |  # probably on a data cache-line != func .text 
>> to avoid ping-pong
>>  | ...
>>  | func:
>>  |   ...make sure ra isn't messed up...
>>  |   aupic
>>  |   nop <=> jalr # Text patch point -> common_dispatch
>>  |   ACTUAL_FUNC
>>  | 
>>  | common_dispatch:
>>  |   load  based on ra
>>  |   jalr
>>  |   ...
>> 
>> The auipc is never touched, and will be overhead. Also, we need a mv to
>> store ra in a scratch register as well -- like Arm. We'll have two insn
>> per-caller overhead for a disabled caller.
>
> Is the AUIPC a significant overhead? IIUC that's similar to Arm's ADRP, and 
> I'd
> have expected that to be pretty cheap.

No, reg-to-reg moves are dirt cheap in my book.

> IIUC your JALR can choose which destination register to store the return
> address in, and if so, you could leave the original ra untouched (and recover
> that in the common trampoline). Have I misunderstood that?
>
> Maybe that doesn't play nicely with something else?

No, you're right, we can link to another register, and shave off an
instruction. I can imagine that some implementation prefer x1/x5 for
branch prediction reasons, but that's something that we can measure on. 

So, 1-2 movs + nop are unconditionally executed on the disabled case.
(1-2 depending on the ra save/jalr reg strategy).

>> B) Use jal, which can only take us +/-1M, and requires multiple
>>dispatchers (and tracking which one to use, and properly distribute
>>them. Ick.)
>> 
>>  |  # probably on a data cache-line != func .text 
>> to avoid ping-pong
>>  | ...
>>  | func:
>>  |   ...make sure ra isn't messed up...
>>  |   nop <=> jal # Text patch point -> within_1M_to_func_dispatch
>>  |   ACTUAL_FUNC
>>  | 
>>  | within_1M_to_func_dispatch:
>>  |   load  based on ra
>>  |   jalr
>> 
>> C) Use jal, which can only take us +/-1M, and use a per-function
>>trampoline requires multiple dispatchers (and tracking which one to
>>use). Blows up text size A LOT.
>> 
>>  |  # somewhere, but probably on a different 
>> cacheline than the .text to avoid ping-ongs
>>  | ...
>>  | per_func_dispatch
>>  |   load  based on ra
>>  |   jalr
>>  | func:
>>  |   ...make sure ra isn't messed up...
>>  |   nop <=> jal # Text patch point -> per_func_dispatch
>>  |   ACTUAL_FUNC
>
> Beware that with option (C) you'll need to handle that in your unwinder for
> RELIABLE_STACKTRACE. If you don't have a symbol for per_func_dispatch (or
> func_trace_target_data_8B), PC values within per_func_dispatch would be
> symbolized as the prior function/data.

Good point (but I don't like C much...)!

>> It's a bit sad that we'll always have to have a dispatcher/trampoline,
>> but it's still better than stop_machine(). (And we'll need a fencei IPI
>> as well, but only one. ;-))
>> 
>> Today, I'm leaning towards A (which is what Mark suggested, and also
>> Robbin).. Any other options?
>
> Assuming my understanding of JALR above is correct, I reckon A is the nicest
> option out of A/B/C.

Yes! +1!


Björn



Re: [RFC PATCH] riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

2024-03-21 Thread Björn Töpel
Andy,

Pulling out the A option:

>> > A) Use auipc/jalr, only patch jalr to take us to a common
>> >dispatcher/trampoline
>> >
>> >  |  # probably on a data cache-line != func 
>> > .text to avoid ping-pong
>> >  | ...
>> >  | func:
>> >  |   ...make sure ra isn't messed up...
>> >  |   aupic
>> >  |   nop <=> jalr # Text patch point -> common_dispatch
>> >  |   ACTUAL_FUNC
>> >  |
>> >  | common_dispatch:
>> >  |   load  based on ra
>> >  |   jalr
>> >  |   ...
>> >
>> > The auipc is never touched, and will be overhead. Also, we need a mv to
>> > store ra in a scratch register as well -- like Arm. We'll have two insn
>> > per-caller overhead for a disabled caller.
>
> My patch series takes a similar "in-function dispatch" approach. A
> difference is that the  is
> embedded within each function entry. I'd like to have it moved to a
> run-time allocated array to reduce total text size.

This is what arm64 has as well. It's a 8B + 1-2 dirt cheap movish like
instructions (save ra, prepare jump with auipc). I think that's a
reasonable overhead.

> Another difference is that my series changes the first instruction to
> "j ACTUAL_FUNC" for the "ftrace disable" case. As long as the
> architecture guarantees the atomicity of the first instruction, then
> we are safe. For example, we are safe if the first instruction could
> only be "mv tmp, ra" or "j ACTUAL_FUNC". And since the loaded address is
> always valid, we can fix "mv + jalr" down so we don't have to
> play with the exception handler trick. The guarantee from arch would
> require ziccif (in RVA22) though, but I think it is the same for us
> (unless with stop_machine). For ziccif, I would rather call that out
> during boot than blindly assume.

I'm maybe biased, but I'd prefer the A) over your version with the
unconditional jump. A) has the overhead of two, I'd say, free
instructions (again "Meten is Weten!" ;-)).

> However, one thing I am not very sure is: do we need a destination
> address in a "per-function" manner? It seems like most of the time the
> destination address can only be ftrace_call, or ftrace_regs_call. If
> the number of destination addresses is very few, then we could
> potentially reduce the size of
> .

Yes, we do need a per-function manner. BPF, e.g., uses
dynamically/JIT:ed trampolines/targets.



Björn



Re: [PATCH v3 resend] net/ipv4: add tracepoint for icmp_send

2024-03-21 Thread Eric Dumazet
On Thu, Mar 21, 2024 at 4:09 AM  wrote:
>
> From: he peilin 
>
> Introduce a tracepoint for icmp_send, which can help users to get more
> detail information conveniently when icmp abnormal events happen.
>
> 1. Giving an usecase example:
> =
> When an application experiences packet loss due to an unreachable UDP
> destination port, the kernel will send an exception message through the
> icmp_send function. By adding a trace point for icmp_send, developers or
> system administrators can obtain detailed information about the UDP
> packet loss, including the type, code, source address, destination address,
> source port, and destination port. This facilitates the trouble-shooting
> of UDP packet loss issues especially for those network-service
> applications.
>
> 2. Operation Instructions:
> ==
> Switch to the tracing directory.
> cd /sys/kernel/tracing
> Filter for destination port unreachable.
> echo "type==3 && code==3" > events/icmp/icmp_send/filter
> Enable trace event.
> echo 1 > events/icmp/icmp_send/enable
>
> 3. Result View:
> 
>  udp_client_erro-11370   [002] ...s.12   124.728002:
>  icmp_send: icmp_send: type=3, code=3.
>  From 127.0.0.1:41895 to 127.0.0.1: ulen=23
>  skbaddr=589b167a
>
> Changelog
> -
> v2->v3:
> Some fixes according to
> https://lore.kernel.org/all/20240319102549.7f7f6...@gandalf.local.home/
> 1. Change the tracking directory to/sys/kernel/tracking.
> 2. Adjust the layout of the TP-STRUCT_entry parameter structure.
>
> v1->v2:
> Some fixes according to
> https://lore.kernel.org/all/CANn89iL-y9e_VFpdw=sztrnkru_tnuwqhufqtjvjsv-nz1x...@mail.gmail.com/
> 1. adjust the trace_icmp_send() to more protocols than UDP.
> 2. move the calling of trace_icmp_send after sanity checks
> in __icmp_send().
>
> Signed-off-by: Peilin He
> Reviewed-by: xu xin 
> Reviewed-by: Yunkai Zhang 
> Cc: Yang Yang 
> Cc: Liu Chun 
> Cc: Xuexin Jiang 
> ---
>  include/trace/events/icmp.h | 64 +
>  net/ipv4/icmp.c |  4 +++
>  2 files changed, 68 insertions(+)
>  create mode 100644 include/trace/events/icmp.h
>
> diff --git a/include/trace/events/icmp.h b/include/trace/events/icmp.h
> new file mode 100644
> index ..2098d4b1b12e
> --- /dev/null
> +++ b/include/trace/events/icmp.h
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM icmp
> +
> +#if !defined(_TRACE_ICMP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_ICMP_H
> +
> +#include 
> +#include 
> +
> +TRACE_EVENT(icmp_send,
> +
> +   TP_PROTO(const struct sk_buff *skb, int type, int code),
> +
> +   TP_ARGS(skb, type, code),
> +
> +   TP_STRUCT__entry(
> +   __field(const void *, skbaddr)
> +   __field(int, type)
> +   __field(int, code)
> +   __array(__u8, saddr, 4)
> +   __array(__u8, daddr, 4)
> +   __field(__u16, sport)
> +   __field(__u16, dport)
> +   __field(unsigned short, ulen)
> +   ),
> +
> +   TP_fast_assign(
> +   struct iphdr *iph = ip_hdr(skb);
> +   int proto_4 = iph->protocol;
> +   __be32 *p32;
> +
> +   __entry->skbaddr = skb;
> +   __entry->type = type;
> +   __entry->code = code;
> +
> +   if (proto_4 == IPPROTO_UDP) {
> +   struct udphdr *uh = udp_hdr(skb);
> +   __entry->sport = ntohs(uh->source);
> +   __entry->dport = ntohs(uh->dest);
> +   __entry->ulen = ntohs(uh->len);

This is completely bogus.

Adding tracepoints is ok if there are no side effects like bugs :/

At this point there is no guarantee the UDP header is complete/present
in skb->head

Look at the existing checks between lines 619 and 623

Then audit all icmp_send() callers, and ask yourself if UDP packets
can not be malicious (like with a truncated UDP header)



[PATCH 2/2] remoteproc: mediatek: Don't parse extraneous subnodes for multi-core

2024-03-21 Thread AngeloGioacchino Del Regno
When probing multi-core SCP, this driver is parsing all sub-nodes of
the scp-cluster node, but one of those could be not an actual SCP core
and that would make the entire SCP cluster to fail probing for no good
reason.

To fix that, in scp_add_multi_core() treat a subnode as a SCP Core by
parsing only available subnodes having compatible "mediatek,scp-core".

Fixes: 1fdbf0cdde98 ("remoteproc: mediatek: Probe SCP cluster on multi-core 
SCP")
Signed-off-by: AngeloGioacchino Del Regno 

---
 drivers/remoteproc/mtk_scp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
index 67518291a8ad..fbe1c232dae7 100644
--- a/drivers/remoteproc/mtk_scp.c
+++ b/drivers/remoteproc/mtk_scp.c
@@ -1096,6 +1096,9 @@ static int scp_add_multi_core(struct platform_device 
*pdev,
cluster_of_data = (const struct mtk_scp_of_data 
**)of_device_get_match_data(dev);
 
for_each_available_child_of_node(np, child) {
+   if (!of_device_is_compatible(child, "mediatek,scp-core"))
+   continue;
+
if (!cluster_of_data[core_id]) {
ret = -EINVAL;
dev_err(dev, "Not support core %d\n", core_id);
-- 
2.44.0




[PATCH 1/2] remoteproc: mediatek: Make sure IPI buffer fits in L2TCM

2024-03-21 Thread AngeloGioacchino Del Regno
The IPI buffer location is read from the firmware that we load to the
System Companion Processor, and it's not granted that both the SRAM
(L2TCM) size that is defined in the devicetree node is large enough
for that, and while this is especially true for multi-core SCP, it's
still useful to check on single-core variants as well.

Failing to perform this check may make this driver perform R/W
oeprations out of the L2TCM boundary, resulting (at best) in a
kernel panic.

To fix that, check that the IPI buffer fits, otherwise return a
failure and refuse to boot the relevant SCP core (or the SCP at
all, if this is single core).

Fixes: 3efa0ea743b7 ("remoteproc/mediatek: read IPI buffer offset from FW")
Signed-off-by: AngeloGioacchino Del Regno 

---
 drivers/remoteproc/mtk_scp.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
index a35409eda0cf..67518291a8ad 100644
--- a/drivers/remoteproc/mtk_scp.c
+++ b/drivers/remoteproc/mtk_scp.c
@@ -132,7 +132,7 @@ static int scp_elf_read_ipi_buf_addr(struct mtk_scp *scp,
 static int scp_ipi_init(struct mtk_scp *scp, const struct firmware *fw)
 {
int ret;
-   size_t offset;
+   size_t buf_sz, offset;
 
/* read the ipi buf addr from FW itself first */
ret = scp_elf_read_ipi_buf_addr(scp, fw, );
@@ -144,6 +144,14 @@ static int scp_ipi_init(struct mtk_scp *scp, const struct 
firmware *fw)
}
dev_info(scp->dev, "IPI buf addr %#010zx\n", offset);
 
+   /* Make sure IPI buffer fits in the L2TCM range assigned to this core */
+   buf_sz = sizeof(*scp->recv_buf) + sizeof(*scp->send_buf);
+
+   if (scp->sram_size < buf_sz + offset) {
+   dev_err(scp->dev, "IPI buffer does not fit in SRAM.\n");
+   return -EOVERFLOW;
+   }
+
scp->recv_buf = (struct mtk_share_obj __iomem *)
(scp->sram_base + offset);
scp->send_buf = (struct mtk_share_obj __iomem *)
-- 
2.44.0




[PATCH 0/2] MediaTek SCP: Urgent fixes for all MTK SoCs

2024-03-21 Thread AngeloGioacchino Del Regno
This series brings some missing validation for the IPI buffer size
that is read from the firmware retrieved from userspace: if the FW
declares IPI buffer offset starting at an out of range address, the
driver doesn't do any validation and naively goes on with IO R/W
operation.

That poses various risks which I believe I really don't need to
describe, leaving it to the reader's imagination :-)

Please note that the first fix is URGENT.

P.S.: Of course, this was tested OK on multiple MTK platforms.

AngeloGioacchino Del Regno (2):
  remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
  remoteproc: mediatek: Don't parse extraneous subnodes for multi-core

 drivers/remoteproc/mtk_scp.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

-- 
2.44.0




Re: [PATCH 2/3] dt-bindings: remoteproc: add Versal-NET platform

2024-03-21 Thread Krzysztof Kozlowski
On 20/03/2024 16:14, Tanmay Shah wrote:
> 
> 
> On 3/20/24 2:40 AM, Krzysztof Kozlowski wrote:
>> On 19/03/2024 15:42, Tanmay Shah wrote:
>>>
>>>
>>> On 3/19/24 12:30 AM, Krzysztof Kozlowski wrote:
 On 19/03/2024 01:51, Tanmay Shah wrote:
> Hello Krzysztof,
>
> Thanks for reviews. Please find my comments below.
>
> On 3/17/24 1:53 PM, Krzysztof Kozlowski wrote:
>> On 15/03/2024 22:15, Tanmay Shah wrote:
>>> AMD-Xilinx Versal-NET platform is successor of Versal platform. It
>>> contains multiple clusters of cortex-R52 real-time processing units.
>>> Each cluster contains two cores of cortex-R52 processors. Each cluster
>>> can be configured in lockstep mode or split mode.
>>>
>>> Each R52 core is assigned 128KB of TCM memory. ATCM memory is 64KB, BTCM
>>> and CTCM memoreis are 32KB each. Each TCM memory has its own dedicated
>>> power-domain that needs to be requested before using it.
>>>
>>> Signed-off-by: Tanmay Shah 
>>> ---
>>>  .../remoteproc/xlnx,zynqmp-r5fss.yaml | 220 +++---
>>>  1 file changed, 184 insertions(+), 36 deletions(-)
>>>
>>> diff --git 
>>> a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml 
>>> b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
>>> index 711da0272250..55654ee02eef 100644
>>> --- 
>>> a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
>>> +++ 
>>> b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml
>>> @@ -18,7 +18,9 @@ description: |
>>>  
>>>  properties:
>>>compatible:
>>> -const: xlnx,zynqmp-r5fss
>>> +enum:
>>> +  - xlnx,zynqmp-r5fss
>>> +  - xlnx,versal-net-r52fss
>>>  
>>>"#address-cells":
>>>  const: 2
>>> @@ -64,7 +66,9 @@ patternProperties:
>>>  
>>>  properties:
>>>compatible:
>>> -const: xlnx,zynqmp-r5f
>>> +enum:
>>> +  - xlnx,zynqmp-r5f
>>> +  - xlnx,versal-net-r52f
>>>  
>>>reg:
>>>  minItems: 1
>>> @@ -135,9 +139,11 @@ required:
>>>  allOf:
>>>- if:
>>>properties:
>>> -xlnx,cluster-mode:
>>> -  enum:
>>> -- 1
>>> +compatible:
>>> +  contains:
>>> +enum:
>>> +  - xlnx,versal-net-r52fss
>>
>> Why do you touch these lines?
>>
>>> +
>>>  then:
>>>patternProperties:
>>>  "^r5f@[0-9a-f]+$":
>>> @@ -149,16 +155,14 @@ allOf:
>>>items:
>>>  - description: ATCM internal memory
>>>  - description: BTCM internal memory
>>> -- description: extra ATCM memory in lockstep mode
>>> -- description: extra BTCM memory in lockstep mode
>>> +- description: CTCM internal memory
>>>  
>>>  reg-names:
>>>minItems: 1
>>>items:
>>> -- const: atcm0
>>> -- const: btcm0
>>> -- const: atcm1
>>> -- const: btcm1
>>> +- const: atcm
>>> +- const: btcm
>>> +- const: ctcm
>>>  
>>>  power-domains:
>>>minItems: 2
>>> @@ -166,33 +170,70 @@ allOf:
>>>  - description: RPU core power domain
>>>  - description: ATCM power domain
>>>  - description: BTCM power domain
>>> -- description: second ATCM power domain
>>> -- description: second BTCM power domain
>>> +- description: CTCM power domain
>>>  
>>>  else:
>>> -  patternProperties:
>>> -"^r5f@[0-9a-f]+$":
>>> -  type: object
>>> -
>>> -  properties:
>>> -reg:
>>> -  minItems: 1
>>> -  items:
>>> -- description: ATCM internal memory
>>> -- description: BTCM internal memory
>>> -
>>> -reg-names:
>>> -  minItems: 1
>>> -  items:
>>> -- const: atcm0
>>> -- const: btcm0
>>> -
>>> -power-domains:
>>> -  minItems: 2
>>> -  items:
>>> -- description: RPU core power domain
>>> -- description: ATCM power domain
>>> -- description: BTCM power domain
>>> +  allOf:
>>> +- if:
>>> +properties:
>>> +  xlnx,cluster-mode:
>>> +enum:
>>> +  - 1
>>
>> Whatever you did here, is not really 

Re: [PATCH v3] vhost/vdpa: Add MSI translation tables to iommu for software-managed MSI

2024-03-21 Thread Michael S. Tsirkin
On Wed, Mar 20, 2024 at 06:19:12PM +0800, Wang Rong wrote:
> From: Rong Wang 
> 
> Once enable iommu domain for one device, the MSI
> translation tables have to be there for software-managed MSI.
> Otherwise, platform with software-managed MSI without an
> irq bypass function, can not get a correct memory write event
> from pcie, will not get irqs.
> The solution is to obtain the MSI phy base address from
> iommu reserved region, and set it to iommu MSI cookie,
> then translation tables will be created while request irq.
> 
> Change log
> --
> 
> v1->v2:
> - add resv iotlb to avoid overlap mapping.
> v2->v3:
> - there is no need to export the iommu symbol anymore.
> 
> Signed-off-by: Rong Wang 

There's in interest to keep extending vhost iotlb -
we should just switch over to iommufd which supports
this already.

> ---
>  drivers/vhost/vdpa.c | 59 +---
>  1 file changed, 56 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ba52d128aeb7..28b56b10372b 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -49,6 +49,7 @@ struct vhost_vdpa {
>   struct completion completion;
>   struct vdpa_device *vdpa;
>   struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS];
> + struct vhost_iotlb resv_iotlb;
>   struct device dev;
>   struct cdev cdev;
>   atomic_t opened;
> @@ -247,6 +248,7 @@ static int _compat_vdpa_reset(struct vhost_vdpa *v)
>  static int vhost_vdpa_reset(struct vhost_vdpa *v)
>  {
>   v->in_batch = 0;
> + vhost_iotlb_reset(>resv_iotlb);
>   return _compat_vdpa_reset(v);
>  }
>  
> @@ -1219,10 +1221,15 @@ static int vhost_vdpa_process_iotlb_update(struct 
> vhost_vdpa *v,
>   msg->iova + msg->size - 1 > v->range.last)
>   return -EINVAL;
>  
> + if (vhost_iotlb_itree_first(>resv_iotlb, msg->iova,
> + msg->iova + msg->size - 1))
> + return -EINVAL;
> +
>   if (vhost_iotlb_itree_first(iotlb, msg->iova,
>   msg->iova + msg->size - 1))
>   return -EEXIST;
>  
> +
>   if (vdpa->use_va)
>   return vhost_vdpa_va_map(v, iotlb, msg->iova, msg->size,
>msg->uaddr, msg->perm);
> @@ -1307,6 +1314,45 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb 
> *iocb,
>   return vhost_chr_write_iter(dev, from);
>  }
>  
> +static int vhost_vdpa_resv_iommu_region(struct iommu_domain *domain, struct 
> device *dma_dev,
> + struct vhost_iotlb *resv_iotlb)
> +{
> + struct list_head dev_resv_regions;
> + phys_addr_t resv_msi_base = 0;
> + struct iommu_resv_region *region;
> + int ret = 0;
> + bool with_sw_msi = false;
> + bool with_hw_msi = false;
> +
> + INIT_LIST_HEAD(_resv_regions);
> + iommu_get_resv_regions(dma_dev, _resv_regions);
> +
> + list_for_each_entry(region, _resv_regions, list) {
> + ret = vhost_iotlb_add_range_ctx(resv_iotlb, region->start,
> + region->start + region->length - 1,
> + 0, 0, NULL);
> + if (ret) {
> + vhost_iotlb_reset(resv_iotlb);
> + break;
> + }
> +
> + if (region->type == IOMMU_RESV_MSI)
> + with_hw_msi = true;
> +
> + if (region->type == IOMMU_RESV_SW_MSI) {
> + resv_msi_base = region->start;
> + with_sw_msi = true;
> + }
> + }
> +
> + if (!ret && !with_hw_msi && with_sw_msi)
> + ret = iommu_get_msi_cookie(domain, resv_msi_base);
> +
> + iommu_put_resv_regions(dma_dev, _resv_regions);
> +
> + return ret;
> +}
> +
>  static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
>  {
>   struct vdpa_device *vdpa = v->vdpa;
> @@ -1335,11 +1381,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa 
> *v)
>  
>   ret = iommu_attach_device(v->domain, dma_dev);
>   if (ret)
> - goto err_attach;
> + goto err_alloc_domain;
>  
> - return 0;
> + ret = vhost_vdpa_resv_iommu_region(v->domain, dma_dev, >resv_iotlb);
> + if (ret)
> + goto err_attach_device;
>  
> -err_attach:
> + return 0;
> +err_attach_device:
> + iommu_detach_device(v->domain, dma_dev);
> +err_alloc_domain:
>   iommu_domain_free(v->domain);
>   v->domain = NULL;
>   return ret;
> @@ -1595,6 +1646,8 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>   goto err;
>   }
>  
> + vhost_iotlb_init(>resv_iotlb, 0, 0);
> +
>   r = dev_set_name(>dev, "vhost-vdpa-%u", minor);
>   if (r)
>   goto err;
> -- 
> 2.27.0
> 




Re: [PATCH v3] vhost/vdpa: Add MSI translation tables to iommu for software-managed MSI

2024-03-21 Thread Jason Wang
On Wed, Mar 20, 2024 at 6:20 PM Wang Rong  wrote:
>
> From: Rong Wang 
>
> Once enable iommu domain for one device, the MSI
> translation tables have to be there for software-managed MSI.
> Otherwise, platform with software-managed MSI without an
> irq bypass function, can not get a correct memory write event
> from pcie, will not get irqs.
> The solution is to obtain the MSI phy base address from
> iommu reserved region, and set it to iommu MSI cookie,
> then translation tables will be created while request irq.
>
> Change log
> --
>
> v1->v2:
> - add resv iotlb to avoid overlap mapping.
> v2->v3:
> - there is no need to export the iommu symbol anymore.
>
> Signed-off-by: Rong Wang 
> ---
>  drivers/vhost/vdpa.c | 59 +---
>  1 file changed, 56 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ba52d128aeb7..28b56b10372b 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -49,6 +49,7 @@ struct vhost_vdpa {
> struct completion completion;
> struct vdpa_device *vdpa;
> struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS];
> +   struct vhost_iotlb resv_iotlb;

Is it better to introduce a reserved flag like VHOST_MAP_RESERVED,
which means it can't be modified by the userspace but the kernel.

So we don't need to have two IOTLB. But I guess the reason you have
this is because we may have multiple address spaces where the MSI
routing should work for all of them?

Another note, vhost-vDPA support virtual address mapping, so this
should only work for physicall address mapping. E.g in the case of
SVA, MSI iova is a valid IOVA for the driver/usrespace.

> struct device dev;
> struct cdev cdev;
> atomic_t opened;
> @@ -247,6 +248,7 @@ static int _compat_vdpa_reset(struct vhost_vdpa *v)
>  static int vhost_vdpa_reset(struct vhost_vdpa *v)
>  {
> v->in_batch = 0;
> +   vhost_iotlb_reset(>resv_iotlb);

We try hard to avoid this for performance, see this commit:

commit 4398776f7a6d532c466f9e41f601c9a291fac5ef
Author: Si-Wei Liu 
Date:   Sat Oct 21 02:25:15 2023 -0700

vhost-vdpa: introduce IOTLB_PERSIST backend feature bit

Any reason you need to do this?

> return _compat_vdpa_reset(v);
>  }
>
> @@ -1219,10 +1221,15 @@ static int vhost_vdpa_process_iotlb_update(struct 
> vhost_vdpa *v,
> msg->iova + msg->size - 1 > v->range.last)
> return -EINVAL;
>
> +   if (vhost_iotlb_itree_first(>resv_iotlb, msg->iova,
> +   msg->iova + msg->size - 1))
> +   return -EINVAL;
> +
> if (vhost_iotlb_itree_first(iotlb, msg->iova,
> msg->iova + msg->size - 1))
> return -EEXIST;
>
> +
> if (vdpa->use_va)
> return vhost_vdpa_va_map(v, iotlb, msg->iova, msg->size,
>  msg->uaddr, msg->perm);
> @@ -1307,6 +1314,45 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb 
> *iocb,
> return vhost_chr_write_iter(dev, from);
>  }
>
> +static int vhost_vdpa_resv_iommu_region(struct iommu_domain *domain, struct 
> device *dma_dev,
> +   struct vhost_iotlb *resv_iotlb)
> +{
> +   struct list_head dev_resv_regions;
> +   phys_addr_t resv_msi_base = 0;
> +   struct iommu_resv_region *region;
> +   int ret = 0;
> +   bool with_sw_msi = false;
> +   bool with_hw_msi = false;
> +
> +   INIT_LIST_HEAD(_resv_regions);
> +   iommu_get_resv_regions(dma_dev, _resv_regions);
> +
> +   list_for_each_entry(region, _resv_regions, list) {
> +   ret = vhost_iotlb_add_range_ctx(resv_iotlb, region->start,
> +   region->start + region->length - 1,
> +   0, 0, NULL);

I think MSI should be write-only?

> +   if (ret) {
> +   vhost_iotlb_reset(resv_iotlb);

Need to report an error here.

Thanks