Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2019-01-03 Thread Mark Brown
On Wed, Jan 02, 2019 at 01:44:40AM +0100, Andreas Färber wrote:

> So, the third invocation of sun6i_transfer_one() calling clk_get_rate()
> hangs at the prepare_lock instead of reference-counting, because it runs
> from a separate kthread, unlike the two previous calls?

If there's any contestation for the bus we push all the I/O through a
separate thread to maintain ordering and improve performance.

> I did not find any *regmap_init_spi() based example in drivers/clk/, and
> all other "spi" mentions in drivers/clk/ appeared to be clock names.
> The closest was devm_regmap_init_i2c() based clk-cdce706.c, which uses
> the prepare/unprepare ops, as suggested by Mark, and does
> regmap_update_bits() from there.

You'll find a bunch of the MFDs have SPI options as well.


signature.asc
Description: PGP signature


Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2019-01-01 Thread Andreas Färber
Am 31.12.18 um 23:56 schrieb Andreas Färber:
> Am 31.12.18 um 18:50 schrieb Mark Brown:
>> On Sun, Dec 30, 2018 at 11:55:46AM +0100, Andreas Färber wrote:
>>> Given that observed symptoms were CPU stalls, workqueue hangs and RCU
>>> problems, requiring a power-cycle to recover, I wonder whether we are
>>> running into some atomic/locking issue with clk_enable()? Is it valid at
>>> all to use SPI/regmap for clk_enable()? If it is, is there a known issue
>>> specific to spi-sun6i (A64) in 4.20.0?
>>> I already tried setting .disable_locking = true in both regmap_configs.
>>> Any suggestions how to further debug?
>>
>> You can't use SPI for clk_enable(), clk_enable() needs to be doable in
>> atomic context since we need to wait for the bus operations to complete
>> (you can start SPI transfers in atomic context but you still need to
>> wait for them to complete).  Any clocks that are only accessible via a
>> slow bus like I2C or SPI need to do the enable/disable in the
>> prepare/unprepare operations which aren't done in atomic context.
>>
>> regmap can be used in atomic contexts, though you need to configure it
>> to use spinlocks instead of mutexes and ensure that no register cache
>> allocations happen during I/O (eg, by providing defaults for all
>> registers or by not using a cache).
> 
> We have .cache_type = REGCACHE_NONE on both bus and spi regmap_configs.
> 
> I moved the regmap_field_write() from .enable to .prepare and set
> .fast_io = true on both regmap_configs to force using spinlocks, but
> same hang as in .enable before...
> 
> And same if I set .disable_locking = true on both.
> 
> Given that it works with one SPI driver and not with the other,
> independent of the locking options applied, I assume my symptoms are not
> a regmap-layer issue.
> 
> Is it allowed during a .prepare operation to call the mentioned
> clk_get_rate(), which ends up calling clk_prepare_lock()?
> 
> According to my debug output in spi-sun6i.c our hanging
> regmap_field_write() ends up calling sun6i_transfer_one() three times,
> the first two look okay, but the third one doesn't make it past the
> clk_get_rate() [...].

SysRq still works in that state! Attached is SysRq-w output.
(still with .disable_locking = true in both regmap_configs)

In the very bottom you see the "ip" task, at wait_for_completion() from
__spi_sync().
I trigger this issue with `ip link set lora2 up`, so that looks okay.

Then there's a "spi1" task at clk_prepare_lock()' mutex_lock() coming
from spi_pump_messages().
The reason for that will be that clk_prepare_lock()'s mutex_trylock()
failed (because we're holding the prepare_lock from clk_prepare_enable()
in the "ip" task) and that the prepare_owner == current check fails for
this separate task_struct, too.

So, the third invocation of sun6i_transfer_one() calling clk_get_rate()
hangs at the prepare_lock instead of reference-counting, because it runs
from a separate kthread, unlike the two previous calls?

Besides, there's also an mmc_rescan workqueue task at clk_prepare_lock()
coming from sunxi_mmc_enable() due to pm_generic_runtime_resume().
My rootfs is on microSD card.

I did not find any *regmap_init_spi() based example in drivers/clk/, and
all other "spi" mentions in drivers/clk/ appeared to be clock names.
The closest was devm_regmap_init_i2c() based clk-cdce706.c, which uses
the prepare/unprepare ops, as suggested by Mark, and does
regmap_update_bits() from there.

A quick grep in drivers/i2c/ does not find any mention of "kthread", so
probably that's the breaking difference?

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
[39292.911444] sysrq: SysRq : Show Blocked State
[39292.915819]   taskPC stack   pid father
[39292.921784] kworker/0:2 D0   170  2 0x0028
[39292.927282] Workqueue: ipv6_addrconf addrconf_verify_work
[39292.932676] Call trace:
[39292.935127]  __switch_to+0x9c/0xd8
[39292.938530]  __schedule+0x2a0/0x888
[39292.942016]  schedule+0x30/0x88
[39292.945157]  schedule_preempt_disabled+0x14/0x20
[39292.949770]  __mutex_lock.isra.1+0x2c0/0x4b8
[39292.954036]  __mutex_lock_slowpath+0x24/0x30
[39292.958302]  mutex_lock+0x48/0x50
[39292.961615]  rtnl_lock+0x1c/0x28
[39292.964841]  addrconf_verify_work+0x14/0x28
[39292.969022]  process_one_work+0x1f4/0x428
[39292.973028]  worker_thread+0x44/0x4a8
[39292.976686]  kthread+0x130/0x138
[39292.979912]  ret_from_fork+0x10/0x18
[39292.983518] btrfs-transacti D0   383  2 0x0028
[39292.988998] Call trace:
[39292.991444]  __switch_to+0x9c/0xd8
[39292.994845]  __schedule+0x2a0/0x888
[39292.998330]  schedule+0x30/0x88
[39293.001472]  io_schedule+0x20/0x40
[39293.004874]  wbt_wait+0x1bc/0x2f8
[39293.008188]  rq_qos_throttle+0x4c/0x68
[39293.011936]  blk_mq_make_request+0xc4/0x4f0
[39293.016118]  generic_make_request+0xf4/0x308
[39293.020384]  submit_bio+0x40/0x198
[39293.023977]  

Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2018-12-31 Thread Andreas Färber
Am 31.12.18 um 18:50 schrieb Mark Brown:
> On Sun, Dec 30, 2018 at 11:55:46AM +0100, Andreas Färber wrote:
>> Given that observed symptoms were CPU stalls, workqueue hangs and RCU
>> problems, requiring a power-cycle to recover, I wonder whether we are
>> running into some atomic/locking issue with clk_enable()? Is it valid at
>> all to use SPI/regmap for clk_enable()? If it is, is there a known issue
>> specific to spi-sun6i (A64) in 4.20.0?
>> I already tried setting .disable_locking = true in both regmap_configs.
>> Any suggestions how to further debug?
> 
> You can't use SPI for clk_enable(), clk_enable() needs to be doable in
> atomic context since we need to wait for the bus operations to complete
> (you can start SPI transfers in atomic context but you still need to
> wait for them to complete).  Any clocks that are only accessible via a
> slow bus like I2C or SPI need to do the enable/disable in the
> prepare/unprepare operations which aren't done in atomic context.
> 
> regmap can be used in atomic contexts, though you need to configure it
> to use spinlocks instead of mutexes and ensure that no register cache
> allocations happen during I/O (eg, by providing defaults for all
> registers or by not using a cache).

We have .cache_type = REGCACHE_NONE on both bus and spi regmap_configs.

I moved the regmap_field_write() from .enable to .prepare and set
.fast_io = true on both regmap_configs to force using spinlocks, but
same hang as in .enable before...

And same if I set .disable_locking = true on both.

Given that it works with one SPI driver and not with the other,
independent of the locking options applied, I assume my symptoms are not
a regmap-layer issue.

Is it allowed during a .prepare operation to call the mentioned
clk_get_rate(), which ends up calling clk_prepare_lock()?

According to my debug output in spi-sun6i.c our hanging
regmap_field_write() ends up calling sun6i_transfer_one() three times,
the first two look okay, but the third one doesn't make it past the
clk_get_rate() or following if block. [But for now some fireworks...]

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2018-12-31 Thread Mark Brown
On Sun, Dec 30, 2018 at 11:55:46AM +0100, Andreas Färber wrote:
> + linux-spi, LAKML

> Given that observed symptoms were CPU stalls, workqueue hangs and RCU
> problems, requiring a power-cycle to recover, I wonder whether we are
> running into some atomic/locking issue with clk_enable()? Is it valid at
> all to use SPI/regmap for clk_enable()? If it is, is there a known issue
> specific to spi-sun6i (A64) in 4.20.0?
> I already tried setting .disable_locking = true in both regmap_configs.
> Any suggestions how to further debug?

You can't use SPI for clk_enable(), clk_enable() needs to be doable in
atomic context since we need to wait for the bus operations to complete
(you can start SPI transfers in atomic context but you still need to
wait for them to complete).  Any clocks that are only accessible via a
slow bus like I2C or SPI need to do the enable/disable in the
prepare/unprepare operations which aren't done in atomic context.

regmap can be used in atomic contexts, though you need to configure it
to use spinlocks instead of mutexes and ensure that no register cache
allocations happen during I/O (eg, by providing defaults for all
registers or by not using a cache).


signature.asc
Description: PGP signature


Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2018-12-31 Thread Andreas Färber
Am 30.12.18 um 11:55 schrieb Andreas Färber:
> Am 29.12.18 um 21:16 schrieb Andreas Färber:
>> `sudo ip link set lora2 up` froze my system, with both
>> lora1 and lora2 being sx1301. [...]
>>
>> Investigating...
> 
> I've bisected and confirmed that it was indeed this patch that regresses
> for one of my SX1301 concentrators.
[...]
> We never return from the sx125x_clkout_enable() performing the
> regmap_field_write() on our regmap_bus, which in turn uses a SPI regmap
> in sx1301_regmap_bus_read().
> 
> A notable difference between my two concentrators is that the working
> one is using spi-gpio driver, the regressing one spi-sun6i.
> 
> Two things stood out in spi-sun6i: It uses a completion (I do not run
> into its timeout warning!), and it uses clk_{get,set}_rate().
> 
> Given that observed symptoms were CPU stalls, workqueue [freezes] and RCU
> problems, requiring a power-cycle to recover, I wonder whether we are
> running into some atomic/locking issue with clk_enable()? Is it valid at
> all to use SPI/regmap for clk_enable()? If it is, is there a known issue
> specific to spi-sun6i (A64) in 4.20.0?

I've now hacked together a test case: delaying the regmap operation to a
work queue (violating the .enable contract of a stable clk on return!)
and having our caller poll afterwards for the operation to finish. Guess
what, below gross hack makes it work again on both cards... :/

Is this hinting at an issue with spi-sun6i clk_get_rate()'s prepare_lock
vs. our clk_enable()'s enable_lock? I grep'ed around and spi-sun6i is
not the only SPI driver using clk_get_rate() in transfer_one...

Thanks for any hints,

Andreas


diff --git a/drivers/net/lora/sx125x.c b/drivers/net/lora/sx125x.c
index 597b882379ac..095ca40e5de7 100644
--- a/drivers/net/lora/sx125x.c
+++ b/drivers/net/lora/sx125x.c
@@ -11,6 +11,7 @@

 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +50,9 @@ struct sx125x_priv {
struct device   *dev;
struct regmap   *regmap;
struct regmap_field
*regmap_fields[ARRAY_SIZE(sx125x_regmap_fields)];
+
+   struct workqueue_struct *clk_wq;
+   struct work_struct  clk_out_enable_work;
 };

 #define to_clkout(_hw) container_of(_hw, struct sx125x_priv, clkout_hw)
@@ -81,8 +85,17 @@ static int sx125x_clkout_enable(struct clk_hw *hw)
 {
struct sx125x_priv *priv = to_clkout(hw);

+   dev_info(priv->dev, "enabling clkout...\n");
+   queue_work(priv->clk_wq, >clk_out_enable_work);
+   return 0;
+}
+
+static void sx125x_clk_out_enable_work_handler(struct work_struct *ws)
+{
+   struct sx125x_priv *priv = container_of(ws, struct sx125x_priv,
clk_out_enable_work);
+
dev_info(priv->dev, "enabling clkout\n");
-   return sx125x_field_write(priv, F_CLK_OUT, 1);
+   sx125x_field_write(priv, F_CLK_OUT, 1);
 }

 static void sx125x_clkout_disable(struct clk_hw *hw)
@@ -230,6 +243,9 @@ static int __maybe_unused sx125x_regmap_probe(struct
device *dev, struct regmap
}
}

+   priv->clk_wq = alloc_workqueue("sx127x_wq", WQ_FREEZABLE |
WQ_MEM_RECLAIM, 0);
+   INIT_WORK(>clk_out_enable_work,
sx125x_clk_out_enable_work_handler);
+
dev_info(dev, "SX125x module probed\n");

return 0;
@@ -237,6 +253,10 @@ static int __maybe_unused
sx125x_regmap_probe(struct device *dev, struct regmap

 static int __maybe_unused sx125x_regmap_remove(struct device *dev)
 {
+   struct sx125x_priv *priv = dev_get_drvdata(dev);
+
+   destroy_workqueue(priv->clk_wq);
+
dev_info(dev, "SX125x module removed\n");

return 0;
diff --git a/drivers/net/lora/sx130x.c b/drivers/net/lora/sx130x.c
index 7ac7de9eda46..4ae6699d38ad 100644
--- a/drivers/net/lora/sx130x.c
+++ b/drivers/net/lora/sx130x.c
@@ -11,6 +11,7 @@

 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -391,6 +392,10 @@ static int sx130x_loradev_open(struct net_device
*netdev)
goto err_clk_enable;
}

+   do {
+   usleep_range(100, 1000);
+   } while (!__clk_is_enabled(priv->clk32m));
+
ret = sx130x_field_write(priv, F_GLOBAL_EN, 1);
if (ret) {
dev_err(priv->dev, "enable global clocks failed (%d)\n",
ret);


-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2018-12-30 Thread Andreas Färber
+ linux-spi, LAKML

Am 29.12.18 um 21:16 schrieb Andreas Färber:
> Am 29.12.18 um 20:25 schrieb Andreas Färber:
>> Am 12.10.18 um 18:26 schrieb Ben Whitten:
>>> +static int sx125x_register_clock_provider(struct sx125x_priv *priv)
>>> +{
>>> +   struct device *dev = priv->dev;
>>> +   struct clk_init_data init;
>>> +   const char *parent;
>>> +   int ret;
>>> +
>>> +   /* Disable CLKOUT */
>>> +   ret = sx125x_field_write(priv, F_CLK_OUT, 0);
>>> +   if (ret) {
>>> +   dev_err(dev, "unable to disable clkout\n");
>>> +   return ret;
>>> +   }
>>> +
>>> +   /* Register clock provider if expected in DTB */
>>> +   if (!of_find_property(dev->of_node, "#clock-cells", NULL))
>>> +   return 0;
>>> +
>>> +   dev_info(dev, "registering clkout\n");
>>> +
>>> +   parent = of_clk_get_parent_name(dev->of_node, 0);
>>> +   if (!parent) {
>>> +   dev_err(dev, "Unable to find parent clk\n");
>>> +   return -ENODEV;
[snip]
> `sudo ip link set lora2 up` froze my system, with both
> lora1 and lora2 being sx1301. [...]
> 
> Investigating...

I've bisected and confirmed that it was indeed this patch that regresses
for one of my SX1301 concentrators.

I noticed that this patch changed the behavior from writing the enable
bit only on one radio to also writing zero to the other. Default value
for the bit was enabled, so a functional change in theory. I therefore
tried enabling radio A clock output on the hanging one. However ...

I've posted a fix for our regmap_bus that lead to the sx125x writes not
taking effect. I.e., no clk_out was ever disabled. Otherwise only the
XOSC register writes were reordered compared to clock output enable, so
I spot no change to SX125x or SX130x registers that could cause this.

I've amended error handling and debug output (cf. below) and verified:

We never return from the sx125x_clkout_enable() performing the
regmap_field_write() on our regmap_bus, which in turn uses a SPI regmap
in sx1301_regmap_bus_read().

A notable difference between my two concentrators is that the working
one is using spi-gpio driver, the regressing one spi-sun6i.

Two things stood out in spi-sun6i: It uses a completion (I do not run
into its timeout warning!), and it uses clk_{get,set}_rate().

Given that observed symptoms were CPU stalls, workqueue hangs and RCU
problems, requiring a power-cycle to recover, I wonder whether we are
running into some atomic/locking issue with clk_enable()? Is it valid at
all to use SPI/regmap for clk_enable()? If it is, is there a known issue
specific to spi-sun6i (A64) in 4.20.0?
I already tried setting .disable_locking = true in both regmap_configs.
Any suggestions how to further debug?

Thanks,
Andreas

> [  473.605058] sx1301 spi0.0: SX1301 module probed
> [  474.394760] sx1301 spi1.0: SX1301 module probed
> [  485.637333] sx125x_con spi0.0-b: SX125x version: 21
> [  485.645801] sx125x_con spi0.0-b: registering clkout
> [  485.656684] sx125x_con spi0.0-b: SX125x module probed
> [  485.663789] sx125x_con spi0.0-a: SX125x version: 21
> [  485.677570] sx125x_con spi0.0-a: SX125x module probed
> [  485.685713] sx125x_con spi1.0-b: SX125x version: 21
> [  485.692210] sx125x_con spi1.0-b: registering clkout
> [  485.701712] sx125x_con spi1.0-b: SX125x module probed
> [  485.708067] sx125x_con spi1.0-a: SX125x version: 21
> [  485.718707] sx125x_con spi1.0-a: SX125x module probed
> [...]
> [ 4603.171814] sx125x_con spi0.0-b: enabling clkout
> [ 4603.697489] sx1301 spi0.0: AGC calibration firmware version 2
> [ 4603.703782] sx1301 spi0.0: starting calibration...
> [ 4606.030617] sx1301 spi0.0: AGC status: 87
> [ 4607.034555] sx1301 spi0.0: AGC firmware version 4
> [ 4607.039666] sx1301 spi0.0: ARB firmware version 1
> [...]
> [ 4628.196884] sx125x_con spi1.0-b: enabling clkout

[ 4096.183451] sx1301 spi1.0: sx1301_regmap_bus_read: radio B addr 0x10

There it does not return from its first SPI-backed regmap_write().

> [ 4968.763990] systemd[1]: systemd-udevd.service: State 'stop-sigterm'
> timed out. Killing.
> [ 4968.772167] systemd[1]: systemd-udevd.service: Killing process 455
> (systemd-udevd) with signal SIGKILL.
> [ 4968.782479] systemd[1]: systemd-logind.service: State 'stop-sigterm'
> timed out. Killing.
> [ 4968.790732] systemd[1]: systemd-logind.service: Killing process 642
> (systemd-logind) with signal SIGKILL.
> [ 5058.761523] systemd[1]: systemd-journald.service: State
> 'stop-sigabrt' timed out. Terminating.
> [ 5059.011507] systemd[1]: systemd-udevd.service: Processes still around
> after SIGKILL. Ignoring.
> [ 5059.021610] systemd[1]: systemd-logind.service: Processes still
> around after SIGKILL. Ignoring.
> [ 5149.009055] systemd[1]: systemd-journald.service: State
> 'stop-sigterm' timed out. Killing.
> [ 5149.017475] systemd[1]: systemd-journald.service: Killing process 440
> (systemd-journal) with signal SIGKILL.
> [ 5149.259029] systemd[1]: systemd-udevd.service: State
> 'stop-final-sigterm' timed out. Killing.
> [ 5259.871913] 

Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2018-12-29 Thread Andreas Färber
Am 29.12.18 um 20:25 schrieb Andreas Färber:
> Am 12.10.18 um 18:26 schrieb Ben Whitten:
>> +static int sx125x_register_clock_provider(struct sx125x_priv *priv)
>> +{
>> +struct device *dev = priv->dev;
>> +struct clk_init_data init;
>> +const char *parent;
>> +int ret;
>> +
>> +/* Disable CLKOUT */
>> +ret = sx125x_field_write(priv, F_CLK_OUT, 0);
>> +if (ret) {
>> +dev_err(dev, "unable to disable clkout\n");
>> +return ret;
>> +}
>> +
>> +/* Register clock provider if expected in DTB */
>> +if (!of_find_property(dev->of_node, "#clock-cells", NULL))
>> +return 0;
>> +
>> +dev_info(dev, "registering clkout\n");
>> +
>> +parent = of_clk_get_parent_name(dev->of_node, 0);
>> +if (!parent) {
>> +dev_err(dev, "Unable to find parent clk\n");
>> +return -ENODEV;
> 
> I got stuck here testing:
> 
> [233875.731268] sx1301 spi0.0: SX1301 module probed
> [233876.520801] sx1301 spi1.0: SX1301 module probed
> [233876.543460] sx125x_con spi0.0-b: SX125x version: 21
> [233876.550866] sx125x_con spi0.0-b: registering clkout
> [233876.555852] sx125x_con spi0.0-b: Unable to find parent clk
> [233876.561491] sx125x_con spi0.0-b: failed to register clkout provider: -19
> [233876.569914] sx125x_con spi0.0-a: SX125x version: 21
> [233876.582915] sx125x_con spi0.0-a: SX125x module probed
> [233876.589100] sx125x_con spi1.0-b: SX125x version: 21
> [233876.595981] sx125x_con spi1.0-b: registering clkout
> [233876.600986] sx125x_con spi1.0-b: Unable to find parent clk
> [233876.606557] sx125x_con spi1.0-b: failed to register clkout provider: -19
> [233876.614559] sx125x_con spi1.0-a: SX125x version: 21
> [233876.625047] sx125x_con spi1.0-a: SX125x module probed
> 
> Your DT example above does not use any parent clock for the radios. It
> seems adding any clocks = <> reference helps resolve that error.
> 
> I don't spot any code dealing with enabling that parent either. With a
> fixed-clock we appear to get around that, except for reference counting:
> 

This was before `sudo ip link set lora1 up`...

> # cat /sys/kernel/debug/clk/clk_summary
> [...]
>  rak831-clk32m0003200
>0 0  5
> rak831_clk32m 0003200
>0 0  5
>  rg186-clk32m 0003200
>0 0  5
> rg186_clk32m  0003200
>0 0  5

Afterwards things seemed okay, but I didn't capture clk_summary again.

> 
> But it might just as well be a gpio-gate-clock or some PMIC providing
> it, needing prepare+enable ops.
> 
>> +}
>> +
>> +init.ops = _clkout_ops;
>> +init.flags = CLK_IS_BASIC;
> 
> I don't think that's really true.
> 
>> +init.parent_names = 
>> +init.num_parents = 1;
>> +priv->clkout_hw.init = 
>> +
>> +of_property_read_string_index(dev->of_node, "clock-output-names", 0,
>> +);
> 
> Error handling for this was in your style cleanup patch. I'd prefer to
> squash that hunk here.
> 
> However, clock-output-names is documented as an optional property, so we
> shouldn't rely on it here, please.
> 
>> +
>> +priv->clkout = devm_clk_register(dev, >clkout_hw);
>> +if (IS_ERR(priv->clkout)) {
>> +dev_err(dev, "failed to register clkout\n");
> 
> Would be nice to output the error code, but then again the caller does.
> 
>> +return PTR_ERR(priv->clkout);
>> +}
>> +ret = of_clk_add_hw_provider(dev->of_node, of_clk_hw_simple_get,
>> +>clkout_hw);
> 
> 
> Don't we need to unregister the provider on remove? Using the devm_
> variant would seem to handle that for us.

Possibly related, `sudo ip link set lora2 up` froze my system, with both
lora1 and lora2 being sx1301. (I should script bringing all of them up!)

Investigating...

And thanks go to the multiple vendors who donated SX130x hardware for
test coverage, helping catch non-standard bugs like these early!

Regards,
Andreas

>> +return ret;
>> +}
[snip]

[  473.605058] sx1301 spi0.0: SX1301 module probed
[  474.394760] sx1301 spi1.0: SX1301 module probed
[  485.637333] sx125x_con spi0.0-b: SX125x version: 21
[  485.645801] sx125x_con spi0.0-b: registering clkout
[  485.656684] sx125x_con spi0.0-b: SX125x module probed
[  485.663789] sx125x_con spi0.0-a: SX125x version: 21
[  485.677570] sx125x_con spi0.0-a: SX125x module probed
[  485.685713] sx125x_con spi1.0-b: SX125x version: 21
[  485.692210] sx125x_con spi1.0-b: registering clkout
[  485.701712] sx125x_con spi1.0-b: SX125x module probed
[  485.708067] sx125x_con spi1.0-a: SX125x version: 21
[  485.718707] sx125x_con spi1.0-a: SX125x module probed
[...]
[ 4603.171814] sx125x_con spi0.0-b: enabling clkout
[ 4603.697489] sx1301 spi0.0: AGC calibration firmware version 2
[ 4603.703782] sx1301 spi0.0: starting 

Re: [PATCH v3 lora-next 5/5] net: lora: sx125x sx1301: allow radio to register as a clk provider

2018-12-29 Thread Andreas Färber
Hi Ben,

+ linux-lpwan, linux-clk, devicetree

Am 12.10.18 um 18:26 schrieb Ben Whitten:
> From: Ben Whitten 
> 
> The 32M is run from the radio, before we just enabled it based on
> the radio number but now we can use the clk framework to request the
> clk is started when we need it.
> 
> The 32M clock produced from the radio is really a gated version of
> tcxo which is a fixed clock provided by hardware, and isn't captured
> in this patch.
> 
> The sx1301 brings the clock up prior to calibration once the radios
> have probed themselves.
> 
> A sample dts showing the clk link:
>   sx1301: sx1301@0 {

Nit: Node names should not duplicate model from compatible or label.
I've been using "lora" for both concentrator and radios.

>   ...
> clocks = < 0>;

Since you use #clock-cells of zero below, you are specifying two clocks
here, surely unintentional.

> clock-names = "clk32m";
> 
> radio-spi {
> radio0: radio-a@0 {

-a/-b in node names duplicates the unit address. I've been using "lora",
but we might also go for just "radio" or "transceiver"?

> ...
> };
> 
> radio1: radio-b@1 {

Should add "..." here, too, for clarity.

> #clock-cells = <0>;
> clock-output-names = "clk32m";

Personally I'd reorder those two lines to have #foo-cells last.

But more importantly, in my testing this is lacking, e.g.,
clocks = <>;
and an appropriate fixed-clock node in the root node. See inline.

> };
> };
>   };

This issue is making me think we should start properly documenting our
dt-bindings in a preceding patch, even if just for my linux-lora staging
tree. Be it in the old .txt format or Rob's newly proposed YAML.

The more cooks are working on SX130x, the better we need to explain what
changes people need to make to their DT to keep things working - if I
stumble as original author, then new testers are even more likely to!

> 
> Signed-off-by: Ben Whitten 
> ---
>  drivers/net/lora/sx125x.c | 112 
> ++
>  drivers/net/lora/sx1301.c |  13 ++
>  drivers/net/lora/sx1301.h |   2 +
>  3 files changed, 119 insertions(+), 8 deletions(-)

In general I'd appreciate if you would prepare, e.g., the clock output
in a preceding sx125x-only patch. Exposing (and consuming) clocks in the
radio driver should be independent of consuming them in the concentrator
driver. That'll make things easier to review for us and also helps avoid
the unusual "sx125x sx1301" in the subject, here and elsewhere.

> 
> diff --git a/drivers/net/lora/sx125x.c b/drivers/net/lora/sx125x.c
> index 36b61b1..b7ca782 100644
> --- a/drivers/net/lora/sx125x.c
> +++ b/drivers/net/lora/sx125x.c
> @@ -9,6 +9,8 @@
>   * Copyright (c) 2013 Semtech-Cycleo
>   */
>  
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -42,10 +44,16 @@ static const struct reg_field sx125x_regmap_fields[] = {
>  };
>  
>  struct sx125x_priv {
> + struct clk  *clkout;
> + struct clk_hw   clkout_hw;

The 0-day bots have reported this to break on sparc64, as you've seen.
We'll need to squash a Kconfig or .c based solution here.

> +
> + struct device   *dev;
>   struct regmap   *regmap;
>   struct regmap_field 
> *regmap_fields[ARRAY_SIZE(sx125x_regmap_fields)];
>  };
>  
> +#define to_clkout(_hw) container_of(_hw, struct sx125x_priv, clkout_hw)
> +
>  static struct regmap_config __maybe_unused sx125x_regmap_config = {
>   .reg_bits = 8,
>   .val_bits = 8,
> @@ -64,6 +72,96 @@ static int sx125x_field_write(struct sx125x_priv *priv,
>   return regmap_field_write(priv->regmap_fields[field_id], val);
>  }
>  
> +static int sx125x_field_read(struct sx125x_priv *priv,
> + enum sx125x_fields field_id, unsigned int *val)
> +{
> + return regmap_field_read(priv->regmap_fields[field_id], val);
> +}

Nit: Given how trivial this is, we might make it static inline?

> +
> +static int sx125x_clkout_enable(struct clk_hw *hw)
> +{
> + struct sx125x_priv *priv = to_clkout(hw);
> +
> + dev_info(priv->dev, "enabling clkout\n");
> + return sx125x_field_write(priv, F_CLK_OUT, 1);
> +}
> +
> +static void sx125x_clkout_disable(struct clk_hw *hw)
> +{
> + struct sx125x_priv *priv = to_clkout(hw);
> + int ret;
> +
> + dev_info(priv->dev, "disabling clkout\n");
> + ret = sx125x_field_write(priv, F_CLK_OUT, 0);
> + if (ret)
> + dev_err(priv->dev, "error disabling clkout\n");
> +}
> +
> +static int sx125x_clkout_is_enabled(struct clk_hw *hw)
> +{
> + struct sx125x_priv *priv = to_clkout(hw);
> + unsigned int enabled;
> + int ret;
> +
> + ret = sx125x_field_read(priv, F_CLK_OUT, );
> + if (ret) {
> + dev_err(priv->dev, "error