Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-08 Thread Karthik Ramasubramanian



On 3/7/2018 2:16 PM, Doug Anderson wrote:

Hi,

On Tue, Feb 27, 2018 at 5:38 PM, Karthikeyan Ramasubramanian
 wrote:

This bus driver supports the GENI based i2c hardware controller in the
Qualcomm SOCs. The Qualcomm Generic Interface (GENI) is a programmable
module supporting a wide range of serial interfaces including I2C. The
driver supports FIFO mode and DMA mode of transfer and switches modes
dynamically depending on the size of the transfer.

Signed-off-by: Karthikeyan Ramasubramanian 
Signed-off-by: Sagar Dharia 
Signed-off-by: Girish Mahadevan 
---
  drivers/i2c/busses/Kconfig |  11 +
  drivers/i2c/busses/Makefile|   1 +
  drivers/i2c/busses/i2c-qcom-geni.c | 626 +
  3 files changed, 638 insertions(+)


I'm not an expert on geni (and, to be honest, I haven't read the main
geni patch yet).  ...but I figured I could at least add my $0.02 since
I've stared at i2c bus drivers a lot in the past.  Feel free to tell
me if I'm full or crap...



  create mode 100644 drivers/i2c/busses/i2c-qcom-geni.c

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index e2954fb..1ddf5cd 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -848,6 +848,17 @@ config I2C_PXA_SLAVE
   is necessary for systems where the PXA may be a target on the
   I2C bus.

+config I2C_QCOM_GENI
+   tristate "Qualcomm Technologies Inc.'s GENI based I2C controller"
+   depends on ARCH_QCOM
+   depends on QCOM_GENI_SE
+   help
+ If you say yes to this option, support will be included for the
+ built-in I2C interface on the Qualcomm Technologies Inc.'s SoCs.


Kind of a generic description and this driver is only for new SoCs,
right?  Maybe make it a little more specific?

Ok.




+
+ This driver can also be built as a module.  If so, the module
+ will be called i2c-qcom-geni.
+
  config I2C_QUP
 tristate "Qualcomm QUP based I2C controller"
 depends on ARCH_QCOM
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 2ce8576..201fce1 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -84,6 +84,7 @@ obj-$(CONFIG_I2C_PNX) += i2c-pnx.o
  obj-$(CONFIG_I2C_PUV3) += i2c-puv3.o
  obj-$(CONFIG_I2C_PXA)  += i2c-pxa.o
  obj-$(CONFIG_I2C_PXA_PCI)  += i2c-pxa-pci.o
+obj-$(CONFIG_I2C_QCOM_GENI)+= i2c-qcom-geni.o
  obj-$(CONFIG_I2C_QUP)  += i2c-qup.o
  obj-$(CONFIG_I2C_RIIC) += i2c-riic.o
  obj-$(CONFIG_I2C_RK3X) += i2c-rk3x.o
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c 
b/drivers/i2c/busses/i2c-qcom-geni.c
new file mode 100644
index 000..e1e4268
--- /dev/null
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -0,0 +1,626 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SE_I2C_TX_TRANS_LEN0x26c
+#define SE_I2C_RX_TRANS_LEN0x270
+#define SE_I2C_SCL_COUNTERS0x278
+
+#define SE_I2C_ERR  (M_CMD_OVERRUN_EN | M_ILLEGAL_CMD_EN | M_CMD_FAILURE_EN |\
+   M_GP_IRQ_1_EN | M_GP_IRQ_3_EN | M_GP_IRQ_4_EN)
+#define SE_I2C_ABORT   BIT(1)
+
+/* M_CMD OP codes for I2C */
+#define I2C_WRITE  0x1
+#define I2C_READ   0x2
+#define I2C_WRITE_READ 0x3
+#define I2C_ADDR_ONLY  0x4
+#define I2C_BUS_CLEAR  0x6
+#define I2C_STOP_ON_BUS0x7
+/* M_CMD params for I2C */
+#define PRE_CMD_DELAY  BIT(0)
+#define TIMESTAMP_BEFORE   BIT(1)
+#define STOP_STRETCH   BIT(2)
+#define TIMESTAMP_AFTERBIT(3)
+#define POST_COMMAND_DELAY BIT(4)
+#define IGNORE_ADD_NACKBIT(6)
+#define READ_FINISHED_WITH_ACK BIT(7)
+#define BYPASS_ADDR_PHASE  BIT(8)
+#define SLV_ADDR_MSK   GENMASK(15, 9)
+#define SLV_ADDR_SHFT  9
+/* I2C SCL COUNTER fields */
+#define HIGH_COUNTER_MSK   GENMASK(29, 20)
+#define HIGH_COUNTER_SHFT  20
+#define LOW_COUNTER_MSKGENMASK(19, 10)
+#define LOW_COUNTER_SHFT   10
+#define CYCLE_COUNTER_MSK  GENMASK(9, 0)
+
+#define GP_IRQ00
+#define GP_IRQ11
+#define GP_IRQ22
+#define GP_IRQ33
+#define GP_IRQ44
+#define GP_IRQ55
+#define GENI_OVERRUN   6
+#define GENI_ILLEGAL_CMD   7
+#define GENI_ABORT_DONE8
+#define GENI_TIMEOUT   9


Above should be an enum; then use the enum type as the parameter to
geni_i2c_err() so it's obvious that "err" is not a normal linux error
code.



+#define I2C_NACK   

Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-08 Thread Doug Anderson
Hi,

On Thu, Mar 8, 2018 at 5:06 PM, Sagar Dharia  wrote:
> Hi Doug
>
>
> On 3/8/2018 2:12 PM, Doug Anderson wrote:
>>
>> Hi,
>>
>> On Wed, Mar 7, 2018 at 9:19 PM, Doug Anderson 
>> wrote:
>
> DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
> a lot by transferring i2c commands over DMA compared to a FIFO?
> Enough to justify the code complexity and the set of bugs that will
> show up?  I'm sure it will be a controversial assertion given that the
> code's already written, but personally I'd be supportive of ripping
> DMA mode out to simplify the driver.  I'd be curious if anyone else
> agrees.  To me it seems like premature optimization.



 Yes, we have multiple clients (e.g. touch, NFC) using I2C for data
 transfers
 bigger than 32 bytes (some transfers are 100s of bytes). The fifo size
 is
 32, so we can definitely avoid at least 1 interrupt when DMA mode is
 used
 with data size > 32.
>>>
>>>
>>> Does that 1-2 interrupts make any real difference, though?  In theory
>>> it really shouldn't affect the transfer rate.  We should be able to
>>> service the interrupt plenty fast and if we were concerned we would
>>> tweak the watermark code a little bit.  So I guess we're worried about
>>> the extra CPU cycles (and power cost) to service those extra couple
>>> interrupts?
>>>
>>> In theory when touch data is coming in or NFC data is coming in then
>>> we're probably not in a super low power state to begin with.  If it's
>>> touch data we likely want to have the CPU boosted a bunch to respond
>>> to the user quickly.  If we've got 8 cores available all of which can
>>> run at 1GHz or more a few interrupts won't kill us.  NFC data is
>>> probably not common enough that we need to optimize power/CPU
>>> utilizatoin for that.
>>>
>>>
>>> So while i can believe that you do save an interrupt or two, I still
>>> am not convinced that those interrupts are worth a bunch of complex
>>> code (and a whole second code path) to save.
>>>
>>>
>>> ...also note that above you said that coming out of runtime suspend
>>> can take several msec.  That seems like it dwarfs any slight
>>> difference in timing between a FIFO-based operation and DMA.
>>
>>
>> One last note here (sorry for not thinking of this last night) is that
>> I would also be interested in considering _only_ supporting the DMA
>> path.  Is it somehow intrinsically bad to use the DMA flow for a
>> 1-byte transfer?  Is there a bunch of extra overhead or power draw?
>>
>> Specifically my main point is that maintaining two separate flows (the
>> FIFO flow vs the DMA flow) adds complexity, code size, and bugs.  If
>> there's a really good reason to maintain both flows then fine, but we
>> should really consider if this is something that's really giving us
>> value before we agree to it.
>>
>
> FIFO mode gives us 2 advantages:
> 1. small transfers don't have to go through 'dma-map/unmap penalties.
> Some small buffers come from the stack of client caller and the
> dma-map/unmap may fail.
> 2. bring-ups are 'less eventful' (with temp. change to just not have DMA
> mode at all during bring-ups) since SMMU translation/DMA path from QUP
> (master) to memory slave may not always available when critical I2C
> peripherals need to be brought up (e.g. PMIC). CPU to QUP (slave) path
> is usually available.
>
> On the other side, DMA mode has other advantages:
> 1. Multiple android clients are still heavily using I2C in spite of
> faster peripheral buses being available in industry.
> As an example, some multi-finger Touch screens use I2C and the data to
> be transferred per transaction over the bus grows well beyond 70-100
> bytes based on number of fingers. These transactions are very frequent
> when touch is being used, and in an environment where other heavy system
> users are also running (MM/graphics).
> Another example is, NFC uses I2C (as of now) to transfer as much as 700+
> bytes. This can save us 20+ interrupts per transfer.
>
> These transfers are mostly in burst. So the RPMh penalty to resume the
> shared resources is only experienced for very first transfer. Remaining
> transfers in the burst benefit from DMA if they are too big.
>
> Goal here is to have common driver for upstream targets and android and
> android has seen proven advantages with both modes.
> If we end up keeping DMA only for downstream (or FIFO only for
> downstream), then we lose the advantage of having code in upstream since
> we have to maintain downstream patch with other mode.

OK, fair enough.  Having DMA mode alone would be a pain at bringup if
nothing else.  You're right.

I would still argue that perhaps those extra interrupts for FIFO mode
aren't quite as bit of a deal as you're making it out to be.  I've
been on systems that get massive number of interrupts almost
constantly and really it wasn't noticeable.

In any case, I didn't think I'd really 

Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-08 Thread Sagar Dharia

Hi Doug,

On 3/7/2018 10:19 PM, Doug Anderson wrote:

Hi,

On Wed, Mar 7, 2018 at 6:42 PM, Sagar Dharia  wrote:

Hi Doug,
Thank you for reviewing the patch. I will take a stab at a few comments
below. We will address most of the other comments in next version of I2C
patch.



+
+static const struct geni_i2c_clk_fld geni_i2c_clk_map[] = {
+   {KHz(100), 7, 10, 11, 26},
+   {KHz(400), 2,  5, 12, 24},
+   {KHz(1000), 1, 3,  9, 18},



So I guess this is all relying on an input serial clock of 19.2MHz?
Maybe document that?

Assuming I'm understanding the math here, is it really OK for your
100kHz and 1MHz mode to be running slightly fast?

19200. / 2 / 24


400.0



19200. / 7 / 26


105.49450549450549



19200. / 1 / 18


1066.7



It seems like you'd want the fastest clock that you can make that's
_less than_ the spec.


It would also be interesting to know if it's expected that boards
might need to tweak the t_high / t_low depending on their electrical
characteristics.  In the past I've had lots of requests from board
makers to tweak things because they've got a long trace, or a stronger
or weaker pull, or ...  If so we might later need to add some dts
properties like "i2c-scl-rising-time-ns" and make the math more
dynamic here, unless your hardware somehow automatically adjusts for
this type of thing...
These values are derived by our HW team to comply with the t-high and


t-low specs of I2C. We have confirmed on scope that the frequency of SCL is
indeed less than/equal to the spec. We have not come across slaves who have
needed to tweak these things. We are open to adding these properties in dts
if you have any such slaves not conforming due to board-layout of other
reasons.


OK, I'm fine with leaving something like this till later if/when it
comes up.  Documenting a little bit more about how these timings work
seems like it would be nice, though, at least mentioning what the
source clock is...



Yes, we will document how t-high and t-low is derived from source.


Wow, that's a cluster of arcane calls to handle a call that probably
will never fail (it just enables clocks and sets pinctrl).  Sigh.
...but as far as I can tell the whole sequence is right.  You
definitely need a "put" after a failed get and it looks like
pm_runtime_set_suspended() has a special exception where it can be
called if you got a runtime error...


We didn't have this in before either. But this condition is somewhat
frequent if I2C transactions are called on cusp of exiting system suspend.
(e.g. PMIC slave getting a wakeup-IRQ and trying to read from PMIC through
I2C to read its status as to what caused that wake-up. At that time,
get_sync doesn't really enable resources (kernel 4.9) since the runtime-pm
ref-count isn't decremented. We run the risk of unclocked access if these
arcane calls aren't present. You can go through runtime-pm documentation
chapter 6 for more details.


Yeah, I certainly agree that the calls are needed if
pm_runtime_get_sync() and I'm not suggesting removing them.  Hence the
"as far as I can tell the whole sequence is right".

...but I'm actually kinda worried if you're saying that you actually
ran into this case.  Hopefully that got fixed and code no longer tries
to read from the PMIC at a bad time anymore?  That code should be
fixed not to be running so late in suspend.



I have added Harry Y and Abhijeet D (developers for PMIC I2C client
team). They can comment if there is still a usecase of very late
transaction during suspend and/or very early transaction during resume.





+   /* Make sure no transactions are pending */
+   ret = i2c_trylock_bus(>adap, I2C_LOCK_SEGMENT);
+   if (!ret) {
+   dev_err(gi2c->se.dev, "late I2C transaction request\n");
+   return -EBUSY;
+   }



Does this happen?  How?

Nothing about this code looks special for your hardware.  If this is
really needed, why is it not part of the i2c core since there's
nothing specific about your driver here?


There have been some clients that don't implement sys-suspend/resume
callbacks (so i2c adapter has no clue they are done with their transactions)
and this allows us to be flexible when they call I2C transactions extremely
late.


Still feels like this belongs in the i2c core, not your driver.  Maybe
you should send a patch for the core and remove it from here?

...and also, it seems like any i2c clients that don't implement the
suspend/resume callbacks and try to do i2c transactions late in the
game need to be fixed.  It should be documented that this isn't a
valid thing for a driver to do and if we end up in this error case
then it's not an i2c issue but it's a bad driver somewhere.


You are right: this check is special for our HW due to usecases
mentioned above.
This check can go in i2c-core, but then it will not be necessary if
all adapters and clients that we work with are upstreamed (and all
implement system suspend/resume).
We will 

Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-08 Thread Sagar Dharia

Hi Doug

On 3/8/2018 2:12 PM, Doug Anderson wrote:

Hi,

On Wed, Mar 7, 2018 at 9:19 PM, Doug Anderson  wrote:

DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
a lot by transferring i2c commands over DMA compared to a FIFO?
Enough to justify the code complexity and the set of bugs that will
show up?  I'm sure it will be a controversial assertion given that the
code's already written, but personally I'd be supportive of ripping
DMA mode out to simplify the driver.  I'd be curious if anyone else
agrees.  To me it seems like premature optimization.



Yes, we have multiple clients (e.g. touch, NFC) using I2C for data transfers
bigger than 32 bytes (some transfers are 100s of bytes). The fifo size is
32, so we can definitely avoid at least 1 interrupt when DMA mode is used
with data size > 32.


Does that 1-2 interrupts make any real difference, though?  In theory
it really shouldn't affect the transfer rate.  We should be able to
service the interrupt plenty fast and if we were concerned we would
tweak the watermark code a little bit.  So I guess we're worried about
the extra CPU cycles (and power cost) to service those extra couple
interrupts?

In theory when touch data is coming in or NFC data is coming in then
we're probably not in a super low power state to begin with.  If it's
touch data we likely want to have the CPU boosted a bunch to respond
to the user quickly.  If we've got 8 cores available all of which can
run at 1GHz or more a few interrupts won't kill us.  NFC data is
probably not common enough that we need to optimize power/CPU
utilizatoin for that.


So while i can believe that you do save an interrupt or two, I still
am not convinced that those interrupts are worth a bunch of complex
code (and a whole second code path) to save.


...also note that above you said that coming out of runtime suspend
can take several msec.  That seems like it dwarfs any slight
difference in timing between a FIFO-based operation and DMA.


One last note here (sorry for not thinking of this last night) is that
I would also be interested in considering _only_ supporting the DMA
path.  Is it somehow intrinsically bad to use the DMA flow for a
1-byte transfer?  Is there a bunch of extra overhead or power draw?

Specifically my main point is that maintaining two separate flows (the
FIFO flow vs the DMA flow) adds complexity, code size, and bugs.  If
there's a really good reason to maintain both flows then fine, but we
should really consider if this is something that's really giving us
value before we agree to it.



FIFO mode gives us 2 advantages:
1. small transfers don't have to go through 'dma-map/unmap penalties.
Some small buffers come from the stack of client caller and the
dma-map/unmap may fail.
2. bring-ups are 'less eventful' (with temp. change to just not have DMA
mode at all during bring-ups) since SMMU translation/DMA path from QUP
(master) to memory slave may not always available when critical I2C
peripherals need to be brought up (e.g. PMIC). CPU to QUP (slave) path
is usually available.

On the other side, DMA mode has other advantages:
1. Multiple android clients are still heavily using I2C in spite of
faster peripheral buses being available in industry.
As an example, some multi-finger Touch screens use I2C and the data to
be transferred per transaction over the bus grows well beyond 70-100
bytes based on number of fingers. These transactions are very frequent
when touch is being used, and in an environment where other heavy system
users are also running (MM/graphics).
Another example is, NFC uses I2C (as of now) to transfer as much as 700+
bytes. This can save us 20+ interrupts per transfer.

These transfers are mostly in burst. So the RPMh penalty to resume the
shared resources is only experienced for very first transfer. Remaining
transfers in the burst benefit from DMA if they are too big.

Goal here is to have common driver for upstream targets and android and
android has seen proven advantages with both modes.
If we end up keeping DMA only for downstream (or FIFO only for
downstream), then we lose the advantage of having code in upstream since
we have to maintain downstream patch with other mode.

Thanks
Sagar



-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-08 Thread Doug Anderson
Hi,

On Wed, Mar 7, 2018 at 9:19 PM, Doug Anderson  wrote:
>>> DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
>>> a lot by transferring i2c commands over DMA compared to a FIFO?
>>> Enough to justify the code complexity and the set of bugs that will
>>> show up?  I'm sure it will be a controversial assertion given that the
>>> code's already written, but personally I'd be supportive of ripping
>>> DMA mode out to simplify the driver.  I'd be curious if anyone else
>>> agrees.  To me it seems like premature optimization.
>>
>>
>> Yes, we have multiple clients (e.g. touch, NFC) using I2C for data transfers
>> bigger than 32 bytes (some transfers are 100s of bytes). The fifo size is
>> 32, so we can definitely avoid at least 1 interrupt when DMA mode is used
>> with data size > 32.
>
> Does that 1-2 interrupts make any real difference, though?  In theory
> it really shouldn't affect the transfer rate.  We should be able to
> service the interrupt plenty fast and if we were concerned we would
> tweak the watermark code a little bit.  So I guess we're worried about
> the extra CPU cycles (and power cost) to service those extra couple
> interrupts?
>
> In theory when touch data is coming in or NFC data is coming in then
> we're probably not in a super low power state to begin with.  If it's
> touch data we likely want to have the CPU boosted a bunch to respond
> to the user quickly.  If we've got 8 cores available all of which can
> run at 1GHz or more a few interrupts won't kill us.  NFC data is
> probably not common enough that we need to optimize power/CPU
> utilizatoin for that.
>
>
> So while i can believe that you do save an interrupt or two, I still
> am not convinced that those interrupts are worth a bunch of complex
> code (and a whole second code path) to save.
>
>
> ...also note that above you said that coming out of runtime suspend
> can take several msec.  That seems like it dwarfs any slight
> difference in timing between a FIFO-based operation and DMA.

One last note here (sorry for not thinking of this last night) is that
I would also be interested in considering _only_ supporting the DMA
path.  Is it somehow intrinsically bad to use the DMA flow for a
1-byte transfer?  Is there a bunch of extra overhead or power draw?

Specifically my main point is that maintaining two separate flows (the
FIFO flow vs the DMA flow) adds complexity, code size, and bugs.  If
there's a really good reason to maintain both flows then fine, but we
should really consider if this is something that's really giving us
value before we agree to it.


-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-07 Thread Doug Anderson
Hi,

On Wed, Mar 7, 2018 at 6:42 PM, Sagar Dharia  wrote:
> Hi Doug,
> Thank you for reviewing the patch. I will take a stab at a few comments
> below. We will address most of the other comments in next version of I2C
> patch.
>>
>>
>>
>>> +
>>> +#define I2C_AUTO_SUSPEND_DELAY 250
>>
>>
>> Why 250 ms?  That seems like an eternity.  Is it really that expensive
>> to turn resources off and on?  I would sorta just expect clocks and
>> stuff to get turned off right after a transaction finished unless
>> another one was pending right behind it...
>>
> The response from RPMh to turn on/off shared resources also take quite a few
> msecs. The QUP serial bus block sits quite a few shared-NOCs away from the
> memory and runtime-PM is used a bandwidth vote/NOC vote for these NOCs from
> QUP to memory. Also the RPC between apps and RPMh can sometimes take longer
> depending on other tasks running on apps. This 250 msec avoids thrashing of
> these RPCs between apps and RPMh.
> If you plan to keep these NOCs on forever, then your are right: runtime-PM
> will be only used to turn on/off local clocks and we won't even need
> autosuspend. that's not true on products where this driver is currently
> deployed.

OK, fair enough.  I don't know how RPMh works well enough to argue.
It does seem odd that you'd want to design things such that it takes a
few msecs to pull it out of runtime suspend, especially for touch.


>>> +
>>> +static const struct geni_i2c_clk_fld geni_i2c_clk_map[] = {
>>> +   {KHz(100), 7, 10, 11, 26},
>>> +   {KHz(400), 2,  5, 12, 24},
>>> +   {KHz(1000), 1, 3,  9, 18},
>>
>>
>> So I guess this is all relying on an input serial clock of 19.2MHz?
>> Maybe document that?
>>
>> Assuming I'm understanding the math here, is it really OK for your
>> 100kHz and 1MHz mode to be running slightly fast?
>>
>> 19200. / 2 / 24
>
> 400.0
>>
>>
>> 19200. / 7 / 26
>
> 105.49450549450549
>>
>>
>> 19200. / 1 / 18
>
> 1066.7
>>
>>
>> It seems like you'd want the fastest clock that you can make that's
>> _less than_ the spec.
>>
>>
>> It would also be interesting to know if it's expected that boards
>> might need to tweak the t_high / t_low depending on their electrical
>> characteristics.  In the past I've had lots of requests from board
>> makers to tweak things because they've got a long trace, or a stronger
>> or weaker pull, or ...  If so we might later need to add some dts
>> properties like "i2c-scl-rising-time-ns" and make the math more
>> dynamic here, unless your hardware somehow automatically adjusts for
>> this type of thing...
>> These values are derived by our HW team to comply with the t-high and
>
> t-low specs of I2C. We have confirmed on scope that the frequency of SCL is
> indeed less than/equal to the spec. We have not come across slaves who have
> needed to tweak these things. We are open to adding these properties in dts
> if you have any such slaves not conforming due to board-layout of other
> reasons.

OK, I'm fine with leaving something like this till later if/when it
comes up.  Documenting a little bit more about how these timings work
seems like it would be nice, though, at least mentioning what the
source clock is...


>>>
>>> +   mode = msg->len > 32 ? GENI_SE_DMA : GENI_SE_FIFO;
>>
>>
>> DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
>> a lot by transferring i2c commands over DMA compared to a FIFO?
>> Enough to justify the code complexity and the set of bugs that will
>> show up?  I'm sure it will be a controversial assertion given that the
>> code's already written, but personally I'd be supportive of ripping
>> DMA mode out to simplify the driver.  I'd be curious if anyone else
>> agrees.  To me it seems like premature optimization.
>
>
> Yes, we have multiple clients (e.g. touch, NFC) using I2C for data transfers
> bigger than 32 bytes (some transfers are 100s of bytes). The fifo size is
> 32, so we can definitely avoid at least 1 interrupt when DMA mode is used
> with data size > 32.

Does that 1-2 interrupts make any real difference, though?  In theory
it really shouldn't affect the transfer rate.  We should be able to
service the interrupt plenty fast and if we were concerned we would
tweak the watermark code a little bit.  So I guess we're worried about
the extra CPU cycles (and power cost) to service those extra couple
interrupts?

In theory when touch data is coming in or NFC data is coming in then
we're probably not in a super low power state to begin with.  If it's
touch data we likely want to have the CPU boosted a bunch to respond
to the user quickly.  If we've got 8 cores available all of which can
run at 1GHz or more a few interrupts won't kill us.  NFC data is
probably not common enough that we need to optimize power/CPU
utilizatoin for that.


So while i can believe that you do save an interrupt or two, I still
am not convinced that those interrupts are worth a bunch of complex

Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-07 Thread Sagar Dharia

Hi Doug,
Thank you for reviewing the patch. I will take a stab at a few comments 
below. We will address most of the other comments in next version of I2C 
patch.




+
+#define I2C_AUTO_SUSPEND_DELAY 250


Why 250 ms?  That seems like an eternity.  Is it really that expensive
to turn resources off and on?  I would sorta just expect clocks and
stuff to get turned off right after a transaction finished unless
another one was pending right behind it...

The response from RPMh to turn on/off shared resources also take quite a 
few msecs. The QUP serial bus block sits quite a few shared-NOCs away 
from the memory and runtime-PM is used a bandwidth vote/NOC vote for 
these NOCs from QUP to memory. Also the RPC between apps and RPMh can 
sometimes take longer depending on other tasks running on apps. This 250 
msec avoids thrashing of these RPCs between apps and RPMh.
If you plan to keep these NOCs on forever, then your are right: 
runtime-PM will be only used to turn on/off local clocks and we won't 
even need autosuspend. that's not true on products where this driver is 
currently deployed.

+
+static const struct geni_i2c_clk_fld geni_i2c_clk_map[] = {
+   {KHz(100), 7, 10, 11, 26},
+   {KHz(400), 2,  5, 12, 24},
+   {KHz(1000), 1, 3,  9, 18},


So I guess this is all relying on an input serial clock of 19.2MHz?
Maybe document that?

Assuming I'm understanding the math here, is it really OK for your
100kHz and 1MHz mode to be running slightly fast?

19200. / 2 / 24

400.0


19200. / 7 / 26

105.49450549450549


19200. / 1 / 18

1066.7


It seems like you'd want the fastest clock that you can make that's
_less than_ the spec.


It would also be interesting to know if it's expected that boards
might need to tweak the t_high / t_low depending on their electrical
characteristics.  In the past I've had lots of requests from board
makers to tweak things because they've got a long trace, or a stronger
or weaker pull, or ...  If so we might later need to add some dts
properties like "i2c-scl-rising-time-ns" and make the math more
dynamic here, unless your hardware somehow automatically adjusts for
this type of thing...
These values are derived by our HW team to comply with the t-high and 
t-low specs of I2C. We have confirmed on scope that the frequency of SCL 
is indeed less than/equal to the spec. We have not come across slaves 
who have needed to tweak these things. We are open to adding these 
properties in dts if you have any such slaves not conforming due to 
board-layout of other reasons.

+   mode = msg->len > 32 ? GENI_SE_DMA : GENI_SE_FIFO;


DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
a lot by transferring i2c commands over DMA compared to a FIFO?
Enough to justify the code complexity and the set of bugs that will
show up?  I'm sure it will be a controversial assertion given that the
code's already written, but personally I'd be supportive of ripping
DMA mode out to simplify the driver.  I'd be curious if anyone else
agrees.  To me it seems like premature optimization.


Yes, we have multiple clients (e.g. touch, NFC) using I2C for data 
transfers bigger than 32 bytes (some transfers are 100s of bytes). The 
fifo size is 32, so we can definitely avoid at least 1 interrupt when 
DMA mode is used with data size > 32.




+   geni_se_select_mode(>se, mode);
+   writel_relaxed(msg->len, gi2c->se.base + SE_I2C_RX_TRANS_LEN);
+   geni_se_setup_m_cmd(>se, I2C_READ, m_param);
+   if (mode == GENI_SE_DMA) {
+   rx_dma = geni_se_rx_dma_prep(>se, msg->buf, msg->len);


Randomly I noticed a flag called "I2C_M_DMA_SAFE".  Do we need to
check this flag before using msg->buf for DMA?  ...or use
i2c_get_dma_safe_msg_buf()?

...btw: the relative lack of people doing this in the kernel is
further evidence of DMA not really being worth it for i2c busses.
I cannot comment about other drivers here using or not using DMA since 
they may not be exercised with slaves like NFC?

+   ret = pm_runtime_get_sync(gi2c->se.dev);
+   if (ret < 0) {
+   dev_err(gi2c->se.dev, "error turning SE resources:%d\n", ret);
+   pm_runtime_put_noidle(gi2c->se.dev);
+   /* Set device in suspended since resume failed */
+   pm_runtime_set_suspended(gi2c->se.dev);
+   return ret;


Wow, that's a cluster of arcane calls to handle a call that probably
will never fail (it just enables clocks and sets pinctrl).  Sigh.
...but as far as I can tell the whole sequence is right.  You
definitely need a "put" after a failed get and it looks like
pm_runtime_set_suspended() has a special exception where it can be
called if you got a runtime error...
We didn't have this in before either. But this condition is somewhat 
frequent if I2C transactions are called on cusp of exiting system 
suspend. (e.g. PMIC slave getting a wakeup-IRQ and trying to read from 
PMIC through I2C to read its status as to what caused that 

Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller

2018-03-07 Thread Doug Anderson
Hi,

On Tue, Feb 27, 2018 at 5:38 PM, Karthikeyan Ramasubramanian
 wrote:
> This bus driver supports the GENI based i2c hardware controller in the
> Qualcomm SOCs. The Qualcomm Generic Interface (GENI) is a programmable
> module supporting a wide range of serial interfaces including I2C. The
> driver supports FIFO mode and DMA mode of transfer and switches modes
> dynamically depending on the size of the transfer.
>
> Signed-off-by: Karthikeyan Ramasubramanian 
> Signed-off-by: Sagar Dharia 
> Signed-off-by: Girish Mahadevan 
> ---
>  drivers/i2c/busses/Kconfig |  11 +
>  drivers/i2c/busses/Makefile|   1 +
>  drivers/i2c/busses/i2c-qcom-geni.c | 626 
> +
>  3 files changed, 638 insertions(+)

I'm not an expert on geni (and, to be honest, I haven't read the main
geni patch yet).  ...but I figured I could at least add my $0.02 since
I've stared at i2c bus drivers a lot in the past.  Feel free to tell
me if I'm full or crap...


>  create mode 100644 drivers/i2c/busses/i2c-qcom-geni.c
>
> diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
> index e2954fb..1ddf5cd 100644
> --- a/drivers/i2c/busses/Kconfig
> +++ b/drivers/i2c/busses/Kconfig
> @@ -848,6 +848,17 @@ config I2C_PXA_SLAVE
>   is necessary for systems where the PXA may be a target on the
>   I2C bus.
>
> +config I2C_QCOM_GENI
> +   tristate "Qualcomm Technologies Inc.'s GENI based I2C controller"
> +   depends on ARCH_QCOM
> +   depends on QCOM_GENI_SE
> +   help
> + If you say yes to this option, support will be included for the
> + built-in I2C interface on the Qualcomm Technologies Inc.'s SoCs.

Kind of a generic description and this driver is only for new SoCs,
right?  Maybe make it a little more specific?


> +
> + This driver can also be built as a module.  If so, the module
> + will be called i2c-qcom-geni.
> +
>  config I2C_QUP
> tristate "Qualcomm QUP based I2C controller"
> depends on ARCH_QCOM
> diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
> index 2ce8576..201fce1 100644
> --- a/drivers/i2c/busses/Makefile
> +++ b/drivers/i2c/busses/Makefile
> @@ -84,6 +84,7 @@ obj-$(CONFIG_I2C_PNX) += i2c-pnx.o
>  obj-$(CONFIG_I2C_PUV3) += i2c-puv3.o
>  obj-$(CONFIG_I2C_PXA)  += i2c-pxa.o
>  obj-$(CONFIG_I2C_PXA_PCI)  += i2c-pxa-pci.o
> +obj-$(CONFIG_I2C_QCOM_GENI)+= i2c-qcom-geni.o
>  obj-$(CONFIG_I2C_QUP)  += i2c-qup.o
>  obj-$(CONFIG_I2C_RIIC) += i2c-riic.o
>  obj-$(CONFIG_I2C_RK3X) += i2c-rk3x.o
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c 
> b/drivers/i2c/busses/i2c-qcom-geni.c
> new file mode 100644
> index 000..e1e4268
> --- /dev/null
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -0,0 +1,626 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define SE_I2C_TX_TRANS_LEN0x26c
> +#define SE_I2C_RX_TRANS_LEN0x270
> +#define SE_I2C_SCL_COUNTERS0x278
> +
> +#define SE_I2C_ERR  (M_CMD_OVERRUN_EN | M_ILLEGAL_CMD_EN | M_CMD_FAILURE_EN 
> |\
> +   M_GP_IRQ_1_EN | M_GP_IRQ_3_EN | M_GP_IRQ_4_EN)
> +#define SE_I2C_ABORT   BIT(1)
> +
> +/* M_CMD OP codes for I2C */
> +#define I2C_WRITE  0x1
> +#define I2C_READ   0x2
> +#define I2C_WRITE_READ 0x3
> +#define I2C_ADDR_ONLY  0x4
> +#define I2C_BUS_CLEAR  0x6
> +#define I2C_STOP_ON_BUS0x7
> +/* M_CMD params for I2C */
> +#define PRE_CMD_DELAY  BIT(0)
> +#define TIMESTAMP_BEFORE   BIT(1)
> +#define STOP_STRETCH   BIT(2)
> +#define TIMESTAMP_AFTERBIT(3)
> +#define POST_COMMAND_DELAY BIT(4)
> +#define IGNORE_ADD_NACKBIT(6)
> +#define READ_FINISHED_WITH_ACK BIT(7)
> +#define BYPASS_ADDR_PHASE  BIT(8)
> +#define SLV_ADDR_MSK   GENMASK(15, 9)
> +#define SLV_ADDR_SHFT  9
> +/* I2C SCL COUNTER fields */
> +#define HIGH_COUNTER_MSK   GENMASK(29, 20)
> +#define HIGH_COUNTER_SHFT  20
> +#define LOW_COUNTER_MSKGENMASK(19, 10)
> +#define LOW_COUNTER_SHFT   10
> +#define CYCLE_COUNTER_MSK  GENMASK(9, 0)
> +
> +#define GP_IRQ00
> +#define GP_IRQ11
> +#define GP_IRQ22
> +#define GP_IRQ33
> +#define GP_IRQ44
> +#define GP_IRQ55
> +#define GENI_OVERRUN   6
> +#define GENI_ILLEGAL_CMD   7
> +#define GENI_ABORT_DONE8
> +#define GENI_TIMEOUT   9

Above should be an