Re: [PATCH v3 1/3] clk: meson-gxbb: expose clock CLKID_RNG0
Herbert Xu <herb...@gondor.apana.org.au> writes: > On Thu, Mar 16, 2017 at 11:24:31AM -0700, Kevin Hilman wrote: >> Hi Herbert, >> >> Herbert Xu <herb...@gondor.apana.org.au> writes: >> >> > On Wed, Feb 22, 2017 at 07:55:24AM +0100, Heiner Kallweit wrote: >> >> Expose clock CLKID_RNG0 which is needed for the HW random number >> >> generator. >> >> >> >> Signed-off-by: Heiner Kallweit <hkallwe...@gmail.com> >> > >> > All patches applied. Thanks. >> >> Actually, can you just apply [PATCH 4/4] to your tree? >> >> The clock and DT patches need to go through their respective trees or >> will otherwise have conflicts with other things going in via those >> trees. > > It's too late now. Please speak up sooner next time. These > patches were posted a month ago. Because this will be causing conflicts with both the platform (amlogic) tree and the clk tree, could provide an immutable branch where these are applied to help us handle these conflicts? Thanks, Kevin
Re: [PATCH v3 1/3] clk: meson-gxbb: expose clock CLKID_RNG0
Herbert Xu <herb...@gondor.apana.org.au> writes: > On Thu, Mar 16, 2017 at 11:24:31AM -0700, Kevin Hilman wrote: >> Hi Herbert, >> >> Herbert Xu <herb...@gondor.apana.org.au> writes: >> >> > On Wed, Feb 22, 2017 at 07:55:24AM +0100, Heiner Kallweit wrote: >> >> Expose clock CLKID_RNG0 which is needed for the HW random number >> >> generator. >> >> >> >> Signed-off-by: Heiner Kallweit <hkallwe...@gmail.com> >> > >> > All patches applied. Thanks. >> >> Actually, can you just apply [PATCH 4/4] to your tree? >> >> The clock and DT patches need to go through their respective trees or >> will otherwise have conflicts with other things going in via those >> trees. > > It's too late now. Please speak up sooner next time. These > patches were posted a month ago. Sorry, I didn't realize you would be applying everything. Also, I'm not the original author, just the platform maintainer that noticed it and now has to deal with the conflicts. :( Most other driver maintainers are only applying patches that directly apply to their subsystem and leave patches to other drivers (e.g. clk) and platform-specific stuff (e.g. DT) to go in via their proper trees, so that's what I was expecting to happen here too. Kevin
Re: [PATCH v3 1/3] clk: meson-gxbb: expose clock CLKID_RNG0
Hi Herbert, Herbert Xuwrites: > On Wed, Feb 22, 2017 at 07:55:24AM +0100, Heiner Kallweit wrote: >> Expose clock CLKID_RNG0 which is needed for the HW random number generator. >> >> Signed-off-by: Heiner Kallweit > > All patches applied. Thanks. Actually, can you just apply [PATCH 4/4] to your tree? The clock and DT patches need to go through their respective trees or will otherwise have conflicts with other things going in via those trees. Thanks, Kevin
Re: [PATCH] hwrng: meson: Remove unneeded platform MODULE_ALIAS
Javier Martinez Canillas <jav...@osg.samsung.com> writes: > The Amlogic Meson is a DT-only platform, which means the devices are > registered via OF and not using the legacy platform devices support. > > So there's no need to have a MODULE_ALIAS("platform:meson-rng") since > the reported uevent MODALIAS to user-space will always be the OF one. > > Signed-off-by: Javier Martinez Canillas <jav...@osg.samsung.com> Acked-by: Kevin Hilman <khil...@baylibre.com> > --- > > drivers/char/hw_random/meson-rng.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/char/hw_random/meson-rng.c > b/drivers/char/hw_random/meson-rng.c > index 51864a509be7..119d698439ae 100644 > --- a/drivers/char/hw_random/meson-rng.c > +++ b/drivers/char/hw_random/meson-rng.c > @@ -122,7 +122,6 @@ static struct platform_driver meson_rng_driver = { > > module_platform_driver(meson_rng_driver); > > -MODULE_ALIAS("platform:meson-rng"); > MODULE_DESCRIPTION("Meson H/W Random Number Generator driver"); > MODULE_AUTHOR("Lawrence Mok <lawrence@amlogic.com>"); > MODULE_AUTHOR("Neil Armstrong <narmstr...@baylibre.com>"); -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND PATCH 0/2] hw_random: Add Amlogic Meson SoCs Random Generator driver
Neil Armstrong <narmstr...@baylibre.com> writes: > NOTE: This is a resent of the DT Bindings and DTSI patches based on the > Amlogic DT 64bit > GIT pull request from Kevin Hilman at [1]. > > Changes since v2 at > http://lkml.kernel.org/r/1465546915-24229-1-git-send-email-narmstr...@baylibre.com > : > - Move rng peripheral node into periphs simple-bus node Thanks for the update. Applied to the dt64 branch of the amlogic tree. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/3] hw_random: Add Amlogic Meson SoCs Random Generator driver
Hi Herbert, Herbert Xuwrites: > On Fri, Jun 10, 2016 at 10:21:52AM +0200, Neil Armstrong wrote: >> Add support for the Amlogic Meson SoCs Hardware Random generator as a >> hw_random char driver. >> The generator is a single 32bit wide register. >> Also adds the Meson GXBB SoC DTSI node and corresponding DT bindings. >> >> Changes since v1 at >> http://lkml.kernel.org/r/1464943621-18278-1-git-send-email-narmstr...@baylibre.com >> : >> - change to depend on ARCH_MESON || COMPILE_TEST >> - check buffer max size in read >> >> Neil Armstrong (3): >> char: hw_random: Add Amlogic Meson Hardware Random Generator >> dt-bindings: hwrng: Add Amlogic Meson Hardware Random Generator >> bindings >> ARM64: dts: meson-gxbb: Add Hardware Random Generator node > > All applied. Thanks. Could you take just the driver please? Due to lots of other activity in the DT, I'd prefer to send the DT & bindings though the arm-soc (via the amlogic tree.) Thanks, Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] OMAP: AES: Don't idle/start AES device between Encrypt operations
Joel A Fernandes joelag...@ti.com writes: Calling runtime PM API for every block causes serious perf hit to crypto operations that are done on a long buffer. As crypto is performed on a page boundary, encrypting large buffers can cause a series of crypto operations divided by page. The runtime PM API is also called those many times. We call runtime_pm_get_sync only at beginning on the session (cra_init) and runtime_pm_put at the end. This result in upto a 50% speedup as below. This doesn't make the driver to keep the system awake as runtime get/put is only called during a crypto session which completes usually quickly. Before: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 13310 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 64 size blocks: 13040 aes-128-cbc's in 0.04s Doing aes-128-cbc for 3s on 256 size blocks: 9134 aes-128-cbc's in 0.03s Doing aes-128-cbc for 3s on 1024 size blocks: 8939 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 8192 size blocks: 4299 aes-128-cbc's in 0.00s After: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 18911 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 64 size blocks: 18878 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 256 size blocks: 11878 aes-128-cbc's in 0.10s Doing aes-128-cbc for 3s on 1024 size blocks: 11538 aes-128-cbc's in 0.05s Doing aes-128-cbc for 3s on 8192 size blocks: 4857 aes-128-cbc's in 0.03s While at it, also drop enter and exit pr_debugs, in related code. tracers can be used for that. Tested on a Beaglebone (AM335x SoC) board. v3 changes: Refreshed patch on kernel v3.10-rc3 Signed-off-by: Joel A Fernandes joelag...@ti.com This patch had at least 2 acks (myself and Mark Greer) which you should include when reposting. Also, changelog says v3 but subject says v2. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] OMAP: AES: Don't idle/start AES device between Encrypt operations
Joel A Fernandes joelag...@ti.com writes: Calling runtime PM API for every block causes serious perf hit to crypto operations that are done on a long buffer. As crypto is performed on a page boundary, encrypting large buffers can cause a series of crypto operations divided by page. The runtime PM API is also called those many times. We call runtime_pm_get_sync only at beginning on the session (cra_init) and runtime_pm_put at the end. This result in upto a 50% speedup as below. This doesn't make the driver to keep the system awake as runtime get/put is only called during a crypto session which completes usually quickly. Before: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 13310 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 64 size blocks: 13040 aes-128-cbc's in 0.04s Doing aes-128-cbc for 3s on 256 size blocks: 9134 aes-128-cbc's in 0.03s Doing aes-128-cbc for 3s on 1024 size blocks: 8939 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 8192 size blocks: 4299 aes-128-cbc's in 0.00s After: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 18911 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 64 size blocks: 18878 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 256 size blocks: 11878 aes-128-cbc's in 0.10s Doing aes-128-cbc for 3s on 1024 size blocks: 11538 aes-128-cbc's in 0.05s Doing aes-128-cbc for 3s on 8192 size blocks: 4857 aes-128-cbc's in 0.03s While at it, also drop enter and exit pr_debugs, in related code. tracers can be used for that. Tested on a Beaglebone (AM335x SoC) board. Signed-off-by: Joel A Fernandes joelag...@ti.com Acked-by: Kevin Hilman khil...@linaro.org Thanks for the updated changelog. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] OMAP: AES: Don't idle/start AES device between Encrypt operations
Joel A Fernandes joelag...@ti.com writes: Calling runtime PM API for every block causes serious perf hit to crypto operations that are done on a long buffer. As crypto is performed on a page boundary, encrypting large buffers can cause a series of crypto operations divided by page. The runtime PM API is also called those many times. We call runtime_pm_get_sync only at beginning of the session (cra_init) and runtime_pm_put at the end. This result in upto a 50% speedup as below: Before: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 13310 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 64 size blocks: 13040 aes-128-cbc's in 0.04s Doing aes-128-cbc for 3s on 256 size blocks: 9134 aes-128-cbc's in 0.03s Doing aes-128-cbc for 3s on 1024 size blocks: 8939 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 8192 size blocks: 4299 aes-128-cbc's in 0.00s After: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 18911 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 64 size blocks: 18878 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 256 size blocks: 11878 aes-128-cbc's in 0.10s Doing aes-128-cbc for 3s on 1024 size blocks: 11538 aes-128-cbc's in 0.05s Doing aes-128-cbc for 3s on 8192 size blocks: 4857 aes-128-cbc's in 0.03s While at it, also drop enter and exit pr_debugs, in related code. tracers are exactly used for that. Tested on a Beaglebone (AM335x SoC) board. Signed-off-by: Joel A Fernandes joelag...@ti.com Did you explore using runtime PM autosuspend timeouts for this instead? They are intended for exactly this kind of thing, and the timeouts can have sane defaults, but can be configured from userspace to allow a power/performance trade-off. [...] static void omap_aes_cra_exit(struct crypto_tfm *tfm) { - pr_debug(enter\n); + struct omap_aes_dev *dd = NULL; + + /* Find AES device, currently picks the first device */ + spin_lock_bh(list_lock); + list_for_each_entry(dd, dev_list, list) { + break; + } + spin_unlock_bh(list_lock); + + pm_runtime_put_sync(dd-dev); nit: Why use the synchronous call here? The original was async. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] OMAP: AES: Don't idle/start AES device between Encrypt operations
Hi Joel, Fernandes, Joel A joelag...@ti.com writes: Hi Kevin, Thanks for your review. -Original Message- From: Kevin Hilman [mailto:khil...@linaro.org] Sent: Monday, May 13, 2013 11:36 AM To: Fernandes, Joel A Cc: linux-crypto@vger.kernel.org; linux-o...@vger.kernel.org; Mark A. Greer Subject: Re: [PATCH] OMAP: AES: Don't idle/start AES device between Encrypt operations Joel A Fernandes joelag...@ti.com writes: Calling runtime PM API for every block causes serious perf hit to crypto operations that are done on a long buffer. As crypto is performed on a page boundary, encrypting large buffers can cause a series of crypto operations divided by page. The runtime PM API is also called those many times. We call runtime_pm_get_sync only at beginning of the session (cra_init) and runtime_pm_put at the end. This result in upto a 50% speedup as below: Before: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 13310 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 64 size blocks: 13040 aes-128-cbc's in 0.04s Doing aes-128-cbc for 3s on 256 size blocks: 9134 aes-128-cbc's in 0.03s Doing aes-128-cbc for 3s on 1024 size blocks: 8939 aes-128-cbc's in 0.01s Doing aes-128-cbc for 3s on 8192 size blocks: 4299 aes-128-cbc's in 0.00s After: root@beagleboard:~# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 18911 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 64 size blocks: 18878 aes-128-cbc's in 0.02s Doing aes-128-cbc for 3s on 256 size blocks: 11878 aes-128-cbc's in 0.10s Doing aes-128-cbc for 3s on 1024 size blocks: 11538 aes-128-cbc's in 0.05s Doing aes-128-cbc for 3s on 8192 size blocks: 4857 aes-128-cbc's in 0.03s While at it, also drop enter and exit pr_debugs, in related code. tracers are exactly used for that. Tested on a Beaglebone (AM335x SoC) board. Signed-off-by: Joel A Fernandes joelag...@ti.com Did you explore using runtime PM autosuspend timeouts for this instead? They are intended for exactly this kind of thing, and the timeouts can have sane defaults, but can be configured from userspace to allow a power/performance trade-off. [Joel] Actually, I feel there is no real benefit in calling runtime PM api so many times in between crypto operations. The patch just moves the runtime pm usage to the beginning and end of a crypto session which will have to be created anyway. Imagine encrypting a 20M block- this means runtime PM API is called 20 * 1024 / 4 =~ 5000 times. The slow down in my opinion doesn't make it worth it. What is your opinion about this? OK, I'm not terribly familiar with the crypto API, so I was assuming that the init/exit calls you're instrumenting were happening at driver probe/remove time. Based on your clarifications, that doesn't seem to be the case. My main concern is that drivers don't simply use 'get' on driver probe and 'put' on driver remove and force the system awake as long as the driver is present. I've seen that plenty of times, and I was assuming that's what was going on here. Sorry for the confusion. I can explore runtime-pm timeouts and propose the numbers to describe what would the speedup w/ my patch and w/ timeouts. Probably not needed. How about just add a few more details to the changelog summarizing how/when the init/exit calls happen to make it a bit more clear. Thanks, Kevin -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 5/7] crypto: omap-sham: Convert to use pm_runtime API
Hi Mark, Mark A. Greer mgr...@animalcreek.com writes: From: Mark A. Greer mgr...@animalcreek.com Convert the omap-sham crypto driver to use the pm_runtime API instead of the clk API. CC: Kevin Hilman khil...@deeprootsystems.com CC: Paul Walmsley p...@pwsan.com CC: Dmitry Kasatkin dmitry.kasat...@intel.com Signed-off-by: Mark A. Greer mgr...@animalcreek.com I can't pretend to fully understand this driver, It looks like the current code is doing a bit more fine-grained clock gating, and leaving the IP clocked only when needed. The proposed version does a 'get' in probe and a 'put' in remove, which means the IP will always be enabled (and thus preventing low-power states), even when it's not in use. If that's really needed, it should be thoroughly described in the changelog, otherwise I suggest doing the runtime PM 'get' and 'put' in roughtly the same spots as the current clk enable/disable which makes this a more straight-forward conversion. Kevin -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html