Re: [PATCH v5 00/17] ASoC: fsl_ssi: Clean up - program flow level

2018-01-17 Thread Maciej S. Szmigiero
On 17.01.2018 07:51, Nicolin Chen wrote:
> [ Maciej, could you please send your Tested-by/Reviewed-by for AC97
>   once you confirm this series?
>   
>   And Caleb, this version does not need a test for non-AC97 cases.
> 
>   Thanks both! ]
> 
> ==Change log==
> v5
>  * Reworked the series by taking suggestions from Maciej for AC97
>   + Fixed SSI lockup issue by changing cleanup sequence in PATCH-13
>   + Moved fsl_ssi_hw_clean() after unregistering the CODEC device
> in PATCH-13
>   + Set NULL as the parent of CODEC platform device to fix a NULL
> pointer dereference bug in PATCH-16
>  * Updated comments of three variables/pointers in struct fsl_ssi
>to describe them more accurately in PATCH-16
> v4
>  * Reworked the series by taking suggestions from Maciej
>   + Added TXBIT0 bit back to play safe in PATCH-14
>   + Made bool synchronous exclusive with AC97 mode in PATCH-16
> v3
>  * Reworked the series by taking suggestions from Maciej
>   + Added PATCH-01 to make RX and TX more clearly defined
>   + Replaced "bool dir" with "int dir" in PATCH-04
>   + Replaced "!dir" with "int adir" in PATCH-05
>   + Put CBM_CFS behind the baudclk check to keep the same
> program flow in PATCH-14
>   + Removed all cpu_dai_drv changes in PATCH-15
> v2
>  * Reworked the series by taking suggestions from Maciej
>   + Added PATCH-01 to keep all ssi->i2s_net updated
>   + Replaced bool tx with bool dir in PATCH-03 and PATCH-06
>   + Moved all initial register configurations from dai probe() to
> platform probe() so as to let AC97 CODEC successfully probe.
>  * Added Tested-by from Caleb for TDM test cases.
> 
> ==Background==
> The fsl_ssi driver was designed for PPC originally and then it has
> been updated to support different modes for i.MX Series, including
> SDMA, I2S Master mode, AC97 and older i.MXs with FIQ, by different
> contributors for different use cases in different coding styles.
> 
> Additionally, in order to fix/work-around hardware bugs and design
> flaws, the driver made a lot of compromise so now its program flow
> looks very complicated and it's getting hard to maintain or update.
> 
> So I am going to clean up the driver on both coding style level and
> program flow level.
> 
> ==Introduction==
> This series of patches is the second set to clean up fsl_ssi driver
> in the program flow level. Any patch here may impact a fundamental
> test case like playback or record.
> 
> ==Verification==
> This series of patches require fully tested. I have done such tests
> on i.MX6SoloX with WM8962 using imx_v6_v7_defconfig as:
>  - Playback via I2S Master and Slave mode
>  - Record via I2S Master and Slave mode
>  - Simultaneous playback and record via I2S Master and Slave mode
>  - Background playback with foreground record (starting at different
>time) via I2S Master and Slave mode
>  - Background record with foreground playback (starting at different
>time) via I2S Master and Slave mode
>  * All tests above by hacking offline_config to true in imx51.
> 
> Caleb has tested v1-v4 with TDM lookback tests on i.MX6.
> 
> Example of uncovered tests: AC97, PowerPC and FIQ.

For the whole series:
Tested-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
Reviewed-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>

However, I have a small nitpick regarding a comment newly added in
this version of patch 16:
+   /*
+* Do not set SSI dev as the parent of AC97 CODEC device since
+* it does not have a DT node. Otherwise ASoC core will assume
+* CODEC has the same DT node as the SSI, so it may return a
+* NULL pointer of CODEC when asked for SSI via the DT node

The second part of the last sentence isn't really true, the ASoC core
will return a (valid, non-NULL) CODEC object pointer when asked for
the SSI one if we set the SSI as the parent device of a AC'97 CODEC
platform device.

The NULL pointer dereference when starting a playback that I wrote
about in my previous message happens because in this situation the SSI
DAI probe callback won't ever get called and so won't setup DMA data
pointers (they will remain NULL).
And this in turn will cause the ASoC DMA code to dereference these
NULL pointers when starting a playback (the same will probably happen
also when starting a capture).

Sorry if I wasn't 100% clear about these details in my previous
message describing this issue.

Maciej


Re: [PATCH v3 16/17] ASoC: fsl_ssi: Move DT related code to a separate probe()

2018-01-15 Thread Maciej S. Szmigiero
On 15.01.2018 05:21, Nicolin Chen wrote:
> This patch cleans up probe() function by moving all Device Tree
> related code into a separate function. It allows the probe() to
> be Device Tree independent. This will be very useful for future
> integration of imx-ssi driver which has similar functionalities
> while exists only because it supports non-DT cases.
> 
> This patch also moves symmetric_channels of AC97 from the probe
> to the structure snd_soc_dai_driver for simplification.
> 
> Additionally, since PowerPC and AC97 use the same pdev pointer
> to register a platform device, this patch also unifies related
> code.
> 
> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
>  sound/soc/fsl/fsl_ssi.c | 202 
> +---
>  1 file changed, 107 insertions(+), 95 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 20889d8..d2072de 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -1366,41 +1365,102 @@ static void fsl_ssi_imx_clean(struct platform_device 
> *pdev, struct fsl_ssi *ssi)
>   clk_disable_unprepare(ssi->clk);
>  }
>  
> -static int fsl_ssi_probe(struct platform_device *pdev)
> +static int fsl_ssi_probe_from_dt(struct fsl_ssi *ssi)
>  {
> - struct fsl_ssi *ssi;
> - int ret = 0;
> - struct device_node *np = pdev->dev.of_node;
> - struct device *dev = >dev;
> + struct device *dev = ssi->dev;
> + struct device_node *np = dev->of_node;
>   const struct of_device_id *of_id;
>   const char *p, *sprop;
>   const uint32_t *iprop;
> - struct resource *res;
> - void __iomem *iomem;
> - char name[64];
> - struct regmap_config regconfig = fsl_ssi_regconfig;
> + u32 dmas[4];
> + int ret;
>  
>   of_id = of_match_device(fsl_ssi_ids, dev);
>   if (!of_id || !of_id->data)
>   return -EINVAL;
>  
> - ssi = devm_kzalloc(dev, sizeof(*ssi), GFP_KERNEL);
> - if (!ssi)
> - return -ENOMEM;
> -
>   ssi->soc = of_id->data;
> - ssi->dev = dev;
> +
> + ret = of_property_match_string(np, "clock-names", "ipg");
> + /* Get error code if not found */
> + ssi->has_ipg_clk_name = ret >= 0;
>  
>   /* Check if being used in AC97 mode */
>   sprop = of_get_property(np, "fsl,mode", NULL);
> - if (sprop) {
> - if (!strcmp(sprop, "ac97-slave"))
> - ssi->dai_fmt = FSLSSI_AC97_DAIFMT;
> + if (sprop && !strcmp(sprop, "ac97-slave")) {
> + ssi->dai_fmt = FSLSSI_AC97_DAIFMT;
> +
> + ret = of_property_read_u32(np, "cell-index", >card_idx);
> + if (ret) {
> + dev_err(dev, "failed to get SSI index property\n");
> + return -EINVAL;
> + }
> + strcpy(ssi->card_name, "ac97-codec");
>   }
>  
>   /* Select DMA or FIQ */
>   ssi->use_dma = !of_property_read_bool(np, "fsl,fiq-stream-filter");
>  
> + /* In synchronous mode, STCK and STFS ports are used by RX as well */
> + if (!of_find_property(np, "fsl,ssi-asynchronous", NULL))
> + ssi->synchronous = true;

You are setting ssi->synchronous in the AC'97 mode here, the old code
didn't do that (see the next patch hunk below).

Since in the previous patch you have replaced cpu_dai_drv.symmetric_rates
with ssi->synchronous this will likely break asymmetric rate support in
the AC'97 mode, since the driver will use STCCR for programming of both
playback and capture.

The next patch in this series (17) also looks affected by this change.

> @@ -1444,24 +1500,13 @@ static int fsl_ssi_probe(struct platform_device *pdev)
>   return ssi->irq;
>   }
>  
> - /* Set software limitations for synchronous mode */
> - if (!of_find_property(np, "fsl,ssi-asynchronous", NULL)) {
> - if (!fsl_ssi_is_ac97(ssi)) {
> - ssi->cpu_dai_drv.symmetric_rates = 1;
> - ssi->cpu_dai_drv.symmetric_samplebits = 1;
> - ssi->synchronous = true;
> - }

You can see it here that the old code didn't set ssi->synchronous in the
AC'97 mode.

Maciej


Re: [PATCH v3 14/17] ASoC: fsl_ssi: Clean up _fsl_ssi_set_dai_fmt()

2018-01-15 Thread Maciej S. Szmigiero
On 15.01.2018 05:21, Nicolin Chen wrote:
> The _fsl_ssi_set_dai_fmt() is a helper function being called from
> fsl_ssi_set_dai_fmt() as an ASoC operation and fsl_ssi_hw_init()
> mainly for AC97 format initialization.
> 
> This patch cleans the _fsl_ssi_set_dai_fmt() in following ways:
> * Removing *dev pointer in the parameters as it's included in the
>   *ssi pointer of struct fsl_ssi.
> * Using regmap_update_bits() instead of regmap_read() with masking
>   the value manually.
> * Removing TXBIT0 configurations since this bit is set to 1 as its
>   reset value and there is no use case so far to unset it. And it
>   is safe to remove since regmap_update_bits() won't touch it.

I didn't get any response to a comment I've written about the point
above during the previous patch iteration:> The old code set this bit in any 
mode other than AC'97 (where the
> hardware always treats this bit as set regardless of the actual value).
> I would play safe here and not rely on this bit being set by a SSI
> reset on all SSI models.

Maciej

> * Moving baudclk check to the switch-case routine to skip the I2S
>   master check. And moving SxCCR.DC settings after baudclk check.
> * Adding format settings for SND_SOC_DAIFMT_AC97 like others.
> 
> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
> Changelog
> v3
>  * Put CBM_CFS behind the baudclk check to keep the same program
>flow as before
> 
>  sound/soc/fsl/fsl_ssi.c | 73 
> ++---
>  1 file changed, 33 insertions(+), 40 deletions(-)
> 


Re: [PATCH v2 06/16] ASoC: fsl_ssi: Clean up helper functions of trigger()

2018-01-14 Thread Maciej S. Szmigiero
On 11.01.2018 07:43, Nicolin Chen wrote:
> The trigger() calls fsl_ssi_tx_config() and fsl_ssi_rx_config(),
> and both of them jump to fsl_ssi_config(). And fsl_ssi_config()
> later calls another fsl_ssi_rxtx_config().
> 
> However, the whole routine, especially fsl_ssi_config() function,
> is too complicated because of the folowing reasons:
> 1) It has to handle the concern of the opposite stream.
> 2) It has to handle cases of offline configurations support.
> 3) It has to handle enable and disable operations while they're
>mostly different.
> 
> Since the enable and disable routines have more differences than
> TX and RX rountines, this patch simplifies these helper functions
> with the following changes:
> - Changing to two helper functions of enable and disable instead
>   of TX and RX.
> - Removing fsl_ssi_rxtx_config() by separately integrating it to
>   two newly introduced enable & disable functions.
> 
> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
>  sound/soc/fsl/fsl_ssi.c | 256 
> +++-
>  1 file changed, 122 insertions(+), 134 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 263c067..09a571a 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -378,31 +378,83 @@ static irqreturn_t fsl_ssi_isr(int irq, void *dev_id)
>  }
>  
>  /**
> - * Enable or disable all rx/tx config flags at once
> + * Set SCR, SIER, STCR and SRCR registers with cached values in regvals
> + *
> + * Notes:
> + * 1) For offline_config SoCs, enable all necessary bits of both streams
> + *when 1st stream starts, even if the opposite stream will not start
> + * 2) It also clears FIFO before setting regvals; SOR is safe to set online
>   */
> -static void fsl_ssi_rxtx_config(struct fsl_ssi *ssi, bool enable)
> +static void fsl_ssi_config_enable(struct fsl_ssi *ssi, bool tx)
>  {
> - struct regmap *regs = ssi->regs;
>   struct fsl_ssi_regvals *vals = ssi->regvals;
> + bool dir = tx ? TX : RX;

A similar case as in patch 3 of a bool variable used for a bit index.

> + u32 sier, srcr, stcr;
>  
> - if (enable) {
> - regmap_update_bits(regs, REG_SSI_SIER,
> -vals[RX].sier | vals[TX].sier,
> -vals[RX].sier | vals[TX].sier);
> - regmap_update_bits(regs, REG_SSI_SRCR,
> -vals[RX].srcr | vals[TX].srcr,
> -vals[RX].srcr | vals[TX].srcr);
> - regmap_update_bits(regs, REG_SSI_STCR,
> -vals[RX].stcr | vals[TX].stcr,
> -vals[RX].stcr | vals[TX].stcr);
> + /* Clear dirty data in the FIFO; It also prevents channel slipping */
> + regmap_update_bits(ssi->regs, REG_SSI_SOR,
> +SSI_SOR_xX_CLR(tx), SSI_SOR_xX_CLR(tx));
> +
> + /*
> +  * On offline_config SoCs, SxCR and SIER are already configured when
> +  * the previous stream started. So skip all SxCR and SIER settings
> +  * to prevent online reconfigurations, then jump to set SCR directly
> +  */
> + if (ssi->soc->offline_config && ssi->streams)
> + goto enable_scr;
> +
> + if (ssi->soc->offline_config) {
> + /*
> +  * Online reconfiguration not supported, so enable all bits for
> +  * both streams at once to avoid necessity of reconfigurations
> +  */
> + srcr = vals[RX].srcr | vals[TX].srcr;
> + stcr = vals[RX].stcr | vals[TX].stcr;
> + sier = vals[RX].sier | vals[TX].sier;
>   } else {
> - regmap_update_bits(regs, REG_SSI_SRCR,
> -vals[RX].srcr | vals[TX].srcr, 0);
> - regmap_update_bits(regs, REG_SSI_STCR,
> -vals[RX].stcr | vals[TX].stcr, 0);
> - regmap_update_bits(regs, REG_SSI_SIER,
> -vals[RX].sier | vals[TX].sier, 0);
> + /* Otherwise, only set bits for the current stream */
> + srcr = vals[dir].srcr;
> + stcr = vals[dir].stcr;
> + sier = vals[dir].sier;
>   }
> +
> + /* Configure SRCR, STCR and SIER at once */
> + regmap_update_bits(ssi->regs, REG_SSI_SRCR, srcr, srcr);
> + regmap_update_bits(ssi->regs, REG_SSI_STCR, stcr, stcr);
> + regmap_update_bits(ssi->regs, REG_SSI_SIER, sier, sier);
> +
> +enable_scr:
> + /*
> +  * Start DMA before setting TE to avoid FIFO underrun
> +  * which may cause a channel slip or a channel swap
> +  *
> +  * TODO: FIQ cases might also need this upon testing
> +  */
> + if (ssi->use_dma && tx) {
> + int try = 100;
> + u32 sfcsr;
> +
> + /* Enable SSI first to send TX DMA request */
> + 

Re: [PATCH v2 13/16] ASoC: fsl_ssi: Clean up _fsl_ssi_set_dai_fmt()

2018-01-14 Thread Maciej S. Szmigiero
On 11.01.2018 07:43, Nicolin Chen wrote:
> The _fsl_ssi_set_dai_fmt() is a helper function being called from
> fsl_ssi_set_dai_fmt() as an ASoC operation and fsl_ssi_hw_init()
> mainly for AC97 format initialization.
> 
> This patch cleans the _fsl_ssi_set_dai_fmt() in following ways:
> * Removing *dev pointer in the parameters as it's included in the
>   *ssi pointer of struct fsl_ssi.
> * Using regmap_update_bits() instead of regmap_read() with masking
>   the value manually.
> * Removing TXBIT0 configurations since this bit is set to 1 as its
>   reset value and there is no use case so far to unset it. And it
>   is safe to remove since regmap_update_bits() won't touch it.

The old code set this bit in any mode other than AC'97 (where the
hardware always treats this bit as set regardless of the actual value).
I would play safe here and not rely on this bit being set by a SSI
reset on all SSI models.

> * Moving baudclk check to the switch-case routine to skip the I2S
>   master check. And moving SxCCR.DC settings after baudclk check.
> * Adding format settings for SND_SOC_DAIFMT_AC97 like others.
> 
> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
>  sound/soc/fsl/fsl_ssi.c | 70 
> ++---
>  1 file changed, 31 insertions(+), 39 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 178c192..213962a 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -855,42 +855,27 @@ static int fsl_ssi_hw_free(struct snd_pcm_substream 
> *substream,
>   return 0;
>  }
>  
> -static int _fsl_ssi_set_dai_fmt(struct device *dev,
> - struct fsl_ssi *ssi, unsigned int fmt)
> +static int _fsl_ssi_set_dai_fmt(struct fsl_ssi *ssi, unsigned int fmt)
>  {
> - struct regmap *regs = ssi->regs;
> - u32 strcr = 0, stcr, srcr, scr, mask;
> + u32 strcr = 0, scr = 0, stcr, srcr, mask;
>  
>   ssi->dai_fmt = fmt;
>  
> - if (fsl_ssi_is_i2s_master(ssi) && IS_ERR(ssi->baudclk)) {
> - dev_err(dev, "missing baudclk for master mode\n");
> - return -EINVAL;
> - }
> -
> - regmap_read(regs, REG_SSI_SCR, );
> - scr &= ~(SSI_SCR_SYN | SSI_SCR_I2S_MODE_MASK);
>   /* Synchronize frame sync clock for TE to avoid data slipping */
>   scr |= SSI_SCR_SYNC_TX_FS;
>  
> - mask = SSI_STCR_TXBIT0 | SSI_STCR_TFDIR | SSI_STCR_TXDIR |
> -SSI_STCR_TSCKP | SSI_STCR_TFSI | SSI_STCR_TFSL | SSI_STCR_TEFS;
> - regmap_read(regs, REG_SSI_STCR, );
> - regmap_read(regs, REG_SSI_SRCR, );
> - stcr &= ~mask;
> - srcr &= ~mask;
> -
>   /* Use Network mode as default */
>   ssi->i2s_net = SSI_SCR_NET;
>   switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
>   case SND_SOC_DAIFMT_I2S:
> - regmap_update_bits(regs, REG_SSI_STCCR,
> -SSI_SxCCR_DC_MASK, SSI_SxCCR_DC(2));
> - regmap_update_bits(regs, REG_SSI_SRCCR,
> -SSI_SxCCR_DC_MASK, SSI_SxCCR_DC(2));
>   switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
>   case SND_SOC_DAIFMT_CBM_CFS:
>   case SND_SOC_DAIFMT_CBS_CFS:
> + if (IS_ERR(ssi->baudclk)) {
> + dev_err(ssi->dev,
> + "missing baudclk for master mode\n");
> + return -EINVAL;
> + }

The original code did this check only for fsl_ssi_is_i2s_master(ssi),
that is, only for SND_SOC_DAIFMT_CBS_CFS while here you also do it for
SND_SOC_DAIFMT_CBM_CFS.
Was this changed on purpose?

Maciej


Re: [PATCH v2 14/16] ASoC: fsl_ssi: Remove cpu_dai_drv from fsl_ssi structure

2018-01-14 Thread Maciej S. Szmigiero
On 11.01.2018 07:43, Nicolin Chen wrote:
> The cpu_dai_drv is only used for symmetric_rates. So this patch replaces
> it with a synchronous boolean flag.

You make cpu_dai_drv common to all SSI instances instead of per-instance.

What if you have multiple SSIs in the system with different
symmetric_{rates,samplebits,channels} settings?

Maciej

> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
>  sound/soc/fsl/fsl_ssi.c | 32 
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 213962a..716603c 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -208,11 +208,11 @@ struct fsl_ssi_soc_data {
>   *
>   * @regs: Pointer to the regmap registers
>   * @irq: IRQ of this SSI
> - * @cpu_dai_drv: CPU DAI driver for this device
>   *
>   * @dai_fmt: DAI configuration this device is currently used with
>   * @streams: Mask of current active streams: BIT(TX) and BIT(RX)
>   * @i2s_net: I2S and Network mode configurations of SCR register
> + * @synchronous: Use synchronous mode - both of TX and RX use STCK and SFCK
>   * @use_dma: DMA is used or FIQ with stream filter
>   * @use_dual_fifo: DMA with support for dual FIFO mode
>   * @has_ipg_clk_name: If "ipg" is in the clock name list of device tree
> @@ -253,11 +253,11 @@ struct fsl_ssi_soc_data {
>  struct fsl_ssi {
>   struct regmap *regs;
>   int irq;
> - struct snd_soc_dai_driver cpu_dai_drv;
>  
>   unsigned int dai_fmt;
>   u8 streams;
>   u8 i2s_net;
> + bool synchronous;
>   bool use_dma;
>   bool use_dual_fifo;
>   bool has_ipg_clk_name;
> @@ -668,7 +668,6 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
>   bool tx2, tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
>   struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(dai);
>   struct regmap *regs = ssi->regs;
> - int synchronous = ssi->cpu_dai_drv.symmetric_rates, ret;
>   u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
>   unsigned long clkrate, baudrate, tmprate;
>   unsigned int slots = params_channels(hw_params);
> @@ -676,6 +675,7 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
>   u64 sub, savesub = 10;
>   unsigned int freq;
>   bool baudclk_is_used;
> + int ret;
>  
>   /* Override slots and slot_width if being specifically set... */
>   if (ssi->slots)
> @@ -754,7 +754,7 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
>   mask = SSI_SxCCR_PM_MASK | SSI_SxCCR_DIV2 | SSI_SxCCR_PSR;
>  
>   /* STCCR is used for RX in synchronous mode */
> - tx2 = tx || synchronous;
> + tx2 = tx || ssi->synchronous;
>   regmap_update_bits(regs, REG_SSI_SxCCR(tx2), mask, stccr);
>  
>   if (!baudclk_is_used) {
> @@ -802,7 +802,7 @@ static int fsl_ssi_hw_params(struct snd_pcm_substream 
> *substream,
>* that should set separate configurations for STCCR and SRCCR
>* despite running in the synchronous mode.
>*/
> - if (enabled && ssi->cpu_dai_drv.symmetric_rates)
> + if (enabled && ssi->synchronous)
>   return 0;
>  
>   if (fsl_ssi_is_i2s_master(ssi)) {
> @@ -834,7 +834,7 @@ static int fsl_ssi_hw_params(struct snd_pcm_substream 
> *substream,
>   }
>  
>   /* In synchronous mode, the SSI uses STCCR for capture */
> - tx2 = tx || ssi->cpu_dai_drv.symmetric_rates;
> + tx2 = tx || ssi->synchronous;
>   regmap_update_bits(regs, REG_SSI_SxCCR(tx2), SSI_SxCCR_WL_MASK, wl);
>  
>   return 0;
> @@ -959,7 +959,7 @@ static int _fsl_ssi_set_dai_fmt(struct fsl_ssi *ssi, 
> unsigned int fmt)
>   srcr = strcr;
>  
>   /* Set SYN mode and clear RXDIR bit when using SYN or AC97 mode */
> - if (ssi->cpu_dai_drv.symmetric_rates || fsl_ssi_is_ac97(ssi)) {
> + if (ssi->synchronous || fsl_ssi_is_ac97(ssi)) {
>   srcr &= ~SSI_SRCR_RXDIR;
>   scr |= SSI_SCR_SYN;
>   }
> @@ -1360,6 +1360,7 @@ static void fsl_ssi_imx_clean(struct platform_device 
> *pdev, struct fsl_ssi *ssi)
>  
>  static int fsl_ssi_probe(struct platform_device *pdev)
>  {
> + struct snd_soc_dai_driver *cpu_dai_drv;
>   struct fsl_ssi *ssi;
>   int ret = 0;
>   struct device_node *np = pdev->dev.of_node;
> @@ -1394,14 +1395,12 @@ static int fsl_ssi_probe(struct platform_device *pdev)
>   ssi->use_dma = !of_property_read_bool(np, "fsl,fiq-stream-filter");
>  
>   if (fsl_ssi_is_ac97(ssi)) {
> - memcpy(>cpu_dai_drv, _ssi_ac97_dai,
> -sizeof(fsl_ssi_ac97_dai));
> + cpu_dai_drv = _ssi_ac97_dai;
>   fsl_ac97_data = ssi;
>   } else {
> - memcpy(>cpu_dai_drv, _ssi_dai_template,
> -sizeof(fsl_ssi_dai_template));
> + cpu_dai_drv = _ssi_dai_template;
>   }
> - 

Re: [PATCH v5 00/17] ASoC: fsl_ssi: Clean up - program flow level

2018-01-17 Thread Maciej S. Szmigiero
On 17.01.2018 21:02, Nicolin Chen wrote:
> On Wed, Jan 17, 2018 at 08:38:48PM +0100, Maciej S. Szmigiero wrote:
> 
>> However, I have a small nitpick regarding a comment newly added in
>> this version of patch 16:
>> +/*
>> + * Do not set SSI dev as the parent of AC97 CODEC device since
>> + * it does not have a DT node. Otherwise ASoC core will assume
>> + * CODEC has the same DT node as the SSI, so it may return a
>> + * NULL pointer of CODEC when asked for SSI via the DT node
>>
>> The second part of the last sentence isn't really true, the ASoC core
>> will return a (valid, non-NULL) CODEC object pointer when asked for
>> the SSI one if we set the SSI as the parent device of a AC'97 CODEC
>> platform device.
>>
>> The NULL pointer dereference when starting a playback that I wrote
>> about in my previous message happens because in this situation the SSI
>> DAI probe callback won't ever get called and so won't setup DMA data
>> pointers (they will remain NULL).
> 
> Well, somehow the DMA data pointer of CODEC could be described
> as "a NULL pointer of CODEC" reluctantly...it confuses people
> though.
> 
>> And this in turn will cause the ASoC DMA code to dereference these
>> NULL pointers when starting a playback (the same will probably happen
>> also when starting a capture).
>>
>> Sorry if I wasn't 100% clear about these details in my previous
>> message describing this issue.
> 
> I would prefer to send an incremental patch later to update it,
> if there are no new comments against this version; Otherwise, I
> will update it in a next version once there is a need to send a
> v6 anyway.

IMHO it is such a tiny thing that it isn't worth respinning 17
patch series just for it, it can be easily improved later via
a separate patch.

> Thanks
> 

Thanks,
Maciej


Re: [PATCH v2 03/16] ASoC: fsl_ssi: Maintain a mask of active streams

2018-01-14 Thread Maciej S. Szmigiero
On 11.01.2018 07:43, Nicolin Chen wrote:
> Checking TE and RE bits in SCR register doesn't work for AC97 mode
> which enables SSIEN, TE and RE in the fsl_ssi_setup_ac97() that's
> called during probe().
> 
> So when running into the trigger(), it will always get the result
> of both TE and RE being enabled already, even if actually there is
> no active stream.
> 
> This patch fixes this issue by adding a variable to log the active
> streams manually.
> 
> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
>  sound/soc/fsl/fsl_ssi.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 491b660..aa14a5d 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -201,6 +201,7 @@ struct fsl_ssi_soc_data {
>   * @cpu_dai_drv: CPU DAI driver for this device
>   *
>   * @dai_fmt: DAI configuration this device is currently used with
> + * @streams: Mask of current active streams: BIT(TX) and BIT(RX)
>   * @i2s_net: I2S and Network mode configurations of SCR register
>   * @use_dma: DMA is used or FIQ with stream filter
>   * @use_dual_fifo: DMA with support for dual FIFO mode
> @@ -245,6 +246,7 @@ struct fsl_ssi {
>   struct snd_soc_dai_driver cpu_dai_drv;
>  
>   unsigned int dai_fmt;
> + u8 streams;
>   u8 i2s_net;
>   bool use_dma;
>   bool use_dual_fifo;
> @@ -440,15 +442,14 @@ static void fsl_ssi_fifo_clear(struct fsl_ssi *ssi, 
> bool is_rx)
>  static void fsl_ssi_config(struct fsl_ssi *ssi, bool enable,
>  struct fsl_ssi_regvals *vals)
>  {
> + bool dir = (>regvals[TX] == vals) ? TX : RX;
Using a bool variable for a bit index (and array index in other parts
of code) looks just wrong.

Even a simple int would look better IMHO here (and in patch 5 that
rewrites this line a bit).

Maciej


Re: [PATCH v2 04/16] ASoC: fsl_ssi: Rename fsl_ssi_disable_val macro

2018-01-14 Thread Maciej S. Szmigiero
On 11.01.2018 07:43, Nicolin Chen wrote:
> The define of fsl_ssi_disable_val is not so clear as it mixes two
> steps of calculations together. And those parameter names are also
> a bit long to read.
> 
> Since it just tries to exclude the shared bits from the regvals of
> current stream while the opposite stream is active, it's better to
> use something like ssi_excl_shared_bits.
> 
> This patch also bisects fsl_ssi_disable_val into two macros of two
> corresponding steps and then shortens its parameter names. It also
> updates callers in the fsl_ssi_config() accordingly.
> 
> Signed-off-by: Nicolin Chen 
> Tested-by: Caleb Crome 
> ---
>  sound/soc/fsl/fsl_ssi.c | 54 
> -
>  1 file changed, 22 insertions(+), 32 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index aa14a5d..f026386 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -445,16 +445,10 @@ static void fsl_ssi_config(struct fsl_ssi *ssi, bool 
> enable,
>   bool dir = (>regvals[TX] == vals) ? TX : RX;
>   struct regmap *regs = ssi->regs;
>   struct fsl_ssi_regvals *avals;
> - int nr_active_streams;
> - int keep_active;
> -
> - nr_active_streams = !!(ssi->streams & BIT(TX)) +
> - !!(ssi->streams & BIT(RX));
> + bool aactive;
>  
> - if (nr_active_streams - 1 > 0)
> - keep_active = 1;
> - else
> - keep_active = 0;
> + /* Check if the opposite stream is active */
> + aactive = ssi->streams & BIT(!dir);
 ^
Here an implicit assumption that either RX == 0, TX == 1 or
RX == 1, TX == 0 still remains.

Maciej


[PATCH v2 3/5] ALSA: emu10k1: add optional debug printouts with DMA addresses

2018-01-27 Thread Maciej S. Szmigiero
When we get a IOMMU page fault for a emu10k1 device it is very hard to
discover which of chip many DMA allocations triggered it (since on a IOMMU
system the DMA address space is often very different from the CPU one).
Let's add optional debug printouts providing this information.

These debug printouts are only enabled on an explicit request via the
kernel dynamic debug mechanism.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: Fix a sparse warning by changing a variable type in
__{get,set}_ptb_entry from u32 to __le32, don't open code EMUPAGESIZE.

 sound/pci/emu10k1/emu10k1_main.c |  8 
 sound/pci/emu10k1/memory.c   | 15 +++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index 11e868f569d6..8decd2a7a404 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1898,6 +1898,9 @@ int snd_emu10k1_create(struct snd_card *card,
err = -ENOMEM;
goto error;
}
+   dev_dbg(card->dev, "page table address range is %.8lx:%.8lx\n",
+   (unsigned long)emu->ptb_pages.addr,
+   (unsigned long)(emu->ptb_pages.addr + emu->ptb_pages.bytes));
 
emu->page_ptr_table = vmalloc(emu->max_cache_pages * sizeof(void *));
emu->page_addr_table = vmalloc(emu->max_cache_pages *
@@ -1912,6 +1915,11 @@ int snd_emu10k1_create(struct snd_card *card,
err = -ENOMEM;
goto error;
}
+   dev_dbg(card->dev, "silent page range is %.8lx:%.8lx\n",
+   (unsigned long)emu->silent_page.addr,
+   (unsigned long)(emu->silent_page.addr +
+   emu->silent_page.bytes));
+
emu->memhdr = snd_util_memhdr_new(emu->max_cache_pages * PAGE_SIZE);
if (emu->memhdr == NULL) {
err = -ENOMEM;
diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index eaa61024ac7f..fcb04cbbc9ab 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -34,7 +34,10 @@
  * aligned pages in others
  */
 #define __set_ptb_entry(emu,page,addr) \
-   (((u32 *)(emu)->ptb_pages.area)[page] = cpu_to_le32(((addr) << 
(emu->address_mode)) | (page)))
+   (((__le32 *)(emu)->ptb_pages.area)[page] = \
+cpu_to_le32(((addr) << (emu->address_mode)) | (page)))
+#define __get_ptb_entry(emu, page) \
+   (le32_to_cpu(((__le32 *)(emu)->ptb_pages.area)[page]))
 
 #define UNIT_PAGES (PAGE_SIZE / EMUPAGESIZE)
 #define MAX_ALIGN_PAGES0   (MAXPAGES0 / UNIT_PAGES)
@@ -44,8 +47,7 @@
 /* get offset address from aligned page */
 #define aligned_page_offset(page)  ((page) << PAGE_SHIFT)
 
-#if PAGE_SIZE == 4096
-/* page size == EMUPAGESIZE */
+#if PAGE_SIZE == EMUPAGESIZE && !IS_ENABLED(CONFIG_DYNAMIC_DEBUG)
 /* fill PTB entrie(s) corresponding to page with addr */
 #define set_ptb_entry(emu,page,addr)   __set_ptb_entry(emu,page,addr)
 /* fill PTB entrie(s) corresponding to page with silence pointer */
@@ -58,6 +60,8 @@ static inline void set_ptb_entry(struct snd_emu10k1 *emu, int 
page, dma_addr_t a
page *= UNIT_PAGES;
for (i = 0; i < UNIT_PAGES; i++, page++) {
__set_ptb_entry(emu, page, addr);
+   dev_dbg(emu->card->dev, "mapped page %d to entry %.8x\n", page,
+   (unsigned int)__get_ptb_entry(emu, page));
addr += EMUPAGESIZE;
}
 }
@@ -65,9 +69,12 @@ static inline void set_silent_ptb(struct snd_emu10k1 *emu, 
int page)
 {
int i;
page *= UNIT_PAGES;
-   for (i = 0; i < UNIT_PAGES; i++, page++)
+   for (i = 0; i < UNIT_PAGES; i++, page++) {
/* do not increment ptr */
__set_ptb_entry(emu, page, emu->silent_page.addr);
+   dev_dbg(emu->card->dev, "mapped silent page %d to entry %.8x\n",
+   page, (unsigned int)__get_ptb_entry(emu, page));
+   }
 }
 #endif /* PAGE_SIZE */
 


[PATCH v2 1/5] ALSA: emu10k1: remove reserved_page

2018-01-27 Thread Maciej S. Szmigiero
The emu10k1-family chips need the first page (index 0) reserved in their
page tables for some reason (every emu10k1 driver I've checked does this
without much of an explanation).
Using the first page for normal samples results in a broken playback.

However, we already have a dummy page allocated - so called "silent page"
and, in fact, had always been setting it as the first page in the chip page
table because an initialization of every entry of the page table to point
to a silent page happens after and overwrites the reserved_page allocation.

So the only thing remaining to remove the reserved_page allocation is a
trivial change to the page allocation logic to ignore the first page entry
and start its allocations from the second entry (index 1).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: None in this patch.

 include/sound/emu10k1.h  |  1 -
 sound/pci/emu10k1/emu10k1_main.c | 11 ---
 sound/pci/emu10k1/memory.c   |  8 ++--
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h
index 4f42affe777c..db32b7de52e0 100644
--- a/include/sound/emu10k1.h
+++ b/include/sound/emu10k1.h
@@ -1718,7 +1718,6 @@ struct snd_emu10k1 {
struct snd_dma_buffer p16v_buffer;
 
struct snd_util_memhdr *memhdr; /* page allocation list */
-   struct snd_emu10k1_memblk *reserved_page;   /* reserved page */
 
struct list_head mapped_link_head;
struct list_head mapped_order_link_head;
diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index ccf4415a1c7b..a0a4d31ded53 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1272,12 +1272,6 @@ static int snd_emu10k1_free(struct snd_emu10k1 *emu)
release_firmware(emu->dock_fw);
if (emu->irq >= 0)
free_irq(emu->irq, emu);
-   /* remove reserved page */
-   if (emu->reserved_page) {
-   snd_emu10k1_synth_free(emu,
-   (struct snd_util_memblk *)emu->reserved_page);
-   emu->reserved_page = NULL;
-   }
snd_util_memhdr_free(emu->memhdr);
if (emu->silent_page.area)
snd_dma_free_pages(>silent_page);
@@ -1993,11 +1987,6 @@ int snd_emu10k1_create(struct snd_card *card,
SPCS_GENERATIONSTATUS | 0x1200 |
0x | SPCS_EMPHASIS_NONE | SPCS_COPYRIGHT;
 
-   emu->reserved_page = (struct snd_emu10k1_memblk *)
-   snd_emu10k1_synth_alloc(emu, 4096);
-   if (emu->reserved_page)
-   emu->reserved_page->map_locked = 1;
-
/* Clear silent pages and set up pointers */
memset(emu->silent_page.area, 0, PAGE_SIZE);
silent_page = emu->silent_page.addr << emu->address_mode;
diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index 4f1f69be1865..eaa61024ac7f 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -102,7 +102,7 @@ static void emu10k1_memblk_init(struct snd_emu10k1_memblk 
*blk)
  */
 static int search_empty_map_area(struct snd_emu10k1 *emu, int npages, struct 
list_head **nextp)
 {
-   int page = 0, found_page = -ENOMEM;
+   int page = 1, found_page = -ENOMEM;
int max_size = npages;
int size;
struct list_head *candidate = >mapped_link_head;
@@ -147,6 +147,10 @@ static int map_memblk(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
page = search_empty_map_area(emu, blk->pages, );
if (page < 0) /* not found */
return page;
+   if (page == 0) {
+   dev_err(emu->card->dev, "trying to map zero (reserved) page\n");
+   return -EINVAL;
+   }
/* insert this block in the proper position of mapped list */
list_add_tail(>mapped_link, next);
/* append this as a newest block in order list */
@@ -177,7 +181,7 @@ static int unmap_memblk(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
q = get_emu10k1_memblk(p, mapped_link);
start_page = q->mapped_page + q->pages;
} else
-   start_page = 0;
+   start_page = 1;
if ((p = blk->mapped_link.next) != >mapped_link_head) {
q = get_emu10k1_memblk(p, mapped_link);
end_page = q->mapped_page;


[PATCH v2 5/5] ALSA: emu10k1: add a IOMMU workaround

2018-01-27 Thread Maciej S. Szmigiero
The Audigy 2 CA0102 chip (but most likely others from the emu10k1 family,
too) has a problem that from time to time it likes to do few DMA reads a
bit beyond its normal allocation and gets very confused if these reads get
blocked by a IOMMU.

For the first (reserved) page this happens multiple times at every
playback, for various synth pages it happens randomly, rarely for PCM
playback buffers and the page table memory itself.
All these reads seem to follow a similar pattern, observed read offsets
beyond the allocation end were 0x00, 0x40, 0x80 and 0xc0 (PCI cache line
multiples), so it looks like the device tries to accesses up to 256 extra
bytes.

As a workaround let's widen these DMA allocations by an extra page if we
detect that the device is behind a non-passthrough IOMMU (the DMA memory
should be relatively plenty on IOMMU systems).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: Apply this workaround also to PCM playback buffers since
it seems they are affected, too.

 include/sound/emu10k1.h  |  1 +
 sound/pci/emu10k1/emu10k1_main.c | 50 +---
 sound/pci/emu10k1/emupcm.c   |  9 +++-
 sound/pci/emu10k1/memory.c   | 16 ++---
 4 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h
index db32b7de52e0..ba27abf65408 100644
--- a/include/sound/emu10k1.h
+++ b/include/sound/emu10k1.h
@@ -1710,6 +1710,7 @@ struct snd_emu10k1 {
unsigned int ecard_ctrl;/* ecard control bits */
unsigned int address_mode;  /* address mode */
unsigned long dma_mask; /* PCI DMA mask */
+   bool iommu_workaround;  /* IOMMU workaround needed */
unsigned int delay_pcm_irq; /* in samples */
int max_cache_pages;/* max memory size / PAGE_SIZE 
*/
struct snd_dma_buffer silent_page;  /* silent page */
diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index 8decd2a7a404..3638bff26d23 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1758,6 +1759,38 @@ static struct snd_emu_chip_details emu_chip_details[] = {
{ } /* terminator */
 };
 
+/*
+ * The chip (at least the Audigy 2 CA0102 chip, but most likely others, too)
+ * has a problem that from time to time it likes to do few DMA reads a bit
+ * beyond its normal allocation and gets very confused if these reads get
+ * blocked by a IOMMU.
+ *
+ * This behaviour has been observed for the first (reserved) page
+ * (for which it happens multiple times at every playback), often for various
+ * synth pages and sometimes for PCM playback buffers and the page table
+ * memory itself.
+ *
+ * As a workaround let's widen these DMA allocations by an extra page if we
+ * detect that the device is behind a non-passthrough IOMMU.
+ */
+static void snd_emu10k1_detect_iommu(struct snd_emu10k1 *emu)
+{
+   struct iommu_domain *domain;
+
+   emu->iommu_workaround = false;
+
+   if (!iommu_present(emu->card->dev->bus))
+   return;
+
+   domain = iommu_get_domain_for_dev(emu->card->dev);
+   if (domain && domain->type == IOMMU_DOMAIN_IDENTITY)
+   return;
+
+   dev_notice(emu->card->dev,
+  "non-passthrough IOMMU detected, widening DMA allocations");
+   emu->iommu_workaround = true;
+}
+
 int snd_emu10k1_create(struct snd_card *card,
   struct pci_dev *pci,
   unsigned short extin_mask,
@@ -1770,6 +1803,7 @@ int snd_emu10k1_create(struct snd_card *card,
struct snd_emu10k1 *emu;
int idx, err;
int is_audigy;
+   size_t page_table_size, silent_page_size;
unsigned int silent_page;
const struct snd_emu_chip_details *c;
static struct snd_device_ops ops = {
@@ -1867,6 +1901,8 @@ int snd_emu10k1_create(struct snd_card *card,
 
is_audigy = emu->audigy = c->emu10k2_chip;
 
+   snd_emu10k1_detect_iommu(emu);
+
/* set addressing mode */
emu->address_mode = is_audigy ? 0 : 1;
/* set the DMA transfer mask */
@@ -1893,8 +1929,13 @@ int snd_emu10k1_create(struct snd_card *card,
emu->port = pci_resource_start(pci, 0);
 
emu->max_cache_pages = max_cache_bytes >> PAGE_SHIFT;
+
+   page_table_size = sizeof(u32) * (emu->address_mode ? MAXPAGES1 :
+MAXPAGES0);
+   if (emu->iommu_workaround)
+   page_table_size += PAGE_SIZE;
if (snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV, snd_dma_pci_data(pci),
-   (emu->address_mode ? 32 : 16) * 1024, 
>ptb_pages) < 0) {
+  

[PATCH v2 2/5] ALSA: emu10k1: use dma_set_mask_and_coherent()

2018-01-27 Thread Maciej S. Szmigiero
We have been calling dma_set_mask() and then dma_set_coherent_mask() with
the same value, but there is a dma_set_mask_and_coherent() function that
does exactly that so let's use it instead.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: None in this patch.

 sound/pci/emu10k1/emu10k1_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index a0a4d31ded53..11e868f569d6 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1871,8 +1871,7 @@ int snd_emu10k1_create(struct snd_card *card,
emu->address_mode = is_audigy ? 0 : 1;
/* set the DMA transfer mask */
emu->dma_mask = emu->address_mode ? EMU10K1_DMA_MASK : AUDIGY_DMA_MASK;
-   if (dma_set_mask(>dev, emu->dma_mask) < 0 ||
-   dma_set_coherent_mask(>dev, emu->dma_mask) < 0) {
+   if (dma_set_mask_and_coherent(>dev, emu->dma_mask) < 0) {
dev_err(card->dev,
"architecture does not support PCI busmaster DMA with 
mask 0x%lx\n",
emu->dma_mask);


[PATCH v2 4/5] ALSA: emu10k1: make sure synth DMA pages are allocated with DMA functions

2018-01-27 Thread Maciej S. Szmigiero
Commit a5003fc04113 ("[ALSA] emu10k1 - simplify page allocation for synth")
switched from using the DMA allocator for synth DMA pages to manually
calling alloc_page().
However, this usage has an implicit assumption that the DMA address space
for the emu10k1-family chip is the same as the CPU physical address space
which is not true for a system with a IOMMU.

Since this made the synth part of the driver non-functional on such systems
let's effectively revert that commit.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: None in this patch.

 sound/pci/emu10k1/memory.c | 70 ++
 1 file changed, 39 insertions(+), 31 deletions(-)

diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index fcb04cbbc9ab..5cdffe2d31e1 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -457,49 +457,44 @@ static void get_single_page_range(struct snd_util_memhdr 
*hdr,
*last_page_ret = last_page;
 }
 
-/* release allocated pages */
-static void __synth_free_pages(struct snd_emu10k1 *emu, int first_page,
-  int last_page)
-{
-   int page;
-
-   for (page = first_page; page <= last_page; page++) {
-   free_page((unsigned long)emu->page_ptr_table[page]);
-   emu->page_addr_table[page] = 0;
-   emu->page_ptr_table[page] = NULL;
-   }
-}
-
 /*
  * allocate kernel pages
  */
 static int synth_alloc_pages(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
 {
int page, first_page, last_page;
+   struct snd_dma_buffer dmab;
 
emu10k1_memblk_init(blk);
get_single_page_range(emu->memhdr, blk, _page, _page);
/* allocate kernel pages */
for (page = first_page; page <= last_page; page++) {
-   /* first try to allocate from <4GB zone */
-   struct page *p = alloc_page(GFP_KERNEL | GFP_DMA32 |
-   __GFP_NOWARN);
-   if (!p || (page_to_pfn(p) & ~(emu->dma_mask >> PAGE_SHIFT))) {
-   if (p)
-   __free_page(p);
-   /* try to allocate from <16MB zone */
-   p = alloc_page(GFP_ATOMIC | GFP_DMA |
-  __GFP_NORETRY | /* no OOM-killer */
-  __GFP_NOWARN);
-   }
-   if (!p) {
-   __synth_free_pages(emu, first_page, page - 1);
-   return -ENOMEM;
+   if (snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV,
+   snd_dma_pci_data(emu->pci),
+   PAGE_SIZE, ) < 0)
+   goto __fail;
+   if (!is_valid_page(emu, dmab.addr)) {
+   snd_dma_free_pages();
+   goto __fail;
}
-   emu->page_addr_table[page] = page_to_phys(p);
-   emu->page_ptr_table[page] = page_address(p);
+   emu->page_addr_table[page] = dmab.addr;
+   emu->page_ptr_table[page] = dmab.area;
}
return 0;
+
+__fail:
+   /* release allocated pages */
+   last_page = page - 1;
+   for (page = first_page; page <= last_page; page++) {
+   dmab.area = emu->page_ptr_table[page];
+   dmab.addr = emu->page_addr_table[page];
+   dmab.bytes = PAGE_SIZE;
+   snd_dma_free_pages();
+   emu->page_addr_table[page] = 0;
+   emu->page_ptr_table[page] = NULL;
+   }
+
+   return -ENOMEM;
 }
 
 /*
@@ -507,10 +502,23 @@ static int synth_alloc_pages(struct snd_emu10k1 *emu, 
struct snd_emu10k1_memblk
  */
 static int synth_free_pages(struct snd_emu10k1 *emu, struct snd_emu10k1_memblk 
*blk)
 {
-   int first_page, last_page;
+   int page, first_page, last_page;
+   struct snd_dma_buffer dmab;
 
get_single_page_range(emu->memhdr, blk, _page, _page);
-   __synth_free_pages(emu, first_page, last_page);
+   dmab.dev.type = SNDRV_DMA_TYPE_DEV;
+   dmab.dev.dev = snd_dma_pci_data(emu->pci);
+   for (page = first_page; page <= last_page; page++) {
+   if (emu->page_ptr_table[page] == NULL)
+   continue;
+   dmab.area = emu->page_ptr_table[page];
+   dmab.addr = emu->page_addr_table[page];
+   dmab.bytes = PAGE_SIZE;
+   snd_dma_free_pages();
+   emu->page_addr_table[page] = 0;
+   emu->page_ptr_table[page] = NULL;
+   }
+
return 0;
 }
 


Re: [PATCH v2 5/5] ALSA: emu10k1: add a IOMMU workaround

2018-02-12 Thread Maciej S. Szmigiero
On 12.02.2018 13:56, Takashi Iwai wrote:
> On Sat, 27 Jan 2018 15:42:59 +0100,
>  Maciej S. Szmigiero  wrote:
>>
>> The Audigy 2 CA0102 chip (but most likely others from the emu10k1 family,
>> too) has a problem that from time to time it likes to do few DMA reads a
>> bit beyond its normal allocation and gets very confused if these reads get
>> blocked by a IOMMU.
>>
>> For the first (reserved) page this happens multiple times at every
>> playback, for various synth pages it happens randomly, rarely for PCM
>> playback buffers and the page table memory itself.
>> All these reads seem to follow a similar pattern, observed read offsets
>> beyond the allocation end were 0x00, 0x40, 0x80 and 0xc0 (PCI cache line
>> multiples), so it looks like the device tries to accesses up to 256 extra
>> bytes.
>>
>> As a workaround let's widen these DMA allocations by an extra page if we
>> detect that the device is behind a non-passthrough IOMMU (the DMA memory
>> should be relatively plenty on IOMMU systems).
>>
>> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
>> ---
>> Changes from v1: Apply this workaround also to PCM playback buffers since
>> it seems they are affected, too.
> 
> Instead of adjusting the allocation size in the caller side, how about
> adding a new helper to wrap around the call of snd_dma_alloc_pages()?
> 
> We may need a counterpart to free pages in synth, but it's a single
> place in __synth_free_pages(), so it can be open-coded with some
> proper comments, too.

I guess you mean adding a new wrapper to the ALSA core somewhere near
snd_dma_alloc_pages() (something named like
snd_dma_dev_alloc_pages_maybe_wider() ?).

Since snd_dma_alloc_pages() currently takes only a "struct device"
per-device parameter we wouldn't have a place to store a flag indicating
whether a device needs this workaround (or not) so we would need to
detect it every time this wrapper function gets called - in contrast,
in the current implementation this is done just once at the device
initialization time in snd_emu10k1_detect_iommu().

There are only 3 allocations that use snd_dma_alloc_pages() in this
driver that would use this new wrapper function, and each time the
overhead is just a two-line "if" block.
If one excludes synth, since it already uses a helper function to
compute these allocations lengths, that count lowers to only 2 places.

That's why I think a driver-local change here is enough.

Also, there is always a possibility to refactor the code into a common
helper if it turns out that there are other sound card with the same
problem.

> 
> thanks,
> 
> Takashi

Thanks,
Maciej


Re: [PATCH v2 4/5] ALSA: emu10k1: make sure synth DMA pages are allocated with DMA functions

2018-02-12 Thread Maciej S. Szmigiero
On 12.02.2018 13:44, Takashi Iwai wrote:
> On Sat, 27 Jan 2018 15:42:47 +0100,
>  Maciej S. Szmigiero  wrote:
>>
>> Commit a5003fc04113 ("[ALSA] emu10k1 - simplify page allocation for synth")
>> switched from using the DMA allocator for synth DMA pages to manually
>> calling alloc_page().
>> However, this usage has an implicit assumption that the DMA address space
>> for the emu10k1-family chip is the same as the CPU physical address space
>> which is not true for a system with a IOMMU.
>>
>> Since this made the synth part of the driver non-functional on such systems
>> let's effectively revert that commit.
> 
> Let's keep (or re-add after this revert) the simplification with
> __synth_free_pages().  Other than that I have no objection to this
> change.

Will keep the simplification via __synth_free_pages() function in a respin.

> thanks,
> 
> Takashi

Thanks,
Maciej


[PATCH v3 4/5] ALSA: emu10k1: make sure synth DMA pages are allocated with DMA functions

2018-02-13 Thread Maciej S. Szmigiero
Commit a5003fc04113 ("[ALSA] emu10k1 - simplify page allocation for synth")
switched from using the DMA allocator for synth DMA pages to manually
calling alloc_page().
However, this usage has an implicit assumption that the DMA address space
for the emu10k1-family chip is the same as the CPU physical address space
which is not true for a system with a IOMMU.

Since this made the synth part of the driver non-functional on such systems
let's effectively revert that commit (while keeping the
__synth_free_pages() simplification).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: None in this patch.

Changes from v2: Keep the __synth_free_pages() simplification.

 sound/pci/emu10k1/memory.c | 44 +++-
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index fcb04cbbc9ab..1d0ce7356bbd 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -461,10 +461,19 @@ static void get_single_page_range(struct snd_util_memhdr 
*hdr,
 static void __synth_free_pages(struct snd_emu10k1 *emu, int first_page,
   int last_page)
 {
+   struct snd_dma_buffer dmab;
int page;
 
+   dmab.dev.type = SNDRV_DMA_TYPE_DEV;
+   dmab.dev.dev = snd_dma_pci_data(emu->pci);
+
for (page = first_page; page <= last_page; page++) {
-   free_page((unsigned long)emu->page_ptr_table[page]);
+   if (emu->page_ptr_table[page] == NULL)
+   continue;
+   dmab.area = emu->page_ptr_table[page];
+   dmab.addr = emu->page_addr_table[page];
+   dmab.bytes = PAGE_SIZE;
+   snd_dma_free_pages();
emu->page_addr_table[page] = 0;
emu->page_ptr_table[page] = NULL;
}
@@ -476,30 +485,31 @@ static void __synth_free_pages(struct snd_emu10k1 *emu, 
int first_page,
 static int synth_alloc_pages(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
 {
int page, first_page, last_page;
+   struct snd_dma_buffer dmab;
 
emu10k1_memblk_init(blk);
get_single_page_range(emu->memhdr, blk, _page, _page);
/* allocate kernel pages */
for (page = first_page; page <= last_page; page++) {
-   /* first try to allocate from <4GB zone */
-   struct page *p = alloc_page(GFP_KERNEL | GFP_DMA32 |
-   __GFP_NOWARN);
-   if (!p || (page_to_pfn(p) & ~(emu->dma_mask >> PAGE_SHIFT))) {
-   if (p)
-   __free_page(p);
-   /* try to allocate from <16MB zone */
-   p = alloc_page(GFP_ATOMIC | GFP_DMA |
-  __GFP_NORETRY | /* no OOM-killer */
-  __GFP_NOWARN);
-   }
-   if (!p) {
-   __synth_free_pages(emu, first_page, page - 1);
-   return -ENOMEM;
+   if (snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV,
+   snd_dma_pci_data(emu->pci),
+   PAGE_SIZE, ) < 0)
+   goto __fail;
+   if (!is_valid_page(emu, dmab.addr)) {
+   snd_dma_free_pages();
+   goto __fail;
}
-   emu->page_addr_table[page] = page_to_phys(p);
-   emu->page_ptr_table[page] = page_address(p);
+   emu->page_addr_table[page] = dmab.addr;
+   emu->page_ptr_table[page] = dmab.area;
}
return 0;
+
+__fail:
+   /* release allocated pages */
+   last_page = page - 1;
+   __synth_free_pages(emu, first_page, last_page);
+
+   return -ENOMEM;
 }
 
 /*



Re: [PATCH v2 5/5] ALSA: emu10k1: add a IOMMU workaround

2018-02-13 Thread Maciej S. Szmigiero
On 13.02.2018 06:00, Takashi Iwai wrote:
> On Mon, 12 Feb 2018 23:13:13 +0100,
> Maciej S. Szmigiero wrote:
>>
>> On 12.02.2018 13:56, Takashi Iwai wrote:
>>> On Sat, 27 Jan 2018 15:42:59 +0100,
>>>  Maciej S. Szmigiero  wrote:
>>>>
>>>> The Audigy 2 CA0102 chip (but most likely others from the emu10k1 family,
>>>> too) has a problem that from time to time it likes to do few DMA reads a
>>>> bit beyond its normal allocation and gets very confused if these reads get
>>>> blocked by a IOMMU.
>>>>
>>>> For the first (reserved) page this happens multiple times at every
>>>> playback, for various synth pages it happens randomly, rarely for PCM
>>>> playback buffers and the page table memory itself.
>>>> All these reads seem to follow a similar pattern, observed read offsets
>>>> beyond the allocation end were 0x00, 0x40, 0x80 and 0xc0 (PCI cache line
>>>> multiples), so it looks like the device tries to accesses up to 256 extra
>>>> bytes.
>>>>
>>>> As a workaround let's widen these DMA allocations by an extra page if we
>>>> detect that the device is behind a non-passthrough IOMMU (the DMA memory
>>>> should be relatively plenty on IOMMU systems).
>>>>
>>>> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
>>>> ---
>>>> Changes from v1: Apply this workaround also to PCM playback buffers since
>>>> it seems they are affected, too.
>>>
>>> Instead of adjusting the allocation size in the caller side, how about
>>> adding a new helper to wrap around the call of snd_dma_alloc_pages()?
>>>
>>> We may need a counterpart to free pages in synth, but it's a single
>>> place in __synth_free_pages(), so it can be open-coded with some
>>> proper comments, too.
>>
>> I guess you mean adding a new wrapper to the ALSA core somewhere near
>> snd_dma_alloc_pages() (something named like
>> snd_dma_dev_alloc_pages_maybe_wider() ?).
> 
> Well, not a common code but emu10k1 specific, something like
> 
> int snd_emu10k1_alloc_pages_maybe_wider(emu, size, res)
> {
>   if (emu->iommu_workaround)
>   size += PAGE_SIZE;
>   return snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV,
>   snd_dma_pci_data(emu->pci), size, res);
> }

Okay, I have added a function like the one above to the driver and
used it instead to make the relevant allocations.

> Also, I wonder what if PAGE_SIZE is over 4k.  In that case, we don't
> necessarily need to increase the size, if the allocated size is larger
> than the requested one?  But iommu_workaround is likely only about
> x86, so we don't need to care about it much, I guess.

For completeness, I have added a check to the allocation function
described above whether rounding the allocation size up to the nearest
page size already covers the necessary extra memory space.

> 
> Takashi
> 

Maciej


[PATCH v3 2/5] ALSA: emu10k1: use dma_set_mask_and_coherent()

2018-02-13 Thread Maciej S. Szmigiero
We have been calling dma_set_mask() and then dma_set_coherent_mask() with
the same value, but there is a dma_set_mask_and_coherent() function that
does exactly that so let's use it instead.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1, v2: None in this patch.

 sound/pci/emu10k1/emu10k1_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index a0a4d31ded53..11e868f569d6 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1871,8 +1871,7 @@ int snd_emu10k1_create(struct snd_card *card,
emu->address_mode = is_audigy ? 0 : 1;
/* set the DMA transfer mask */
emu->dma_mask = emu->address_mode ? EMU10K1_DMA_MASK : AUDIGY_DMA_MASK;
-   if (dma_set_mask(>dev, emu->dma_mask) < 0 ||
-   dma_set_coherent_mask(>dev, emu->dma_mask) < 0) {
+   if (dma_set_mask_and_coherent(>dev, emu->dma_mask) < 0) {
dev_err(card->dev,
"architecture does not support PCI busmaster DMA with 
mask 0x%lx\n",
emu->dma_mask);



[PATCH v3 1/5] ALSA: emu10k1: remove reserved_page

2018-02-13 Thread Maciej S. Szmigiero
The emu10k1-family chips need the first page (index 0) reserved in their
page tables for some reason (every emu10k1 driver I've checked does this
without much of an explanation).
Using the first page for normal samples results in a broken playback.

However, we already have a dummy page allocated - so called "silent page"
and, in fact, had always been setting it as the first page in the chip page
table because an initialization of every entry of the page table to point
to a silent page happens after and overwrites the reserved_page allocation.

So the only thing remaining to remove the reserved_page allocation is a
trivial change to the page allocation logic to ignore the first page entry
and start its allocations from the second entry (index 1).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1, v2: None in this patch.

 include/sound/emu10k1.h  |  1 -
 sound/pci/emu10k1/emu10k1_main.c | 11 ---
 sound/pci/emu10k1/memory.c   |  8 ++--
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h
index 4f42affe777c..db32b7de52e0 100644
--- a/include/sound/emu10k1.h
+++ b/include/sound/emu10k1.h
@@ -1718,7 +1718,6 @@ struct snd_emu10k1 {
struct snd_dma_buffer p16v_buffer;
 
struct snd_util_memhdr *memhdr; /* page allocation list */
-   struct snd_emu10k1_memblk *reserved_page;   /* reserved page */
 
struct list_head mapped_link_head;
struct list_head mapped_order_link_head;
diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index ccf4415a1c7b..a0a4d31ded53 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1272,12 +1272,6 @@ static int snd_emu10k1_free(struct snd_emu10k1 *emu)
release_firmware(emu->dock_fw);
if (emu->irq >= 0)
free_irq(emu->irq, emu);
-   /* remove reserved page */
-   if (emu->reserved_page) {
-   snd_emu10k1_synth_free(emu,
-   (struct snd_util_memblk *)emu->reserved_page);
-   emu->reserved_page = NULL;
-   }
snd_util_memhdr_free(emu->memhdr);
if (emu->silent_page.area)
snd_dma_free_pages(>silent_page);
@@ -1993,11 +1987,6 @@ int snd_emu10k1_create(struct snd_card *card,
SPCS_GENERATIONSTATUS | 0x1200 |
0x | SPCS_EMPHASIS_NONE | SPCS_COPYRIGHT;
 
-   emu->reserved_page = (struct snd_emu10k1_memblk *)
-   snd_emu10k1_synth_alloc(emu, 4096);
-   if (emu->reserved_page)
-   emu->reserved_page->map_locked = 1;
-
/* Clear silent pages and set up pointers */
memset(emu->silent_page.area, 0, PAGE_SIZE);
silent_page = emu->silent_page.addr << emu->address_mode;
diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index 4f1f69be1865..eaa61024ac7f 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -102,7 +102,7 @@ static void emu10k1_memblk_init(struct snd_emu10k1_memblk 
*blk)
  */
 static int search_empty_map_area(struct snd_emu10k1 *emu, int npages, struct 
list_head **nextp)
 {
-   int page = 0, found_page = -ENOMEM;
+   int page = 1, found_page = -ENOMEM;
int max_size = npages;
int size;
struct list_head *candidate = >mapped_link_head;
@@ -147,6 +147,10 @@ static int map_memblk(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
page = search_empty_map_area(emu, blk->pages, );
if (page < 0) /* not found */
return page;
+   if (page == 0) {
+   dev_err(emu->card->dev, "trying to map zero (reserved) page\n");
+   return -EINVAL;
+   }
/* insert this block in the proper position of mapped list */
list_add_tail(>mapped_link, next);
/* append this as a newest block in order list */
@@ -177,7 +181,7 @@ static int unmap_memblk(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
q = get_emu10k1_memblk(p, mapped_link);
start_page = q->mapped_page + q->pages;
} else
-   start_page = 0;
+   start_page = 1;
if ((p = blk->mapped_link.next) != >mapped_link_head) {
q = get_emu10k1_memblk(p, mapped_link);
end_page = q->mapped_page;



[PATCH v3 3/5] ALSA: emu10k1: add optional debug printouts with DMA addresses

2018-02-13 Thread Maciej S. Szmigiero
When we get a IOMMU page fault for a emu10k1 device it is very hard to
discover which of chip many DMA allocations triggered it (since on a IOMMU
system the DMA address space is often very different from the CPU one).
Let's add optional debug printouts providing this information.

These debug printouts are only enabled on an explicit request via the
kernel dynamic debug mechanism.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: Fix a sparse warning by changing a variable type in
__{get,set}_ptb_entry from u32 to __le32, don't open code EMUPAGESIZE.

Changes from v2: None in this patch.

 sound/pci/emu10k1/emu10k1_main.c |  8 
 sound/pci/emu10k1/memory.c   | 15 +++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index 11e868f569d6..8decd2a7a404 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1898,6 +1898,9 @@ int snd_emu10k1_create(struct snd_card *card,
err = -ENOMEM;
goto error;
}
+   dev_dbg(card->dev, "page table address range is %.8lx:%.8lx\n",
+   (unsigned long)emu->ptb_pages.addr,
+   (unsigned long)(emu->ptb_pages.addr + emu->ptb_pages.bytes));
 
emu->page_ptr_table = vmalloc(emu->max_cache_pages * sizeof(void *));
emu->page_addr_table = vmalloc(emu->max_cache_pages *
@@ -1912,6 +1915,11 @@ int snd_emu10k1_create(struct snd_card *card,
err = -ENOMEM;
goto error;
}
+   dev_dbg(card->dev, "silent page range is %.8lx:%.8lx\n",
+   (unsigned long)emu->silent_page.addr,
+   (unsigned long)(emu->silent_page.addr +
+   emu->silent_page.bytes));
+
emu->memhdr = snd_util_memhdr_new(emu->max_cache_pages * PAGE_SIZE);
if (emu->memhdr == NULL) {
err = -ENOMEM;
diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index eaa61024ac7f..fcb04cbbc9ab 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -34,7 +34,10 @@
  * aligned pages in others
  */
 #define __set_ptb_entry(emu,page,addr) \
-   (((u32 *)(emu)->ptb_pages.area)[page] = cpu_to_le32(((addr) << 
(emu->address_mode)) | (page)))
+   (((__le32 *)(emu)->ptb_pages.area)[page] = \
+cpu_to_le32(((addr) << (emu->address_mode)) | (page)))
+#define __get_ptb_entry(emu, page) \
+   (le32_to_cpu(((__le32 *)(emu)->ptb_pages.area)[page]))
 
 #define UNIT_PAGES (PAGE_SIZE / EMUPAGESIZE)
 #define MAX_ALIGN_PAGES0   (MAXPAGES0 / UNIT_PAGES)
@@ -44,8 +47,7 @@
 /* get offset address from aligned page */
 #define aligned_page_offset(page)  ((page) << PAGE_SHIFT)
 
-#if PAGE_SIZE == 4096
-/* page size == EMUPAGESIZE */
+#if PAGE_SIZE == EMUPAGESIZE && !IS_ENABLED(CONFIG_DYNAMIC_DEBUG)
 /* fill PTB entrie(s) corresponding to page with addr */
 #define set_ptb_entry(emu,page,addr)   __set_ptb_entry(emu,page,addr)
 /* fill PTB entrie(s) corresponding to page with silence pointer */
@@ -58,6 +60,8 @@ static inline void set_ptb_entry(struct snd_emu10k1 *emu, int 
page, dma_addr_t a
page *= UNIT_PAGES;
for (i = 0; i < UNIT_PAGES; i++, page++) {
__set_ptb_entry(emu, page, addr);
+   dev_dbg(emu->card->dev, "mapped page %d to entry %.8x\n", page,
+   (unsigned int)__get_ptb_entry(emu, page));
addr += EMUPAGESIZE;
}
 }
@@ -65,9 +69,12 @@ static inline void set_silent_ptb(struct snd_emu10k1 *emu, 
int page)
 {
int i;
page *= UNIT_PAGES;
-   for (i = 0; i < UNIT_PAGES; i++, page++)
+   for (i = 0; i < UNIT_PAGES; i++, page++) {
/* do not increment ptr */
__set_ptb_entry(emu, page, emu->silent_page.addr);
+   dev_dbg(emu->card->dev, "mapped silent page %d to entry %.8x\n",
+   page, (unsigned int)__get_ptb_entry(emu, page));
+   }
 }
 #endif /* PAGE_SIZE */
 



[PATCH v3 5/5] ALSA: emu10k1: add a IOMMU workaround

2018-02-13 Thread Maciej S. Szmigiero
The Audigy 2 CA0102 chip (but most likely others from the emu10k1 family,
too) has a problem that from time to time it likes to do few DMA reads a
bit beyond its normal allocation and gets very confused if these reads get
blocked by a IOMMU.

For the first (reserved) page this happens multiple times at every
playback, for various synth pages it happens randomly, rarely for PCM
playback buffers and the page table memory itself.
All these reads seem to follow a similar pattern, observed read offsets
beyond the allocation end were 0x00, 0x40, 0x80 and 0xc0 (PCI cache line
multiples), so it looks like the device tries to accesses up to 256 extra
bytes.

As a workaround let's widen these DMA allocations by an extra page if we
detect that the device is behind a non-passthrough IOMMU (the DMA memory
should be relatively plenty on IOMMU systems).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: Apply this workaround also to PCM playback buffers since
it seems they are affected, too.

Changes from v2: Use a common snd_emu10k1_alloc_pages_maybe_wider() helper
function, check if rounding the allocation size up to the nearest page
size already covers the necessary extra memory space.

 include/sound/emu10k1.h  |  3 +++
 sound/pci/emu10k1/emu10k1_main.c | 49 
 sound/pci/emu10k1/emupcm.c   | 10 +++-
 sound/pci/emu10k1/memory.c   | 40 +---
 4 files changed, 93 insertions(+), 9 deletions(-)

diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h
index db32b7de52e0..5ebcc51c0a6a 100644
--- a/include/sound/emu10k1.h
+++ b/include/sound/emu10k1.h
@@ -1710,6 +1710,7 @@ struct snd_emu10k1 {
unsigned int ecard_ctrl;/* ecard control bits */
unsigned int address_mode;  /* address mode */
unsigned long dma_mask; /* PCI DMA mask */
+   bool iommu_workaround;  /* IOMMU workaround needed */
unsigned int delay_pcm_irq; /* in samples */
int max_cache_pages;/* max memory size / PAGE_SIZE 
*/
struct snd_dma_buffer silent_page;  /* silent page */
@@ -1877,6 +1878,8 @@ void snd_p16v_resume(struct snd_emu10k1 *emu);
 /* memory allocation */
 struct snd_util_memblk *snd_emu10k1_alloc_pages(struct snd_emu10k1 *emu, 
struct snd_pcm_substream *substream);
 int snd_emu10k1_free_pages(struct snd_emu10k1 *emu, struct snd_util_memblk 
*blk);
+int snd_emu10k1_alloc_pages_maybe_wider(struct snd_emu10k1 *emu, size_t size,
+   struct snd_dma_buffer *dmab);
 struct snd_util_memblk *snd_emu10k1_synth_alloc(struct snd_emu10k1 *emu, 
unsigned int size);
 int snd_emu10k1_synth_free(struct snd_emu10k1 *emu, struct snd_util_memblk 
*blk);
 int snd_emu10k1_synth_bzero(struct snd_emu10k1 *emu, struct snd_util_memblk 
*blk, int offset, int size);
diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index 8decd2a7a404..18267de3a269 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1758,6 +1759,38 @@ static struct snd_emu_chip_details emu_chip_details[] = {
{ } /* terminator */
 };
 
+/*
+ * The chip (at least the Audigy 2 CA0102 chip, but most likely others, too)
+ * has a problem that from time to time it likes to do few DMA reads a bit
+ * beyond its normal allocation and gets very confused if these reads get
+ * blocked by a IOMMU.
+ *
+ * This behaviour has been observed for the first (reserved) page
+ * (for which it happens multiple times at every playback), often for various
+ * synth pages and sometimes for PCM playback buffers and the page table
+ * memory itself.
+ *
+ * As a workaround let's widen these DMA allocations by an extra page if we
+ * detect that the device is behind a non-passthrough IOMMU.
+ */
+static void snd_emu10k1_detect_iommu(struct snd_emu10k1 *emu)
+{
+   struct iommu_domain *domain;
+
+   emu->iommu_workaround = false;
+
+   if (!iommu_present(emu->card->dev->bus))
+   return;
+
+   domain = iommu_get_domain_for_dev(emu->card->dev);
+   if (domain && domain->type == IOMMU_DOMAIN_IDENTITY)
+   return;
+
+   dev_notice(emu->card->dev,
+  "non-passthrough IOMMU detected, widening DMA allocations");
+   emu->iommu_workaround = true;
+}
+
 int snd_emu10k1_create(struct snd_card *card,
   struct pci_dev *pci,
   unsigned short extin_mask,
@@ -1770,6 +1803,7 @@ int snd_emu10k1_create(struct snd_card *card,
struct snd_emu10k1 *emu;
int idx, err;
int is_audigy;
+   size_t page_table_size;
unsigned int silent_page;
const struct snd_emu_chip_details *c;
 

RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Maciej S. Szmigiero
On 18.11.2017 06:14, Kees Cook wrote:
> On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean  wrote:
>> On 2017-11-17 04:55 PM, Linus Torvalds wrote:
>>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:

 I am still getting the crash at d9e12200852d, I figured I would
 double-check the "good" and "bad" kernels before starting a full bisect.
>>>
>>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?
>>
>> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.
> 
> That's strange. With d9e12200852d the shuffle_seed variables won't
> ever actually get used. (i.e. I wouldn't expect the seed to change any
> behavior.)
> 
> Can you confirm with something like this:
> 
> 
> for (i = 0; i < 4; i++) {
> seed[i] = shuffle_seed[i];
> 
> 
> You should see no reports of "Shuffling struct ..."
> 
> And if it reports nothing, and you're on d9e12200852d, can you confirm
> that switching to a "good" seed fixes it? (If it _does_, then I
> suspect a build artifact being left behind or something odd like
> that.)
> 
>>> Kees removed even the baseline "randomize pure function pointer
>>> structures", so at that commit, nothing should be randomized.
>>>
>>> But maybe the plugin code itself ends up confusing gcc somehow?
>>>
>>> Even when it doesn't actually do that "relayout_struct()" on the
>>> structure, it always does those TYPE_ATTRIBUTES() games.
> 
> FWIW, myself doing a build at d9e12200852d with and without
> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
> where I did spot-checks.
> 

I have also hit a GPF in nfsd4_encode_fattr() with RANDSTRUCT plugin
enabled.
This function is located in a fs/nfsd/nfs4xdr.c file.

The fault happened at "xdr_encode_hyper(p, 
exp->ex_path.mnt->mnt_sb->s_maxbytes)"
line, namely when accessing s_maxbytes.

exp->ex_path is of type struct path that has been annotated with
__randomize_layout.
It seems to me that this annotation isn't really taken into consideration
when compiling nfs4xdr.c.
This most likely results in dereferencing a value of exp->ex_path.dentry
instead of exp->ex_path.mnt. Then some member of struct dentry is
dereferenced as struct super_block to access its s_maxbytes member which
results in an oops if it happens to be an invalid pointer (which it was
in my case).

How to reproduce the problem statically (tested on current Linus's tree
and on 4.15.4, with gcc 7.3.0):
1) Enable RANDSTRUCT plugin,

2) Use a RANDSTRUCT seed that results in shuffling of struct path,
Example: "55e5fea7ff662b333a190209ab31f35b6f0f2470f7d0e3c64430936169571106".

3) make fs/nfsd/nfs4xdr.s and save the result,

4) Insert "#include " at the top of
fs/nfsd/nfs4xdr.c as the very first include directive.

5) make fs/nfsd/nfs4xdr.s and compare the result with the one from step 3.

One can see that offsets used to access various members of struct path are
different, and also that the original file from step 3 contains an object
named "__randomize_layout".

This is caused by a fact that the current version of nfs4xdr.c includes
linux/fs_struct.h as the very first included header which then includes
linux/path.h as the very first included header, which then defines
struct path, but without including any files on its own.

This results in __randomize_layout tag at the end of struct path
definition being treated as a variable name (since linux/compiler-gcc.h
that defines it as a type attribute has not been included yet).

It looks like to me that every header file that defines a randomized
struct also has to include linux/compiler_types.h or some other file
that ultimately results in that file inclusion in order to make
the RANDSTRUCT plugin work correctly.

Maciej


Re: [PATCH v2 0/5] Change ISA_BUS_API dependency to selection

2018-02-19 Thread Maciej S. Szmigiero
Hi all,

On 31.01.2018 03:22, William Breathitt Gray wrote:
> On Tue, Jan 02, 2018 at 10:29:30AM +0100, Linus Walleij wrote:
>> On Fri, Dec 29, 2017 at 9:13 PM, William Breathitt Gray
>>  wrote:
>>
>>> Linus, please pickup this entire patchset through your GPIO subsystem
>>> tree; a recursive dependency error is present if these patches are
>>> cherry-picked (see https://lkml.org/lkml/2017/12/26/235), so they should
>>> be merged together in the same tree.
>>
>> OK I just need a bunch of ACKs so giving subsystem maintainers some
>> days to come back on this.
>>
>> Yours,
>> Linus Walleij
> 
> Hi Linus,
> 
> Do you have any objections to picking up this patchset now; or would you
> prefer to wait for some more input from the other maintainers?
> 
> William Breathitt Gray
> 

Anything still blocking this patchset?
It is a pure Kconfig change, however it has been waiting nearly
2 months now...

Maciej


Re: [PATCH v2] kconfig.h: Include compiler types to avoid missed struct attributes

2018-02-22 Thread Maciej S. Szmigiero
On 22.02.2018 01:28, Kees Cook wrote:
> The header files for some structures could get included in such a way
> that struct attributes (specifically __randomize_layout from path.h) would
> be parsed as variable names instead of attributes. This could lead to
> some instances of a structure being unrandomized, causing nasty GPFs, etc.
> 
> This patch makes sure the compiler_types.h header is included in
> kconfig.h so that we've always got types and struct attributes defined,
> since kconfig.h is included from the compiler command line.
> 
> Reported-by: Patrick McLean <chutz...@gentoo.org>
> Root-caused-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
> Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")
> Suggested-by: Linus Torvalds <torva...@linux-foundation.org>
> Signed-off-by: Kees Cook <keesc...@chromium.org>

I can confirm that this patch fixes the original nfsd GPF issue.
Also, struct path members offsets are consistent now between nfs4xdr.s
and other files.

> ---
> Updated Maciej's tag.

Thanks.

Maciej


Re: [PATCH v2 3/5] gpio: Change ISA_BUS_API dependency to selection

2018-02-22 Thread Maciej S. Szmigiero
Hi William,
Hi Linus,

On 22.02.2018 21:30, William Breathitt Gray wrote:
> On Thu, Feb 22, 2018 at 04:16:17PM +0100, Linus Walleij wrote:
>> On Fri, Dec 29, 2017 at 9:13 PM, William Breathitt Gray
>>  wrote:
>>
>>> The ISA_BUS_API Kconfig option enables the compilation of the ISA bus
>>> driver. The ISA bus driver does not perform any hardware interaction,
>>> and is instead just a thin layer of software abstraction to eliminate
>>> boilerplate code common to ISA-style device drivers. Since ISA_BUS_API
>>> has no dependencies and does not jeopardize the integrity of the system
>>> when enabled, drivers should select it when the ISA bus driver
>>> functionality is needed.
>>>
>>> Cc: Linus Walleij 
>>> Signed-off-by: William Breathitt Gray 
>>
>> Patch applied to the GPIO tree for v4.17.
>>
>> Can you confirm that we don't have any dangling ISA
>> drivers not using this?
>>
>> Yours,
>> Linus Walleij
> 
> Hi Linus,
> 
> This patchset should cover all current mainline drivers depending on
> ISA_BUS_API.
> 

Thanks for merging this series!

Note that gpio-winbond (CONFIG_GPIO_WINBOND) driver was originally
merged using ISA_BUS_API selection as done for other ISA bus gpio drivers
by this patch, then temporary switched to the previous ("depends on")
style by commit 92a8046c9d952a to fix a circular Kconfig dependency.
So now this commit (92...) should be reverted to make gpio-winbond
driver Kconfig dependency / selection consistent with the remaining ISA
bus gpio drivers.

Unfortunately, I can't test the reversion myself because it looks like
Linus didn't push his trees yet to git.kernel.org after applying this
series (or I am not looking at the right place - linusw/linux-gpio.git?).

Best regards,
Maciej


[PATCH 1/3] X.509: unpack RSA signatureValue field from BIT STRING

2018-02-24 Thread Maciej S. Szmigiero
The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
For RSA signatures this BIT STRING is of so-called primitive subtype, which
contains a u8 prefix indicating a count of unused bits in the encoding.

We have to strip this prefix from signature data, just as we already do for
key data in x509_extract_key_data() function.

This wasn't noticed earlier because this prefix byte is zero for RSA key
sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
prefixes has no bearing on its value.

The signature length, however was incorrect, which is a problem for RSA
implementations that need it to be exactly correct (like AMD CCP).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
Fixes: c26fd69fa009 ("X.509: Add a crypto key parser for binary (DER) X.509 
certificates")
Cc: sta...@vger.kernel.org
---
 crypto/asymmetric_keys/x509_cert_parser.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index ce2df8c9c583..88c26a4538ae 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -249,6 +249,15 @@ int x509_note_signature(void *context, size_t hdrlen,
return -EINVAL;
}
 
+   if (!strcmp(ctx->cert->sig->pkey_algo, "rsa")) {
+   /* Discard the BIT STRING metadata */
+   if (vlen < 1 || *(const u8 *)value != 0)
+   return -EBADMSG;
+
+   value++;
+   vlen--;
+   }
+
ctx->cert->raw_sig = value;
ctx->cert->raw_sig_size = vlen;
return 0;


[PATCH 2/3] crypto: ccp - return an actual key size from RSA max_size callback

2018-02-24 Thread Maciej S. Szmigiero
rsa-pkcs1pad uses a value returned from a RSA implementation max_size
callback as a size of an input buffer passed to the RSA implementation for
encrypt and sign operations.

CCP RSA implementation uses a hardware input buffer which size depends only
on the current RSA key length, so it should return this key length in
the max_size callback, too.
This also matches what the kernel software RSA implementation does.

Previously, the value returned from this callback was always the maximum
RSA key size the CCP hardware supports.
This resulted in this huge buffer being passed by rsa-pkcs1pad to CCP even
for smaller key sizes and then in a buffer overflow when ccp_run_rsa_cmd()
tried to copy this large input buffer into a RSA key length-sized hardware
input buffer.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
Fixes: ceeec0afd684 ("crypto: ccp - Add support for RSA on the CCP")
Cc: sta...@vger.kernel.org
---
 drivers/crypto/ccp/ccp-crypto-rsa.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-rsa.c 
b/drivers/crypto/ccp/ccp-crypto-rsa.c
index e6db8672d89c..05850dfd7940 100644
--- a/drivers/crypto/ccp/ccp-crypto-rsa.c
+++ b/drivers/crypto/ccp/ccp-crypto-rsa.c
@@ -60,10 +60,9 @@ static int ccp_rsa_complete(struct crypto_async_request 
*async_req, int ret)
 
 static unsigned int ccp_rsa_maxsize(struct crypto_akcipher *tfm)
 {
-   if (ccp_version() > CCP_VERSION(3, 0))
-   return CCP5_RSA_MAXMOD;
-   else
-   return CCP_RSA_MAXMOD;
+   struct ccp_ctx *ctx = akcipher_tfm_ctx(tfm);
+
+   return ctx->u.rsa.n_len;
 }
 
 static int ccp_rsa_crypt(struct akcipher_request *req, bool encrypt)


[PATCH 3/3] crypto: ccp - protect RSA implementation from too large input data

2018-02-24 Thread Maciej S. Szmigiero
CCP RSA implementation uses a hardware input buffer which size depends only
on the current RSA key length. Key modulus and a message to be processed
is then copied to this buffer based on their own lengths.

Since the price for providing too long input data is a buffer overflow and
there already has been a case when this has happened let's better reject
such oversized input data and log an error message in this case so we know
what is going on.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/crypto/ccp/ccp-ops.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c
index 406b95329b3d..517aeee30abf 100644
--- a/drivers/crypto/ccp/ccp-ops.c
+++ b/drivers/crypto/ccp/ccp-ops.c
@@ -1770,10 +1770,6 @@ static int ccp_run_rsa_cmd(struct ccp_cmd_queue *cmd_q, 
struct ccp_cmd *cmd)
if (!rsa->exp || !rsa->mod || !rsa->src || !rsa->dst)
return -EINVAL;
 
-   memset(, 0, sizeof(op));
-   op.cmd_q = cmd_q;
-   op.jobid = CCP_NEW_JOBID(cmd_q->ccp);
-
/* The RSA modulus must precede the message being acted upon, so
 * it must be copied to a DMA area where the message and the
 * modulus can be concatenated.  Therefore the input buffer
@@ -1785,6 +1781,26 @@ static int ccp_run_rsa_cmd(struct ccp_cmd_queue *cmd_q, 
struct ccp_cmd *cmd)
o_len = 32 * ((rsa->key_size + 255) / 256);
i_len = o_len * 2;
 
+   if (rsa->mod_len > o_len) {
+   dev_err(cmd_q->ccp->dev,
+   "RSA modulus of %u bytes too large for key size of %u 
bits\n",
+   (unsigned int)rsa->mod_len,
+   (unsigned int)rsa->key_size);
+   return -EINVAL;
+   }
+
+   if (rsa->src_len > o_len) {
+   dev_err(cmd_q->ccp->dev,
+   "RSA data of %u bytes too large for key size of %u 
bits\n",
+   (unsigned int)rsa->src_len,
+   (unsigned int)rsa->key_size);
+   return -EINVAL;
+   }
+
+   memset(, 0, sizeof(op));
+   op.cmd_q = cmd_q;
+   op.jobid = CCP_NEW_JOBID(cmd_q->ccp);
+
sb_count = 0;
if (cmd_q->ccp->vdata->version < CCP_VERSION(5, 0)) {
/* sb_count is the number of storage block slots required


Re: [PATCH v2 3/5] gpio: Change ISA_BUS_API dependency to selection

2018-02-23 Thread Maciej S. Szmigiero
On 22.02.2018 21:44, Maciej S. Szmigiero wrote:
> Hi William,
> Hi Linus,
> 
> On 22.02.2018 21:30, William Breathitt Gray wrote:
>> On Thu, Feb 22, 2018 at 04:16:17PM +0100, Linus Walleij wrote:
>>> On Fri, Dec 29, 2017 at 9:13 PM, William Breathitt Gray
>>> <vilhelm.g...@gmail.com> wrote:
>>>
>>>> The ISA_BUS_API Kconfig option enables the compilation of the ISA bus
>>>> driver. The ISA bus driver does not perform any hardware interaction,
>>>> and is instead just a thin layer of software abstraction to eliminate
>>>> boilerplate code common to ISA-style device drivers. Since ISA_BUS_API
>>>> has no dependencies and does not jeopardize the integrity of the system
>>>> when enabled, drivers should select it when the ISA bus driver
>>>> functionality is needed.
>>>>
>>>> Cc: Linus Walleij <linus.wall...@linaro.org>
>>>> Signed-off-by: William Breathitt Gray <vilhelm.g...@gmail.com>
>>>
>>> Patch applied to the GPIO tree for v4.17.
>>>
>>> Can you confirm that we don't have any dangling ISA
>>> drivers not using this?
>>>
>>> Yours,
>>> Linus Walleij
>>
>> Hi Linus,
>>
>> This patchset should cover all current mainline drivers depending on
>> ISA_BUS_API.
>>
> 
> Thanks for merging this series!
> 
> Note that gpio-winbond (CONFIG_GPIO_WINBOND) driver was originally
> merged using ISA_BUS_API selection as done for other ISA bus gpio drivers
> by this patch, then temporary switched to the previous ("depends on")
> style by commit 92a8046c9d952a to fix a circular Kconfig dependency.
> So now this commit (92...) should be reverted to make gpio-winbond
> driver Kconfig dependency / selection consistent with the remaining ISA
> bus gpio drivers.
> 
> Unfortunately, I can't test the reversion myself because it looks like
> Linus didn't push his trees yet to git.kernel.org after applying this
> series (or I am not looking at the right place - linusw/linux-gpio.git?).

It looks like the series has been pushed to git.kernel.org now.
I have tested the reversion of commit 92a8046c9d952a and can confirm
that the ISA_BUS_API dependency / selection for gpio-winbond works
then as intended (in a way that is consistent with other ISA bus gpio
drivers).

@Linus:
Will you revert that commit or should I submit a patch that does it?

Maciej


[PATCH] Revert "gpio: winbond: fix ISA_BUS_API dependency"

2018-02-23 Thread Maciej S. Szmigiero
This reverts commit 92a8046c9d952a2a7d21dfcd3afadc72a0bc0f72.

Now that the patch series changing ISA_BUS_API dependency to selection
was merged this reversion will do the same for gpio-winbond driver to
make it consistent with other ISA bus gpio drivers.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/gpio/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index dff99871bca8..6d481ef03623 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -722,7 +722,7 @@ config GPIO_TS5500
 
 config GPIO_WINBOND
tristate "Winbond Super I/O GPIO support"
-   depends on ISA_BUS_API
+   select ISA_BUS_API
help
  This option enables support for GPIOs found on Winbond Super I/O
  chips.


Re: [PATCH v2] gpio: winbond: add driver

2017-12-28 Thread Maciej S. Szmigiero
On 28.12.2017 05:58, William Breathitt Gray wrote:
> On Wed, Dec 27, 2017 at 07:42:21PM +0100, Maciej S. Szmigiero wrote:
>> On 27.12.2017 01:24, William Breathitt Gray wrote:
>>> On Mon, Dec 25, 2017 at 03:48:16PM +0100, Maciej S. Szmigiero wrote:
>> (..)
>>>> All the existing ISA bus drivers seem to depend on CONFIG_ISA_BUS_API
>>>> instead of selecting it but IMHO this is wrong because:
>>>> 1) This Kconfig option doesn't really enable or disable any bus support
>>>> but building of a library of some common boilerplate code.
>>>> Libraries are normally selected by drivers needing them and only provided
>>>> as an user-selectable option if there is a possibility that a out-of-tree
>>>> module would need it,
>>>>
>>>> 2) On x86_64 this option (or rather, its parent option CONFIG_ISA_BUS)
>>>> cannot be enabled without CONFIG_EXPERT,
>>>>
>>>> 3) This device isn't really a ISA bus device any more than, for example,
>>>> a 8250 serial port or a PC-style parallel port and these don't need
>>>> that an user explicitly enables "ISA bus support" in his kernel
>>>> configuration.
>>>
>>> I can see what you mean about selecting ISA_BUS_API rather than having
>>> it as a dependency for drivers. Part of the reason I added the
>>> CONFIG_EXPERT dependency for CONFIG_ISA_BUS -- as well as having
>>> CONFIG_ISA_BUS_API be a dependency for the drivers themselves -- was to
>>> hide the ISA-style drivers which blindly poke at I/O port addresses,
>>> lest a niave user enable all available drivers and unintentionally brick
>>> their system when the drivers execute.
>>>
>>> I think there is still merit in masking dangerous drivers such as this,
>>> since the expected behavior nowadays is for the driver to probe for the
>>> device before poking at memory; since ISA-style communication lacks a
>>> standard method of detecting devices in hardware, these devices
>>> generally pose a danger when loaded by niave users.
>>
>> This driver accesses the same Super I/O chip as w83627ehf hwmon and
>> w83627hf_wdt watchdog drivers.
>> In addition to this, there are loads of other hwmon and watchdog drivers
>> for x86 Super I/Os in the tree, most of them using the same probing and
>> communication style.
>> There are even existing GPIO drivers for some Super I/Os like gpio-it87
>> and gpio-f7188x.
>>
>> None of these drivers need CONFIG_EXPERT to be selected.
>>
>> Also, CONFIG_EXPERT is described as "Configure standard kernel features"
>> and that "[it] allows certain base kernel options and settings to be
>> disabled or tweaked" for "specialized environments".
>> Enabling this driver is not about changing "standard kernel feature" or
>> a "base kernel option [or] setting".
> 
> I'm sorry, I didn't make it quite clear in my previous reply. I agree
> with you that CONFIG_EXPERT shouldn't be necessary for this driver -- in
> the end, a select ISA_BUS_API line should be all that's needed to have
> ISA bus driver support for your driver.

My apologies for misunderstanding you.

> My reference to the CONFIG_EXPERT option is for masking options related
> to other ISA-style buses not commonly found in desktop systems. Devices
> like the Super I/O chip wouldn't fall into this category since LPC is
> pretty common and the relevant I/O port addresses are usually well
> known. Ultimately, the Winbond GPIO driver should not need CONFIG_EXPERT
> to be enabled in order to select ISA_BUS_API.

Thanks for the explanation, it is clear for me now what you had on mind.

> William Breathitt Gray
> 

Maciej Szmigiero


Re: [PATCH 0/3] Change ISA_BUS_API dependency to selection

2017-12-28 Thread Maciej S. Szmigiero
On 28.12.2017 17:01, William Breathitt Gray wrote:
(..)
> 
> Linus, please pickup this entire patchset through your GPIO subsystem
> tree; a recursive dependency error is present if these patches are
> cherry-picked (see https://lkml.org/lkml/2017/12/26/235), so they should
> be merged together in the same tree.
> 
> Maciej, this patchset resolves the recursive dependency issue you
> encountered, so now you should be able to submit your Winbond GPIO
> driver with the ISA_BUS_API selection as desired.

Thanks for these patches, will submit the driver as soon as I
integrate Andy's comments.

Maciej


Re: [PATCH v2] gpio: winbond: add driver

2017-12-28 Thread Maciej S. Szmigiero
On 28.12.2017 16:12, Andy Shevchenko wrote:
> On Fri, 2017-12-22 at 19:58 +0100, Maciej S. Szmigiero wrote:
>> This commit adds GPIO driver for Winbond Super I/Os.
>>
>> Currently, only W83627UHG model (also known as Nuvoton NCT6627UD) is
>> supported but in the future a support for other Winbond models, too,
>> can
>> be added to the driver.
>>
>> A module parameter "gpios" sets a bitmask of GPIO ports to enable (bit
>> 0 is
>> GPIO1, bit 1 is GPIO2, etc.).
>> One should be careful which ports one tinkers with since some might be
>> managed by the firmware (for functions like powering on and off,
>> sleeping,
>> BIOS recovery, etc.) and some of GPIO port pins are physically shared
>> with
>> other devices included in the Super I/O chip.
>  
>> +config GPIO_WINBOND
>> +tristate "Winbond Super I/O GPIO support"
>> +help
>> +  This option enables support for GPIOs found on Winbond
>> Super I/O
>> +  chips.
>> +  Currently, only W83627UHG (also known as Nuvoton NCT6627UD)
>> is
>> +  supported.
>> +
>> +  You will need to provide a module parameter "gpios", or a
>> +  boot-time parameter "gpio_winbond.gpios" with a bitmask of
>> GPIO
>> +  ports to enable (bit 0 is GPIO1, bit 1 is GPIO2, etc.).
> 
> 1. Why it's required?

It is required bacause "[o]ne should be careful which ports one tinkers
with since some might be managed by the firmware (for functions like
powering on and off, sleeping, BIOS recovery, etc.) and some of GPIO port
pins are physically shared with other devices included in the Super I/O
chip".

> 2. GPIO1 -> GPIO0, GPIO2 -> GPIO1, etc ?

The chip datasheet calls GPIO devices and their pins "GPIO1", "GPIO2",
etc., however the driver uses a more common zero-based indexing (so,
for example, we don't waste the zeroth bit).

> 
>> +
>> +  To compile this driver as a module, choose M here: the
>> module will
>> +  be called gpio-winbond.
>>
> 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
> 
> Perhaps, alphabetically ordered?

Isn't 'e' < 'g' < 'i' < 'm' < 'p' ?

BTW. The ISA bus version has slightly different includes.

> 
>> +
>> +#define WB_GPIO_DRIVER_NAME "gpio-winbond"
>> +
>> +#define WB_SIO_BASE 0x2e
>> +#define WB_SIO_BASE_HIGH 0x4e
> 
> Is it my mail client or you didn't use TAB(s) as separator?

Will use tabs as separators between name and value in this type of macro.

>> +#define WB_SIO_CHIP_ID_W83627UHG 0xa230
>> +#define WB_SIO_CHIP_ID_W83627UHG_MASK 0xfff0
> 
> GENMASK()

Ok.

>> +
>> +/* not an actual device number, just a value meaning 'no device' */
>> +#define WB_SIO_DEV_NONE 0xff
> 
> 
> 
>> +
>> +#define WB_SIO_DEV_UARTB 3
>> +#define WB_SIO_UARTB_REG_ENABLE 0x30
>> +#define WB_SIO_UARTB_ENABLE_ON 0
> 
> What does it mean?
> 
> 1. ???

Super I/O logical device number for UART B.

> 2. Register offset

(Device) enable register offset.

> 3. Bit to enable

(Device) enable register bit index.

> ?
> 
>> +
>> +#define WB_SIO_DEV_UARTC 6
>> +#define WB_SIO_UARTC_REG_ENABLE 0x30
>> +#define WB_SIO_UARTC_ENABLE_ON 0
>> +
>> +#define WB_SIO_DEV_GPIO34 7
>> +#define WB_SIO_GPIO34_REG_ENABLE 0x30
> 
>> +#define WB_SIO_GPIO34_ENABLE_4 1
>> +#define WB_SIO_GPIO34_ENABLE_3 0
> 
> Perhaps swap them?

Ok.
 
>> +#define WB_SIO_GPIO34_REG_IO3 0xe0
>> +#define WB_SIO_GPIO34_REG_DATA3 0xe1
>> +#define WB_SIO_GPIO34_REG_INV3 0xe2
>> +#define WB_SIO_GPIO34_REG_IO4 0xe4
>> +#define WB_SIO_GPIO34_REG_DATA4 0xe5
>> +#define WB_SIO_GPIO34_REG_INV4 0xe6
>> +
>> +#define WB_SIO_DEV_WDGPIO56 8
> 
>> +#define WB_SIO_WDGPIO56_REG_ENABLE 0x30
> 
> Why do we have duplication here?

Registers with offset >= 0x30 are
specific for a particular device.

That's a register in a different device
(which happen to have similar function as
register 0x30 in, for example, UARTC but
nothing in the datasheet guarantees that
such mapping will be universal).

>> +#define WB_SIO_WDGPIO56_ENABLE_6 2
>> +#define WB_SIO_WDGPIO56_ENABLE_5 1
> 
> Swap.

Ok.

> 
>> +#define WB_SIO_WDGPIO56_REG_IO5 0xe0
>> +#define WB_SIO_WDGPIO56_REG_DATA5 0xe1
>> +#define WB_SIO_WDGPIO56_REG_INV5 0xe2
>> +#define WB_SIO_WDGPIO56_REG_IO6 0xe4
>> +#define WB_SIO_WDGPIO56_REG_DATA6 0xe5
>> +#define WB_SIO_WDGPIO56_REG_INV6 0xe6
> 
> Duplication?

Again, it's a different device.

>> +
>> +#defin

[PATCH v2] gpio: winbond: add driver

2017-12-22 Thread Maciej S. Szmigiero
This commit adds GPIO driver for Winbond Super I/Os.

Currently, only W83627UHG model (also known as Nuvoton NCT6627UD) is
supported but in the future a support for other Winbond models, too, can
be added to the driver.

A module parameter "gpios" sets a bitmask of GPIO ports to enable (bit 0 is
GPIO1, bit 1 is GPIO2, etc.).
One should be careful which ports one tinkers with since some might be
managed by the firmware (for functions like powering on and off, sleeping,
BIOS recovery, etc.) and some of GPIO port pins are physically shared with
other devices included in the Super I/O chip.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1:
* Added SPDX license tag,

* Removed gpiobase parameter,

* Changed uint{8,16}_t types to u{8,16},

* Added kerneldoc descriptions of driver structures,

* Reformatted winbond_gpio_infos array fields so they are on separate
lines,

* Added few comments here and there as requested,

* Moved port configuration code from separate winbond_gpio_configure_X()
functions to one, common, parametrized winbond_gpio_configure_port()
function.

Didn't change "linux/errno.h" and "linux/gpio.h" includes to
"linux/driver.h" since there is no such file in the current linux-gpio
tree and so the driver would not compile with this change.
Other GPIO drivers are using these former two include files, too.

 drivers/gpio/Kconfig|  15 +
 drivers/gpio/Makefile   |   1 +
 drivers/gpio/gpio-winbond.c | 758 
 3 files changed, 774 insertions(+)
 create mode 100644 drivers/gpio/gpio-winbond.c

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 395669bfcc26..3384a4675a0c 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -698,6 +698,21 @@ config GPIO_TS5500
  blocks of the TS-5500: DIO1, DIO2 and the LCD port, and the TS-5600
  LCD port.
 
+config GPIO_WINBOND
+   tristate "Winbond Super I/O GPIO support"
+   help
+ This option enables support for GPIOs found on Winbond Super I/O
+ chips.
+ Currently, only W83627UHG (also known as Nuvoton NCT6627UD) is
+ supported.
+
+ You will need to provide a module parameter "gpios", or a
+ boot-time parameter "gpio_winbond.gpios" with a bitmask of GPIO
+ ports to enable (bit 0 is GPIO1, bit 1 is GPIO2, etc.).
+
+ To compile this driver as a module, choose M here: the module will
+ be called gpio-winbond.
+
 config GPIO_WS16C48
tristate "WinSystems WS16C48 GPIO support"
depends on ISA_BUS_API
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index bc5dd673fa11..ff3d36d0a443 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -139,6 +139,7 @@ obj-$(CONFIG_GPIO_VIPERBOARD)   += gpio-viperboard.o
 obj-$(CONFIG_GPIO_VR41XX)  += gpio-vr41xx.o
 obj-$(CONFIG_GPIO_VX855)   += gpio-vx855.o
 obj-$(CONFIG_GPIO_WHISKEY_COVE)+= gpio-wcove.o
+obj-$(CONFIG_GPIO_WINBOND) += gpio-winbond.o
 obj-$(CONFIG_GPIO_WM831X)  += gpio-wm831x.o
 obj-$(CONFIG_GPIO_WM8350)  += gpio-wm8350.o
 obj-$(CONFIG_GPIO_WM8994)  += gpio-wm8994.o
diff --git a/drivers/gpio/gpio-winbond.c b/drivers/gpio/gpio-winbond.c
new file mode 100644
index ..385855fb6c9e
--- /dev/null
+++ b/drivers/gpio/gpio-winbond.c
@@ -0,0 +1,758 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * GPIO interface for Winbond Super I/O chips
+ * Currently, only W83627UHG (Nuvoton NCT6627UD) is supported.
+ *
+ * Author: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define WB_GPIO_DRIVER_NAME "gpio-winbond"
+
+#define WB_SIO_BASE 0x2e
+#define WB_SIO_BASE_HIGH 0x4e
+
+#define WB_SIO_EXT_ENTER_KEY 0x87
+#define WB_SIO_EXT_EXIT_KEY 0xaa
+
+#define WB_SIO_CHIP_ID_W83627UHG 0xa230
+#define WB_SIO_CHIP_ID_W83627UHG_MASK 0xfff0
+
+/* not an actual device number, just a value meaning 'no device' */
+#define WB_SIO_DEV_NONE 0xff
+
+#define WB_SIO_DEV_UARTB 3
+#define WB_SIO_UARTB_REG_ENABLE 0x30
+#define WB_SIO_UARTB_ENABLE_ON 0
+
+#define WB_SIO_DEV_UARTC 6
+#define WB_SIO_UARTC_REG_ENABLE 0x30
+#define WB_SIO_UARTC_ENABLE_ON 0
+
+#define WB_SIO_DEV_GPIO34 7
+#define WB_SIO_GPIO34_REG_ENABLE 0x30
+#define WB_SIO_GPIO34_ENABLE_4 1
+#define WB_SIO_GPIO34_ENABLE_3 0
+#define WB_SIO_GPIO34_REG_IO3 0xe0
+#define WB_SIO_GPIO34_REG_DATA3 0xe1
+#define WB_SIO_GPIO34_REG_INV3 0xe2
+#define WB_SIO_GPIO34_REG_IO4 0xe4
+#define WB_SIO_GPIO34_REG_DATA4 0xe5
+#define WB_SIO_GPIO34_REG_INV4 0xe6
+
+#define WB_SIO_DEV_WDGPIO56 8
+#define WB_SIO_WDGPIO56_REG_ENABLE 0x30
+#define WB_SIO_WDGPIO56_ENABLE_6 2
+#define WB_SIO_WDGPIO56_ENABLE_5 1
+#define WB_SIO_WDGPIO56_REG_IO5 0xe0
+#define WB_SIO_WDGPIO56_REG_DATA5 0xe1
+#define WB_SIO_WDGPIO56_REG_INV5 0xe2
+#define WB_SIO_WDGPIO56_REG_IO6 0xe4
+#defin

Re: [PATCH v4 6/6] [media] cxusb: add analog mode support for Medion MD95700

2017-12-22 Thread Maciej S. Szmigiero
On 19.12.2017 13:53, Mauro Carvalho Chehab wrote:
> Em Sun, 17 Dec 2017 19:47:25 +0100
> "Maciej S. Szmigiero" <m...@maciej.szmigiero.name> escreveu:
> 
>> This patch adds support for analog part of Medion 95700 in the cxusb
>> driver.
>>
>> What works:
>> * Video capture at various sizes with sequential fields,
>> * Input switching (TV Tuner, Composite, S-Video),
>> * TV and radio tuning,
>> * Video standard switching and auto detection,
>> * Radio mode switching (stereo / mono),
>> * Unplugging while capturing,
>> * DVB / analog coexistence,
>> * Raw BT.656 stream support.
>>
>> What does not work yet:
>> * Audio,
>> * VBI,
>> * Picture controls.
> 
> Patches 1 to 5 look OK to me (although checkpatch do a few complains).
> 
> This one, however, require some adjustments.
> 
> I'd like to also have Hans eyes on it, as he's doing a lot more V4L2
> new driver reviews than me nowadays.
> 
(..)
>> +static int cxusb_medion_try_s_fmt_vid_cap(struct file *file,
>> +  struct v4l2_format *f,
>> +  bool isset)
>> +{
>> +struct dvb_usb_device *dvbdev = video_drvdata(file);
>> +struct cxusb_medion_dev *cxdev = dvbdev->priv;
>> +struct v4l2_subdev_format subfmt;
>> +int ret;
>> +
>> +if (isset && (cxusb_medion_stream_busy(cxdev) ||
>> +  vb2_is_busy(>videoqueue)))
>> +return -EBUSY;
>> +
>> +memset(, 0, sizeof(subfmt));
>> +subfmt.which = isset ? V4L2_SUBDEV_FORMAT_ACTIVE :
>> +V4L2_SUBDEV_FORMAT_TRY;
>> +subfmt.format.width = f->fmt.pix.width & ~1;
>> +subfmt.format.height = f->fmt.pix.height & ~1;
>> +subfmt.format.code = MEDIA_BUS_FMT_FIXED;
>> +subfmt.format.field = V4L2_FIELD_SEQ_TB;
>> +subfmt.format.colorspace = V4L2_COLORSPACE_SMPTE170M;
>> +
>> +ret = v4l2_subdev_call(cxdev->cx25840, pad, set_fmt, NULL, );
>> +if (ret != 0) {
>> +if (ret != -ERANGE)
>> +return ret;
>> +
>> +/* try some common formats */
>> +subfmt.format.width = 720;
>> +subfmt.format.height = 576;
>> +ret = v4l2_subdev_call(cxdev->cx25840, pad, set_fmt, NULL,
>> +   );
>> +if (ret != 0) {
>> +if (ret != -ERANGE)
>> +return ret;
>> +
>> +subfmt.format.width = 640;
>> +subfmt.format.height = 480;
>> +ret = v4l2_subdev_call(cxdev->cx25840, pad, set_fmt,
>> +   NULL, );
>> +if (ret != 0)
>> +return ret;
>> +}
>> +}
> 
> That looks weird... Why are you trying two different formats here,
> instead of just using the width/height that userspace passes?
> 
V4L2 docs say that VIDIOC_{S,TRY}_FMT ioctls "should not return an error
code unless the type field is invalid", that is, they should not return
an error for invalid or unsupported image widths or heights.
They should instead return something sensible for these image parameters.

However, cx25840 driver set_fmt callback simply returns -ERANGE if it
does not like the provided width or height.
In this case the code above simply tries first the bigger PAL capture
resolution then the smaller NTSC one.
Which one will be accepted by the cx25840 depends on the currently set
broadcast standard and parameters of the last signal that was received,
at least one of these resolutions seem to work even without any
signal being received since the chip was powered up.

This way the API guarantees should be kept by the driver.

(All other your comments were implemented in a respin).

> 
> Thanks,
> Mauro
> 

Thanks,
Maciej


[PATCH v5 4/6] tuner-simple: allow setting mono radio mode

2017-12-22 Thread Maciej S. Szmigiero
For some types of tuners (Philips FMD1216ME(X) MK3 currently) we know that
letting TDA9887 output port 1 remain high (inactive) will switch FM radio
to mono mode.
Let's make use of this functionality - nothing changes for the default
stereo radio mode.

Tested on a Medion 95700 board which has a FMD1216ME tuner.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/media/tuners/tuner-simple.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/media/tuners/tuner-simple.c 
b/drivers/media/tuners/tuner-simple.c
index cf44d3657f55..01ab94681d2d 100644
--- a/drivers/media/tuners/tuner-simple.c
+++ b/drivers/media/tuners/tuner-simple.c
@@ -670,6 +670,7 @@ static int simple_set_radio_freq(struct dvb_frontend *fe,
int rc, j;
struct tuner_params *t_params;
unsigned int freq = params->frequency;
+   bool mono = params->audmode == V4L2_TUNER_MODE_MONO;
 
tun = priv->tun;
 
@@ -736,8 +737,8 @@ static int simple_set_radio_freq(struct dvb_frontend *fe,
config |= TDA9887_PORT2_ACTIVE;
if (t_params->intercarrier_mode)
config |= TDA9887_INTERCARRIER;
-/* if (t_params->port1_set_for_fm_mono)
-   config &= ~TDA9887_PORT1_ACTIVE;*/
+   if (t_params->port1_set_for_fm_mono && mono)
+   config &= ~TDA9887_PORT1_ACTIVE;
if (t_params->fm_gain_normal)
config |= TDA9887_GAIN_NORMAL;
if (t_params->radio_if == 2)


[PATCH v5 0/6] [media] Add analog mode support for Medion MD95700

2017-12-22 Thread Maciej S. Szmigiero
This series adds support for analog part of Medion 95700 in the cxusb
driver.

What works:
* Video capture at various sizes with sequential fields,
* Input switching (TV Tuner, Composite, S-Video),
* TV and radio tuning,
* Video standard switching and auto detection,
* Radio mode switching (stereo / mono),
* Unplugging while capturing,
* DVB / analog coexistence,
* Raw BT.656 stream support.

What does not work yet:
* Audio,
* VBI,
* Picture controls.

This series (as a one patch) was submitted for inclusion few years ago,
then waited few months in a patch queue.
Unfortunately, by the time it was supposed to be merged there
were enough changes in media that it was no longer mergable.

I thought at that time that I will be able to rebase and retest it soon
but unfortunately up till now I was never able to find enough time to do
so.
Also, with the passing of time the implementation diverged more and
more from the current kernel code, necessitating even more reworking.

That last iteration can be found here:
https://patchwork.linuxtv.org/patch/8048/

Since that version there had been the following changes:
* Adaptation to changes in V4L2 / DVB core,

* Radio device was added, with a possibility to tune to a FM radio
station and switch between stereo and mono modes (tested by taping
audio signal directly at tuner output pin),

* DVB / analog coexistence was improved - resolved a few cases where
DVB core would switch off power or reset the tuner when the device
was still being used but in the analog mode,

* Fixed issues reported by v4l2-compliance,

* Switching to raw BT.656 mode is now done by a custom streaming
parameter set via VIDIOC_S_PARM ioctl instead of using a
V4L2_BUF_TYPE_PRIVATE buffer (which was removed from V4L2),

* General small code cleanups (like using BIT() or ARRAY_SIZE() macros
instead of open coding them, code formatting improvements, etc.).

Changes from v1:
* Only support configuration of cx25840 pins that the cxusb driver is
actually using so there is no need for an ugly CX25840_PIN() macro,

* Split cxusb changes into two patches: first one implementing
digital / analog coexistence in this driver, second one adding the
actual implementation of the analog mode,

* Fix a warning reported by kbuild test robot.

Changes from v2:
* Split out ivtv cx25840 platform data zero-initialization to a separate
commit,

* Add kernel-doc description of struct cx25840_state,

* Make sure that all variables used in CX25840_VCONFIG_OPTION() and
CX25840_VCONFIG_SET_BIT() macros are their explicit parameters,

* Split out some code from cxusb_medion_copy_field() and
cxusb_medion_v_complete_work() functions to separate ones to increase
their readability,

* Generate masks using GENMASK() and BIT() macros in cx25840.h and
cxusb.h.

Changes from v3:
Add SPDX tag to a newly added "cxusb-analog.c" file.

Changes from v4:
* Make analog support conditional on a new DVB_USB_CXUSB_ANALOG Kconfig
option,

* Use '//' comments in the header of a newly added "cxusb-analog.c"
file,

* Don't print errors on memory allocation failures,

* Get rid of the driver MODULE_VERSION(),

* Small formating fix of a one line.

Maciej S. Szmigiero (6):
  ivtv: zero-initialize cx25840 platform data
  cx25840: add kernel-doc description of struct cx25840_state
  cx25840: add pin to pad mapping and output format configuration
  tuner-simple: allow setting mono radio mode
  [media] cxusb: implement Medion MD95700 digital / analog coexistence
  [media] cxusb: add analog mode support for Medion MD95700

 drivers/media/i2c/cx25840/cx25840-core.c |  396 ++-
 drivers/media/i2c/cx25840/cx25840-core.h |   46 +-
 drivers/media/i2c/cx25840/cx25840-vbi.c  |3 +
 drivers/media/pci/ivtv/ivtv-i2c.c|1 +
 drivers/media/tuners/tuner-simple.c  |5 +-
 drivers/media/usb/dvb-usb/Kconfig|   16 +-
 drivers/media/usb/dvb-usb/Makefile   |3 +
 drivers/media/usb/dvb-usb/cxusb-analog.c | 1914 ++
 drivers/media/usb/dvb-usb/cxusb.c|  454 ++-
 drivers/media/usb/dvb-usb/cxusb.h|  154 +++
 drivers/media/usb/dvb-usb/dvb-usb-dvb.c  |   20 +-
 drivers/media/usb/dvb-usb/dvb-usb-init.c |   13 +
 drivers/media/usb/dvb-usb/dvb-usb.h  |8 +
 include/media/drv-intf/cx25840.h |   74 +-
 14 files changed, 3043 insertions(+), 64 deletions(-)
 create mode 100644 drivers/media/usb/dvb-usb/cxusb-analog.c


[PATCH v5 5/6] [media] cxusb: implement Medion MD95700 digital / analog coexistence

2017-12-22 Thread Maciej S. Szmigiero
This patch prepares cxusb driver for supporting the analog part of
Medion 95700 (previously only the digital - DVB - mode was supported).

Specifically, it adds support for:
* switching the device between analog and digital modes of operation,
* enforcing that only one mode is active at the same time due to hardware
limitations.

Actual implementation of the analog mode will be provided by the next
commit.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/media/usb/dvb-usb/cxusb.c| 452 +++
 drivers/media/usb/dvb-usb/cxusb.h|  48 
 drivers/media/usb/dvb-usb/dvb-usb-dvb.c  |  20 +-
 drivers/media/usb/dvb-usb/dvb-usb-init.c |  13 +
 drivers/media/usb/dvb-usb/dvb-usb.h  |   8 +
 5 files changed, 487 insertions(+), 54 deletions(-)

diff --git a/drivers/media/usb/dvb-usb/cxusb.c 
b/drivers/media/usb/dvb-usb/cxusb.c
index 7109fc7ab74d..ea55b31cf681 100644
--- a/drivers/media/usb/dvb-usb/cxusb.c
+++ b/drivers/media/usb/dvb-usb/cxusb.c
@@ -16,6 +16,7 @@
  * Copyright (C) 2005 Patrick Boettcher (patrick.boettc...@posteo.de)
  * Copyright (C) 2006 Michael Krufky (mkru...@linuxtv.org)
  * Copyright (C) 2006, 2007 Chris Pascoe (c.pas...@itee.uq.edu.au)
+ * Copyright (C) 2011, 2017 Maciej S. Szmigiero (m...@maciej.szmigiero.name)
  *
  *   This program is free software; you can redistribute it and/or modify it
  *   under the terms of the GNU General Public License as published by the Free
@@ -24,9 +25,12 @@
  * see Documentation/dvb/README.dvb-usb for more information
  */
 #include 
-#include 
-#include 
+#include 
+#include 
 #include 
+#include 
+#include 
+#include 
 
 #include "cxusb.h"
 
@@ -47,17 +51,46 @@
 #include "si2157.h"
 
 /* debug */
-static int dvb_usb_cxusb_debug;
+int dvb_usb_cxusb_debug;
 module_param_named(debug, dvb_usb_cxusb_debug, int, 0644);
-MODULE_PARM_DESC(debug, "set debugging level (1=rc (or-able))." 
DVB_USB_DEBUG_STATUS);
+MODULE_PARM_DESC(debug, "set debugging level (see cxusb.h)."
+DVB_USB_DEBUG_STATUS);
 
 DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
 
-#define deb_info(args...)   dprintk(dvb_usb_cxusb_debug, 0x03, args)
-#define deb_i2c(args...)dprintk(dvb_usb_cxusb_debug, 0x02, args)
+#define deb_info(args...)   dprintk(dvb_usb_cxusb_debug, CXUSB_DBG_MISC, args)
+#define deb_i2c(args...)dprintk(dvb_usb_cxusb_debug, CXUSB_DBG_I2C, args)
 
-static int cxusb_ctrl_msg(struct dvb_usb_device *d,
- u8 cmd, const u8 *wbuf, int wlen, u8 *rbuf, int rlen)
+enum cxusb_table_index {
+   MEDION_MD95700,
+   DVICO_BLUEBIRD_LG064F_COLD,
+   DVICO_BLUEBIRD_LG064F_WARM,
+   DVICO_BLUEBIRD_DUAL_1_COLD,
+   DVICO_BLUEBIRD_DUAL_1_WARM,
+   DVICO_BLUEBIRD_LGZ201_COLD,
+   DVICO_BLUEBIRD_LGZ201_WARM,
+   DVICO_BLUEBIRD_TH7579_COLD,
+   DVICO_BLUEBIRD_TH7579_WARM,
+   DIGITALNOW_BLUEBIRD_DUAL_1_COLD,
+   DIGITALNOW_BLUEBIRD_DUAL_1_WARM,
+   DVICO_BLUEBIRD_DUAL_2_COLD,
+   DVICO_BLUEBIRD_DUAL_2_WARM,
+   DVICO_BLUEBIRD_DUAL_4,
+   DVICO_BLUEBIRD_DVB_T_NANO_2,
+   DVICO_BLUEBIRD_DVB_T_NANO_2_NFW_WARM,
+   AVERMEDIA_VOLAR_A868R,
+   DVICO_BLUEBIRD_DUAL_4_REV_2,
+   CONEXANT_D680_DMB,
+   MYGICA_D689,
+   MYGICA_T230,
+   MYGICA_T230C,
+   NR__cxusb_table_index
+};
+
+static struct usb_device_id cxusb_table[];
+
+int cxusb_ctrl_msg(struct dvb_usb_device *d,
+  u8 cmd, const u8 *wbuf, int wlen, u8 *rbuf, int rlen)
 {
struct cxusb_state *st = d->priv;
int ret;
@@ -89,7 +122,8 @@ static void cxusb_gpio_tuner(struct dvb_usb_device *d, int 
onoff)
struct cxusb_state *st = d->priv;
u8 o[2], i;
 
-   if (st->gpio_write_state[GPIO_TUNER] == onoff)
+   if (st->gpio_write_state[GPIO_TUNER] == onoff &&
+   !st->gpio_write_refresh[GPIO_TUNER])
return;
 
o[0] = GPIO_TUNER;
@@ -100,6 +134,7 @@ static void cxusb_gpio_tuner(struct dvb_usb_device *d, int 
onoff)
deb_info("gpio_write failed.\n");
 
st->gpio_write_state[GPIO_TUNER] = onoff;
+   st->gpio_write_refresh[GPIO_TUNER] = false;
 }
 
 static int cxusb_bluebird_gpio_rw(struct dvb_usb_device *d, u8 changemask,
@@ -259,7 +294,7 @@ static int cxusb_i2c_xfer(struct i2c_adapter *adap, struct 
i2c_msg msg[],
 
 static u32 cxusb_i2c_func(struct i2c_adapter *adapter)
 {
-   return I2C_FUNC_I2C;
+   return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
 }
 
 static struct i2c_algorithm cxusb_i2c_algo = {
@@ -267,15 +302,48 @@ static struct i2c_algorithm cxusb_i2c_algo = {
.functionality = cxusb_i2c_func,
 };
 
-static int cxusb_power_ctrl(struct dvb_usb_device *d, int onoff)
+static int _cxusb_power_ctrl(struct dvb_usb_device *d, int onoff)
 {
u8 b = 0;
+
+   deb_info("setting power %s\n", onoff ? "ON" : "

[PATCH v5 2/6] cx25840: add kernel-doc description of struct cx25840_state

2017-12-22 Thread Maciej S. Szmigiero
This commit describes a device instance private data of the driver
(struct cx25840_state) in a kernel-doc style comment.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/media/i2c/cx25840/cx25840-core.h | 33 ++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/drivers/media/i2c/cx25840/cx25840-core.h 
b/drivers/media/i2c/cx25840/cx25840-core.h
index 55432ed42714..877b930e5b1f 100644
--- a/drivers/media/i2c/cx25840/cx25840-core.h
+++ b/drivers/media/i2c/cx25840/cx25840-core.h
@@ -45,6 +45,35 @@ enum cx25840_media_pads {
CX25840_NUM_PADS
 };
 
+/**
+ * struct cx25840_state - a device instance private data
+ * @c: i2c_client struct representing this device
+ * @sd:our V4L2 sub-device
+ * @hdl:   our V4L2 control handler
+ * @volume:audio volume V4L2 control (non-cx2583x devices only)
+ * @mute:  audio mute V4L2 control (non-cx2583x devices only)
+ * @pvr150_workaround: whether we enable workaround for Hauppauge PVR150
+ * hardware bug (audio dropping out)
+ * @radio: set if we are currently in the radio mode, otherwise
+ * the current mode is non-radio (that is, video)
+ * @std:   currently set video standard
+ * @vid_input: currently set video input
+ * @aud_input: currently set audio input
+ * @audclk_freq:   currently set audio sample rate
+ * @audmode:   currently set audio mode (when in non-radio mode)
+ * @vbi_line_offset:   vbi line number offset
+ * @id:exact device model
+ * @rev:   raw device id read from the chip
+ * @is_initialized:whether we have already loaded firmware into the chip
+ * and initialized it
+ * @vbi_regs_offset:   offset of vbi regs
+ * @fw_wait:   wait queue to wake an initalization function up when
+ * firmware loading (on a separate workqueue) finishes
+ * @fw_work:   a work that actually loads the firmware on a separate
+ * workqueue
+ * @ir_state:  a pointer to chip IR controller private data
+ * @pads:  array of supported chip pads (currently only a stub)
+ */
 struct cx25840_state {
struct i2c_client *c;
struct v4l2_subdev sd;
@@ -66,8 +95,8 @@ struct cx25840_state {
u32 rev;
int is_initialized;
unsigned vbi_regs_offset;
-   wait_queue_head_t fw_wait;/* wake up when the fw load is finished */
-   struct work_struct fw_work;   /* work entry for fw load */
+   wait_queue_head_t fw_wait;
+   struct work_struct fw_work;
struct cx25840_ir_state *ir_state;
 #if defined(CONFIG_MEDIA_CONTROLLER)
struct media_padpads[CX25840_NUM_PADS];


[PATCH v5 6/6] [media] cxusb: add analog mode support for Medion MD95700

2017-12-22 Thread Maciej S. Szmigiero
This patch adds support for analog part of Medion 95700 in the cxusb
driver.

What works:
* Video capture at various sizes with sequential fields,
* Input switching (TV Tuner, Composite, S-Video),
* TV and radio tuning,
* Video standard switching and auto detection,
* Radio mode switching (stereo / mono),
* Unplugging while capturing,
* DVB / analog coexistence,
* Raw BT.656 stream support.

What does not work yet:
* Audio,
* VBI,
* Picture controls.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/media/usb/dvb-usb/Kconfig|   16 +-
 drivers/media/usb/dvb-usb/Makefile   |3 +
 drivers/media/usb/dvb-usb/cxusb-analog.c | 1914 ++
 drivers/media/usb/dvb-usb/cxusb.c|2 -
 drivers/media/usb/dvb-usb/cxusb.h|  106 ++
 5 files changed, 2037 insertions(+), 4 deletions(-)
 create mode 100644 drivers/media/usb/dvb-usb/cxusb-analog.c

diff --git a/drivers/media/usb/dvb-usb/Kconfig 
b/drivers/media/usb/dvb-usb/Kconfig
index 2651ae277347..b4ef8f6eb470 100644
--- a/drivers/media/usb/dvb-usb/Kconfig
+++ b/drivers/media/usb/dvb-usb/Kconfig
@@ -138,12 +138,24 @@ config DVB_USB_CXUSB
select MEDIA_TUNER_SI2157 if MEDIA_SUBDRV_AUTOSELECT
help
  Say Y here to support the Conexant USB2.0 hybrid reference design.
- Currently, only DVB and ATSC modes are supported, analog mode
- shall be added in the future. Devices that require this module:
+ DVB and ATSC modes are supported, for a basic analog mode support
+ see the next option ("Analog support for the Conexant USB2.0 hybrid
+ reference design").
+ Devices that require this module:
 
  Medion MD95700 hybrid USB2.0 device.
  DViCO FusionHDTV (Bluebird) USB2.0 devices
 
+config DVB_USB_CXUSB_ANALOG
+   bool "Analog support for the Conexant USB2.0 hybrid reference design"
+   depends on DVB_USB_CXUSB && VIDEO_V4L2
+   select VIDEO_CX25840
+   select VIDEOBUF2_VMALLOC
+   help
+ Say Y here to enable basic analog mode support for the Conexant
+ USB2.0 hybrid reference design.
+ Currently this mode is supported only on a Medion MD95700 device.
+
 config DVB_USB_M920X
tristate "Uli m920x DVB-T USB2.0 support"
depends on DVB_USB
diff --git a/drivers/media/usb/dvb-usb/Makefile 
b/drivers/media/usb/dvb-usb/Makefile
index 16de1e4f36a4..264468783db7 100644
--- a/drivers/media/usb/dvb-usb/Makefile
+++ b/drivers/media/usb/dvb-usb/Makefile
@@ -42,6 +42,9 @@ dvb-usb-digitv-objs := digitv.o
 obj-$(CONFIG_DVB_USB_DIGITV) += dvb-usb-digitv.o
 
 dvb-usb-cxusb-objs := cxusb.o
+ifeq ($(CONFIG_DVB_USB_CXUSB_ANALOG),y)
+dvb-usb-cxusb-objs += cxusb-analog.o
+endif
 obj-$(CONFIG_DVB_USB_CXUSB) += dvb-usb-cxusb.o
 
 dvb-usb-ttusb2-objs := ttusb2.o
diff --git a/drivers/media/usb/dvb-usb/cxusb-analog.c 
b/drivers/media/usb/dvb-usb/cxusb-analog.c
new file mode 100644
index ..969d82b24f41
--- /dev/null
+++ b/drivers/media/usb/dvb-usb/cxusb-analog.c
@@ -0,0 +1,1914 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// DVB USB compliant linux driver for Conexant USB reference design -
+// (analog part).
+//
+// Copyright (C) 2011, 2017 Maciej S. Szmigiero (m...@maciej.szmigiero.name)
+//
+// TODO:
+//  * audio support,
+//  * finish radio support (requires audio of course),
+//  * VBI support,
+//  * controls support
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "cxusb.h"
+
+static int cxusb_medion_v_queue_setup(struct vb2_queue *q,
+ unsigned int *num_buffers,
+ unsigned int *num_planes,
+ unsigned int sizes[],
+ struct device *alloc_devs[])
+{
+   struct dvb_usb_device *dvbdev = vb2_get_drv_priv(q);
+   struct cxusb_medion_dev *cxdev = dvbdev->priv;
+   unsigned int size = cxdev->raw_mode ?
+   CXUSB_VIDEO_MAX_FRAME_SIZE :
+   cxdev->width * cxdev->height * 2;
+
+   if (*num_planes > 0) {
+   if (*num_planes != 1)
+   return -EINVAL;
+
+   if (sizes[0] < size)
+   return -EINVAL;
+   } else {
+   *num_planes = 1;
+   sizes[0] = size;
+   }
+
+   if (q->num_buffers + *num_buffers < 6)
+   *num_buffers = 6 - q->num_buffers;
+
+   return 0;
+}
+
+static int cxusb_medion_v_buf_init(struct vb2_buffer *vb)
+{
+   struct dvb_usb_device *dvbdev = vb2_get_drv_priv(vb->vb2_queue);
+   struct cxusb_medion_dev *cxdev = dvbdev->priv;
+
+   cxusb_vprintk(dvbdev, OPS, "buffer init\n");
+
+   if (cxdev->raw_mode) {
+   if (vb2_pl

[PATCH v5 3/6] cx25840: add pin to pad mapping and output format configuration

2017-12-22 Thread Maciej S. Szmigiero
This commit adds pin to pad mapping and output format configuration support
in CX2584x-series chips to cx25840 driver.

This functionality is then used to allow disabling ivtv-specific hacks
(called a "generic mode"), so cx25840 driver can be used for other devices
not needing them without risking compatibility problems.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/media/i2c/cx25840/cx25840-core.c | 396 ++-
 drivers/media/i2c/cx25840/cx25840-core.h |  13 +
 drivers/media/i2c/cx25840/cx25840-vbi.c  |   3 +
 include/media/drv-intf/cx25840.h |  74 +-
 4 files changed, 484 insertions(+), 2 deletions(-)

diff --git a/drivers/media/i2c/cx25840/cx25840-core.c 
b/drivers/media/i2c/cx25840/cx25840-core.c
index 2189980a0f29..e2fd33a64550 100644
--- a/drivers/media/i2c/cx25840/cx25840-core.c
+++ b/drivers/media/i2c/cx25840/cx25840-core.c
@@ -21,6 +21,9 @@
  * CX23888 DIF support for the HVR1850
  * Copyright (C) 2011 Steven Toth <st...@kernellabs.com>
  *
+ * CX2584x pin to pad mapping and output format configuration support are
+ * Copyright (C) 2011 Maciej S. Szmigiero <m...@maciej.szmigiero.name>
+ *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
  * as published by the Free Software Foundation; either version 2
@@ -316,6 +319,260 @@ static int cx23885_s_io_pin_config(struct v4l2_subdev 
*sd, size_t n,
return 0;
 }
 
+static u8 cx25840_function_to_pad(struct i2c_client *client, u8 function)
+{
+   switch (function) {
+   case CX25840_PAD_ACTIVE:
+   return 1;
+
+   case CX25840_PAD_VACTIVE:
+   return 2;
+
+   case CX25840_PAD_CBFLAG:
+   return 3;
+
+   case CX25840_PAD_VID_DATA_EXT0:
+   return 4;
+
+   case CX25840_PAD_VID_DATA_EXT1:
+   return 5;
+
+   case CX25840_PAD_GPO0:
+   return 6;
+
+   case CX25840_PAD_GPO1:
+   return 7;
+
+   case CX25840_PAD_GPO2:
+   return 8;
+
+   case CX25840_PAD_GPO3:
+   return 9;
+
+   case CX25840_PAD_IRQ_N:
+   return 10;
+
+   case CX25840_PAD_AC_SYNC:
+   return 11;
+
+   case CX25840_PAD_AC_SDOUT:
+   return 12;
+
+   case CX25840_PAD_PLL_CLK:
+   return 13;
+
+   case CX25840_PAD_VRESET:
+   return 14;
+
+   default:
+   if (function != CX25840_PAD_DEFAULT)
+   v4l_err(client,
+   "invalid function %u, assuming default\n",
+   (unsigned int)function);
+   return 0;
+   }
+}
+
+static void cx25840_set_invert(u8 *pinctrl3, u8 *voutctrl4, u8 function,
+  u8 pin, bool invert)
+{
+   switch (function) {
+   case CX25840_PAD_IRQ_N:
+   if (invert)
+   *pinctrl3 &= ~2;
+   else
+   *pinctrl3 |= 2;
+   break;
+
+   case CX25840_PAD_ACTIVE:
+   if (invert)
+   *voutctrl4 |= BIT(2);
+   else
+   *voutctrl4 &= ~BIT(2);
+   break;
+
+   case CX25840_PAD_VACTIVE:
+   if (invert)
+   *voutctrl4 |= BIT(5);
+   else
+   *voutctrl4 &= ~BIT(5);
+   break;
+
+   case CX25840_PAD_CBFLAG:
+   if (invert)
+   *voutctrl4 |= BIT(4);
+   else
+   *voutctrl4 &= ~BIT(4);
+   break;
+
+   case CX25840_PAD_VRESET:
+   if (invert)
+   *voutctrl4 |= BIT(0);
+   else
+   *voutctrl4 &= ~BIT(0);
+   break;
+   }
+
+   if (function != CX25840_PAD_DEFAULT)
+   return;
+
+   switch (pin) {
+   case CX25840_PIN_DVALID_PRGM0:
+   if (invert)
+   *voutctrl4 |= BIT(6);
+   else
+   *voutctrl4 &= ~BIT(6);
+   break;
+
+   case CX25840_PIN_HRESET_PRGM2:
+   if (invert)
+   *voutctrl4 |= BIT(1);
+   else
+   *voutctrl4 &= ~BIT(1);
+   break;
+   }
+}
+
+static int cx25840_s_io_pin_config(struct v4l2_subdev *sd, size_t n,
+  struct v4l2_subdev_io_pin_config *p)
+{
+   struct i2c_client *client = v4l2_get_subdevdata(sd);
+   unsigned int i;
+   u8 pinctrl[6], pinconf[10], voutctrl4;
+
+   for (i = 0; i < 6; i++)
+   pinctrl[i] = cx25840_read(client, 0x114 + i);
+
+   for (i = 0; i < 10; i++)
+   pinconf[i] = cx25840_read(client, 0x11c + i);
+
+  

[PATCH v5 1/6] ivtv: zero-initialize cx25840 platform data

2017-12-22 Thread Maciej S. Szmigiero
We need to zero-initialize cx25840 platform data structure to make sure
that its future members do not contain random stack garbage.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 drivers/media/pci/ivtv/ivtv-i2c.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/pci/ivtv/ivtv-i2c.c 
b/drivers/media/pci/ivtv/ivtv-i2c.c
index 66696e6ee587..b755337ec938 100644
--- a/drivers/media/pci/ivtv/ivtv-i2c.c
+++ b/drivers/media/pci/ivtv/ivtv-i2c.c
@@ -293,6 +293,7 @@ int ivtv_i2c_register(struct ivtv *itv, unsigned idx)
.platform_data = ,
};
 
+   memset(, 0, sizeof(pdata));
pdata.pvr150_workaround = itv->pvr150_workaround;
sd = v4l2_i2c_new_subdev_board(>v4l2_dev, adap,
_info, NULL);


Re: [PATCH] gpio: winbond: add driver

2017-12-20 Thread Maciej S. Szmigiero
On 20.12.2017 13:06, Linus Walleij wrote:
> On Tue, Dec 19, 2017 at 12:07 AM, Maciej S. Szmigiero
> <m...@maciej.szmigiero.name> wrote:
>> On 18.12.2017 22:22, Linus Walleij wrote:
> 
(..)
>>>> +static void winbond_gpio_warn_conflict(unsigned int idx, const char 
>>>> *otherdev)
>>>> +{
>>>> +   pr_warn(WB_GPIO_DRIVER_NAME
>>>> +   ": enabled GPIO%u share pins with active %s\n", idx + 1,
>>>> +   otherdev);
>>>> +}
>>>
>>> I don't understand this otherdev business, is it pin multiplexing?
>>
>> Yes, some GPIO pins are shared with some other devices and functions.
>>
>> But this is a x86 Super I/O chip so we don't really have a general
>> ability to detect which should be configured how (and a BIOS might not
>> have configured pins that it didn't care about correctly).
> 
> I see. That is unfortunate, users having to provide module parameters
> for this is just so 1990ies, not your fault.
> 
> I'm just wondering if there is something, anything you can grab onto
> to detect the type of system you're running on and configure accordingly.
> Like SMBIOS magic. Or subsystem type on the PCI devices. It's
> scary to have to insert by hand.
> 
> Am I guessing right when assuming this is not really a slot-in card
> but more a, say, internal bridge om some misc machines?
> 
> I'm just thinking that somewhere on this machine there must be some
> entity that knows what we're running on.

This driver was developed and tested on a Axiomtek CAPA800 board, which
has a GPIO header allowing easy access to these Super I/O pins.
A manual for this board specifically documents poking at hardcoded SIO
I/O space as the way to read / write them, so yes, it is very much a
1990's style of operation.
BTW: DMI data for this board is full of "To Be Filled By O.E.M." as far
as I can see.

See also my answer under the next paragraph below.

>> So the only practical way is to have an user tell us which GPIO devices
>> he wants to use and then enable only these (maybe stealing pins from
>> other devices).
> 
> That is fine, it's the system as a whole that counts.
> 
> Maybe I misunderstand the usecase: is this an off-the-shelf product
> support thing (like, to get LEDs on the case of a laptop or read magic
> switches on a workstation or so) or is it intended for random hacking and
> special-purpose machines?

Well, currently this driver has just a one "hard" user of CAPA800 board
(where the GPIO signals themselves are a part of the platform but it is
left unspecified what is their intended use).
But since SIO GPIO lines are often used internally to implement switching
of various motherboard functions (I know motherboards with other SIO
models use them for LED control, main / backup BIOS swapping, etc.) I
think random hacking and reverse engineering of motherboard signals is
also a sensible use case for this driver.

For things like LED control and input switches reading, where these
are a permanent part of some off-the-shelf platform, I think a specific
driver that registers with LED and input layer should be used instead.

>>>> +   else
>>>> +   pr_notice(WB_GPIO_DRIVER_NAME ": GPIO%u pins are %s\n", 
>>>> idx + 1,
>>>> + winbond_sio_reg_btest(base, output_reg,
>>>> +   output_pp_bit) ?
>>>> + "push-pull" :
>>>> + "open drain");
>>>
>>> If your hardware supports open drain then implement .set_config() for
>>> it in the gpio_chip so you can use the hardware open drain with the driver.
>>
>> The problem here is that the device doesn't support per-pin
>> output driver configuration, only per-GPIO device (8 pins)
>> configuration.
> 
> Hm. That is well supported by pin control while GPIO does not support
> it since it is line-oriented. Pin control on the other hand, supports
> group-wise multiplexing and config.
> 
> I don't know how much you looked into pin control. It might be a bit
> heavy for an ISA-type device as this appears to be.

Yes, I think pin control is more a thing for complicated SoCs than for
a few Super I/O chip signals (especially that nothing on any x86
platform that I have seems to make use of it).
Besides, it would need adding a support for requesting the proper pins
on boards with this chip to, for example, the 8250 serial driver.

>>> winbond_gpio_configure_uartb()?
>>
>> UARTB is a device that the first GPIO device conflicts with.
>> Should it give a name to this GPIO device configuration

Re: [PATCH v3] gpio: winbond: add driver

2018-01-03 Thread Maciej S. Szmigiero
On 03.01.2018 20:05, Andy Shevchenko wrote:
> On Sat, 2017-12-30 at 22:02 +0100, Maciej S. Szmigiero wrote:
>> This commit adds GPIO driver for Winbond Super I/Os.
>>
>> Currently, only W83627UHG model (also known as Nuvoton NCT6627UD) is
>> supported but in the future a support for other Winbond models, too,
>> can
>> be added to the driver.
>>
>> A module parameter "gpios" sets a bitmask of GPIO ports to enable (bit
>> 0 is
>> GPIO1, bit 1 is GPIO2, etc.).
>> One should be careful which ports one tinkers with since some might be
>> managed by the firmware (for functions like powering on and off,
>> sleeping,
>> BIOS recovery, etc.) and some of GPIO port pins are physically shared
>> with
>> other devices included in the Super I/O chip.
>>
> 
> Thanks for an update.
> My comments below.
> 
> First of all, looking more at this driver, why don't we create a
> gpiochip per real "port" during actual configuration?

Hmm.. there is only a one 'chip' here, so why would the driver want to
register multiple ones?

That would also create at least one additional point of failure if
one or more such gpiochip(s) register but one fails to do so.

> And I still have filing that this one suitable for MFD.

As I wrote previously, that would necessitate rewriting also w83627ehf
hwmon and w83627hf_wdt drivers, and would make the driver stand out
against other, similar Super I/O drivers.

> Anyone, does it make sense?
> 
>> +#define WB_SIO_REG_G1MF_G2PP7
>> +#define WB_SIO_REG_G1MF_G1PP6
> 
> Forgot to swap.

Good catch, will swap.

> 
>> +#define wb_sio_notice(...) pr_notice(WB_GPIO_DRIVER_NAME ": "
>> __VA_ARGS__)
>> +#define wb_sio_warn(...) pr_warn(WB_GPIO_DRIVER_NAME ": "
>> __VA_ARGS__)
>> +#define wb_sio_err(...) pr_err(WB_GPIO_DRIVER_NAME ": " __VA_ARGS__)
> 
> What I meant is to
> 
> #define pr_fmt(x) ...
> 
> Look at the kernel sources, there are a lot of examples.

I see now what you meant, will do it then.

>> +/* returns whether changing a pin is allowed */
>> +static bool winbond_gpio_get_info(unsigned int *gpio_num,
>> +  const struct winbond_gpio_info
>> **info)
>> +{
>> +bool allow_changing = true;
>> +unsigned long i;
>> +
>> +for_each_set_bit(i, , sizeof(gpios)) {
>> +if (*gpio_num < 8)
>> +break;
>> +
>> +*gpio_num -= 8;
>> +}
> 
> Why not hweight() here?
> 
> unsigned int shift = hweight_long(gpios) * 8;
> unsigned int index = fls_long(gpios); // AFAIU
> 
> *offset -= *offset >= shift ? shift : shift - 8;
> *info = _gpio_infos[index];
> 
> ...

Unfortunately, this code doesn't produce the same results as the code
above.

First, in this code "index" does not depend on "gpio_num" (or "offset")
passed to winbond_gpio_get_info() function, so gpio 0 (on the first GPIO
device or port) will access the same winbond_gpio_infos entry as gpio 18
(which is located on the third GPIO port).
In fact, the calculated "index" would always point to the last enabled
GPIO port (since that is the meaning of "gpios" MSB, assuming this
user-provided parameter was properly verified or sanitized).

Second, the calculated "offset" would end negative for anything but the
very last GPIO port (ok, not really negative since it is an unsigned type,
but still not correct either).
And that even not taking into account the special case of GPIO6 port
that has only 5 gpios.

What we want in this code is for "i" (or "index") to contain the GPIO
port number for the passed "gpio_num" (or "offset") and that this
last variable ends reduced modulo 8 from its original value.

>> +
>> +*info = _gpio_infos[i];
>> +
>> +/*
>> + * GPIO2 (the second port) shares some pins with a basic PC
>> + * functionality, which is very likely controlled by the
>> firmware.
>> + * Don't allow changing these pins by default.
>> + */
>> +if (i == 1) {
>> +if (*gpio_num == 0 && !pledgpio)
>> +allow_changing = false;
>> +else if (*gpio_num == 1 && !beepgpio)
>> +allow_changing = false;
>> +else if ((*gpio_num == 5 || *gpio_num == 6) &&
>> !i2cgpio)
>> +allow_changing = false;
>> +}
>> +
>> +return allow_changing;
>> +}
> 
>> +static int winbond_gpio_configure(unsigned long base)
>> +{
>> +   

Re: [PATCH v1 05/15] ASoC: fsl_ssi: Clean up helper functions of trigger()

2018-01-01 Thread Maciej S. Szmigiero
On 19.12.2017 18:00, Nicolin Chen wrote:
> The trigger() calls fsl_ssi_tx_config() and fsl_ssi_rx_config(),
> and both of them jump to fsl_ssi_config(). And fsl_ssi_config()
> later calls another fsl_ssi_rxtx_config().
> 
> However, the whole routine, especially fsl_ssi_config() function,
> is too complicated because of the folowing reasons:
> 1) It has to handle the concern of the opposite stream.
> 2) It has to handle cases of offline configurations support.
> 3) It has to handle enable and disable operations while they're
>mostly different.
> 
> Since the enable and disable routines have more differences than
> TX and RX rountines, this patch simplifies these helper functions
> with the following changes:
> - Changing to two helper functions of enable and disable instead
>   of TX and RX.
> - Removing fsl_ssi_rxtx_config() by separately integrating it to
>   two newly introduced enable & disable functions.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  sound/soc/fsl/fsl_ssi.c | 254 
> +++-
>  1 file changed, 119 insertions(+), 135 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 0f09caf..d8c8656 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -378,31 +378,81 @@ static irqreturn_t fsl_ssi_isr(int irq, void *dev_id)
>  }
>  
>  /**
> - * Enable or disable all rx/tx config flags at once
> + * Set SCR, SIER, STCR and SRCR registers with cached values in regvals
> + *
> + * Notes:
> + * 1) For offline_config SoCs, enable all necessary bits of both streams
> + *when 1st stream starts, even if the opposite stream will not start
> + * 2) It also clears FIFO before setting regvals; SOR is safe to set online
>   */
> -static void fsl_ssi_rxtx_config(struct fsl_ssi *ssi, bool enable)
> +static void fsl_ssi_config_enable(struct fsl_ssi *ssi, bool tx)
>  {
> - struct regmap *regs = ssi->regs;
>   struct fsl_ssi_regvals *vals = ssi->regvals;
> + u32 sier, srcr, stcr;
>  
> - if (enable) {
> - regmap_update_bits(regs, REG_SSI_SIER,
> -vals[RX].sier | vals[TX].sier,
> -vals[RX].sier | vals[TX].sier);
> - regmap_update_bits(regs, REG_SSI_SRCR,
> -vals[RX].srcr | vals[TX].srcr,
> -vals[RX].srcr | vals[TX].srcr);
> - regmap_update_bits(regs, REG_SSI_STCR,
> -vals[RX].stcr | vals[TX].stcr,
> -vals[RX].stcr | vals[TX].stcr);
> + /* Clear dirty data in the FIFO; It also prevents channel slipping */
> + regmap_update_bits(ssi->regs, REG_SSI_SOR,
> +SSI_SOR_xX_CLR(tx), SSI_SOR_xX_CLR(tx));
> +
> + /*
> +  * On offline_config SoCs, SxCR and SIER are already configured when
> +  * the previous stream started. So skip all SxCR and SIER settings
> +  * to prevent online reconfigurations, then jump to set SCR directly
> +  */
> + if (ssi->soc->offline_config && ssi->streams)
> + goto enable_scr;
> +
> + if (ssi->soc->offline_config) {
> + /*
> +  * Online reconfiguration not supported, so enable all bits for
> +  * both streams at once to avoid necessity of reconfigurations
> +  */
> + srcr = vals[RX].srcr | vals[TX].srcr;
> + stcr = vals[RX].stcr | vals[TX].stcr;
> + sier = vals[RX].sier | vals[TX].sier;
>   } else {
> - regmap_update_bits(regs, REG_SSI_SRCR,
> -vals[RX].srcr | vals[TX].srcr, 0);
> - regmap_update_bits(regs, REG_SSI_STCR,
> -vals[RX].stcr | vals[TX].stcr, 0);
> - regmap_update_bits(regs, REG_SSI_SIER,
> -vals[RX].sier | vals[TX].sier, 0);
> + /* Otherwise, only set bits for the current stream */
> + srcr = vals[tx].srcr;
> + stcr = vals[tx].stcr;
> + sier = vals[tx].sier;

Implicit assumption here that RX == 0, TX == 1, as in the 03 patch.

>   }
> +
> + /* Configure SRCR, STCR and SIER at once */
> + regmap_update_bits(ssi->regs, REG_SSI_SRCR, srcr, srcr);
> + regmap_update_bits(ssi->regs, REG_SSI_STCR, stcr, stcr);
> + regmap_update_bits(ssi->regs, REG_SSI_SIER, sier, sier);
> +
> +enable_scr:
> + /*
> +  * Start DMA before setting TE to avoid FIFO underrun
> +  * which may cause a channel slip or a channel swap
> +  *
> +  * TODO: FIQ cases might also need this upon testing
> +  */
> + if (ssi->use_dma && tx) {
> + int try = 100;
> + u32 sfcsr;
> +
> + /* Enable SSI first to send TX DMA request */
> + regmap_update_bits(ssi->regs, REG_SSI_SCR,
> +SSI_SCR_SSIEN, 

Re: [PATCH v1 03/15] ASoC: fsl_ssi: Rename fsl_ssi_disable_val macro

2018-01-01 Thread Maciej S. Szmigiero
On 19.12.2017 18:00, Nicolin Chen wrote:
> The define of fsl_ssi_disable_val is not so clear as it mixes two
> steps of calculations together. And those parameter names are also
> a bit long to read.
> 
> Since it just tries to exclude the shared bits from the regvals of
> current stream while the opposite stream is active, it's better to
> use something like ssi_excl_shared_bits.
> 
> This patch also bisects fsl_ssi_disable_val into two macros of two
> corresponding steps and then shortens its parameter names. It also
> updates callers in the fsl_ssi_config() accordingly.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  sound/soc/fsl/fsl_ssi.c | 54 
> -
>  1 file changed, 22 insertions(+), 32 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index f05f78d..6dda2e0 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c

> @@ -445,16 +445,10 @@ static void fsl_ssi_config(struct fsl_ssi *ssi, bool 
> enable,
>   bool tx = >regvals[TX] == vals;
>   struct regmap *regs = ssi->regs;
>   struct fsl_ssi_regvals *avals;
> - int nr_active_streams;
> - int keep_active;
> -
> - nr_active_streams = !!(ssi->streams & BIT(TX)) +
> - !!(ssi->streams & BIT(RX));
> + bool aactive;
>  
> - if (nr_active_streams - 1 > 0)
> - keep_active = 1;
> - else
> - keep_active = 0;
> + /* Check if the opposite stream is active */
> + aactive = ssi->streams & BIT(!tx);

I don't think that hardcoding an implicit assumption here that RX == 0,
TX == 1 is a good thing.
If in the future, for any reason, somebody changes values of these macros
this code will silently break.

I would instead change this line into something like
"aactive = ssi->streams & (tx ? BIT(RX) : BIT(TX));" or similar.

Maciej


Re: [PATCH v1 01/15] ASoC: fsl_ssi: Clean up set_dai_tdm_slot()

2018-01-01 Thread Maciej S. Szmigiero
On 19.12.2017 18:00, Nicolin Chen wrote:
> This patch replaces the register read with ssi->i2s_net for
> simplification. It also removes masking SSIEN from scr value
> since it's handled later by regmap_update_bits() to set this
> scr value back.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  sound/soc/fsl/fsl_ssi.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index aecd00f..ed2712b 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -1051,9 +1051,7 @@ static int fsl_ssi_set_dai_tdm_slot(struct snd_soc_dai 
> *dai, u32 tx_mask,
>   }
>  
>   /* The slot number should be >= 2 if using Network mode or I2S mode */
> - regmap_read(regs, REG_SSI_SCR, );
> - val &= SSI_SCR_I2S_MODE_MASK | SSI_SCR_NET;
> - if (val && slots < 2) {
> + if (ssi->i2s_net && slots < 2) {
>   dev_err(dai->dev, "slot number should be >= 2 in I2S or NET\n");
>   return -EINVAL;
>   }

Are you sure that ssi->i2s_net SSI_SCR_I2S_MODE_MASK | SSI_SCR_NET bits
(also known as SSI_SCR_I2S_NET_MASK) zero or non-zero status is always
consistent with that in the SCR register?

I can see that in fsl_ssi_hw_params() these bits in SCR are zeroed in
a one special case and in the second special case they are hardcoded
to SSI_SCR_I2S_MODE_NORMAL | SSI_SCR_NET, in both cases regardless of
what is currently in ssi->i2s_net.

Maciej


Re: [PATCH v1 11/15] ASoC: fsl_ssi: Setup AC97 in dai_probe()

2018-01-01 Thread Maciej S. Szmigiero
On 19.12.2017 18:00, Nicolin Chen wrote:
> AC97 configures some registers earlier to start a communication
> with CODECs, so this patch moves those register settings to the
> dai_probe() as well, along with other register configurations.
> 
> It also applies _fsl_ssi_set_dai_fmt() to AC97 only since other
> formats would be configured via fsl_ssi_set_dai_fmt() directly.
> 
> Meanwhile, this patch adds fsl_ssi_dai_ac97_remove() to cleanup
> some control bits for AC97.
> 
> Signed-off-by: Nicolin Chen 

This patch breaks AC'97 CODEC probing.

Namely, the fsl_ssi DAI probe callback is only called after the AC'97
CODEC probe callback, so when you move SSI AC'97 startup to its DAI
probe callback it won't be done yet when the CODEC is probed (and this
requires a working AC'97 interface to successfully complete).

Maciej


[PATCH] pcmcia: Implement CLKRUN protocol disabling for Ricoh bridges

2018-09-08 Thread Maciej S. Szmigiero
Currently, "disable_clkrun" yenta_socket module parameter is only
implemented for TI CardBus bridges.
Add also an implementation for Ricoh bridges that have the necessary
setting documented in publicly available datasheets.

Tested on a RL5C476II with a Sunrich C-160 CardBus NIC that doesn't work
correctly unless the CLKRUN protocol is disabled.

Let's also make it clear in its description that the "disable_clkrun"
module parameter only works on these two previously mentioned brands of
CardBus bridges.

Signed-off-by: Maciej S. Szmigiero 
---
 drivers/pcmcia/ricoh.h| 35 +++
 drivers/pcmcia/yenta_socket.c |  3 ++-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/pcmcia/ricoh.h b/drivers/pcmcia/ricoh.h
index 01098c841f87..8ac7b138c094 100644
--- a/drivers/pcmcia/ricoh.h
+++ b/drivers/pcmcia/ricoh.h
@@ -119,6 +119,10 @@
 #define  RL5C4XX_MISC_CONTROL   0x2F /* 8 bit */
 #define  RL5C4XX_ZV_ENABLE  0x08
 
+/* Misc Control 3 Register */
+#define RL5C4XX_MISC3  0x00A2 /* 16 bit */
+#define  RL5C47X_MISC3_CB_CLKRUN_DIS   BIT(1)
+
 #ifdef __YENTA_H
 
 #define rl_misc(socket)((socket)->private[0])
@@ -156,6 +160,35 @@ static void ricoh_set_zv(struct yenta_socket *socket)
 }
 }
 
+static void ricoh_set_clkrun(struct yenta_socket *socket, bool quiet)
+{
+   u16 misc3;
+
+   /*
+* RL5C475II likely has this setting, too, however no datasheet
+* is publicly available for this chip
+*/
+   if (socket->dev->device != PCI_DEVICE_ID_RICOH_RL5C476 &&
+   socket->dev->device != PCI_DEVICE_ID_RICOH_RL5C478)
+   return;
+
+   if (socket->dev->revision < 0x80)
+   return;
+
+   misc3 = config_readw(socket, RL5C4XX_MISC3);
+   if (misc3 & RL5C47X_MISC3_CB_CLKRUN_DIS) {
+   if (!quiet)
+   dev_dbg(>dev->dev,
+   "CLKRUN feature already disabled\n");
+   } else if (disable_clkrun) {
+   if (!quiet)
+   dev_info(>dev->dev,
+"Disabling CLKRUN feature\n");
+   misc3 |= RL5C47X_MISC3_CB_CLKRUN_DIS;
+   config_writew(socket, RL5C4XX_MISC3, misc3);
+   }
+}
+
 static void ricoh_save_state(struct yenta_socket *socket)
 {
rl_misc(socket) = config_readw(socket, RL5C4XX_MISC);
@@ -172,6 +205,7 @@ static void ricoh_restore_state(struct yenta_socket *socket)
config_writew(socket, RL5C4XX_16BIT_IO_0, rl_io(socket));
config_writew(socket, RL5C4XX_16BIT_MEM_0, rl_mem(socket));
config_writew(socket, RL5C4XX_CONFIG, rl_config(socket));
+   ricoh_set_clkrun(socket, true);
 }
 
 
@@ -197,6 +231,7 @@ static int ricoh_override(struct yenta_socket *socket)
config_writew(socket, RL5C4XX_CONFIG, config);
 
ricoh_set_zv(socket);
+   ricoh_set_clkrun(socket, false);
 
return 0;
 }
diff --git a/drivers/pcmcia/yenta_socket.c b/drivers/pcmcia/yenta_socket.c
index ab3da2262f0f..ac6a3f46b1e6 100644
--- a/drivers/pcmcia/yenta_socket.c
+++ b/drivers/pcmcia/yenta_socket.c
@@ -26,7 +26,8 @@
 
 static bool disable_clkrun;
 module_param(disable_clkrun, bool, 0444);
-MODULE_PARM_DESC(disable_clkrun, "If PC card doesn't function properly, please 
try this option");
+MODULE_PARM_DESC(disable_clkrun,
+"If PC card doesn't function properly, please try this option 
(TI and Ricoh bridges only)");
 
 static bool isa_probe = 1;
 module_param(isa_probe, bool, 0444);
-- 
2.17.0



[PATCH 5/5] ALSA: emu10k1: add a IOMMU workaround

2018-01-21 Thread Maciej S. Szmigiero
The Audigy 2 CA0102 chip (but most likely others from the emu10k1 family,
too) has a problem that from time to time it likes to do few DMA reads a
bit beyond its normal allocation and gets very confused if these reads get
blocked by a IOMMU.

For the first (reserved) page this happens multiple times at every
playback, for various synth pages it happens randomly, rarely for the page
table memory itself.
All these reads seem to follow a similar pattern, observed read offsets
beyond the allocation end were 0x00, 0x40, 0x80 and 0xc0 (PCI cache line
multiples), so it looks like the device tries to accesses up to 256 extra
bytes.

As a workaround let's widen these DMA allocations by an extra page if we
detect that the device is behind a non-passthrough IOMMU (the DMA memory
should be relatively plenty on IOMMU systems).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 include/sound/emu10k1.h  |  1 +
 sound/pci/emu10k1/emu10k1_main.c | 49 +---
 sound/pci/emu10k1/memory.c   | 16 ++---
 3 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h
index db32b7de52e0..ba27abf65408 100644
--- a/include/sound/emu10k1.h
+++ b/include/sound/emu10k1.h
@@ -1710,6 +1710,7 @@ struct snd_emu10k1 {
unsigned int ecard_ctrl;/* ecard control bits */
unsigned int address_mode;  /* address mode */
unsigned long dma_mask; /* PCI DMA mask */
+   bool iommu_workaround;  /* IOMMU workaround needed */
unsigned int delay_pcm_irq; /* in samples */
int max_cache_pages;/* max memory size / PAGE_SIZE 
*/
struct snd_dma_buffer silent_page;  /* silent page */
diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index 8decd2a7a404..3d1b57546379 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1758,6 +1759,37 @@ static struct snd_emu_chip_details emu_chip_details[] = {
{ } /* terminator */
 };
 
+/*
+ * The chip (at least the Audigy 2 CA0102 chip, but most likely others, too)
+ * has a problem that from time to time it likes to do few DMA reads a bit
+ * beyond its normal allocation and gets very confused if these reads get
+ * blocked by a IOMMU.
+ *
+ * This behaviour has been observed for the first (reserved) page
+ * (for which it happens multiple times at every playback), often for various
+ * synth pages and sometimes for the page table memory itself.
+ *
+ * As a workaround let's widen these DMA allocations by an extra page if we
+ * detect that the device is behind a non-passthrough IOMMU.
+ */
+static void snd_emu10k1_detect_iommu(struct snd_emu10k1 *emu)
+{
+   struct iommu_domain *domain;
+
+   emu->iommu_workaround = false;
+
+   if (!iommu_present(emu->card->dev->bus))
+   return;
+
+   domain = iommu_get_domain_for_dev(emu->card->dev);
+   if (domain && domain->type == IOMMU_DOMAIN_IDENTITY)
+   return;
+
+   dev_notice(emu->card->dev,
+  "non-passthrough IOMMU detected, widening DMA allocations");
+   emu->iommu_workaround = true;
+}
+
 int snd_emu10k1_create(struct snd_card *card,
   struct pci_dev *pci,
   unsigned short extin_mask,
@@ -1770,6 +1802,7 @@ int snd_emu10k1_create(struct snd_card *card,
struct snd_emu10k1 *emu;
int idx, err;
int is_audigy;
+   size_t page_table_size, silent_page_size;
unsigned int silent_page;
const struct snd_emu_chip_details *c;
static struct snd_device_ops ops = {
@@ -1867,6 +1900,8 @@ int snd_emu10k1_create(struct snd_card *card,
 
is_audigy = emu->audigy = c->emu10k2_chip;
 
+   snd_emu10k1_detect_iommu(emu);
+
/* set addressing mode */
emu->address_mode = is_audigy ? 0 : 1;
/* set the DMA transfer mask */
@@ -1893,8 +1928,13 @@ int snd_emu10k1_create(struct snd_card *card,
emu->port = pci_resource_start(pci, 0);
 
emu->max_cache_pages = max_cache_bytes >> PAGE_SHIFT;
+
+   page_table_size = sizeof(u32) * (emu->address_mode ? MAXPAGES1 :
+MAXPAGES0);
+   if (emu->iommu_workaround)
+   page_table_size += PAGE_SIZE;
if (snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV, snd_dma_pci_data(pci),
-   (emu->address_mode ? 32 : 16) * 1024, 
>ptb_pages) < 0) {
+   page_table_size, >ptb_pages) < 0) {
err = -ENOMEM;
goto error;
}
@@ -1910,8 +1950,11 @@ int snd_emu10k1_create(struct snd_card *card,

[PATCH 1/5] ALSA: emu10k1: remove reserved_page

2018-01-21 Thread Maciej S. Szmigiero
The emu10k1-family chips need the first page (index 0) reserved in their
page tables for some reason (every emu10k1 driver I've checked does this
without much of an explanation).
Using the first page for normal samples results in a broken playback.

However, we already have a dummy page allocated - so called "silent page"
and, in fact, had always been setting it as the first page in the chip page
table because an initialization of every entry of the page table to point
to a silent page happens after and overwrites the reserved_page allocation.

So the only thing remaining to remove the reserved_page allocation is a
trivial change to the page allocation logic to ignore the first page entry
and start its allocations from the second entry (index 1).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 include/sound/emu10k1.h  |  1 -
 sound/pci/emu10k1/emu10k1_main.c | 11 ---
 sound/pci/emu10k1/memory.c   |  8 ++--
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h
index 4f42affe777c..db32b7de52e0 100644
--- a/include/sound/emu10k1.h
+++ b/include/sound/emu10k1.h
@@ -1718,7 +1718,6 @@ struct snd_emu10k1 {
struct snd_dma_buffer p16v_buffer;
 
struct snd_util_memhdr *memhdr; /* page allocation list */
-   struct snd_emu10k1_memblk *reserved_page;   /* reserved page */
 
struct list_head mapped_link_head;
struct list_head mapped_order_link_head;
diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index ccf4415a1c7b..a0a4d31ded53 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1272,12 +1272,6 @@ static int snd_emu10k1_free(struct snd_emu10k1 *emu)
release_firmware(emu->dock_fw);
if (emu->irq >= 0)
free_irq(emu->irq, emu);
-   /* remove reserved page */
-   if (emu->reserved_page) {
-   snd_emu10k1_synth_free(emu,
-   (struct snd_util_memblk *)emu->reserved_page);
-   emu->reserved_page = NULL;
-   }
snd_util_memhdr_free(emu->memhdr);
if (emu->silent_page.area)
snd_dma_free_pages(>silent_page);
@@ -1993,11 +1987,6 @@ int snd_emu10k1_create(struct snd_card *card,
SPCS_GENERATIONSTATUS | 0x1200 |
0x | SPCS_EMPHASIS_NONE | SPCS_COPYRIGHT;
 
-   emu->reserved_page = (struct snd_emu10k1_memblk *)
-   snd_emu10k1_synth_alloc(emu, 4096);
-   if (emu->reserved_page)
-   emu->reserved_page->map_locked = 1;
-
/* Clear silent pages and set up pointers */
memset(emu->silent_page.area, 0, PAGE_SIZE);
silent_page = emu->silent_page.addr << emu->address_mode;
diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index 4f1f69be1865..eaa61024ac7f 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -102,7 +102,7 @@ static void emu10k1_memblk_init(struct snd_emu10k1_memblk 
*blk)
  */
 static int search_empty_map_area(struct snd_emu10k1 *emu, int npages, struct 
list_head **nextp)
 {
-   int page = 0, found_page = -ENOMEM;
+   int page = 1, found_page = -ENOMEM;
int max_size = npages;
int size;
struct list_head *candidate = >mapped_link_head;
@@ -147,6 +147,10 @@ static int map_memblk(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
page = search_empty_map_area(emu, blk->pages, );
if (page < 0) /* not found */
return page;
+   if (page == 0) {
+   dev_err(emu->card->dev, "trying to map zero (reserved) page\n");
+   return -EINVAL;
+   }
/* insert this block in the proper position of mapped list */
list_add_tail(>mapped_link, next);
/* append this as a newest block in order list */
@@ -177,7 +181,7 @@ static int unmap_memblk(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
q = get_emu10k1_memblk(p, mapped_link);
start_page = q->mapped_page + q->pages;
} else
-   start_page = 0;
+   start_page = 1;
if ((p = blk->mapped_link.next) != >mapped_link_head) {
q = get_emu10k1_memblk(p, mapped_link);
end_page = q->mapped_page;


[PATCH 3/5] ALSA: emu10k1: add optional debug printouts with DMA addresses

2018-01-21 Thread Maciej S. Szmigiero
When we get a IOMMU page fault for a emu10k1 device it is very hard to
discover which of chip many DMA allocations triggered it (since on a IOMMU
system the DMA address space is often very different from the CPU one).
Let's add optional debug printouts providing this information.

These debug printouts are only enabled on an explicit request via the
kernel dynamic debug mechanism.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 sound/pci/emu10k1/emu10k1_main.c |  8 
 sound/pci/emu10k1/memory.c   | 11 +--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index 11e868f569d6..8decd2a7a404 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1898,6 +1898,9 @@ int snd_emu10k1_create(struct snd_card *card,
err = -ENOMEM;
goto error;
}
+   dev_dbg(card->dev, "page table address range is %.8lx:%.8lx\n",
+   (unsigned long)emu->ptb_pages.addr,
+   (unsigned long)(emu->ptb_pages.addr + emu->ptb_pages.bytes));
 
emu->page_ptr_table = vmalloc(emu->max_cache_pages * sizeof(void *));
emu->page_addr_table = vmalloc(emu->max_cache_pages *
@@ -1912,6 +1915,11 @@ int snd_emu10k1_create(struct snd_card *card,
err = -ENOMEM;
goto error;
}
+   dev_dbg(card->dev, "silent page range is %.8lx:%.8lx\n",
+   (unsigned long)emu->silent_page.addr,
+   (unsigned long)(emu->silent_page.addr +
+   emu->silent_page.bytes));
+
emu->memhdr = snd_util_memhdr_new(emu->max_cache_pages * PAGE_SIZE);
if (emu->memhdr == NULL) {
err = -ENOMEM;
diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index eaa61024ac7f..39afe199e1c7 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -35,6 +35,8 @@
  */
 #define __set_ptb_entry(emu,page,addr) \
(((u32 *)(emu)->ptb_pages.area)[page] = cpu_to_le32(((addr) << 
(emu->address_mode)) | (page)))
+#define __get_ptb_entry(emu, page) \
+   (le32_to_cpu(((u32 *)(emu)->ptb_pages.area)[page]))
 
 #define UNIT_PAGES (PAGE_SIZE / EMUPAGESIZE)
 #define MAX_ALIGN_PAGES0   (MAXPAGES0 / UNIT_PAGES)
@@ -44,7 +46,7 @@
 /* get offset address from aligned page */
 #define aligned_page_offset(page)  ((page) << PAGE_SHIFT)
 
-#if PAGE_SIZE == 4096
+#if PAGE_SIZE == 4096 && !IS_ENABLED(CONFIG_DYNAMIC_DEBUG)
 /* page size == EMUPAGESIZE */
 /* fill PTB entrie(s) corresponding to page with addr */
 #define set_ptb_entry(emu,page,addr)   __set_ptb_entry(emu,page,addr)
@@ -58,6 +60,8 @@ static inline void set_ptb_entry(struct snd_emu10k1 *emu, int 
page, dma_addr_t a
page *= UNIT_PAGES;
for (i = 0; i < UNIT_PAGES; i++, page++) {
__set_ptb_entry(emu, page, addr);
+   dev_dbg(emu->card->dev, "mapped page %d to entry %.8x\n", page,
+   (unsigned int)__get_ptb_entry(emu, page));
addr += EMUPAGESIZE;
}
 }
@@ -65,9 +69,12 @@ static inline void set_silent_ptb(struct snd_emu10k1 *emu, 
int page)
 {
int i;
page *= UNIT_PAGES;
-   for (i = 0; i < UNIT_PAGES; i++, page++)
+   for (i = 0; i < UNIT_PAGES; i++, page++) {
/* do not increment ptr */
__set_ptb_entry(emu, page, emu->silent_page.addr);
+   dev_dbg(emu->card->dev, "mapped silent page %d to entry %.8x\n",
+   page, (unsigned int)__get_ptb_entry(emu, page));
+   }
 }
 #endif /* PAGE_SIZE */
 


[PATCH 4/5] ALSA: emu10k1: make sure synth DMA pages are allocated with DMA functions

2018-01-21 Thread Maciej S. Szmigiero
Commit a5003fc04113 ("[ALSA] emu10k1 - simplify page allocation for synth")
switched from using the DMA allocator for synth DMA pages to manually
calling alloc_page().
However, this usage has an implicit assumption that the DMA address space
for the emu10k1-family chip is the same as the CPU physical address space
which is not true for a system with a IOMMU.

Since this made the synth part of the driver non-functional on such systems
let's effectively revert that commit.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 sound/pci/emu10k1/memory.c | 70 ++
 1 file changed, 39 insertions(+), 31 deletions(-)

diff --git a/sound/pci/emu10k1/memory.c b/sound/pci/emu10k1/memory.c
index 39afe199e1c7..a23eba540132 100644
--- a/sound/pci/emu10k1/memory.c
+++ b/sound/pci/emu10k1/memory.c
@@ -457,49 +457,44 @@ static void get_single_page_range(struct snd_util_memhdr 
*hdr,
*last_page_ret = last_page;
 }
 
-/* release allocated pages */
-static void __synth_free_pages(struct snd_emu10k1 *emu, int first_page,
-  int last_page)
-{
-   int page;
-
-   for (page = first_page; page <= last_page; page++) {
-   free_page((unsigned long)emu->page_ptr_table[page]);
-   emu->page_addr_table[page] = 0;
-   emu->page_ptr_table[page] = NULL;
-   }
-}
-
 /*
  * allocate kernel pages
  */
 static int synth_alloc_pages(struct snd_emu10k1 *emu, struct 
snd_emu10k1_memblk *blk)
 {
int page, first_page, last_page;
+   struct snd_dma_buffer dmab;
 
emu10k1_memblk_init(blk);
get_single_page_range(emu->memhdr, blk, _page, _page);
/* allocate kernel pages */
for (page = first_page; page <= last_page; page++) {
-   /* first try to allocate from <4GB zone */
-   struct page *p = alloc_page(GFP_KERNEL | GFP_DMA32 |
-   __GFP_NOWARN);
-   if (!p || (page_to_pfn(p) & ~(emu->dma_mask >> PAGE_SHIFT))) {
-   if (p)
-   __free_page(p);
-   /* try to allocate from <16MB zone */
-   p = alloc_page(GFP_ATOMIC | GFP_DMA |
-  __GFP_NORETRY | /* no OOM-killer */
-  __GFP_NOWARN);
-   }
-   if (!p) {
-   __synth_free_pages(emu, first_page, page - 1);
-   return -ENOMEM;
+   if (snd_dma_alloc_pages(SNDRV_DMA_TYPE_DEV,
+   snd_dma_pci_data(emu->pci),
+   PAGE_SIZE, ) < 0)
+   goto __fail;
+   if (!is_valid_page(emu, dmab.addr)) {
+   snd_dma_free_pages();
+   goto __fail;
}
-   emu->page_addr_table[page] = page_to_phys(p);
-   emu->page_ptr_table[page] = page_address(p);
+   emu->page_addr_table[page] = dmab.addr;
+   emu->page_ptr_table[page] = dmab.area;
}
return 0;
+
+__fail:
+   /* release allocated pages */
+   last_page = page - 1;
+   for (page = first_page; page <= last_page; page++) {
+   dmab.area = emu->page_ptr_table[page];
+   dmab.addr = emu->page_addr_table[page];
+   dmab.bytes = PAGE_SIZE;
+   snd_dma_free_pages();
+   emu->page_addr_table[page] = 0;
+   emu->page_ptr_table[page] = NULL;
+   }
+
+   return -ENOMEM;
 }
 
 /*
@@ -507,10 +502,23 @@ static int synth_alloc_pages(struct snd_emu10k1 *emu, 
struct snd_emu10k1_memblk
  */
 static int synth_free_pages(struct snd_emu10k1 *emu, struct snd_emu10k1_memblk 
*blk)
 {
-   int first_page, last_page;
+   int page, first_page, last_page;
+   struct snd_dma_buffer dmab;
 
get_single_page_range(emu->memhdr, blk, _page, _page);
-   __synth_free_pages(emu, first_page, last_page);
+   dmab.dev.type = SNDRV_DMA_TYPE_DEV;
+   dmab.dev.dev = snd_dma_pci_data(emu->pci);
+   for (page = first_page; page <= last_page; page++) {
+   if (emu->page_ptr_table[page] == NULL)
+   continue;
+   dmab.area = emu->page_ptr_table[page];
+   dmab.addr = emu->page_addr_table[page];
+   dmab.bytes = PAGE_SIZE;
+   snd_dma_free_pages();
+   emu->page_addr_table[page] = 0;
+   emu->page_ptr_table[page] = NULL;
+   }
+
return 0;
 }
 


[PATCH 2/5] ALSA: emu10k1: use dma_set_mask_and_coherent()

2018-01-21 Thread Maciej S. Szmigiero
We have been calling dma_set_mask() and then dma_set_coherent_mask() with
the same value, but there is a dma_set_mask_and_coherent() function that
does exactly that so let's use it instead.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 sound/pci/emu10k1/emu10k1_main.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c
index a0a4d31ded53..11e868f569d6 100644
--- a/sound/pci/emu10k1/emu10k1_main.c
+++ b/sound/pci/emu10k1/emu10k1_main.c
@@ -1871,8 +1871,7 @@ int snd_emu10k1_create(struct snd_card *card,
emu->address_mode = is_audigy ? 0 : 1;
/* set the DMA transfer mask */
emu->dma_mask = emu->address_mode ? EMU10K1_DMA_MASK : AUDIGY_DMA_MASK;
-   if (dma_set_mask(>dev, emu->dma_mask) < 0 ||
-   dma_set_coherent_mask(>dev, emu->dma_mask) < 0) {
+   if (dma_set_mask_and_coherent(>dev, emu->dma_mask) < 0) {
dev_err(card->dev,
"architecture does not support PCI busmaster DMA with 
mask 0x%lx\n",
emu->dma_mask);


Re: [alsa-devel] [PATCH] sound: emu10k1: Replace GFP_ATOMIC with GFP_KERNEL in synth_alloc_pages

2018-01-23 Thread Maciej S. Szmigiero
On 23.01.2018 10:37, Jia-Ju Bai wrote:
> The function synth_alloc_pages is not called in atomic context.
> Thus GFP_ATOMIC is not necessary, and it can be replaced with GFP_KERNEL.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
>  sound/pci/emu10k1/memory.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

I've submitted a patch last Sunday that removes this whole code anyway,
see: "[PATCH 4/5] ALSA: emu10k1: make sure synth DMA pages are allocated with 
DMA functions".

Maciej


Re: [PATCH] X.509: unpack RSA signatureValue field from BIT STRING

2018-03-07 Thread Maciej S. Szmigiero
On 07.03.2018 16:44, David Howells wrote:
> Maciej S. Szmigiero <m...@maciej.szmigiero.name> wrote:
> 
>> +if (!strcmp(ctx->cert->sig->pkey_algo, "rsa")) {
> 
> I'm going to change this to '== 0' rather than '!'.

No problem. 
> David
> 

Thanks,
Maciej


[PATCH] X.509: unpack RSA signatureValue field from BIT STRING

2018-03-06 Thread Maciej S. Szmigiero
The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
For RSA signatures this BIT STRING is of so-called primitive subtype, which
contains a u8 prefix indicating a count of unused bits in the encoding.

We have to strip this prefix from signature data, just as we already do for
key data in x509_extract_key_data() function.

This wasn't noticed earlier because this prefix byte is zero for RSA key
sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
prefixes has no bearing on its value.

The signature length, however was incorrect, which is a problem for RSA
implementations that need it to be exactly correct (like AMD CCP).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
Fixes: c26fd69fa009 ("X.509: Add a crypto key parser for binary (DER) X.509 
certificates")
Cc: sta...@vger.kernel.org
---
This is a resend without changes of a patch that was previously
submitted in one series with CCP driver changes since this particular
patch should go through the security (rather than crypto) tree.

 crypto/asymmetric_keys/x509_cert_parser.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index ce2df8c9c583..88c26a4538ae 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -249,6 +249,15 @@ int x509_note_signature(void *context, size_t hdrlen,
return -EINVAL;
}
 
+   if (!strcmp(ctx->cert->sig->pkey_algo, "rsa")) {
+   /* Discard the BIT STRING metadata */
+   if (vlen < 1 || *(const u8 *)value != 0)
+   return -EBADMSG;
+
+   value++;
+   vlen--;
+   }
+
ctx->cert->raw_sig = value;
ctx->cert->raw_sig_size = vlen;
return 0;


Re: [PATCH] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-11 Thread Maciej S. Szmigiero
On 11.03.2018 10:59, Ingo Molnar wrote:
> 
> * Maciej S. Szmigiero <m...@maciej.szmigiero.name> wrote:
> 
>> Currently, it is very easy to make the AMD microcode update driver crash
>> or spin on a malformed microcode file since it does very little
>> consistency checking on data loaded from such file.
>>
>> This commit introduces various checks, mostly on length-type fields, so
>> all corrupted microcode files are (hopefully) correctly rejected instead.
>>
>> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
>> ---
>> Test files are at https://pastebin.com/XkKUSmMp
>> One has to enable KASAN in the kernel config and rename a particular
>> test file to name appropriate to the running CPU family to test its
>> loading.
>>
>>  arch/x86/include/asm/microcode_amd.h |   6 ++
>>  arch/x86/kernel/cpu/microcode/amd.c  | 111 
>> ++-
>>  2 files changed, 89 insertions(+), 28 deletions(-)
> 
> Treating microcode update files as external input and sanity checking the 
> data 
> makes sense I suppose, but there's various small uglies in the patch:
> 
>> +/* if a patch is larger than this the microcode file is surely corrupted */
>> +#define PATCH_MAX_SIZE_ABSOLUTE (16 * 1024 * 1024)
> 
> Please capitalize comments.

Will do.

> 
>>   * Returns the amount of bytes consumed while scanning. @desc contains all 
>> the
>>   * data we're going to use in later stages of the application.
>>   */
>> -static ssize_t parse_container(u8 *ucode, ssize_t size, struct cont_desc 
>> *desc)
>> +static size_t parse_container(u8 *ucode, size_t size, struct cont_desc 
>> *desc)
>>  {
>>  struct equiv_cpu_entry *eq;
>> -ssize_t orig_size = size;
>> +size_t orig_size = size;
>>  u32 *hdr = (u32 *)ucode;
>> +u32 eqsize;
>>  u16 eq_id;
>>  u8 *buf;
> 
> So we have 'eq_id', but 'eqsize'? Why not 'eq_size' to have fewer random 
> variations of coding style?

Will change.
>>  
>> +if (size < CONTAINER_HDR_SZ)
>> +return 0;
> 
> The comment about CONTAINER_HDR_SZ better belongs here, where we use it.

Will move it.
 
>>  /* Am I looking at an equivalence table header? */
>>  if (hdr[0] != UCODE_MAGIC ||
>>  hdr[1] != UCODE_EQUIV_CPU_TABLE_TYPE ||
>>  hdr[2] == 0)
>>  return CONTAINER_HDR_SZ;
>>  
>> +eqsize = hdr[2];
>> +if (eqsize < sizeof(*eq) ||
>> +eqsize > UCODE_EQUIV_CPU_TABLE_MAX_SIZE)
>> +return 0;
>> +
>> +if (size < CONTAINER_HDR_SZ + eqsize)
>> +return 0;
>> +
>>  buf = ucode;
>>  
>>  eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
>>  
>>  /* Find the equivalence ID of our CPU in this table: */
>> -eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
>> +eq_id = find_equiv_id(eq, eqsize / sizeof(*eq), desc->cpuid_1_eax);
> 
> Does eq_size have to be a multiple of sizeof(*eq)? If yes then we should 
> probably 
> check that too.

In principle yes, but having garbage at the end of the equivalence
table in a microcode container file is a non-fatal problem.
I have mixed feelings whether we should be really strict here, especially
that the existing driver would accept such files without any error.

>> -static int install_equiv_cpu_table(const u8 *buf)
>> +static int install_equiv_cpu_table(const u8 *buf, size_t bufsize)
> 
> s/bufsize/buf_size

Will change.

>>  {
>>  unsigned int *ibuf = (unsigned int *)buf;
>> -unsigned int type = ibuf[1];
>> -unsigned int size = ibuf[2];
>> +unsigned int type, size;
>>  
>> -if (type != UCODE_EQUIV_CPU_TABLE_TYPE || !size) {
>> -pr_err("empty section/"
>> -   "invalid type field in container file section header\n");
>> +if (bufsize < CONTAINER_HDR_SZ) {
>> +pr_err("no container header\n");
>> +return -EINVAL;
>> +}
>> +
>> +type = ibuf[1];
>> +if (type != UCODE_EQUIV_CPU_TABLE_TYPE) {
>> +pr_err("invalid type field in container file section header\n");
>> +return -EINVAL;
>> +}
>> +
>> +size = ibuf[2];
>> +if (size < sizeof(struct equiv_cpu_entry) ||
>> +size > UCODE_EQUIV_CPU_TABLE_MAX_SIZE) {
>> +pr_err("equivalent CPU table size invalid\n");
>> +return -EINVAL;
>> +}
>> +
>> +if (bufsize < CONTAINER_HDR_SZ + size) {
>> +pr_err("equivalent CPU table truncated\n");
>>  return -EINVAL;
>>  }
>>  
>> @@ -559,6 +599,7 @@ static int install_equiv_cpu_table(const u8 *buf)
>>  }
>>  
>>  memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, size);
>> +equiv_cpu_table_size = size / sizeof(struct equiv_cpu_entry);
> 
> Btw., 'equiv_cpu_table_size' is an ambiguous name: often _size variables are 
> in 
> byte units - but this is number of entries. So the name should be 
> 'equiv_cpu_table_entries' or so.

Will rename.

> Thanks,
> 
>   Ingo
> 

Thanks,
Maciej


[PATCH v2] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-11 Thread Maciej S. Szmigiero
Currently, it is very easy to make the AMD microcode update driver crash
or spin on a malformed microcode file since it does very little
consistency checking on data loaded from such file.

This commit introduces various checks, mostly on length-type fields, so
all corrupted microcode files are (hopefully) correctly rejected instead.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Changes from v1: Capitalize a comment, rename 'eqsize' and 'bufsize'
to 'eq_size' and 'buf_size', respectively, attach a comment about
checking the equivalence table header to its first size check, rename
'equiv{_cpu,}_table_size' to 'equiv{_cpu,}_table_entries'.

Test files are at https://pastebin.com/XkKUSmMp
One has to enable KASAN in the kernel config and rename a particular
test file to name appropriate to the running CPU family to test its
loading.

 arch/x86/include/asm/microcode_amd.h |   6 ++
 arch/x86/kernel/cpu/microcode/amd.c  | 112 ++-
 2 files changed, 90 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/microcode_amd.h 
b/arch/x86/include/asm/microcode_amd.h
index 209492849566..373a202ea569 100644
--- a/arch/x86/include/asm/microcode_amd.h
+++ b/arch/x86/include/asm/microcode_amd.h
@@ -19,6 +19,9 @@ struct equiv_cpu_entry {
u16 res;
 } __attribute__((packed));
 
+/* 4k */
+#define UCODE_EQUIV_CPU_TABLE_MAX_SIZE (256 * sizeof(struct equiv_cpu_entry))
+
 struct microcode_header_amd {
u32 data_code;
u32 patch_id;
@@ -41,6 +44,9 @@ struct microcode_amd {
unsigned intmpb[0];
 };
 
+/* If a patch is larger than this the microcode file is surely corrupted */
+#define PATCH_MAX_SIZE_ABSOLUTE (16 * 1024 * 1024)
+
 #define PATCH_MAX_SIZE PAGE_SIZE
 
 #ifdef CONFIG_MICROCODE_AMD
diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index a998e1a7d46f..1cbccf79ff68 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -39,6 +39,7 @@
 #include 
 
 static struct equiv_cpu_entry *equiv_cpu_table;
+static unsigned int equiv_cpu_table_entries;
 
 /*
  * This points to the current valid container of microcode patches which we 
will
@@ -63,12 +64,18 @@ static u8 amd_ucode_patch[PATCH_MAX_SIZE];
 static const char
 ucode_path[] __maybe_unused = "kernel/x86/microcode/AuthenticAMD.bin";
 
-static u16 find_equiv_id(struct equiv_cpu_entry *equiv_table, u32 sig)
+static u16 find_equiv_id(const struct equiv_cpu_entry *equiv_table,
+unsigned int equiv_table_entries, u32 sig)
 {
-   for (; equiv_table && equiv_table->installed_cpu; equiv_table++) {
-   if (sig == equiv_table->installed_cpu)
-   return equiv_table->equiv_cpu;
-   }
+   unsigned int i;
+
+   if (!equiv_table)
+   return 0;
+
+   for (i = 0; i < equiv_table_entries && equiv_table[i].installed_cpu;
+i++)
+   if (sig == equiv_table[i].installed_cpu)
+   return equiv_table[i].equiv_cpu;
 
return 0;
 }
@@ -80,29 +87,41 @@ static u16 find_equiv_id(struct equiv_cpu_entry 
*equiv_table, u32 sig)
  * Returns the amount of bytes consumed while scanning. @desc contains all the
  * data we're going to use in later stages of the application.
  */
-static ssize_t parse_container(u8 *ucode, ssize_t size, struct cont_desc *desc)
+static size_t parse_container(u8 *ucode, size_t size, struct cont_desc *desc)
 {
struct equiv_cpu_entry *eq;
-   ssize_t orig_size = size;
+   size_t orig_size = size;
u32 *hdr = (u32 *)ucode;
+   u32 eq_size;
u16 eq_id;
u8 *buf;
 
/* Am I looking at an equivalence table header? */
+   if (size < CONTAINER_HDR_SZ)
+   return 0;
+
if (hdr[0] != UCODE_MAGIC ||
hdr[1] != UCODE_EQUIV_CPU_TABLE_TYPE ||
hdr[2] == 0)
return CONTAINER_HDR_SZ;
 
+   eq_size = hdr[2];
+   if (eq_size < sizeof(*eq) ||
+   eq_size > UCODE_EQUIV_CPU_TABLE_MAX_SIZE)
+   return 0;
+
+   if (size < CONTAINER_HDR_SZ + eq_size)
+   return 0;
+
buf = ucode;
 
eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
 
/* Find the equivalence ID of our CPU in this table: */
-   eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
+   eq_id = find_equiv_id(eq, eq_size / sizeof(*eq), desc->cpuid_1_eax);
 
-   buf  += hdr[2] + CONTAINER_HDR_SZ;
-   size -= hdr[2] + CONTAINER_HDR_SZ;
+   buf  += eq_size + CONTAINER_HDR_SZ;
+   size -= eq_size + CONTAINER_HDR_SZ;
 
/*
 * Scan through the rest of the container to find where it ends. We do
@@ -112,6 +131,9 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *desc)
struct microcode_amd *mc;
   

Re: [PATCH v2] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-12 Thread Maciej S. Szmigiero
On 12.03.2018 10:53, Borislav Petkov wrote:
> On Sun, Mar 11, 2018 at 04:27:03PM +0100, Maciej S. Szmigiero wrote:
>> +/* 4k */
>> +#define UCODE_EQUIV_CPU_TABLE_MAX_SIZE (256 * sizeof(struct 
>> equiv_cpu_entry))
> 
> And you came up with that max size how exactly?
The equivalent CPU table is allocated using vmalloc() so it is nice
when the maximum size is an integer multiple of the page size.

Since the maximum entry count in current microcode files is 18 the
maximum size of 256 entries (or one page) gives us plenty of headroom.

Also, looking in the past, there probably won't be more than 256 AMD CPU
types in one CPU family.

> 
>> +/* If a patch is larger than this the microcode file is surely corrupted */
>> +#define PATCH_MAX_SIZE_ABSOLUTE (16 * 1024 * 1024)
> 
> Same question here.
> 

This limit is an absolute upper cap of a patch size.
This number is high so it won't have to be changed in the future, it
only serves as a plausibility check before we consider other data
contained in the particular patch.

Current patches are 4k max size, but since the size field is
32 bit-wide one can theoretically specify sizes up to 4GB.


Since these two maximum sizes are somewhat arbitrary if anybody wants
to propose other values the patch can be updated, naturally.

Maciej


Re: [PATCH v2] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-12 Thread Maciej S. Szmigiero
On 12.03.2018 14:48, Borislav Petkov wrote:
> On Mon, Mar 12, 2018 at 02:32:30PM +0100, Maciej S. Szmigiero wrote:
>> "microcode_amd.bin" in linux-firmware.
> 
> That is the microcode container for all families < 0x15. And it
> *happens* to have 18 entries.
> 
> So purely arbitrary:
> 
> Equivalence table (magic: AMD, type: 0, length: 288 (0x120))
(the long table was cut)
> 
>> There is no problem raising this value in that (future) case.
>> As I wrote previously, currently the maximum used count is 18.
> 
> There is a problem because not everyone can upgrade their kernels
> like you. Distros and big deployments can't just up and update their
> kernels at a whim just because you imposed an arbitrary limit which you
> determined would be ok.

First, this limit is more than 14 times higher than the current maximum
count.

And this current maximum was reached by CPU types added in
families < 15h during last 10+ years (the oldest supported CPU family in
this container is 10h, which - according to Wikipedia - was released
September 10, 2007).

At that rate exceeding this limit will take 130+ additional years (and
this assumes that AMD will introduce new CPU types in these families
at the same rate as in the last 10 years - I sincerely doubt it).

If somebody does not update their kernel for 130 years (even to -stable
versions) then he or she has a much bigger problem than incompatibility
with the current microcode update file.

> 
>> Not really, since even in the existing code CONTAINER_HDR_SZ (12) gets
>> added to this size, then the sum is cast to a (signed) int.
>> If this value is negative then the file get rejected.
> 
> That is a bug in install_equiv_cpu_table() - it should return unsigned int.
> 
>> It can be changed to the current maximum across sizes for particular
> 
> What is the "current maximum across sizes"?
> 

This is the current maximum patch size across families:
#define F15H_MPB_MAX_SIZE 4096

Maciej


Re: [PATCH v2] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-12 Thread Maciej S. Szmigiero
On 12.03.2018 14:06, Borislav Petkov wrote:
> On Mon, Mar 12, 2018 at 01:56:59PM +0100, Maciej S. Szmigiero wrote:
(..)
>> Since the maximum entry count in current microcode files is 18 the
> 
> Where did you dream up that 18?

"microcode_amd.bin" in linux-firmware.

>> Also, looking in the past, there probably won't be more than 256 AMD CPU
>> types in one CPU family.
> 
> Wrong.

There is no problem raising this value in that (future) case.
As I wrote previously, currently the maximum used count is 18.

> The only limitation on the equivalence table size we have is the 32-bit
> unsigned length field at offset 8 in the equivalence table header.

Not really, since even in the existing code CONTAINER_HDR_SZ (12) gets
added to this size, then the sum is cast to a (signed) int.
If this value is negative then the file get rejected.

>> This limit is an absolute upper cap of a patch size.
> 
> More dreamt up crap.
> 
> See verify_patch_size() for the actual patch sizes.
> 

It can be changed to the current maximum across sizes for particular
families, but then the limit will need to be raised when adding a new
family (if it uses a larger patch).

Maciej


[PATCH v3 4/9] x86/microcode/AMD: automatically compute the PATCH_MAX_SIZE macro

2018-03-13 Thread Maciej S. Szmigiero
The PATCH_MAX_SIZE macro should contain the maximum of all family patch
sizes, computed automatically so that future changes there don't cause an
inconsistency.

Unfortunately, we can't use a standard max{,3} macros for this since they
only work inside a function (they use a compound statement as an
expression) and we have a static array using this macro value as its
length.

This macro is specific to amd.c file so let's move it there (the max sizes
for families are in this file already).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/include/asm/microcode_amd.h |  2 --
 arch/x86/kernel/cpu/microcode/amd.c  | 22 --
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/microcode_amd.h 
b/arch/x86/include/asm/microcode_amd.h
index 209492849566..365bfa85a06b 100644
--- a/arch/x86/include/asm/microcode_amd.h
+++ b/arch/x86/include/asm/microcode_amd.h
@@ -41,8 +41,6 @@ struct microcode_amd {
unsigned intmpb[0];
 };
 
-#define PATCH_MAX_SIZE PAGE_SIZE
-
 #ifdef CONFIG_MICROCODE_AMD
 extern void __init load_ucode_amd_bsp(unsigned int family);
 extern void load_ucode_amd_ap(unsigned int family);
diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index d20c327c960b..b9f6c06bdc16 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -40,6 +40,22 @@
 
 static struct equiv_cpu_entry *equiv_cpu_table;
 
+/* Maximum patch size for a particular family */
+#define F1XH_MPB_MAX_SIZE 2048
+#define F14H_MPB_MAX_SIZE 1824
+#define F15H_MPB_MAX_SIZE 4096
+#define F16H_MPB_MAX_SIZE 3458
+#define F17H_MPB_MAX_SIZE 3200
+
+/* Can't use a standard max{,3} since they only work inside a function */
+#define SIMPLE_MAX(x, y) ((x) > (y) ? (x) : (y))
+#define SIMPLE_MAX3(x, y, z) SIMPLE_MAX(SIMPLE_MAX(x, y), z)
+
+/* Maximum of all the above families */
+#define PATCH_MAX_SIZE SIMPLE_MAX3(F1XH_MPB_MAX_SIZE, F14H_MPB_MAX_SIZE, \
+  SIMPLE_MAX3(F15H_MPB_MAX_SIZE, F16H_MPB_MAX_SIZE, \
+  F17H_MPB_MAX_SIZE))
+
 /*
  * This points to the current valid container of microcode patches which we 
will
  * save from the initrd/builtin before jettisoning its contents. @mc is the
@@ -473,12 +489,6 @@ static unsigned int verify_patch_size(u8 family, u32 
patch_size,
 {
u32 max_size;
 
-#define F1XH_MPB_MAX_SIZE 2048
-#define F14H_MPB_MAX_SIZE 1824
-#define F15H_MPB_MAX_SIZE 4096
-#define F16H_MPB_MAX_SIZE 3458
-#define F17H_MPB_MAX_SIZE 3200
-
switch (family) {
case 0x14:
max_size = F14H_MPB_MAX_SIZE;


[PATCH v3 5/9] x86/microcode/AMD: check patch size in verify_and_add_patch()

2018-03-13 Thread Maciej S. Szmigiero
Now that we have the PATCH_MAX_SIZE macro correctly computed we can verify
properly the indicated size of a patch in a microcode container file and
whether this file is actually large enough to contain it in the late loader
verify_and_add_patch() function.

The early loader already does the PATCH_MAX_SIZE check in parse_container()
function.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index b9f6c06bdc16..0f78200f2f6c 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -485,7 +485,7 @@ static int collect_cpu_info_amd(int cpu, struct 
cpu_signature *csig)
 }
 
 static unsigned int verify_patch_size(u8 family, u32 patch_size,
- unsigned int size)
+ size_t size)
 {
u32 max_size;
 
@@ -507,7 +507,7 @@ static unsigned int verify_patch_size(u8 family, u32 
patch_size,
break;
}
 
-   if (patch_size > min_t(u32, size, max_size)) {
+   if (patch_size > min_t(size_t, size, max_size)) {
pr_err("patch size mismatch\n");
return 0;
}
@@ -616,7 +616,7 @@ static void cleanup(void)
  * driver cannot continue functioning normally. In such cases, we tear
  * down everything we've used up so far and exit.
  */
-static int verify_and_add_patch(u8 family, u8 *fw, unsigned int leftover)
+static int verify_and_add_patch(u8 family, u8 *fw, size_t leftover)
 {
struct microcode_header_amd *mc_hdr;
struct ucode_patch *patch;
@@ -624,7 +624,15 @@ static int verify_and_add_patch(u8 family, u8 *fw, 
unsigned int leftover)
u32 proc_fam;
u16 proc_id;
 
+   if (leftover < SECTION_HDR_SIZE + sizeof(*mc_hdr))
+   return leftover;
+
patch_size  = *(u32 *)(fw + 4);
+   if (patch_size > PATCH_MAX_SIZE) {
+   pr_err("patch size %u too large\n", patch_size);
+   return -EINVAL;
+   }
+
crnt_size   = patch_size + SECTION_HDR_SIZE;
mc_hdr  = (struct microcode_header_amd *)(fw + SECTION_HDR_SIZE);
proc_id = mc_hdr->processor_rev_id;


[PATCH v3 2/9] x86/microcode/AMD: check whether the equivalence table fits in the file

2018-03-13 Thread Maciej S. Szmigiero
Before loading a CPU equivalence table from a microcode container file we
need to verify whether this file is actually large enough to contain the
table of a size indicated in this file.
If it is not, there is no point of continuing with loading it since
microcode patches are located after the equivalence table anyway.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 54 ++---
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index ffe0d0ce57fc..ac06e2819f26 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -80,20 +80,29 @@ static u16 find_equiv_id(struct equiv_cpu_entry 
*equiv_table, u32 sig)
  * Returns the amount of bytes consumed while scanning. @desc contains all the
  * data we're going to use in later stages of the application.
  */
-static ssize_t parse_container(u8 *ucode, ssize_t size, struct cont_desc *desc)
+static size_t parse_container(u8 *ucode, size_t size, struct cont_desc *desc)
 {
struct equiv_cpu_entry *eq;
-   ssize_t orig_size = size;
+   size_t orig_size = size;
u32 *hdr = (u32 *)ucode;
+   size_t eq_size;
u16 eq_id;
u8 *buf;
 
/* Am I looking at an equivalence table header? */
+   if (size < CONTAINER_HDR_SZ)
+   return 0;
+
if (hdr[0] != UCODE_MAGIC ||
hdr[1] != UCODE_EQUIV_CPU_TABLE_TYPE ||
hdr[2] == 0)
return CONTAINER_HDR_SZ;
 
+   eq_size = hdr[2];
+   if (eq_size < sizeof(*eq) ||
+   size - CONTAINER_HDR_SZ < eq_size)
+   return 0;
+
buf = ucode;
 
eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
@@ -101,8 +110,8 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *desc)
/* Find the equivalence ID of our CPU in this table: */
eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
 
-   buf  += hdr[2] + CONTAINER_HDR_SZ;
-   size -= hdr[2] + CONTAINER_HDR_SZ;
+   buf  += eq_size + CONTAINER_HDR_SZ;
+   size -= eq_size + CONTAINER_HDR_SZ;
 
/*
 * Scan through the rest of the container to find where it ends. We do
@@ -159,15 +168,13 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *desc)
  */
 static void scan_containers(u8 *ucode, size_t size, struct cont_desc *desc)
 {
-   ssize_t rem = size;
-
-   while (rem >= 0) {
-   ssize_t s = parse_container(ucode, rem, desc);
+   while (size > 0) {
+   size_t s = parse_container(ucode, size, desc);
if (!s)
return;
 
ucode += s;
-   rem   -= s;
+   size  -= s;
}
 }
 
@@ -540,15 +547,30 @@ static enum ucode_state apply_microcode_amd(int cpu)
return UCODE_UPDATED;
 }
 
-static int install_equiv_cpu_table(const u8 *buf)
+static int install_equiv_cpu_table(const u8 *buf, size_t buf_size)
 {
unsigned int *ibuf = (unsigned int *)buf;
-   unsigned int type = ibuf[1];
-   unsigned int size = ibuf[2];
+   unsigned int type, size;
+
+   if (buf_size < CONTAINER_HDR_SZ) {
+   pr_err("no container header\n");
+   return -EINVAL;
+   }
+
+   type = ibuf[1];
+   if (type != UCODE_EQUIV_CPU_TABLE_TYPE) {
+   pr_err("invalid type field in container file section header\n");
+   return -EINVAL;
+   }
+
+   size = ibuf[2];
+   if (size < sizeof(struct equiv_cpu_entry)) {
+   pr_err("equivalent CPU table too short\n");
+   return -EINVAL;
+   }
 
-   if (type != UCODE_EQUIV_CPU_TABLE_TYPE || !size) {
-   pr_err("empty section/"
-  "invalid type field in container file section header\n");
+   if (buf_size - CONTAINER_HDR_SZ < size) {
+   pr_err("equivalent CPU table truncated\n");
return -EINVAL;
}
 
@@ -655,7 +677,7 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
int crnt_size = 0;
int offset;
 
-   offset = install_equiv_cpu_table(data);
+   offset = install_equiv_cpu_table(data, size);
if (offset < 0) {
pr_err("failed to create equivalent cpu table\n");
return ret;


[PATCH v3 0/9] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-13 Thread Maciej S. Szmigiero
Currently, it is very easy to make the AMD microcode update driver crash
or spin on a malformed microcode container file since it does very little
consistency checking on data loaded from such file.

This series introduces various checks, mostly on length-type fields,
so all corrupted microcode container files are (hopefully) correctly
rejected instead.
This largely matches what the Intel microcode update driver already does.

It also tries to make the behavior consistent between the early and late
loaders.

Please note that this isn't about verifying the actual microcode update,
that is, the blob that gets sent to the CPU as the new microcode.
Such verification is (hopefully) done by the CPU itself.

It is about verifying a driver-specific container file that includes
many microcode updates for different CPUs of a particular CPU family,
along with metadata that helps the driver select the right microcode
update to actually send to the CPU.

There are purposely-corrupted test files available at
https://pastebin.com/XkKUSmMp
One has to enable KASAN in the kernel config and rename a particular
test file to name appropriate to the running CPU family to test its
loading.

Changes from v1: Capitalize a comment, rename 'eqsize' and 'bufsize'
to 'eq_size' and 'buf_size', respectively, attach a comment about
checking the equivalence table header to its first size check, rename
'equiv{_cpu,}_table_size' to 'equiv{_cpu,}_table_entries'.

Changes from v2: Split the patch into separate commits, remove explicit
CPU equivalence table size limit, make install_equiv_cpu_table() return
a size_t instead of a (signed) int so no overflow can occur there,
automatically compute the PATCH_MAX_SIZE macro and use it for checking
a patch size, make the late loader behavior with respect to late parse
failures consistent with what the early loader does. 

 arch/x86/include/asm/microcode_amd.h |   2 -
 arch/x86/kernel/cpu/microcode/amd.c  | 164 ---
 2 files changed, 114 insertions(+), 52 deletions(-)


[PATCH v3 7/9] x86/microcode/AMD: check microcode container file size before accessing it

2018-03-13 Thread Maciej S. Szmigiero
The early loader parse_container() function should check whether the
microcode container file is actually large enough to contain the patch of
an indicated size, just like the late loader does.

Also, the request_microcode_amd() function should check whether the
container file is actually large enough to contain the header magic value.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 3ad23e72c2b0..63bd1a63f98a 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -137,6 +137,9 @@ static size_t parse_container(u8 *ucode, size_t size, 
struct cont_desc *desc)
struct microcode_amd *mc;
u32 patch_size;
 
+   if (size < SECTION_HDR_SIZE)
+   break;
+
hdr = (u32 *)buf;
 
if (hdr[0] != UCODE_UCODE_TYPE)
@@ -151,6 +154,10 @@ static size_t parse_container(u8 *ucode, size_t size, 
struct cont_desc *desc)
buf  += SECTION_HDR_SIZE;
size -= SECTION_HDR_SIZE;
 
+   if (size < sizeof(*mc) ||
+   size < patch_size)
+   break;
+
mc = (struct microcode_amd *)buf;
if (eq_id == mc->hdr.processor_rev_id) {
desc->psize = patch_size;
@@ -786,6 +793,10 @@ static enum ucode_state request_microcode_amd(int cpu, 
struct device *device,
}
 
ret = UCODE_ERROR;
+   if (fw->size < sizeof(u32)) {
+   pr_err("microcode container far too short\n");
+   goto fw_release;
+   }
if (*(u32 *)fw->data != UCODE_MAGIC) {
pr_err("invalid magic value (0x%08x)\n", *(u32 *)fw->data);
goto fw_release;


[PATCH v3 8/9] x86/microcode/AMD: check the equivalence table size when scanning it

2018-03-13 Thread Maciej S. Szmigiero
Currently, the code scanning the CPU equivalence table read from a
microcode container file assumes that it actually contains a terminating
zero entry.
Let's check also the size of this table to make sure that we don't read
past it in case it actually doesn't.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 63bd1a63f98a..78e698fcd3ce 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -39,6 +39,7 @@
 #include 
 
 static struct equiv_cpu_entry *equiv_cpu_table;
+static unsigned int equiv_cpu_table_entries;
 
 /* Maximum patch size for a particular family */
 #define F1XH_MPB_MAX_SIZE 2048
@@ -79,12 +80,18 @@ static u8 amd_ucode_patch[PATCH_MAX_SIZE];
 static const char
 ucode_path[] __maybe_unused = "kernel/x86/microcode/AuthenticAMD.bin";
 
-static u16 find_equiv_id(struct equiv_cpu_entry *equiv_table, u32 sig)
+static u16 find_equiv_id(const struct equiv_cpu_entry *equiv_table,
+unsigned int equiv_table_entries, u32 sig)
 {
-   for (; equiv_table && equiv_table->installed_cpu; equiv_table++) {
-   if (sig == equiv_table->installed_cpu)
-   return equiv_table->equiv_cpu;
-   }
+   unsigned int i;
+
+   if (!equiv_table)
+   return 0;
+
+   for (i = 0; i < equiv_table_entries && equiv_table[i].installed_cpu;
+i++)
+   if (sig == equiv_table[i].installed_cpu)
+   return equiv_table[i].equiv_cpu;
 
return 0;
 }
@@ -124,7 +131,7 @@ static size_t parse_container(u8 *ucode, size_t size, 
struct cont_desc *desc)
eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
 
/* Find the equivalence ID of our CPU in this table: */
-   eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
+   eq_id = find_equiv_id(eq, eq_size / sizeof(*eq), desc->cpuid_1_eax);
 
buf  += eq_size + CONTAINER_HDR_SZ;
size -= eq_size + CONTAINER_HDR_SZ;
@@ -394,20 +401,21 @@ void reload_ucode_amd(void)
 static u16 __find_equiv_id(unsigned int cpu)
 {
struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
-   return find_equiv_id(equiv_cpu_table, uci->cpu_sig.sig);
+   return find_equiv_id(equiv_cpu_table, equiv_cpu_table_entries,
+uci->cpu_sig.sig);
 }
 
 static u32 find_cpu_family_by_equiv_cpu(u16 equiv_cpu)
 {
-   int i = 0;
+   unsigned int i;
 
BUG_ON(!equiv_cpu_table);
 
-   while (equiv_cpu_table[i].equiv_cpu != 0) {
+   for (i = 0; i < equiv_cpu_table_entries &&
+equiv_cpu_table[i].equiv_cpu != 0; i++)
if (equiv_cpu == equiv_cpu_table[i].equiv_cpu)
return equiv_cpu_table[i].installed_cpu;
-   i++;
-   }
+
return 0;
 }
 
@@ -599,6 +607,7 @@ static size_t install_equiv_cpu_table(const u8 *buf, size_t 
buf_size)
}
 
memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, size);
+   equiv_cpu_table_entries = size / sizeof(struct equiv_cpu_entry);
 
/* add header length */
return size + CONTAINER_HDR_SZ;
@@ -608,6 +617,7 @@ static void free_equiv_cpu_table(void)
 {
vfree(equiv_cpu_table);
equiv_cpu_table = NULL;
+   equiv_cpu_table_entries = 0;
 }
 
 static void cleanup(void)


[PATCH v3 6/9] x86/microcode/AMD: verify patch section type for every such section

2018-03-13 Thread Maciej S. Szmigiero
We should check whether the patch section currently being processed is
actually a patch section for each of them (not just the first one) in the
late loader verify_and_add_patch() function, just like the early loader
already does in parse_container() function.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 0f78200f2f6c..3ad23e72c2b0 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -627,6 +627,11 @@ static int verify_and_add_patch(u8 family, u8 *fw, size_t 
leftover)
if (leftover < SECTION_HDR_SIZE + sizeof(*mc_hdr))
return leftover;
 
+   if (*(u32 *)fw != UCODE_UCODE_TYPE) {
+   pr_err("invalid type field in container file section header\n");
+   return -EINVAL;
+   }
+
patch_size  = *(u32 *)(fw + 4);
if (patch_size > PATCH_MAX_SIZE) {
pr_err("patch size %u too large\n", patch_size);
@@ -704,12 +709,6 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
fw += offset;
leftover = size - offset;
 
-   if (*(u32 *)fw != UCODE_UCODE_TYPE) {
-   pr_err("invalid type field in container file section header\n");
-   free_equiv_cpu_table();
-   return ret;
-   }
-
while (leftover) {
crnt_size = verify_and_add_patch(family, fw, leftover);
if (crnt_size < 0)


[PATCH v3 3/9] x86/microcode/AMD: install_equiv_cpu_table() should not return (signed) int

2018-03-13 Thread Maciej S. Szmigiero
The maximum possible value returned by install_equiv_cpu_table() is
UINT_MAX + CONTAINER_HDR_SZ (on a 64-bit kernel).
This is more than (signed) int type currently returned by this function can
hold so this function will need to return a size_t instead.

The individual (negative) error codes returned by this function are of no
use anyway, since they are all normalized to UCODE_ERROR by its caller
(__load_microcode_amd()).

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index ac06e2819f26..d20c327c960b 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -547,37 +547,38 @@ static enum ucode_state apply_microcode_amd(int cpu)
return UCODE_UPDATED;
 }
 
-static int install_equiv_cpu_table(const u8 *buf, size_t buf_size)
+static size_t install_equiv_cpu_table(const u8 *buf, size_t buf_size)
 {
unsigned int *ibuf = (unsigned int *)buf;
-   unsigned int type, size;
+   unsigned int type;
+   size_t size;
 
if (buf_size < CONTAINER_HDR_SZ) {
pr_err("no container header\n");
-   return -EINVAL;
+   return 0;
}
 
type = ibuf[1];
if (type != UCODE_EQUIV_CPU_TABLE_TYPE) {
pr_err("invalid type field in container file section header\n");
-   return -EINVAL;
+   return 0;
}
 
size = ibuf[2];
if (size < sizeof(struct equiv_cpu_entry)) {
pr_err("equivalent CPU table too short\n");
-   return -EINVAL;
+   return 0;
}
 
if (buf_size - CONTAINER_HDR_SZ < size) {
pr_err("equivalent CPU table truncated\n");
-   return -EINVAL;
+   return 0;
}
 
equiv_cpu_table = vmalloc(size);
if (!equiv_cpu_table) {
pr_err("failed to allocate equivalent CPU table\n");
-   return -ENOMEM;
+   return 0;
}
 
memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, size);
@@ -672,13 +673,13 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
 size_t size)
 {
enum ucode_state ret = UCODE_ERROR;
-   unsigned int leftover;
+   size_t leftover;
u8 *fw = (u8 *)data;
int crnt_size = 0;
-   int offset;
+   size_t offset;
 
offset = install_equiv_cpu_table(data, size);
-   if (offset < 0) {
+   if (!offset) {
pr_err("failed to create equivalent cpu table\n");
return ret;
}


[PATCH v3 1/9] x86/microcode/AMD: subtract SECTION_HDR_SIZE from file leftover length

2018-03-13 Thread Maciej S. Szmigiero
verify_patch_size() function verifies whether the microcode container file
remaining size is large enough to contain a patch of the indicated size.

However, the section header length is not included in this indicated size
but it is present in the leftover file length so it should be subtracted
from the leftover file length before passing this value to
verify_patch_size().

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index a998e1a7d46f..ffe0d0ce57fc 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -613,7 +613,8 @@ static int verify_and_add_patch(u8 family, u8 *fw, unsigned 
int leftover)
return crnt_size;
}
 
-   ret = verify_patch_size(family, patch_size, leftover);
+   ret = verify_patch_size(family, patch_size,
+   leftover - SECTION_HDR_SIZE);
if (!ret) {
pr_err("Patch-ID 0x%08x: size mismatch.\n", mc_hdr->patch_id);
return crnt_size;


[PATCH v3 9/9] x86/microcode/AMD: be more tolerant of late parse failures in late loader

2018-03-13 Thread Maciej S. Szmigiero
The early loader just ends its microcode container file processing when it
is unable to parse some patch section, but keeps the already read patches
from this file for their eventual application.

We can do the same in the late loader - we'll just return an error if we
are unable to parse any patches.
Note that we already do silently skip patches in the late loader for
smaller issues like lack of an equivalence table entry, family-size
mismatch or an unsupported chipset match type.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 78e698fcd3ce..a098780b0847 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -729,13 +729,15 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
while (leftover) {
crnt_size = verify_and_add_patch(family, fw, leftover);
if (crnt_size < 0)
-   return ret;
+   break;
 
fw   += crnt_size;
leftover -= crnt_size;
+
+   ret = UCODE_OK;
}
 
-   return UCODE_OK;
+   return ret;
 }
 
 static enum ucode_state


[PATCH v4 07/10] x86/microcode/AMD: Verify patch section type for every such section

2018-03-15 Thread Maciej S. Szmigiero
We should check whether the patch section currently being processed is
actually a patch section for each of them (not just the first one) in the
late loader verify_and_add_patch() function, just like the early loader
already does in parse_container() function.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 096cb58a563f..4d2116d08754 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -613,13 +613,19 @@ static int verify_and_add_patch(u8 family, u8 *fw, size_t 
leftover)
 {
struct microcode_header_amd *mc_hdr;
struct ucode_patch *patch;
-   unsigned int patch_size, crnt_size, ret;
+   unsigned int patch_type, patch_size, crnt_size, ret;
u32 proc_fam;
u16 proc_id;
 
if (leftover < SECTION_HDR_SIZE + sizeof(*mc_hdr))
return leftover;
 
+   patch_type = *(u32 *)fw;
+   if (patch_type != UCODE_UCODE_TYPE) {
+   pr_err("invalid type field in container file section header\n");
+   return -EINVAL;
+   }
+
patch_size  = *(u32 *)(fw + 4);
if (patch_size > PATCH_MAX_SIZE) {
pr_err("patch size %u too large\n", patch_size);
@@ -711,12 +717,6 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
fw += offset;
leftover = size - CONTAINER_HDR_SZ - offset;
 
-   if (*(u32 *)fw != UCODE_UCODE_TYPE) {
-   pr_err("invalid type field in container file section header\n");
-   free_equiv_cpu_table();
-   return ret;
-   }
-
while (leftover) {
crnt_size = verify_and_add_patch(family, fw, leftover);
if (crnt_size < 0)


[PATCH v4 05/10] x86/microcode/AMD: Add a reminder about PATCH_MAX_SIZE macro

2018-03-15 Thread Maciej S. Szmigiero
The PATCH_MAX_SIZE macro should contain the maximum of all family patch
sizes.
Since these sizes are defined in an other place than this macro, let's add
a reminder to them so people will remember to verify PATCH_MAX_SIZE
correctness when modifying a family patch size or adding a new family.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/include/asm/microcode_amd.h | 1 +
 arch/x86/kernel/cpu/microcode/amd.c  | 4 
 2 files changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/microcode_amd.h 
b/arch/x86/include/asm/microcode_amd.h
index 209492849566..8ea477fbc65f 100644
--- a/arch/x86/include/asm/microcode_amd.h
+++ b/arch/x86/include/asm/microcode_amd.h
@@ -41,6 +41,7 @@ struct microcode_amd {
unsigned intmpb[0];
 };
 
+/* Maximum patch size of all supported families */
 #define PATCH_MAX_SIZE PAGE_SIZE
 
 #ifdef CONFIG_MICROCODE_AMD
diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 8e8df37f2f1b..eba9e3c8aa17 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -477,6 +477,10 @@ static unsigned int verify_patch_size(u8 family, u32 
patch_size,
 {
u32 max_size;
 
+/*
+ * If you modify these values or add a new one also check whether
+ * PATCH_MAX_SIZE in include/asm/microcode_amd.h needs updating, too.
+ */
 #define F1XH_MPB_MAX_SIZE 2048
 #define F14H_MPB_MAX_SIZE 1824
 #define F15H_MPB_MAX_SIZE 4096


Re: [PATCH v4 00/10] x86/microcode/AMD: Check microcode file sanity before loading it

2018-03-15 Thread Maciej S. Szmigiero
On 16.03.2018 00:23, Borislav Petkov wrote:
> On Fri, Mar 16, 2018 at 12:07:34AM +0100, Maciej S. Szmigiero wrote:
>> Currently, it is very easy to make the AMD microcode update driver crash
>> or spin on a malformed microcode container file since it does very little
>> consistency checking on data loaded from such file.
> 
> Ok, a couple of tips for the next time:
> 
> * please send your reworked patchset for review after review of your
> current patchset has been completed. We normally send patchsets once a
> week. You've sent yours two days after and that's kinda too fast. I'm
> sure you can imagine reviewers have other work to do too.
>
> So please be patient before resending next time.
> 
> If in doubt, the whole process is documented in Documentation/process/ -
> might wanna skim through it.
I thought the one week absolute minimum resend time is for the case that
there are no comments to a patch, or, more generally, when there is no
action and not for "normal process" respins.

submitting-patches.rst says:
> Once upon a time, patches used to disappear into the void without comment,
> but the development process works more smoothly than that now.  You should
> receive comments within a week or so; if that does not happen, make sure
> that you have sent your patches to the right place.  Wait for a minimum of
> one week before resubmitting or pinging reviewers - possibly longer during
> busy times like merge windows.

But I will wait a week now for respins if you say so.

> * when sending your patchset, make sure all mails are sent as a reply to
> the 0/N message.
> 
> Your 0/N has:
> 
> Message-ID: <9964a6cf-00be-78ea-cc1e-7f7062716...@maciej.szmigiero.name>
> 
> but your 1/N has
> 
> References: <cover.1521150415.git.m...@maciej.szmigiero.name>
> 
> so something went wrong there. Normally git send-email does this
> correctly.

I've copied these messages to my mailbox first and sent them from there,
so that must have overwritten the message IDs.
Will submit them from git-send-email the next time.

> Thx.

Thanks,
Maciej


[PATCH v4 02/10] x86/microcode/AMD: Check equivalence table length in the early loader

2018-03-15 Thread Maciej S. Szmigiero
Before loading a CPU equivalence table from a microcode container file we
need to verify whether this file is actually large enough to contain the
table of a size indicated in this file.
If it is not, there is no point of continuing with loading it since
microcode patches are located after the equivalence table anyway.

This patch adds these checks to the early loader.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 35 +++
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 6a93be0f771c..138c9fb983f2 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -80,20 +80,33 @@ static u16 find_equiv_id(struct equiv_cpu_entry 
*equiv_table, u32 sig)
  * Returns the amount of bytes consumed while scanning. @desc contains all the
  * data we're going to use in later stages of the application.
  */
-static ssize_t parse_container(u8 *ucode, ssize_t size, struct cont_desc *desc)
+static size_t parse_container(u8 *ucode, size_t size, struct cont_desc *desc)
 {
struct equiv_cpu_entry *eq;
-   ssize_t orig_size = size;
+   size_t orig_size = size;
u32 *hdr = (u32 *)ucode;
+   unsigned int cont_magic, cont_type;
+   size_t equiv_tbl_len;
u16 eq_id;
u8 *buf;
 
+   if (size < CONTAINER_HDR_SZ)
+   return 0;
+
+   cont_magic  = hdr[0];
+   cont_type   = hdr[1];
+   equiv_tbl_len   = hdr[2];
+
/* Am I looking at an equivalence table header? */
-   if (hdr[0] != UCODE_MAGIC ||
-   hdr[1] != UCODE_EQUIV_CPU_TABLE_TYPE ||
-   hdr[2] == 0)
+   if (cont_magic != UCODE_MAGIC ||
+   cont_type != UCODE_EQUIV_CPU_TABLE_TYPE ||
+   equiv_tbl_len == 0)
return CONTAINER_HDR_SZ;
 
+   if (equiv_tbl_len < sizeof(*eq) ||
+   size - CONTAINER_HDR_SZ < equiv_tbl_len)
+   return size;
+
buf = ucode;
 
eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
@@ -101,8 +114,8 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *desc)
/* Find the equivalence ID of our CPU in this table: */
eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
 
-   buf  += hdr[2] + CONTAINER_HDR_SZ;
-   size -= hdr[2] + CONTAINER_HDR_SZ;
+   buf  += equiv_tbl_len + CONTAINER_HDR_SZ;
+   size -= equiv_tbl_len + CONTAINER_HDR_SZ;
 
/*
 * Scan through the rest of the container to find where it ends. We do
@@ -159,15 +172,13 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *desc)
  */
 static void scan_containers(u8 *ucode, size_t size, struct cont_desc *desc)
 {
-   ssize_t rem = size;
-
-   while (rem >= 0) {
-   ssize_t s = parse_container(ucode, rem, desc);
+   while (size > 0) {
+   size_t s = parse_container(ucode, size, desc);
if (!s)
return;
 
ucode += s;
-   rem   -= s;
+   size  -= s;
}
 }
 


[PATCH v4 04/10] x86/microcode/AMD: install_equiv_cpu_table() should not return a signed int

2018-03-15 Thread Maciej S. Szmigiero
The maximum possible value returned by install_equiv_cpu_table() is
UINT_MAX (on a 64-bit kernel).
This is more than (signed) int type currently returned by this function can
hold so this function will need to return an unsigned int instead.

In order to avoid an overflow of this type on a 64-bit kernel we'll need to
also modify this function to return only the CPU equivalence table length
(without the container header length), leaving its single caller
(__load_microcode_amd()) the job of adding the container header length to
skip to the fist patch section.

The individual (negative) error codes returned by install_equiv_cpu_table()
are of no use anyway, since they are all normalized to UCODE_ERROR by its
caller.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index ed24200cf936..8e8df37f2f1b 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -551,40 +551,39 @@ static enum ucode_state apply_microcode_amd(int cpu)
return UCODE_UPDATED;
 }
 
-static int install_equiv_cpu_table(const u8 *buf, size_t buf_size)
+static unsigned int install_equiv_cpu_table(const u8 *buf, size_t buf_size)
 {
unsigned int *ibuf = (unsigned int *)buf;
unsigned int type, equiv_tbl_len;
 
if (buf_size <= CONTAINER_HDR_SZ) {
pr_err("Truncated microcode container header.\n");
-   return -EINVAL;
+   return 0;
}
 
type = ibuf[1];
if (type != UCODE_EQUIV_CPU_TABLE_TYPE) {
pr_err("Wrong microcode container equivalence table type: 
%u.\n",
   type);
-   return -EINVAL;
+   return 0;
}
 
equiv_tbl_len = ibuf[2];
if (equiv_tbl_len < sizeof(struct equiv_cpu_entry) ||
buf_size - CONTAINER_HDR_SZ < equiv_tbl_len) {
pr_err("Truncated equivalence table.\n");
-   return -EINVAL;
+   return 0;
}
 
equiv_cpu_table = vmalloc(equiv_tbl_len);
if (!equiv_cpu_table) {
pr_err("failed to allocate equivalent CPU table\n");
-   return -ENOMEM;
+   return 0;
}
 
memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, equiv_tbl_len);
 
-   /* add header length */
-   return equiv_tbl_len + CONTAINER_HDR_SZ;
+   return equiv_tbl_len;
 }
 
 static void free_equiv_cpu_table(void)
@@ -681,18 +680,24 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
 size_t size)
 {
enum ucode_state ret = UCODE_ERROR;
-   unsigned int leftover;
+   size_t leftover;
u8 *fw = (u8 *)data;
int crnt_size = 0;
-   int offset;
+   unsigned int offset;
 
offset = install_equiv_cpu_table(data, size);
-   if (offset < 0) {
+   if (!offset) {
pr_err("failed to create equivalent cpu table\n");
return ret;
}
+
+   /*
+* Skip also the container header, since install_equiv_cpu_table()
+* returns just the raw equivalence table size without the header
+*/
+   fw += CONTAINER_HDR_SZ;
fw += offset;
-   leftover = size - offset;
+   leftover = size - CONTAINER_HDR_SZ - offset;
 
if (*(u32 *)fw != UCODE_UCODE_TYPE) {
pr_err("invalid type field in container file section header\n");


[PATCH v4 03/10] x86/microcode/AMD: Check equivalence table length in the late loader

2018-03-15 Thread Maciej S. Szmigiero
Before loading a CPU equivalence table from a microcode container file we
need to verify whether this file is actually large enough to contain the
table of a size indicated in this file.
If it is not, there is no point of continuing with loading it since
microcode patches are located after the equivalence table anyway.

This patch adds these checks to the late loader.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 138c9fb983f2..ed24200cf936 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -551,28 +551,40 @@ static enum ucode_state apply_microcode_amd(int cpu)
return UCODE_UPDATED;
 }
 
-static int install_equiv_cpu_table(const u8 *buf)
+static int install_equiv_cpu_table(const u8 *buf, size_t buf_size)
 {
unsigned int *ibuf = (unsigned int *)buf;
-   unsigned int type = ibuf[1];
-   unsigned int size = ibuf[2];
+   unsigned int type, equiv_tbl_len;
 
-   if (type != UCODE_EQUIV_CPU_TABLE_TYPE || !size) {
-   pr_err("empty section/"
-  "invalid type field in container file section header\n");
+   if (buf_size <= CONTAINER_HDR_SZ) {
+   pr_err("Truncated microcode container header.\n");
return -EINVAL;
}
 
-   equiv_cpu_table = vmalloc(size);
+   type = ibuf[1];
+   if (type != UCODE_EQUIV_CPU_TABLE_TYPE) {
+   pr_err("Wrong microcode container equivalence table type: 
%u.\n",
+  type);
+   return -EINVAL;
+   }
+
+   equiv_tbl_len = ibuf[2];
+   if (equiv_tbl_len < sizeof(struct equiv_cpu_entry) ||
+   buf_size - CONTAINER_HDR_SZ < equiv_tbl_len) {
+   pr_err("Truncated equivalence table.\n");
+   return -EINVAL;
+   }
+
+   equiv_cpu_table = vmalloc(equiv_tbl_len);
if (!equiv_cpu_table) {
pr_err("failed to allocate equivalent CPU table\n");
return -ENOMEM;
}
 
-   memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, size);
+   memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, equiv_tbl_len);
 
/* add header length */
-   return size + CONTAINER_HDR_SZ;
+   return equiv_tbl_len + CONTAINER_HDR_SZ;
 }
 
 static void free_equiv_cpu_table(void)
@@ -674,7 +686,7 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
int crnt_size = 0;
int offset;
 
-   offset = install_equiv_cpu_table(data);
+   offset = install_equiv_cpu_table(data, size);
if (offset < 0) {
pr_err("failed to create equivalent cpu table\n");
return ret;


[PATCH v4 00/10] x86/microcode/AMD: Check microcode file sanity before loading it

2018-03-15 Thread Maciej S. Szmigiero
Currently, it is very easy to make the AMD microcode update driver crash
or spin on a malformed microcode container file since it does very little
consistency checking on data loaded from such file.

This series introduces various checks, mostly on length-type fields,
so all corrupted microcode container files are (hopefully) correctly
rejected instead.
This largely matches what the Intel microcode update driver already does.

It also tries to make the behavior consistent between the early and late
loaders.

Please note that this isn't about verifying the actual microcode update,
that is, the blob that gets sent to the CPU as the new microcode.
Such verification is (hopefully) done by the CPU itself.

It is about verifying a driver-specific container file that includes
many microcode updates for different CPUs of a particular CPU family,
along with metadata that helps the driver select the right microcode
update to actually send to the CPU.

There are purposely-corrupted test files available at
https://pastebin.com/XkKUSmMp
One has to enable KASAN in the kernel config and rename a particular
test file to name appropriate to the running CPU family to test its
loading.

Changes from v1:
* Capitalize a comment,

* Rename 'eqsize' and 'bufsize' to 'eq_size' and 'buf_size',
respectively,

* Attach a comment about checking the equivalence table header to its
first size check,

* Rename 'equiv{_cpu,}_table_size' to 'equiv{_cpu,}_table_entries'.

Changes from v2:
* Split the patch into separate commits,

* Remove explicit CPU equivalence table size limit,

* Make install_equiv_cpu_table() return a size_t instead of a (signed)
int so no overflow can occur there,

* Automatically compute the PATCH_MAX_SIZE macro and use it for checking
a patch size,

* Make the late loader behavior with respect to late parse failures
consistent with what the early loader does.

Changes from v3:
* Capitalize patch subject names,

* Add a comment describing why we subtract SECTION_HDR_SIZE from a file
leftover length before calling verify_patch_size(),

* Don't break a long line containing the above subtraction,

* Move a comment in parse_container() back to where it was in the original
patch version,

* Add special local vars for container header fields in parse_container(),

* Return the remaining blob size from parse_container() if the equivalence
table is truncated,

* Split the equivalence table size checks into two patches: one for the
early loader and one for the late loader,

* Rename an equivalence table length variable in install_equiv_cpu_table()
for consistency with a similar one in parse_container(),

* Rephrase the error messages in install_equiv_cpu_table() to be more
descriptive and merge two tests there so they print the same message,

* Change install_equiv_cpu_table() to return an unsigned int while moving
the job of adding the container header length to this value to its caller,

* Drop automatic computation of the PATCH_MAX_SIZE macro and add a comment
reminding to do it manually instead,

* Add a local variable patch_type for better code readability in
verify_and_add_patch() function.

 arch/x86/include/asm/microcode_amd.h |   1 +
 arch/x86/kernel/cpu/microcode/amd.c  | 179 ---
 2 files changed, 127 insertions(+), 53 deletions(-)


[PATCH v4 01/10] x86/microcode/AMD: Subtract SECTION_HDR_SIZE from file leftover length

2018-03-15 Thread Maciej S. Szmigiero
verify_patch_size() function verifies whether the microcode container file
remaining size is large enough to contain a patch of the indicated size.

However, the section header length is not included in this indicated size
but it is present in the leftover file length so it should be subtracted
from the leftover file length before passing this value to
verify_patch_size().

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index a998e1a7d46f..6a93be0f771c 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -613,7 +613,16 @@ static int verify_and_add_patch(u8 family, u8 *fw, 
unsigned int leftover)
return crnt_size;
}
 
-   ret = verify_patch_size(family, patch_size, leftover);
+   /*
+* verify_patch_size() checks whether the passed remaining file size
+* is large enough to contain a patch of the indicated size (and also
+* whether this size does not exceed the family maximum).
+*
+* The section header length is not included in this indicated size
+* but it is present in the leftover file length so we need to subtract
+* it before passing this value to that function.
+*/
+   ret = verify_patch_size(family, patch_size, leftover - 
SECTION_HDR_SIZE);
if (!ret) {
pr_err("Patch-ID 0x%08x: size mismatch.\n", mc_hdr->patch_id);
return crnt_size;


[PATCH v4 08/10] x86/microcode/AMD: Check microcode container file size before accessing it

2018-03-15 Thread Maciej S. Szmigiero
The early loader parse_container() function should check whether the
microcode container file is actually large enough to contain the patch of
an indicated size, just like the late loader does.

Also, the request_microcode_amd() function should check whether the
container file is actually large enough to contain the header magic value.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 4d2116d08754..dc5ed4971879 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -125,6 +125,9 @@ static size_t parse_container(u8 *ucode, size_t size, 
struct cont_desc *desc)
struct microcode_amd *mc;
u32 patch_size;
 
+   if (size < SECTION_HDR_SIZE)
+   break;
+
hdr = (u32 *)buf;
 
if (hdr[0] != UCODE_UCODE_TYPE)
@@ -139,6 +142,10 @@ static size_t parse_container(u8 *ucode, size_t size, 
struct cont_desc *desc)
buf  += SECTION_HDR_SIZE;
size -= SECTION_HDR_SIZE;
 
+   if (size < sizeof(*mc) ||
+   size < patch_size)
+   break;
+
mc = (struct microcode_amd *)buf;
if (eq_id == mc->hdr.processor_rev_id) {
desc->psize = patch_size;
@@ -794,6 +801,10 @@ static enum ucode_state request_microcode_amd(int cpu, 
struct device *device,
}
 
ret = UCODE_ERROR;
+   if (fw->size < sizeof(u32)) {
+   pr_err("microcode container far too short\n");
+   goto fw_release;
+   }
if (*(u32 *)fw->data != UCODE_MAGIC) {
pr_err("invalid magic value (0x%08x)\n", *(u32 *)fw->data);
goto fw_release;


[PATCH v4 09/10] x86/microcode/AMD: Check the equivalence table size when scanning it

2018-03-15 Thread Maciej S. Szmigiero
Currently, the code scanning the CPU equivalence table read from a
microcode container file assumes that it actually contains a terminating
zero entry.
Let's check also the size of this table to make sure that we don't read
past it in case it actually doesn't.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index dc5ed4971879..6e25a63a0a3d 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -39,6 +39,7 @@
 #include 
 
 static struct equiv_cpu_entry *equiv_cpu_table;
+static unsigned int equiv_cpu_table_entries;
 
 /*
  * This points to the current valid container of microcode patches which we 
will
@@ -63,12 +64,18 @@ static u8 amd_ucode_patch[PATCH_MAX_SIZE];
 static const char
 ucode_path[] __maybe_unused = "kernel/x86/microcode/AuthenticAMD.bin";
 
-static u16 find_equiv_id(struct equiv_cpu_entry *equiv_table, u32 sig)
+static u16 find_equiv_id(const struct equiv_cpu_entry *equiv_table,
+unsigned int equiv_table_entries, u32 sig)
 {
-   for (; equiv_table && equiv_table->installed_cpu; equiv_table++) {
-   if (sig == equiv_table->installed_cpu)
-   return equiv_table->equiv_cpu;
-   }
+   unsigned int i;
+
+   if (!equiv_table)
+   return 0;
+
+   for (i = 0; i < equiv_table_entries && equiv_table[i].installed_cpu;
+i++)
+   if (sig == equiv_table[i].installed_cpu)
+   return equiv_table[i].equiv_cpu;
 
return 0;
 }
@@ -112,7 +119,8 @@ static size_t parse_container(u8 *ucode, size_t size, 
struct cont_desc *desc)
eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
 
/* Find the equivalence ID of our CPU in this table: */
-   eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
+   eq_id = find_equiv_id(eq, equiv_tbl_len / sizeof(*eq),
+ desc->cpuid_1_eax);
 
buf  += equiv_tbl_len + CONTAINER_HDR_SZ;
size -= equiv_tbl_len + CONTAINER_HDR_SZ;
@@ -382,20 +390,21 @@ void reload_ucode_amd(void)
 static u16 __find_equiv_id(unsigned int cpu)
 {
struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
-   return find_equiv_id(equiv_cpu_table, uci->cpu_sig.sig);
+   return find_equiv_id(equiv_cpu_table, equiv_cpu_table_entries,
+uci->cpu_sig.sig);
 }
 
 static u32 find_cpu_family_by_equiv_cpu(u16 equiv_cpu)
 {
-   int i = 0;
+   unsigned int i;
 
BUG_ON(!equiv_cpu_table);
 
-   while (equiv_cpu_table[i].equiv_cpu != 0) {
+   for (i = 0; i < equiv_cpu_table_entries &&
+equiv_cpu_table[i].equiv_cpu != 0; i++)
if (equiv_cpu == equiv_cpu_table[i].equiv_cpu)
return equiv_cpu_table[i].installed_cpu;
-   i++;
-   }
+
return 0;
 }
 
@@ -593,6 +602,7 @@ static unsigned int install_equiv_cpu_table(const u8 *buf, 
size_t buf_size)
}
 
memcpy(equiv_cpu_table, buf + CONTAINER_HDR_SZ, equiv_tbl_len);
+   equiv_cpu_table_entries = equiv_tbl_len / sizeof(struct 
equiv_cpu_entry);
 
return equiv_tbl_len;
 }
@@ -601,6 +611,7 @@ static void free_equiv_cpu_table(void)
 {
vfree(equiv_cpu_table);
equiv_cpu_table = NULL;
+   equiv_cpu_table_entries = 0;
 }
 
 static void cleanup(void)


[PATCH v4 10/10] x86/microcode/AMD: Be more tolerant of late parse failures in late loader

2018-03-15 Thread Maciej S. Szmigiero
The early loader just ends its microcode container file processing when it
is unable to parse some patch section, but keeps the already read patches
from this file for their eventual application.

We can do the same in the late loader - we'll just return an error if we
are unable to parse any patches.
Note that we already do silently skip patches in the late loader for
smaller issues like lack of an equivalence table entry, family-size
mismatch or an unsupported chipset match type.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index 6e25a63a0a3d..f9278d9a2f9b 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -738,13 +738,15 @@ static enum ucode_state __load_microcode_amd(u8 family, 
const u8 *data,
while (leftover) {
crnt_size = verify_and_add_patch(family, fw, leftover);
if (crnt_size < 0)
-   return ret;
+   break;
 
fw   += crnt_size;
leftover -= crnt_size;
+
+   ret = UCODE_OK;
}
 
-   return UCODE_OK;
+   return ret;
 }
 
 static enum ucode_state


[PATCH v4 06/10] x86/microcode/AMD: Check patch size in verify_and_add_patch()

2018-03-15 Thread Maciej S. Szmigiero
We should verify the indicated size of a patch in a microcode container
file (whether it does not exceed the PATCH_MAX_SIZE value) and also whether
this file is actually large enough to contain it in the late loader
verify_and_add_patch() function.

The early loader already does the PATCH_MAX_SIZE check in parse_container()
function.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
 arch/x86/kernel/cpu/microcode/amd.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index eba9e3c8aa17..096cb58a563f 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -473,7 +473,7 @@ static int collect_cpu_info_amd(int cpu, struct 
cpu_signature *csig)
 }
 
 static unsigned int verify_patch_size(u8 family, u32 patch_size,
- unsigned int size)
+ size_t size)
 {
u32 max_size;
 
@@ -505,7 +505,7 @@ static unsigned int verify_patch_size(u8 family, u32 
patch_size,
break;
}
 
-   if (patch_size > min_t(u32, size, max_size)) {
+   if (patch_size > min_t(size_t, size, max_size)) {
pr_err("patch size mismatch\n");
return 0;
}
@@ -609,7 +609,7 @@ static void cleanup(void)
  * driver cannot continue functioning normally. In such cases, we tear
  * down everything we've used up so far and exit.
  */
-static int verify_and_add_patch(u8 family, u8 *fw, unsigned int leftover)
+static int verify_and_add_patch(u8 family, u8 *fw, size_t leftover)
 {
struct microcode_header_amd *mc_hdr;
struct ucode_patch *patch;
@@ -617,7 +617,15 @@ static int verify_and_add_patch(u8 family, u8 *fw, 
unsigned int leftover)
u32 proc_fam;
u16 proc_id;
 
+   if (leftover < SECTION_HDR_SIZE + sizeof(*mc_hdr))
+   return leftover;
+
patch_size  = *(u32 *)(fw + 4);
+   if (patch_size > PATCH_MAX_SIZE) {
+   pr_err("patch size %u too large\n", patch_size);
+   return -EINVAL;
+   }
+
crnt_size   = patch_size + SECTION_HDR_SIZE;
mc_hdr  = (struct microcode_header_amd *)(fw + SECTION_HDR_SIZE);
proc_id = mc_hdr->processor_rev_id;


[PATCH] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-09 Thread Maciej S. Szmigiero
Currently, it is very easy to make the AMD microcode update driver crash
or spin on a malformed microcode file since it does very little
consistency checking on data loaded from such file.

This commit introduces various checks, mostly on length-type fields, so
all corrupted microcode files are (hopefully) correctly rejected instead.

Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
---
Test files are at https://pastebin.com/XkKUSmMp
One has to enable KASAN in the kernel config and rename a particular
test file to name appropriate to the running CPU family to test its
loading.

 arch/x86/include/asm/microcode_amd.h |   6 ++
 arch/x86/kernel/cpu/microcode/amd.c  | 111 ++-
 2 files changed, 89 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/microcode_amd.h 
b/arch/x86/include/asm/microcode_amd.h
index 209492849566..e797d5d939d7 100644
--- a/arch/x86/include/asm/microcode_amd.h
+++ b/arch/x86/include/asm/microcode_amd.h
@@ -19,6 +19,9 @@ struct equiv_cpu_entry {
u16 res;
 } __attribute__((packed));
 
+/* 4k */
+#define UCODE_EQUIV_CPU_TABLE_MAX_SIZE (256 * sizeof(struct equiv_cpu_entry))
+
 struct microcode_header_amd {
u32 data_code;
u32 patch_id;
@@ -41,6 +44,9 @@ struct microcode_amd {
unsigned intmpb[0];
 };
 
+/* if a patch is larger than this the microcode file is surely corrupted */
+#define PATCH_MAX_SIZE_ABSOLUTE (16 * 1024 * 1024)
+
 #define PATCH_MAX_SIZE PAGE_SIZE
 
 #ifdef CONFIG_MICROCODE_AMD
diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
b/arch/x86/kernel/cpu/microcode/amd.c
index a998e1a7d46f..f798bd4004aa 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -39,6 +39,7 @@
 #include 
 
 static struct equiv_cpu_entry *equiv_cpu_table;
+static unsigned int equiv_cpu_table_size;
 
 /*
  * This points to the current valid container of microcode patches which we 
will
@@ -63,12 +64,17 @@ static u8 amd_ucode_patch[PATCH_MAX_SIZE];
 static const char
 ucode_path[] __maybe_unused = "kernel/x86/microcode/AuthenticAMD.bin";
 
-static u16 find_equiv_id(struct equiv_cpu_entry *equiv_table, u32 sig)
+static u16 find_equiv_id(const struct equiv_cpu_entry *equiv_table,
+unsigned int equiv_table_size, u32 sig)
 {
-   for (; equiv_table && equiv_table->installed_cpu; equiv_table++) {
-   if (sig == equiv_table->installed_cpu)
-   return equiv_table->equiv_cpu;
-   }
+   unsigned int i;
+
+   if (!equiv_table)
+   return 0;
+
+   for (i = 0; i < equiv_table_size && equiv_table[i].installed_cpu; i++)
+   if (sig == equiv_table[i].installed_cpu)
+   return equiv_table[i].equiv_cpu;
 
return 0;
 }
@@ -80,29 +86,41 @@ static u16 find_equiv_id(struct equiv_cpu_entry 
*equiv_table, u32 sig)
  * Returns the amount of bytes consumed while scanning. @desc contains all the
  * data we're going to use in later stages of the application.
  */
-static ssize_t parse_container(u8 *ucode, ssize_t size, struct cont_desc *desc)
+static size_t parse_container(u8 *ucode, size_t size, struct cont_desc *desc)
 {
struct equiv_cpu_entry *eq;
-   ssize_t orig_size = size;
+   size_t orig_size = size;
u32 *hdr = (u32 *)ucode;
+   u32 eqsize;
u16 eq_id;
u8 *buf;
 
+   if (size < CONTAINER_HDR_SZ)
+   return 0;
+
/* Am I looking at an equivalence table header? */
if (hdr[0] != UCODE_MAGIC ||
hdr[1] != UCODE_EQUIV_CPU_TABLE_TYPE ||
hdr[2] == 0)
return CONTAINER_HDR_SZ;
 
+   eqsize = hdr[2];
+   if (eqsize < sizeof(*eq) ||
+   eqsize > UCODE_EQUIV_CPU_TABLE_MAX_SIZE)
+   return 0;
+
+   if (size < CONTAINER_HDR_SZ + eqsize)
+   return 0;
+
buf = ucode;
 
eq = (struct equiv_cpu_entry *)(buf + CONTAINER_HDR_SZ);
 
/* Find the equivalence ID of our CPU in this table: */
-   eq_id = find_equiv_id(eq, desc->cpuid_1_eax);
+   eq_id = find_equiv_id(eq, eqsize / sizeof(*eq), desc->cpuid_1_eax);
 
-   buf  += hdr[2] + CONTAINER_HDR_SZ;
-   size -= hdr[2] + CONTAINER_HDR_SZ;
+   buf  += eqsize + CONTAINER_HDR_SZ;
+   size -= eqsize + CONTAINER_HDR_SZ;
 
/*
 * Scan through the rest of the container to find where it ends. We do
@@ -112,6 +130,9 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *desc)
struct microcode_amd *mc;
u32 patch_size;
 
+   if (size < SECTION_HDR_SIZE)
+   break;
+
hdr = (u32 *)buf;
 
if (hdr[0] != UCODE_UCODE_TYPE)
@@ -126,6 +147,10 @@ static ssize_t parse_container(u8 *ucode, ssize_t size, 
struct cont_desc *des

Re: [PATCH] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-10 Thread Maciej S. Szmigiero
On 10.03.2018 01:34, Maciej S. Szmigiero wrote:
> Currently, it is very easy to make the AMD microcode update driver crash
> or spin on a malformed microcode file since it does very little
> consistency checking on data loaded from such file.
> 
> This commit introduces various checks, mostly on length-type fields, so
> all corrupted microcode files are (hopefully) correctly rejected instead.
> 
> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>

To make sure that it is clear what this patch is about:
It *isn't* about verifying the actual microcode update, that is, the
blob that gets sent to the CPU as the new microcode.
Such verification is (hopefully) done by the CPU itself.

It is about verifying *this driver-specific* container file that
includes many microcode updates for different CPUs of a particular CPU
family, along with metadata that helps the driver select the right
microcode update to actually sent to the CPU.

The microcode container files in linux-firmware don't contain the
newest microcode versions, even for older CPUs.
They are generally updated VERY rarely - I can see only three updates
for them: on 2016-03-18, 2014-11-30 and 2013-07-11.
There is no container file at all for family 17h (Zen) so
distributions like OpenSUSE that include this file must have gotten it
from some other source (or made from raw updates themselves).

That's why to get things like IBPB it is sometimes necessary to use
a newer microcode version than included in linux-firmware, sourced for
example from a BIOS update.
Since BIOS updates contain only actual (raw) microcode updates one
has to place it in a microcode container file so this driver can parse
it.

As far I know there is no tool to automate this work so one has to
manually tweak the container metadata.
And this is prone to mistakes this patch protects against.

Sorry for not making this 100% clear in the commit log.

Maciej


Re: [PATCH] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-10 Thread Maciej S. Szmigiero
On 10.03.2018 14:12, Borislav Petkov wrote:
> On Sat, Mar 10, 2018 at 01:26:00PM +0100, Maciej S. Szmigiero wrote:
>> Without them, it is easy to crash the driver when just playing with
>> microcode files
> 
> You're not supposed to play with the microcode files. If you do and
> something breaks, you get to keep the pieces.
> 
> If the official microcode files have a problem, then I'm all ears.
> Anything else contrived which doesn't actually happen unless someone
> manipulates them is not an issue the microcode loader should protect
> against.
> 

So, shall we remove data consistency checks of various root-only
syscalls then? :)

Maciej


Re: [PATCH] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-10 Thread Maciej S. Szmigiero
On 10.03.2018 10:18, Borislav Petkov wrote:
> On Sat, Mar 10, 2018 at 01:34:45AM +0100, Maciej S. Szmigiero wrote:
>> Currently, it is very easy to make the AMD microcode update driver crash
>> or spin on a malformed microcode file since it does very little
>> consistency checking on data loaded from such file.
> 
> Sorry but if one has enough permissions to install malformed microcode,
> crashing the loader is the least of your problems. IOW, I don't see the
> justification for the unnecessary complication with all those checks.

While I agree this is not a security problem, I cannot agree that these
checks are unnecessary driver complication.

First, these checks are really just very basic checks like "check
whether the loaded file is long enough to actually contain some
structure before accessing it" or "don't iterate an array in file
without checking if it actually has a terminating element" or "check
whether microcode patch length isn't something like 2GB before allocating
memory for it".

Without them, it is easy to crash the driver when just playing with
microcode files (and it turns out that AMD-released microcode files in
linux-firmware actually don't contain the newest microcode versions,
even for older CPUs).

Second, since these checks happen only on a microcode file load
(something that 99.9% of systems probably will do just once at boot
time) it is hardly a performance-critical path.

Third, we still do check consistency of data provided to various
root-only syscalls (and these might be much more performance-critical
than this code).
 
> Thx.
> 

Thanks,
Maciej


Re: [PATCH] x86/microcode/AMD: check microcode file sanity before loading it

2018-03-10 Thread Maciej S. Szmigiero
On 10.03.2018 17:46, Borislav Petkov wrote:
> On Sat, Mar 10, 2018 at 05:16:40PM +0100, Maciej S. Szmigiero wrote:
(..)
>> There is no container file at all for family 17h (Zen) so
>> distributions like OpenSUSE that include this file must have gotten it
>> from some other source
> 
> Or maybe they've gotten it from AMD directly. Don't you think that
> getting microcode from the CPU vendor directly is the logical thing?

"some other source" than linux-firmware includes the CPU vendor.

Also please note that while OpenSUSE can get the microcode directly
from the CPU vendor there seems to be no official AMD web site that
distributes microcode.
And it looks like other distros simply get it from OpenSUSE:
https://bugs.archlinux.org/task/56951

>> That's why to get things like IBPB it is sometimes necessary to use
>> a newer microcode version than included in linux-firmware, sourced for
>> example from a BIOS update.
> 
> linux-firmware will get F17h microcode soon.

Great!
Hope it will include latest production versions for the whole family
17h.

>> Since BIOS updates contain only actual (raw) microcode updates one
>> has to place it in a microcode container file so this driver can parse
>> it.
>>
>> As far I know there is no tool to automate this work so one has to
>> manually tweak the container metadata.
> 
> Let me get this straight: am I reading this correctly that you've tried
> to carve out the F17h microcode from a BIOS update blob and you're
> trying to load that?!?
>
> If so, you could've simply taken a distro microcode package and used
> F17h microcode from there - they are all the same.
> 

"microcode_amd_fam17h.bin" from both my distro (Gentoo) and OpenSUSE
only contains family 23 model 2 microcode while my Ryzen is model 1.

And my motherboard BIOS-loaded microcode is too old to contain IBPB
support.

Maciej


Re: [PATCH v3 3/9] x86/microcode/AMD: install_equiv_cpu_table() should not return (signed) int

2018-03-14 Thread Maciej S. Szmigiero
On 15.03.2018 00:58, Borislav Petkov wrote:
> On Thu, Mar 15, 2018 at 12:46:05AM +0100, Maciej S. Szmigiero wrote:
>> The maximum possible value returned by install_equiv_cpu_table() of
>> UINT_MAX + CONTAINER_HDR_SZ comes from the maximum value of this 'size'
>> variable (that is UINT_MAX) plus the header length of CONTAINER_HDR_SZ.
>> This won't fit in 'int' type, hence this patch.
> 
> So make it fit by returning an unsigned int.
> 

This can be done if this function is modified to return only the CPU
equivalence table length (without the container header length), leaving
its single caller the job of adding the container header length to skip
to the fist patch section.

Otherwise we introduce a equivalence table length limit of
UINT_MAX - CONTAINER_HDR_SZ, as anything more will overflow an
unsigned int variable on a 64-bit kernel (on 32-bit this will be caught
by the equivalence table truncation check).

Maciej


Re: [PATCH v3 4/9] x86/microcode/AMD: automatically compute the PATCH_MAX_SIZE macro

2018-03-14 Thread Maciej S. Szmigiero
On 14.03.2018 19:02, Borislav Petkov wrote:
> On Tue, Mar 13, 2018 at 10:06:48PM +0100, Maciej S. Szmigiero wrote:
>> +/* Maximum of all the above families */
>> +#define PATCH_MAX_SIZE SIMPLE_MAX3(F1XH_MPB_MAX_SIZE, F14H_MPB_MAX_SIZE, \
> 
> Nope, it should be
> 
> #define PATCH_MAX_SIZE (max_t(unsigned int, FXXH...

Unfortunately, this does not work:
> ./include/linux/kernel.h:806:41: error: braced-group within expression 
> allowed only inside a function
>  #define __max(t1, t2, max1, max2, x, y) ({  \

That's because we have a static array containing the chosen microcode
patch for the current CPU (amd_ucode_patch) using this macro value as its
length.

Comments in the code say we can't use vmalloc() during early microcode
load so we can't allocate this array dynamically.

And if we hardcode its length we don't get the benefits of automatically
computing this length as maximum of all family patch sizes (as it should
be).

Maciej


Re: [PATCH v3 2/9] x86/microcode/AMD: check whether the equivalence table fits in the file

2018-03-14 Thread Maciej S. Szmigiero
On 14.03.2018 18:04, Borislav Petkov wrote:
> On Tue, Mar 13, 2018 at 10:06:23PM +0100, Maciej S. Szmigiero wrote:
(..)
>> --- a/arch/x86/kernel/cpu/microcode/amd.c
>> +++ b/arch/x86/kernel/cpu/microcode/amd.c
>> @@ -80,20 +80,29 @@ static u16 find_equiv_id(struct equiv_cpu_entry 
>> *equiv_table, u32 sig)
>>   * Returns the amount of bytes consumed while scanning. @desc contains all 
>> the
>>   * data we're going to use in later stages of the application.
>>   */
>> -static ssize_t parse_container(u8 *ucode, ssize_t size, struct cont_desc 
>> *desc)
>> +static size_t parse_container(u8 *ucode, size_t size, struct cont_desc 
>> *desc)
>>  {
>>  struct equiv_cpu_entry *eq;
>> -ssize_t orig_size = size;
>> +size_t orig_size = size;
>>  u32 *hdr = (u32 *)ucode;
>> +size_t eq_size;
>>  u16 eq_id;
>>  u8 *buf;
>>  
>>  /* Am I looking at an equivalence table header? */
> 
> That comment becomes wrong when you add this check here.

It was moved there because on the first (monolithic) iteration of this
change there was a review comment that it better belongs here.

No problem to move it back, however.

>> +if (size < CONTAINER_HDR_SZ)
>> +return 0;
>> +
>>  if (hdr[0] != UCODE_MAGIC ||
>>  hdr[1] != UCODE_EQUIV_CPU_TABLE_TYPE ||
>>  hdr[2] == 0)
>>  return CONTAINER_HDR_SZ;
>>  
>> +eq_size = hdr[2];
> 
> If we're going to have special local vars for the container header, then
> do it right:
> 
>   cont_magic  = hdr[0];
>   cont_type   = hdr[1];
>   equiv_tbl_len   = hdr[2];
> 
> and then use those from now on.

Will do.

>> +if (eq_size < sizeof(*eq) ||
>> +size - CONTAINER_HDR_SZ < eq_size)
>> +return 0;
> 
> I think you want
> 
>   if (size < eqiv_tbl_len + CONTAINER_HDR_SZ)
>   return size;
> 
> here to skip over the next, hopefully not truncated container.

'size' here is the length of the whole CPIO blob containing all
containers combined (well, the remaining part of it).

If we skip over 'size' bytes we'll have nothing left to parse.

>> @@ -159,15 +168,13 @@ static ssize_t parse_container(u8 *ucode, ssize_t 
>> size, struct cont_desc *desc)
>>   */
>>  static void scan_containers(u8 *ucode, size_t size, struct cont_desc *desc)
>>  {
>> -ssize_t rem = size;
>> -
>> -while (rem >= 0) {
>> -ssize_t s = parse_container(ucode, rem, desc);
>> +while (size > 0) {
>> +size_t s = parse_container(ucode, size, desc);
>>  if (!s)
>>  return;
>>  
>>  ucode += s;
>> -rem   -= s;
>> +size  -= s;
>>  }
>>  }
>>  
> 
> All changes upto here need to be a separate patch.
> install_equiv_cpu_table() changes below are the second patch.

OK, will split this into two patches.

>> @@ -540,15 +547,30 @@ static enum ucode_state apply_microcode_amd(int cpu)
>>  return UCODE_UPDATED;
>>  }
>>  
>> -static int install_equiv_cpu_table(const u8 *buf)
>> +static int install_equiv_cpu_table(const u8 *buf, size_t buf_size)
>>  {
>>  unsigned int *ibuf = (unsigned int *)buf;
>> -unsigned int type = ibuf[1];
>> -unsigned int size = ibuf[2];
>> +unsigned int type, size;
> 
> unsigned int type, equiv_tbl_len;

Will do.

>> +
>> +if (buf_size < CONTAINER_HDR_SZ) {
> 
><= is ok too.

Will do.

>> +pr_err("no container header\n");
> 
> More descriptive error messages:
> 
>   "Truncated microcode container header.\n"

Will do.

>> +return -EINVAL;
>> +}
>> +
>> +type = ibuf[1];
>> +if (type != UCODE_EQUIV_CPU_TABLE_TYPE) {
>> +pr_err("invalid type field in container file section header\n");
> 
>   "Wrong microcode container equivalence table type: 
> %d.\n"

Will do.

>> +return -EINVAL;
>> +}
>> +
>> +size = ibuf[2];
>> +if (size < sizeof(struct equiv_cpu_entry)) {
>> +pr_err("equivalent CPU table too short\n");
> 
>   "Truncated equivalence table.\n"

Will do.

>> +return -EINVAL;
>> +}
>>  
>> -if (type != UCODE_EQUIV_CPU_TABLE_TYPE || !size) {
>> -pr_err("empty section/"
>> -   "invalid type field in container file section header\n");
>> +if (buf_size - CONTAINER_HDR_SZ < size) {
>> +pr_err("equivalent CPU table truncated\n");
> 
> Combine that test with the above one and use the same error message.

Will do.
 
> Thx.
> 

Thanks,
Maciej


Re: [PATCH v3 3/9] x86/microcode/AMD: install_equiv_cpu_table() should not return (signed) int

2018-03-14 Thread Maciej S. Szmigiero
On 15.03.2018 01:56, Borislav Petkov wrote:
> On Thu, Mar 15, 2018 at 01:13:07AM +0100, Maciej S. Szmigiero wrote:
>> This can be done if this function is modified to return only the CPU
>> equivalence table length (without the container header length), leaving
>> its single caller the job of adding the container header length to skip
>> to the fist patch section.
> 
> Sure, it leaves the function to deal with the equiv table length only
> and the caller then adds the header length. Which is actually cleaner.
> 

OK, will do then.

Maciej


Re: [PATCH v3 3/9] x86/microcode/AMD: install_equiv_cpu_table() should not return (signed) int

2018-03-14 Thread Maciej S. Szmigiero
On 14.03.2018 18:58, Borislav Petkov wrote:
> On Tue, Mar 13, 2018 at 10:06:34PM +0100, Maciej S. Szmigiero wrote:
>> The maximum possible value returned by install_equiv_cpu_table() is
>> UINT_MAX + CONTAINER_HDR_SZ (on a 64-bit kernel).
>> This is more than (signed) int type currently returned by this function can
>> hold so this function will need to return a size_t instead.
> 
> I'm trying to parse this but I'm not really sure.
> 
> All I know is:
> 
>   unsigned int size = ibuf[2];
> 
> and that is really a 4-byte unsigned quantity so anything less is an
> arbitrary limitation.

There is no limit on CPU equivalence table length in this patch series
like it was in the previous version.

The maximum possible value returned by install_equiv_cpu_table() of
UINT_MAX + CONTAINER_HDR_SZ comes from the maximum value of this 'size'
variable (that is UINT_MAX) plus the header length of CONTAINER_HDR_SZ.
This won't fit in 'int' type, hence this patch.

That's because this functions tells its caller how much bytes to skip
from the beginning of a microcode container file to the first patch
section contained in it.

Maciej


Re: x86/retpoline: Fill RSB on context switch for affected CPUs

2018-03-09 Thread Maciej S. Szmigiero
On 12.01.2018 18:49, Woodhouse, David wrote:
> When we context switch from a shallow call stack to a deeper one, as we
> 'ret' up the deeper side we may encounter RSB entries (predictions for
> where the 'ret' goes to) which were populated in userspace. This is
> problematic if we have neither SMEP nor KPTI (the latter of which marks
> userspace pages as NX for the kernel), as malicious code in userspace
> may then be executed speculatively. So overwrite the CPU's return
> prediction stack with calls which are predicted to return to an infinite
> loop, to "capture" speculation if this happens. This is required both
> for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI.
> 
> On Skylake+ the problem is slightly different, and an *underflow* of the
> RSB may cause errant branch predictions to occur. So there it's not so
> much overwrite, as *filling* the RSB to attempt to prevent it getting
> empty. This is only a partial solution for Skylake+ since there are many
> other conditions which may result in the RSB becoming empty. The full
> solution on Skylake+ is to use IBRS, which will prevent the problem even
> when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
> required on context switch.
> 
> Signed-off-by: David Woodhouse 
> Acked-by: Arjan van de Ven 
> ---
(..)
> @@ -213,6 +230,23 @@ static void __init spectre_v2_select_mitigation(void)
>  
>   spectre_v2_enabled = mode;
>   pr_info("%s\n", spectre_v2_strings[mode]);
> +
> + /*
> +  * If we don't have SMEP or KPTI, then we run the risk of hitting
> +  * userspace addresses in the RSB after a context switch from a
> +  * shallow call stack to a deeper one. We must must fill the entire
> +  * RSB to avoid that, even when using IBRS.
> +  *
> +  * Skylake era CPUs have a separate issue with *underflow* of the
> +  * RSB, when they will predict 'ret' targets from the generic BTB.
> +  * IBRS makes that safe, but we need to fill the RSB on context
> +  * switch if we're using retpoline.
> +  */
> + if ((!boot_cpu_has(X86_FEATURE_PTI) &&
> +  !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
> + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> + pr_info("Filling RSB on context switch\n");
> + }

Shouldn't the RSB filling on context switch also be done on non-IBPB
CPUs to protect (retpolined) user space tasks from other user space
tasks?

We already issue a IBPB when switching to high-value user space tasks
to protect them from other user space tasks.

Thanks,
Maciej


Re: x86/retpoline: Fill RSB on context switch for affected CPUs

2018-03-09 Thread Maciej S. Szmigiero
On 09.03.2018 16:14, Andi Kleen wrote:
>> Shouldn't the RSB filling on context switch also be done on non-IBPB
>> CPUs to protect (retpolined) user space tasks from other user space
>> tasks?
> 
> The comment is actually incorrect. There's no risk to hit user space
> addresses if we have KPTI and NX (which is fairly universal).
> 
> It's mainly needed on Skylake era CPUs.
> 
> Should fix the comment. I'll send a patch.

But what about userspace-to-userspace attacks? - the ones that IBPB on 
context switches currently protects against (at least for high-value, or
as implemented currently, non-dumpable, processes)?

If understand the issue correctly, high-value user space processes can
be protected from other user space processes even on CPUs that lack
IBPB as long as they are recompiled with retpolines and there is no
danger of RSB entries from one process being used by another one after
a context switch.
For Skyklake this would not be enough, but there we'll (hopefully) have
the IBPB instead.

> -Andi
> 

Maciej


Re: [PATCH v4 01/10] x86/microcode/AMD: Subtract SECTION_HDR_SIZE from file leftover length

2018-04-18 Thread Maciej S. Szmigiero
On 18.03.2018 17:12, Borislav Petkov wrote:
> On Fri, Mar 16, 2018 at 12:07:42AM +0100, Maciej S. Szmigiero wrote:
>> verify_patch_size() function verifies whether the microcode container file
>> remaining size is large enough to contain a patch of the indicated size.
>>
>> However, the section header length is not included in this indicated size
>> but it is present in the leftover file length so it should be subtracted
>> from the leftover file length before passing this value to
>> verify_patch_size().
>>
>> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
>> ---
>>  arch/x86/kernel/cpu/microcode/amd.c | 11 ++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> I split the comment and applied this:
> 
> ---
> From: "Maciej S. Szmigiero" <m...@maciej.szmigiero.name>
> Date: Fri, 16 Mar 2018 00:07:42 +0100
> Subject: [PATCH] x86/microcode/AMD: Subtract SECTION_HDR_SIZE from file
>  leftover length

Can't really find this commit in any tree I have looked at (bp.git and
tip.git at kernel.org).
Was it pushed somewhere else?

Maciej


Re: [PATCH v2] X.509: unpack RSA signatureValue field from BIT STRING

2018-04-17 Thread Maciej S. Szmigiero
On 17.04.2018 17:07, Kamil Konieczny wrote:
> 
> 
> On 17.04.2018 15:39, Maciej S. Szmigiero wrote:
>> The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
>> For RSA signatures this BIT STRING is of so-called primitive subtype, which
>> contains a u8 prefix indicating a count of unused bits in the encoding.
>>
>> We have to strip this prefix from signature data, just as we already do for
>> key data in x509_extract_key_data() function.
>>
>> This wasn't noticed earlier because this prefix byte is zero for RSA key
>> sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
>> prefixes has no bearing on its value.
>>
>> The signature length, however was incorrect, which is a problem for RSA
>> implementations that need it to be exactly correct (like AMD CCP).
>>
>> Signed-off-by: Maciej S. Szmigiero <m...@maciej.szmigiero.name>
> 
> your e-mail address looks incorrect
> 
> [...]
> 

What's wrong with it?

Maciej


Re: [PATCH v4 01/10] x86/microcode/AMD: Subtract SECTION_HDR_SIZE from file leftover length

2018-04-18 Thread Maciej S. Szmigiero
On 18.04.2018 15:53, Borislav Petkov wrote:
> On Wed, Apr 18, 2018 at 02:39:27PM +0200, Maciej S. Szmigiero wrote:
>> Can't really find this commit in any tree I have looked at (bp.git and
>> tip.git at kernel.org).
>> Was it pushed somewhere else?
> 
> No. Still waiting for the rest.
> 

So, should this patch be included in a respin or not?

Maciej


<    1   2   3   4   5   6   7   8   9   10   >