Re: [linux-sunxi] [PATCH v2 14/14] ARM: dts: sun8i: Enable DVFS on Orange Pi One

2016-06-30 Thread Siarhei Siamashka
On Thu, 30 Jun 2016 13:13:48 +0200
Michal Suchanek  wrote:

> Hello,
> 
> On 25 June 2016 at 05:45,   wrote:
> > From: Ondrej Jirman 
> >
> > Use Xulong Orange Pi One GPIO based regulator for
> > passive cooling and thermal management.
> >
> > Signed-off-by: Ondrej Jirman 
> > ---
> >  arch/arm/boot/dts/sun8i-h3-orangepi-one.dts | 39 
> > +
> >  1 file changed, 39 insertions(+)
> >
> > diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts 
> > b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
> > index b1bd6b0..a38d871 100644
> > --- a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
> > +++ b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
> > @@ -109,6 +109,45 @@
> > };
> >  };
> >
> > + {
> > +   operating-points = <
> > +   /* kHzuV */
> > +   1296000 130
> > +   120 130  
> 
> First problem is that the board boots at 1008000 which is not listed
> and the kernel complains.
> 
> Second problem is that the board locks up during boot with this enabled.
> 
> Do you have some suggestion for alternate configuration to test?

Maybe try the Allwinner's original DVFS table instead of these
undervolted values and see if it helps?

https://linux-sunxi.org/index.php?title=Xunlong_Orange_Pi_PC=17753#CPU_clock_speed_limit

While undervolting is tempting because it helps to decrease the SoC
temperature and avoid throttling, different units may have different
tolerances and one needs to be very careful when picking defaults
that are intended to work correctly on all boards. Some safety
headroom exists there for a reason.

If I remember correctly, some people pushed for undervolting experiments
at least twice in the past (on the Banana Pi and C.H.I.P.). In both
cases this did not end up well and had to be fixed later to solve
reliability problems.

In order to allow individual per-unit tuning, a concept of "speed
grading" may be probably introduced later. So that the board is tested
for reliability and then the speed grade rating is stored somewhere on
the non-removable storage (EEPROM, SPI flash, eFUSE, ...). Some SoC
manufacturers, such as Samsung, are already doing this with their chips.


Re: [linux-sunxi] [PATCH v2 14/14] ARM: dts: sun8i: Enable DVFS on Orange Pi One

2016-06-30 Thread Siarhei Siamashka
On Thu, 30 Jun 2016 13:13:48 +0200
Michal Suchanek  wrote:

> Hello,
> 
> On 25 June 2016 at 05:45,   wrote:
> > From: Ondrej Jirman 
> >
> > Use Xulong Orange Pi One GPIO based regulator for
> > passive cooling and thermal management.
> >
> > Signed-off-by: Ondrej Jirman 
> > ---
> >  arch/arm/boot/dts/sun8i-h3-orangepi-one.dts | 39 
> > +
> >  1 file changed, 39 insertions(+)
> >
> > diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts 
> > b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
> > index b1bd6b0..a38d871 100644
> > --- a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
> > +++ b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
> > @@ -109,6 +109,45 @@
> > };
> >  };
> >
> > + {
> > +   operating-points = <
> > +   /* kHzuV */
> > +   1296000 130
> > +   120 130  
> 
> First problem is that the board boots at 1008000 which is not listed
> and the kernel complains.
> 
> Second problem is that the board locks up during boot with this enabled.
> 
> Do you have some suggestion for alternate configuration to test?

Maybe try the Allwinner's original DVFS table instead of these
undervolted values and see if it helps?

https://linux-sunxi.org/index.php?title=Xunlong_Orange_Pi_PC=17753#CPU_clock_speed_limit

While undervolting is tempting because it helps to decrease the SoC
temperature and avoid throttling, different units may have different
tolerances and one needs to be very careful when picking defaults
that are intended to work correctly on all boards. Some safety
headroom exists there for a reason.

If I remember correctly, some people pushed for undervolting experiments
at least twice in the past (on the Banana Pi and C.H.I.P.). In both
cases this did not end up well and had to be fixed later to solve
reliability problems.

In order to allow individual per-unit tuning, a concept of "speed
grading" may be probably introduced later. So that the board is tested
for reliability and then the speed grade rating is stored somewhere on
the non-removable storage (EEPROM, SPI flash, eFUSE, ...). Some SoC
manufacturers, such as Samsung, are already doing this with their chips.


Re: [linux-sunxi] Re: [PATCH 05/11] drivers: pinctrl: add driver for Allwinner A64 SoC

2016-02-01 Thread Siarhei Siamashka
On Mon, 1 Feb 2016 22:49:16 +
André Przywara  wrote:

> On 01/02/16 18:27, Karsten Merker wrote:
> 
> Hi Karsten,
> 
> thank you very much for your feedback!
> 
> > On Mon, Feb 01, 2016 at 05:39:24PM +, Andre Przywara wrote:  
> >> Based on the Allwinner A64 user manual and on the previous sunxi
> >> pinctrl drivers this introduces the pin multiplex assignments for
> >> the ARMv8 Allwinner A64 SoC.
> >> Port A is apparently used for the fixed function DRAM controller, so
> >> the ports start at B here (the manual mentions "n from 1 to 7", so
> >> not starting at 0).
> >>
> >> Signed-off-by: Andre Przywara 
> >> ---
> >>  .../bindings/pinctrl/allwinner,sunxi-pinctrl.txt   |   1 +
> >>  arch/arm64/Kconfig.platforms   |   1 +
> >>  drivers/pinctrl/sunxi/Kconfig  |   4 +
> >>  drivers/pinctrl/sunxi/Makefile |   1 +
> >>  drivers/pinctrl/sunxi/pinctrl-a64.c| 606 
> >> +
> >>  5 files changed, 613 insertions(+)
> >>  create mode 100644 drivers/pinctrl/sunxi/pinctrl-a64.c
> >>
> >> diff --git 
> >> a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt 
> >> b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
> >> index 9213b27..9050002 100644
> >> --- a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
> >> +++ b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
> >> @@ -21,6 +21,7 @@ Required properties:
> >>"allwinner,sun9i-a80-r-pinctrl"
> >>"allwinner,sun8i-a83t-pinctrl"
> >>"allwinner,sun8i-h3-pinctrl"
> >> +  "allwinner,a64-pinctrl"  
> > 
> > Hello,
> > 
> > on all other Allwinner SoCs we use the SoC family as part of the
> > compatible, as well as in the names of the Kconfig options. To
> > keep things consistent, I would like to propose doing the same on
> > Arm64, i.e. using allwinner,sun50i-a64-pinctrl instead of
> > allwinner,a64-pinctrl.  
> 
> Yes, I have been told this already. However I don't like this idea so
> much, for the following reasons:
> a) It is mostly redundant. The actual SoC (marketing) name is unique,
> there is no sun6i-a20 or sun7i-a23.
> b) It is not even helpful. If I got Maxime correctly, then the newer
> sunxi generation numbers depend on the ARM _cores_ used in the SoC,
> which is frankly the least interesting part from a Linux support
> perspective. I would see some sense if it would reflect the generation
> of IP blocks used, but so it is even more confusing to see that
> sun7i-a20 and sun8i-a23 are related, but sun8i-h3 is a completely
> different beast. The Allwinner marketing name tells you that, but the
> sunxi one does not.
> c) It is very confusing for people not dealing with it everyday. Just
> because I own a BananaPi I know that the A20 is sun7i, but I am totally
> lost when it comes to all the other names. And even now it took me about
> a minute to find the appropriate Wiki page which explains part of that
> story.
> d) Most importantly ;-): It kills TAB completion, unless you know the
> sunxi number, which is mostly not true as pointed out in c)
> 
> So while I see that just a is not really very specific, I'd
> rather do away with current naming scheme for the future. In this
> particular case we have the vendor name as a name space identifier
> already, so there is no possible confusion with ARM Cortex naming, for
> instance.
> 
> Also as this is now moving into the arm64 world, I'd like to use the
> opportunity to fix things that are not really optimal, the naming is one
> of them.

One of the problems is that A64 name is not unique. We have reasons
to believe that there are also H64 and R18 out there using exactly
the same die, but possibly available in different packaging (a different
ball grid pitch? or maybe a different set of peripherals routed to the
outside?). Early prototypes of the Pine64 board were using Allwinner R18
and the Jide Remix Mini HTPC box is using Allwinner H64.

The bootloader sources from Allwinner are also referring to A64 as
AW1689, which makes some sense because it is the chip id number that
is accessible for runtime identification via reading the SRAM_VER_REG
hardware register:

http://linux-sunxi.org/SRAM_Controller_Register_Guide#SRAM_VER_REG

So would it be a good idea to use "aw1689" as a compatible property
in the DT instead of "a64"? Or maybe have "aw1689-a64" and
"aw1689-h64", which would be similar to the existing "sun5i-a13"
and "sun5i-a10s" naming convention?

-- 
Best regards,
Siarhei Siamashka


Re: [linux-sunxi] Re: [PATCH 05/11] drivers: pinctrl: add driver for Allwinner A64 SoC

2016-02-01 Thread Siarhei Siamashka
On Mon, 1 Feb 2016 22:49:16 +
André Przywara <andre.przyw...@arm.com> wrote:

> On 01/02/16 18:27, Karsten Merker wrote:
> 
> Hi Karsten,
> 
> thank you very much for your feedback!
> 
> > On Mon, Feb 01, 2016 at 05:39:24PM +, Andre Przywara wrote:  
> >> Based on the Allwinner A64 user manual and on the previous sunxi
> >> pinctrl drivers this introduces the pin multiplex assignments for
> >> the ARMv8 Allwinner A64 SoC.
> >> Port A is apparently used for the fixed function DRAM controller, so
> >> the ports start at B here (the manual mentions "n from 1 to 7", so
> >> not starting at 0).
> >>
> >> Signed-off-by: Andre Przywara <andre.przyw...@arm.com>
> >> ---
> >>  .../bindings/pinctrl/allwinner,sunxi-pinctrl.txt   |   1 +
> >>  arch/arm64/Kconfig.platforms   |   1 +
> >>  drivers/pinctrl/sunxi/Kconfig  |   4 +
> >>  drivers/pinctrl/sunxi/Makefile |   1 +
> >>  drivers/pinctrl/sunxi/pinctrl-a64.c| 606 
> >> +
> >>  5 files changed, 613 insertions(+)
> >>  create mode 100644 drivers/pinctrl/sunxi/pinctrl-a64.c
> >>
> >> diff --git 
> >> a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt 
> >> b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
> >> index 9213b27..9050002 100644
> >> --- a/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
> >> +++ b/Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt
> >> @@ -21,6 +21,7 @@ Required properties:
> >>"allwinner,sun9i-a80-r-pinctrl"
> >>"allwinner,sun8i-a83t-pinctrl"
> >>"allwinner,sun8i-h3-pinctrl"
> >> +  "allwinner,a64-pinctrl"  
> > 
> > Hello,
> > 
> > on all other Allwinner SoCs we use the SoC family as part of the
> > compatible, as well as in the names of the Kconfig options. To
> > keep things consistent, I would like to propose doing the same on
> > Arm64, i.e. using allwinner,sun50i-a64-pinctrl instead of
> > allwinner,a64-pinctrl.  
> 
> Yes, I have been told this already. However I don't like this idea so
> much, for the following reasons:
> a) It is mostly redundant. The actual SoC (marketing) name is unique,
> there is no sun6i-a20 or sun7i-a23.
> b) It is not even helpful. If I got Maxime correctly, then the newer
> sunxi generation numbers depend on the ARM _cores_ used in the SoC,
> which is frankly the least interesting part from a Linux support
> perspective. I would see some sense if it would reflect the generation
> of IP blocks used, but so it is even more confusing to see that
> sun7i-a20 and sun8i-a23 are related, but sun8i-h3 is a completely
> different beast. The Allwinner marketing name tells you that, but the
> sunxi one does not.
> c) It is very confusing for people not dealing with it everyday. Just
> because I own a BananaPi I know that the A20 is sun7i, but I am totally
> lost when it comes to all the other names. And even now it took me about
> a minute to find the appropriate Wiki page which explains part of that
> story.
> d) Most importantly ;-): It kills TAB completion, unless you know the
> sunxi number, which is mostly not true as pointed out in c)
> 
> So while I see that just a is not really very specific, I'd
> rather do away with current naming scheme for the future. In this
> particular case we have the vendor name as a name space identifier
> already, so there is no possible confusion with ARM Cortex naming, for
> instance.
> 
> Also as this is now moving into the arm64 world, I'd like to use the
> opportunity to fix things that are not really optimal, the naming is one
> of them.

One of the problems is that A64 name is not unique. We have reasons
to believe that there are also H64 and R18 out there using exactly
the same die, but possibly available in different packaging (a different
ball grid pitch? or maybe a different set of peripherals routed to the
outside?). Early prototypes of the Pine64 board were using Allwinner R18
and the Jide Remix Mini HTPC box is using Allwinner H64.

The bootloader sources from Allwinner are also referring to A64 as
AW1689, which makes some sense because it is the chip id number that
is accessible for runtime identification via reading the SRAM_VER_REG
hardware register:

http://linux-sunxi.org/SRAM_Controller_Register_Guide#SRAM_VER_REG

So would it be a good idea to use "aw1689" as a compatible property
in the DT instead of "a64"? Or maybe have "aw1689-a64" and
"aw1689-h64", which would be similar to the existing "sun5i-a13"
and "sun5i-a10s" naming convention?

-- 
Best regards,
Siarhei Siamashka


Re: [linux-sunxi] [PATCH] thermal: Add support for Sunxi THS on the Allwinner H3

2015-11-18 Thread Siarhei Siamashka
Hello,

On Wed, 18 Nov 2015 21:51:48 +0100
Josef Gajdusek  wrote:

> This patch adds support for the Sunxi thermal sensor on the Allwinner H3.
> Also adds declaration of the H3 THS clock to clk-sunxi.c ignoring the
> dividers as they are not continuous (clk-divider.c cannot be used as it
> does not support setting an enable bit).
> Should be easily extendable for the A33/A83T/... as they have similar but
> not completely identical sensors.
> 
> Signed-off-by: Josef Gajdusek 

Thanks for working on this. The thermal sensor is very useful on
H3 because this SoC tends to be running rather hot at high clock
frequencies.

Do you also have plans for making use of an irq handler?

> ---
>  Documentation/devicetree/bindings/clock/sunxi.txt  |   1 +
>  .../devicetree/bindings/thermal/sunxi-ths.txt  |  24 ++
>  arch/arm/boot/dts/sun8i-h3.dtsi|  27 +++
>  drivers/clk/sunxi/clk-sunxi.c  |  16 ++
>  drivers/thermal/Kconfig|   7 +
>  drivers/thermal/Makefile   |   1 +
>  drivers/thermal/sunxi_ths.c| 263 
> +
>  7 files changed, 339 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/thermal/sunxi-ths.txt
>  create mode 100644 drivers/thermal/sunxi_ths.c

[...]

> --- a/drivers/clk/sunxi/clk-sunxi.c
> +++ b/drivers/clk/sunxi/clk-sunxi.c
> @@ -604,6 +604,13 @@ static void sun7i_a20_get_out_factors(u32 *freq, u32 
> parent_rate,
>   *p = calcp;
>  }
>  
> +static void sun8i_h3_get_ths_factors(u32 *freq, u32 parent_rate,
> +   u8 *n, u8 *k, u8 *m, u8 *p)
> +{
> + /* Ignore the dividers as they are not continuous */
> + *freq = parent_rate;
> +}

Can you elaborate on this? What's wrong with the dividers? Which clock
frequency happens to be used for THS in the end?

> +static int sunxi_ths_h3_init(struct sunxi_ths_data *data)
> +{
> + if (data->calreg)
> + writel(readl(data->calreg) & 0xfff, data->regs + THS_H3_CDATA);
> + /* Magical constants mostly taken from Allwinner 3.4 kernel.
> +  * Seem to work fine, though this could be configurable in DT/sysft
> +  */
> + writel(0xff << THS_H3_CTRL0_SENSOR_ACQ0,
> + data->regs + THS_H3_CTRL0);
> + writel((0x3f << THS_H3_CTRL2_SENSOR_ACQ1) | BIT(THS_H3_CTRL2_SENSE_EN),
> + data->regs + THS_H3_CTRL2);
> + writel((0x390 << THS_H3_INT_CTRL_THERMAL_PER) | 
> BIT(THS_H3_INT_CTRL_DATA_IRQ_EN),
> + data->regs + THS_H3_INT_CTRL);
> + writel(BIT(THS_H3_FILTER_EN) | (0x2 << THS_H3_FILTER_TYPE),
> + data->regs + THS_H3_FILTER);
> + return 0;
> +}

The H3 manual has some nice description of these registers and explains
how these parameters should be calculated (they depend on the THS clock
frequency). No magic involved.

Currently the H3 manual is available from Orange Pi people and OLIMEX.
There is also an open request about releasing it "officially":
https://github.com/allwinner-zh/documents/issues/9

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH] thermal: Add support for Sunxi THS on the Allwinner H3

2015-11-18 Thread Siarhei Siamashka
Hello,

On Wed, 18 Nov 2015 21:51:48 +0100
Josef Gajdusek <a...@atx.name> wrote:

> This patch adds support for the Sunxi thermal sensor on the Allwinner H3.
> Also adds declaration of the H3 THS clock to clk-sunxi.c ignoring the
> dividers as they are not continuous (clk-divider.c cannot be used as it
> does not support setting an enable bit).
> Should be easily extendable for the A33/A83T/... as they have similar but
> not completely identical sensors.
> 
> Signed-off-by: Josef Gajdusek <a...@atx.name>

Thanks for working on this. The thermal sensor is very useful on
H3 because this SoC tends to be running rather hot at high clock
frequencies.

Do you also have plans for making use of an irq handler?

> ---
>  Documentation/devicetree/bindings/clock/sunxi.txt  |   1 +
>  .../devicetree/bindings/thermal/sunxi-ths.txt  |  24 ++
>  arch/arm/boot/dts/sun8i-h3.dtsi|  27 +++
>  drivers/clk/sunxi/clk-sunxi.c  |  16 ++
>  drivers/thermal/Kconfig|   7 +
>  drivers/thermal/Makefile   |   1 +
>  drivers/thermal/sunxi_ths.c| 263 
> +
>  7 files changed, 339 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/thermal/sunxi-ths.txt
>  create mode 100644 drivers/thermal/sunxi_ths.c

[...]

> --- a/drivers/clk/sunxi/clk-sunxi.c
> +++ b/drivers/clk/sunxi/clk-sunxi.c
> @@ -604,6 +604,13 @@ static void sun7i_a20_get_out_factors(u32 *freq, u32 
> parent_rate,
>   *p = calcp;
>  }
>  
> +static void sun8i_h3_get_ths_factors(u32 *freq, u32 parent_rate,
> +   u8 *n, u8 *k, u8 *m, u8 *p)
> +{
> + /* Ignore the dividers as they are not continuous */
> + *freq = parent_rate;
> +}

Can you elaborate on this? What's wrong with the dividers? Which clock
frequency happens to be used for THS in the end?

> +static int sunxi_ths_h3_init(struct sunxi_ths_data *data)
> +{
> + if (data->calreg)
> + writel(readl(data->calreg) & 0xfff, data->regs + THS_H3_CDATA);
> + /* Magical constants mostly taken from Allwinner 3.4 kernel.
> +  * Seem to work fine, though this could be configurable in DT/sysft
> +  */
> + writel(0xff << THS_H3_CTRL0_SENSOR_ACQ0,
> + data->regs + THS_H3_CTRL0);
> + writel((0x3f << THS_H3_CTRL2_SENSOR_ACQ1) | BIT(THS_H3_CTRL2_SENSE_EN),
> + data->regs + THS_H3_CTRL2);
> + writel((0x390 << THS_H3_INT_CTRL_THERMAL_PER) | 
> BIT(THS_H3_INT_CTRL_DATA_IRQ_EN),
> + data->regs + THS_H3_INT_CTRL);
> + writel(BIT(THS_H3_FILTER_EN) | (0x2 << THS_H3_FILTER_TYPE),
> + data->regs + THS_H3_FILTER);
> + return 0;
> +}

The H3 manual has some nice description of these registers and explains
how these parameters should be calculated (they depend on the THS clock
frequency). No magic involved.

Currently the H3 manual is available from Orange Pi people and OLIMEX.
There is also an open request about releasing it "officially":
https://github.com/allwinner-zh/documents/issues/9

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] ARM: dts: sun6i: Add dts file for MSI Primo81 tablet

2015-10-24 Thread Siarhei Siamashka
Hello,

On Fri, 23 Oct 2015 23:46:59 +0800
Chen-Yu Tsai  wrote:

> On Fri, Oct 23, 2015 at 10:53 PM, Karsten Merker  wrote:
> > On Fri, Oct 23, 2015 at 11:50:41AM +0800, Chen-Yu Tsai wrote:
> >
> >> From: Karsten Merker 
> >>
> >> The MSI Primo81 is an A31s based tablet, with 1G RAM, 16G NAND,
> >> 768x1024 IPS LCD display, mono speaker, 0.3 MP front camera, 2.0 MP
> >> rear camera, 3500 mAh battery, gt911 touchscreen, mma8452 accelerometer
> >> and rtl8188etv usb wifi. Has "power", "volume+" and "volume-" buttons
> >> (both volume buttons are also connected to the UBOOT_SEL pin). The
> >
> > Hello Chen-Yu,
> >
> > the volume button function is something that I wanted to confirm
> > again but forgot to ask previously: Siarhei had pointed out that
> > only the volume+ button triggers UBOOT_SEL, but for me actually
> > both volume buttons work as described above.  Could you
> > cross-check that on your Primo81?
> 
> IIRC both work. I can test next week.

Both buttons work this way because the Allwinner's firmware
additionally checks the buttons state via LRADC and switches
into the FEL mode in a software way. But if somebody flashes a
broken bootloader to NAND (something that is recognized by the
BROM, but dies instead of booting) then we may potentially have
a bricked device because booting from NAND has the highest
priority on A31s.

The UBOOT_SEL pin on the SoC can be used to change the default
boot order and allow to boot from the SD card or USB first
regardless of what is in NAND. So if at least one of the hardware
buttons is connected to the UBOOT_SEL pin, then we have an
unbrickable device.

Checking whether the hardware button is really connected to the
UBOOT_SEL pin can be done by reading the SRAM_VER_REG hardware
register and looking at the BOOT_SEL_PAD_STA bits:
http://linux-sunxi.org/SRAM_Controller_Register_Guide#SRAM_VER_REG

And this can be done, for example, via using the devmem2 tool:
"devmem2 0x01c00024"

Alternatively, it is also possible to use a modified variant of the
dialog tool, which is additionally polling the state of the FEL
button and interpreting long FEL button press as KEY_ENTER and
short press as KEY_DOWN:
https://github.com/ssvb/dialog-sunxi

This patched dialog tool is a part of the board type selection stub,
used for creating universal board-independent SD card based installers
for Allwinner devices, which has been available for Linux distribution
maintainers since a while ago:
http://lists.denx.de/pipermail/u-boot/2015-January/202306.html

Regarding the UBOOT_SEL pin in the MSI Primo81 tablet. After
buying this tablet, I was happy to confirm that at least the
"volume+" button is connected to the UBOOT_SEL pin, making the
tablet unbrickable. However appears that it was not just some sane
decision made by MSI engineers, but in fact connecting both LRADC
and UBOOT_SEL to tablet buttons is a part of the standard Allwinner's
reference schematics. One can search for "a20_pad_std_v1_1.pdf",
"a13-sch.pdf", "A31_PAD_STD_V1_90_130225.pdf" documents on the
Internet to find this information. Basically, we should expect
the majority of Allwinner A31(s) tablets to have a hardware FEL
button and be perfectly unbrickable :-)

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/4] ARM: dts: sun6i: Add dts file for MSI Primo81 tablet

2015-10-24 Thread Siarhei Siamashka
Hello,

On Fri, 23 Oct 2015 23:46:59 +0800
Chen-Yu Tsai <w...@csie.org> wrote:

> On Fri, Oct 23, 2015 at 10:53 PM, Karsten Merker <mer...@debian.org> wrote:
> > On Fri, Oct 23, 2015 at 11:50:41AM +0800, Chen-Yu Tsai wrote:
> >
> >> From: Karsten Merker <mer...@debian.org>
> >>
> >> The MSI Primo81 is an A31s based tablet, with 1G RAM, 16G NAND,
> >> 768x1024 IPS LCD display, mono speaker, 0.3 MP front camera, 2.0 MP
> >> rear camera, 3500 mAh battery, gt911 touchscreen, mma8452 accelerometer
> >> and rtl8188etv usb wifi. Has "power", "volume+" and "volume-" buttons
> >> (both volume buttons are also connected to the UBOOT_SEL pin). The
> >
> > Hello Chen-Yu,
> >
> > the volume button function is something that I wanted to confirm
> > again but forgot to ask previously: Siarhei had pointed out that
> > only the volume+ button triggers UBOOT_SEL, but for me actually
> > both volume buttons work as described above.  Could you
> > cross-check that on your Primo81?
> 
> IIRC both work. I can test next week.

Both buttons work this way because the Allwinner's firmware
additionally checks the buttons state via LRADC and switches
into the FEL mode in a software way. But if somebody flashes a
broken bootloader to NAND (something that is recognized by the
BROM, but dies instead of booting) then we may potentially have
a bricked device because booting from NAND has the highest
priority on A31s.

The UBOOT_SEL pin on the SoC can be used to change the default
boot order and allow to boot from the SD card or USB first
regardless of what is in NAND. So if at least one of the hardware
buttons is connected to the UBOOT_SEL pin, then we have an
unbrickable device.

Checking whether the hardware button is really connected to the
UBOOT_SEL pin can be done by reading the SRAM_VER_REG hardware
register and looking at the BOOT_SEL_PAD_STA bits:
http://linux-sunxi.org/SRAM_Controller_Register_Guide#SRAM_VER_REG

And this can be done, for example, via using the devmem2 tool:
"devmem2 0x01c00024"

Alternatively, it is also possible to use a modified variant of the
dialog tool, which is additionally polling the state of the FEL
button and interpreting long FEL button press as KEY_ENTER and
short press as KEY_DOWN:
https://github.com/ssvb/dialog-sunxi

This patched dialog tool is a part of the board type selection stub,
used for creating universal board-independent SD card based installers
for Allwinner devices, which has been available for Linux distribution
maintainers since a while ago:
http://lists.denx.de/pipermail/u-boot/2015-January/202306.html

Regarding the UBOOT_SEL pin in the MSI Primo81 tablet. After
buying this tablet, I was happy to confirm that at least the
"volume+" button is connected to the UBOOT_SEL pin, making the
tablet unbrickable. However appears that it was not just some sane
decision made by MSI engineers, but in fact connecting both LRADC
and UBOOT_SEL to tablet buttons is a part of the standard Allwinner's
reference schematics. One can search for "a20_pad_std_v1_1.pdf",
"a13-sch.pdf", "A31_PAD_STD_V1_90_130225.pdf" documents on the
Internet to find this information. Basically, we should expect
the majority of Allwinner A31(s) tablets to have a hardware FEL
button and be perfectly unbrickable :-)

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2] ARM64: Add AT_ARM64_MIDR to the aux vector

2015-09-02 Thread Siarhei Siamashka
On Wed, 2 Sep 2015 00:28:28 +
"Pinski, Andrew"  wrote:

> > On Sep 2, 2015, at 3:13 AM, Siarhei Siamashka  
> > wrote:
> > 
> > On Wed, 2 Sep 2015 01:58:56 +0800
> > pins...@gmail.com wrote:
> > 
> >>> On Sep 2, 2015, at 1:30 AM, Mark Rutland  wrote:
> >>> 
> >>> [...]
> >>> 
> >>>>>>> On Sat, Aug 29, 2015 at 07:46:22PM +0100, Andrew Pinski wrote:
> >>>>>>> It is useful to pass down MIDR register down to userland if all of
> >>>>>>> the online cores are all the same type.  This adds AT_ARM64_MIDR
> >>>>>>> aux vector type and passes down the midr system register.
> >>>>>>> 
> >>>>>>> This is alternative to MIDR_EL1 part of
> >>>>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358995.html.
> >>>>>>> It allows for faster access to midr_el1 than going through a trap and
> >>>>>>> does not exist if the set of cores are not the same.
> >>>>>> 
> >>>>>> I'm not sure I follow the rationale. If speed is important the
> >>>>>> application can cache the value the first time it reads it with a trap.
> >>>>> 
> >>>>> It is also about compatibility also. Exposing the register is not 
> >>>>> backwards compatible but using the aux vector is.
> >>>> 
> >>>> That would also break big.little too. So either break it with hot plug 
> >>>> or break it in userland, your choice.
> >>> 
> >>> The value wouldn't be representative of the system as a whole; that is
> >>> true. However, we never guaranteed that it was, while the aux vector
> >>> code implied that we did.
> >> 
> >> Yes but I guess you talk about caching the value in userspace but doing
> >> it via the aux vector is the same as your suggestion. Just one
> >> difference is you don't get the aux vector entry if there is a CPU
> >> that is online which is different. No difference from your suggestion
> >> of caching it. Without considering hot pug for a second (that is a
> >> huge different issue all together), if userland wants to know if all
> >> up CPUs have the same midr, they would either read /sys entries (lots
> >> of syscalls) or bind to each CPU and do the trap. That means at least
> >> three or two syscalls/traps for each CPU. My way is none and gets a
> >> value of midr if they are all the Same for free. 
> > 
> > Andrew, how do you propose to get the value of MIDR? Open the
> > "/proc/self/auxv", read it, do a linear search in the buffer to find
> > the required entry and then read the value? Or use the glibc specific
> > getauxval() function (https://lwn.net/Articles/519085) ?
> 
> This is inside glibc I am talking about so getauxval. 

OK. Point taken.

> > Regarding the caching implementation, one can open and parse the
> > "/proc/cpuinfo" file (with older kernels) or read the new sysfs
> > entries to get the MIDR value for each core. Then create a lookup
> > table. As an additional bonus, this lookup table can contain not
> > just the MIDR values, but any arbitrary data in any format (for
> > example, a function pointer to the memcpy function or anything else).
> 
> You don't want to do that early on in ld.so each time a program
> starts up. Too much overhead. 

Right.

> > After the lookup table is available, one can use the getcpu() syscall
> > for getting the CPU number and do the table lookup. And for getting
> > reasonable performance, implement the vdso variant of the getcpu()
> > syscall.
> > 
> > All of this internal ugliness would be best abstracted inside
> > of the GCC __builtin_cpu_init(), __builtin_cpu_is() and
> > __builtin_cpu_supports() builtins:
> >http://gcc.gnu.org/gcc-4.8/changes.html
> 
> Yes but this is about glibc support and not other userland
> support. Having glibc depend on that is even worse. 

Well, basically you are saying that you only care about your
individual use case (glibc only, 64-bit only, no support for
big.LITTLE) and just don't give rats about anything else. This
is not necessarily wrong, but we are not getting the problem
solved on a wider scale with such approach.

Regarding the reliance on the getauxval() function. Appears that
at least it seems to be also supported in recent versions of Android
and in musl too. So it already has a wide support in the Linux
different flavours and just needs a configure check

Re: [PATCHv2] ARM64: Add AT_ARM64_MIDR to the aux vector

2015-09-02 Thread Siarhei Siamashka
On Wed, 2 Sep 2015 00:28:28 +
"Pinski, Andrew" <andrew.pin...@caviumnetworks.com> wrote:

> > On Sep 2, 2015, at 3:13 AM, Siarhei Siamashka <siarhei.siamas...@gmail.com> 
> > wrote:
> > 
> > On Wed, 2 Sep 2015 01:58:56 +0800
> > pins...@gmail.com wrote:
> > 
> >>> On Sep 2, 2015, at 1:30 AM, Mark Rutland <mark.rutl...@arm.com> wrote:
> >>> 
> >>> [...]
> >>> 
> >>>>>>> On Sat, Aug 29, 2015 at 07:46:22PM +0100, Andrew Pinski wrote:
> >>>>>>> It is useful to pass down MIDR register down to userland if all of
> >>>>>>> the online cores are all the same type.  This adds AT_ARM64_MIDR
> >>>>>>> aux vector type and passes down the midr system register.
> >>>>>>> 
> >>>>>>> This is alternative to MIDR_EL1 part of
> >>>>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358995.html.
> >>>>>>> It allows for faster access to midr_el1 than going through a trap and
> >>>>>>> does not exist if the set of cores are not the same.
> >>>>>> 
> >>>>>> I'm not sure I follow the rationale. If speed is important the
> >>>>>> application can cache the value the first time it reads it with a trap.
> >>>>> 
> >>>>> It is also about compatibility also. Exposing the register is not 
> >>>>> backwards compatible but using the aux vector is.
> >>>> 
> >>>> That would also break big.little too. So either break it with hot plug 
> >>>> or break it in userland, your choice.
> >>> 
> >>> The value wouldn't be representative of the system as a whole; that is
> >>> true. However, we never guaranteed that it was, while the aux vector
> >>> code implied that we did.
> >> 
> >> Yes but I guess you talk about caching the value in userspace but doing
> >> it via the aux vector is the same as your suggestion. Just one
> >> difference is you don't get the aux vector entry if there is a CPU
> >> that is online which is different. No difference from your suggestion
> >> of caching it. Without considering hot pug for a second (that is a
> >> huge different issue all together), if userland wants to know if all
> >> up CPUs have the same midr, they would either read /sys entries (lots
> >> of syscalls) or bind to each CPU and do the trap. That means at least
> >> three or two syscalls/traps for each CPU. My way is none and gets a
> >> value of midr if they are all the Same for free. 
> > 
> > Andrew, how do you propose to get the value of MIDR? Open the
> > "/proc/self/auxv", read it, do a linear search in the buffer to find
> > the required entry and then read the value? Or use the glibc specific
> > getauxval() function (https://lwn.net/Articles/519085) ?
> 
> This is inside glibc I am talking about so getauxval. 

OK. Point taken.

> > Regarding the caching implementation, one can open and parse the
> > "/proc/cpuinfo" file (with older kernels) or read the new sysfs
> > entries to get the MIDR value for each core. Then create a lookup
> > table. As an additional bonus, this lookup table can contain not
> > just the MIDR values, but any arbitrary data in any format (for
> > example, a function pointer to the memcpy function or anything else).
> 
> You don't want to do that early on in ld.so each time a program
> starts up. Too much overhead. 

Right.

> > After the lookup table is available, one can use the getcpu() syscall
> > for getting the CPU number and do the table lookup. And for getting
> > reasonable performance, implement the vdso variant of the getcpu()
> > syscall.
> > 
> > All of this internal ugliness would be best abstracted inside
> > of the GCC __builtin_cpu_init(), __builtin_cpu_is() and
> > __builtin_cpu_supports() builtins:
> >http://gcc.gnu.org/gcc-4.8/changes.html
> 
> Yes but this is about glibc support and not other userland
> support. Having glibc depend on that is even worse. 

Well, basically you are saying that you only care about your
individual use case (glibc only, 64-bit only, no support for
big.LITTLE) and just don't give rats about anything else. This
is not necessarily wrong, but we are not getting the problem
solved on a wider scale with such approach.

Regarding the reliance on the getauxval() function. Appears that
at least it seems to be also supported in recent versions of Android
and in musl to

Re: [PATCHv2] ARM64: Add AT_ARM64_MIDR to the aux vector

2015-09-01 Thread Siarhei Siamashka
  1. Implement __builtin_cpu_init(), __builtin_cpu_is() and
 __builtin_cpu_supports() builtins, which rely on reading sysfs
 entries (with a fallback to /proc/cpuinfo parsing on old kernels)
 and the getcpu() syscall for the reasonably accurate core type
 runtime identification on big.LITTLE systems.

On the applications/libraries side (including, but not limited to glibc)
it means:
  1. Rely on the GCC __builtin_cpu_init(), __builtin_cpu_is() and
 __builtin_cpu_supports() builtins.
  2. Maybe implement the replacement of these builtins to get all the
 same functionality even with the old versions of GCC.

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2] ARM64: Add AT_ARM64_MIDR to the aux vector

2015-09-01 Thread Siarhei Siamashka
the GCC side it means:
  1. Implement __builtin_cpu_init(), __builtin_cpu_is() and
 __builtin_cpu_supports() builtins, which rely on reading sysfs
 entries (with a fallback to /proc/cpuinfo parsing on old kernels)
 and the getcpu() syscall for the reasonably accurate core type
     runtime identification on big.LITTLE systems.

On the applications/libraries side (including, but not limited to glibc)
it means:
  1. Rely on the GCC __builtin_cpu_init(), __builtin_cpu_is() and
 __builtin_cpu_supports() builtins.
  2. Maybe implement the replacement of these builtins to get all the
 same functionality even with the old versions of GCC.

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] drm/exynos/fbdev: don't set fix.smem/mmio_{start,len}

2014-06-19 Thread Siarhei Siamashka
On Fri,  4 Apr 2014 17:22:01 +0800
Daniel Kurtz  wrote:

> Kernel access to the eyxnos fbdev framebuffer is via its gem object's
> kernel mapping (kvaddr, stored in info->screen_base).
> 
> User space access is provided by mmap(), read() and write() of /dev/fb/fb0.
> These functions also only use screen_base/screen_size().
> 
> Therefore, it is not necessary to set fix->smem_{start,len} or
> fix->mmio_{start,len} fields.
> 
> This avoids leaking kernel, physical and dma mapped addresses to user
> space via the ioctls FBIOGET_VSCREENINFO and FBIOGET_FSCREENINFO.
> 
> Signed-off-by: Daniel Kurtz 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c 
> b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> index 5fa342e..2dcc589 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
> @@ -123,14 +123,7 @@ static int exynos_drm_fbdev_update(struct drm_fb_helper 
> *helper,
>  
>   dev->mode_config.fb_base = (resource_size_t)buffer->dma_addr;
>   fbi->screen_base = buffer->kvaddr + offset;
> - if (is_drm_iommu_supported(dev))
> - fbi->fix.smem_start = (unsigned long)
> - (page_to_phys(sg_page(buffer->sgt->sgl)) + offset);
> - else
> - fbi->fix.smem_start = (unsigned long)buffer->dma_addr;
> -
>   fbi->screen_size = size;
> - fbi->fix.smem_len = size;

Can we keep proper initialization of 'smem_len'? Some userland
applications use it for calculating the size for mmap:


http://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/fbdevhw/fbdevhw.c?id=xorg-server-1.15.99.903#n571

>  
>   return 0;
>  }

Basically, this patch breaks the xf86-video-fbdev ddx and some users
are already unhappy.

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] drm/exynos/fbdev: don't set fix.smem/mmio_{start,len}

2014-06-19 Thread Siarhei Siamashka
On Fri,  4 Apr 2014 17:22:01 +0800
Daniel Kurtz djku...@chromium.org wrote:

 Kernel access to the eyxnos fbdev framebuffer is via its gem object's
 kernel mapping (kvaddr, stored in info-screen_base).
 
 User space access is provided by mmap(), read() and write() of /dev/fb/fb0.
 These functions also only use screen_base/screen_size().
 
 Therefore, it is not necessary to set fix-smem_{start,len} or
 fix-mmio_{start,len} fields.
 
 This avoids leaking kernel, physical and dma mapped addresses to user
 space via the ioctls FBIOGET_VSCREENINFO and FBIOGET_FSCREENINFO.
 
 Signed-off-by: Daniel Kurtz djku...@chromium.org
 ---
  drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 7 ---
  1 file changed, 7 deletions(-)
 
 diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c 
 b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 index 5fa342e..2dcc589 100644
 --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
 @@ -123,14 +123,7 @@ static int exynos_drm_fbdev_update(struct drm_fb_helper 
 *helper,
  
   dev-mode_config.fb_base = (resource_size_t)buffer-dma_addr;
   fbi-screen_base = buffer-kvaddr + offset;
 - if (is_drm_iommu_supported(dev))
 - fbi-fix.smem_start = (unsigned long)
 - (page_to_phys(sg_page(buffer-sgt-sgl)) + offset);
 - else
 - fbi-fix.smem_start = (unsigned long)buffer-dma_addr;
 -
   fbi-screen_size = size;
 - fbi-fix.smem_len = size;

Can we keep proper initialization of 'smem_len'? Some userland
applications use it for calculating the size for mmap:


http://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/fbdevhw/fbdevhw.c?id=xorg-server-1.15.99.903#n571

  
   return 0;
  }

Basically, this patch breaks the xf86-video-fbdev ddx and some users
are already unhappy.

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

2013-06-28 Thread Siarhei Siamashka
On Fri, 28 Jun 2013 12:26:10 +0200 (CEST)
Thomas Gleixner  wrote:

> On Fri, 28 Jun 2013, Siarhei Siamashka wrote:
> > On Fri, 28 Jun 2013 09:43:37 +0800
> > 张猛  wrote:
> > 
> > > > The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
> > > > does not seem to contain any details about what bad things may happen
> > > > if we try to read CNT64_LO_REG while latching is still in progress and
> > > > CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
> > > > I can imagine the following possible scenarios:
> > > >  1. We read either the old stale CNT64_LO_REG value or the new
> > > > correct value.
> > > >  2. We read either the old stale CNT64_LO_REG value or the new
> > > > correct value, or some random garbage.
> > > >  3. The processor may deadlock, eat your dog, or do some other
> > > > nasty thing.
> > > >
> > > > In the case of 1, we probably can get away without using any spinlocks?
> > > 
> > > About the 64bits counter, the latch bit is needed because of the
> > > asynchronous circuit. The internal circuit of 64bits counter is
> > > working under 24Mhz clock, and CNT_LO/HI is read with APB clock.
> > > So the clock synchronize is needed. The function of the latch is
> > > synchronous the 64bits counter from 24Mhz clock domain to APB clock
> > > domain. So, if the latch is not completely, value of the CNT_LO/HI
> > > maybe a random value, because some bits are latched, but others are
> > > not. So, the CNT_LO/HI should be read after latch is completely.
> > > The latch just takes 3 cycles of 24Mhz clock, the time is nearly
> > > 0.125 micro-second.
> > 
> > Thanks for the clarification! It is very much appreciated.
> > 
> > So basically we get scenario 2, which still allows some optimizations
> > compared to scenario 3. On single-core systems (Allwinner A10), it
> > probably makes sense to avoid spinlocks overhead and just place
> 
> Spinlocks are NOPs on UP and just disable preemption, but they
> provide you the same ordering guarantees as spinlocks on SMP. So no
> special case optimization necessary.

Still I would not want some high priority and latency sensitive junk
(such as pulseaudio for example) not to get scheduled at the right
time because of some low priority gettimeofday spammer application.
The time spent latching and reading these counters is non negligible.

Though I agree Maxime that first we need an initial implementation,
which just works safely. And then we can optimize it by testing real
use cases and providing the relevant benchmark numbers. Knowing the
precise details about the hardware helps.

But enough bikeshedding, it's time for me to shut up :)

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

2013-06-28 Thread Siarhei Siamashka
On Fri, 28 Jun 2013 12:26:10 +0200 (CEST)
Thomas Gleixner t...@linutronix.de wrote:

 On Fri, 28 Jun 2013, Siarhei Siamashka wrote:
  On Fri, 28 Jun 2013 09:43:37 +0800
  张猛 ke...@allwinnertech.com wrote:
  
The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
does not seem to contain any details about what bad things may happen
if we try to read CNT64_LO_REG while latching is still in progress and
CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
I can imagine the following possible scenarios:
 1. We read either the old stale CNT64_LO_REG value or the new
correct value.
 2. We read either the old stale CNT64_LO_REG value or the new
correct value, or some random garbage.
 3. The processor may deadlock, eat your dog, or do some other
nasty thing.
   
In the case of 1, we probably can get away without using any spinlocks?
   
   About the 64bits counter, the latch bit is needed because of the
   asynchronous circuit. The internal circuit of 64bits counter is
   working under 24Mhz clock, and CNT_LO/HI is read with APB clock.
   So the clock synchronize is needed. The function of the latch is
   synchronous the 64bits counter from 24Mhz clock domain to APB clock
   domain. So, if the latch is not completely, value of the CNT_LO/HI
   maybe a random value, because some bits are latched, but others are
   not. So, the CNT_LO/HI should be read after latch is completely.
   The latch just takes 3 cycles of 24Mhz clock, the time is nearly
   0.125 micro-second.
  
  Thanks for the clarification! It is very much appreciated.
  
  So basically we get scenario 2, which still allows some optimizations
  compared to scenario 3. On single-core systems (Allwinner A10), it
  probably makes sense to avoid spinlocks overhead and just place
 
 Spinlocks are NOPs on UP and just disable preemption, but they
 provide you the same ordering guarantees as spinlocks on SMP. So no
 special case optimization necessary.

Still I would not want some high priority and latency sensitive junk
(such as pulseaudio for example) not to get scheduled at the right
time because of some low priority gettimeofday spammer application.
The time spent latching and reading these counters is non negligible.

Though I agree Maxime that first we need an initial implementation,
which just works safely. And then we can optimize it by testing real
use cases and providing the relevant benchmark numbers. Knowing the
precise details about the hardware helps.

But enough bikeshedding, it's time for me to shut up :)

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

2013-06-27 Thread Siarhei Siamashka
On Thu, 27 Jun 2013 18:54:36 +0200
Maxime Ripard  wrote:

> On Thu, Jun 27, 2013 at 11:54:11AM +0200, Hans de Goede wrote:
> > I notice that unlike the sunxi-3.4 code you don't do any locking,
> > so how do you stop 2 clocksource calls from racing (and thus
> > getting a possible wrong value because of things not
> > being properly latched) ?
> 
> Hmm, right. I'll add a spinlock.

I think the best would be to ask the Allwinner people (it's good to
have them in CC, right?) whether anything wrong can happen because of
"things not being properly latched".

The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
does not seem to contain any details about what bad things may happen
if we try to read CNT64_LO_REG while latching is still in progress and
CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
I can imagine the following possible scenarios:
  1. We read either the old stale CNT64_LO_REG value or the new
 correct value.
  2. We read either the old stale CNT64_LO_REG value or the new
 correct value, or some random garbage.
  3. The processor may deadlock, eat your dog, or do some other
 nasty thing.

In the case of 1, we probably can get away without using any spinlocks?

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 2/8] clocksource: sun4i: Add clocksource and sched clock drivers

2013-06-27 Thread Siarhei Siamashka
On Thu, 27 Jun 2013 19:02:28 +0200
Maxime Ripard  wrote:

> Hi Siarhei,
> 
> On Thu, Jun 27, 2013 at 01:17:29PM +0300, Siarhei Siamashka wrote:
> > On Wed, 26 Jun 2013 23:16:55 +0200
> > Maxime Ripard  wrote:
> > 
> > > The A10 and the A13 has a 64 bits free running counter that we can use
> > > as a clocksource and a sched clock, that were both not used yet on these
> > > platforms.
> > > 
> > > Signed-off-by: Maxime Ripard 
> > > ---
> > >  drivers/clocksource/sun4i_timer.c | 27 +++
> > >  1 file changed, 27 insertions(+)
> > > 
> > > diff --git a/drivers/clocksource/sun4i_timer.c 
> > > b/drivers/clocksource/sun4i_timer.c
> > > index bdf34d9..1d2eaa0 100644
> > > --- a/drivers/clocksource/sun4i_timer.c
> > > +++ b/drivers/clocksource/sun4i_timer.c
> > > @@ -23,6 +23,8 @@
> > >  #include 
> > >  #include 
> > >  
> > > +#include 
> > > +
> > >  #define TIMER_IRQ_EN_REG 0x00
> > >  #define TIMER_IRQ_EN(val)BIT(val)
> > >  #define TIMER_IRQ_ST_REG 0x04
> > > @@ -34,6 +36,11 @@
> > >  #define TIMER_CNTVAL_REG(val)(0x10 * val + 0x18)
> > >  
> > >  #define TIMER_SCAL   16
> > > +#define TIMER_CNT64_CTL_REG  0xa0
> > > +#define TIMER_CNT64_CTL_CLR  BIT(0)
> > > +#define TIMER_CNT64_CTL_RL   BIT(1)
> > > +#define TIMER_CNT64_LOW_REG  0xa4
> > > +#define TIMER_CNT64_HIGH_REG 0xa8
> > >  
> > >  static void __iomem *timer_base;
> > >  
> > > @@ -96,6 +103,20 @@ static struct irqaction sun4i_timer_irq = {
> > >   .dev_id = _clockevent,
> > >  };
> > >  
> > > +static u32 sun4i_timer_sched_read(void)
> > > +{
> > > + u32 reg = readl(timer_base + TIMER_CNT64_CTL_REG);
> > 
> > If we can be absolutely sure that nothing else may ever change
> > the TIMER_CNT64_CTL_REG, then its default value can be probably
> > cached instead of doing expensive read from the hardware register
> > each time?
> 
> Since it's a free-running counter, its value will always change, so the
> caching will bring no additions at all, right?

Sorry, 'caching' was not a very good description for something that is
already a compile time constant. I mean just replace

u32 reg = readl(timer_base + TIMER_CNT64_CTL_REG);

with 

u32 reg = TIMER_CNT64_CTL_CLR;

Because we know that the TIMER_CNT64_CTL_REG is already supposed
to have the default TIMER_CNT64_CTL_CLR value (initialized in the
'sun4i_timer_init' function) between calls to 'sun4i_timer_sched_read'.
Inside of 'sun4i_timer_sched_read' we set an extra TIMER_CNT64_CTL_RL
bit in this register, but wait until it clears, effectively reverting
TIMER_CNT64_CTL_REG register back to the default TIMER_CNT64_CTL_CLR
value.

Removing this extra HW register read can save roughly a hundred of CPU
cycles here and provide a ~10% overall improvement for gettimeofday
(these estimates are based on the earlier benchmarks done with the
Allwinner 3.4 kernel).

Or maybe I'm overlooking something?

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 2/8] clocksource: sun4i: Add clocksource and sched clock drivers

2013-06-27 Thread Siarhei Siamashka
On Wed, 26 Jun 2013 23:16:55 +0200
Maxime Ripard  wrote:

> The A10 and the A13 has a 64 bits free running counter that we can use
> as a clocksource and a sched clock, that were both not used yet on these
> platforms.
> 
> Signed-off-by: Maxime Ripard 
> ---
>  drivers/clocksource/sun4i_timer.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/clocksource/sun4i_timer.c 
> b/drivers/clocksource/sun4i_timer.c
> index bdf34d9..1d2eaa0 100644
> --- a/drivers/clocksource/sun4i_timer.c
> +++ b/drivers/clocksource/sun4i_timer.c
> @@ -23,6 +23,8 @@
>  #include 
>  #include 
>  
> +#include 
> +
>  #define TIMER_IRQ_EN_REG 0x00
>  #define TIMER_IRQ_EN(val)BIT(val)
>  #define TIMER_IRQ_ST_REG 0x04
> @@ -34,6 +36,11 @@
>  #define TIMER_CNTVAL_REG(val)(0x10 * val + 0x18)
>  
>  #define TIMER_SCAL   16
> +#define TIMER_CNT64_CTL_REG  0xa0
> +#define TIMER_CNT64_CTL_CLR  BIT(0)
> +#define TIMER_CNT64_CTL_RL   BIT(1)
> +#define TIMER_CNT64_LOW_REG  0xa4
> +#define TIMER_CNT64_HIGH_REG 0xa8
>  
>  static void __iomem *timer_base;
>  
> @@ -96,6 +103,20 @@ static struct irqaction sun4i_timer_irq = {
>   .dev_id = _clockevent,
>  };
>  
> +static u32 sun4i_timer_sched_read(void)
> +{
> + u32 reg = readl(timer_base + TIMER_CNT64_CTL_REG);

If we can be absolutely sure that nothing else may ever change
the TIMER_CNT64_CTL_REG, then its default value can be probably
cached instead of doing expensive read from the hardware register
each time?

The gettimeofday() abusers will feel a bit less pain. ARM does not
currently enjoy the VDSO optimized gettimeofday, so the software
which has been only tested on x86 may get a nasty surprise (an order
of magnitude higher gettimeofday overhead).

> + writel(reg | TIMER_CNT64_CTL_RL, timer_base + TIMER_CNT64_CTL_REG);
> + while (readl(timer_base + TIMER_CNT64_CTL_REG) & TIMER_CNT64_CTL_REG);

Some may think that this particular loop looks like a performance
bottleneck, but it is very rarely run for more than one iteration.
In fact, most of the time it just happens to be a single HW register
read.

> +
> + return readl(timer_base + TIMER_CNT64_LOW_REG);
> +}
> +
> +static cycle_t sun4i_timer_clksrc_read(struct clocksource *c)
> +{
> + return sun4i_timer_sched_read();
> +}
> +
>  static void __init sun4i_timer_init(struct device_node *node)
>  {
>   unsigned long rate = 0;
> @@ -117,6 +138,12 @@ static void __init sun4i_timer_init(struct device_node 
> *node)
>  
>   rate = clk_get_rate(clk);
>  
> + writel(TIMER_CNT64_CTL_CLR, timer_base + TIMER_CNT64_CTL_REG);
> + setup_sched_clock(sun4i_timer_sched_read, 32, clk_get_rate(clk));
> + clocksource_mmio_init(timer_base + TIMER_CNT64_LOW_REG, node->name,
> +   clk_get_rate(clk), 300, 32,
> +   sun4i_timer_clksrc_read);
> +
>   writel(rate / (TIMER_SCAL * HZ),
>  timer_base + TIMER_INTVAL_REG(0));
>  



-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 2/8] clocksource: sun4i: Add clocksource and sched clock drivers

2013-06-27 Thread Siarhei Siamashka
On Wed, 26 Jun 2013 23:16:55 +0200
Maxime Ripard maxime.rip...@free-electrons.com wrote:

 The A10 and the A13 has a 64 bits free running counter that we can use
 as a clocksource and a sched clock, that were both not used yet on these
 platforms.
 
 Signed-off-by: Maxime Ripard maxime.rip...@free-electrons.com
 ---
  drivers/clocksource/sun4i_timer.c | 27 +++
  1 file changed, 27 insertions(+)
 
 diff --git a/drivers/clocksource/sun4i_timer.c 
 b/drivers/clocksource/sun4i_timer.c
 index bdf34d9..1d2eaa0 100644
 --- a/drivers/clocksource/sun4i_timer.c
 +++ b/drivers/clocksource/sun4i_timer.c
 @@ -23,6 +23,8 @@
  #include linux/of_address.h
  #include linux/of_irq.h
  
 +#include asm/sched_clock.h
 +
  #define TIMER_IRQ_EN_REG 0x00
  #define TIMER_IRQ_EN(val)BIT(val)
  #define TIMER_IRQ_ST_REG 0x04
 @@ -34,6 +36,11 @@
  #define TIMER_CNTVAL_REG(val)(0x10 * val + 0x18)
  
  #define TIMER_SCAL   16
 +#define TIMER_CNT64_CTL_REG  0xa0
 +#define TIMER_CNT64_CTL_CLR  BIT(0)
 +#define TIMER_CNT64_CTL_RL   BIT(1)
 +#define TIMER_CNT64_LOW_REG  0xa4
 +#define TIMER_CNT64_HIGH_REG 0xa8
  
  static void __iomem *timer_base;
  
 @@ -96,6 +103,20 @@ static struct irqaction sun4i_timer_irq = {
   .dev_id = sun4i_clockevent,
  };
  
 +static u32 sun4i_timer_sched_read(void)
 +{
 + u32 reg = readl(timer_base + TIMER_CNT64_CTL_REG);

If we can be absolutely sure that nothing else may ever change
the TIMER_CNT64_CTL_REG, then its default value can be probably
cached instead of doing expensive read from the hardware register
each time?

The gettimeofday() abusers will feel a bit less pain. ARM does not
currently enjoy the VDSO optimized gettimeofday, so the software
which has been only tested on x86 may get a nasty surprise (an order
of magnitude higher gettimeofday overhead).

 + writel(reg | TIMER_CNT64_CTL_RL, timer_base + TIMER_CNT64_CTL_REG);
 + while (readl(timer_base + TIMER_CNT64_CTL_REG)  TIMER_CNT64_CTL_REG);

Some may think that this particular loop looks like a performance
bottleneck, but it is very rarely run for more than one iteration.
In fact, most of the time it just happens to be a single HW register
read.

 +
 + return readl(timer_base + TIMER_CNT64_LOW_REG);
 +}
 +
 +static cycle_t sun4i_timer_clksrc_read(struct clocksource *c)
 +{
 + return sun4i_timer_sched_read();
 +}
 +
  static void __init sun4i_timer_init(struct device_node *node)
  {
   unsigned long rate = 0;
 @@ -117,6 +138,12 @@ static void __init sun4i_timer_init(struct device_node 
 *node)
  
   rate = clk_get_rate(clk);
  
 + writel(TIMER_CNT64_CTL_CLR, timer_base + TIMER_CNT64_CTL_REG);
 + setup_sched_clock(sun4i_timer_sched_read, 32, clk_get_rate(clk));
 + clocksource_mmio_init(timer_base + TIMER_CNT64_LOW_REG, node-name,
 +   clk_get_rate(clk), 300, 32,
 +   sun4i_timer_clksrc_read);
 +
   writel(rate / (TIMER_SCAL * HZ),
  timer_base + TIMER_INTVAL_REG(0));
  



-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 2/8] clocksource: sun4i: Add clocksource and sched clock drivers

2013-06-27 Thread Siarhei Siamashka
On Thu, 27 Jun 2013 19:02:28 +0200
Maxime Ripard maxime.rip...@free-electrons.com wrote:

 Hi Siarhei,
 
 On Thu, Jun 27, 2013 at 01:17:29PM +0300, Siarhei Siamashka wrote:
  On Wed, 26 Jun 2013 23:16:55 +0200
  Maxime Ripard maxime.rip...@free-electrons.com wrote:
  
   The A10 and the A13 has a 64 bits free running counter that we can use
   as a clocksource and a sched clock, that were both not used yet on these
   platforms.
   
   Signed-off-by: Maxime Ripard maxime.rip...@free-electrons.com
   ---
drivers/clocksource/sun4i_timer.c | 27 +++
1 file changed, 27 insertions(+)
   
   diff --git a/drivers/clocksource/sun4i_timer.c 
   b/drivers/clocksource/sun4i_timer.c
   index bdf34d9..1d2eaa0 100644
   --- a/drivers/clocksource/sun4i_timer.c
   +++ b/drivers/clocksource/sun4i_timer.c
   @@ -23,6 +23,8 @@
#include linux/of_address.h
#include linux/of_irq.h

   +#include asm/sched_clock.h
   +
#define TIMER_IRQ_EN_REG 0x00
#define TIMER_IRQ_EN(val)BIT(val)
#define TIMER_IRQ_ST_REG 0x04
   @@ -34,6 +36,11 @@
#define TIMER_CNTVAL_REG(val)(0x10 * val + 0x18)

#define TIMER_SCAL   16
   +#define TIMER_CNT64_CTL_REG  0xa0
   +#define TIMER_CNT64_CTL_CLR  BIT(0)
   +#define TIMER_CNT64_CTL_RL   BIT(1)
   +#define TIMER_CNT64_LOW_REG  0xa4
   +#define TIMER_CNT64_HIGH_REG 0xa8

static void __iomem *timer_base;

   @@ -96,6 +103,20 @@ static struct irqaction sun4i_timer_irq = {
 .dev_id = sun4i_clockevent,
};

   +static u32 sun4i_timer_sched_read(void)
   +{
   + u32 reg = readl(timer_base + TIMER_CNT64_CTL_REG);
  
  If we can be absolutely sure that nothing else may ever change
  the TIMER_CNT64_CTL_REG, then its default value can be probably
  cached instead of doing expensive read from the hardware register
  each time?
 
 Since it's a free-running counter, its value will always change, so the
 caching will bring no additions at all, right?

Sorry, 'caching' was not a very good description for something that is
already a compile time constant. I mean just replace

u32 reg = readl(timer_base + TIMER_CNT64_CTL_REG);

with 

u32 reg = TIMER_CNT64_CTL_CLR;

Because we know that the TIMER_CNT64_CTL_REG is already supposed
to have the default TIMER_CNT64_CTL_CLR value (initialized in the
'sun4i_timer_init' function) between calls to 'sun4i_timer_sched_read'.
Inside of 'sun4i_timer_sched_read' we set an extra TIMER_CNT64_CTL_RL
bit in this register, but wait until it clears, effectively reverting
TIMER_CNT64_CTL_REG register back to the default TIMER_CNT64_CTL_CLR
value.

Removing this extra HW register read can save roughly a hundred of CPU
cycles here and provide a ~10% overall improvement for gettimeofday
(these estimates are based on the earlier benchmarks done with the
Allwinner 3.4 kernel).

Or maybe I'm overlooking something?

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

2013-06-27 Thread Siarhei Siamashka
On Thu, 27 Jun 2013 18:54:36 +0200
Maxime Ripard maxime.rip...@free-electrons.com wrote:

 On Thu, Jun 27, 2013 at 11:54:11AM +0200, Hans de Goede wrote:
  I notice that unlike the sunxi-3.4 code you don't do any locking,
  so how do you stop 2 clocksource calls from racing (and thus
  getting a possible wrong value because of things not
  being properly latched) ?
 
 Hmm, right. I'll add a spinlock.

I think the best would be to ask the Allwinner people (it's good to
have them in CC, right?) whether anything wrong can happen because of
things not being properly latched.

The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
does not seem to contain any details about what bad things may happen
if we try to read CNT64_LO_REG while latching is still in progress and
CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
I can imagine the following possible scenarios:
  1. We read either the old stale CNT64_LO_REG value or the new
 correct value.
  2. We read either the old stale CNT64_LO_REG value or the new
 correct value, or some random garbage.
  3. The processor may deadlock, eat your dog, or do some other
 nasty thing.

In the case of 1, we probably can get away without using any spinlocks?

-- 
Best regards,
Siarhei Siamashka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/