Re: [OpenOCD-devel] STM32F2/4/7 Flash programming

2018-03-12 Thread Tomas Vanek via OpenOCD-devel

On 12.03.2018 21:53, Christopher Head wrote:

On March 10, 2018 11:25:15 PM PST, Tomas Vanek via OpenOCD-devel 
 wrote:

I wouldn't call this case as an obscure one. The reason could be
insufficient device clock rate,
not very high adapter_khz. Anyway all these cases could be solved by
configuring
the device properly.

I don’t think this is related to device clock speed. You will get a WAIT if 
there is a bus stall, and Flash programming is self timed. So I think it 
depends on the 16 (typical) to 100 (maximum) microsecond word programming time. 
If you can deliver 35 bits (length of a DRSCAN) plus the TMS transitions within 
the word programming time, then you will get a WAIT reply.
You're right that self-timed flash write takes most of required time. 
Bus transport should
take just one or two bus cycles (the second for clock sync). It is quite 
different when using algo.





One more concern: If programming by algo is usable on SWD only, JTAG
users should
set WORKAREASIZE to zero. But algos are used for verify, blank check
and
external memories as well.
This may impose a big penalty...

Yes, this is unfortunate. The verify algorithm works fine for me, but of course 
it is a synchronous, rather than asynchronous, algorithm, so any silicon 
erratum exposed by bus arbitration or other weirdness would not apply there.

In any case, 4463 makes this change. I get one DAP WAIT, but no more, with my 
FTDI at 2M, and programming works fine and verifies properly

Have you noticed programming speed?

For testing I connected STM32F722-nucleo and STM32F413-nucleo to one 
JTAG chain from FT2232.
I configured 128 MHz clock in F7 reset-init 
(http://openocd.zylin.com/4464) a lowered max adapter_khz to 3000

as my old FT2232C does not work well @ 6000 khz.

With algo:

> reset init
...
> adapter_khz
adapter speed: 3000 kHz
> dap memaccess
memory bus access delay set to 8 tck
> targets f7.cpu
> flash write_image 64kib.bin 0x0802
wrote 65536 bytes from file 64kib.bin in 0.743568s (*86*.071 KiB/s)
> targets f4.cpu
> flash write_image 64kib.bin 0x0802
wrote 65536 bytes from file 64kib.bin in 0.763147s (83.863 KiB/s)

Now with your patch and WORKAREASIZE 0 (both devices performs very 
similar so I list just F722):


> reset init
...
> adapter_khz
adapter speed: 3000 kHz
> dap memaccess
memory bus access delay set to 8 tck
> flash write_image 64kib.bin 0x0804
device id = 0x10006452
flash size = 512kbytes
not enough working area available(requested 76)
no working area available, can't do block memory writes
couldn't use block writes, falling back to single memory accesses
DAP transaction stalled (WAIT) - slowing down
wrote 65536 bytes from file 64kib.bin in 2.583704s (*24*.771 KiB/s)

However if you set longer memory access delay manually:

> dap memaccess 31
memory bus access delay set to 31 tck
> flash write_image 64kib.bin 0x0807
not enough working area available(requested 76)
no working area available, can't do block memory writes
couldn't use block writes, falling back to single memory accesses
wrote 65536 bytes from file 64kib.bin in 0.962217s (*66*.513 KiB/s)

I got much worse results with DAP WAIT on a slow old Intel Atom single 
core industrial PC:

speed as slow as 3.618 KiB/s (without wait 48.723 KiB/s).

FT2232 SWD transport (F7 only):

> reset init
...
> adapter_khz
adapter speed: 3000 kHz
> flash write_image 64kib.bin 0x0806
device id = 0x10006452
flash size = 512kbytes
not enough working area available(requested 76)
no working area available, can't do block memory writes
couldn't use block writes, falling back to single memory accesses
SWD DPIDR 0x5ba02477
Failed to write memory at 0x0806006c
error writing to flash at address 0x0800 at offset 0x0006

Too fast, DAP WAIT is an error on FTDI/SWD.

> adapter_khz 1500
adapter speed: 1500 kHz
> flash write_image 64kib.bin 0x0807
not enough working area available(requested 76)
no working area available, can't do block memory writes
couldn't use block writes, falling back to single memory accesses
wrote 65536 bytes from file 64kib.bin in 0.734222s (87.167 KiB/s)

Works. 46 SWCLK cycles / 1.5 MHz = 30.6 usec > t_flash

And finally ST-Link with original adapter_khz values:

> reset init
...
adapter speed: 4000 kHz
> flash write_image 64kib.bin 0x0803
not enough working area available(requested 76)
no working area available, can't do block memory writes
couldn't use block writes, falling back to single memory accesses
wrote 65536 bytes from file 64kib.bin in 9.900800s (6.464 KiB/s)

Really slow in comparison to 110.128 KiB/s with algo.

Your change really speed-up non-algo flashing. Unfortunately WAIT handling
on dumb adapters is far from effective and manual setting of extra 
memaccess cycles
heavily depends on the flash timing and this may vary with the flash 
wear out/temperature/whatever.



  (at least, it does once I work around the fact that my nasty multi target 
hacks have gone from 

[OpenOCD-devel] [PATCH]: 22723f7 tcl/target/stm32f7x: configure faster system clock in reset-init

2018-03-12 Thread gerrit
This is an automated email from Gerrit.

Tomas Vanek (van...@fbl.cz) just uploaded a new patch set to Gerrit, which you 
can find at http://openocd.zylin.com/4464

-- gerrit

commit 22723f76148c57dc5655b85df829ac6af3fb61f8
Author: Tomas Vanek 
Date:   Mon Mar 12 23:42:23 2018 +0100

tcl/target/stm32f7x: configure faster system clock in reset-init

STM32F7xx devices need faster clock for flash programming
over JTAG transport. Using reset default 16 MHz clock
resulted in lot of DAP WAITs and substantial decrease
of flashing performance.

Change-Id: Ida6915331dd924c9c0d08822fd94c04ad408cdc5
Signed-off-by: Tomas Vanek 

diff --git a/tcl/target/stm32f7x.cfg b/tcl/target/stm32f7x.cfg
index 4065e2a..1f36bb4 100755
--- a/tcl/target/stm32f7x.cfg
+++ b/tcl/target/stm32f7x.cfg
@@ -81,3 +81,21 @@ $_TARGETNAME configure -event trace-config {
# assignment
mmw 0xE0042004 0x0020 0
 }
+
+$_TARGETNAME configure -event reset-init {
+   # Configure PLL to boost clock to HSI x 8 (128 MHz)
+   mww 0x40023804 0x08002008   ;# RCC_PLLCFGR 16 Mhz /8 (M) * 128 (N) /2(P)
+   mww 0x40023C00 0x0106   ;# FLASH_ACR = PRFTBE | 6(Latency)
+   mmw 0x40023800 0x0100 0 ;# RCC_CR |= PLLON
+   sleep 10;# Wait for PLL to lock
+   mmw 0x40023808 0x1000 0 ;# RCC_CFGR |= RCC_CFGR_PPRE1_DIV2
+   mmw 0x40023808 0x0002 0 ;# RCC_CFGR |= RCC_CFGR_SW_PLL
+
+   # Boost JTAG frequency
+   adapter_khz 8000
+}
+
+$_TARGETNAME configure -event reset-start {
+   # Reduce speed since CPU speed will slow down to 16MHz with the reset
+   adapter_khz 2000
+}

-- 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel


Re: [OpenOCD-devel] STM32F2/4/7 Flash programming

2018-03-12 Thread Christopher Head
On March 10, 2018 11:25:15 PM PST, Tomas Vanek via OpenOCD-devel 
 wrote:
>I wouldn't call this case as an obscure one. The reason could be 
>insufficient device clock rate,
>not very high adapter_khz. Anyway all these cases could be solved by 
>configuring
>the device properly.

I don’t think this is related to device clock speed. You will get a WAIT if 
there is a bus stall, and Flash programming is self timed. So I think it 
depends on the 16 (typical) to 100 (maximum) microsecond word programming time. 
If you can deliver 35 bits (length of a DRSCAN) plus the TMS transitions within 
the word programming time, then you will get a WAIT reply.

>One more concern: If programming by algo is usable on SWD only, JTAG 
>users should
>set WORKAREASIZE to zero. But algos are used for verify, blank check
>and 
>external memories as well.
>This may impose a big penalty...

Yes, this is unfortunate. The verify algorithm works fine for me, but of course 
it is a synchronous, rather than asynchronous, algorithm, so any silicon 
erratum exposed by bus arbitration or other weirdness would not apply there.

In any case, 4463 makes this change. I get one DAP WAIT, but no more, with my 
FTDI at 2M, and programming works fine and verifies properly (at least, it does 
once I work around the fact that my nasty multi target hacks have gone from 
necessary to counterproductive).

-- 
Christopher Head

signature.asc
Description: PGP signature
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel


[OpenOCD-devel] [PATCH]: 97d09e1 flash/nor/stm32f2x: check CR/SR outside loop

2018-03-12 Thread gerrit
This is an automated email from Gerrit.

Christopher Head (ch...@zaber.com) just uploaded a new patch set to Gerrit, 
which you can find at http://openocd.zylin.com/4463

-- gerrit

commit 97d09e15c355d4cf22f6e22bcb71d798f846c788
Author: Christopher Head 
Date:   Mon Mar 12 11:21:01 2018 -0700

flash/nor/stm32f2x: check CR/SR outside loop

When writing to Flash without using an algorithm, writing the
programming control bits to CR and reading the status from SR before and
after every word takes a lot of time. However, it is not necessary: one
can write to CR once before programming many words. The error bits in SR
are cumulative and can be checked after a large chunk is written. The
busy bit in SR is mostly redundant since subsequent bus accesses stall
(resulting in a DAP WAIT) rather than failing. Therefore, just write CR
once, bulk-write the memory, and then check SR at the end.

Change-Id: I845961eeb6435a1ee9482750411946f3c5dfb60c
Signed-off-by: Christopher Head 

diff --git a/src/flash/nor/stm32f2x.c b/src/flash/nor/stm32f2x.c
index b0992b4..5fdd7fe 100644
--- a/src/flash/nor/stm32f2x.c
+++ b/src/flash/nor/stm32f2x.c
@@ -760,16 +760,13 @@ static int stm32x_write(struct flash_bank *bank, const 
uint8_t *buffer,
Double word access in case of x64 parallelism
Wait for the BSY bit to be cleared
*/
-   while (words_remaining > 0) {
-   uint16_t value;
-   memcpy(, buffer + bytes_written, sizeof(uint16_t));
-
+   if (words_remaining > 0) {
retval = target_write_u32(target, stm32x_get_flash_reg(bank, 
STM32_FLASH_CR),
FLASH_PG | FLASH_PSIZE_16);
if (retval != ERROR_OK)
return retval;
 
-   retval = target_write_u16(target, address, value);
+   retval = target_write_memory(target, address, 2, 
words_remaining, buffer);
if (retval != ERROR_OK)
return retval;
 
@@ -777,9 +774,9 @@ static int stm32x_write(struct flash_bank *bank, const 
uint8_t *buffer,
if (retval != ERROR_OK)
return retval;
 
-   bytes_written += 2;
-   words_remaining--;
-   address += 2;
+   bytes_written += words_remaining * 2;
+   words_remaining = 0;
+   address += words_remaining * 2;
}
 
if (bytes_remaining) {

-- 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel


[OpenOCD-devel] [PATCH]: 8bb3c2a jtag: drivers: buspirate: fix abuse of "char" type

2018-03-12 Thread gerrit
This is an automated email from Gerrit.

Paul Fertser (fercer...@gmail.com) just uploaded a new patch set to Gerrit, 
which you can find at http://openocd.zylin.com/4462

-- gerrit

commit 8bb3c2a11ff1543103cba1be088a731993f94889
Author: Paul Fertser 
Date:   Mon Mar 12 20:09:50 2018 +0300

jtag: drivers: buspirate: fix abuse of "char" type

Change occurrences of char to uint8_t where appropriate as a binary
protocol is used to talk to this adapter.

This fixes a build issue with modern clang.

Change-Id: I21cc82c8cad148bd0977533c12c74a9d6ba2faff
Signed-off-by: Paul Fertser 

diff --git a/src/jtag/drivers/buspirate.c b/src/jtag/drivers/buspirate.c
index 76be103..35649c2 100644
--- a/src/jtag/drivers/buspirate.c
+++ b/src/jtag/drivers/buspirate.c
@@ -141,7 +141,7 @@ static void buspirate_set_speed(int, char);
 /* low level interface */
 static void buspirate_bbio_enable(int);
 static void buspirate_jtag_reset(int);
-static unsigned char buspirate_jtag_command(int, char *, int);
+static unsigned char buspirate_jtag_command(int, uint8_t *, int);
 static void buspirate_jtag_set_speed(int, char);
 static void buspirate_jtag_set_mode(int, char);
 static void buspirate_jtag_set_feature(int, char, char);
@@ -155,10 +155,10 @@ static void buspirate_swd_set_mode(int, char);
 /* low level HW communication interface */
 static int buspirate_serial_open(char *port);
 static int buspirate_serial_setspeed(int fd, char speed, cc_t timeout);
-static int buspirate_serial_write(int fd, char *buf, int size);
-static int buspirate_serial_read(int fd, char *buf, int size);
+static int buspirate_serial_write(int fd, uint8_t *buf, int size);
+static int buspirate_serial_read(int fd, uint8_t *buf, int size);
 static void buspirate_serial_close(int fd);
-static void buspirate_print_buffer(char *buf, int size);
+static void buspirate_print_buffer(uint8_t *buf, int size);
 
 static int buspirate_execute_queue(void)
 {
@@ -255,7 +255,7 @@ static bool read_and_discard_all_data(const int fd)
bool was_msg_already_printed = false;
 
for ( ; ; ) {
-   char buffer[1024];  /* Any size will do, it's a trade-off 
between stack size and performance. */
+   uint8_t buffer[1024];  /* Any size will do, it's a trade-off 
between stack size and performance. */
 
const ssize_t read_count = read(fd, buffer, sizeof(buffer));
 
@@ -690,8 +690,8 @@ static void buspirate_stableclocks(int num_cycles)
make it incompatible with the Bus Pirate firmware. */
 #define BUSPIRATE_MAX_PENDING_SCANS 128
 
-static char tms_chain[BUSPIRATE_BUFFER_SIZE]; /* send */
-static char tdi_chain[BUSPIRATE_BUFFER_SIZE]; /* send */
+static uint8_t tms_chain[BUSPIRATE_BUFFER_SIZE]; /* send */
+static uint8_t tdi_chain[BUSPIRATE_BUFFER_SIZE]; /* send */
 static int tap_chain_index;
 
 struct pending_scan_result /* this was stolen from arm-jtag-ew */
@@ -716,7 +716,7 @@ static int buspirate_tap_execute(void)
 {
static const int CMD_TAP_SHIFT_HEADER_LEN = 3;
 
-   char tmp[4096];
+   uint8_t tmp[4096];
uint8_t *in_buf;
int i;
int fill_index = 0;
@@ -732,8 +732,8 @@ static int buspirate_tap_execute(void)
bytes_to_send = DIV_ROUND_UP(tap_chain_index, 8);
 
tmp[0] = CMD_TAP_SHIFT; /* this command expects number of bits */
-   tmp[1] = (char)(tap_chain_index >> 8);  /* high */
-   tmp[2] = (char)(tap_chain_index);  /* low */
+   tmp[1] = tap_chain_index >> 8;  /* high */
+   tmp[2] = tap_chain_index;  /* low */
 
fill_index = CMD_TAP_SHIFT_HEADER_LEN;
for (i = 0; i < bytes_to_send; i++) {
@@ -904,7 +904,7 @@ static void buspirate_set_speed(int fd, char speed)
 static void buspirate_swd_set_speed(int fd, char speed)
 {
int  ret;
-   char tmp[1];
+   uint8_t tmp[1];
 
LOG_DEBUG("Buspirate speed setting in SWD mode defaults to 400 kHz");
 
@@ -925,7 +925,7 @@ static void buspirate_swd_set_speed(int fd, char speed)
 static void buspirate_swd_set_mode(int fd, char mode)
 {
int ret;
-   char tmp[1];
+   uint8_t tmp[1];
 
/* raw-wire mode configuration */
if (mode == MODE_HIZ)
@@ -948,7 +948,7 @@ static void buspirate_swd_set_mode(int fd, char mode)
 static void buspirate_swd_set_feature(int fd, char feat, char action)
 {
int  ret;
-   char tmp[1];
+   uint8_t tmp[1];
 
switch (feat) {
case FEATURE_TRST:
@@ -989,7 +989,7 @@ static void buspirate_bbio_enable(int fd)
char command;
const char *mode_answers[2] = { "OCD1", "RAW1" };
const char *correct_ans = NULL;
-   char tmp[21] = { [0 ... 20] = 0x00 };
+   uint8_t tmp[21] = { [0 ... 20] = 0x00 };
int done = 0;
int cmd_sent = 0;
 
@@ -1013,7 +1013,7 @@ static void buspirate_bbio_enable(int fd)
"/OpenOCD support enabled?");

[OpenOCD-devel] GDB connection via pipe problems

2018-03-12 Thread Augusto Fraga Giachero

Hello,

I've experiencing problems with recent versions of arm-none-eabi-gdb
(8.0 and above) and openocd when connecting via pipe as suggested by the
OpenOCD manual:

(gdb) target remote | openocd -c "gdb_port pipe; log_output openocd.log"
-f openocd.cfg

And I get:

Remote replied unexpectedly to 'vMustReplyEmpty':
PacketSize=3fff;qXfer:memory-map:read+;qXfer:features:read+;qXfer:threads:read+;QStartNoAckMode+

But if I try to connect through TCP everything works fine.

I'm debugging STM32Fx microcontrollers and connecting over pipe worked
great until 3 weeks ago. I've been searching for similar problems on
forums and mailling lists, but couldn't find relevant info.

Anyone experiencing this or have some idea what this gdb error means?

Thanks,
Augusto Fraga Giachero.

OpenOCD: 0.10.0
arm-none-eabi-gdb: 8.1
Platform: Linux 4.15.7 x86-64 (Arch Linux)


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel


[OpenOCD-devel] [PATCH]: 69e3562 aarch64: fix debug entry from EL0

2018-03-12 Thread gerrit
This is an automated email from Gerrit.

Matthias Welwarsky (matth...@welwarsky.de) just uploaded a new patch set to 
Gerrit, which you can find at http://openocd.zylin.com/4461

-- gerrit

commit 69e35620b2a5bdfca60ef55286d98aa5a238019c
Author: Matthias Welwarsky 
Date:   Mon Mar 12 16:56:05 2018 +0100

aarch64: fix debug entry from EL0

If we enter debug state from EL0, some registers are not accessible.
Temporarily move to EL1H and back to gain access. Also, fix
armv8_dpm_modeswitch() to not immediately restore the previous state
on elevating the privilege level.

Change-Id: Ic2a92109230ff4eb6834c00ef544397a5b7ad56a
Signed-off-by: Matthias Welwarsky 

diff --git a/src/target/aarch64.c b/src/target/aarch64.c
index 0630ffb..b586e24 100644
--- a/src/target/aarch64.c
+++ b/src/target/aarch64.c
@@ -1861,7 +1861,7 @@ static int aarch64_write_cpu_memory(struct target *target,
if (dscr & (DSCR_ERR | DSCR_SYS_ERROR_PEND)) {
/* Abort occurred - clear it and exit */
LOG_ERROR("abort occurred - dscr = 0x%08" PRIx32, dscr);
-   armv8_dpm_handle_exception(dpm);
+   armv8_dpm_handle_exception(dpm, true);
return ERROR_FAIL;
}
 
@@ -2080,7 +2080,7 @@ static int aarch64_read_cpu_memory(struct target *target,
if (dscr & (DSCR_ERR | DSCR_SYS_ERROR_PEND)) {
/* Abort occurred - clear it and exit */
LOG_ERROR("abort occurred - dscr = 0x%08" PRIx32, dscr);
-   armv8_dpm_handle_exception(dpm);
+   armv8_dpm_handle_exception(dpm, true);
return ERROR_FAIL;
}
 
diff --git a/src/target/armv8.c b/src/target/armv8.c
index a85fae2..b88f37d 100644
--- a/src/target/armv8.c
+++ b/src/target/armv8.c
@@ -620,6 +620,7 @@ void armv8_select_reg_access(struct armv8_common *armv8, 
bool is_aarch64)
 int armv8_read_mpidr(struct armv8_common *armv8)
 {
int retval = ERROR_FAIL;
+   struct arm *arm = >arm;
struct arm_dpm *dpm = armv8->arm.dpm;
uint32_t mpidr;
 
@@ -627,6 +628,13 @@ int armv8_read_mpidr(struct armv8_common *armv8)
if (retval != ERROR_OK)
goto done;
 
+   /* check if we're in an unprivileged mode */
+   if (armv8_curel_from_core_mode(arm->core_mode) < SYSTEM_CUREL_EL1) {
+   retval = armv8_dpm_modeswitch(dpm, ARMV8_64_EL1H);
+   if (retval != ERROR_OK)
+   return retval;
+   }
+
retval = dpm->instr_read_data_r0(dpm, armv8_opcode(armv8, 
READ_REG_MPIDR), );
if (retval != ERROR_OK)
goto done;
@@ -642,6 +650,7 @@ int armv8_read_mpidr(struct armv8_common *armv8)
LOG_ERROR("mpidr not in multiprocessor format");
 
 done:
+   armv8_dpm_modeswitch(dpm, ARM_MODE_ANY);
dpm->finish(dpm);
return retval;
 }
diff --git a/src/target/armv8_cache.c b/src/target/armv8_cache.c
index 7f610c9..40965eb 100644
--- a/src/target/armv8_cache.c
+++ b/src/target/armv8_cache.c
@@ -310,6 +310,7 @@ int armv8_identify_cache(struct armv8_common *armv8)
 {
/*  read cache descriptor */
int retval = ERROR_FAIL;
+   struct arm *arm = >arm;
struct arm_dpm *dpm = armv8->arm.dpm;
uint32_t csselr, clidr, ctr;
uint32_t cache_reg;
@@ -320,6 +321,13 @@ int armv8_identify_cache(struct armv8_common *armv8)
if (retval != ERROR_OK)
goto done;
 
+   /* check if we're in an unprivileged mode */
+   if (armv8_curel_from_core_mode(arm->core_mode) < SYSTEM_CUREL_EL1) {
+   retval = armv8_dpm_modeswitch(dpm, ARMV8_64_EL1H);
+   if (retval != ERROR_OK)
+   return retval;
+   }
+
/* retrieve CTR */
retval = dpm->instr_read_data_r0(dpm,
armv8_opcode(armv8, READ_REG_CTR), );
@@ -417,6 +425,7 @@ int armv8_identify_cache(struct armv8_common *armv8)
}
 
 done:
+   armv8_dpm_modeswitch(dpm, ARM_MODE_ANY);
dpm->finish(dpm);
return retval;
 
diff --git a/src/target/armv8_dpm.c b/src/target/armv8_dpm.c
index 91b2f51..3c941fa 100644
--- a/src/target/armv8_dpm.c
+++ b/src/target/armv8_dpm.c
@@ -258,7 +258,7 @@ static int dpmv8_exec_opcode(struct arm_dpm *dpm,
 
if (dscr & DSCR_ERR) {
LOG_ERROR("Opcode 0x%08"PRIx32", DSCR.ERR=1, DSCR.EL=%i", 
opcode, dpm->last_el);
-   armv8_dpm_handle_exception(dpm);
+   armv8_dpm_handle_exception(dpm, true);
retval = ERROR_FAIL;
}
 
@@ -600,7 +600,7 @@ int armv8_dpm_modeswitch(struct arm_dpm *dpm, enum arm_mode 
mode)
armv8_opcode(armv8, ARMV8_OPC_DCPS) | 
target_el);
 
/* DCPS clobbers registers just like an exception taken */
-   armv8_dpm_handle_exception(dpm);
+