Re: [Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-20 Thread Marek Vasut
On 10/20/2016 07:05 AM, Richard Henderson wrote:
> On 10/19/2016 08:01 PM, Marek Vasut wrote:
>>> > You might like 0xfffc better, but that does require that you count
>>> > f's appropriately for the type.  That's why I like -4: it's obvious
>>> (or
>>> > should be) that it masks to a multiple of 4, and type promotion
>>> extends
>>> > bits to the left as needed for the type of the other AND argument.
>> But then you can also use ~3 , right ?
> 
> Yes.
> 
>> In which case, you'd avoid doing
>> bitwise operations with negative numbers, which is nice.
> 
> Eh.  Since ~3 == -4, ~3 is a negative number.  So it's a false argument.
> 
> (C triviallities aside, such as the fact that there are no negative
> constants, only positive constants to which unary negation may be applied.)
> 
> The reason I like -4 is that it aligns to 4, spelled the same way.

OK

-- 
Best regards,
Marek Vasut



Re: [Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-19 Thread Richard Henderson

On 10/19/2016 08:01 PM, Marek Vasut wrote:

> You might like 0xfffc better, but that does require that you count
> f's appropriately for the type.  That's why I like -4: it's obvious (or
> should be) that it masks to a multiple of 4, and type promotion extends
> bits to the left as needed for the type of the other AND argument.

But then you can also use ~3 , right ?


Yes.


In which case, you'd avoid doing
bitwise operations with negative numbers, which is nice.


Eh.  Since ~3 == -4, ~3 is a negative number.  So it's a false argument.

(C triviallities aside, such as the fact that there are no negative constants, 
only positive constants to which unary negation may be applied.)


The reason I like -4 is that it aligns to 4, spelled the same way.



r~



Re: [Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-19 Thread Marek Vasut
On 10/19/2016 05:50 PM, Richard Henderson wrote:
> On 10/18/2016 08:23 PM, Marek Vasut wrote:
>>> The documentation appears less than clear about whether or not loads
>>> into r0 recognize exceptions from the load, as opposed to simply not
>>> modifying r0.
>>
>> What about [1] page 10, quote:
>>
>> The Nios II architecture provides thirty-two 32-bit general-purpose
>> registers, r0 through r31. Some registers have names recognized by the
>> assembler. For example, the zero register (r0) always returns the
>> value zero, and writing to zero has no effect.
> 
> I read that, but is that "... writing to zero has no effect [on the zero
> register]" but loads still take mmu faults, or, as other architecture
> manuals usually call out explicitly, "loads to the zero register are
> prefetch instructions and do not fault".
> 
> Other architectures are split on this issue.  For instance HPPA, Sparc,
> MIPS all fault when loading to the zero register from a protected page. 
> Alpha *initially* reported such faults, but later changed the spec to be
> a prefetch (and therefore kernels supporting ev4 are supposed to
> recognize such loads and return without reporting the SIGSEGV).
> 
> Do you have hardware that you can test this against?  If a load to zero
> is a prefetch, the nios2 gcc port is missing the prefetch pattern.  ;-)

Ha, I see, thanks for explaining :) I have a real system, so I tried
doing the following:

int main()
{
 asm volatile("ldb r0, 0x10(r0)");
}

This will trigger a segfault:

test: unhandled page fault (11) at 0x, cause 14

CPU: 0 PID: 1133 Comm: test Not tainted
4.8.0-rc2-next-20160816-00017-g40615ac #2
task: cfcef160 task.stack: cfcbe000
r1:  r2:  r3: 23cc r4: 0001
r5: 7f8d7e64 r6: 7f8d7e6c r7: 991f2697 r8: 001263cc
r9: e6925b67 r10: 2aac8ca0 r11: 0003 r12: 
r13: 2aabcc9c r14: 2aac8428 r15: 
ra: 2aaeee94 fp:   sp: 7f8d7df0 gp: c000
ea: 23cc estatus: 0003
Segmentation fault

> Either way, a comment would clear this up for future maintainers.

Will add one.

 +/* Branch instructions */
 +static void br(DisasContext *dc, uint32_t code, uint32_t flags)
 +{
 +I_TYPE(instr, code);
 +
 +gen_goto_tb(dc, 0, dc->pc + 4 + (int16_t)(instr.imm16 & 0xFFFC));
>>>
>>> Probably clearer as pc + 4 + (instr.imm16s & -4)
>>
>> I disagree, the 0xfffc in my opinion clearly states it removes bottom
>> two bits. Doing bitops with signed number looks real iffy.
> 
> It also removes the top 16 bits, requiring you to sign-extend again.
> 
> You might like 0xfffc better, but that does require that you count
> f's appropriately for the type.  That's why I like -4: it's obvious (or
> should be) that it masks to a multiple of 4, and type promotion extends
> bits to the left as needed for the type of the other AND argument.

But then you can also use ~3 , right ? In which case, you'd avoid doing
bitwise operations with negative numbers, which is nice.

-- 
Best regards,
Marek Vasut



Re: [Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-19 Thread Richard Henderson

On 10/18/2016 08:23 PM, Marek Vasut wrote:

The documentation appears less than clear about whether or not loads
into r0 recognize exceptions from the load, as opposed to simply not
modifying r0.


What about [1] page 10, quote:

The Nios II architecture provides thirty-two 32-bit general-purpose
registers, r0 through r31. Some registers have names recognized by the
assembler. For example, the zero register (r0) always returns the
value zero, and writing to zero has no effect.


I read that, but is that "... writing to zero has no effect [on the zero 
register]" but loads still take mmu faults, or, as other architecture manuals 
usually call out explicitly, "loads to the zero register are prefetch 
instructions and do not fault".


Other architectures are split on this issue.  For instance HPPA, Sparc, MIPS 
all fault when loading to the zero register from a protected page.  Alpha 
*initially* reported such faults, but later changed the spec to be a prefetch 
(and therefore kernels supporting ev4 are supposed to recognize such loads and 
return without reporting the SIGSEGV).


Do you have hardware that you can test this against?  If a load to zero is a 
prefetch, the nios2 gcc port is missing the prefetch pattern.  ;-)


Either way, a comment would clear this up for future maintainers.


+/* Branch instructions */
+static void br(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+I_TYPE(instr, code);
+
+gen_goto_tb(dc, 0, dc->pc + 4 + (int16_t)(instr.imm16 & 0xFFFC));


Probably clearer as pc + 4 + (instr.imm16s & -4)


I disagree, the 0xfffc in my opinion clearly states it removes bottom
two bits. Doing bitops with signed number looks real iffy.


It also removes the top 16 bits, requiring you to sign-extend again.

You might like 0xfffc better, but that does require that you count f's 
appropriately for the type.  That's why I like -4: it's obvious (or should be) 
that it masks to a multiple of 4, and type promotion extends bits to the left 
as needed for the type of the other AND argument.



r~



Re: [Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-18 Thread Marek Vasut
On 10/19/2016 01:04 AM, Richard Henderson wrote:
> On 10/18/2016 02:50 PM, Marek Vasut wrote:
>> +/* Special R-Type instruction opcode */
>> +#define INSN_R_TYPE 0x3A
>> +
>> +/* I-Type instruction parsing */
>> +#define I_TYPE(instr, code)  \
>> +struct { \
>> +uint8_t op;  \
>> +union {  \
>> +uint16_t imm16;  \
>> +int16_t imm16s;  \
>> +};   \
>> +uint8_t b;   \
>> +uint8_t a;   \
>> +} (instr) = {\
>> +.op = ((code) >> 0) & 0x3f,  \
>> +.imm16 = ((code) >> 6) & 0x, \
>> +.b = ((code) >> 22) & 0x1f,  \
>> +.a = ((code) >> 27) & 0x1f,  \
> 
> It would be preferable to use extract32.

Fixed

>> +}
>> +
>> +/* I-Type instruction parsing */
>> +#define R_TYPE(instr, code)  \
> 
> Typo in the comment.

Fixed, thanks

>> +static void gen_goto_tb(DisasContext *dc, int n, uint32_t dest)
>> +{
>> +TranslationBlock *tb = dc->tb;
>> +
>> +if ((tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK)) {
> 
> Normally we also suppress goto_tb when single-stepping.

Fixed

>> +static void gen_check_supervisor(DisasContext *dc, TCGLabel *label)
>> +{
>> +if (dc->tb->flags & CR_STATUS_U) {/* CPU in user mode */
>> +t_gen_helper_raise_exception(dc, EXCP_SUPERI);
>> +tcg_gen_br(label);
>> +}
> 
> There's no point in the label or branch, because raise_exception does
> not return.

Fixed

>> +
>> +/* Update the PC to the next instruction */
>> +tcg_gen_movi_tl(dc->cpu_R[R_PC], dc->pc + 4);
>> +}
> 
> Why?

Hrm, I cannot find good explanation. Must be a remnant from the old
code, so removed.

>> +/*
>> + * I-Type instructions
>> + */
>> +/* Load instructions */
>> +static void gen_ldx(DisasContext *dc, uint32_t code, uint32_t flags)
>> +{
>> +I_TYPE(instr, code);
>> +
>> +/* Loads into R_ZERO are ignored */
>> +if (unlikely(instr.b == R_ZERO)) {
>> +return;
>> +} else if (instr.a == R_ZERO) {
>> +/* Loads from R_ZERO + offset are loaded directly from offset */
>> +tcg_gen_qemu_ld_tl(dc->cpu_R[instr.b],
>> tcg_const_i32(instr.imm16s),
>> +   dc->mem_idx, flags);
>> +} else { /* Regular loads */
>> +TCGv addr = tcg_temp_new();
>> +tcg_gen_addi_tl(addr, dc->cpu_R[instr.a], instr.imm16s);
> 
> This is where it becomes cleaner to just use load_gpr, and let the TCG
> optimizer fold 0 + C -> C.
> 
> The documentation appears less than clear about whether or not loads
> into r0 recognize exceptions from the load, as opposed to simply not
> modifying r0.

What about [1] page 10, quote:

The Nios II architecture provides thirty-two 32-bit general-purpose
registers, r0 through r31. Some registers have names recognized by the
assembler. For example, the zero register (r0) always returns the
value zero, and writing to zero has no effect.
[...]

[1]
https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/nios2/n2cpu_nii5v1.pdf

> You should use the TCGMemOp type for flags.

Fixed

>> +/* Branch instructions */
>> +static void br(DisasContext *dc, uint32_t code, uint32_t flags)
>> +{
>> +I_TYPE(instr, code);
>> +
>> +gen_goto_tb(dc, 0, dc->pc + 4 + (int16_t)(instr.imm16 & 0xFFFC));
> 
> Probably clearer as pc + 4 + (instr.imm16s & -4)

I disagree, the 0xfffc in my opinion clearly states it removes bottom
two bits. Doing bitops with signed number looks real iffy.

>> +/* ctlN <- rA */
>> +static void wrctl(DisasContext *dc, uint32_t code, uint32_t flags)
>> +{
>> +R_TYPE(instr, code);
>> +
>> +TCGLabel *l1 = gen_new_label();

I can also drop this label.

>> +gen_check_supervisor(dc, l1);
>> +
>> +switch (instr.imm5 + 32) {
>> +case CR_PTEADDR:
>> +case CR_TLBACC:
>> +case CR_TLBMISC:
>> +{
>> +#if !defined(CONFIG_USER_ONLY)
>> +TCGv_i32 tmp = tcg_const_i32(instr.imm5 + 32);
>> +gen_helper_mmu_write(dc->cpu_env, tmp, dc->cpu_R[instr.a]);
> 
> load_gpr.

Fixed

>> +/* Comparison instructions */
>> +static void gen_cmpxx(DisasContext *dc, uint32_t code, uint32_t flags)
>> +{
>> +R_TYPE(instr, code);
>> +if (likely(instr.c != R_ZERO)) {
>> +tcg_gen_setcond_tl(flags, dc->cpu_R[instr.c],
>> dc->cpu_R[instr.a],
>> +   dc->cpu_R[instr.b]);
>> +} else {
>> +TCGv_i32 tmp = tcg_const_i32(0);
>> +tcg_gen_setcond_tl(flags, tmp, dc->cpu_R[instr.a],
>> +   dc->cpu_R[instr.b]);
>> +tcg_temp_free_i32(tmp);
> 
> Why compute this into a tmp?

H, seems meaningless. I don't need the whole branch, so I can remove it.

>> +#define gen_r_math_logic_zeropt(fname, insn,
>> op3)\
>> +static void 

Re: [Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-18 Thread Richard Henderson

On 10/18/2016 02:50 PM, Marek Vasut wrote:

+/* Special R-Type instruction opcode */
+#define INSN_R_TYPE 0x3A
+
+/* I-Type instruction parsing */
+#define I_TYPE(instr, code)  \
+struct { \
+uint8_t op;  \
+union {  \
+uint16_t imm16;  \
+int16_t imm16s;  \
+};   \
+uint8_t b;   \
+uint8_t a;   \
+} (instr) = {\
+.op = ((code) >> 0) & 0x3f,  \
+.imm16 = ((code) >> 6) & 0x, \
+.b = ((code) >> 22) & 0x1f,  \
+.a = ((code) >> 27) & 0x1f,  \


It would be preferable to use extract32.


+}
+
+/* I-Type instruction parsing */
+#define R_TYPE(instr, code)  \


Typo in the comment.


+static void gen_goto_tb(DisasContext *dc, int n, uint32_t dest)
+{
+TranslationBlock *tb = dc->tb;
+
+if ((tb->pc & TARGET_PAGE_MASK) == (dest & TARGET_PAGE_MASK)) {


Normally we also suppress goto_tb when single-stepping.


+static void gen_check_supervisor(DisasContext *dc, TCGLabel *label)
+{
+if (dc->tb->flags & CR_STATUS_U) {/* CPU in user mode */
+t_gen_helper_raise_exception(dc, EXCP_SUPERI);
+tcg_gen_br(label);
+}


There's no point in the label or branch, because raise_exception does not 
return.


+
+/* Update the PC to the next instruction */
+tcg_gen_movi_tl(dc->cpu_R[R_PC], dc->pc + 4);
+}


Why?


+/*
+ * I-Type instructions
+ */
+/* Load instructions */
+static void gen_ldx(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+I_TYPE(instr, code);
+
+/* Loads into R_ZERO are ignored */
+if (unlikely(instr.b == R_ZERO)) {
+return;
+} else if (instr.a == R_ZERO) {
+/* Loads from R_ZERO + offset are loaded directly from offset */
+tcg_gen_qemu_ld_tl(dc->cpu_R[instr.b], tcg_const_i32(instr.imm16s),
+   dc->mem_idx, flags);
+} else { /* Regular loads */
+TCGv addr = tcg_temp_new();
+tcg_gen_addi_tl(addr, dc->cpu_R[instr.a], instr.imm16s);


This is where it becomes cleaner to just use load_gpr, and let the TCG 
optimizer fold 0 + C -> C.


The documentation appears less than clear about whether or not loads into r0 
recognize exceptions from the load, as opposed to simply not modifying r0.


You should use the TCGMemOp type for flags.


+/* Branch instructions */
+static void br(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+I_TYPE(instr, code);
+
+gen_goto_tb(dc, 0, dc->pc + 4 + (int16_t)(instr.imm16 & 0xFFFC));


Probably clearer as pc + 4 + (instr.imm16s & -4)


+/* ctlN <- rA */
+static void wrctl(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+R_TYPE(instr, code);
+
+TCGLabel *l1 = gen_new_label();
+gen_check_supervisor(dc, l1);
+
+switch (instr.imm5 + 32) {
+case CR_PTEADDR:
+case CR_TLBACC:
+case CR_TLBMISC:
+{
+#if !defined(CONFIG_USER_ONLY)
+TCGv_i32 tmp = tcg_const_i32(instr.imm5 + 32);
+gen_helper_mmu_write(dc->cpu_env, tmp, dc->cpu_R[instr.a]);


load_gpr.


+/* Comparison instructions */
+static void gen_cmpxx(DisasContext *dc, uint32_t code, uint32_t flags)
+{
+R_TYPE(instr, code);
+if (likely(instr.c != R_ZERO)) {
+tcg_gen_setcond_tl(flags, dc->cpu_R[instr.c], dc->cpu_R[instr.a],
+   dc->cpu_R[instr.b]);
+} else {
+TCGv_i32 tmp = tcg_const_i32(0);
+tcg_gen_setcond_tl(flags, tmp, dc->cpu_R[instr.a],
+   dc->cpu_R[instr.b]);
+tcg_temp_free_i32(tmp);


Why compute this into a tmp?


+#define gen_r_math_logic_zeropt(fname, insn, op3)\
+static void (fname)(DisasContext *dc, uint32_t code, uint32_t flags)   \
+{  \
+R_TYPE(instr, (code)); \
+/* NOP and store to R_ZERO optimization */ \
+if (instr.c == R_ZERO) {   \
+return;\
+} else if ((instr.a == R_ZERO) && (instr.b == R_ZERO)) {   \
+tcg_gen_movi_tl(dc->cpu_R[instr.c], 0);\
+} else if (instr.a == R_ZERO) {\
+tcg_gen_mov_tl(dc->cpu_R[instr.c], load_gpr(dc, instr.b)); \
+} else if (instr.b == R_ZERO) {\
+tcg_gen_mov_tl(dc->cpu_R[instr.c], load_gpr(dc, instr.a)); \


These two are in general wrong.  Use load_gpr.


+#define gen_r_div(fname, insn, op3)\
+static void (fname)(DisasContext *dc, uint32_t code, uint32_t flags)   \
+{   

[Qemu-devel] [PATCH V3 2/7] nios2: Add architecture emulation support

2016-10-18 Thread Marek Vasut
From: Chris Wulff 

Add support for emulating Altera NiosII R1 architecture into qemu.
This patch is based on previous work by Chris Wulff from 2012 and
updated to latest mainline QEMU.

Signed-off-by: Marek Vasut 
Cc: Chris Wulff 
Cc: Jeff Da Silva 
Cc: Ley Foon Tan 
Cc: Sandra Loosemore 
Cc: Yves Vandervennet 
---
V3: Thorough cleanup, deal with the review comments all over the place
---
 target-nios2/Makefile.objs |   4 +
 target-nios2/cpu.c | 233 
 target-nios2/cpu.h | 270 +
 target-nios2/helper.c  | 313 +++
 target-nios2/helper.h  |  27 ++
 target-nios2/mmu.c | 292 ++
 target-nios2/mmu.h |  54 +++
 target-nios2/monitor.c |  35 ++
 target-nios2/op_helper.c   |  47 +++
 target-nios2/translate.c   | 929 +
 10 files changed, 2204 insertions(+)
 create mode 100644 target-nios2/Makefile.objs
 create mode 100644 target-nios2/cpu.c
 create mode 100644 target-nios2/cpu.h
 create mode 100644 target-nios2/helper.c
 create mode 100644 target-nios2/helper.h
 create mode 100644 target-nios2/mmu.c
 create mode 100644 target-nios2/mmu.h
 create mode 100644 target-nios2/monitor.c
 create mode 100644 target-nios2/op_helper.c
 create mode 100644 target-nios2/translate.c

diff --git a/target-nios2/Makefile.objs b/target-nios2/Makefile.objs
new file mode 100644
index 000..2a11c5c
--- /dev/null
+++ b/target-nios2/Makefile.objs
@@ -0,0 +1,4 @@
+obj-y += translate.o op_helper.o helper.o cpu.o mmu.o
+obj-$(CONFIG_SOFTMMU) += monitor.o
+
+$(obj)/op_helper.o: QEMU_CFLAGS += $(HELPER_CFLAGS)
diff --git a/target-nios2/cpu.c b/target-nios2/cpu.c
new file mode 100644
index 000..076e4e2
--- /dev/null
+++ b/target-nios2/cpu.c
@@ -0,0 +1,233 @@
+/*
+ * QEMU Nios II CPU
+ *
+ * Copyright (c) 2012 Chris Wulff 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qapi/error.h"
+#include "cpu.h"
+#include "exec/log.h"
+#include "exec/gdbstub.h"
+#include "hw/qdev-properties.h"
+
+static void nios2_cpu_set_pc(CPUState *cs, vaddr value)
+{
+Nios2CPU *cpu = NIOS2_CPU(cs);
+CPUNios2State *env = >env;
+
+env->regs[R_PC] = value;
+}
+
+static bool nios2_cpu_has_work(CPUState *cs)
+{
+return cs->interrupt_request & (CPU_INTERRUPT_HARD | CPU_INTERRUPT_NMI);
+}
+
+/* CPUClass::reset() */
+static void nios2_cpu_reset(CPUState *cs)
+{
+Nios2CPU *cpu = NIOS2_CPU(cs);
+Nios2CPUClass *ncc = NIOS2_CPU_GET_CLASS(cpu);
+CPUNios2State *env = >env;
+
+if (qemu_loglevel_mask(CPU_LOG_RESET)) {
+qemu_log("CPU Reset (CPU %d)\n", cs->cpu_index);
+log_cpu_state(cs, 0);
+}
+
+ncc->parent_reset(cs);
+
+tlb_flush(cs, 1);
+
+memset(env->regs, 0, sizeof(uint32_t) * NUM_CORE_REGS);
+env->regs[R_PC] = cpu->reset_addr;
+
+#if defined(CONFIG_USER_ONLY)
+/* Start in user mode with interrupts enabled. */
+env->regs[CR_STATUS] = CR_STATUS_U | CR_STATUS_PIE;
+#endif
+}
+
+static void nios2_cpu_initfn(Object *obj)
+{
+CPUState *cs = CPU(obj);
+Nios2CPU *cpu = NIOS2_CPU(obj);
+CPUNios2State *env = >env;
+static bool tcg_initialized;
+
+cpu->mmu_present = true;
+cs->env_ptr = env;
+cpu_exec_init(cs, _abort);
+
+#if !defined(CONFIG_USER_ONLY)
+mmu_init(>mmu);
+#endif
+
+if (tcg_enabled() && !tcg_initialized) {
+tcg_initialized = true;
+nios2_tcg_init();
+}
+}
+
+Nios2CPU *cpu_nios2_init(const char *cpu_model)
+{
+Nios2CPU *cpu = NIOS2_CPU(object_new(TYPE_NIOS2_CPU));
+
+object_property_set_bool(OBJECT(cpu), true, "realized", NULL);
+
+return cpu;
+}
+
+static void nios2_cpu_realizefn(DeviceState *dev, Error **errp)
+{
+CPUState *cs = CPU(dev);
+Nios2CPUClass *ncc = NIOS2_CPU_GET_CLASS(dev);
+
+qemu_init_vcpu(cs);
+cpu_reset(cs);
+
+ncc->parent_realize(dev, errp);
+}
+
+static bool nios2_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
+{
+Nios2CPU *cpu = NIOS2_CPU(cs);
+CPUNios2State *env = >env;
+
+if ((interrupt_request & CPU_INTERRUPT_HARD) &&
+(env->regs[CR_STATUS] & CR_STATUS_PIE)) {
+