Dear Hoa, all,

Yes, the problem has been solved!! :)

I am able to execute HPCG benchmark without accuracy problems in RISCV FS mode!

Thank you!!

Best regards,
Nikos

Quoting Hoa Nguyen via gem5-users <gem5-users@gem5.org>:

Hi Nikos,

The problem you ran into with the FS mode seems to be the same problem
described here [1] [2].
Can you try downloading the changes and let me know if it fixes the problem?

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://gem5-review.googlesource.com/c/public/gem5/+/65273

Regards,
Hoa Nguyen

On Tue, Nov 1, 2022 at 2:15 PM Hoa Nguyen <hoangu...@ucdavis.edu> wrote:

Hi all,

I also ran into the same problem using another benchmark. I want to note
that this problem also appears when using the AtomicCPU.

Regards,
Hoa Nguyen

On Tue, Nov 1, 2022 at 3:02 AM Νικόλαος Ταμπουρατζής via gem5-users <
gem5-users@gem5.org> wrote:

Dear Boddy,

Thank you for the update! Please let me know when the accuracy issue
will be resolved because I cannot execute any benchmark in RISCV FS
mode (I am wondering if any other user faces the same problem).

Best regards,
Nikos


Quoting Bobby Bruce via gem5-users <gem5-users@gem5.org>:

> You mean this bug? Unfortunately not, I've been very busy with the
upcoming
> gem5 release and haven't had time to investigate this further.
>
> --
> Dr. Bobby R. Bruce
> Room 3050,
> Kemper Hall, UC Davis
> Davis,
> CA, 95616
>
> web: https://www.bobbybruce.net
>
>
> On Mon, Oct 31, 2022 at 1:45 AM Νικόλαος Ταμπουρατζής via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Dear Bobby, Jason, all,
>>
>> Is there any update about the accuracy of RISC-V FS?
>>
>> Best regards,
>> Nikos
>>
>>
>> Quoting Bobby Bruce <bbr...@ucdavis.edu>:
>>
>> > Jason and I had a theory that this may be due to the "Rounding Mode"
for
>> > floating pointing being set incorrectly in FS mode. That's set via a
>> macro
>> > here:
>> >
>>
https://gem5.googlesource.com/public/gem5/+/refs/tags/v22.0.0.2/src/arch/riscv/fp_inst.hh#36
>> >
>> > I manually expanded the macro here:
>> >
>>
https://gem5.googlesource.com/public/gem5/+/refs/tags/v22.0.0.2/src/arch/riscv/isa/decoder.isa#1495
>> ,
>> > inside the "fsqrt_d" definition then compiled "build/ALL/gem5.debug".
>> Then
>> > used gdb to add a breakpoint in the "Fsqrt_d::execute" function (in
the
>> > generated "build/ALL/arch/riscv/generated/exec-ns.cc.inc" file).
>> >
>> > ```
>> > gdb build/ALL/gem5.opt
>> > break Fsqrt_d::execute
>> > run bug-recreation/se-mode-run.py # or `run
>> bug-recreation/fs-mode-run.py`
>> > ```
>> >
>> > Stepping through with gdb I the rounding mode is `0` for SE mode and
`0`
>> > for FS mode as well. So, no luck with that theory.
>> >
>> > My new theory is that this bug has something to do with thread
context
>> > switching being implemented incorrectly in RISC-V somehow. I find it
>> > strange that the sqrt(1) works fine for a while (i.e. returns `1`)
then
>> > suddenly starts returning zero after a certain point in the
execution. In
>> > addition, it's odd that the loop is not returning the same value each
>> time
>> > despite executing the same code. It'd make sense to me that the
thread is
>> > being stored and then resumed with some corruption of the floating
point
>> > data. This would also explain why this bug only occurs in FS mode.
>> >
>> > I'll try to find time to figure out a good test for this. If anyone
has
>> any
>> > other theories or ideas then let me know.
>> >
>> > --
>> > Dr. Bobby R. Bruce
>> > Room 3050,
>> > Kemper Hall, UC Davis
>> > Davis,
>> > CA, 95616
>> >
>> > web: https://www.bobbybruce.net
>> >
>> >
>> > On Fri, Oct 7, 2022 at 12:50 PM Νικόλαος Ταμπουρατζής <
>> > ntampourat...@ece.auth.gr> wrote:
>> >>
>> >> Dear Jason & Boddy,
>> >>
>> >> Unfortunately, I have tried my simple example without the sqrt
>> >> function and the problem remains. Specifically, I have the following
>> >> simple code:
>> >>
>> >>
>> >> #include <cmath>
>> >> #include <stdio.h>
>> >>
>> >> int main(){
>> >>
>> >>      int dim = 1024;
>> >>
>> >>      double result;
>> >>
>> >>      for (int iter = 0; iter < 2; iter++){
>> >>          result = 0;
>> >>          for (int i = 0; i < dim; i++){
>> >>              for (int j = 0; j < dim; j++){
>> >>                  result += i * j;
>> >>              }
>> >>          }
>> >>          printf("Final Result: %lf\n", result);
>> >>      }
>> >> }
>> >>
>> >>
>> >> In the above code, the correct result is 274341298176.000000 (from
>> >> RISCV-SE mode and x86), while in FS mode I get sometimes the correct
>> >> result and other times a different number.
>> >>
>> >> Best regards,
>> >> Nikos
>> >>
>> >>
>> >> Quoting Jason Lowe-Power <ja...@lowepower.com>:
>> >>
>> >> > I have an idea...
>> >> >
>> >> > Have you put a breakpoint in the implementation of the fsqrt_d
>> > function? I
>> >> > would like to know if when running in SE mode and running in FS
mode
>> we
>> > are
>> >> > using the same rounding mode. My hypothesis is that in FS mode the
>> > rounding
>> >> > mode is set differently.
>> >> >
>> >> > Cheers,
>> >> > Jason
>> >> >
>> >> > On Fri, Oct 7, 2022 at 12:15 AM Νικόλαος Ταμπουρατζής <
>> >> > ntampourat...@ece.auth.gr> wrote:
>> >> >
>> >> >> Dear Boddy,
>> >> >>
>> >> >> Thanks a lot for the effort! I looked in detail and I observe
that
>> the
>> >> >> problem is created only using float and double variables (in the
case
>> >> >> of int it is working properly in FS mode). Specifically, in the
case
>> >> >> of float the variables are set to "nan", while in the case of
double
>> >> >> the variables are set to 0.000000 (in random time - probably from
>> some
>> >> >> instruction of simulated OS?). You may use a simple c/c++
example in
>> >> >> order to get some traces before going to HPCG...
>> >> >>
>> >> >> Thank you in advance!!
>> >> >> Best regards,
>> >> >> Nikos
>> >> >>
>> >> >>
>> >> >> Quoting Bobby Bruce <bbr...@ucdavis.edu>:
>> >> >>
>> >> >> > Hey Niko,
>> >> >> >
>> >> >> > Thanks for this analysis. I jumped a little into this today but
>> > didn't
>> >> >> get
>> >> >> > as far as you did. I wanted to find a quick way to recreate the
>> >> >> following:
>> >> >> > https://gem5-review.googlesource.com/c/public/gem5/+/64211.
>> Please
>> > feel
>> >> >> > free to use this, if it helps any.
>> >> >> >
>> >> >> > It's very strange to me that this bug hasn't manifested itself
>> > before but
>> >> >> > it's undeniably there. I'll try to spend more time looking at
this
>> >> >> tomorrow
>> >> >> > with some traces and debug flags and see if I can narrow down
the
>> >> >> problem.
>> >> >> >
>> >> >> > --
>> >> >> > Dr. Bobby R. Bruce
>> >> >> > Room 3050,
>> >> >> > Kemper Hall, UC Davis
>> >> >> > Davis,
>> >> >> > CA, 95616
>> >> >> >
>> >> >> > web: https://www.bobbybruce.net
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Oct 5, 2022 at 2:26 PM Νικόλαος Ταμπουρατζής <
>> >> >> > ntampourat...@ece.auth.gr> wrote:
>> >> >> >
>> >> >> >> In my previous results, I had used double (not float) for the
>> >> >> >> following variables: result, sq_i and sq_j. In the case of
float
>> >> >> >> instead of double I get "nan" and not 0.000000.
>> >> >> >>
>> >> >> >> Quoting Νικόλαος Ταμπουρατζής <ntampourat...@ece.auth.gr>:
>> >> >> >>
>> >> >> >> > Dear Jason, all,
>> >> >> >> >
>> >> >> >> > I am trying to find the accuracy problem with RISCV-FS and I
>> > observe
>> >> >> >> > that the problem is created (at least in my dummy example)
>> because
>> >> >> >> > the variables (double) are set to zero in random simulated
time
>> > (for
>> >> >> >> > this reason I get different results among executions of the
same
>> >> >> >> > code). Specifically for the following dummy code:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > #include <cmath>
>> >> >> >> > #include <stdio.h>
>> >> >> >> >
>> >> >> >> > int main(){
>> >> >> >> >
>> >> >> >> >     int dim = 10;
>> >> >> >> >
>> >> >> >> >     float result;
>> >> >> >> >
>> >> >> >> >     for (int iter = 0; iter < 2; iter++){
>> >> >> >> >         result = 0;
>> >> >> >> >         for (int i = 0; i < dim; i++){
>> >> >> >> >             for (int j = 0; j < dim; j++){
>> >> >> >> >                 float sq_i = sqrt(i);
>> >> >> >> >                 float sq_j = sqrt(j);
>> >> >> >> >                 result += sq_i * sq_j;
>> >> >> >> >                 printf("ITER: %d | i: %d | j: %d Result(i:
%f |
>> j:
>> >> >> >> > %f | i*j: %f): %f\n", iter, i , j, sq_i, sq_j, sq_i * sq_j,
>> > result);
>> >> >> >> >             }
>> >> >> >> >         }
>> >> >> >> >         printf("Final Result: %lf\n", result);
>> >> >> >> >     }
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > The correct Final Result in both iterations is 372.721656.
>> > However,
>> >> >> >> > I get the following results in FS:
>> >> >> >> >
>> >> >> >> > ITER: 0 | i: 0 | j: 0 Result(i: 0.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 1 Result(i: 0.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 2 Result(i: 0.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 3 Result(i: 0.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 4 Result(i: 0.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 5 Result(i: 0.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 6 Result(i: 0.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 7 Result(i: 0.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 8 Result(i: 0.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 0 | j: 9 Result(i: 0.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 0 Result(i: 1.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 1 Result(i: 1.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 1.000000): 1.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 2 Result(i: 1.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 1.414214): 2.414214
>> >> >> >> > ITER: 0 | i: 1 | j: 3 Result(i: 1.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 1.732051): 4.146264
>> >> >> >> > ITER: 0 | i: 1 | j: 4 Result(i: 0.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 5 Result(i: 0.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 6 Result(i: 0.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 7 Result(i: 0.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 8 Result(i: 0.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 9 Result(i: 0.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 2 | j: 0 Result(i: 1.414214 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 2 | j: 1 Result(i: 1.414214 | j: 1.000000 |
i*j:
>> >> >> >> > 1.414214): 1.414214
>> >> >> >> > ITER: 0 | i: 2 | j: 2 Result(i: 1.414214 | j: 1.414214 |
i*j:
>> >> >> >> > 2.000000): 3.414214
>> >> >> >> > ITER: 0 | i: 2 | j: 3 Result(i: 1.414214 | j: 1.732051 |
i*j:
>> >> >> >> > 2.449490): 5.863703
>> >> >> >> > ITER: 0 | i: 2 | j: 4 Result(i: 1.414214 | j: 2.000000 |
i*j:
>> >> >> >> > 2.828427): 8.692130
>> >> >> >> > ITER: 0 | i: 2 | j: 5 Result(i: 1.414214 | j: 2.236068 |
i*j:
>> >> >> >> > 3.162278): 11.854408
>> >> >> >> > ITER: 0 | i: 2 | j: 6 Result(i: 1.414214 | j: 2.449490 |
i*j:
>> >> >> >> > 3.464102): 15.318510
>> >> >> >> > ITER: 0 | i: 2 | j: 7 Result(i: 1.414214 | j: 2.645751 |
i*j:
>> >> >> >> > 3.741657): 19.060167
>> >> >> >> > ITER: 0 | i: 2 | j: 8 Result(i: 1.414214 | j: 2.828427 |
i*j:
>> >> >> >> > 4.000000): 23.060167
>> >> >> >> > ITER: 0 | i: 2 | j: 9 Result(i: 1.414214 | j: 3.000000 |
i*j:
>> >> >> >> > 4.242641): 27.302808
>> >> >> >> > ITER: 0 | i: 3 | j: 0 Result(i: 1.732051 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 27.302808
>> >> >> >> > ITER: 0 | i: 3 | j: 1 Result(i: 1.732051 | j: 1.000000 |
i*j:
>> >> >> >> > 1.732051): 29.034859
>> >> >> >> > ITER: 0 | i: 3 | j: 2 Result(i: 1.732051 | j: 1.414214 |
i*j:
>> >> >> >> > 2.449490): 31.484348
>> >> >> >> > ITER: 0 | i: 3 | j: 3 Result(i: 1.732051 | j: 1.732051 |
i*j:
>> >> >> >> > 3.000000): 34.484348
>> >> >> >> > ITER: 0 | i: 3 | j: 4 Result(i: 1.732051 | j: 2.000000 |
i*j:
>> >> >> >> > 3.464102): 37.948450
>> >> >> >> > ITER: 0 | i: 3 | j: 5 Result(i: 1.732051 | j: 2.236068 |
i*j:
>> >> >> >> > 3.872983): 41.821433
>> >> >> >> > ITER: 0 | i: 3 | j: 6 Result(i: 1.732051 | j: 2.449490 |
i*j:
>> >> >> >> > 4.242641): 46.064074
>> >> >> >> > ITER: 0 | i: 3 | j: 7 Result(i: 1.732051 | j: 2.645751 |
i*j:
>> >> >> >> > 4.582576): 50.646650
>> >> >> >> > ITER: 0 | i: 3 | j: 8 Result(i: 1.732051 | j: 2.828427 |
i*j:
>> >> >> >> > 4.898979): 55.545629
>> >> >> >> > ITER: 0 | i: 3 | j: 9 Result(i: 1.732051 | j: 3.000000 |
i*j:
>> >> >> >> > 5.196152): 60.741782
>> >> >> >> > ITER: 0 | i: 4 | j: 0 Result(i: 2.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 60.741782
>> >> >> >> > ITER: 0 | i: 4 | j: 1 Result(i: 2.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 2.000000): 62.741782
>> >> >> >> > ITER: 0 | i: 4 | j: 2 Result(i: 2.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 2.828427): 65.570209
>> >> >> >> > ITER: 0 | i: 4 | j: 3 Result(i: 2.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 3.464102): 69.034310
>> >> >> >> > ITER: 0 | i: 4 | j: 4 Result(i: 2.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 4.000000): 73.034310
>> >> >> >> > ITER: 0 | i: 4 | j: 5 Result(i: 2.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 4.472136): 77.506446
>> >> >> >> > ITER: 0 | i: 4 | j: 6 Result(i: 2.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 4.898979): 82.405426
>> >> >> >> > ITER: 0 | i: 4 | j: 7 Result(i: 2.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 5.291503): 87.696928
>> >> >> >> > ITER: 0 | i: 4 | j: 8 Result(i: 2.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 5.656854): 93.353783
>> >> >> >> > ITER: 0 | i: 4 | j: 9 Result(i: 2.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 6.000000): 99.353783
>> >> >> >> > ITER: 0 | i: 5 | j: 0 Result(i: 2.236068 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 99.353783
>> >> >> >> > ITER: 0 | i: 5 | j: 1 Result(i: 2.236068 | j: 1.000000 |
i*j:
>> >> >> >> > 2.236068): 101.589851
>> >> >> >> > ITER: 0 | i: 5 | j: 2 Result(i: 2.236068 | j: 1.414214 |
i*j:
>> >> >> >> > 3.162278): 104.752128
>> >> >> >> > ITER: 0 | i: 5 | j: 3 Result(i: 2.236068 | j: 1.732051 |
i*j:
>> >> >> >> > 3.872983): 108.625112
>> >> >> >> > ITER: 0 | i: 5 | j: 4 Result(i: 2.236068 | j: 2.000000 |
i*j:
>> >> >> >> > 4.472136): 113.097248
>> >> >> >> > ITER: 0 | i: 5 | j: 5 Result(i: 2.236068 | j: 2.236068 |
i*j:
>> >> >> >> > 5.000000): 118.097248
>> >> >> >> > ITER: 0 | i: 5 | j: 6 Result(i: 2.236068 | j: 2.449490 |
i*j:
>> >> >> >> > 5.477226): 123.574473
>> >> >> >> > ITER: 0 | i: 5 | j: 7 Result(i: 2.236068 | j: 2.645751 |
i*j:
>> >> >> >> > 5.916080): 129.490553
>> >> >> >> > ITER: 0 | i: 5 | j: 8 Result(i: 2.236068 | j: 2.828427 |
i*j:
>> >> >> >> > 6.324555): 135.815108
>> >> >> >> > ITER: 0 | i: 5 | j: 9 Result(i: 2.236068 | j: 3.000000 |
i*j:
>> >> >> >> > 6.708204): 142.523312
>> >> >> >> > ITER: 0 | i: 6 | j: 0 Result(i: 2.449490 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 142.523312
>> >> >> >> > ITER: 0 | i: 6 | j: 1 Result(i: 2.449490 | j: 1.000000 |
i*j:
>> >> >> >> > 2.449490): 144.972802
>> >> >> >> > ITER: 0 | i: 6 | j: 2 Result(i: 2.449490 | j: 1.414214 |
i*j:
>> >> >> >> > 3.464102): 148.436904
>> >> >> >> > ITER: 0 | i: 6 | j: 3 Result(i: 2.449490 | j: 1.732051 |
i*j:
>> >> >> >> > 4.242641): 152.679544
>> >> >> >> > ITER: 0 | i: 6 | j: 4 Result(i: 2.449490 | j: 2.000000 |
i*j:
>> >> >> >> > 4.898979): 157.578524
>> >> >> >> > ITER: 0 | i: 6 | j: 5 Result(i: 2.449490 | j: 2.236068 |
i*j:
>> >> >> >> > 5.477226): 163.055749
>> >> >> >> > ITER: 0 | i: 6 | j: 6 Result(i: 2.449490 | j: 2.449490 |
i*j:
>> >> >> >> > 6.000000): 169.055749
>> >> >> >> > ITER: 0 | i: 6 | j: 7 Result(i: 2.449490 | j: 2.645751 |
i*j:
>> >> >> >> > 6.480741): 175.536490
>> >> >> >> > ITER: 0 | i: 6 | j: 8 Result(i: 2.449490 | j: 2.828427 |
i*j:
>> >> >> >> > 6.928203): 182.464693
>> >> >> >> > ITER: 0 | i: 6 | j: 9 Result(i: 2.449490 | j: 3.000000 |
i*j:
>> >> >> >> > 7.348469): 189.813162
>> >> >> >> > ITER: 0 | i: 7 | j: 0 Result(i: 2.645751 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 189.813162
>> >> >> >> > ITER: 0 | i: 7 | j: 1 Result(i: 2.645751 | j: 1.000000 |
i*j:
>> >> >> >> > 2.645751): 192.458914
>> >> >> >> > ITER: 0 | i: 7 | j: 2 Result(i: 2.645751 | j: 1.414214 |
i*j:
>> >> >> >> > 3.741657): 196.200571
>> >> >> >> > ITER: 0 | i: 7 | j: 3 Result(i: 2.645751 | j: 1.732051 |
i*j:
>> >> >> >> > 4.582576): 200.783147
>> >> >> >> > ITER: 0 | i: 7 | j: 4 Result(i: 2.645751 | j: 2.000000 |
i*j:
>> >> >> >> > 5.291503): 206.074649
>> >> >> >> > ITER: 0 | i: 7 | j: 5 Result(i: 2.645751 | j: 2.236068 |
i*j:
>> >> >> >> > 5.916080): 211.990729
>> >> >> >> > ITER: 0 | i: 7 | j: 6 Result(i: 2.645751 | j: 2.449490 |
i*j:
>> >> >> >> > 6.480741): 218.471470
>> >> >> >> > ITER: 0 | i: 7 | j: 7 Result(i: 2.645751 | j: 2.645751 |
i*j:
>> >> >> >> > 7.000000): 225.471470
>> >> >> >> > ITER: 0 | i: 7 | j: 8 Result(i: 2.645751 | j: 2.828427 |
i*j:
>> >> >> >> > 7.483315): 232.954785
>> >> >> >> > ITER: 0 | i: 7 | j: 9 Result(i: 2.645751 | j: 3.000000 |
i*j:
>> >> >> >> > 7.937254): 240.892039
>> >> >> >> > ITER: 0 | i: 8 | j: 0 Result(i: 2.828427 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 240.892039
>> >> >> >> > ITER: 0 | i: 8 | j: 1 Result(i: 2.828427 | j: 1.000000 |
i*j:
>> >> >> >> > 2.828427): 243.720466
>> >> >> >> > ITER: 0 | i: 8 | j: 2 Result(i: 2.828427 | j: 1.414214 |
i*j:
>> >> >> >> > 4.000000): 247.720466
>> >> >> >> > ITER: 0 | i: 8 | j: 3 Result(i: 2.828427 | j: 1.732051 |
i*j:
>> >> >> >> > 4.898979): 252.619445
>> >> >> >> > ITER: 0 | i: 8 | j: 4 Result(i: 2.828427 | j: 2.000000 |
i*j:
>> >> >> >> > 5.656854): 258.276300
>> >> >> >> > ITER: 0 | i: 8 | j: 5 Result(i: 2.828427 | j: 2.236068 |
i*j:
>> >> >> >> > 6.324555): 264.600855
>> >> >> >> > ITER: 0 | i: 8 | j: 6 Result(i: 2.828427 | j: 2.449490 |
i*j:
>> >> >> >> > 6.928203): 271.529058
>> >> >> >> > ITER: 0 | i: 8 | j: 7 Result(i: 2.828427 | j: 2.645751 |
i*j:
>> >> >> >> > 7.483315): 279.012373
>> >> >> >> > ITER: 0 | i: 8 | j: 8 Result(i: 2.828427 | j: 2.828427 |
i*j:
>> >> >> >> > 8.000000): 287.012373
>> >> >> >> > ITER: 0 | i: 8 | j: 9 Result(i: 2.828427 | j: 3.000000 |
i*j:
>> >> >> >> > 8.485281): 295.497654
>> >> >> >> > ITER: 0 | i: 9 | j: 0 Result(i: 3.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 295.497654
>> >> >> >> > ITER: 0 | i: 9 | j: 1 Result(i: 3.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 3.000000): 298.497654
>> >> >> >> > ITER: 0 | i: 9 | j: 2 Result(i: 3.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 4.242641): 302.740295
>> >> >> >> > ITER: 0 | i: 9 | j: 3 Result(i: 3.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 5.196152): 307.936447
>> >> >> >> > ITER: 0 | i: 9 | j: 4 Result(i: 3.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 6.000000): 313.936447
>> >> >> >> > ITER: 0 | i: 9 | j: 5 Result(i: 3.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 6.708204): 320.644651
>> >> >> >> > ITER: 0 | i: 9 | j: 6 Result(i: 3.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 7.348469): 327.993120
>> >> >> >> > ITER: 0 | i: 9 | j: 7 Result(i: 3.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 7.937254): 335.930374
>> >> >> >> > ITER: 0 | i: 9 | j: 8 Result(i: 3.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 8.485281): 344.415656
>> >> >> >> > ITER: 0 | i: 9 | j: 9 Result(i: 3.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 9.000000): 353.415656
>> >> >> >> > Final Result: 353.415656
>> >> >> >> > ITER: 1 | i: 0 | j: 0 Result(i: 0.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 1 Result(i: 0.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 2 Result(i: 0.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 3 Result(i: 0.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 4 Result(i: 0.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 5 Result(i: 0.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 6 Result(i: 0.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 7 Result(i: 0.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 8 Result(i: 0.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 0 | j: 9 Result(i: 0.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 1 | j: 0 Result(i: 1.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 1 | i: 1 | j: 1 Result(i: 1.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 1.000000): 1.000000
>> >> >> >> > ITER: 1 | i: 1 | j: 2 Result(i: 1.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 1.414214): 2.414214
>> >> >> >> > ITER: 1 | i: 1 | j: 3 Result(i: 1.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 1.732051): 4.146264
>> >> >> >> > ITER: 1 | i: 1 | j: 4 Result(i: 1.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 2.000000): 6.146264
>> >> >> >> > ITER: 1 | i: 1 | j: 5 Result(i: 1.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 2.236068): 8.382332
>> >> >> >> > ITER: 1 | i: 1 | j: 6 Result(i: 1.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 2.449490): 10.831822
>> >> >> >> > ITER: 1 | i: 1 | j: 7 Result(i: 1.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 2.645751): 13.477573
>> >> >> >> > ITER: 1 | i: 1 | j: 8 Result(i: 1.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 2.828427): 16.306001
>> >> >> >> > ITER: 1 | i: 1 | j: 9 Result(i: 1.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 3.000000): 19.306001
>> >> >> >> > ITER: 1 | i: 2 | j: 0 Result(i: 1.414214 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 19.306001
>> >> >> >> > ITER: 1 | i: 2 | j: 1 Result(i: 1.414214 | j: 1.000000 |
i*j:
>> >> >> >> > 1.414214): 20.720214
>> >> >> >> > ITER: 1 | i: 2 | j: 2 Result(i: 1.414214 | j: 1.414214 |
i*j:
>> >> >> >> > 2.000000): 22.720214
>> >> >> >> > ITER: 1 | i: 2 | j: 3 Result(i: 1.414214 | j: 1.732051 |
i*j:
>> >> >> >> > 2.449490): 25.169704
>> >> >> >> > ITER: 1 | i: 2 | j: 4 Result(i: 1.414214 | j: 2.000000 |
i*j:
>> >> >> >> > 2.828427): 27.998131
>> >> >> >> > ITER: 1 | i: 2 | j: 5 Result(i: 1.414214 | j: 2.236068 |
i*j:
>> >> >> >> > 3.162278): 31.160409
>> >> >> >> > ITER: 1 | i: 2 | j: 6 Result(i: 1.414214 | j: 2.449490 |
i*j:
>> >> >> >> > 3.464102): 34.624510
>> >> >> >> > ITER: 1 | i: 2 | j: 7 Result(i: 1.414214 | j: 2.645751 |
i*j:
>> >> >> >> > 3.741657): 38.366168
>> >> >> >> > ITER: 1 | i: 2 | j: 8 Result(i: 1.414214 | j: 2.828427 |
i*j:
>> >> >> >> > 4.000000): 42.366168
>> >> >> >> > ITER: 1 | i: 2 | j: 9 Result(i: 1.414214 | j: 3.000000 |
i*j:
>> >> >> >> > 4.242641): 46.608808
>> >> >> >> > ITER: 1 | i: 3 | j: 0 Result(i: 1.732051 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 46.608808
>> >> >> >> > ITER: 1 | i: 3 | j: 1 Result(i: 1.732051 | j: 1.000000 |
i*j:
>> >> >> >> > 1.732051): 48.340859
>> >> >> >> > ITER: 1 | i: 3 | j: 2 Result(i: 1.732051 | j: 1.414214 |
i*j:
>> >> >> >> > 2.449490): 50.790349
>> >> >> >> > ITER: 1 | i: 3 | j: 3 Result(i: 1.732051 | j: 1.732051 |
i*j:
>> >> >> >> > 3.000000): 53.790349
>> >> >> >> > ITER: 1 | i: 3 | j: 4 Result(i: 1.732051 | j: 2.000000 |
i*j:
>> >> >> >> > 3.464102): 57.254450
>> >> >> >> > ITER: 1 | i: 3 | j: 5 Result(i: 1.732051 | j: 2.236068 |
i*j:
>> >> >> >> > 3.872983): 61.127434
>> >> >> >> > ITER: 1 | i: 3 | j: 6 Result(i: 1.732051 | j: 2.449490 |
i*j:
>> >> >> >> > 4.242641): 65.370075
>> >> >> >> > ITER: 1 | i: 3 | j: 7 Result(i: 1.732051 | j: 2.645751 |
i*j:
>> >> >> >> > 4.582576): 69.952650
>> >> >> >> > ITER: 1 | i: 3 | j: 8 Result(i: 1.732051 | j: 2.828427 |
i*j:
>> >> >> >> > 4.898979): 74.851630
>> >> >> >> > ITER: 1 | i: 3 | j: 9 Result(i: 1.732051 | j: 3.000000 |
i*j:
>> >> >> >> > 5.196152): 80.047782
>> >> >> >> > ITER: 1 | i: 4 | j: 0 Result(i: 2.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 80.047782
>> >> >> >> > ITER: 1 | i: 4 | j: 1 Result(i: 2.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 2.000000): 82.047782
>> >> >> >> > ITER: 1 | i: 4 | j: 2 Result(i: 2.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 2.828427): 84.876209
>> >> >> >> > ITER: 1 | i: 4 | j: 3 Result(i: 2.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 3.464102): 88.340311
>> >> >> >> > ITER: 1 | i: 4 | j: 4 Result(i: 2.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 4.000000): 92.340311
>> >> >> >> > ITER: 1 | i: 4 | j: 5 Result(i: 2.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 4.472136): 96.812447
>> >> >> >> > ITER: 1 | i: 4 | j: 6 Result(i: 2.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 4.898979): 101.711426
>> >> >> >> > ITER: 1 | i: 4 | j: 7 Result(i: 2.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 5.291503): 107.002929
>> >> >> >> > ITER: 1 | i: 4 | j: 8 Result(i: 2.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 5.656854): 112.659783
>> >> >> >> > ITER: 1 | i: 4 | j: 9 Result(i: 2.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 6.000000): 118.659783
>> >> >> >> > ITER: 1 | i: 5 | j: 0 Result(i: 2.236068 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 118.659783
>> >> >> >> > ITER: 1 | i: 5 | j: 1 Result(i: 2.236068 | j: 1.000000 |
i*j:
>> >> >> >> > 2.236068): 120.895851
>> >> >> >> > ITER: 1 | i: 5 | j: 2 Result(i: 2.236068 | j: 1.414214 |
i*j:
>> >> >> >> > 3.162278): 124.058129
>> >> >> >> > ITER: 1 | i: 5 | j: 3 Result(i: 2.236068 | j: 1.732051 |
i*j:
>> >> >> >> > 3.872983): 127.931112
>> >> >> >> > ITER: 1 | i: 5 | j: 4 Result(i: 2.236068 | j: 2.000000 |
i*j:
>> >> >> >> > 4.472136): 132.403248
>> >> >> >> > ITER: 1 | i: 5 | j: 5 Result(i: 2.236068 | j: 2.236068 |
i*j:
>> >> >> >> > 5.000000): 137.403248
>> >> >> >> > ITER: 1 | i: 5 | j: 6 Result(i: 2.236068 | j: 2.449490 |
i*j:
>> >> >> >> > 5.477226): 142.880474
>> >> >> >> > ITER: 1 | i: 5 | j: 7 Result(i: 2.236068 | j: 2.645751 |
i*j:
>> >> >> >> > 5.916080): 148.796553
>> >> >> >> > ITER: 1 | i: 5 | j: 8 Result(i: 2.236068 | j: 2.828427 |
i*j:
>> >> >> >> > 6.324555): 155.121109
>> >> >> >> > ITER: 1 | i: 5 | j: 9 Result(i: 2.236068 | j: 3.000000 |
i*j:
>> >> >> >> > 6.708204): 161.829313
>> >> >> >> > ITER: 1 | i: 6 | j: 0 Result(i: 2.449490 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 161.829313
>> >> >> >> > ITER: 1 | i: 6 | j: 1 Result(i: 2.449490 | j: 1.000000 |
i*j:
>> >> >> >> > 2.449490): 164.278802
>> >> >> >> > ITER: 1 | i: 6 | j: 2 Result(i: 2.449490 | j: 1.414214 |
i*j:
>> >> >> >> > 3.464102): 167.742904
>> >> >> >> > ITER: 1 | i: 6 | j: 3 Result(i: 2.449490 | j: 1.732051 |
i*j:
>> >> >> >> > 4.242641): 171.985545
>> >> >> >> > ITER: 1 | i: 6 | j: 4 Result(i: 2.449490 | j: 2.000000 |
i*j:
>> >> >> >> > 4.898979): 176.884524
>> >> >> >> > ITER: 1 | i: 6 | j: 5 Result(i: 2.449490 | j: 2.236068 |
i*j:
>> >> >> >> > 5.477226): 182.361750
>> >> >> >> > ITER: 1 | i: 6 | j: 6 Result(i: 2.449490 | j: 2.449490 |
i*j:
>> >> >> >> > 6.000000): 188.361750
>> >> >> >> > ITER: 1 | i: 6 | j: 7 Result(i: 2.449490 | j: 2.645751 |
i*j:
>> >> >> >> > 6.480741): 194.842491
>> >> >> >> > ITER: 1 | i: 6 | j: 8 Result(i: 2.449490 | j: 2.828427 |
i*j:
>> >> >> >> > 6.928203): 201.770694
>> >> >> >> > ITER: 1 | i: 6 | j: 9 Result(i: 2.449490 | j: 3.000000 |
i*j:
>> >> >> >> > 7.348469): 209.119163
>> >> >> >> > ITER: 1 | i: 7 | j: 0 Result(i: 2.645751 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 209.119163
>> >> >> >> > ITER: 1 | i: 7 | j: 1 Result(i: 2.645751 | j: 1.000000 |
i*j:
>> >> >> >> > 2.645751): 211.764914
>> >> >> >> > ITER: 1 | i: 7 | j: 2 Result(i: 2.645751 | j: 1.414214 |
i*j:
>> >> >> >> > 3.741657): 215.506572
>> >> >> >> > ITER: 1 | i: 7 | j: 3 Result(i: 2.645751 | j: 1.732051 |
i*j:
>> >> >> >> > 4.582576): 220.089147
>> >> >> >> > ITER: 1 | i: 7 | j: 4 Result(i: 2.645751 | j: 2.000000 |
i*j:
>> >> >> >> > 5.291503): 225.380650
>> >> >> >> > ITER: 1 | i: 7 | j: 5 Result(i: 2.645751 | j: 2.236068 |
i*j:
>> >> >> >> > 5.916080): 231.296730
>> >> >> >> > ITER: 1 | i: 7 | j: 6 Result(i: 2.645751 | j: 2.449490 |
i*j:
>> >> >> >> > 6.480741): 237.777470
>> >> >> >> > ITER: 1 | i: 7 | j: 7 Result(i: 2.645751 | j: 2.645751 |
i*j:
>> >> >> >> > 7.000000): 244.777470
>> >> >> >> > ITER: 1 | i: 7 | j: 8 Result(i: 2.645751 | j: 2.828427 |
i*j:
>> >> >> >> > 7.483315): 252.260785
>> >> >> >> > ITER: 1 | i: 7 | j: 9 Result(i: 2.645751 | j: 3.000000 |
i*j:
>> >> >> >> > 7.937254): 260.198039
>> >> >> >> > ITER: 1 | i: 8 | j: 0 Result(i: 2.828427 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 260.198039
>> >> >> >> > ITER: 1 | i: 8 | j: 1 Result(i: 2.828427 | j: 1.000000 |
i*j:
>> >> >> >> > 2.828427): 263.026466
>> >> >> >> > ITER: 1 | i: 8 | j: 2 Result(i: 2.828427 | j: 1.414214 |
i*j:
>> >> >> >> > 4.000000): 267.026466
>> >> >> >> > ITER: 1 | i: 8 | j: 3 Result(i: 2.828427 | j: 1.732051 |
i*j:
>> >> >> >> > 4.898979): 271.925446
>> >> >> >> > ITER: 1 | i: 8 | j: 4 Result(i: 2.828427 | j: 2.000000 |
i*j:
>> >> >> >> > 5.656854): 277.582300
>> >> >> >> > ITER: 1 | i: 8 | j: 5 Result(i: 2.828427 | j: 2.236068 |
i*j:
>> >> >> >> > 6.324555): 283.906855
>> >> >> >> > ITER: 1 | i: 8 | j: 6 Result(i: 2.828427 | j: 2.449490 |
i*j:
>> >> >> >> > 6.928203): 290.835059
>> >> >> >> > ITER: 1 | i: 8 | j: 7 Result(i: 2.828427 | j: 2.645751 |
i*j:
>> >> >> >> > 7.483315): 298.318373
>> >> >> >> > ITER: 1 | i: 8 | j: 8 Result(i: 2.828427 | j: 2.828427 |
i*j:
>> >> >> >> > 8.000000): 306.318373
>> >> >> >> > ITER: 1 | i: 8 | j: 9 Result(i: 2.828427 | j: 3.000000 |
i*j:
>> >> >> >> > 8.485281): 314.803655
>> >> >> >> > ITER: 1 | i: 9 | j: 0 Result(i: 3.000000 | j: 0.000000 |
i*j:
>> >> >> >> > 0.000000): 314.803655
>> >> >> >> > ITER: 1 | i: 9 | j: 1 Result(i: 3.000000 | j: 1.000000 |
i*j:
>> >> >> >> > 3.000000): 317.803655
>> >> >> >> > ITER: 1 | i: 9 | j: 2 Result(i: 3.000000 | j: 1.414214 |
i*j:
>> >> >> >> > 4.242641): 322.046295
>> >> >> >> > ITER: 1 | i: 9 | j: 3 Result(i: 3.000000 | j: 1.732051 |
i*j:
>> >> >> >> > 5.196152): 327.242448
>> >> >> >> > ITER: 1 | i: 9 | j: 4 Result(i: 3.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 6.000000): 333.242448
>> >> >> >> > ITER: 1 | i: 9 | j: 5 Result(i: 3.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 6.708204): 339.950652
>> >> >> >> > ITER: 1 | i: 9 | j: 6 Result(i: 3.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 7.348469): 347.299121
>> >> >> >> > ITER: 1 | i: 9 | j: 7 Result(i: 3.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 7.937254): 355.236375
>> >> >> >> > ITER: 1 | i: 9 | j: 8 Result(i: 3.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 8.485281): 363.721656
>> >> >> >> > ITER: 1 | i: 9 | j: 9 Result(i: 3.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 9.000000): 372.721656
>> >> >> >> > Final Result: 372.721656
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > As we can see in the following iterations the sqrt(1) as
well as
>> > the
>> >> >> >> > result is set to zero for some reason.
>> >> >> >> >
>> >> >> >> > ITER: 0 | i: 1 | j: 4 Result(i: 0.000000 | j: 2.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 5 Result(i: 0.000000 | j: 2.236068 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 6 Result(i: 0.000000 | j: 2.449490 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 7 Result(i: 0.000000 | j: 2.645751 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 8 Result(i: 0.000000 | j: 2.828427 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> > ITER: 0 | i: 1 | j: 9 Result(i: 0.000000 | j: 3.000000 |
i*j:
>> >> >> >> > 0.000000): 0.000000
>> >> >> >> >
>> >> >> >> > Please help me to resolve the accuracy issue! I think that
it
>> will
>> >> >> >> > be very useful for gem5 community.
>> >> >> >> >
>> >> >> >> > To be noticed, I find the correct simulated tick in which
the
>> >> >> >> > application started in FS (using m5 dumpstats), and I start
the
>> >> >> >> > --debug-start, but the trace file which is generated is 10x
>> larger
>> >> >> >> > than SE mode for the same application. How can I compare
them?
>> >> >> >> >
>> >> >> >> > Thank you in advance!
>> >> >> >> > Best regards,
>> >> >> >> > Nikos
>> >> >> >> >
>> >> >> >> > Quoting Νικόλαος Ταμπουρατζής <ntampourat...@ece.auth.gr>:
>> >> >> >> >
>> >> >> >> >> Dear Jason,
>> >> >> >> >>
>> >> >> >> >> I am trying to use --debug-start but in FS mode it is very
>> >> >> >> >> difficult to find the tick on which the application is
started!
>> >> >> >> >>
>> >> >> >> >> However, I am writing the following very simple c++
program:
>> >> >> >> >>
>> >> >> >> >> #include <cmath>
>> >> >> >> >> #include <stdio.h>
>> >> >> >> >>
>> >> >> >> >> int main(){
>> >> >> >> >>
>> >> >> >> >>    int dim = 4096;
>> >> >> >> >>
>> >> >> >> >>    double result;
>> >> >> >> >>
>> >> >> >> >>    for (int iter = 0; iter < 2; iter++){
>> >> >> >> >>        result = 0;
>> >> >> >> >>        for (int i = 0; i < dim; i++){
>> >> >> >> >>            for (int j = 0; j < dim; j++){
>> >> >> >> >>                result += sqrt(i) * sqrt(j);
>> >> >> >> >>            }
>> >> >> >> >>        }
>> >> >> >> >>        printf("Result: %lf\n", result); //Result:
>> > 30530733453.127449
>> >> >> >> >>    }
>> >> >> >> >> }
>> >> >> >> >>
>> >> >> >> >> I cross-compile it using: riscv64-linux-gnu-g++ -static
-O3 -o
>> >> >> >> >> test_riscv test_riscv.cpp
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> While in X86 (without cross-compilation of course),
QEMU-RISCV,
>> >> >> >> >> GEM5-SE the result is the same (30530733453.127449), in
GEM5-FS
>> > the
>> >> >> >> >> result is different! In addition, the result is also
different
>> >> >> >> >> between the 2 iterations.
>> >> >> >> >>
>> >> >> >> >> Please reproduce the error if you want in order to verify
my
>> > result.
>> >> >> >> >> Ηow can the issue be resolved?
>> >> >> >> >>
>> >> >> >> >> Thank you in advance!
>> >> >> >> >>
>> >> >> >> >> Best regards,
>> >> >> >> >> Nikos
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> Quoting Jason Lowe-Power <ja...@lowepower.com>:
>> >> >> >> >>
>> >> >> >> >>> Hi Nikos,
>> >> >> >> >>>
>> >> >> >> >>> You can use --debug-start to start the debugging after
some
>> > number
>> >> >> of
>> >> >> >> >>> ticks. Also, I would expect that the difference should
come up
>> >> >> >> quickly, so
>> >> >> >> >>> no need to run the program to the end.
>> >> >> >> >>>
>> >> >> >> >>> For the FS mode one, you will want to just start the
trace as
>> > the
>> >> >> >> >>> application starts. This could be a bit of a pain.
>> >> >> >> >>>
>> >> >> >> >>> I'm not really sure what fundamentally could be
different. FS
>> > and SE
>> >> >> >> mode
>> >> >> >> >>> use the exact same code for executing instructions, so I
don't
>> > think
>> >> >> >> that's
>> >> >> >> >>> the problem. Have you tried running for smaller inputs or
just
>> > one
>> >> >> >> >>> iteration?
>> >> >> >> >>>
>> >> >> >> >>> Jason
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>> On Wed, Sep 21, 2022 at 9:04 AM Νικόλαος Ταμπουρατζής <
>> >> >> >> >>> ntampourat...@ece.auth.gr> wrote:
>> >> >> >> >>>
>> >> >> >> >>>> Dear Bobby,
>> >> >> >> >>>>
>> >> >> >> >>>> Iam trying to add --debug-flags=Exec (building the gem5
for
>> >> >> gem5.opt
>> >> >> >> >>>> not for gem5.fast which I had) but the debug traces
exceed
>> the
>> > 20GB
>> >> >> >> >>>> (and it is not finished yet) for less than 1 simulated
>> second.
>> > How
>> >> >> can
>> >> >> >> >>>> I reduce the size of the debug-flags (or set something
more
>> >> >> specific)?
>> >> >> >> >>>>
>> >> >> >> >>>> In contrast I build the HPCG benchmark with DHPCG_DEBUG
flag.
>> > If
>> >> >> you
>> >> >> >> >>>> want, you can compare these two output files
>> >> >> >> >>>> (hpcg20010909T014640_SE_Mode &
HPCG-Benchmark_3.1_FS_Mode).
>> As
>> > you
>> >> >> can
>> >> >> >> >>>> see, something goes wrong with the accuracy of
calculations
>> in
>> > FS
>> >> >> mode
>> >> >> >> >>>> (benchmark uses double precission). You can find the
files
>> > here:
>> >> >> >> >>>> http://kition.mhl.tuc.gr:8000/d/68d82f3533/
>> >> >> >> >>>>
>> >> >> >> >>>> Best regards,
>> >> >> >> >>>> Nikos
>> >> >> >> >>>>
>> >> >> >> >>>> Quoting Jason Lowe-Power <ja...@lowepower.com>:
>> >> >> >> >>>>
>> >> >> >> >>>>> That's quite odd that it works in SE mode but not FS
mode!
>> >> >> >> >>>>>
>> >> >> >> >>>>> I would suggest running with --debug-flags=Exec for
both and
>> > then
>> >> >> >> >>>> perform a
>> >> >> >> >>>>> diff to see how they differ.
>> >> >> >> >>>>>
>> >> >> >> >>>>> Cheers,
>> >> >> >> >>>>> Jason
>> >> >> >> >>>>>
>> >> >> >> >>>>> On Tue, Sep 20, 2022 at 2:45 PM Νικόλαος Ταμπουρατζής <
>> >> >> >> >>>>> ntampourat...@ece.auth.gr> wrote:
>> >> >> >> >>>>>
>> >> >> >> >>>>>> Dear Bobby,
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> In QEMU I get the same (correct) results that I get in
SE
>> > mode
>> >> >> >> >>>>>> simulation. I get invalid results in FS simulation (in
both
>> >> >> >> >>>>>> riscv-fs.py and riscv-ubuntu-run.py). I cannot access
real
>> > RISCV
>> >> >> >> >>>>>> hardware at this moment, however, if you want you may
>> > execute my
>> >> >> >> xhpcg
>> >> >> >> >>>>>> binary (http://kition.mhl.tuc.gr:8000/f/4ca25fdd3c/)
with
>> the
>> >> >> >> >>>>>> following configuration:
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> ./xhpcg --nx=16 --ny=16 --nz=16 --npx=1 --npy=1 --npz=1
>> > --rt=0.1
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> Please let me know if you have any updates!
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> Best regards,
>> >> >> >> >>>>>> Nikos
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> Quoting Jason Lowe-Power <ja...@lowepower.com>:
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>> Hi Nikos,
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> I notice you said the following in your original
email:
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> In addition, I used the RISCV Ubuntu image
>> >> >> >> >>>>>>>> (
>> >> >> >>
>> https://github.com/gem5/gem5-resources/tree/stable/src/riscv-ubuntu
>> >> >> >> >>>> ),
>> >> >> >> >>>>>>>> I installed the gcc compiler, compile it (through
qemu)
>> > and I
>> >> >> get
>> >> >> >> >>>>>>>> wrong results too.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Is this saying you get the wrong results is QEMU? If
so,
>> > the bug
>> >> >> >> is in
>> >> >> >> >>>>>> GCC
>> >> >> >> >>>>>>> or the HPCG workload, not in gem5. If not, I would
test in
>> > QEMU
>> >> >> to
>> >> >> >> >>>> make
>> >> >> >> >>>>>>> sure the binary works there. Another way you could
test to
>> > see
>> >> >> if
>> >> >> >> the
>> >> >> >> >>>>>>> problem is your binary or gem5 would be to run it on
real
>> >> >> >> hardware. We
>> >> >> >> >>>>>> have
>> >> >> >> >>>>>>> access to some RISC-V hardware here at UC Davis, if
you
>> > don't
>> >> >> have
>> >> >> >> >>>> access
>> >> >> >> >>>>>>> to it.
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> Cheers,
>> >> >> >> >>>>>>> Jason
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>> On Tue, Sep 20, 2022 at 12:58 AM Νικόλαος
Ταμπουρατζής <
>> >> >> >> >>>>>>> ntampourat...@ece.auth.gr> wrote:
>> >> >> >> >>>>>>>
>> >> >> >> >>>>>>>> Dear Bobby,
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> 1) I use the original riscv-fs.py which is provided
in
>> the
>> >> >> latest
>> >> >> >> >>>> gem5
>> >> >> >> >>>>>>>> release.
>> >> >> >> >>>>>>>> I run the gem5 once (./build/RISCV/gem5.fast -d
>> >> >> ./HPCG_FS_results
>> >> >> >> >>>>>>>> ./configs/example/gem5_library/riscv-fs.py) in order
to
>> >> >> download
>> >> >> >> the
>> >> >> >> >>>>>>>> riscv-bootloader-vmlinux-5.10 and riscv-disk-img.
>> >> >> >> >>>>>>>> After this I mount the riscv-disk-img (sudo mount -o
loop
>> >> >> >> >>>>>>>> riscv-disk-img /mnt), put the xhpcg executable and I
do
>> the
>> >> >> >> following
>> >> >> >> >>>>>>>> changes in riscv-fs.py to boot the riscv-disk-img
with
>> >> >> executable:
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> image = CustomDiskImageResource(
>> >> >> >> >>>>>>>>      local_path =
>> > "/home/cossim/.cache/gem5/riscv-disk-img",
>> >> >> >> >>>>>>>> )
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> # Set the Full System workload.
>> >> >> >> >>>>>>>> board.set_kernel_disk_workload(
>> >> >> >> >>>>>>>>
>> >> >> >>  kernel=Resource("riscv-bootloader-vmlinux-5.10"),
>> >> >> >> >>>>>>>>                     disk_image=image,
>> >> >> >> >>>>>>>> )
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Finally, in the
>> >> >> >> gem5/src/python/gem5/components/boards/riscv_board.py
>> >> >> >> >>>>>>>> I change the last line to "return ["console=ttyS0",
>> >> >> >> >>>>>>>> "root={root_value}", "rw"]" in order to allow the
write
>> >> >> >> permissions
>> >> >> >> >>>> in
>> >> >> >> >>>>>>>> the image.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> 2) The HPCG benchmark after some iterations
calculates if
>> > the
>> >> >> >> results
>> >> >> >> >>>>>>>> are valid or not valid. In the case of FS it gives
>> invalid
>> >> >> >> results.
>> >> >> >> >>>> As
>> >> >> >> >>>>>>>> I see from the results, one (at least) problem is
that
>> > produces
>> >> >> >> >>>>>>>> different results in each HPCG execution (with the
same
>> >> >> >> >>>> configuration).
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Here is the HPCG output and riscv-fs.py
>> >> >> >> >>>>>>>> (http://kition.mhl.tuc.gr:8000/d/68d82f3533/). You
may
>> >> >> reproduce
>> >> >> >> the
>> >> >> >> >>>>>>>> results in the video if you use the xhpcg executable
>> >> >> >> >>>>>>>> (http://kition.mhl.tuc.gr:8000/f/4ca25fdd3c/)
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Please help me in order to solve it!
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Finally, I get invalid results in the HPL benchmark
in FS
>> > mode
>> >> >> >> too.
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Best regards,
>> >> >> >> >>>>>>>> Nikos
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> Quoting Bobby Bruce <bbr...@ucdavis.edu>:
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> > I'm going to need a bit more information to help:
>> >> >> >> >>>>>>>> >
>> >> >> >> >>>>>>>> > 1. In what way have you modified
>> >> >> >> >>>>>>>> > ./configs/example/gem5_library/riscv-fs.py? Can you
>> > attach
>> >> >> the
>> >> >> >> >>>> script
>> >> >> >> >>>>>>>> here?
>> >> >> >> >>>>>>>> > 2. What error are you getting or in what way are
the
>> > results
>> >> >> >> >>>> invalid?
>> >> >> >> >>>>>>>> >
>> >> >> >> >>>>>>>> > -
>> >> >> >> >>>>>>>> > Dr. Bobby R. Bruce
>> >> >> >> >>>>>>>> > Room 3050,
>> >> >> >> >>>>>>>> > Kemper Hall, UC Davis
>> >> >> >> >>>>>>>> > Davis,
>> >> >> >> >>>>>>>> > CA, 95616
>> >> >> >> >>>>>>>> >
>> >> >> >> >>>>>>>> > web: https://www.bobbybruce.net
>> >> >> >> >>>>>>>> >
>> >> >> >> >>>>>>>> >
>> >> >> >> >>>>>>>> > On Mon, Sep 19, 2022 at 1:43 PM Νικόλαος
Ταμπουρατζής <
>> >> >> >> >>>>>>>> > ntampourat...@ece.auth.gr> wrote:
>> >> >> >> >>>>>>>> >
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> Dear gem5 community,
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> I have successfully cross-compile the HPCG
benchmark
>> for
>> >> >> RISCV
>> >> >> >> >>>>>> (Serial
>> >> >> >> >>>>>>>> >> version, without MPI and OpenMP). While it working
>> > properly
>> >> >> in
>> >> >> >> >>>> gem5
>> >> >> >> >>>>>> SE
>> >> >> >> >>>>>>>> >> mode (./build/RISCV/gem5.fast -d ./HPCG_SE_results
>> >> >> >> >>>>>>>> >> ./configs/example/se.py -c xhpcg --options
'--nx=16
>> > --ny=16
>> >> >> >> >>>> --nz=16
>> >> >> >> >>>>>>>> >> --npx=1 --npy=1 --npz=1 --rt=0.1'), I get invalid
>> > results
>> >> >> in FS
>> >> >> >> >>>>>>>> >> simulation using "./build/RISCV/gem5.fast -d
>> >> >> ./HPCG_FS_results
>> >> >> >> >>>>>>>> >> ./configs/example/gem5_library/riscv-fs.py" (I
mount
>> the
>> >> >> riscv
>> >> >> >> >>>> image
>> >> >> >> >>>>>>>> >> and put it).
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> Can you help me please?
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> In addition, I used the RISCV Ubuntu image
>> >> >> >> >>>>>>>> >> (
>> >> >> >> >>>>
>> >> >>
https://github.com/gem5/gem5-resources/tree/stable/src/riscv-ubuntu
>> >> >> >> >>>>>> ),
>> >> >> >> >>>>>>>> >> I installed the gcc compiler, compile it (through
>> qemu)
>> > and
>> >> >> I
>> >> >> >> get
>> >> >> >> >>>>>>>> >> wrong results too.
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> Here is the Makefile which I use, the hpcg
executable
>> > for
>> >> >> RISCV
>> >> >> >> >>>>>>>> >> (xhpcg), and a video that shows the results
>> >> >> >> >>>>>>>> >> (http://kition.mhl.tuc.gr:8000/f/4ca25fdd3c/).
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> P.S. I use the latest gem5 version.
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> Thank you in advance! :)
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>> >> Best regards,
>> >> >> >> >>>>>>>> >> Nikos
>> >> >> >> >>>>>>>> >> _______________________________________________
>> >> >> >> >>>>>>>> >> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> >>>>>>>> >> To unsubscribe send an email to
>> > gem5-users-le...@gem5.org
>> >> >> >> >>>>>>>> >>
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>>> _______________________________________________
>> >> >> >> >>>>>>>> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> >>>>>>>> To unsubscribe send an email to
>> gem5-users-le...@gem5.org
>> >> >> >> >>>>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>>
>> >> >> >> >>>>>> _______________________________________________
>> >> >> >> >>>>>> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> >>>>>> To unsubscribe send an email to
gem5-users-le...@gem5.org
>> >> >> >> >>>>>>
>> >> >> >> >>>>
>> >> >> >> >>>>
>> >> >> >> >>>> _______________________________________________
>> >> >> >> >>>> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> >>>> To unsubscribe send an email to
gem5-users-le...@gem5.org
>> >> >> >> >>>>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> _______________________________________________
>> >> >> >> >> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> >> To unsubscribe send an email to gem5-users-le...@gem5.org
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > _______________________________________________
>> >> >> >> > gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> > To unsubscribe send an email to gem5-users-le...@gem5.org
>> >> >> >>
>> >> >> >>
>> >> >> >> _______________________________________________
>> >> >> >> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> >> To unsubscribe send an email to gem5-users-le...@gem5.org
>> >> >> >>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> gem5-users mailing list -- gem5-users@gem5.org
>> >> >> To unsubscribe send an email to gem5-users-le...@gem5.org
>> >> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> gem5-users mailing list -- gem5-users@gem5.org
>> >> To unsubscribe send an email to gem5-users-le...@gem5.org
>>
>>
>> _______________________________________________
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>


_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org



--

---

Hoa Nguyen
PhD student
Department of Computer Science
University of California, Davis
2235 Kemper Hall
https://arch.cs.ucdavis.edu/people/hoa-nguyen



--

---

Hoa Nguyen
PhD student
Department of Computer Science
University of California, Davis
2235 Kemper Hall
https://arch.cs.ucdavis.edu/people/hoa-nguyen


_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org

Reply via email to