Hi Polydoros,

There is a pending SVE O3 fix at: 
https://gem5-review.googlesource.com/c/public/gem5/+/19173 could you try 
with that fix to see if it works?

I believe that the fix is not Ruby specific, but maybe the problem only 
happens in Ruby by coincidence.

We need to make some improvements to the patch before merging, but it 
could already help.

On 11/18/19 6:00 PM, Polydoros Petrakis wrote:
> Hi all,
> 
> I would like to report the following two findings in regard to ARM SVE + 
> Ruby + FS/SE:
> 
> 
> A) When trying to run C STREAM benchmark v5.10 
> (https://www.cs.virginia.edu/stream/FTP/Code/stream.c), compiled for ARM 
> SVE,
> 
> I get the following assert error:
> 
> *gem5.opt: build/ARM/mem/packet.hh:1093: T* Packet::getPtr() [with T = 
> unsigned char]: Assertion `!isMaskedWrite()' failed.*
> 
> 
> I have tried with both gcc/armclang:
> 
> a) aarch64-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 
> 8.2-2019.01 (arm-rel-8.28)) 8.2.1 20180802
> 
> e.g.: aarch64-linux-gnu-gcc -Ofast --static 
> *-march=armv8.2-a+sve*-DNTIMES=5 -DSTREAM_ARRAY_SIZE=200000 
> -I../gem5/include/ -I../gem5/include/gem5 stream.c -L./  -o 
> sve_stream.exe -lm5
> 
> b) armclang (arm-hpc-compiler/19.3)
> 
> 
> Stream runs OK, in the following 3 cases:
> 
> a) build without SVE flags
> 
> b) When I comment out the assert(But this could lead to errors ?)
> 
> c) Without Ruby/NoC (But we are currently interested in Ruby-NoC Systems).
> 
> 
> The complete output is the following:
> 
> command line: ./build/ARM/gem5.opt --outdir=m5out/batch502_0001 
> configs/example/se.py --cpu-type=DerivO3CPU --caches --num-cpus=4 
> --num-dirs=4 --l2cache --mem-type=DDR4_2400_8x8 --num-l2caches=4 
> --mem-size=2GB --ruby --topology=Mesh_XY --mesh-rows=2 
> --network=garnet2.0 -c sve_stream.exe
> 
> Global frequency set at 1000000000000 ticks per second
> warn: DRAM device capacity (16384 Mbytes) does not match the address 
> range assigned (512 Mbytes)
> warn: DRAM device capacity (16384 Mbytes) does not match the address 
> range assigned (512 Mbytes)
> warn: DRAM device capacity (16384 Mbytes) does not match the address 
> range assigned (512 Mbytes)
> warn: DRAM device capacity (16384 Mbytes) does not match the address 
> range assigned (512 Mbytes)
> 0: system.remote_gdb: listening for remote gdb on port 7004
> 0: system.remote_gdb: listening for remote gdb on port 7005
> 0: system.remote_gdb: listening for remote gdb on port 7006
> 0: system.remote_gdb: listening for remote gdb on port 7007
> **** REAL SIMULATION ****
> info: Entering event queue @ 0.  Starting simulation...
> warn: Replacement policy updates recently became the responsibility of 
> SLICC state machines. Make sure to setMRU() near callbacks in .sm files!
> info: Increasing stack size by one page.
> gem5.opt: build/ARM/mem/packet.hh:1093: T* Packet::getPtr() [with T = 
> unsigned char]: Assertion `!isMaskedWrite()' failed.
> Program aborted at tick 24589000
> --- BEGIN LIBC BACKTRACE ---
> ./build/ARM/gem5.opt(_Z15print_backtracev+0x19)[0x55976cece499]
> ./build/ARM/gem5.opt(_Z12abortHandleri+0x3d)[0x55976cee013d]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7f12b2d18730]
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7f12b1b0a7bb]
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x121)[0x7f12b1af5535]
> /lib/x86_64-linux-gnu/libc.so.6(+0x2240f)[0x7f12b1af540f]
> /lib/x86_64-linux-gnu/libc.so.6(+0x30102)[0x7f12b1b03102]
> ./build/ARM/gem5.opt(_ZN9Sequencer12issueRequestEP6Packet15RubyRequestType+0x5e9)[0x55976c2c2119]
>  
> 
> ./build/ARM/gem5.opt(_ZN9Sequencer11makeRequestEP6Packet+0x203)[0x55976c2c2513]
>  
> 
> ./build/ARM/gem5.opt(_ZN8RubyPort12MemSlavePort13recvTimingReqEP6Packet+0x37f)[0x55976c2b299f]
>  
> 
> ./build/ARM/gem5.opt(_ZN7LSQUnitI9O3CPUImplE13trySendPacketEbP6Packet+0xd8)[0x55976d3e0c58]
>  
> 
> ./build/ARM/gem5.opt(_ZN3LSQI9O3CPUImplE16SplitDataRequest17sendPacketToCacheEv+0x6c)[0x55976d3d7f3c]
>  
> 
> ./build/ARM/gem5.opt(_ZN7LSQUnitI9O3CPUImplE15writebackStoresEv+0x862)[0x55976d3e7992]
>  
> 
> ./build/ARM/gem5.opt(_ZN3LSQI9O3CPUImplE15writebackStoresEv+0x84)[0x55976d3de564]
>  
> 
> ./build/ARM/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE4tickEv+0x144)[0x55976d3ca994]
>  
> 
> ./build/ARM/gem5.opt(_ZN9FullO3CPUI9O3CPUImplE4tickEv+0x137)[0x55976d39f0b7] 
> 
> ./build/ARM/gem5.opt(_ZN10EventQueue10serviceOneEv+0xd9)[0x55976ced62e9]
> ./build/ARM/gem5.opt(_Z9doSimLoopP10EventQueue+0x77)[0x55976cef6157]
> ./build/ARM/gem5.opt(_Z8simulatem+0xccd)[0x55976cef718d]
> ./build/ARM/gem5.opt(+0x1e399ca)[0x55976cfdd9ca]
> ./build/ARM/gem5.opt(+0xcbd5f1)[0x55976be615f1]
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x60f2)[0x7f12b2e720a2]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x732)[0x7f12b2e6b852]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x62e9)[0x7f12b2e72299]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x732)[0x7f12b2e6b852]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x62e9)[0x7f12b2e72299]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x732)[0x7f12b2e6b852]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x62e9)[0x7f12b2e72299]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x732)[0x7f12b2e6b852]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x7f12b2e6be69]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6992)[0x7f12b2e72942]
>  
> 
> /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x732)[0x7f12b2e6b852]
>  
> 
> --- END LIBC BACKTRACE ---
> Aborted (core dumped)
> 
> 
> B) Additionally, when I disable the previous assert and run:
> 
> FS with multi-threaded SVE-Stream + OpenMP + DerivO3CPU + 2x2 Mesh + 
> garnet2.0 + MOESI_hammer/MOESI_CMP_directory + (4 cpu / 4 dirs / 4 
> L2-caches),
> 
> I observe:
> 
> a) Atomicity (+ OpenMP ) seems problematic?
> 
> The following code (from stream.c) prints garbage values:
> 
> /    pragma omp atomic//
> //            k++;//
> //        printf ("Number of Threads counted = %i\n",k);/
> 
> b) STREAM Benchmark finishes, but results contain errors, 
> when*OMP_NUM_THREADS > 1*.
> 
> I guess (a) could explain (b). The results get verified for some array 
> sizes (e.g.: 200,000), for 1/2/4 threads.
> 
> 
> When running multi-threaded***OpenMP**STREAM without SVE*, all results 
> get verified (e.g.: array sizes = [43690, 252134, 243431, 151690, 
> 200000] for 1/2/4 threads), although problem (a) persists.
> 
> STREAM + SVE (+ OpenMP) finish with errors, even with Classic Memory 
> System (without Ruby). I could do more runs to verify this, if needed.
> 
> I would like to ask about the current status of atomicity in Ruby, as 
> well as atomicity in regard to SVE.
> 
> ( I have seen the relative status matrix in 
> http://www.gem5.org/Status_Matrix, but maybe it is outdated? )
> 
> Any information or help to resolve these issues is welcome.
> 
> 
> Thanks in advance,
> 
> Polydoros Petrakis
> 
> -------
> 
> Institute of Computer Science (ICS),
> 
> Foundation for Research & Technology - Hellas (FORTH),
> 
> Heraklion, Crete, Greece.
> 
> 
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to