Hi Islam, I like the definition of 95% hard real time; it suits my needs. Thanks for this good paper.
Le lundi 6 juin 2016 18:45:35 UTC+2, Islam Badreldin a écrit : > > Hi John, > > I am currently pursuing similar effort. I got a GPIO pin on the BeagleBone > Black embedded board toggling in hard real-time and verified the jitter > with an oscilloscope. For that, I used a vanilla Linux 4.4.11 kernel with > the PREEMPT_RT patch applied. I also released an initial version of a Julia > package that wraps the clock_nanosleep() and clock_gettime() functions from > the POSIX real-time extensions. Please see this other thread: > https://groups.google.com/forum/#!topic/julia-users/0Vr2rCRwJY4 > > I tested that package both on Intel-based laptop and on the BeagleBone > Black. I am giving some of the relevant details below.. > > On Monday, June 6, 2016 at 5:41:29 AM UTC-4, John leger wrote: >> >> Since it seems you have a good overview in this domain I will give more >> details: >> We are working in signal processing and especially in image processing. >> The goal here is just the adaptive optic: we just want to stabilize the >> image and not get the final image. >> The consequence is that we will not store anything on the hard drive: we >> read an image, process it and destroy it. We stay in RAM all the time. >> The processing is done by using/coding our algorithms. So for now, no >> need of any external library (for now, but I don't see any reason for that >> now) >> >> First I would like to apologize: just after posting my answer I went to >> wikipedia to search the difference between soft and real time. >> I should have done it before so that you don't have to spend more time to >> explain. >> >> In the end I still don't know if I am hard real time or soft real time: >> the timing is given by the camera speed and the processing should be done >> between the acquisition of two images. >> We don't want to miss an image or delay the processing, I still need to >> clarify the consequences of a delay or if we miss an image. >> For now let's just say that we can miss some images so we want soft real >> time. >> > > The real-time performance you are after could be 95% hard real-time. See > e.g. here: https://www.osadl.org/fileadmin/dam/rtlws/12/Brown.pdf > > >> >> I'm making a benchmark that should match the system in term of >> complexity, these are my first remarks: >> >> When you say that one allocation is unacceptable, I say it's shockingly >> true: In my case I had 2 allocations done by: >> A +=1 where A is an array >> and in 7 seconds I had 600k allocations. >> Morality :In closed loop you cannot accept any alloc and so you have to >> explicit all loops. >> > > Yes, try to completely avoid memory allocations while developing your own > algorithms in Julia. Pre-allocations and in-place operations are your > friends! The example script available on the POSIXClock package is one way > to do this ( > https://github.com/ibadr/POSIXClock.jl/blob/master/examples/rt_histogram.jl). > The real-time section of the code is marked by a ccall to mlockall() in > order to cause immediate failure upon memory allocations in the real-time > section. You can also use the --track-allocation option to hunt down > memory allocations while developing your algorithm. See e.g. > http://docs.julialang.org/en/release-0.4/manual/profile/#man-track-allocation > > I discovered --track-allocation not so long ago and it is a good tool. For now I think I will rely on tracking allocation manually. I am a little afraid of using mlockall(): In soft or real time crashing (failure) is not a good option for me... Since you are talking about --track-allocation I have a question: - function deflat(v::globalVar) 0 @simd for i in 1:v.len_sub 0 @inbounds v.sub_imagef[i] = v.flat[i]*v.image[i] - end - 0 @simd for i in 1:v.len_ref 0 @inbounds v.ref_imagef[i] = v.flat[i]*v.image[i] - end 0 return - end - - # get min max - # apply norm_coef - # MORE TO DO HERE - function normalization(v::globalVar) 0 min::Float32 = Float32(4095) 0 max::Float32 = Float32(0) 0 tmp::Float32 = Float32(0) 0 norm_fact::Float32 = Float32(0) 0 norm_coef::Float32 = Float32(0) - # find min max 0 @simd for i in 1:v.nb_mat 0 # Doing something with no allocs 0 end 0 end 0 1226415 # SAD[70] 16x16 de Ref_Image sur Sub_Image[60] - function correlation_SAD(v::globalVar) 0 - end - In the mem output file I have this information: at the end of normalization I have no alloc and in front of the SAD comment and before the empty correlation function I have 1226415 allocations. It should be logic that these allocations happened in normalization but why is it here between two function ? > >> I have two problems now: >> >> 1/ Many times, the first run that include the compilation was the fastest >> and then any other run was slower by a factor 2. >> 2/ If I relaunch many times the main function that is in a module, there >> are some run that were very different (slower) from the previous. >> >> About 1/, although I find it strange I don't really care. >> 2/ If far more problematic, once the code is compiled I want it to act >> the same whatever the number of launch. >> I have some ideas why but no certitudes. What bother me the most is that >> all the runs in the benchmark will be slower, it's not a temporary slowdown >> it's all the current benchmark that will be slower. >> If I launch again it will be back to the best performances. >> >> Thank you for the links they are very interesting and I keep that in mind. >> >> Note: I disabled hyperthreading and overclock, so it should not be the >> CPU doing funky things. >> >> >> > Regarding these two issues, I encountered similar ones. Are you running on > an Intel-based computer? I had to do many tweaks to get to acceptable > real-time performance with Intel processors. Many factors could be at play. > As you said, you have to make sure hyper-threading is disabled and not to > overclock the processor. Also, monitor the kernel dmesg log for any errors > or warnings regarding RT throttling or local_softitq_pending. > > Additionally, I had to use the following options in the Linux command line > (pass them from the bootloader): > > intel_idle.max_cstate=0 processor.max_cstate=0 idle=poll > > Together with removing the intel_powerclamp kernel module (sudo rm > intel_powerclamp). Caution: be extremely careful with such configuration as > it disables many power saving features in the processor and can potentially > overheat it. Keep an eye on the kernel dmesg log and try to monitor the CPU > temperature. > > I also found it useful to isolate one CPU core using the isolcpus=1 kernel > command line option and then set the affinity of the real-time Julia > process to run on that isolated CPU (using the taskset command). This way, > you can almost guarantee the Linux kernel and all other user-space process > will not run on that isolated CPU so it becomes wholly dedicated to running > the real-time Julia process. I am planning to post more details to the > POSIXClock package in the near future. > > I have an intel processor indeed and thanks for all the tips I will first try to apply to isolate a CPU then disabling the intel options. > Best, > Islam > > Again thanks a lot for all the help.
