You’ll need to show some log of something for anyone to really nail this down
rjs > On Jul 9, 2022, at 7:12 PM, Rob Sargent <robjsarg...@gmail.com> wrote: > > > > >>> On Jul 9, 2022, at 4:55 PM, Nagle, Michael F >>> <michael.na...@oregonstate.edu> wrote: >>> >> >> Thanks for your response, Rob. I will do my best to answer your questions. >> Please let me know if anything is unclear and more info would help. I >> appreciate your attention to this! >> >> This is a rather powerful Dell workstation running Ubuntu 22.04 LTS, with a >> 12-core Intel processor and 503GB RAM. >> >> I'm running as a user with admin privileges, but am not using sudo, so as I >> understand these should not be root processes. >> >> In short, we're running some custom Python code to analyze ~1.3GB >> hyperspectral images, do some linear algebra and output some plots and >> arrays describing the biochemical composition in these images. This is >> benchmarked to take 2-4GB of RAM per image. There is one image per job. By >> default, parallel is running 24 jobs, dual-threading on each of 12 cores... >> There should be plenty of RAM to run 24 4GB jobs at once. Since this is an >> embarrassingly parallel computation and we already use bash scripting in >> this workflow, I prefer to keep it simple and use GNU Parallel rather than >> Python parallel frameworks... it always worked great in the past. >> >> Here is the script I'm calling from the command line, inside the jobs file >> described further below: gmodetector_py/analyze_sample.py at master · >> naglemi/gmodetector_py (github.com) >> >> # This is what we run to execute the .jobs file >> parallel -a $job_list_name >> >> # I have also tried limiting the number of jobs to 20, which also leads to >> the same crashing problem after a few runs. >> parallel--jobs 20 -a $job_list_name >> >> # Here is how we prepare the .jobs file. We produce one job per image, each >> given its own line in a text file, with options set by a bunch of variables >> in a Jupyter notebook. Note, I have also confirmed it still crashes if we >> run outside of Jupyter. >> for file in $data/*.hdr >> do >> if [[ "$file" != *'hroma'* ]] && [[ "$file" != *'roadband'* ]]; then >> echo "python wrappers/analyze_sample.py \ >> --file_path $file \ >> --fluorophores ${fluorophores[*]} \ >> --min_desired_wavelength ${desired_wavelength_range[0]} \ >> --max_desired_wavelength ${desired_wavelength_range[1]} \ >> --red_channel ${FalseColor_channels[0]} \ >> --green_channel ${FalseColor_channels[1]} \ >> --blue_channel ${FalseColor_channels[2]} \ >> --red_cap ${FalseColor_caps[0]} \ >> --green_cap ${FalseColor_caps[1]} \ >> --blue_cap ${FalseColor_caps[2]} \ >> --plot 1 \ >> --spectral_library_path "$spectral_library_path" \ >> --output_dir $output_directory_full \ >> --threshold 38" >> $job_list_name >> fi >> done >> >> Thanks again! > > We’ll I’m shocked you’ve managed to crash the machine. Not at my desk just > now but I think you’ll need Audi to look for ‘panic’ reports in the syslog. > > Have you kept an eye on physical memory? dmesg?