> On Jul 9, 2022, at 4:55 PM, Nagle, Michael F <[email protected]> > wrote: > > > Thanks for your response, Rob. I will do my best to answer your questions. > Please let me know if anything is unclear and more info would help. I > appreciate your attention to this! > > This is a rather powerful Dell workstation running Ubuntu 22.04 LTS, with a > 12-core Intel processor and 503GB RAM. > > I'm running as a user with admin privileges, but am not using sudo, so as I > understand these should not be root processes. > > In short, we're running some custom Python code to analyze ~1.3GB > hyperspectral images, do some linear algebra and output some plots and arrays > describing the biochemical composition in these images. This is benchmarked > to take 2-4GB of RAM per image. There is one image per job. By default, > parallel is running 24 jobs, dual-threading on each of 12 cores... There > should be plenty of RAM to run 24 4GB jobs at once. Since this is an > embarrassingly parallel computation and we already use bash scripting in this > workflow, I prefer to keep it simple and use GNU Parallel rather than Python > parallel frameworks... it always worked great in the past. > > Here is the script I'm calling from the command line, inside the jobs file > described further below: gmodetector_py/analyze_sample.py at master · > naglemi/gmodetector_py (github.com) > > # This is what we run to execute the .jobs file > parallel -a $job_list_name > > # I have also tried limiting the number of jobs to 20, which also leads to > the same crashing problem after a few runs. > parallel--jobs 20 -a $job_list_name > > # Here is how we prepare the .jobs file. We produce one job per image, each > given its own line in a text file, with options set by a bunch of variables > in a Jupyter notebook. Note, I have also confirmed it still crashes if we run > outside of Jupyter. > for file in $data/*.hdr > do > if [[ "$file" != *'hroma'* ]] && [[ "$file" != *'roadband'* ]]; then > echo "python wrappers/analyze_sample.py \ > --file_path $file \ > --fluorophores ${fluorophores[*]} \ > --min_desired_wavelength ${desired_wavelength_range[0]} \ > --max_desired_wavelength ${desired_wavelength_range[1]} \ > --red_channel ${FalseColor_channels[0]} \ > --green_channel ${FalseColor_channels[1]} \ > --blue_channel ${FalseColor_channels[2]} \ > --red_cap ${FalseColor_caps[0]} \ > --green_cap ${FalseColor_caps[1]} \ > --blue_cap ${FalseColor_caps[2]} \ > --plot 1 \ > --spectral_library_path "$spectral_library_path" \ > --output_dir $output_directory_full \ > --threshold 38" >> $job_list_name > fi > done > > Thanks again!
We’ll I’m shocked you’ve managed to crash the machine. Not at my desk just now but I think you’ll need Audi to look for ‘panic’ reports in the syslog. Have you kept an eye on physical memory? dmesg?
