Hi,

      I figured it is about time I give pocl a try with my physics
simulation code.   I've been using Intel's OpenCL library for computing on
Cray systems with Xeon CPU.
       Today I built pocl (today's git master ) on a Cray XC40
using clang+llvm-7.0.0-x86_64-linux-sles12.3
       I was able to run a simple Hello World kernel as well as clinfo.
When running my physics application at necessary scale, I'm seeing about
0.2% of clBuildProgram fail by SEGFAULT, all with a common stack signature.
(pasted below)
       I'm not sure why this would be so intermittent.  I've tried reducing
to one process per compute node, so only one clBuildProgram would be
executing on that node at a time.  In this testing, that leaves 90
processes doing the same program compile simultaneously in the same working
directory.   Is pocl or clang trying to write anything to the working
directory?  In my restricted case, /tmp is private to each compute node and
thus each process.
     Google-ing for similar stack language, I find one mention that may
well be the same bug:
https://www.mail-archive.com/[email protected]/msg28677.html
https://bugs.llvm.org/show_bug.cgi?id=39833

    "poclcc" is successful with the same OpenCL kernel source.  I assume
I'd need to run it hundreds of times, perhaps in parallel to potentially
trigger the same bug.

      Any advice would be appreciated.  Now that I've thought through the
situation, I think I should probably create an account and contribute to
the LLVM bug 39833 discussion with a me-too.

Cheers,

Noah Reddell


  WmResidentPatchProcessor::WmResidentPatchProcessor(WmComputeProgram*,
boost::shared_ptr<WmComputeAssignment const>,
std::vector<boost::shared_ptr<WmSubDomain const>,
std::allocator<boost::shared_ptr<WmSubDomain const> > > const&,
WmComputeMachine&)@wmresidentpatchprocessor.cc:358
  [email protected]:37
  compile_and_link_program@pocl_build.c:624
  pocl_llvm_build_program@pocl_llvm_build.cc:489

clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)@0x2aaaabebfd07
  clang::FrontendAction::Execute()@0x2aaaabf1c106
  clang::PrintPreprocessedAction::ExecuteAction()@0x2aaaabf22328
  clang::DoPrintPreprocessedInput(clang::Preprocessor&, llvm::raw_ostream*,
clang::PreprocessorOutputOptions const&)@0x2aaaabf51226
  clang::Preprocessor::EnterMainSourceFile()@0x2aaaacc1cabc
  clang::Preprocessor::EnterSourceFile(clang::FileID,
clang::DirectoryLookup const*, clang::SourceLocation)@0x2aaaacbf7407
  (anonymous
namespace)::PrintPPOutputPPCallbacks::FileChanged(clang::SourceLocation,
clang::PPCallbacks::FileChangeReason, clang::SrcMgr::CharacteristicKind,
clang::FileID)@0x2aaaabf5212d
  clang::SourceManager::getPresumedLoc(clang::SourceLocation, bool)
const@0x2aaaacc4e00e
  clang::SourceManager::getLineNumber(clang::FileID, unsigned int, bool*)
const@0x2aaaacc4e43a
  *ComputeLineNumbers*(clang::DiagnosticsEngine&,
clang::SrcMgr::ContentCache*,
llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096ul, 4096ul>&,
clang::SourceManager const&, bool&)@0x2aaaacc4e683
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to