Keywords: runtime error, package check, 32 bit architectures, large files
This is the second of two reports with CRAN check problems that I found in my package and that affect only some particular architectures (in this case, x86_32) Problem description: When compiling a package with C++ source code using Rcpp in a Linux system, kernel 5.19.16-100, distribution Fedora 35, the generated package passed R CMD check --as-cran test, giving no compilation warnings and no execution errors. Nevertheless, the runtime tests in the CRAN server provoked an error exclusively for the x86_32 architecture (found mostly in old PCs). Let's suppose you have stored a variable of unsigned long long type at the end of a binary file. You think you can read it with: unsigned long long endofbindata; std::string fname="yourfilename"; std::ifstream f(fname.c_str()); f.seekg(-sizeof(unsigned long long),std::ios::end); f.read((char *)&endofbindata,sizeof(unsigned long long)); and indeed you can, but ONLY in 64-bit architectures. The function seekg does not work as expected in 32-bit architectures since the first parameter (offset) is of type streamoff which does not seem to be defined equally by g++ for 32 and 64 bit architectures. In 32 bit provokes over/underflow and absurd results on execution EVEN IF THE FILE is smaller than 2^32 bytes (in compilation, even in a 32-bit computer, no error or warning is raised so you don't notice the problem). My solution has been: Make a more portable function to get the size of a file using the stat system call, like: unsigned long long GetFileSize(std::string fname) { struct stat stat_buf; int rc = stat(fname.c_str(), &stat_buf); if (rc != 0) { std::string err="Cannot obtain information (with stat system call) of file "+fname+"\n"; err += "This is probably because you are running this in a 32-bit architecture and the file is bigger than 4 GB.\n"; err += "Unfortunately, we have not found yet a solution for that and, if you need to manage so big files,\n"; err += "probably you should consider using a 64-bit architecture.\n"; Rcpp::stop(err); // NOTE: may be definition of __USE_FILE_OFFSET64 could solve this but it might provoke other problems... } else return ((unsigned long long)stat_buf.st_size); } According to the stat manual, stat returns this error: EOVERFLOW pathname or fd refers to a file whose size, inode number, or number of blocks cannot be represented in, respectively, the types off_t, ino_t, or blkcnt_t. This error can occur when, for example, an application compiled on a 32-bit platform without -D_FILE_OFFSET_BITS=64 calls stat() on a file whose size exceeds (1<<31)-1 bytes. Done that, use the returned number (if it has succeeded) to go there (or there, less an offset) with a f.seekg call. As you see, I have not found a real solution, but at least this warns the user about the problem of using large files in 32-bit architectures. This should be now infrequent in practice, since every day less 32-bit computers remain in use, but since CRAN still checks with them I have preferred to document it, just in case anyone else may benefit of the information. Juan -- ________________________________________________________________ Juan Domingo Esteve Dept. of Informatics, School of Engineering University of Valencia Avda. de la Universidad, s/n. 46100-Burjasot (Valencia) SPAIN Telephone: +34-963543572 Fax: +34-963543550 email: juan.domi...@uv.es ________________________________________________________________ _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel