Hi Axel, > marton, > > are you trying to run the postprocessing on your local > machine or on the IBM machine?
on the IBM machine. I had bad experiences with postprocessing on a different machine because of using the iotk package, converting binary files to text files and back is quite time consuming... (and I hate ssh-ing gygabites of files) > that depends on what is causing this. it could just be that you > have an integer overflow, due to the size of your system, or it > could be that you try to read unformatted data on a different > endian machine. i would suggest you insert a print statment into > the code that prints out the values of DIRECT_IO_FACTOR and recl > as well as unf_recl and then get back to use with the information > about the architectures and these numbers (ideally also for the > smaller test, where it worked). Unfortunately, the espresso I'm using on BASSI was not compiled by myself, and now I'm scared of compiling mine because I'm not sure that it will be able to read the binary that was made with an espresso probably compiled with different compilers and/or compiler options. Yeah, I know... I should have compiled my own version of quantum espresso before making serious calculations to avoid these situtations. So... I made some changes in diropn.f90 in espresso4.0/PW and compiled my own version of espresso (with this I get the same error) to print the values below in the case of the big run, honestly I do not really know much about this cluster, but I'm sure I'm using compiler xl fortran version 11.1.0.3 and library essl 4.2.0.3. recl: 415578000 DIRECT_IO_FACTOR: 8 unf_recl: -970343296 On my home cluster, I used a parallelized espresso-4.0.3 on system "Intel Xeon E5410 @ 2.33Ghz, 16 GB RAM" with ifort 10.1.015, intel mkl libraries 10.0.1.014 and openmpi-1.2.6 and with a smaller but similar system (same pseudos, same cutoff, only gamma point), as I said there is no "wrong record length" error and I got the following values: recl: 97079200 DIRECT_IO_FACTOR: 8 unf_recl: 776633600 If I'm right... 415578000*8 = 3324624000 which is bigger than the largest value of a signed 32 bit integer, maybe that causes the problem? Thanks for your help, Marton
