Re: [Numpy-discussion] example reading binary Fortran file

Neil Martinsen-Burrell Fri, 29 May 2009 19:56:45 -0700

On 2009-05-29 10:12 , David Froger wrote:

I think the FortranFile class is not intended to read arrays written
with the syntax 'write(11) array1, array2, array3'  (correct me if I'm
wrong).  This is the use in the laboratory where I'm currently
completing a phd.

You're half wrong. FortranFile can read arrays written as above, but itsees them as a single real array. So, with the attached Fortran program::


In [1]: from fortranfile import FortranFile

In [2]: f = FortranFile('uxuyp.bin', endian='<') # Original bug wasincorrect byte order


In [3]: u = f.readReals()

In [4]: u.shape
Out[4]: (20,)

In [5]: u
Out[5]:
array([ 101.,  111.,  102.,  112.,  103.,  113.,  104.,  114.,  105.,
        115.,  201.,  211.,  202.,  212.,  203.,  213.,  204.,  214.,
        205.,  215.], dtype=float32)

In [6]: ux = u[:10].reshape(2,5); uy = u[10:].reshape(2,5)

In [7]: p = f.readReals().reshape(2,5)

In [8]: ux, uy, p
Out[8]:
(array([[ 101.,  111.,  102.,  112.,  103.],
       [ 113.,  104.,  114.,  105.,  115.]], dtype=float32),
 array([[ 201.,  211.,  202.,  212.,  203.],
       [ 213.,  204.,  214.,  205.,  215.]], dtype=float32),
 array([[ 301.,  311.,  302.,  312.,  303.],
       [ 313.,  304.,  314.,  305.,  315.]], dtype=float32))

What doesn't currently work is to have arrays of mixed types in the samewrite statement, e.g.


integer :: index(10)
real :: x(10,10)
...
write(13) x, index

To address the original problem, I've changed the code to default to thenative byte-ordering (f.ENDIAN='@') and to be more informative aboutwhat happened in the error. In the latest version (attached):


In [1]: from fortranfile import FortranFile

In [2]: f = FortranFile('uxuyp.bin', endian='>') # incorrect endian-ness

In [3]: u = f.readReals()

IOError: Could not read enough data.  Wanted 1342177280 bytes, got 132

and hopefully when people see crazy big numbers like 1.34e9 they willthink of byte order problems.

I'm going to dive into struc, FotranFile etc.. to propose something
convenient for people who have to read unformatted binary fortran file
very often.

Awesome! The thoughts banging around in my head right now are that somesort of mini-language that encapsulates the content of the declarationsand the write statements should allow one to tease out exactly whichstruct call will unpack the right information. f2py has some fortranparsing capabilities, so you might be able to use the fortran itself asthe mini-language. Something like


spec = fortranfile.OutputSpecification(\
"""real(4),dimension(2,5):: ux,uy
write(11) ux,uy""")
ux, uy = fortranfile.FortranFile('uxuyp.bin').readSpec(spec)

Best of luck.  Peace,

-Neil

# Copyright 2008 Neil Martinsen-Burrell
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:

# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.

"""Defines a file-derived class to read/write Fortran unformatted files.

The assumption is that a Fortran unformatted file is being written by
the Fortran runtime as a sequence of records.  Each record consists of
an integer (of the default size [usually 32 or 64 bits]) giving the
length of the following data in bytes, then the data itself, then the
same integer as before.

Examples
--------

To use the default endian and size settings, one can just do::
    >>> f = FortranFile('filename')
    >>> x = f.readReals()

One can read arrays with varying precisions::
    >>> f = FortranFile('filename')
    >>> x = f.readInts('h')
    >>> y = f.readInts('q')
    >>> z = f.readReals('f')
Where the format codes are those used by Python's struct module.

One can change the default endian-ness and header precision::
    >>> f = FortranFile('filename', endian='>', header_prec='l')
for a file with little-endian data whose record headers are long
integers.
"""

__docformat__ = "restructuredtext en"

import struct
import numpy

class FortranFile(file):

    """File with methods for dealing with fortran unformatted data files"""

    def _get_header_length(self):
        return struct.calcsize(self._header_prec)
    _header_length = property(fget=_get_header_length)

    def _set_endian(self,c):
        """Set endian to big (c='>') or little (c='<') or native (c='@')

        :Parameters:
          `c` : string
            The endian-ness to use when reading from this file.
        """
        if c in '<>@=':
            self._endian = c
        else:
            raise ValueError('Cannot set endian-ness')
    def _get_endian(self):
        return self._endian
    ENDIAN = property(fset=_set_endian,
                      fget=_get_endian,
                      doc="Possible endian values are '<', '>', '@', '='"
                     )

    def _set_header_prec(self, prec):
        if prec in 'hilq':
            self._header_prec = prec
        else:
            raise ValueError('Cannot set header precision')
    def _get_header_prec(self):
        return self._header_prec
    HEADER_PREC = property(fset=_set_header_prec,
                           fget=_get_header_prec,
                           doc="Possible header precisions are 'h', 'i', 'l', 
'q'"
                          )

    def __init__(self, fname, endian='@', header_prec='i', *args, **kwargs):
        """Open a Fortran unformatted file for writing.
        
        Parameters
        ----------
        endian : character, optional
            Specify the endian-ness of the file.  Possible values are
            '>', '<', '@' and '='.  See the documentation of Python's
            struct module for their meanings.  The deafult is '>' (big
            endian)
        header_prec : character, optional
            Specify the precision used for the record headers.  Possible
            values are 'h', 'i', 'l' and 'q' with their meanings from
            Python's struct module.  The default is 'i' (the system's
            default integer).

        """
        file.__init__(self, fname, *args, **kwargs)
        self.ENDIAN = endian
        self.HEADER_PREC = header_prec

    def _read_check(self):
        return struct.unpack(self.ENDIAN+self.HEADER_PREC,
                             self.read(self._header_length)
                            )[0]

    def _write_check(self, number_of_bytes):
        """Write the header for the given number of bytes"""
        self.write(struct.pack(self.ENDIAN+self.HEADER_PREC,
                               number_of_bytes))

    def readRecord(self):
        """Read a single fortran record"""
        l = self._read_check()
        data_str = self.read(l)
        if len(data_str) != l:
            raise IOError('Could not read enough data.  Wanted %d bytes, got 
%d' % (l, len(data_str)))
        check_size = self._read_check()
        if check_size != l:
            raise IOError('Error reading record from data file')
        return data_str

    def writeRecord(self,s):
        """Write a record with the given bytes.

        Parameters
        ----------
        s : the string to write

        """
        length_bytes = len(s)
        self._write_check(length_bytes)
        self.write(s)
        self._write_check(length_bytes)

    def readString(self):
        """Read a string."""
        return self.readRecord()

    def writeString(self,s):
        """Write a string

        Parameters
        ----------
        s : the string to write
        
        """
        self.writeRecord(s)

    _real_precisions = 'df'

    def readReals(self, prec='f'):
        """Read in an array of real numbers.
        
        Parameters
        ----------
        prec : character, optional
            Specify the precision of the array using character codes from
            Python's struct module.  Possible values are 'd' and 'f'.
            
        """
        
        _numpy_precisions = {'d': numpy.float64,
                             'f': numpy.float32
                            }

        if prec not in self._real_precisions:
            raise ValueError('Not an appropriate precision')
            
        data_str = self.readRecord()
        num = len(data_str)/struct.calcsize(prec)
        numbers =struct.unpack(self.ENDIAN+str(num)+prec,data_str) 
        return numpy.array(numbers, dtype=_numpy_precisions[prec])

    def writeReals(self, reals, prec='f'):
        """Write an array of floats in given precision

        Parameters
        ----------
        reals : array
            Data to write
        prec` : string
            Character code for the precision to use in writing
        """
        if prec not in self._real_precisions:
            raise ValueError('Not an appropriate precision')
        
        # Don't use writeRecord to avoid having to form a
        # string as large as the array of numbers
        length_bytes = len(reals)*struct.calcsize(prec)
        self._write_check(length_bytes)
        _fmt = self.ENDIAN + prec
        for r in reals:
            self.write(struct.pack(_fmt,r))
        self._write_check(length_bytes)
    
    _int_precisions = 'hilq'

    def readInts(self, prec='i'):
        """Read an array of integers.
        
        Parameters
        ----------
        prec : character, optional
            Specify the precision of the data to be read using 
            character codes from Python's struct module.  Possible
            values are 'h', 'i', 'l' and 'q'
            
        """
        if prec not in self._int_precisions:
            raise ValueError('Not an appropriate precision')
            
        data_str = self.readRecord()
        num = len(data_str)/struct.calcsize(prec)
        return numpy.array(struct.unpack(self.ENDIAN+str(num)+prec,data_str))

    def writeInts(self, ints, prec='i'):
        """Write an array of integers in given precision

        Parameters
        ----------
        reals : array
            Data to write
        prec : string
            Character code for the precision to use in writing
        """
        if prec not in self._int_precisions:
            raise ValueError('Not an appropriate precision')
        
        # Don't use writeRecord to avoid having to form a
        # string as large as the array of numbers
        length_bytes = len(ints)*struct.calcsize(prec)
        self._write_check(length_bytes)
        _fmt = self.ENDIAN + prec
        for item in ints:
            self.write(struct.pack(_fmt,item))
        self._write_check(length_bytes)

_______________________________________________
Numpy-discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] example reading binary Fortran file

Reply via email to