Thanks for the suggestion. I can't use the CSV driver to read the file, though, because it doesn't support a space as a field separator, only COMMA/SEMICOLON/TAB.
2011/3/29, Paolo Corti <[email protected]>: >> I've received a 10m resolution DEM in xyz text format. The file is >> about 1 GB in size. The file is too big to open in a text editor, such >> as Notepad and I don't have Office 2007, so Excel cuts off the file >> after 67 000 lines. >> >> So, I need to write a Python script to to read this file and extract >> only the data that falls within my study area. According to QGIS, the >> extents of my area is: >> xMin,yMin -66483.3,-3155672.31 : xMax,yMax -33474.9,-3122229.70 >> >> This is the first unprocessed line in the file, which I extracted using >> Python: >> -74289.694 -3182439.485 2092.029 >> >> The spacing between the lines are not consistent, which is another >> reason why I need to manipulate the data so that GRASS can import it. >> >> Reading the whole file at once causes a MemoryError in Python, so I've >> written the following code to read it in chunks, with some help from >> the web - <http://effbot.org/zone/readline-performance.htm>: >> >> [code] >> readfile='bethlehem.xyz' >> >> file = open(readfile) >> >> while 1: >> # read a chunck of the file >> lines = file.readlines(100000) >> if not lines: >> break >> for line in lines: >> # extract x, y and z >> x = line[2:12] >> y = line[13:25] >> z = line[27:35] >> if x >= -66483.300 and x <= -33474.900: >> if y >= -3155672.310 and y <= -3122229.700: >> print line >> [/code] >> >> This code runs for a (relatively) short while and exits having printed no >> lines. >> >> My questions are thus: >> 1. Will this code iterate through the whole file, or does it read only >> the first 100 000 bytes of text? If it reads only the first 100 000 >> bytes, how can I change it to read the while file in chunks? >> >> 2. Is the logic in my if statements correct to extract the values for >> my study area? If not, how should I change it? > > Hi Hanlie > > don't reinvent the wheel: use GDAL ogr2ogr utility [0] with the -clipsrc > option. > Read the file by using the csv driver [1] > > best regards > P > > [0] http://www.gdal.org/ogr2ogr.html > [1] http://www.gdal.org/ogr/drv_csv.html > > -- > Paolo Corti > Geospatial software developer > web: http://www.paolocorti.net > twitter: @paolo_corti >
