Thanks for the suggestion. I can't use the CSV driver to read the
file, though, because it doesn't support a space as a field separator,
only COMMA/SEMICOLON/TAB.

2011/3/29, Paolo Corti <[email protected]>:
>> I've received a 10m resolution DEM in xyz text format. The file is
>> about 1 GB in size. The file is too big to open in a text editor, such
>> as Notepad and I don't have Office 2007, so Excel cuts off the file
>> after 67 000 lines.
>>
>> So, I need to write a Python script to to read this file and extract
>> only the data that falls within my study area. According to QGIS, the
>> extents of my area is:
>> xMin,yMin -66483.3,-3155672.31 : xMax,yMax -33474.9,-3122229.70
>>
>> This is the first unprocessed line in the file, which I extracted using
>> Python:
>>  -74289.694 -3182439.485  2092.029
>>
>> The spacing between the lines are not consistent, which is another
>> reason why I need to manipulate the data so that GRASS can import it.
>>
>> Reading the whole file at once causes a MemoryError in Python, so I've
>> written the following code to read it in chunks, with some help from
>> the web - <http://effbot.org/zone/readline-performance.htm>:
>>
>> [code]
>> readfile='bethlehem.xyz'
>>
>> file = open(readfile)
>>
>> while 1:
>>    # read a chunck of the file
>>    lines = file.readlines(100000)
>>    if not lines:
>>        break
>>    for line in lines:
>>    # extract x, y and z
>>        x = line[2:12]
>>        y = line[13:25]
>>        z = line[27:35]
>>        if x >= -66483.300 and x <= -33474.900:
>>           if y >= -3155672.310 and y <= -3122229.700:
>>               print line
>> [/code]
>>
>> This code runs for a (relatively) short while and exits having printed no
>> lines.
>>
>> My questions are thus:
>> 1. Will this code iterate through the whole file, or does it read only
>> the first 100 000 bytes of text? If it reads only the first 100 000
>> bytes, how can I change it to read the while file in chunks?
>>
>> 2. Is the logic in my if statements correct to extract the values for
>> my study area? If not, how should I change it?
>
> Hi Hanlie
>
> don't reinvent the wheel: use GDAL ogr2ogr utility [0] with the -clipsrc
> option.
> Read the file by using the csv driver [1]
>
> best regards
> P
>
> [0] http://www.gdal.org/ogr2ogr.html
> [1] http://www.gdal.org/ogr/drv_csv.html
>
> --
> Paolo Corti
> Geospatial software developer
> web: http://www.paolocorti.net
> twitter: @paolo_corti
>

Reply via email to