Splitting a file from specific column content

Yigit Turgut Sun, 22 Jan 2012 06:38:12 -0800

Hi all,

I have a text file approximately 20mb in size and contains about one
million lines. I was doing some processing on the data but then the
data rate increased and it takes very long time to process. I import
using numpy.loadtxt, here is a fragment of the data ;


0.000006         -0.0004
0.000071         0.0028
0.000079         0.0044
0.000086         0.0104
.
.
.

First column is the timestamp in seconds and second column is the
data. File contains 8seconds of measurement, and I would like to be
able to split the file into 3 parts seperated from specific time
locations. For example I want to divide the file into 3 parts, first
part containing 3 seconds of data, second containing 2 seconds of data
and third containing 3 seconds. Splitting based on file size doesn't
work that accurately for this specific data, some columns become
missing and etc. I need to split depending on the column content ;

1 - read file until first character of column1 is 3 (3 seconds)
2 - save this region to another file
3 - read the file where first characters  of column1 are between 3 to
5 (2 seconds)
4 - save this region to another file
5 - read the file where first characters  of column1 are between 5 to
5 (3 seconds)
6 - save this region to another file

I need to do this exactly because numpy.loadtxt or genfromtxt doesn't
get well with missing columns / rows. I even tried the invalidraise
parameter of genfromtxt but no luck.

I am sure it's a few lines of code for experienced users and I would
appreciate some guidance.

-- 
http://mail.python.org/mailman/listinfo/python-list

Splitting a file from specific column content

Reply via email to