Re: skipping a repeated header

Kirk Wythers Wed, 27 May 2009 09:16:05 -0700

Thanks David. I didn't include my attempts because the postgresqlstuff took up so many lines. I have snippet out the data read part(including your suggestion) and pated below. As you can see I also,sent a number of variables in the file. Again I was trying to not fillthe list with too much text. The error I am getting is: Search patternnot terminated at ./untitled.pl line 7.


#!/usr/bin/perl -w
use strict;


while ( <> ) {

      #Part 1. Skip header lines
      next if ( ! m!^\d+/\! );

      #Part 2. Loop through the records and prepare SQL statement.
      my ( $TIMESTAMP, $RECORD, $Flag1, $Flag2, $Flag3, $Flag4,
      $Flag5, $Flag6, $Flag7, $year, $doy, $hour, $minute, $Batt_volt,
      $PTemp, $AM25TREF1, $AIRTEMP_Avg, $SCTemp_Avg1, $SCTemp_Avg2,
      $SCTemp_Avg3, $SCTemp_Avg4, $SCTemp_Avg5, $SCTemp_Avg6,
      $SCTemp_Avg7, $SCTemp_Avg8, $SCTemp_Avg9, $SCTemp_Avg10,
      $SCTemp_Avg11, $SCTemp_Avg12, $SCTemp_Avg13, $SCTemp_Avg14,
      $SCTemp_Avg15, $SCTemp_Avg16, $SCTemp_Avg17, $STemp_Avg1,
      $STemp_Avg2, $STemp_Avg3, $STemp_Avg4, $STemp_Avg5, $STemp_Avg6,
      $STemp_Avg7, $STemp_Avg8, $STemp_Avg9, $STemp_Avg10,
      $STemp_Avg11, $STemp_Avg12, $STemp_Avg13, $STemp_Avg14,
      $STemp_Avg15, $STemp_Avg16, $TsoilH1_Avg1, $TsoilH1_Avg2,
      $TsoilH2_Avg1, $TsoilH2_Avg2, $TsoilAc_Avg1, $TsoilAc_Avg2,
      $TsoilDs_Avg1, $TsoilDs_Avg2, $S_All_AvgT_Avg, $S_dif_Avg1,
      $S_dif_Avg2, $S_dif_Avg3, $S_dif_Avg4, $S_dif_Avg5, $S_dif_Avg6,
      $SAVG_dif_Avg1, $SAVG_dif_Avg2, $SBtemp_Avg1, $SBtemp_Avg2,
      $SBtemp_Avg3, $SBtemp_Avg4, $SBtemp_Avg5, $SBtemp_Avg6,
      $SBtemp_Avg7, $SBtemp_Avg8, $TmV_Avg1, $TmV_Avg2, $TmV_Avg3,
      $TmV_Avg4, $TmV_Avg5, $TmV_Avg6, $TmV_Avg7, $TmV_Avg8,
      $TargetTemp_Avg1, $TargetTemp_Avg2, $TargetTemp_Avg3,
      $TargetTemp_Avg4, $TargetTemp_Avg5, $TargetTemp_Avg6,
      $TargetTemp_Avg7, $TargetTemp_Avg8, $A_TargetTemp_Avg1,
      $A_TargetTemp_Avg2, $A_TargetTemp_Avg3, $A_TargetTemp_Avg4,
      $A_TargetTemp_Avg5, $A_TargetTemp_Avg6, $A_TargetTemp_Avg7,
      $A_TargetTemp_Avg8, $TargetTemp_ADJ_Avg1, $TargetTemp_ADJ_Avg2,
      $TargetTemp_ADJ_Avg3, $TargetTemp_ADJ_Avg4, $Temp_H1_Avg1,
      $Temp_H1_Avg2, $Temp_H2_Avg1, $Temp_H2_Avg2, $Temp_Ac_Avg1,
      $Temp_Ac_Avg2, $Temp_Ds_Avg1, $Temp_Ds_Avg2, $A_Dif_H1_Avg1,
      $A_Dif_H1_Avg2, $A_Dif_H2_Avg1, $A_Dif_H2_Avg2, $AAVG_Dif_Avg1,
      $AAVG_Dif_Avg2, $All_AvgT_Avg, $PID_out_Avg1, $PID_out_Avg2,
      $PID_out_Avg3, $PID_out_Avg4, $PID_lmt_Avg1, $PID_lmt_Avg2,
      $PID_lmt_Avg3, $PID_lmt_Avg4, $ScldOut_Avg1, $ScldOut_Avg2,
      $ScldOut_Avg3, $ScldOut_Avg4, $SDM_Out_Avg1, $SDM_Out_Avg2,
      $SDM_Out_Avg3, $SDM_Out_Avg4, $E_ScldOut_Avg1, $E_ScldOut_Avg2,
      $RXResponse, $ES_ScldOut_Avg1, $ES_ScldOut_Avg2, $S_RXResponse,
      $Wind_RXResponse, $I_WS_MS ) = split;

      print join( "\t", $TIMESTAMP, $RECORD, $Flag1, $Flag2, $Flag3,
      $Flag4, $Flag5, $Flag6, $Flag7, $year, $doy, $hour, $minute,
      $Batt_volt, $PTemp, $AM25TREF1, $AIRTEMP_Avg, $SCTemp_Avg1,
      $SCTemp_Avg2, $SCTemp_Avg3, $SCTemp_Avg4, $SCTemp_Avg5,
      $SCTemp_Avg6, $SCTemp_Avg7, $SCTemp_Avg8, $SCTemp_Avg9,
      $SCTemp_Avg10, $SCTemp_Avg11, $SCTemp_Avg12, $SCTemp_Avg13,
      $SCTemp_Avg14, $SCTemp_Avg15, $SCTemp_Avg16, $SCTemp_Avg17,
      $STemp_Avg1, $STemp_Avg2, $STemp_Avg3, $STemp_Avg4, $STemp_Avg5,
      $STemp_Avg6, $STemp_Avg7, $STemp_Avg8, $STemp_Avg9,
      $STemp_Avg10, $STemp_Avg11, $STemp_Avg12, $STemp_Avg13,
      $STemp_Avg14, $STemp_Avg15, $STemp_Avg16, $TsoilH1_Avg1,
      $TsoilH1_Avg2, $TsoilH2_Avg1, $TsoilH2_Avg2, $TsoilAc_Avg1,
      $TsoilAc_Avg2, $TsoilDs_Avg1, $TsoilDs_Avg2, $S_All_AvgT_Avg,
      $S_dif_Avg1, $S_dif_Avg2, $S_dif_Avg3, $S_dif_Avg4, $S_dif_Avg5,
      $S_dif_Avg6, $SAVG_dif_Avg1, $SAVG_dif_Avg2, $SBtemp_Avg1,
      $SBtemp_Avg2, $SBtemp_Avg3, $SBtemp_Avg4, $SBtemp_Avg5,
      $SBtemp_Avg6, $SBtemp_Avg7, $SBtemp_Avg8, $TmV_Avg1, $TmV_Avg2,
      $TmV_Avg3, $TmV_Avg4, $TmV_Avg5, $TmV_Avg6, $TmV_Avg7,
      $TmV_Avg8, $TargetTemp_Avg1, $TargetTemp_Avg2, $TargetTemp_Avg3,
      $TargetTemp_Avg4, $TargetTemp_Avg5, $TargetTemp_Avg6,
      $TargetTemp_Avg7, $TargetTemp_Avg8, $A_TargetTemp_Avg1,
      $A_TargetTemp_Avg2, $A_TargetTemp_Avg3, $A_TargetTemp_Avg4,
      $A_TargetTemp_Avg5, $A_TargetTemp_Avg6, $A_TargetTemp_Avg7,
      $A_TargetTemp_Avg8, $TargetTemp_ADJ_Avg1, $TargetTemp_ADJ_Avg2,
      $TargetTemp_ADJ_Avg3, $TargetTemp_ADJ_Avg4, $Temp_H1_Avg1,
      $Temp_H1_Avg2, $Temp_H2_Avg1, $Temp_H2_Avg2, $Temp_Ac_Avg1,
      $Temp_Ac_Avg2, $Temp_Ds_Avg1, $Temp_Ds_Avg2, $A_Dif_H1_Avg1,
      $A_Dif_H1_Avg2, $A_Dif_H2_Avg1, $A_Dif_H2_Avg2, $AAVG_Dif_Avg1,
      $AAVG_Dif_Avg2, $All_AvgT_Avg, $PID_out_Avg1, $PID_out_Avg2,
      $PID_out_Avg3, $PID_out_Avg4, $PID_lmt_Avg1, $PID_lmt_Avg2,
      $PID_lmt_Avg3, $PID_lmt_Avg4, $ScldOut_Avg1, $ScldOut_Avg2,
      $ScldOut_Avg3, $ScldOut_Avg4, $SDM_Out_Avg1, $SDM_Out_Avg2,
      $SDM_Out_Avg3, $SDM_Out_Avg4, $E_ScldOut_Avg1, $E_ScldOut_Avg2,
      $RXResponse, $ES_ScldOut_Avg1, $ES_ScldOut_Avg2, $S_RXResponse,
      $Wind_RXResponse, $I_WS_MS ), "\n";

}

 __END__

On May 27, 2009, at 10:42 AM, Wagner, David --- Senior ProgrammerAnalyst --- CFS wrote:

-----Original Message-----
From: Kirk Wythers [mailto:kwyth...@umn.edu]
Sent: Wednesday, May 27, 2009 09:31
To: beginners@perl.org
Subject: skipping a repeated header

I have a large datafile that I am trying to read into a postgresql
database. I think I have the db_connect stuff down, but I'm fighting
with the part that reads the file to be processed. The file
contains a
repeating structure of header lines like this:

TOA5    B4WARM_C        CR1000  16474   CR1000.Std.15
TIMESTAMP       RECORD  Flag(1) Flag(2) Flag(3)
TS      RN                      
Smp     Smp     Smp
4/29/09 15:10   0       0       0       0
4/29/09 15:11   1       0       0       0
4/29/09 15:12   2       0       0       0
4/29/09 15:13   3       0       0       0
4/29/09 15:14   4       0       0       0
4/29/09 15:15   5       0       0       0
4/29/09 15:16   6       0       0       0
4/29/09 15:17   7       0       0       0
4/29/09 15:18   8       -1      -1      -1
4/29/09 15:19   9       -1      -1      -1
4/29/09 15:20   10      -1      -1      -1
TOA5    B4WARM_C        CR1000  16474   CR1000.Std.15
TIMESTAMP       RECORD  Flag(1) Flag(2) Flag(3)
TS      RN                      
Smp     Smp     Smp
4/29/09 15:10   0       0       0       0
4/29/09 15:11   1       0       0       0
4/29/09 15:12   2       0       0       0
4/29/09 15:13   3       0       0       0
4/29/09 15:14   4       0       0       0
4/29/09 15:15   5       0       0       0
4/29/09 15:16   6       0       0       0
4/29/09 15:17   7       0       0       0
4/29/09 15:18   8       -1      -1      -1
4/29/09 15:19   9       -1      -1      -1
4/29/09 15:20   10      -1      -1      -1


I want to read in the lines that begin with the date format,
but skip
all the header stuff. Can anyone suggest a strategy for a, "if the


        next if ( ! /^\d/ );
        If you only care about the date, then unless your lines have a
number in the front, then the above will bypass those headers.

        If the head can have a number then something like:

        next if ( ! m!^\d+/\! );

        Also the group is more willing if you can show code you have
attempted. This is a very simplestic regex test which you should have

gotten from any doc on Perl regex processing. Just a fyi for thefuture.


        If you have any questions and/or problems, please let me know.
        Thanks.

Wags ;)
David R. Wagner
Senior Programmer Analyst
FedEx Freight
1.719.484.2097 TEL
1.719.484.2419 FAX
1.408.623.5963 Cell
http://fedex.com/us

line begins with XXXX, go ahead and read".

Thanks in advance.



--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: skipping a repeated header

Reply via email to