----- Original Message -----
From: Aditi Gupta <[EMAIL PROTECTED]>
Date: Monday, May 9, 2005 11:41 am
Subject: extracting coordinates

> Hi everyone,
Hello Aditi,

> 
> That code is working... 
> But my specific problem is as follows:
> 
> i have a file in which data is stored as
> 
> HELIX 4 4 VAL 74 LEU 84 1 11 
> CRYST1 33.020 33.750 75.670 90.00 90.00 90.00 P 21 21 21 4 
> ORIGX1 1.000000 0.000000 0.000000 0.00000 
> ORIGX2 0.000000 1.000000 0.000000 0.00000 
> ORIGX3 0.000000 0.000000 1.000000 0.00000 
> SCALE1 0.030285 0.000000 0.000000 0.00000 
> SCALE2 0.000000 0.029630 0.000000 0.00000 
> SCALE3 0.000000 0.000000 0.013215 0.00000 
> ATOM 1 N LEU 2 -10.586 -14.055 54.397 1.00 49.37 N 
> ATOM 2 CA LEU 2 -9.711 -13.341 53.419 1.00 48.40 C 
> ATOM 3 C LEU 2 -10.401 -12.068 52.928 1.00 46.56 C 
> ATOM 4 O LEU 2 -11.440 -12.138 52.267 1.00 47.05 O 
> ATOM 5 CB LEU 2 -9.417 -14.253 52.223 1.00 51.90 C 
> ATOM 6 CG LEU 2 -7.974 -14.441 51.748 1.00 54.45 C 
> ATOM 7 CD1 LEU 2 -7.365 -13.109 51.342 1.00 53.43 C 
> ATOM 8 CD2 LEU 2 -7.160 -15.095 52.852 1.00 55.22 C 
> ATOM 9 N THR 3 -9.833 -10.909 53.259 1.00 42.49 N 
> ATOM 10 CA THR 3 -10.405 -9.634 52.826 1.00 40.93 C 
> ATOM 11 C THR 3 -10.060 -9.403 51.362 1.00 41.24 C 
> 
> 
> 
> the fields of records having ATOM as 1st field are as follows:
> 
> COLUMNS        DATA TYPE       FIELD         DEFINITION
> -------------------------------------------------------------------
> --------------
> 1 -  6        Record name     "ATOM  "
> 
> 7 - 11        Integer         serial        Atom serial number.
> 
> 13 - 16        Atom            name          Atom name.
> 
> 17             Character       altLoc        Alternate location 
> indicator.
> 18 - 20        Residue name    resName       Residue name.
> 
> 22             Character       chainID       Chain identifier.
> 
> 23 - 26        Integer         resSeq        Residue sequence number.
> 
> 27             AChar           iCode         Code for insertion of 
> residues.
> 31 - 38        Real(8.3)       x             Orthogonal 
> coordinates for X in
>                                             Angstroms.
> 
> 39 - 46        Real(8.3)       y             Orthogonal 
> coordinates for Y in
>                                             Angstroms.
> 
> 47 - 54        Real(8.3)       z             Orthogonal 
> coordinates for Z in
>                                             Angstroms.
> 
> 55 - 60        Real(6.2)       occupancy     Occupancy.
> 
> 61 - 66        Real(6.2)       tempFactor    Temperature factor.
> 
> 73 - 76        LString(4)      segID         Segment identifier, 
> left-justified.
> 
> 77 - 78        LString(2)      element       Element symbol, right-
> justified.
> 79 - 80        LString(2)      charge        Charge on the atom.
> 
> 
> 
> I have to get the x,y,z coordinates of records whose atom name is
> 'CA'(highlighted as blue).
> 
> I wrote a code but its giving many errors..
> 
> The code is:
> 
> 
> 
> #!usr/bin/perl
> use warnings;
> 
> $filename = "1a32.txt";
> chomp $filename;

The above line is useless, perldoc -f chomp

> open (FILEHANDLE, "$filename") or die "couldn't open $filename:$!";
> @file= <FILEHANDLE>;
> close (FILEHANDLE);
> 
> $a= "ATOM";
> $c= "CA";
> 
> foreach $line(@file)
> {
> if(my $line =~ /^/$a/\s*
>  (\s*\d+)
>  \s*/$c/\s*
>  \d*
>  \w+
>  \s
>  \w
>  (\s*\d+)
>  \w*
>  (\s*\d*)
>  (\s*\d*)
>  (\s*\d*)
>  (\s*\d*)
>  (\s*\d*)
>  (\w*\s*)
>  (\s*\w*)
>  (\s*\w*)/)

Youch, that is way to long of a regular expression [ atleast for me ], you may 
consider a shorter nested version such as my @fields =~ /([\w+\s+])/g. In any 
case I think split would work the best here [ my @fields = split /\s/,$line ], 
since your fields are locked into place. In general you PAD data, untill you 
get it all uniformed such as yours.  Below is some simple code that should help 
you on your way, feel free to modify at will.



>   
> {
> my $x= substr($line,30,8);
> my $y= substr($line,38,8);
> my $z= substr($line,46,8);
> 
> print "$x\t$y\t$z\n";
> }
> }
> 
> #-------------------------------------------------
> 
> 
> 
> The errors that i'm getting are:
> 
> Scalar found where operator expected at two.pl line 15, near "/^/$a"
>        (Missing operator before $a?)
> Unrecognized escape \d passed through at two.pl line 15.
> Unrecognized escape \s passed through at two.pl line 15.
> Unrecognized escape \w passed through at two.pl line 17.
> Unrecognized escape \s passed through at two.pl line 17.
> Unrecognized escape \w passed through at two.pl line 17.
> Unrecognized escape \s passed through at two.pl line 17.
> Backslash found where operator expected at two.pl line 22, near 
> "(\s*\"  (Might be a runaway multi-line ** string starting on line 17)
>        (Missing operator before \?)
> Unquoted string "d" may clash with future reserved word at two.pl 
> line 22.
> Backslash found where operator expected at two.pl line 23, near ")
>                \"
>        (Missing operator before \?)
> Unquoted string "w" may clash with future reserved word at two.pl 
> line 23.
> Unrecognized escape \s passed through at two.pl line 24.
> Backslash found where operator expected at two.pl line 25, near 
> "(\s*\"  (Might be a runaway multi-line ** string starting on line 24)
>        (Missing operator before \?)
> Unquoted string "d" may clash with future reserved word at two.pl 
> line 25.
> Unrecognized escape \s passed through at two.pl line 26.
> Backslash found where operator expected at two.pl line 27, near 
> "(\s*\"  (Might be a runaway multi-line ** string starting on line 26)
>        (Missing operator before \?)
> Unquoted string "d" may clash with future reserved word at two.pl 
> line 27.
> Unrecognized escape \w passed through at two.pl line 28.
> Backslash found where operator expected at two.pl line 29, near 
> "(\w*\"  (Might be a runaway multi-line ** string starting on line 28)
>        (Missing operator before \?)
> Unrecognized escape \w passed through at two.pl line 29.
> syntax error at two.pl line 15, near "/^/$a"
> Substitution replacement not terminated at two.pl line 31.
> 
> Please help me..
> 

#!usr/bin/perl
use warnings;
use strict;
my $filename = "1a32.txt";
my $Atom='CA';

open (FILEHANDLE, "$filename") or die "couldn't open $filename:$!";
@file= <FILEHANDLE>;
close (FILEHANDLE);

foreach my $line ( @file ){

my @fields = split /\s/,$line;
print "X: $fields[-4] Y: $fields[-3] Z: $fields[-2]\n" if uc $fields[2] eq 
'$Atom;

}

HTH,
Mark G.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to