Francesco Pietra wrote:
I corrected the awk script as indicated (attached here)

With input
ATOM      1  N   LEU     1     153.242  64.673  95.851  0.00  0.00           N
ATOM      2  CA  LEU     1     154.534  64.963  95.169  0.00  0.00           C
..........
the output
ATOM      2  BN0 LEU     1     154.534  64.963  95.169  0.00  0.00
ATOM      4  SC1 LEU     1     156.589  66.550  95.065  0.00  0.00
................

is correct, the cgpdb file opens correctly in viewers, but there are
contacts, which gromacs was unable to relax at the relaxation stage
(nor it was at the all-atoms stage of the input file). Therefore I
relaxed the all-atoms input file with AMBER until no contacts at 0.8A
VDW, then repeated the awk script with the relaxed pdb file. Input

ATOM      1  N   LEU A   1     153.242  64.673  95.851  0.00  0.00           N
ATOM      2  CA  LEU A   1     154.534  64.963  95.169  0.00  0.00           C
.................
the output
ATOM      2  BN0 LEU     0       1.000 154.534  64.963 95.17  0.00
ATOM      4  SC1 LEU     0       1.000 156.589  66.550 95.06  0.00
.................
is grossly incorrect. Notice that both input files above give correct
psf and cg pdb files with VMD, just to say that the files are correct
pdb layout.

I was unable to understand why the awk script one time works, another
time not, just when I have a relxed file.


Because now you have a chain identifier, a case that I believe I mentioned last time. If you look at the script, the pattern matching expects a numeric field after the residue name; in the case of a chain identifier, this is not true and the script returns a zero instead of the actual residue number. Note too that in the output, every field is shifted exactly by one place as a result.

-Justin

I would appreciate very much that a stable version of the awk script
is posted if my corrections were incorrect. Replacement in the martini
web page would also be appreciated because one normally trusts in what
is officially posted.

thanks
francesco pietra

On Mon, Nov 9, 2009 at 12:58 PM, Justin A. Lemkul <[email protected]> wrote:

Francesco Pietra wrote:
Does the atom2cg_v2.1.awk require the indication of the subunit (A, B,
C, etc) in the pdb file of a multimeric protein?

From

ATOM      1  N   LEU     1     153.242  64.673  95.851  0.00  0.00
  N
ATOM      2  CA  LEU     1     154.534  64.963  95.169  0.00  0.00
  C
ATOM      3  CB  LEU     1     155.257  66.191  95.767  0.00  0.00
  C
ATOM      4  CG  LEU     1     156.589  66.550  95.065  0.00  0.00
  C
ATOM      5  CD1 LEU     1     156.406  66.834  93.574  0.00  0.00
  C
ATOM      6  CD2 LEU     1     157.222  67.770  95.727  0.00  0.00
  C
ATOM      7  C   LEU     1     155.425  63.717  95.081  0.00  0.00
  C
ATOM      8  O   LEU     1     155.371  63.026  94.063  0.00  0.00
  O
ATOM      9  N   SER     2     156.233  63.409  96.105  0.00  0.00
  N

I get

ATOM      2  BN0 LEU  154.534      64.963  95.169   0.000  0.00  0.00
ATOM      4  SC1 LEU  156.589      66.550  95.065   0.000  0.00  0.00
ATOM     10  BN0 SER  157.124      62.235  96.094   0.000  0.00  0.00

i.e., weird residue numbers.

The awk script simply copies the information from one line to the new file,
using the old atom numbers.  You can use genconf -renumber to fix this.  The
reason why the residue number isn't being written is because there is a
problem with the atom2cg script that I have posted here a number of times.
 For example, you need to fix each line of the script:

OLD LINE
if($1=="ATOM" && $4=="ARG" && $3=="CA")
printf("%4s  %5i %4s %3s  %4s    %8.3f%8.3f%8.3f%6.2f%6.2f    \n",$1, $2,
"BN0", $4, $6, $7, $8, $9,$10,$11);

FIXED LINE
if($1=="ATOM" && $4=="ARG" && $3=="CA")
printf("%4s  %5i %4s %3s  %4i    %8.3f%8.3f%8.3f%6.2f%6.2f    \n",$1, $2,
"BN0", $4, $5, $6, $7, $8, $9,$10,$11);


In another case (coming from AMBER, where the subunit indication is
omitted) with the subunit indicated, the residue numbers in the cg
file are correct. I don't see any other difference between the two
starting files. Or should I look for a different cause.

Then that's simply a matter of luck :)  The print statements in the original
awk script do not expect chain identifiers, so the printing worked due to
the extra field.

-Justin

thanks

francesco pietra
.......
--
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================
--
gmx-users mailing list    [email protected]
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface
or send it to [email protected].
Can't post? Read http://www.gromacs.org/mailing_lists/users.php



--
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================
--
gmx-users mailing list    [email protected]
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/search before posting!
Please don't post (un)subscribe requests to the list. Use the www interface or send it to [email protected].
Can't post? Read http://www.gromacs.org/mailing_lists/users.php

Reply via email to