Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-25 Thread James Starlight
From the previous task it's OK now

small question about pymol :)

how to prevent removing TER record after editing of the structure using
pymol

e.g I used the following command to load protein-ligand complex (where
there is TER between protein and ligand) and remove hydrogens and than save
back it to new pdb where there were no more TER records

pymol $pdb -c -q -d remove hydrogens; save
${temp}/${filenamenoextention}_noH.pdb, ${filenamenoextention} 
${temp}/xz.log

and smth about shell informatics:

how to scan specified string in pdb corresponded to the last line of the
ligand (which is the RES in pdb) and add after this line TER record. E.g
the basic idea:

grep -v ATOM.*\(RES\|MOL\) $pdb | # smth with sed   pdb_with_TER.pdb

now I only would like to know proper reg expression in my case for GREP and
command for SED

James

2014-09-24 11:06 GMT+02:00 James Starlight jmsstarli...@gmail.com:

 some additional question about shell scripting (copied from the amber
 forum because I'd like to find as more sollutions of this problem as
 possible):


 I wounder about possibilities to define disulphide bond between any pairs
 of SG atoms of CYX residues using amber's tleap scripts in some automatic
 fashion.

 In my case I use tleap as part of some big script to process many
 models.pdb for further md simulation. Each of the model consist of pair of
 CYX residues (assigned by pdb2pqr) in different positions of its sequence.
 So in script I need firstly to know the number of position for each of CYX
 residues of each model and than to fill this numbers to the tleap input
 files for each model

 in bash for one model it will be look like:

 #some command to scan the sequence of model.pdb and define pair of its CYX
 residues within it as the k ans i variables
   printf source leaprc.ff03.r1\nprotein = loadpdb model.pdb\nsetbox
 protein centers\nbond protein.${k}.SG protein.${i}.SG\nsaveamberparm
 protein protein.parm7 protein.inpcrd\nquit  ./tleap.in


 so my task is only to find some command which will scan model and find
 positions of the CYX within its sequence which could be put to the tleap as
 two digits. It will be better to find those 2 digits using pdb as an input
 and some unix command like sed or grep to find positions

 I will be very thankful for any suggestions!

 James



 2014-09-12 16:21 GMT+02:00 Tsjerk Wassenaar tsje...@gmail.com:


 csplit -b %03d.pdb test.pdbqt /^MODEL/ {0}  somelog.log


 man csplit:

 csplit -f blabla -b %03d.pdb test.pdbqt /^MODEL/ {1}

 But you want only the first frame anyway, so no real use for csplit...

 sed /^ENDMDL/q my_docking.pdb | grep -v ^ROOT\|^ENDROOT\|^TORSDOF
 0\|^MODEL\|^REMARK |  sed -e 's/^ENDMDL/TER/g'  firstmodel.pdb


 sed -e '/^ENDMDL/{s/^.*/TER/;q;}' -e '/^\(ROOT\|ENDROOT\|TORSDOF
 0\|MODEL\|REMARK\)/d' my_docking.pdb  firstmodel.pdb

 ... shorter and one process running in stead of 3.

 Cheers,

 Tsjerk

 --
 Tsjerk A. Wassenaar, Ph.D.



--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-24 Thread James Starlight
some additional question about shell scripting (copied from the amber forum
because I'd like to find as more sollutions of this problem as possible):


I wounder about possibilities to define disulphide bond between any pairs
of SG atoms of CYX residues using amber's tleap scripts in some automatic
fashion.

In my case I use tleap as part of some big script to process many
models.pdb for further md simulation. Each of the model consist of pair of
CYX residues (assigned by pdb2pqr) in different positions of its sequence.
So in script I need firstly to know the number of position for each of CYX
residues of each model and than to fill this numbers to the tleap input
files for each model

in bash for one model it will be look like:

#some command to scan the sequence of model.pdb and define pair of its CYX
residues within it as the k ans i variables
  printf source leaprc.ff03.r1\nprotein = loadpdb model.pdb\nsetbox
protein centers\nbond protein.${k}.SG protein.${i}.SG\nsaveamberparm
protein protein.parm7 protein.inpcrd\nquit  ./tleap.in


so my task is only to find some command which will scan model and find
positions of the CYX within its sequence which could be put to the tleap as
two digits. It will be better to find those 2 digits using pdb as an input
and some unix command like sed or grep to find positions

I will be very thankful for any suggestions!

James



2014-09-12 16:21 GMT+02:00 Tsjerk Wassenaar tsje...@gmail.com:


 csplit -b %03d.pdb test.pdbqt /^MODEL/ {0}  somelog.log


 man csplit:

 csplit -f blabla -b %03d.pdb test.pdbqt /^MODEL/ {1}

 But you want only the first frame anyway, so no real use for csplit...

 sed /^ENDMDL/q my_docking.pdb | grep -v ^ROOT\|^ENDROOT\|^TORSDOF
 0\|^MODEL\|^REMARK |  sed -e 's/^ENDMDL/TER/g'  firstmodel.pdb


 sed -e '/^ENDMDL/{s/^.*/TER/;q;}' -e '/^\(ROOT\|ENDROOT\|TORSDOF
 0\|MODEL\|REMARK\)/d' my_docking.pdb  firstmodel.pdb

 ... shorter and one process running in stead of 3.

 Cheers,

 Tsjerk

 --
 Tsjerk A. Wassenaar, Ph.D.


--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread James Starlight
Hi,

some new question.

I need to some combination of shell utilities to split multi_model.pdb on
several pdbs  as well as separate command to seek multi_model.pdb and to
save only this model as the separare model1.pdb. I've tried to do it using
grep
grep '^MODEL 1' my_docking.pdb  model1.pdb

but results were empty.

James

2014-09-08 15:48 GMT+02:00 James Starlight jmsstarli...@gmail.com:

 Thanks you very much!

 James

 2014-09-05 20:18 GMT+02:00 Folmer Fredslund folm...@gmail.com:

 Hi

 Small correction to Gianlucas suggestion

  will direct the output to a file, overwriting the contents
  will direct the output to a file, appending the contents

 Venlig hilsen
 Folmer Fredslund
 Den 05/09/2014 19.16 skrev Gianluca Santoni gianluca.sant...@ibs.fr:

 Don't even need cat
 simply do

 grep PPC ref.pdb  tar_i.pdb

 redirecting std out with  appends it directly to the file (after the
 last line)

 Cheers

 On 9/5/14 6:48 PM, James Starlight wrote:
  Dear Pymol users!
 
  I've decided to open new topic focused on the implementation of the
  common shell utilities like grep awk and sed for the structural
  bioinformatics tasks like processing and editing of the large sets of
 pdbs.
 
  In my current task I need to copy all lipids from one pdb (called it
  ref) to another call it tar_i.pdb (both files have the same 3D shape
 and
  have been superimposed before that): so in that case I guess lipids
  could be recognized by residue name in pdb file (PPC) as well as by its
  #4 column number (what is actually do grep).  So the algorithm might
 be:
  select from the ref.pdb all strings where #4 column is PPC and merge it
  (by means of CAT I guess) with the tar_i.pdb. Please show me some
  example of the one-line method of this realization.
 
  Thanks,
 
  James
 
 
 
 --
  Slashdot TV.
  Video for Nerds.  Stuff that matters.
  http://tv.slashdot.org/
 
 
 
  ___
  PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
  Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
  Archives:
 http://www.mail-archive.com/pymol-users@lists.sourceforge.net
 


 --
 Gianluca Santoni,
 Dynamop Group
 Institut de Biologie Structurale
 6 rue Jules Horowitz
 38027 Grenoble Cedex 1
 France
 _
 Please avoid sending me Word or PowerPoint attachments.
 See http://www.gnu.org/philosophy/no-word-attachments.html


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net



 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net



--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread Tsjerk Wassenaar
Hi James,

These are the sort of questions that'll be answered elsewhere. Most notably
on stackoverflow:
http://stackoverflow.com/questions/18364411/using-regex-to-tell-csplit-where-to-split-the-file

csplit -b %04d.pdb file.pdb /^MODEL/ {*}

Cheers,

Tsjerk


On Fri, Sep 12, 2014 at 11:25 AM, James Starlight jmsstarli...@gmail.com
wrote:

 Hi,

 some new question.

 I need to some combination of shell utilities to split multi_model.pdb on
 several pdbs  as well as separate command to seek multi_model.pdb and to
 save only this model as the separare model1.pdb. I've tried to do it using
 grep
 grep '^MODEL 1' my_docking.pdb  model1.pdb

 but results were empty.

 James

 2014-09-08 15:48 GMT+02:00 James Starlight jmsstarli...@gmail.com:

 Thanks you very much!

 James

 2014-09-05 20:18 GMT+02:00 Folmer Fredslund folm...@gmail.com:

 Hi

 Small correction to Gianlucas suggestion

  will direct the output to a file, overwriting the contents
  will direct the output to a file, appending the contents

 Venlig hilsen
 Folmer Fredslund
 Den 05/09/2014 19.16 skrev Gianluca Santoni gianluca.sant...@ibs.fr:

 Don't even need cat
 simply do

 grep PPC ref.pdb  tar_i.pdb

 redirecting std out with  appends it directly to the file (after the
 last line)

 Cheers

 On 9/5/14 6:48 PM, James Starlight wrote:
  Dear Pymol users!
 
  I've decided to open new topic focused on the implementation of the
  common shell utilities like grep awk and sed for the structural
  bioinformatics tasks like processing and editing of the large sets of
 pdbs.
 
  In my current task I need to copy all lipids from one pdb (called it
  ref) to another call it tar_i.pdb (both files have the same 3D shape
 and
  have been superimposed before that): so in that case I guess lipids
  could be recognized by residue name in pdb file (PPC) as well as by
 its
  #4 column number (what is actually do grep).  So the algorithm might
 be:
  select from the ref.pdb all strings where #4 column is PPC and merge
 it
  (by means of CAT I guess) with the tar_i.pdb. Please show me some
  example of the one-line method of this realization.
 
  Thanks,
 
  James
 
 
 
 --
  Slashdot TV.
  Video for Nerds.  Stuff that matters.
  http://tv.slashdot.org/
 
 
 
  ___
  PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
  Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
  Archives:
 http://www.mail-archive.com/pymol-users@lists.sourceforge.net
 


 --
 Gianluca Santoni,
 Dynamop Group
 Institut de Biologie Structurale
 6 rue Jules Horowitz
 38027 Grenoble Cedex 1
 France
 _
 Please avoid sending me Word or PowerPoint attachments.
 See http://www.gnu.org/philosophy/no-word-attachments.html


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net



 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net





 --
 Want excitement?
 Manually upgrade your production database.
 When you want reliability, choose Perforce
 Perforce version control. Predictably reliable.

 http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net




-- 
Tsjerk A. Wassenaar, Ph.D.
--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread James Starlight
Hi Tsjerk,

thank you very much for help.

this is a little bioinformatics question so probably it's better to ask it
here some expert of this topic like you :)

because in my case I need to further proceed each split model model (e,g
delete some lines or make changing) piping with some commands

e,g in my case each model after spliting consist of

MODEL 1
ROOT
ATOMS
ENDROOT
TORSDOF 0
ENDMDL

i'd like to remove lines consisted of ROOT ENDROOT TORSDOF 0 and change
ENDMDL to TER

i've tried to do it

csplit -b %04d.pdb my_docking.pdb /^MODEL/ {*} | grep -v '^ENDROOT' |
grep -v '^TORSDOF 0' |  sed -e 's/^ENDMDL/TER/g'

but the resulted files still consist of unused lines

BTW might the csplit be used to extract only ONE (e,g first) model from the
multi-pdb file?

James

2014-09-12 11:39 GMT+02:00 Tsjerk Wassenaar tsje...@gmail.com:

 Hi James,

 These are the sort of questions that'll be answered elsewhere. Most
 notably on stackoverflow:
 http://stackoverflow.com/questions/18364411/using-regex-to-tell-csplit-where-to-split-the-file

 csplit -b %04d.pdb file.pdb /^MODEL/ {*}

 Cheers,

 Tsjerk


 On Fri, Sep 12, 2014 at 11:25 AM, James Starlight jmsstarli...@gmail.com
 wrote:

 Hi,

 some new question.

 I need to some combination of shell utilities to split multi_model.pdb on
 several pdbs  as well as separate command to seek multi_model.pdb and to
 save only this model as the separare model1.pdb. I've tried to do it using
 grep
 grep '^MODEL 1' my_docking.pdb  model1.pdb

 but results were empty.

 James

 2014-09-08 15:48 GMT+02:00 James Starlight jmsstarli...@gmail.com:

 Thanks you very much!

 James

 2014-09-05 20:18 GMT+02:00 Folmer Fredslund folm...@gmail.com:

 Hi

 Small correction to Gianlucas suggestion

  will direct the output to a file, overwriting the contents
  will direct the output to a file, appending the contents

 Venlig hilsen
 Folmer Fredslund
 Den 05/09/2014 19.16 skrev Gianluca Santoni gianluca.sant...@ibs.fr
 :

 Don't even need cat
 simply do

 grep PPC ref.pdb  tar_i.pdb

 redirecting std out with  appends it directly to the file (after the
 last line)

 Cheers

 On 9/5/14 6:48 PM, James Starlight wrote:
  Dear Pymol users!
 
  I've decided to open new topic focused on the implementation of the
  common shell utilities like grep awk and sed for the structural
  bioinformatics tasks like processing and editing of the large sets
 of pdbs.
 
  In my current task I need to copy all lipids from one pdb (called it
  ref) to another call it tar_i.pdb (both files have the same 3D shape
 and
  have been superimposed before that): so in that case I guess lipids
  could be recognized by residue name in pdb file (PPC) as well as by
 its
  #4 column number (what is actually do grep).  So the algorithm might
 be:
  select from the ref.pdb all strings where #4 column is PPC and merge
 it
  (by means of CAT I guess) with the tar_i.pdb. Please show me some
  example of the one-line method of this realization.
 
  Thanks,
 
  James
 
 
 
 --
  Slashdot TV.
  Video for Nerds.  Stuff that matters.
  http://tv.slashdot.org/
 
 
 
  ___
  PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
  Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
  Archives:
 http://www.mail-archive.com/pymol-users@lists.sourceforge.net
 


 --
 Gianluca Santoni,
 Dynamop Group
 Institut de Biologie Structurale
 6 rue Jules Horowitz
 38027 Grenoble Cedex 1
 France
 _
 Please avoid sending me Word or PowerPoint attachments.
 See http://www.gnu.org/philosophy/no-word-attachments.html


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives:
 http://www.mail-archive.com/pymol-users@lists.sourceforge.net



 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net





 --
 Want excitement?
 Manually upgrade your production database.
 When you want reliability, choose Perforce
 Perforce version control. Predictably reliable.

 http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info 

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread Tsjerk Wassenaar
Hi James,

This is more text-file processing than it is bioinformatics. The trick is
to understand the problem, dissect it, and fit it to your toolbox on Linux.
That's actually much of bioinformatics :)

The first thing to understand is what data you have and what data you need
to have in the end. That will determine the tools and how to use them. To
extract the first part of a file, up to and including a tag (ENDMDL), you
could use sed:

sed /^ENDMDL/q models.pdb  firstmodel.pdb

While at it, you can also delete those lines you don't want to:

sed -e /^ROOT/d -e /^ENDROOT/d -e /^TORSDOF/d -e /^ENDMDL/q models.pdb 
firstmodel.pdb

For bioinformatics, it really pays off to read up on sed and awk.

As for the other question, yes, csplit can be used to extract one, or a
number of blocks. The {*} indicates that all blocks are to be written. {10}
indicates the first ten blocks are to be written. Check the help to see how
to use csplit to extract a specific block. I just read up on it now to be
able to answer your question. I didn't know this about csplit when I woke
up this morning.

Cheers,

Tsjerk

On Fri, Sep 12, 2014 at 12:00 PM, James Starlight jmsstarli...@gmail.com
wrote:

 Hi Tsjerk,

 thank you very much for help.

 this is a little bioinformatics question so probably it's better to ask it
 here some expert of this topic like you :)

 because in my case I need to further proceed each split model model (e,g
 delete some lines or make changing) piping with some commands

 e,g in my case each model after spliting consist of

 MODEL 1
 ROOT
 ATOMS
 ENDROOT
 TORSDOF 0
 ENDMDL

 i'd like to remove lines consisted of ROOT ENDROOT TORSDOF 0 and change
 ENDMDL to TER

 i've tried to do it

 csplit -b %04d.pdb my_docking.pdb /^MODEL/ {*} | grep -v '^ENDROOT' |
 grep -v '^TORSDOF 0' |  sed -e 's/^ENDMDL/TER/g'

 but the resulted files still consist of unused lines

 BTW might the csplit be used to extract only ONE (e,g first) model from
 the multi-pdb file?

 James

 2014-09-12 11:39 GMT+02:00 Tsjerk Wassenaar tsje...@gmail.com:

 Hi James,

 These are the sort of questions that'll be answered elsewhere. Most
 notably on stackoverflow:
 http://stackoverflow.com/questions/18364411/using-regex-to-tell-csplit-where-to-split-the-file

 csplit -b %04d.pdb file.pdb /^MODEL/ {*}

 Cheers,

 Tsjerk


 On Fri, Sep 12, 2014 at 11:25 AM, James Starlight jmsstarli...@gmail.com
  wrote:

 Hi,

 some new question.

 I need to some combination of shell utilities to split multi_model.pdb
 on several pdbs  as well as separate command to seek multi_model.pdb and to
 save only this model as the separare model1.pdb. I've tried to do it using
 grep
 grep '^MODEL 1' my_docking.pdb  model1.pdb

 but results were empty.

 James

 2014-09-08 15:48 GMT+02:00 James Starlight jmsstarli...@gmail.com:

 Thanks you very much!

 James

 2014-09-05 20:18 GMT+02:00 Folmer Fredslund folm...@gmail.com:

 Hi

 Small correction to Gianlucas suggestion

  will direct the output to a file, overwriting the contents
  will direct the output to a file, appending the contents

 Venlig hilsen
 Folmer Fredslund
 Den 05/09/2014 19.16 skrev Gianluca Santoni gianluca.sant...@ibs.fr
 :

 Don't even need cat
 simply do

 grep PPC ref.pdb  tar_i.pdb

 redirecting std out with  appends it directly to the file (after the
 last line)

 Cheers

 On 9/5/14 6:48 PM, James Starlight wrote:
  Dear Pymol users!
 
  I've decided to open new topic focused on the implementation of the
  common shell utilities like grep awk and sed for the structural
  bioinformatics tasks like processing and editing of the large sets
 of pdbs.
 
  In my current task I need to copy all lipids from one pdb (called it
  ref) to another call it tar_i.pdb (both files have the same 3D
 shape and
  have been superimposed before that): so in that case I guess lipids
  could be recognized by residue name in pdb file (PPC) as well as by
 its
  #4 column number (what is actually do grep).  So the algorithm
 might be:
  select from the ref.pdb all strings where #4 column is PPC and
 merge it
  (by means of CAT I guess) with the tar_i.pdb. Please show me some
  example of the one-line method of this realization.
 
  Thanks,
 
  James
 
 
 
 --
  Slashdot TV.
  Video for Nerds.  Stuff that matters.
  http://tv.slashdot.org/
 
 
 
  ___
  PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
  Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
  Archives:
 http://www.mail-archive.com/pymol-users@lists.sourceforge.net
 


 --
 Gianluca Santoni,
 Dynamop Group
 Institut de Biologie Structurale
 6 rue Jules Horowitz
 38027 Grenoble Cedex 1
 France
 _
 Please avoid sending me Word or PowerPoint attachments.
 See http://www.gnu.org/philosophy/no-word-attachments.html


 

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread Marko Hyvonen
Hi James,

How about

egrep -v MODEL 1|ROOT|ATOMS|ENDROOT|TORSDOF myoriginalfile | sed 
's/ENDMDL/TER/'  mynewfile

-v in egrep is to reverse the selection so you get all lines except that 
ones in the expression.  Without that you _only_ grep the lines with 
those expressions.
(egrep might be grep, or something similar depending on OS or variant of 
it)

And sed (stream editor) then s(ubstitutes) ENDMDL with TER.

hth, Marko


On 12/09/2014 11:00, James Starlight wrote:
 Hi Tsjerk,

 thank you very much for help.

 this is a little bioinformatics question so probably it's better to 
 ask it here some expert of this topic like you :)

 because in my case I need to further proceed each split model model 
 (e,g delete some lines or make changing) piping with some commands

 e,g in my case each model after spliting consist of

 MODEL 1
 ROOT
 ATOMS
 ENDROOT
 TORSDOF 0
 ENDMDL

 i'd like to remove lines consisted of ROOT ENDROOT TORSDOF 0 and 
 change ENDMDL to TER

 i've tried to do it

 csplit -b %04d.pdb my_docking.pdb /^MODEL/ {*} | grep -v '^ENDROOT' 
 | grep -v '^TORSDOF 0' |  sed -e 's/^ENDMDL/TER/g'

 but the resulted files still consist of unused lines

 BTW might the csplit be used to extract only ONE (e,g first) model 
 from the multi-pdb file?

 James

 2014-09-12 11:39 GMT+02:00 Tsjerk Wassenaar tsje...@gmail.com 
 mailto:tsje...@gmail.com:

 Hi James,

 These are the sort of questions that'll be answered elsewhere.
 Most notably on stackoverflow:
 
 http://stackoverflow.com/questions/18364411/using-regex-to-tell-csplit-where-to-split-the-file

 csplit -b %04d.pdb file.pdb /^MODEL/ {*}

 Cheers,

 Tsjerk


 On Fri, Sep 12, 2014 at 11:25 AM, James Starlight
 jmsstarli...@gmail.com mailto:jmsstarli...@gmail.com wrote:

 Hi,

 some new question.

 I need to some combination of shell utilities to split
 multi_model.pdb on several pdbs  as well as separate command
 to seek multi_model.pdb and to save only this model as the
 separare model1.pdb. I've tried to do it using grep
 grep '^MODEL 1' my_docking.pdb  model1.pdb

 but results were empty.

 James

 2014-09-08 15:48 GMT+02:00 James Starlight
 jmsstarli...@gmail.com mailto:jmsstarli...@gmail.com:

 Thanks you very much!

 James

 2014-09-05 20:18 GMT+02:00 Folmer Fredslund
 folm...@gmail.com mailto:folm...@gmail.com:

 Hi

 Small correction to Gianlucas suggestion

  will direct the output to a file, overwriting the
 contents
  will direct the output to a file, appending the
 contents

 Venlig hilsen
 Folmer Fredslund

 Den 05/09/2014 19.16 skrev Gianluca Santoni
 gianluca.sant...@ibs.fr
 mailto:gianluca.sant...@ibs.fr:

 Don't even need cat
 simply do

 grep PPC ref.pdb  tar_i.pdb

 redirecting std out with  appends it directly to
 the file (after the
 last line)

 Cheers

 On 9/5/14 6:48 PM, James Starlight wrote:
  Dear Pymol users!
 
  I've decided to open new topic focused on the
 implementation of the
  common shell utilities like grep awk and sed for
 the structural
  bioinformatics tasks like processing and editing
 of the large sets of pdbs.
 
  In my current task I need to copy all lipids
 from one pdb (called it
  ref) to another call it tar_i.pdb (both files
 have the same 3D shape and
  have been superimposed before that): so in that
 case I guess lipids
  could be recognized by residue name in pdb file
 (PPC) as well as by its
  #4 column number (what is actually do grep).  So
 the algorithm might be:
  select from the ref.pdb all strings where #4
 column is PPC and merge it
  (by means of CAT I guess) with the tar_i.pdb.
 Please show me some
  example of the one-line method of this realization.
 
  Thanks,
 
  James
 
 
 
 
 --
  Slashdot TV.
  Video for Nerds. Stuff that matters.
  

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread James Starlight
Thank you very much!

James

2014-09-12 12:36 GMT+02:00 Marko Hyvonen mh...@cam.ac.uk:

 On 12/09/2014 11:26, James Starlight wrote:

 grep -v ^ROOT\|^ENDROOT\|^TORSDOF 0\^MODEL\^REMARK|


 I think you are missing few | in there:

 grep -v ^ROOT\|^ENDROOT\|^TORSDOF 0\|^MODEL\|^REMARK

 and depending on the shell, you might be able get away with \ by using
 single quotation marks
 grep -v '^ROOT|^ENDROOT|^TORSDOF 0|^MODEL|^REMARK'

 Marko


 --

  Marko Hyvonen
  Department of Biochemistry, University of Cambridge
  mh...@cam.ac.uk
  http://www-cryst.bioc.cam.ac.uk/groups/hyvonen
  tel:+44-(0)1223-766 044


--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-12 Thread Tsjerk Wassenaar
 csplit -b %03d.pdb test.pdbqt /^MODEL/ {0}  somelog.log


man csplit:

csplit -f blabla -b %03d.pdb test.pdbqt /^MODEL/ {1}

But you want only the first frame anyway, so no real use for csplit...

sed /^ENDMDL/q my_docking.pdb | grep -v ^ROOT\|^ENDROOT\|^TORSDOF
 0\|^MODEL\|^REMARK |  sed -e 's/^ENDMDL/TER/g'  firstmodel.pdb


sed -e '/^ENDMDL/{s/^.*/TER/;q;}' -e '/^\(ROOT\|ENDROOT\|TORSDOF
0\|MODEL\|REMARK\)/d' my_docking.pdb  firstmodel.pdb

... shorter and one process running in stead of 3.

Cheers,

Tsjerk

-- 
Tsjerk A. Wassenaar, Ph.D.
--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-08 Thread James Starlight
Thanks you very much!

James

2014-09-05 20:18 GMT+02:00 Folmer Fredslund folm...@gmail.com:

 Hi

 Small correction to Gianlucas suggestion

  will direct the output to a file, overwriting the contents
  will direct the output to a file, appending the contents

 Venlig hilsen
 Folmer Fredslund
 Den 05/09/2014 19.16 skrev Gianluca Santoni gianluca.sant...@ibs.fr:

 Don't even need cat
 simply do

 grep PPC ref.pdb  tar_i.pdb

 redirecting std out with  appends it directly to the file (after the
 last line)

 Cheers

 On 9/5/14 6:48 PM, James Starlight wrote:
  Dear Pymol users!
 
  I've decided to open new topic focused on the implementation of the
  common shell utilities like grep awk and sed for the structural
  bioinformatics tasks like processing and editing of the large sets of
 pdbs.
 
  In my current task I need to copy all lipids from one pdb (called it
  ref) to another call it tar_i.pdb (both files have the same 3D shape and
  have been superimposed before that): so in that case I guess lipids
  could be recognized by residue name in pdb file (PPC) as well as by its
  #4 column number (what is actually do grep).  So the algorithm might be:
  select from the ref.pdb all strings where #4 column is PPC and merge it
  (by means of CAT I guess) with the tar_i.pdb. Please show me some
  example of the one-line method of this realization.
 
  Thanks,
 
  James
 
 
 
 --
  Slashdot TV.
  Video for Nerds.  Stuff that matters.
  http://tv.slashdot.org/
 
 
 
  ___
  PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
  Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
  Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net
 


 --
 Gianluca Santoni,
 Dynamop Group
 Institut de Biologie Structurale
 6 rue Jules Horowitz
 38027 Grenoble Cedex 1
 France
 _
 Please avoid sending me Word or PowerPoint attachments.
 See http://www.gnu.org/philosophy/no-word-attachments.html


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net



 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

Re: [PyMOL] Shell utilities for structural bioinformatics

2014-09-05 Thread Gianluca Santoni
Don't even need cat
simply do

grep PPC ref.pdb  tar_i.pdb

redirecting std out with  appends it directly to the file (after the 
last line)

Cheers

On 9/5/14 6:48 PM, James Starlight wrote:
 Dear Pymol users!

 I've decided to open new topic focused on the implementation of the
 common shell utilities like grep awk and sed for the structural
 bioinformatics tasks like processing and editing of the large sets of pdbs.

 In my current task I need to copy all lipids from one pdb (called it
 ref) to another call it tar_i.pdb (both files have the same 3D shape and
 have been superimposed before that): so in that case I guess lipids
 could be recognized by residue name in pdb file (PPC) as well as by its
 #4 column number (what is actually do grep).  So the algorithm might be:
 select from the ref.pdb all strings where #4 column is PPC and merge it
 (by means of CAT I guess) with the tar_i.pdb. Please show me some
 example of the one-line method of this realization.

 Thanks,

 James


 --
 Slashdot TV.
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/



 ___
 PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
 Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
 Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net



-- 
Gianluca Santoni,
Dynamop Group
Institut de Biologie Structurale
6 rue Jules Horowitz
38027 Grenoble Cedex 1  
France  
_
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net