Re: [PyMOL] Shell utilities for structural bioinformatics

Marko Hyvonen Fri, 12 Sep 2014 03:17:56 -0700

Hi James,

How about


egrep -v "MODEL 1|ROOT|ATOMS|ENDROOT|TORSDOF" myoriginalfile | sed 
's/ENDMDL/TER/' > mynewfile

-v in egrep is to reverse the selection so you get all lines except that 
ones in the expression.  Without that you _only_ "grep" the lines with 
those expressions.
(egrep might be grep, or something similar depending on OS or variant of 
it)

And sed (stream editor) then s(ubstitutes) ENDMDL with TER.

hth, Marko


On 12/09/2014 11:00, James Starlight wrote:
> Hi Tsjerk,
>
> thank you very much for help.
>
> this is a little bioinformatics question so probably it's better to 
> ask it here some expert of this topic like you :)
>
> because in my case I need to further proceed each split model model 
> (e,g delete some lines or make changing) piping with some commands
>
> e,g in my case each model after spliting consist of
>
> MODEL 1
> ROOT
> ATOMS
> ENDROOT
> TORSDOF 0
> ENDMDL
>
> i'd like to remove lines consisted of ROOT ENDROOT TORSDOF 0 and 
> change ENDMDL to TER
>
> i've tried to do it
>
> csplit -b "%04d.pdb" my_docking.pdb /^MODEL/ {*} | grep -v '^ENDROOT' 
> | grep -v '^TORSDOF 0' |  sed -e 's/^ENDMDL/TER/g'
>
> but the resulted files still consist of unused lines
>
> BTW might the csplit be used to extract only ONE (e,g first) model 
> from the multi-pdb file?
>
> James
>
> 2014-09-12 11:39 GMT+02:00 Tsjerk Wassenaar <[email protected] 
> <mailto:[email protected]>>:
>
>     Hi James,
>
>     These are the sort of questions that'll be answered elsewhere.
>     Most notably on stackoverflow:
>     
> http://stackoverflow.com/questions/18364411/using-regex-to-tell-csplit-where-to-split-the-file
>
>     csplit -b "%04d.pdb" file.pdb /^MODEL/ {*}
>
>     Cheers,
>
>     Tsjerk
>
>
>     On Fri, Sep 12, 2014 at 11:25 AM, James Starlight
>     <[email protected] <mailto:[email protected]>> wrote:
>
>         Hi,
>
>         some new question.
>
>         I need to some combination of shell utilities to split
>         multi_model.pdb on several pdbs  as well as separate command
>         to seek multi_model.pdb and to save only this model as the
>         separare model1.pdb. I've tried to do it using grep
>         grep '^MODEL 1' my_docking.pdb > model1.pdb
>
>         but results were empty.
>
>         James
>
>         2014-09-08 15:48 GMT+02:00 James Starlight
>         <[email protected] <mailto:[email protected]>>:
>
>             Thanks you very much!
>
>             James
>
>             2014-09-05 20:18 GMT+02:00 Folmer Fredslund
>             <[email protected] <mailto:[email protected]>>:
>
>                 Hi
>
>                 Small correction to Gianlucas suggestion
>
>                 ">" will direct the output to a file, overwriting the
>                 contents
>                 ">>" will direct the output to a file, appending the
>                 contents
>
>                 Venlig hilsen
>                 Folmer Fredslund
>
>                 Den 05/09/2014 19.16 skrev "Gianluca Santoni"
>                 <[email protected]
>                 <mailto:[email protected]>>:
>
>                     Don't even need cat
>                     simply do
>
>                     grep PPC ref.pdb > tar_i.pdb
>
>                     redirecting std out with > appends it directly to
>                     the file (after the
>                     last line)
>
>                     Cheers
>
>                     On 9/5/14 6:48 PM, James Starlight wrote:
>                     > Dear Pymol users!
>                     >
>                     > I've decided to open new topic focused on the
>                     implementation of the
>                     > common shell utilities like grep awk and sed for
>                     the structural
>                     > bioinformatics tasks like processing and editing
>                     of the large sets of pdbs.
>                     >
>                     > In my current task I need to copy all lipids
>                     from one pdb (called it
>                     > ref) to another call it tar_i.pdb (both files
>                     have the same 3D shape and
>                     > have been superimposed before that): so in that
>                     case I guess lipids
>                     > could be recognized by residue name in pdb file
>                     (PPC) as well as by its
>                     > #4 column number (what is actually do grep).  So
>                     the algorithm might be:
>                     > select from the ref.pdb all strings where #4
>                     column is PPC and merge it
>                     > (by means of CAT I guess) with the tar_i.pdb.
>                     Please show me some
>                     > example of the one-line method of this realization.
>                     >
>                     > Thanks,
>                     >
>                     > James
>                     >
>                     >
>                     >
>                     
> ------------------------------------------------------------------------------
>                     > Slashdot TV.
>                     > Video for Nerds. Stuff that matters.
>                     > http://tv.slashdot.org/
>                     >
>                     >
>                     >
>                     > _______________________________________________
>                     > PyMOL-users mailing list
>                     ([email protected]
>                     <mailto:[email protected]>)
>                     > Info Page:
>                     https://lists.sourceforge.net/lists/listinfo/pymol-users
>                     > Archives:
>                     
> http://www.mail-archive.com/[email protected]
>                     >
>
>
>                     --
>                     Gianluca Santoni,
>                     Dynamop Group
>                     Institut de Biologie Structurale
>                     6 rue Jules Horowitz
>                     38027 Grenoble Cedex 1
>                     France
>                     _________________________________________________________
>                     Please avoid sending me Word or PowerPoint
>                     attachments.
>                     See
>                     http://www.gnu.org/philosophy/no-word-attachments.html
>
>                     
> ------------------------------------------------------------------------------
>                     Slashdot TV.
>                     Video for Nerds. Stuff that matters.
>                     http://tv.slashdot.org/
>                     _______________________________________________
>                     PyMOL-users mailing list
>                     ([email protected]
>                     <mailto:[email protected]>)
>                     Info Page:
>                     https://lists.sourceforge.net/lists/listinfo/pymol-users
>                     Archives:
>                     
> http://www.mail-archive.com/[email protected]
>
>
>                 
> ------------------------------------------------------------------------------
>                 Slashdot TV.
>                 Video for Nerds.  Stuff that matters.
>                 http://tv.slashdot.org/
>                 _______________________________________________
>                 PyMOL-users mailing list
>                 ([email protected]
>                 <mailto:[email protected]>)
>                 Info Page:
>                 https://lists.sourceforge.net/lists/listinfo/pymol-users
>                 Archives:
>                 http://www.mail-archive.com/[email protected]
>
>
>
>
>         
> ------------------------------------------------------------------------------
>         Want excitement?
>         Manually upgrade your production database.
>         When you want reliability, choose Perforce
>         Perforce version control. Predictably reliable.
>         
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
>         _______________________________________________
>         PyMOL-users mailing list ([email protected]
>         <mailto:[email protected]>)
>         Info Page:
>         https://lists.sourceforge.net/lists/listinfo/pymol-users
>         Archives:
>         http://www.mail-archive.com/[email protected]
>
>
>
>
>     -- 
>     Tsjerk A. Wassenaar, Ph.D.
>
>
>
>
> ------------------------------------------------------------------------------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce
> Perforce version control. Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
>
>
> _______________________________________________
> PyMOL-users mailing list ([email protected])
> Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
> Archives: http://www.mail-archive.com/[email protected]


-- 

  Marko Hyvonen
  Department of Biochemistry, University of Cambridge
  [email protected]
  http://www-cryst.bioc.cam.ac.uk/groups/hyvonen
  tel:    +44-(0)1223-766 044
  


------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
PyMOL-users mailing list ([email protected])
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/[email protected]

Re: [PyMOL] Shell utilities for structural bioinformatics

Reply via email to