[Rdkit-discuss] smiview 1.2

Andrew Dalke Tue, 03 Apr 2018 15:30:05 -0700

About 10 days ago I posted a prototype program called 'smiview', which displays 
information about the structure of a SMILES string.


Thanks to feedback from a couple of users, and a deep urge to explore the idea, 
I've just released smiview 1.2, available from 
https://bitbucket.org/dalke/smiview/downloads/smiview-1.2.tar.gz .

For details about what it can do, see the README at 
https://bitbucket.org/dalke/smiview .

Some of the changes are:
 - lots of bug fixes
 - the SMILES tokenizer will now try to parse the contents of an atom in []s
 - the atom indicators now point to the first character of the element symbol(s)
     rather than the first character of the atom token (e.g., the "C" in 
"[35Cl]"
     and not the "[")
 - the 'closures' track now highlights the atoms involved in a minimal cycle
     for a closure, rather than the SMILES string between the two closure points
 - more control over some of the styles
 - there is now code to generate the molecular graph, which means smiview can 
also
    report errors like C11 (closure to itself) and C1C1 (two bonds between 
atoms)
 - new tracks, like "hcounts" to show the number of implicit hydrogens on each 
atom,
    and "symclasses" to show each atom's symmetry class
 - support for both RDKit and OEChem, or no toolkit, albeit with reduced 
functionality
 - options to modify the input SMILES so all atoms have explicit hydrogen 
counts,
     and to set the isotope and atom class fields base on the atom index, 
symmetry
     class, or element number.
 - cleaned up and re-organized the internals. It now uses an experimental 
property
    calculation dependency system, and has a "track manager" to organize the 
tracks.

Here's what it looks like with most of the tracks enabled (which is rather 
overwhelming):

% smiview 'Cn1c(=O)c2c(ncn2C)n(C)c1=O' --fancy
            ┌                   1 1 1  1
       atoms│ 01 2  3 4 5 678 9 0 1 2  3
            └ || |  | | | ||| | | | |  |
byte offsets┌           1    1    2    2
            └ 0    5    0    5    0    5
 token types[ AA%A(BA)A%A(AAA%A)A(A)A%BA
      SMILES[ Cn1c(=O)c2c(ncn2C)n(C)c1=O
     hcounts[ 30 0  0 0 0 010 3 0 3 0  0
    branches┌    *(..)  *(.....)
            └                   *(.)
    closures┌  *1*    * *       *   *1
            └         *2*.***2 .
   fragments[ 00000000000000000000000000
  symclasses┌ 01 7  3 9 1 651 1 1 2 8  4
            └  1        0   2   3 

I'll focus on just the closures, and give more emphasis to the element symbols 
which make up either end of the closure (marked with a "*") while the other 
atoms in the closure ring are marked with an "x":

% smiview 'Cn1c(=O)c2c(ncn2C)n(C)c1=O' -b closures --closure-atom-style 
end-elements
        ┌                   1 1 1  1
   atoms│ 01 2  3 4 5 678 9 0 1 2  3
        └ || |  | | | ||| | | | |  |
  SMILES[ Cn1c(=O)c2c(ncn2C)n(C)c1=O
closures┌  *1x    x x       x   *1
        └         *2x.xx*2 .

With a bit of counting of *'s and x's you can see there's a ring of size 6 and 
another of size 5.

Here's an example of the input syntax processing; I'll convert all of the atoms 
to use the bracket form, by adding the correct hydrogen count to each 
non-bracket atom:

% smiview 'Cn1c(=O)c2c(ncn2C)n(C)c1=O' --use-brackets -a input-smiles -b none 
--width 80
input smiles[  C    n 1 c (= O ) c 2 c ( n  c   n 2 C   ) n ( C   ) c 1= O
      SMILES[ [CH3][n]1[c](=[O])[c]2[c]([n][cH][n]2[CH3])[n]([CH3])[c]1=[O]

If you want the modified SMILES string from another program, or to copy&paste 
it, then turn off the legend and use a large enough width. Here I'll also set 
the isotope to the atom index+1, which might be used as a way to tag the atoms:

% smiview 'Cn1c(=O)c2c(ncn2C)n(C)c1=O' --set-isotope index+1 -a none -b none 
--width 100000 --legend off
[1CH3][2n]1[3c](=[4O])[5c]2[6c]([7n][8cH][9n]2[10CH3])[11n]([12CH3])[13c]1=[14O]

Let me know what you think.


                                Andrew
                                da...@dalkescientific.com

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] smiview 1.2

Reply via email to