Over the last few days I've developed a command-line tool that I call "smiview".

It's a SMILES viewer. It isn't a depiction tool where the input is in SMILES 
but rather a tool to highlight different aspects of the SMILES string.

I'll put some examples at the end. If you want to try it out you can download 
it from

  https://bitbucket.org/dalke/smiview/downloads/smiview-1.1.tar.gz

or see the README from the project page at

  https://bitbucket.org/dalke/smiview

As the README says, this was mostly built for fun. I would like to know if you 
use it for serious work.

I have no long terms plan for this project.

I think it would be cool (perhaps for pedagogical reasons) to have a GUI 
version in an IPython notebook, tied to a graphical depiction so you can see 
the connection between the SMILES terms and the depiction. I don't have those 
skills, so if it's something you want to do, you might look to this code as a 
starting point.

It was developed under Python 3 and it seems to work under Python 2.7.

Enjoy!


                                Andrew
                                da...@dalkescientific.com


The default view highlights the atom locations (and numbers them), shows the 
branches and the atom that the branches are attached to, shows the closures 
(highlighting the closure atom and the closure indicator for both side of the 
closure), and shows you which part of the SMILES string correspond to which 
fragments.

% smiview 
'C#CCC[N+](C)(C)CCCCCCCCCCCC[N+](C)(C)CCC#C.Cc1ccc(S(=O)(=O)[O-])cc1.Cc1ccc(S(=O)(=O)[O-])cc1'
         ┌                   1111111111    2  2 222 2 22 223 3  3   3 3
    atoms│ 0 1234    5  6 7890123456789    0  1 234 5 67 890 1  2   3 4
         └ | ||||    |  | |||||||||||||    |  | ||| | || ||| |  |   | |
   SMILES[ C#CCC[N+](C)(C)CCCCCCCCCCCC[N+](C)(C)CCC#C.Cc1ccc(S(=O)(=O)[O
 branches┌      *---(.)(.)            *---(.)(.)           *(...........
         └                                                   *(..)(..)
 closures┌                                            1*↑-------- 1 ----
         └
         ┌ 000000000000000000000000000000000000000000
fragments│                                            111111111111111111
         └

         ┌    33  33 344 4  4   4 4    44
    atoms│    56  78 901 2  3   4 5    67
         └    ||  || ||| |  |   | |    ||
   SMILES[ -])cc1.Cc1ccc(S(=O)(=O)[O-])cc1
 branches┌ ..)         *(.............)
         └               *(..)(..)
 closures┌ ----*↑1
         └        1*↑-------- 1 --------*↑1
         ┌
fragments│ 111111
         └        222222222222222222222222


(Thanks to Greg for feedback on the 'branches' visualization, and for 
suggesting the 'fragments' track.)


If you give it a SMARTS pattern it will show where the corresponding atoms are:

% smiview 'CC1CC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(O)C(=O)CO' --smarts 
'*=O'
       ┌                  1 1 11  1 1  1 1 1 12  2 2  2 2  2 22
  atoms│ 01 23 4 567  89  0 1 23  4 5  6 7 8 90  1 2  3 4  5 67
       └ || || | |||  ||  | | ||  | |  | | | ||  | |  | |  | ||
 SMILES[ CC1CC2C3CCC4=CC(=O)C=CC4(C)C3(F)C(O)CC2(C)C1(O)C(=O)CO
match 1[               *  *
match 2[                                                *  *


If you give it an atom index, it shows the neighbors around that index, along 
with a summary of how the atom is attached to those neighbors:


% smiview 'NC(=O)c1nonc1CNC(c1c[nH]cn1)C1CCNCC1' --atom-index 10
         ┌                1 1 11   11  1 11122
    atoms│ 01  2 3 4567 890 1 23   45  6 78901
         └ ||  | | |||| ||| | ||   ||  | |||||
   SMILES[ NC(=O)c1nonc1CNC(c1c[nH]cn1)C1CCNCC1
neighbors┌               ^X ^          ^
         └                C(-N9)(-c11)(-C16)


You can modify how things look. In the following I'll have it show the offset 
to each character in the SMILES string on the top, and the atom indices on the 
bottom.


% smiview 'NC(=O)c1nonc1CNC(c1c[nH]cn1)C1CCNCC1' -a offsets -b atoms -b closures
byte offsets┌           1    1    2    2    3    3
            └ 0    5    0    5    0    5    0    5
      SMILES[ NC(=O)c1nonc1CNC(c1c[nH]cn1)C1CCNCC1
            ┌ ||  | | |||| ||| | ||   ||  | |||||
       atoms│ 01  2 3 4567 891 1 11   11  1 11122
            └                0 1 23   45  6 78901


Here's the output for buckminsterfullerene:

% smiview 
'c12c3c4c5c1c6c7c8c2c9c1c3c2c3c4c4c%10c5c5c6c6c7c7c%11c8c9c8c9c1c2c1c2c3c4c3c4c%10c5c5c6c6c7c7c%11c8c8c9c1c1c2c3c2c4c5c6c3c7c8c1c23'
 --legend once
         ┌                       1 1 1 1 1 1 1   1 1 1 2 2 2 2   2 2 2 2
    atoms│  0  1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6   7 8 9 0 1 2 3   4 5 6 7
         └  |  | | | | | | | | | | | | | | | |   | | | | | | |   | | | |
   SMILES[  c12c3c4c5c1c6c7c8c2c9c1c3c2c3c4c4c%10c5c5c6c6c7c7c%11c8c9c8c
 branches[
         ┌ 1*↑-------*↑1  8*↑------ 8 ------- 8 ------- 8 -------*↑8  9*
         │ 2*-↑----- 2 ------*↑2    2*↑------ 2 ------- 2 ------- 2 ----
         │    3*↑------- 3 --------*↑3    4*↑------- 4 ------- 4 -------
         │      4*↑--------- 4 ----------*↑4      5*↑------- 5 --------
         │        5*↑------- 5 ------- 5 --------*↑5  6*↑------- 6 -----
 closures│            6*↑------- 6 ------- 6 --------*↑6  7*↑------- 7 -
         │              7*↑------- 7 --------- 7 --------*↑7        8*↑-
         │                    9*↑------ 9 ------ 9 ------- 9 ------*↑9
         │                      1*↑------- 1 ------- 1 -------- 1 ------
         │                            3*↑------- 3 -------- 3 -------- 3
         │                                 10*↑------- 10 -------- 10 --
         └                                                 11*↑------- 1
fragments[  000000000000000000000000000000000000000000000000000000000000

 2 2 3 3 3 3 3 3 3   3 3 3 4 4 4 4   4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5
 8 9 0 1 2 3 4 5 6   7 8 9 0 1 2 3   4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
 | | | | | | | | |   | | | | | | |   | | | | | | | | | | | | | | | |
9c1c2c1c2c3c4c3c4c%10c5c5c6c6c7c7c%11c8c8c9c1c1c2c3c2c4c5c6c3c7c8c1c23

↑------- 9 -------- 9 -------- 9 --------*↑9      2*↑----- 2 ------*↑2
---*↑2      3*↑------ 3 ------ 3 ------- 3 ------*↑3      3*↑------*-↑3
- 4 -------*↑4        5*↑------- 5 --------- 5 --------*↑5
5 -------- 5 --------*↑5  6*↑------- 6 ------- 6 --------*↑6
--- 6 -------- 6 --------*↑6  7*↑------- 7 ------- 7 --------*↑7
------- 7 -------- 7 --------*↑7      8*↑--------- 8 ----------*↑8
------ 8 ------- 8 -------- 8 -------*↑8    1*↑------- 1 --------*↑1
    1*↑------ 1 ------- 1 ------- 1 -------*↑1
-*↑1  2*↑------- 2 ------- 2 -------- 2 -------*↑2
 --------*↑3  4*↑------ 4 ------- 4 ------- 4 -------*↑4
----- 10 --------*↑10
1 -------- 11 ------- 11 --------*↑11
0000000000000000000000000000000000000000000000000000000000000000000000







------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to