Re: [Cdk-user] CDK support in chemfp 3.5b1

2021-01-20 Thread Egon Willighagen
On Wed, Jan 20, 2021 at 10:31 AM Andrew Dalke 
wrote:

> So I'm leaving the supported CDK formats in chemfp to SMILES, SDF, and
> InChI. And if no one asks for other formats then I don't need to do
> anything.
>

I understand that supporting XYZ and PDB here is a "when asked".

Regarding unused formats, it would help if we would have a command line
conversion tool again. We used to have one, but in a retrospectively
questionable decision stopped supporting that, bc we had OpenBabel.

Egon

-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK support in chemfp 3.5b1

2021-01-20 Thread Andrew Dalke
I figured that since I haven't seen XYZ being used for a couple of decades, I 
didn't need to worry about it that much. ;)

PDB input is more useful, but that's a specialized topic in its own right, with 
careful attention to PDB naming conventions, bond angles, ring flatness, etc.

I also had problems getting the mol2 reader to work. It doesn't look like it's 
had any TLC for over a decade, again, probably because no one uses it.

So I'm leaving the supported CDK formats in chemfp to SMILES, SDF, and InChI. 
And if no one asks for other formats then I don't need to do anything.


Andrew
da...@dalkescientific.com


 

> On Jan 20, 2021, at 08:29, Egon Willighagen  
> wrote:
> 
> 
> Yes, I understand that. I think the original workflow was first to establish 
> the bonds ("single" by default) and then use valency information and hoping 
> that the structure makes sense (no hydrogens missing), etc. The book has a 
> follow up section on that second step. But I agree with your doubt about how 
> well it works. I do not remember if I did a validation like we would do 
> nowadays. It would be good to do something like this:
> 
> 1. take 1000 random 3D structures from PubChem (add zeros according to taste)
> 2. remove all bond info, and keep only the 3D locations
> 3. rebond, add missing bond orders
> 4. compare.
> 
> Now, this set would not really be a real world scenario. I could imagine one 
> would like to repeat this for structures from COD too. Actually, maybe check 
> out this paper: 
> https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0279-6
> 
> Egon
> 




___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user