[Rdkit-discuss] Install RDKit on Portable Apps

2020-06-08 Thread Vasos Panagiotopoulos +1-718-939-8595 Bioengineer-Financier
I have RDKit Jython on prtable apps but it seems to have dlls and
no exe. I finally tealised the three lines all were one line
and java stopped bombing, but now all I get is silence.


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Scalability of Postgres cartridge

2020-06-08 Thread Finnerty, Jim via Rdkit-discuss
If you have a billion molecule data source and would like to try an at-scale 
test, I'd be willing to help out with provisioning the hardware, looking at the 
efficiency of the plans, etc., using rdkit with Aurora PostgreSQL.

If I understand how the rdkit GIST index filtering mechanism works for a given 
similarity metric, a parallel GIST index scan ought to be able to scale almost 
linearly scale with the number of cores, provided that the RDBMS is built on a 
scalable storage subsystem. 

If so, the largest instance size that's currently supported has 96 cores, so we 
can do a fairly high degree of parallelism.

On 6/5/20, 1:07 PM, "dmaziuk via Rdkit-discuss" 
 wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



On 6/5/2020 4:45 AM, Greg Landrum wrote:

> Having said that, the team behind ZINC used to use the RDKit cartridge 
with
> PostgreSQL as the backend for ZINC. They had the database sharded
> across multiple instances and managed to get the fingerprint indices to
> work there. I don't remember the substructure search performance being
> terrible, but it wasn't great either. They have since switched to a
> specialized system (Arthor from NextMove software), which offers
> significantly better performance.

Generally speaking a database of a billion rows needs hardware capable
of running it. Buy a server with 1TB RAM and 64 cores and a couple of
U.2 NVME drives and see how Postgres runs on that.

Then you need to look at the database, e.g. query in an indexed
billion-row table could be OK but inserting a billion-first row will not be.

If you want to scale to these kinds of volumes, you need to do some work.

(And much of the point of no-sql hadoop "cloud" workflows is that if you
can parallelize what you're doing to multiple machines, at some data
size they will start outperforming a centralized fast search engine.)

Dima


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] CIx position at NIBR Cambridge, US

2020-06-08 Thread Stiefl, Nikolaus
Dear all,
I wanted to bring to your attention that our position for a cheminformatics 
expert in the CADD group in our global chemistry community at NIBR is opened 
again.

https://www.novartis.com/careers/career-search/job-details/288340BR

If you feel like you want to apply your skill set to real-world drug discovery 
problems within a group of molecular modellers, data scientists and other CIx 
experts please go ahead :).
Looking forward to your applications.
Best
Nik (Stiefl)

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Removing solvent and ions from dataset

2020-06-08 Thread Nicolas Bosc
Hi Max,

Third alternative: https://github.com/chembl/ChEMBL_Structure_Pipeline 


parent_molblock, _ = standardizer.get_parent_molblock(o_molblock)

This will strip the molecule.

Nicolas

> On 8 Jun 2020, at 08:19, Pierre-Marie Allard  
> wrote:
> 
> Hi Max,
> 
> You can also use MolVS https://molvs.readthedocs.io/en/latest/ 
> 
> This should suit most of your needs,
> 
> PM
> _
> 
> Pierre-Marie Allard
> Research Assistant - Natural Products Chemistry
> ISPSO - UniGe - Geneva
> pierre-marie.all...@unige.ch 
> 
>> On 8 Jun 2020, at 08:46, Francois Berenger > > wrote:
>> 
>> On 06/06/2020 17:33, Max Pinheiro Jr wrote:
>>> Hi RDkit team,
>>> I am working on a chemically diverse dataset of smiles strings and I
>>> need to do some preprocessing to clean a bit the data before starting
>>> the modeling part. So I was looking for some tools or built-in
>>> functions in RDkit to make such preprocessing by removing, for
>>> instance, solvent (water) molecules and ions. I found the
>>> "SaltRemover" module that may solve my problem with removing ions from
>>> the database, but I could not find an equivalent module for the case
>>> of solvent molecules. Does anyone know a specific tool in RDkit (or
>>> any other python program) to make such preprocessing in the smile
>>> strings? If so, could you please provide just a simple example of how
>>> to do it? I will be really thankful for any help you may provide.
>> 
>> I have used this program several times:
>> 
>> https://github.com/flatkinson/standardiser 
>> 
>> 
>> You can try this:
>> ```
>> pip3 install chemo-standardizer
>> standardiser -i input.smi -o output_std.smi
>> ```
>> 
>> I believe it uses rdkit under the hood.
>> 
>> Regards,
>> F.
>> 
>>> Max Pinheiro Jr
>>> -
>>> Université Aix-Marseille, France
>>> Institut de Chimie Radicalaire
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
>> 
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Removing solvent and ions from dataset

2020-06-08 Thread Pierre-Marie Allard
Hi Max,

You can also use MolVS https://molvs.readthedocs.io/en/latest/
This should suit most of your needs,

PM
_

Pierre-Marie Allard
Research Assistant - Natural Products Chemistry
ISPSO - UniGe - Geneva
pierre-marie.all...@unige.ch

On 8 Jun 2020, at 08:46, Francois Berenger 
mailto:mli...@ligand.eu>> wrote:

On 06/06/2020 17:33, Max Pinheiro Jr wrote:
Hi RDkit team,
I am working on a chemically diverse dataset of smiles strings and I
need to do some preprocessing to clean a bit the data before starting
the modeling part. So I was looking for some tools or built-in
functions in RDkit to make such preprocessing by removing, for
instance, solvent (water) molecules and ions. I found the
"SaltRemover" module that may solve my problem with removing ions from
the database, but I could not find an equivalent module for the case
of solvent molecules. Does anyone know a specific tool in RDkit (or
any other python program) to make such preprocessing in the smile
strings? If so, could you please provide just a simple example of how
to do it? I will be really thankful for any help you may provide.

I have used this program several times:

https://github.com/flatkinson/standardiser

You can try this:
```
pip3 install chemo-standardizer
standardiser -i input.smi -o output_std.smi
```

I believe it uses rdkit under the hood.

Regards,
F.

Max Pinheiro Jr
-
Université Aix-Marseille, France
Institut de Chimie Radicalaire
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Removing solvent and ions from dataset

2020-06-08 Thread Francois Berenger

On 06/06/2020 17:33, Max Pinheiro Jr wrote:

Hi RDkit team,

I am working on a chemically diverse dataset of smiles strings and I
need to do some preprocessing to clean a bit the data before starting
the modeling part. So I was looking for some tools or built-in
functions in RDkit to make such preprocessing by removing, for
instance, solvent (water) molecules and ions. I found the
"SaltRemover" module that may solve my problem with removing ions from
the database, but I could not find an equivalent module for the case
of solvent molecules. Does anyone know a specific tool in RDkit (or
any other python program) to make such preprocessing in the smile
strings? If so, could you please provide just a simple example of how
to do it? I will be really thankful for any help you may provide.


I have used this program several times:

https://github.com/flatkinson/standardiser

You can try this:
```
pip3 install chemo-standardizer
standardiser -i input.smi -o output_std.smi
```

I believe it uses rdkit under the hood.

Regards,
F.


Max Pinheiro Jr
-
Université Aix-Marseille, France
Institut de Chimie Radicalaire
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss