Dear Niko.
Thank you for your quick reply.
Your suggestion was right.
I tried PCA with 500 mols, and it took very long time. So, i could not
wait....
Next, I tried  PCA with 10 mols, then the calculation was finished. Thanks!

I understood this script was not sufficient for do PCA with lots of
molecules.

So, i will change the script that do PCA using molecular fingerprint to
using molecular descriptors.
Thanks.
Taka




2013/1/19 Nikolas Fechner <m...@fechner.cc>

> Dear Takayuki,
> I tested in a fairly recent svn checkout as well as on 2012.03.1, which
> is the version I also used for this example (the Stats.py has not changed
> since 2009). I ran it on a set of public serineprotease inhibitors (
> http://cheminformatics.org/datasets/bohm/bohm-test.3d.sdf, chosen because
> rather small and publicly available) in newly installed python notebook.
> The res[1] I got are:
>
> [[  0.00000000e+00   0.00000000e+00   7.67554438e-02 ...,   0.00000000e+00
>     0.00000000e+00   0.00000000e+00]
>  [  0.00000000e+00   0.00000000e+00  -1.60075292e-02 ...,   0.00000000e+00
>     0.00000000e+00   0.00000000e+00]
>  [  0.00000000e+00   0.00000000e+00  -8.20590478e-02 ...,   0.00000000e+00
>     0.00000000e+00   0.00000000e+00]
>  ...,
>  [  0.00000000e+00   0.00000000e+00  -2.85344502e-16 ...,   0.00000000e+00
>     0.00000000e+00   0.00000000e+00]
>  [  0.00000000e+00   0.00000000e+00  -2.12793777e-15 ...,   0.00000000e+00
>     0.00000000e+00   0.00000000e+00]
>  [  0.00000000e+00   0.00000000e+00  -2.12793777e-15 ...,   0.00000000e+00
>     0.00000000e+00   0.00000000e+00]]
>
>
> The error you posted looks a bit like the script was manually stopped
> (ctrl-c for instance) or did the program terminate with error independent
> from any user interaction?  The PCA can take a very long time for feature
> vectors with such high dimensionality like fingerprints (several minutes in
> the case of my very small example dataset). How many compounds are you
> using for your example? In the case that it just ran forever in your case,
> could you try it with a tiny subset (e.g. 10 compounds) just to see wether
> it terminates.
>
> Best,
> Niko
>
> On Jan 18, 2013, at 11:48 PM, Taka Seri <serit...@gmail.com> wrote:
>
> Dear Greg and Niko.
>
> Thank you for your quick repry.
> >To Greg, thanks for your recommendation.
> I tried PCA with matplotlib and it worked with no problem. Thanks.
> But Matplotlib returned view that was different from R.
>
>  >To Niko.
> I using RDKit, version RDKit_2012_06_1.
> And, when I tried PCA with the code, no response was returned.
>
> KeyboardInterrupt, following message was returned.
>
> Traceback (most recent call last):
>   File "mol_pca.py", line 33, in <module>
>     res=Stats.PrincipalComponents(matrix)
>   File "C:\RDKit_2012_06_1\rdkit\ML\Data\Stats.py", line 82, in
> PrincipalComponents
>     covMat = FormCorrelationMatrix(mat)
>   File "C:\RDKit_2012_06_1\rdkit\ML\Data\Stats.py", line 66, in
> FormCorrelationMatrix
>     sumY = sum(y)
>
> So,  what version of RDKit are you using?
> And if you don't care, could you show me some results ?
>
> Thanks.
>  Takayuki
>
> 2013/1/18 Nikolas Fechner <m...@fechner.cc>
>
>>  Hi Takayuki,
>> I was able to run your code snippet without any errors (with different
>> example molecules of course). Could possible explain in more detail what is
>> not working for you? What version of RDKit are you using (from rdkit import
>> rdBase;print rdBase.rdkitVersion) ?
>>
>> Niko
>>
>> On Jan 17, 2013, at 11:08 AM, Taka Seri <serit...@gmail.com> wrote:
>>
>> Dear All.
>>
>> I want to do PCA with molecular fingerprint .
>> So, I wrote following code.
>> But, this code did not work .
>> Does anyone have a suggestion?
>> Thanks.
>>
>> Takayuki
>>
>>
>>  1 from rdkit import Chem
>>  2 from rdkit.Chem import AllChem
>>  3 from rdkit.ML.Data import Stats
>>  4 import numpy
>>  5 import sys
>>  6
>>  7
>>  8 mols = [mol for mol in Chem.SDMolSupplier(sys.argv[1])]
>>  9 fps = [AllChem.GetMorganFingerprintAsBitVect(mol,2) for mol in mols]
>> 10
>> 11 mat = []
>> 12 for fp in fps:
>> 13     bits = fp.ToBitString()
>> 14     bitsvec = [int(bit) for bit in bits]
>> 15     mat.append(bitsvec)
>> 16
>> 17 mat=numpy.array(mat)
>> 18 res = Stats.PrincpalComponents(mat)
>> 19 print res[1]
>>
>>
>> ------------------------------------------------------------------------------
>> Master Visual Studio, SharePoint, SQL, ASP.NET <http://asp.net/>, C#
>> 2012, HTML5, CSS,
>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> MVPs and experts. ON SALE this month only -- learn more at:
>>
>> http://p.sf.net/sfu/learnmore_122712_______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Master Visual Studio, SharePoint, SQL, ASP.NET <http://asp.net/>, C#
>> 2012, HTML5, CSS,
>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> MVPs and experts. ON SALE this month only -- learn more at:
>> http://p.sf.net/sfu/learnmore_122712
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> ------------------------------------------------------------------------------
> Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
> much more. Get web development skills now with LearnDevNow -
> 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
> SALE $99.99 this month only -- learn more at:
>
> http://p.sf.net/sfu/learnmore_122812_______________________________________________
>
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122912
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_123012
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to