Hi,
On Sun, Feb 8, 2015 at 1:11 AM, Ronnie Ghose <ronnie.gh...@gmail.com> wrote:
>
> lots of examples!
>
> http://scikit-learn.org/stable/auto_examples/index.html#decomposition
>
>
Yes, but I couldn't see an example there that labelled the plot in the way
that I described.
Sarah
> On Sat, Feb 7, 2015 at 7:56 PM, Sarah Mount <mount.sa...@gmail.com> wrote:
>
>> Hi there,
>>
>> This question relates to Pandas and visualisation as well ask sklearn, so
>> apologies if I am asking on the wrong list.
>>
>> I have a dataset (imported as a Pandas data frame) that has a reasonably
>> large number of columns (~15) and I want to use PCA on the data.
>>
>> The first column of the data frame is a string that describes each row.
>> e.g.:
>>
>> Sample m1 m2 ...
>> -----------------------------------
>> sample1 0.1 0.2 ...
>> ...
>>
>> The "fit" function in sklearn.decomposition.PCA does not expect columns
>> to contain strings, so to perform PCA I'm removing the first column of the
>> data frame, like this:
>>
>> df_no_strings = df.drop("Sample", axis=1, inplace=False)
>> pca = PCA(n_components=4)
>> pca.fit(df_no_strings)
>> print("Explained variance:", pca.explained_variance_)
>> df_reduced = pca.fit_transform(df_no_strings)
>>
>>
>> I want to plot the results of performing PCA as a scatter plot, similarly
>> to Figure 3 on Page 10 of this document:
>>
>> http://www.dacapobench.org/dacapo-TR-CS-06-01.pdf
>>
>> Is there an easy way to do this, given that I have lost the first column
>> of the data frame?
>>
>> Many thanks,
>>
>> Sarah
>>
>> --
>> Dr. Sarah Mount, Senior Lecturer, University of Wolverhampton
>> website: http://www.snim2.org/
>> twitter: @snim2
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is
>> your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is
> your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
--
Dr. Sarah Mount, Senior Lecturer, University of Wolverhampton
website: http://www.snim2.org/
twitter: @snim2
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general