Hi Tim.
Please keep all discussions on the mailing list, as individual 
contributors might not find the time to respond.
With fixed random seeds, results should be reproducible. If you provide 
the full code, it might be possible to say where the problem lies.

Sorry, I meant ShuffleSplitCV, but you can also use any other CV object.
The cross_val_predict function is not super essential, just a 
convenience. You should start with computing the scores using 
cross_val_score.

For otherwise maintained clusters: installing a version locally is super 
easy. Just check out the dev version and set the pythonpath appropriately,
or install into a virtual environment. Many people have custom python 
libraries (or whole environments) installed.

Cheers,
Andy


On 01/07/2015 05:42 AM, Timothy Vivian-Griffiths wrote:
>       Dear Andy,
>
> Thank you for your reply on the scikit-learn problem I was having. Seeing as 
> I am new to this, I am writing to you directly; is this what I should do, or 
> should I reply to the response email that you gave me?
>
> As for the reproducability, I have not set the probability to be True, so it 
> should be running on the default. I am also setting the random state 
> parameter, so I am puzzled as to what is happening. I haven't found a single 
> split that is reproducing high performing results. I understand that there 
> will be discrepancies in the data, but I don't understand why splits should 
> perform differently on different occasions.
>
> Just another thing, I have noticed that the cross_val_predict function that 
> you mention is in the latest version of sklearn, but I cannot find the 
> RandomizedSplitCV one. Also, seeing as I am running my code on a cluster 
> which I do not maintain, I think it's probably best if I wait until 0.16 
> becomes the stable version before I ask the admins to update. Do you have any 
> idea of when this might be?
>
> Thanks for your help and apologies if I am not supposed to contact your email 
> address directly,
>
> Tim
>
>
>


------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to