Re: [scikit-learn] Clustering 4 dimensional data

2017-02-28 Thread Dale T Smith
. Dale T. Smith | Macy's Systems and Technology | IFS eCom CSE Data Science 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.sm...@macys.com From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys@python.org] On Behalf Of Rohan Koodli Sent: Monday, February 27, 2017 10:43 PM

[scikit-learn] Renaming subject lines if you get a digest

2016-12-14 Thread Dale T Smith
T. Smith | Macy's Systems and Technology | IFS eCom CSE Data Science 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.sm...@macys.com From: scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=macys@python.org] On Behalf Of Graham Arthur Mackenzie Sent: Tuesday, December 13, 2016 5:02

Re: [scikit-learn] Scikit Learn Random Classifier - TPR and FPR plotted on matplotlib

2016-12-14 Thread Dale T Smith
I think you need to look at the examples. __ Dale T. Smith | Macy's Systems and Technology | IFS eCom CSE Data Science 5985 State Bridge Road, Johns Creek, GA

Re: [scikit-learn] suggested classification algorithm

2016-11-17 Thread Dale T Smith
/pcr_part2_yaware/ http://www.win-vector.com/blog/2016/06/y-aware-scaling-in-context/ __ Dale T. Smith | Macy's Systems and Technology | IFS eCom CSE Data Science 5985

Re: [scikit-learn] suggested classification algorithm

2016-11-16 Thread Dale T Smith
of this on the mailing list. __ Dale T. Smith | Macy's Systems and Technology | IFS eCom CSE Data Science 5985 State Bridge Road, Johns Creek, GA 30097 | dale.t.sm

Re: [scikit-learn] Recurrent Decision Tree

2016-11-07 Thread Dale T Smith
Searching the mailing list would be the best way to find out this information. It may be in the contrib packages on github – have you checked? __ Dale T. Smith

Re: [scikit-learn] Missing data and decision trees

2016-10-13 Thread Dale T Smith
Please define “sensibly”. I would be strongly opposed to modifying any models to incorporate “missingness”. No model handles missing data for you. That is for you to decide based on your individual problem domain. Take a look at a talk from last winter on missing data by Nina Zumel. Nina

Re: [scikit-learn] Random Forest with Bootstrapping

2016-10-04 Thread Dale T Smith
Search for Jackknife at Wikipedia. That will give you a quick overview. Then you will have the background to read the papers below. While you are at Wikipedia, you may want to read on the bootstrap and random forests as well.

Re: [scikit-learn] Confidence Estimation for Regressor Predictions

2016-09-01 Thread Dale T Smith
definition of a confidence interval. -- Roman On 01/09/16 20:32, Dale T Smith wrote: > There is a scikit-learn-contrib project with confidence intervals for random > forests. > > https://github.com/scikit-learn-contrib/forest-confide

Re: [scikit-learn] Confidence Estimation for Regressor Predictions

2016-09-01 Thread Dale T Smith
There is a scikit-learn-contrib project with confidence intervals for random forests. https://github.com/scikit-learn-contrib/forest-confidence-interval __ Dale Smith | Macy's Systems and Technology | IFS

Re: [scikit-learn] Supervised anomaly detection in time series

2016-08-05 Thread Dale T Smith
]. And there are also other approaches for comparing time series in the frequency domain such as FFT and DWT [Ref: http://infolab.usc.edu/csci599/Fall2003/Time%20Series/Efficient%20Similarity%20Search%20In%20Sequence%20Databases.pdf]. I hope it helps. 2016-08-05 9:26 GMT-03:00 Dale T Smith <dale.t

Re: [scikit-learn] Supervised anomaly detection in time series

2016-08-05 Thread Dale T Smith
I don’t think you should treat this as an outlier detection problem. Why not try it as a classification problem? The dataset is highly unbalanced. Try http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html Use sample_weight to tell the fit method about the

Re: [scikit-learn] Model trained in 0.17 gives entirely different results in 0.15

2016-08-03 Thread Dale T Smith
Use conda or a virtualenv to handle compatibility issues. Then you can control when upgrades occur. I’ve used conda with good effect to handle version issues such as yours. Otherwise, use PMML. The Data Mining Group maintains a list of PMML producers and consumers. I think there is a Python

Re: [scikit-learn] Install sklearn into a specific folder to make some changes

2016-08-02 Thread Dale T Smith
I agree with everyone else – conda environments are specially designed for this situation. I’ve not used virtualenv myself (http://docs.python-guide.org/en/latest/dev/virtualenvs/). I’m an Anaconda user.

Re: [scikit-learn] [Scikit-learn-general] Estimator serialisability

2016-07-14 Thread Dale T Smith
months ago. http://www.slideshare.net/rgrossman/how-to-lower-the-cost-of-deploying-analytics-an-introduction-to-the-portable-format-for-analytics William On Thu, Jul 14, 2016 at 8:35 AM, Dale T Smith <dale.t.sm...@macys.com<mailto:dale.t.sm...@macys.com>> wrote: Hello, I investigated