The problem is really how you do cross-validation.

On 02/09/2016 11:47 PM, muhammad waseem wrote:
Thanks Luca and Andreas, the idea behind this is to predict a weather parameter using some other parameters. You still think it will be difficult to solve with Random Forest as it is not really time series. I get good training results (with high max_depth) but not very good for the testing dataset, meaning the regressor is unable to generalise.

What about gradient boosting regressor, is this suitable?

Thanks
Kindest Regards
Waseem

On Tue, Feb 9, 2016 at 10:00 PM, Luca Puggini <lucapug...@gmail.com <mailto:lucapug...@gmail.com>> wrote:

    Personally I think that random forest should not be used for time
    series data unless the data is supposed to have some sort of
    periodicity. This is because random forest is a sort of local
    estimator.  It's not effective if new samples are outside of the
    hypercube defined by the training data.  This is quite common in
    time series.  If I were you I would try something like linear
    regression or extreme learning machine. If you are interested in
    extreme learning machine there should be a PR on scikit-learn (I
    wrote a simple paper with a simple introduction to ELM: "Extreme
    learning machines for virtual metrology and etch rate prediction".
    Maybe this can help you

    .


    On Tue, Feb 9, 2016, 9:41 PM Andreas Mueller <t3k...@gmail.com
    <mailto:t3k...@gmail.com>> wrote:

        Yes. Exactly what Luca said and what I said earlier.

        There is temporal structure in your data. If you use k-fold
        cross validation (or even shuffle the data) that destroys the
        temporal structure.
        You want to make predictions for the future (the second file).
        You should use a cross-validation method that tries to predict
        form the past
        to the future, not that tries to predict arbitrary time
        points. Otherwise, your results will be too optimistic, as you
        found.


        On 02/09/2016 04:23 PM, muhammad waseem wrote:
        I have it in separate file (csv). Actually, I have four years
        weather data (hourly values in two files), I use 3 years
        (first file) worth of data for training and one years worth
        of data (second file) for testing.

        Am I doing it correctly? any ideas?

        On Tue, Feb 9, 2016 at 9:01 PM, Andreas Mueller
        <t3k...@gmail.com <mailto:t3k...@gmail.com>> wrote:

            How did you create the hold-out test data? Before or
            after shuffling?


            On 02/09/2016 03:22 PM, muhammad waseem wrote:
            Hi Andreas,
            Thanks for your reply. I have already shuffled my data
            so it is not in ordered now but still no luck. Any other
            suggestions?


            On Tue, Feb 9, 2016 at 8:16 PM, Andreas Mueller
            <t3k...@gmail.com <mailto:t3k...@gmail.com>> wrote:

                You should probably use a different cross-validation
                strategy if your
                data is ordered. This will give you more realistic
                cross-validation results.
                There was a time series CV object somewhere, and by
                now I think we
                should include it (this is the third time this comes
                up in the last 3 days)

                
------------------------------------------------------------------------------
                Site24x7 APM Insight: Get Deep Visibility into
                Application Performance
                APM + Mobile APM + RUM: Monitor 3 App instances at
                just $35/Month
                Monitor end-to-end web transactions and take
                corrective actions now
                Troubleshoot faster and improve end-user experience.
                Signup Now!
                http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
                _______________________________________________
                Scikit-learn-general mailing list
                Scikit-learn-general@lists.sourceforge.net
                <mailto:Scikit-learn-general@lists.sourceforge.net>
                
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




            
------------------------------------------------------------------------------
            Site24x7 APM Insight: Get Deep Visibility into Application 
Performance
            APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
            Monitor end-to-end web transactions and take corrective actions now
            Troubleshoot faster and improve end-user experience. Signup Now!
            http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140


            _______________________________________________
            Scikit-learn-general mailing list
            Scikit-learn-general@lists.sourceforge.net
            <mailto:Scikit-learn-general@lists.sourceforge.net>
            https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


            
------------------------------------------------------------------------------
            Site24x7 APM Insight: Get Deep Visibility into
            Application Performance
            APM + Mobile APM + RUM: Monitor 3 App instances at just
            $35/Month
            Monitor end-to-end web transactions and take corrective
            actions now
            Troubleshoot faster and improve end-user experience.
            Signup Now!
            http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
            _______________________________________________
            Scikit-learn-general mailing list
            Scikit-learn-general@lists.sourceforge.net
            <mailto:Scikit-learn-general@lists.sourceforge.net>
            https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



        
------------------------------------------------------------------------------
        Site24x7 APM Insight: Get Deep Visibility into Application Performance
        APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
        Monitor end-to-end web transactions and take corrective actions now
        Troubleshoot faster and improve end-user experience. Signup Now!
        http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140


        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

        
------------------------------------------------------------------------------
        Site24x7 APM Insight: Get Deep Visibility into Application
        Performance
        APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
        Monitor end-to-end web transactions and take corrective
        actions now
        Troubleshoot faster and improve end-user experience. Signup Now!
        
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

--
    Sent by mobile phone


    
------------------------------------------------------------------------------
    Site24x7 APM Insight: Get Deep Visibility into Application Performance
    APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
    Monitor end-to-end web transactions and take corrective actions now
    Troubleshoot faster and improve end-user experience. Signup Now!
    http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to