Hi,
The problem with including data munging in the tutorial is that it's not
really a machine learning question. Solutions are generally so
domain-specific that you can't present it in a way that would be generally
useful to an interdisciplinary audience. This is why most (all?) short
machine learning tutorials ignore the data cleaning aspect and instead
focus on the machine learning algorithms & concepts – and in my tutorials,
I always try to emphasize the fact that I'm leaving this part up to the
user (and perhaps point them to the pandas tutorial, if one is being
offered).
   Jake

 Jake VanderPlas
 Senior Data Science Fellow
 Director of Research in Physical Sciences
 University of Washington eScience Institute

On Wed, Sep 30, 2015 at 4:41 PM, KAB <kha...@yahoo.com> wrote:

> Hello Jake and Andy,
>
> If you would not mind some advice, I would suggest including examples (or
> at least one) where you use data that is not built-in. I remember the first
> several tutorials (if not all of them) relied completely on built-in data
> sets and unapologetically ignored the big elephant in the room that people
> will need to import/read-in their own data and have to deal with it in
> scikit-learn one way or another, either through pandas or numpy and these
> will then hand the data over to the appropriate scikit-learn routines.
>
> Ignoring coverage of this aspect (and likewise the issue of how to deal
> with categorical data in data sets), in such tutorials, in my humble
> opinion presents a somewhat uneasy hurdle to getting started with the
> scikit-learn tool set. I for one had to use R just to overcome these issues
> when I first started with this, even though I would have preferred to use
> Python and its data science stack due to my experience with and preference
> of Python over R.
>
> Best regards
>
>
>
> On 9/30/2015 8:22 PM, Andy wrote:
>
> Hi Jake.
> I think the tutorial Kyle and I did based on the previous tutorials was
> working quite well.
> I think it would make sense to work of our scipy ones and improve them
> further.
> I'd be happy to work on it.
> We have some more exercises in a branch, and I have also improved versions
> of some of the notebooks that I have been using for teaching.
>
> Andy
>
>
> On 09/29/2015 06:48 PM, Jacob Vanderplas wrote:
>
> Hi All,
> PyCon 2016 call for proposals
> <https://us.pycon.org/2016/speaking/tutorials/> just opened. For the last
> several years Olivier and I have been teaching a two-part scikit-learn
> tutorial at each PyCon, and I think they have gone over well.
>
> As the conference is just a few hour train ride away for me this year, I'm
> certainly going to attend again. I'd also love to put together one or more
> scikit-learn tutorials again this year – if you're planning to attend PyCon
> and would like to work together on a proposal or two, let me know!
>    Jake
>
>  Jake VanderPlas
>  Senior Data Science Fellow
>  Director of Research in Physical Sciences
>  University of Washington eScience Institute
>
>
> ------------------------------------------------------------------------------
>
>
>
> _______________________________________________
> Scikit-learn-general mailing 
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
>
>
>
> _______________________________________________
> Scikit-learn-general mailing 
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to