Re: [scikit-learn] R user trying to learn Python

Sebastian Raschka Sun, 18 Jun 2017 13:38:43 -0700

Hi, C W,

yeah I'd say that Python is a programming language with lots of packages for 
scientific computing, whereas R is more of a toolbox for stats. Thus, Python 
may be a bit weird at first for people who come from the R/stats field and are 
new to programming. Not sure if it is necessary to learn programming & computer 
science basics for a person who is primarily interested in in stats and ML, but 
since so many tools are Python-based and require some sort of basic programming 
to fit the pieces together, it's maybe not a bad idea :).


There's probably an over-abundance of python intro books out there ... However, 
I'd maybe recommend a introduction to computer science book that uses Python as 
a teaching language rather than a book that is just about Python language.

Maybe check out 
https://www.udacity.com/course/intro-to-computer-science--cs101, which is a 
Python-based computer science course (and should be free).

Best,
Sebastian


> On Jun 18, 2017, at 4:18 PM, C W <tmrs...@gmail.com> wrote:
> 
> Hi Sebastian,
> 
> I looked through your book. I think it is great if you already know Python, 
> and looking to learn machine learning.
> 
> For me, I have some sense of machine learning, but none of Python.
> 
> Unlike R, which is specifically for statistics analysis. Python is broad!
> 
> Maybe some expert here with R can tell me how to go about this. :)
> 
> On Sun, Jun 18, 2017 at 12:53 PM, Sebastian Raschka <se.rasc...@gmail.com> 
> wrote:
> Hi,
> 
> > I am extremely frustrated using this thing. Everything comes after a dot! 
> > Why would you type the sam thing at the beginning of every line. It's not 
> > efficient.
> >
> > code 1:
> > y_sin = np.sin(x)
> > y_cos = np.cos(x)
> >
> > I know you can import the entire package without the "as np", but I see 
> > np.something as the standard. Why?
> 
> Because it makes it clear where this function is coming from. Sure, you could 
> do
> 
> from numpy import *
> 
> but this is NOT!!! recommended. The reason why this is not recommended is 
> that it would clutter up your main name space. For instance, numpy has its 
> own sum function. If you do from numpy import *, Python's in-built `sum` will 
> be gone from your main name space and replaced by NumPy's sum. This is 
> confusing and should be avoided.
> 
> > In the code above, sklearn > linear_model > Ridge, one lives inside the 
> > other, it feels that there are multiple layer, how deep do I have to dig in?
> >
> > Can someone explain the mentality behind this setup?
> 
> This is one way to organize your code and package. Sklearn contains many 
> things, and organizing it by subpackages (linear_model, svm, ...) makes only 
> sense; otherwise, you would end up with code files > 100,000 lines or so, 
> which would make life really hard for package developers.
> 
> Here, scikit-learn tries to follow the core principles of good object 
> oriented program design, for instance, Abstraction, encapsulation, 
> modularity, hierarchy, ...
> 
> > What are some good ways and resources to learn Python for data analysis?
> 
> I think baed on your questions, a good resource would be an introduction to 
> programming book or course. I think that sections on objected oriented 
> programming would make the rationale/design/API of scikit-learn and Python 
> classes as a whole more accessible and address your concerns and questions.
> 
> Best,
> Sebastian
> 
> > On Jun 18, 2017, at 12:02 PM, C W <tmrs...@gmail.com> wrote:
> >
> > Dear Scikit-learn,
> >
> > What are some good ways and resources to learn Python for data analysis?
> >
> > I am extremely frustrated using this thing. Everything comes after a dot! 
> > Why would you type the sam thing at the beginning of every line. It's not 
> > efficient.
> >
> > code 1:
> > y_sin = np.sin(x)
> > y_cos = np.cos(x)
> >
> > I know you can import the entire package without the "as np", but I see 
> > np.something as the standard. Why?
> >
> > Code 2:
> > model = LogisticRegression()
> > model.fit(X_train, y_train)
> > model.score(X_test, y_test)
> >
> > In R, everything is saved to a variable. In the code above, what if I 
> > accidentally ran model.fit(), I would not know.
> >
> > Code 3:
> > from sklearn import linear_model
> > reg = linear_model.Ridge (alpha = .5)
> > reg.fit ([[0, 0], [0, 0], [1, 1]], [0, .1, 1])
> >
> > In the code above, sklearn > linear_model > Ridge, one lives inside the 
> > other, it feels that there are multiple layer, how deep do I have to dig in?
> >
> > Can someone explain the mentality behind this setup?
> >
> > Thank you very much!
> >
> > M
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] R user trying to learn Python

Reply via email to