Hi everyone, On Thu, Mar 31, 2016 at 7:20 AM, Steve Haddock <[email protected]> wrote:
> I would say I have intermediate to high experience with R, and have taught > it a few times. Even so, although I feel I could come up with a working > solution for most random questions, I am still not very confident that I > will be giving the optimal Hadley-approved R-like solution (probably > involving plyr !). It is going to take more than reading a book to reach > that comfort level. > > I think the proper mindset for R is *totslly* different than Python or > most other languages, and I would not lump them together in any way. You > can't think "how would i do this in Python?" and then try to translate that > straight into R language. Your Matlab experience is going to be the best > guide. In R if you are using a for loop you are almost always "doing it > wrong". > So using vector operations, Boolean subsetting, and taking advantage of > the power of dataframes are the keys to getting off on the right foot. > There are plenty of syntactical differences, but these are things you can > learn pretty quickly. For example, to grab the 2nd up to the last item of > a series, you wouldn't use x[2:end] but would use x[-1] (negative indices > meaning leave those out). > For teaching, I would also go the sacrilegious route of using = instead of > <- as the assignment operator. A few laudable books actually use this too. > The "proper" way is just visually confusing and extra keystrokes. > > The power of R and the situations where you would choose it over Python > (pandas notwithstanding) are pretty clear when you get down to it. > I'm hoping not to create too much of a diversion, but is this true? I've tried to find some definitive source that actually lays out a good case in this regard and haven't been successful (happy to take tips). The classic answer as far as I can tell is that R is for statistics and nice-looking plots, whereas Python is for more varied projects, web apps, numerical work, etc. I find that there is a lot of bleed-over, however. People tend to like their tools, so if they're used to R and want to build a complex analysis tool, they do so, even if its painful and includes system calls, file manipulation, nontrivial computational tasks (they'll write a loop at let it run for a few hours..) etc. I'm still a diehard matplotlib (and associated derivative tools) user, even though I'm really impressed with how nice ggplot looks visually (I use the ggplot style -- still not as good) and how succinct it is. Finally, GIS work seems to be a tossup. R supports it, but there are some nice Python libraries too (I'm partial to the rasterio/fiona/shapely stack myself). I'd be interested in hearing what this community thinks is the pretty clear line between the two. Cheers, Matt > > Data visualization is a great way to impress people that there are viable > alternatives to spreadsheets. I had a fun experience of replacing my > friend's GUI graph generation workflow with about 3 lines of R code to plot > his dataset and colorize each point by another factor -- stuff he had to > regenerate each time with a few dozen clicks. You also could start by > focusing on getting their data into R as effectively as possible, using > factors and dataframes, along the lines of Hadley's tidy data paper. > > -Steve > > ----- q•b ----- > > > On Mar 29, 2016, at 23:09, Jason Bell <[email protected]> wrote: > > G’day Software Carpentry Instructors > > > > This being my first post to this list, as I recently become a software > carpentry instructor (as of last week) and I hope this is the appropriate > channel to ask a few questions in regards to learning, and then teaching, R > and python to my local research colleagues. > > > > I am in the unusual position of providing eResearch support to all of the > researchers at my University – distributed throughout 20 campuses. I look > after a number of systems, including our dedicated research storage > infrastructure (https://my.cqu.edu.au/web/eresearch/data-tools) and also > our High Performance Computing facility (www.cqu.edu.au/hpc), amongst > many things. Recently I have been getting a number of researchers who have > been approaching me requesting help in getting their research data > completed more quickly. I have been surprised how many different research > domains are now using R, in which the need for scientific computing skill > is starting to explode. As an example, I have assisted researchers to run > their code on our HPC System, in which the results would have taken them > months to complete on their local machine, to having a full set of data > results in just a few hours by running many programs on our HPC system at > once. > > > > One of the reasons why I am keen to learn and teach R and python, is so I > can help even more of my colleagues to produce their research data more > effectively and efficiently. Unfortunately at my local institution their > isn’t any local training that my colleagues can attend – this I hope > software carpentry can help to fill this large gap in scientific computing > training. > > > > Over the years I have learnt many programming languages (I have been quite > interested in reading some of the recent emails to this list about > programming languages), which stated with “BASIC” at high school, to Pascal > as the first language I learnt at University, to C/C++, ADA, Java, Visual > Basic, Lego robotics programming, Perl, Bash scripts, Matlab, PHP and HTML > (did someone mention TeX), using middleware libraries such MPI, P4 and even > did some python training quite a few years ago and contributed to the open > source software project “Access Grid” Software. I believe I have an > acceptable understanding of programming principles in general and therefore > would like to ask the following questions > > > > · What is the best (the quickest) way to get up to speed in R (and > python a little further down the track). As you can appreciate my time is > extremely limited (like most of us these days) and thus am chasing the most > efficient method for learning R and python, so I can begin providing > lessons in the very near future. > > · Do you think “instructors” should know more than just the > teaching material for the “subjects” they plan on teaching. For example, I > recently ran a local “UNIX Shell” locally and given I have been using bash > for over 15 years, I was extremely comfortable with the teaching material > (even though I did pick up a few tips and tricks), there were no unexpected > questions that I could not answer. I doubt this would be the case with R > or python, as I don’t use it regularly enough to feel competent to answer > left field questions. Now, I appreciate that you cannot know everything, > but having a greater knowledge than just the 3-4 hour lesson material would > like highly desirable – thus would welcome any suggestions in resources, > training material that could help me to get up to speed ASAP. > > > > · I see there are a few “R” lessons within software and data > carpentry, so I wonder if there are any recommended lessons that are > designed as an overview and not so much research domain specific? > > > > · I am also be interested in some visualisation aspects of R as > well, as a lot of my users are still trying to use “excel” to graph data. > > > > o I have taught myself how to pass command line arguments in R, as this > allows you to write a script to submit hundreds or thousands of separate > jobs to solve on a HPC system. Is this sort of thing covered anywhere? > > > > Some other “general” questions in regards to what our research colleagues > should be learning > > > > · Is there still a place for researchers to learn programming > languages such as C/C++ - from a program “execution” speed, C is pretty > hard to compete again, especially when looking to HPC types of programs. > > · A colleague has suggested that the “go” programming language ( > https://golang.org/) is becoming quite popular these days, is anyone else > seeing this? > > Anyway – I hope all of these questions are acceptable to ask here and > would appreciate any advice and comments you might have. > > > > Many thanks for your time, > > Jason. > > > > <image001.png> <https://www.cqu.edu.au/> > > *Jason Bell* > > Senior Research Technologies Officer | Information and Technology > Directorate > > CQUniversity eResearch Analyst | Queensland Cyber Infrastructure > Foundation (QCIF) > CQUniversity Australia, Building 19 Room 1.07, Bruce Highway, Rockhampton > QLD 4702 > *P* +61 7 4930 9229 *| X* 59229 *| M* 0409 630 897 |* E *[email protected] > > <image002.jpg> <https://www.cqu.edu.au/social-media> > > This communication may contain privileged or confidential information. If > you have received this in error, > > please return to sender and delete. CRICOS: 00219C | RTO Code 40939 > > > > > > > > _______________________________________________ > Discuss mailing list > [email protected] > > http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org > > > _______________________________________________ > Discuss mailing list > [email protected] > > http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org > >
_______________________________________________ Discuss mailing list [email protected] http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org
