Hello, With regards to your specific point about CSV I/O, there are a several ways to read CSV files in Julia.
- Dataframes <https://github.com/JuliaStats/DataFrames.jl/blob/712e3876507228552ec83a371a5d0e577c75c183/doc/source/io.rst> .jl: df = readtable("data.csv") - Base: readdlm(source, delim::Char, T::Type; options...) - And the current state of the art with regards to speed, CSV.jl <https://github.com/JuliaDB/CSV.jl> with its datastream integration. Unless you are reading fairly large CSV files, I would stick to Dataframes.jl. I would caution you though that Data I/O in Julia is still in its infancy and there are methods that are either slower than Python/R or missing (xls etc). Zooming out a bit, I've found Data Wookie Month of Julia <https://github.com/DataWookie/MonthOfJulia> blog series to be the best getting started guide for practical data sciency Julia stuff. On Thursday, February 11, 2016 at 4:06:45 PM UTC-5, ivo welch wrote: > > > hi doug---and vice-versa. it's interesting that a core function (reading > a .csv file) would not be in a native julia library. when are you > switching your students to julia? regards, /iaw > > > ---- > Ivo Welch ([email protected] <javascript:>) > http://www.ivo-welch.info/ > J. Fred Weston Distinguished Professor of Finance > Anderson School at UCLA, C519 > Free Finance Textbook, http://book.ivo-welch.info/ > Exec Editor, Critical Finance Review, > http://www.critical-finance-review.org/ > Editor and Publisher, FAMe, http://www.fame-jagazine.com/ > > On Thu, Feb 11, 2016 at 12:37 PM, Douglas Bates <[email protected] > <javascript:>> wrote: > >> Hi Ivo, >> >> Good to hear from you. >> >> On Wednesday, February 10, 2016 at 9:58:37 AM UTC-6, ivo welch wrote: >>> >>> >>> ladies and gents---I am not (yet) a julia user. >>> >>> may I suggest adding more examples into two places where julia users >>> will face starting hurdles? >>> >>> [1] the I/O docs of julia. like, reading and writing csv files that are >>> compressed and decompressed on-the-fly, even if not in the ultimate >>> efficient manner. a large fraction of the time and frustration of new >>> users is consumed by the task of shoehorning data into and out of new >>> computer languages. with all of R's problem, the ' d <- read.csv("f.csv")' >>> and 'd<-read.csv(pipe(paste("gzcat ", fname)))' reduced this entry >>> frustration greatly. perhaps xml file reading and writing. perhaps... >>> >>> [2] more 'standard task' programs would be great. read a csv file, run >>> a regression according to variable names on the command line, print output, >>> draw a graph. I know there are fragments throughout the docs, but some >>> section with ready to run complete programs would be good, perhaps at the >>> end of the manual. >>> >>> in a year, I hope to switch my students from R to julia. >>> >> >> My main use of the RCall package is to import datasets from R into >> Julia. If I have a dataset in an R package I use, e.g. >> >> julia> using RCall >> >> julia> ds = rcopy("lme4::Dyestuff") >> 30x2 DataFrames.DataFrame >> | Row | Batch | Yield | >> |-----|-------|--------| >> | 1 | "A" | 1545.0 | >> | 2 | "A" | 1440.0 | >> | 3 | "A" | 1440.0 | >> | 4 | "A" | 1520.0 | >> | 5 | "A" | 1580.0 | >> | 6 | "B" | 1540.0 | >> | 7 | "B" | 1555.0 | >> | 8 | "B" | 1490.0 | >> | 9 | "B" | 1560.0 | >> | 10 | "B" | 1495.0 | >> | 11 | "C" | 1595.0 | >> | 12 | "C" | 1550.0 | >> | 13 | "C" | 1605.0 | >> | 14 | "C" | 1510.0 | >> | 15 | "C" | 1560.0 | >> | 16 | "D" | 1445.0 | >> | 17 | "D" | 1440.0 | >> | 18 | "D" | 1595.0 | >> | 19 | "D" | 1465.0 | >> | 20 | "D" | 1545.0 | >> | 21 | "E" | 1595.0 | >> | 22 | "E" | 1630.0 | >> | 23 | "E" | 1515.0 | >> | 24 | "E" | 1635.0 | >> | 25 | "E" | 1625.0 | >> | 26 | "F" | 1520.0 | >> | 27 | "F" | 1455.0 | >> | 28 | "F" | 1450.0 | >> | 29 | "F" | 1480.0 | >> | 30 | "F" | 1445.0 | >> >> If I wanted to read a CSV file using the facilities in R I could use >> >> julia> rcopy("read.csv('/usr/share/distro-info/debian.csv')") >> 17x6 DataFrames.DataFrame >> | Row | version | codename | series | created | >> release | eol | >> >> |-----|---------|----------------|----------------|--------------|--------------|--------------| >> | 1 | 1.1 | "Buzz" | "buzz" | "1993-08-16" | >> "1996-06-17" | "1997-06-05" | >> | 2 | 1.2 | "Rex" | "rex" | "1996-06-17" | >> "1996-12-12" | "1998-06-05" | >> | 3 | 1.3 | "Bo" | "bo" | "1996-12-12" | >> "1997-06-05" | "1999-03-09" | >> | 4 | 2.0 | "Hamm" | "hamm" | "1997-06-05" | >> "1998-07-24" | "2000-03-09" | >> | 5 | 2.1 | "Slink" | "slink" | "1998-07-24" | >> "1999-03-09" | "2000-10-30" | >> | 6 | 2.2 | "Potato" | "potato" | "1999-03-09" | >> "2000-08-15" | "2003-07-30" | >> | 7 | 3.0 | "Woody" | "woody" | "2000-08-15" | >> "2002-07-19" | "2006-06-30" | >> | 8 | 3.1 | "Sarge" | "sarge" | "2002-07-19" | >> "2005-06-06" | "2008-03-30" | >> | 9 | 4.0 | "Etch" | "etch" | "2005-06-06" | >> "2007-04-08" | "2010-02-15" | >> | 10 | 5.0 | "Lenny" | "lenny" | "2007-04-08" | >> "2009-02-14" | "2012-02-06" | >> | 11 | 6.0 | "Squeeze" | "squeeze" | "2009-02-14" | >> "2011-02-06" | "2014-05-31" | >> | 12 | 7.0 | "Wheezy" | "wheezy" | "2011-02-06" | >> "2013-05-04" | "" | >> | 13 | 8.0 | "Jessie" | "jessie" | "2013-05-04" | >> "2015-04-25" | "" | >> | 14 | 9.0 | "Stretch" | "stretch" | "2015-04-25" | "" >> | "" | >> | 15 | 10.0 | "Buster" | "buster" | "2018-07-01" | "" >> | "" | >> | 16 | NA | "Sid" | "sid" | "1993-08-16" | "" >> | "" | >> | 17 | NA | "Experimental" | "experimental" | "1993-08-16" | "" >> | "" | >> >> >> (It turns out that R's allowing either ' or " for enclosing strings is an >> advantage for quoting strings within strings.) >> > >
