Hey everyone, I know it's been mentioned here and there, but now it's official: two new packages have been officially released for 0.4, DataStreams.jl and CSV.jl. SQLite.jl has also gone through a big overhaul to modernize the code and rework the data processing interface.
DataStreams.jl is a new package with a lofty goal and not a lot of code. It aims to put forth a data ingestion/processing framework that can be used by all types of data-reader/ingestion/source/sink/writer type packages. The basic idea is that for a type of data source, defining a `Source` and `Sink` types, and then implementing the various combinations of `Data.stream!(::Source, ::Sink)` methods that make sense. For example, CSV.jl and SQLite.jl now both have `Source` and `Sink` types, and I've simply defined the following methods between the two packages: Data.stream!(source::CSV.Source, sink::SQLite.Sink) => parse a CSV file represented by `source` directly into the SQLite table represented by `sink` Data.stream!(source::SQLite.Source, sink::CSV.Sink) => fetch the SQLite table represented by `source` directly out to a CSV file represented by `sink` The DataStreams.jl package also defines a `Data.Table` type which is simply: type Table{T} schema::Data.Schema data::T end this is meant as a "backend-agnostic" kind of type that represents an in-memory Julia structure. Currently the default constructors put a `Vector{NullableVector}` as the `.data` field, but it could really be anything you wanted (e.g. DataFrame, Matrix, etc.). The aim of `Data.Table` certainly isn't to replace something like DataFrames, but rather to act as a default "pure julia type" with the DataStreams.jl framework. Indeed, to do a non-copying convert of a `Data.Table` to a `DataFrame` is just: `DataFrame(dt::Data.Table)`. You can see more details in the blog post I wrote up here: http://julialang.org/blog/2015/10/datastreams/ A big thanks to a number of people as well who have helped encourage and develop these packages with me. I truly love the community and caliber of people around here and just want to say thanks. DataStreams.jl: https://github.com/JuliaDB/DataStreams.jl CSV.jl: https://github.com/JuliaDB/CSV.jl SQLite.jl: https://github.com/JuliaDB/SQLite.jl -Jacob