I'm working on https://github.com/joshday/SparseRegression.jl for penalized regression problems. I'm still optimizing the code, but a test set of that size is not a problem.
julia> n, p = 1000, 262144; x = randn(n, p); y = x*randn(p) + randn(n); julia> @time o = SparseReg(x, y, ElasticNetPenalty(.1), Fista(tol = 1e-4, step = .1), lambda = [.5]) 22.356062 seconds (1.69 k allocations: 408.851 MB, 0.16% gc time) ■ SparseReg > Model: SparseRegression.LinearRegression() > Penalty: ElasticNetPenalty (α = 0.1) > Intercept: true > nλ: 1 > Algorithm: Fista