Re: [julia-users] Are SubArrays intended to work with BLAS functions?

2016-10-13 Thread Josh Day
Sure thing:  https://github.com/JuliaLang/julia/issues/18908


[julia-users] Are SubArrays intended to work with BLAS functions?

2016-10-13 Thread Josh Day
After much confusion, I discovered That BLAS.syr! gives incorrect results 
when using a view.  Is this a bug, or is it not recommended to use views 
with BLAS functions?  


julia> a1 = zeros(2,2); a2 = zeros(2, 2); x = randn(5, 2);

julia> BLAS.syr!('U', 1.0, view(x, 1, :), a1)
2×2 Array{Float64,2}:
 0.483364  0.440104
 0.0   0.400716

julia> BLAS.syr!('U', 1.0, x[1, :], a2)
2×2 Array{Float64,2}:
 0.483364  0.458034
 0.0   0.434032

However, BLAS.syrk! appears to work:

julia> a1 = zeros(2,2); a2 = zeros(2, 2); x = randn(5, 2);

julia> BLAS.syrk!('U', 'T', 1.0, x, 0.0, a1)
2×2 Array{Float64,2}:
 4.16346  -0.618009
 0.0   4.75777

julia> BLAS.syrk!('U', 'T', 1.0, view(x, :, :), 0.0, a2)
2×2 Array{Float64,2}:
 4.16346  -0.618009
 0.0   4.75777


Re: [julia-users] Trouble cloning private repos in OS X.

2016-09-26 Thread Josh Day
I completely forgot about 2FA.  That's the culprit.  Thanks for the pointer 
and sorry for the noise.


[julia-users] Trouble cloning private repos in OS X.

2016-09-26 Thread Josh Day
There might be some related issues already reported, but I didn't see one 
that was quite what I'm seeing.  I can clone my repos outside of Julia 
normally , but Pkg.clone in Julia results in repeatedly getting asked for 
my username/password.  I don't get an error if I provide an incorrect 
username/password.  I have github.user=joshday and 
credential.helper=osxkeychain in my git config (cached password), so I'm 
not sure why I'm being asked for my username/password to begin with.  Any 
ideas of where to start?

julia> versioninfo()
Julia Version 0.5.1-pre+2
Commit f0d40ec (2016-09-20 03:34 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin15.6.0)
  CPU: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, haswell)

julia> Pkg.clone("https://github.com/joshday/MyPrivateRepo.jl;)
INFO: Cloning MyPrivateRepo from https://github.com/joshday/MyPrivateRepo.jl
Username for 'https://github.com':joshday
Password for 'https://josh...@github.com':
Username for 'https://github.com' [joshday]:
Password for 'https://josh...@github.com':
Username for 'https://github.com' [joshday]:fake_name
Password for 'https://fake_n...@github.com':
Username for 'https://github.com' [fake_name]:


[julia-users] Re: Achitecture for solving largish LASSO/elasticnet problem

2016-06-24 Thread Josh Day
I'm working on https://github.com/joshday/SparseRegression.jl for penalized 
regression problems.  I'm still optimizing the code, but a test set of that 
size is not a problem.  

julia> n, p = 1000, 262144; x = randn(n, p); y = x*randn(p) + randn(n);

julia> @time o = SparseReg(x, y, ElasticNetPenalty(.1), Fista(tol = 1e-4, 
step = .1), lambda = [.5])
 22.356062 seconds (1.69 k allocations: 408.851 MB, 0.16% gc time)
■ SparseReg
  >  Model:  SparseRegression.LinearRegression()
  >Penalty:  ElasticNetPenalty (α = 0.1)
  >  Intercept:  true
  > nλ:  1
  >  Algorithm:  Fista


Re: [julia-users] JuliaCon birds of a feather

2016-06-22 Thread Josh Day
Sounds great, I'm in.  I'd be especially interested in talking about 
LearnBase.


[julia-users] Matrix multiplication performance with BitMatrix, Matrix{Bool}, Matrix{Int8}

2016-05-15 Thread Josh Day
Hi all.  Suppose I have a large matrix with entries {0, 1} and I'd like to 
keep storage small by using a BitMatrix.  Are there any tricks to squeeze 
better performance out of BitMatrix multiplication?  I'm also curious about 
the performance difference between Matrix{Bool} and Matrix{Int8}.  Thoughts 
or suggestions are appreciated.  Thanks.

n, p = 1000, 100_000
x1 = rand(n, p) .> .5
x2 = Matrix{Bool}(x1)
x3 = Matrix{Float64}(x1)
x4 = Matrix{Int8}(x1)
b = randn(p)

@time x1 * b
@time x2 * b
@time x3 * b
@time x4 * b
  0.559938 seconds (7 allocations: 8.078 KB)
  0.437336 seconds (7 allocations: 8.078 KB)
  0.062144 seconds (7 allocations: 8.078 KB)
  0.109573 seconds (7 allocations: 8.078 KB)


[julia-users] Re: Julia console with inline graphics?

2016-04-02 Thread Josh Day
I believe that's iTerm being used 
with https://github.com/Keno/TerminalExtensions.jl.  Depending on the 
complexity of your plots, https://github.com/Evizero/UnicodePlots.jl may be 
sufficient for you.

On Saturday, April 2, 2016 at 6:45:22 AM UTC-4, Oliver Schulz wrote:
>
> Hi,
>
> I'm looking for a Julia console with inline graphics (e.g. to display 
> Gadfly plots). There's Jupyter/IJulia, of course, but I saw a picture of 
> something more console-like in the AxisArrays readme (at the end of 
> https://github.com/mbauman/AxisArrays.jl#example-of-currently-implemented-behavior)
>  
> - does anyone know what's been used there?
>
> Cheers,
>
> Oliver
>
>

[julia-users] Re: Packaging Julia project for Open Respoducable Science

2016-02-11 Thread Josh Day
How about releasing it as a Julia package?  You can handle your (Julia) 
dependencies with the REQUIRE file.

I'm working on a reproducible PhD thesis, and that's the route I'm going. 
 All my julia code and tex files will be in there and can be built from 
scratch.

On Thursday, February 11, 2016 at 6:56:00 AM UTC-5, Lyndon White wrote:
>
> I've been working in IJulia for all of my experimental needs.
> I am now submitting the paper and I wish to submit my code and data  along 
> with it.
>
> I also probably should include a copy of julia.
> Since my code works in Julia 0.5, but I am confidant it doesn't work in 
> any prior versions (such as the one in most package managers?)
> and I guess I need to include the dependencies.
>
> Are there guidelines for packaging up a julia project for reproducible 
> science?
>
>

[julia-users] Re: documentation suggestions

2016-02-10 Thread Josh Day
I think a lot of what you're looking for already exists.  It's just that 
things like "run a regression according to variable names" wouldn't belong 
in base Julia.  If you haven't already, I'd take a look at StatsBase.jl, 
DataFrames.jl, and GLM.jl.

http://dataframesjl.readthedocs.org/en/latest/io.html#importing-data-from-tabular-data-files
https://github.com/JuliaStats/GLM.jl



On Wednesday, February 10, 2016 at 10:58:37 AM UTC-5, ivo welch wrote:
>
>
> ladies and gents---I am not (yet) a julia user.
>
> may I suggest adding more examples into two places where julia users will 
> face starting hurdles?
>
> [1] the I/O docs of julia.  like, reading and writing csv files that are 
> compressed and decompressed on-the-fly, even if not in the ultimate 
> efficient manner.a large fraction of the time and frustration of new 
> users is consumed by the task of shoehorning data into and out of new 
> computer languages.  with all of R's problem, the ' d <- read.csv("f.csv")' 
> and 'd<-read.csv(pipe(paste("gzcat ", fname)))' reduced this entry 
> frustration greatly.  perhaps xml file reading and writing.  perhaps...
>
> [2] more 'standard task' programs would be great.  read a csv file, run a 
> regression according to variable names on the command line, print output, 
> draw a graph.  I know there are fragments throughout the docs, but some 
> section with ready to run complete programs would be good, perhaps at the 
> end of the manual.
>
> in a year, I hope to switch my students from R to julia.
>
> regards,
>
> /iaw
>
>

[julia-users] Re: Strangely formatted HTTP response (Pkg.publish())

2016-02-04 Thread Josh Day
Yes, I just saw it and searched about it to find this post.


Re: [julia-users] How to define methods of rearranged arguments

2016-01-07 Thread Josh Day
True, but this is a special case where arguments have unique types.  The 
reason I asked is for a type in OnlineStats.jl 
<https://github.com/joshday/OnlineStats.jl>, where more often than not, the 
user will probably change defaults.

StatLearn(x, y, L1Regression(), AdaGrad(), L2Penalty())

looks considerably cleaner than

StatLearn(x, y, model = L1Regression(), algorithm = AdaGrad, penalty = 
L2Penalty())

On Thursday, January 7, 2016 at 6:15:04 PM UTC-5, Tim Holy wrote:
>
> This is what keyword arguments are for. 
> http://docs.julialang.org/en/stable/manual/functions/#keyword-arguments 
>
> --Tim 
>
> On Thursday, January 07, 2016 11:02:03 AM Josh Day wrote: 
> > Suppose I have a function that takes several arguments of different 
> types 
> > and each has a default value.  What is the best way to specify all 
> possible 
> > methods where a user can specify an argument without entering the 
> defaults 
> > that come before it?  I don't want to force a user to remember the exact 
> > order of arguments.  The example below may explain this better. 
> > 
> > type A 
> > a::Int 
> > end 
> > type B 
> > b::Int 
> > end 
> > type C 
> > c::Int 
> > end 
> > f(a::A = A(1), b::B = B(1), c::C = C(1)) = ... 
> > 
> > I would like the user to be able to call  f(C(3), B(2))instead of 
> f(A(1), 
> > B(2), C(3)).  I could just implement the factorial(3)methods myself, but 
> if 
> > I want to do this for 5 types, it means I'm writing 120 methods. 
> > 
> > Is this just a terrible idea and I should use keyword arguments? 
>
>

Re: [julia-users] How to define methods of rearranged arguments

2016-01-07 Thread Josh Day
Thanks, Tom.  That's exactly what I'm looking for.

On Thursday, January 7, 2016 at 2:21:13 PM UTC-5, Tom Breloff wrote:
>
> Depending on how much performance you need, you could do something like:
>
> _f(a::A, b::B, c::C) = ...
>
> function f(args...)
>   a,b,c = A(1),B(1),C(1)
>   for arg in args
> T = typeof(arg)
> if T <: A
>   a = arg
> elseif T <: B
>   b = arg
> elseif T <: C
>   c = arg
> end
>   end
>   _f(a,b,c)
> end
>
> And if you need to do this in multiple places, I'm sure you could turn 
> this into a macro fairly easily.
>
>
> On Thu, Jan 7, 2016 at 2:02 PM, Josh Day <emailj...@gmail.com 
> > wrote:
>
>> Suppose I have a function that takes several arguments of different types 
>> and each has a default value.  What is the best way to specify all possible 
>> methods where a user can specify an argument without entering the defaults 
>> that come before it?  I don't want to force a user to remember the exact 
>> order of arguments.  The example below may explain this better.
>>
>> type A
>> a::Int
>> end
>> type B
>> b::Int
>> end
>> type C
>> c::Int
>> end
>> f(a::A = A(1), b::B = B(1), c::C = C(1)) = ...
>>
>> I would like the user to be able to call  f(C(3), B(2))instead of f(A(1), 
>> B(2), C(3)).  I could just implement the factorial(3)methods myself, but 
>> if I want to do this for 5 types, it means I'm writing 120 methods.  
>>
>> Is this just a terrible idea and I should use keyword arguments?  
>>
>
>

[julia-users] How to define methods of rearranged arguments

2016-01-07 Thread Josh Day
Suppose I have a function that takes several arguments of different types 
and each has a default value.  What is the best way to specify all possible 
methods where a user can specify an argument without entering the defaults 
that come before it?  I don't want to force a user to remember the exact 
order of arguments.  The example below may explain this better.

type A
a::Int
end
type B
b::Int
end
type C
c::Int
end
f(a::A = A(1), b::B = B(1), c::C = C(1)) = ...

I would like the user to be able to call  f(C(3), B(2))instead of f(A(1), 
B(2), C(3)).  I could just implement the factorial(3)methods myself, but if 
I want to do this for 5 types, it means I'm writing 120 methods.  

Is this just a terrible idea and I should use keyword arguments?  


[julia-users] Re: Online regression algorithms

2015-04-26 Thread Josh Day
I emailed John Myles White a few months back about merging.  One of his 
concerns was that OnlineStats looks more ambitious, but he wanted to work 
together.  I was focused on the implementation progress to show off for my 
oral prelim (I'm a PhD student in statistics), so nothing ever came of it. 
 Maybe now is the time to pull the trigger on merging.

I have a few minor concerns:
1) I think there needs to be more flexibility in the abstract type 
structure of StreamStats.  I'm currently using the same types for 
OnlineStats, but I've been putting in some thought on how to improve it.  

2) The sufficient statistics in OnlineStats types are based on averages 
to avoid overflow (StreamStats based on sums)

3) I would like OnlineStats to allow both batch and singleton updates 
(StreamStats uses singletons)


I'm definitely open for collaboration.  What are the goals you're aiming 
for?


On Friday, April 24, 2015 at 5:13:15 PM UTC-4, Tom Breloff wrote:

 I'm considering writing packages for the following online (i.e. updating 
 models on the fly as new data arrives) techniques, but this functionality 
 might exist already, or there might be a package that I should contribute 
 to instead of writing my own:

- Online PCA (such as Candid covariance-free incremental principal 
component analysis)
- Online flexible least squares (time-varying regression weights)
- Online support vector machines/regressions

 Are there any packages that might have this functionality, or even a good 
 framework that I could/should add to?  Does anyone else have a need for 
 these algorithms?



Re: [julia-users] Re: Online regression algorithms

2015-04-26 Thread Josh Day
1) Could the flexibility in weighting new data you're thinking about be fit 
into an optional argument to update!()?
3) Agreed.  Right now a few methods do the opposite of that with something 
like update!(obj, y::Float64) = update!(obj, [y])

I've at least started an issue for a redesign to get a discussion going: 
https://github.com/joshday/OnlineStats.jl/issues/2.  I'd be interested in 
hearing your thoughts, as I'm a statistician pretending to be a programmer.



On Sunday, April 26, 2015 at 1:38:02 PM UTC-4, Tom Breloff wrote:

 1) More flexibility is certainly good, as long as it doesn't impact 
 performance or readability. I would also like additional flexibility on 
 weighting new data. 
 2) aside from possible overflow, updating averages instead of sums is 
 probably more performant when you call state() more often than update!()
 3) i also would probably like both batch and singleton updates, but with 
 well designed code you may be able to just wrap your singleton update 
 method in a loop. 

 My use case is in algorithmic trading, so potentially calling an update!() 
 method very frequently. As such, I'd always have an eye towards performance 
 and any implementation would reflect that. 

 I'd love to have a discussion about any redesign/merge. It sounds to me 
 like OnlineStats is the more natural destination for any merge... But maybe 
 John should weigh in?


 On Apr 26, 2015, at 8:51 AM, Josh Day emailj...@gmail.com javascript: 
 wrote:

 I emailed John Myles White a few months back about merging.  One of his 
 concerns was that OnlineStats looks more ambitious, but he wanted to work 
 together.  I was focused on the implementation progress to show off for my 
 oral prelim (I'm a PhD student in statistics), so nothing ever came of it. 
  Maybe now is the time to pull the trigger on merging.

 I have a few minor concerns:
 1) I think there needs to be more flexibility in the abstract type 
 structure of StreamStats.  I'm currently using the same types for 
 OnlineStats, but I've been putting in some thought on how to improve it.  

 2) The sufficient statistics in OnlineStats types are based on averages 
 to avoid overflow (StreamStats based on sums)

 3) I would like OnlineStats to allow both batch and singleton updates 
 (StreamStats uses singletons)


 I'm definitely open for collaboration.  What are the goals you're aiming 
 for?


 On Friday, April 24, 2015 at 5:13:15 PM UTC-4, Tom Breloff wrote:

 I'm considering writing packages for the following online (i.e. updating 
 models on the fly as new data arrives) techniques, but this functionality 
 might exist already, or there might be a package that I should contribute 
 to instead of writing my own:

- Online PCA (such as Candid covariance-free incremental principal 
component analysis)
- Online flexible least squares (time-varying regression weights)
- Online support vector machines/regressions

 Are there any packages that might have this functionality, or even a good 
 framework that I could/should add to?  Does anyone else have a need for 
 these algorithms?



[julia-users] Re: Online regression algorithms

2015-04-25 Thread Josh Day
I've been working on https://github.com/joshday/OnlineStats.jl.  The 
src/README shows the implementation progress.  It's partially a playground 
for my research (on online algorithms for statistics).

Please take a look and let me know what you think, but my regression stuff 
is currently in break-everything mode and will be cleaned up in less than a 
week.