[julia-users] Re: How do I maximize a function ?

2016-11-18 Thread John Myles White
Pkg.add("Optim")


then

using Optim


and follow this section of the 
docs: 
http://www.juliaopt.org/Optim.jl/stable/user/minimization/#minimizing-a-univariate-function

 --John

On Friday, November 18, 2016 at 7:29:28 PM UTC-8, Pranav Bhat wrote:
>
> How do I obtain the maximum value of a function over an interval?
>
> In R I would do something like:
>
> f <- function(x) { x ^ 2}
> optimize(f, c(0, 100), maximum=TRUE)
>


Re: [julia-users] Recursive data structures with Julia

2016-10-30 Thread John Myles White
Working with non-concrete types is often a problem for performance, so this 
approach may not be very efficient compared with alternatives that are more 
careful about the use of concrete types.

 --John

On Sunday, October 30, 2016 at 6:27:47 PM UTC-7, Ralph Smith wrote:
>
> Conversion is done by methods listed in base/nullable.jl
>
> I would like to know if there is any drawback to an alternative like
>
> abstract Bst
>
> immutable NullNode <: Bst end
>
> type BstNode <: Bst
> val::Int
> left::Bst
> right::Bst
> end
>
> isnull(t::Bst) = isa(t,NullNode)
>
> BstNode(key::Int) = BstNode(key, NullNode(), NullNode())
>
> which appears to be good for type-safety, and is (sometimes) slightly 
> faster and less cumbersome than Nullables.
>
> On Sunday, October 30, 2016 at 6:24:42 PM UTC-4, Ángel de Vicente wrote:
>>
>> Hi, 
>>
>> by searching in the web I found 
>> (
>> http://stackoverflow.com/questions/36383517/how-to-implement-bst-in-julia) 
>>
>> a way to make my BST code much cleaner (as posted below). Nevertheless, 
>> I don't find this very ellegant, since the head node is of type Bst, 
>> while the children are of type Nullable{Bst} (is this the 'canonical' way 
>> of building recursive data structures with Julia?). 
>>
>> But when I first read the code in SO, I thought that it was probably 
>> wrong, since it does: 
>>
>> node.left = BST(key) 
>> where node.left is of type Nullable{BST}. 
>>
>> Then I realized that automatic conversion from BST to Nullable{BST} is 
>> done when assigning to node.left, so all is good. Coming from Fortran, 
>> this is a bit odd for me... what are the rules for automatic conversion? 
>>   
>>
>>
>> Thanks a lot, 
>> Ángel de Vicente 
>>
>>
>>
>>
>>
>> , 
>> | module b 
>> | 
>> | type Bst 
>> | val::Int 
>> | left::Nullable{Bst} 
>> | right::Nullable{Bst} 
>> | end 
>> | Bst(key::Int) = Bst(key, Nullable{Bst}(), Nullable{Bst}())   
>> | 
>> | "Given an array of Ints, it will create a BST tree, type: Bst" 
>> | function build_bst(list::Array{Int,1}) 
>> | head = list[1] 
>> | tree = Bst(head) 
>> | for e in list[2:end] 
>> | place_bst(tree,e) 
>> | end 
>> | return tree 
>> | end 
>> | 
>> | function place_bst(tree::Bst,e::Int) 
>> | if e == tree.val 
>> | println("Dropping $(e). No repeated values allowed") 
>> | elseif e < tree.val 
>> | if (isnull(tree.left)) 
>> | tree.left = Bst(e) 
>> | else 
>> | place_bst(tree.left.value,e) 
>> | end 
>> | else 
>> | if (isnull(tree.right)) 
>> | tree.right = Bst(e) 
>> | else 
>> | place_bst(tree.right.value,e) 
>> | end 
>> | end 
>> | end 
>> | 
>> | function print_bst(tree::Bst) 
>> | if !isnull(tree.left) print_bst(tree.left.value) end 
>> | println(tree.val) 
>> | if !isnull(tree.right) print_bst(tree.right.value) end 
>> | end 
>> | 
>> | end 
>> ` 
>>
>> , 
>> | julia> include("bst.jl") 
>> | 
>> | julia> b.print_bst( b.build_bst([4,5,10,3,20,-1,10])) 
>> | Dropping 10. No repeated values allowed 
>> | -1 
>> | 3 
>> | 4 
>> | 5 
>> | 10 
>> | 20 
>> | 
>> | julia> 
>> ` 
>>
>>
>> -- 
>> Ángel de Vicente 
>> http://www.iac.es/galeria/angelv/   
>>
>

[julia-users] Re: Julia and the Tower of Babel

2016-10-07 Thread John Myles White
I don't really see how you can solve this without a single dictator who 
controls the package ecosystem. I'm not enough of an expert in Python to 
say how well things work there, but the R ecosystem is vastly less 
organized than the Julia ecosystem. Insofar as it's getting better, it's 
because the community has agreed to make Hadley Wickham their benevolent 
dictator.

 --John

On Friday, October 7, 2016 at 8:35:46 AM UTC-7, Gabriel Gellner wrote:
>
> Something that I have been noticing, as I convert more of my research code 
> over to Julia, is how the super easy to use package manager (which I love), 
> coupled with the talent base of the Julia community seems to have a 
> detrimental effect on the API consistency of the many “micro” packages that 
> cover what I would consider the de-facto standard library.
>
> What I mean is that whereas a commercial package like Matlab/Mathematica 
> etc., being written under one large umbrella, will largely (clearly not 
> always) choose consistent names for similar API keyword arguments, and have 
> similar calling conventions for master function like tools (`optimize` 
> versus `lbfgs`, etc), which I am starting to realize is one of the great 
> selling points of these packages as an end user. I can usually guess what a 
> keyword will be in Mathematica, whereas even after a year of using Julia 
> almost exclusively I find I have to look at the documentation (or the 
> source code depending on the documentation ...) to figure out the keyword 
> names in many common packages.
>
> Similarly, in my experience with open source tools, due to the complexity 
> of the package management, we get large “batteries included” distributions 
> that cover a lot of the standard stuff for doing science, like python’s 
> numpy + scipy combination. Whereas in Julia the equivalent of scipy is 
> split over many, separately developed packages (Base, Optim.jl, NLopt.jl, 
> Roots.jl, NLsolve.jl, ODE.jl/DifferentialEquations.jl). Many of these 
> packages are stupid awesome, but they can have dramatically different 
> naming conventions and calling behavior, for essential equivalent behavior. 
> Recently I noticed that tolerances, for example, are named as `atol/rtol` 
> versus `abstol/reltol` versus `abs_tol/rel_tol`, which means is extremely 
> easy to have a piece of scientific code that will need to use all three 
> conventions across different calls to seemingly similar libraries. 
>
> Having brought this up I find that the community is largely sympathetic 
> and, in general, would support a common convention, the issue I have slowly 
> realized is that it is rarely that straightforward. In the above example 
> the abstol/reltol versus abs_tol/rel_tol seems like an easy example of what 
> can be tidied up, but the latter underscored name is consistent with 
> similar naming conventions from Optim.jl for other tolerances, so that 
> community is reluctant to change the convention. Similarly, I think there 
> would be little interest in changing abstol/reltol to the underscored 
> version in packages like Base, ODE.jl etc as this feels consistent with 
> each of these code bases. Hence I have started to think that the problem is 
> the micro-packaging. It is much easier to look for consistency within a 
> package then across similar packages, and since Julia seems to distribute 
> so many of the essential tools in very narrow boundaries of functionality I 
> am not sure that this kind of naming convention will ever be able to reach 
> something like a Scipy, or the even higher standard of commercial packages 
> like Matlab/Mathematica. (I am sure there are many more examples like using 
> maxiter, versus iterations for describing stopping criteria in iterative 
> solvers ...)
>
> Even further I have noticed that even when packages try to find 
> consistency across packages, for example Optim.jl <-> Roots.jl <-> 
> NLsolve.jl, when one package changes how they do things (Optim.jl moving to 
> delegation on types for method choice) then again the consistency fractures 
> quickly, where we now have a common divide of using either Typed dispatch 
> keywords versus :method symbol names across the previous packages (not to 
> mention the whole inplace versus not-inplace for function arguments …)
>
> Do people, with more experience in scientific packages ecosystems, feel 
> this is solvable? Or do micro distributions just lead to many, many varying 
> degrees of API conventions that need to be learned by end users? Is this 
> common in communities that use C++ etc? I ask as I wonder how much this 
> kind of thing can be worried about when making small packages is so easy.
>


Re: [julia-users] Is there a way to use values in a DataFrame directly in computation?

2016-10-03 Thread John Myles White
I think the core problem is that the current API + Nullable's is very 
cumbersome, but the switch to Nullable's will hopefully occur nearly 
simultaneously with the introduction of new API's that can make Nullable's 
much easier to deal with. David Gold spent the summer working on one 
approach that is, I think, much better than the current API; David Anthoff 
also has another approach that is substantially more powerful than the 
current API. The time between 0.5 and 0.6 may be a little chaotic in this 
regard, but I think the eventual results will be unequivocally worth the 
wait.

 -- John

On Monday, October 3, 2016 at 3:45:42 PM UTC-7, Min-Woong Sohn wrote:
>
> Thank you. I fear that Nullables will make the DataFrame very difficult to 
> use and turn many people away from Julia. 
>
>
>
> On Monday, October 3, 2016 at 12:20:32 PM UTC-4, Milan Bouchet-Valat wrote:
>>
>> Le lundi 03 octobre 2016 à 08:21 -0700, Min-Woong Sohn a écrit : 
>> > 
>> > I am using DataFrames from master branch (with NullableArrays as the 
>> > default) and was wondering how the following should be done: 
>> > 
>> > df = DataFrame() 
>> > df[:A] = NullableArray([1,2,3]) 
>> > 
>> > The following are not allowed or return wrong values: 
>> > 
>> > df[1,:A] == 1   # false 
>> > df[1,:A] > 1 # MethodError: no method matching isless(::Int64, 
>> > ::Nullable{Int64}) 
>> > df[3,:A] + 1 # MethodError: no method matching 
>> > +(::Nullable{Int64}, ::Int64) 
>> > 
>> > How should I get around these issues? Does anybody know if there is a 
>> > plan to support these kinds of computations directly? 
>> These operations currently work (after loading NullableArrays) if you 
>> rewrite 1 as Nullable(1), eg. df[1, :A] == Nullable(1). But the two 
>> first return a Nullable{Bool}, so you need to call get() on the result 
>> if you want to use them e.g. with an if. As an alternative, you can use 
>> isequal(). 
>>
>> There are discussions as regards whether mixing Nullable and scalars 
>> should be allowed, as well as whether these operations should be moved 
>> into Julia Base. See in particular 
>> https://github.com/JuliaStats/NullableArrays.jl/pull/85 
>> https://github.com/JuliaLang/julia/pull/16988 
>>
>> Anyway, the best approach to work with data frames is probably to use 
>> frameworks like AbstractQuery.jl and Query.jl, which are not yet 
>> completely ready to handle Nullable, but should make this easier. 
>>
>>
>> Regards 
>>
>

[julia-users] Re: No operator overloading in DataFrames.jl?

2016-09-25 Thread John Myles White
Yes, this absence is intentional. This operation is far too magical.

 -- John

On Sunday, September 25, 2016 at 7:49:27 PM UTC+2, nuffe wrote:
>
> The ability to add, subtract (etc) dataframes with automatic index 
> alignment is one of the great features with Pandas. Currently this is not 
> implemented in DataFrames.jl. I was just wondering if this is intentional?
>
> I was thinking about attempting to create a Pull Request, but the way 
> Pandas implements this (
> https://github.com/pydata/pandas/blob/master/pandas/core/ops.py#L37) 
> looks pretty intimidating, and I don't know what the corresponding design 
> idiom would be in DataFrames.jl
>
> Thanks
>


[julia-users] Re: whole arrays pass by reference while matrix columns by value?

2016-09-07 Thread John Myles White
Everything is pass by sharing.

But array indexing by slices creates copies. 0.5 has better support for 
creating views.

--John

On Wednesday, September 7, 2016 at 5:00:00 PM UTC-7, Alexandros Fakos wrote:
>
> Hi,
>
> a=rand(10,2)
> b=rand(10)
> sort!(b) modifies b
> but sort!(a[:,1]) does not modify the first column of matrix a 
>
> why is that? Does this mean that i cannot write functions that modify 
> their arguments and apply it to columns of matrices?
>
> Is trying to modify columns of matrices bad programming style? Is there a 
> better alternative?
>  How much more memory am I allocating if my functions return output 
> instead of modifying their arguments (matlab background - I am completely 
> unaware of memory allocation issues)? 
>
> Thanks,
> Alex
>


Re: [julia-users] Is the master algorithm on the roadmap?

2016-09-02 Thread John Myles White

>
> May I also point out to the My settings button on your top right corner > 
> My topic email subscriptions > Unsubscribe from this thread, which would've 
> spared you the message.


I'm sorry, but this kind of attitude is totally unacceptable, Kevin. I've 
tolerated your misuse of the mailing list, but it is not acceptable for you 
to imply that others are behaving inappropriately when they complain about 
your unequivocal misuse of the mailing list.

 --John 

On Friday, September 2, 2016 at 7:23:27 AM UTC-7, Kevin Liu wrote:
>
> May I also point out to the My settings button on your top right corner > 
> My topic email subscriptions > Unsubscribe from this thread, which would've 
> spared you the message.
>
> On Friday, September 2, 2016 at 11:19:42 AM UTC-3, Kevin Liu wrote:
>>
>> Hello Chris. Have you been applying relational learning to your Neural 
>> Crest Migration Patterns in Craniofacial Development research project? It 
>> could enhance your insights. 
>>
>> On Friday, September 2, 2016 at 6:18:15 AM UTC-3, Chris Rackauckas wrote:
>>>
>>> This entire thread is a trip... a trip which is not really relevant to 
>>> julia-users. You may want to share these musings in the form of a blog 
>>> instead of posting them here.
>>>
>>> On Friday, September 2, 2016 at 1:41:03 AM UTC-7, Kevin Liu wrote:

 Princeton's post: 
 http://www.nytimes.com/2016/08/28/world/europe/france-burkini-bikini-ban.html?_r=1

 Only logic saves us from paradox. - Minsky

 On Thursday, August 25, 2016 at 10:18:27 PM UTC-3, Kevin Liu wrote:
>
> Tim Holy, I am watching your keynote speech at JuliaCon 2016 where you 
> mention the best optimization is not doing the computation at all. 
>
> Domingos talks about that in his book, where an efficient kind of 
> learning is by analogy, with no model at all, and how numerous scientific 
> discoveries have been made that way, e.g. Bohr's analogy of the solar 
> system to the atom. Analogizers learn by hypothesizing that entities with 
> similar known properties have similar unknown ones. 
>
> MLN can reproduce structure mapping, which is the more powerful type 
> of analogy, that can make inferences from one domain (solar system) to 
> another (atom). This can be done by learning formulas that don't refer to 
> any of the specific relations in the source domain (general formulas). 
>
> Seth and Tim have been helping me a lot with putting the pieces 
> together for MLN in the repo I created 
> , and more help is always 
> welcome. I would like to write MLN in idiomatic Julia. My question at the 
> moment to you and the community is how to keep mappings of first-order 
> harmonic functions type-stable in Julia? I am just getting acquainted 
> with 
> the type field. 
>
> On Tuesday, August 9, 2016 at 9:02:25 AM UTC-3, Kevin Liu wrote:
>>
>> Helping me separate the process in parts and priorities would be a 
>> lot of help. 
>>
>> On Tuesday, August 9, 2016 at 8:41:03 AM UTC-3, Kevin Liu wrote:
>>>
>>> Tim Holy, what if I could tap into the well of knowledge that you 
>>> are to speed up things? Can you imagine if every learner had to start 
>>> without priors? 
>>>
>>> > On Aug 9, 2016, at 07:06, Tim Holy  wrote: 
>>> > 
>>> > I'd recommend starting by picking a very small project. For 
>>> example, fix a bug 
>>> > or implement a small improvement in a package that you already 
>>> find useful or 
>>> > interesting. That way you'll get some guidance while making a 
>>> positive 
>>> > contribution; once you know more about julia, it will be easier to 
>>> see your 
>>> > way forward. 
>>> > 
>>> > Best, 
>>> > --Tim 
>>> > 
>>> >> On Monday, August 8, 2016 8:22:01 PM CDT Kevin Liu wrote: 
>>> >> I have no idea where to start and where to finish. Founders' help 
>>> would be 
>>> >> wonderful. 
>>> >> 
>>> >>> On Tuesday, August 9, 2016 at 12:19:26 AM UTC-3, Kevin Liu 
>>> wrote: 
>>> >>> After which I have to code Felix into Julia, a relational 
>>> optimizer for 
>>> >>> statistical inference with Tuffy <
>>> http://i.stanford.edu/hazy/tuffy/> 
>>> >>> inside, for enterprise settings. 
>>> >>> 
>>>  On Tuesday, August 9, 2016 at 12:07:32 AM UTC-3, Kevin Liu 
>>> wrote: 
>>>  Can I get tips on bringing Alchemy's optimized Tuffy 
>>>   in Java to Julia while 
>>> showing the 
>>>  best of Julia? I am going for the most correct way, even if it 
>>> means 
>>>  coding 
>>>  Tuffy into C and Julia. 
>>>  
>>> > On Sunday, August 7, 2016 at 8:34:37 PM UTC-3, Kevin Liu 
>>> wrote: 
>>> > I'll try to build it, compare it, and show it 

[julia-users] Re: Distributions.jl : generating samples from stable distributions

2016-08-19 Thread John Myles White
Unfortunately the R package is GPL, so we can't use it as a template for a 
Julia implementation. But RCall will let you call that package from Julia 
to get draws from those distributions, so you should be able to do what you 
suggested pretty easily.

--John

On Friday, August 19, 2016 at 1:44:06 PM UTC-7, Rock Pereira wrote:
>
> It's available in R. Look up the package '*stabledist*'. It has function
>  rstable(n, alpha, beta, gamma, delta, pm)
>
> Stable distributions are difficult to generate because the pdf does not 
> have a closed form.
> For the Levy distribution, alpha=0.5, beta=1.
> I was trying to answer a question on StackOverflow 
> http://stackoverflow.com/questions/38774913/fbasicsfitstable-working-suspiciously
> I wanted to do the generation in R, and the MLE optimization in Julia, but 
> didn't get round to it.
>
>
>

[julia-users] Re: Distributions.jl : generating samples from stable distributions

2016-08-19 Thread John Myles White
This is extremely difficult to do right, which is why we don't support it 
yet.

--John

On Friday, August 19, 2016 at 9:26:19 AM UTC-7, Mirmu wrote:
>
> I am looking for generating samples from the stable Levy family with 
> Distributions.jl, but I cannot find it.
> https://en.wikipedia.org/wiki/Stable_distribution
>
> Did I miss it, or there is a reason why it is not yet implemented ?
>
> Best
>


[julia-users] Re: How to make a variable length tuple with inferred type

2016-08-02 Thread John Myles White
 

This example should clarify where you're confused:


julia> typeof(ntuple(x -> 1, 1))

Tuple{Int64}


julia> typeof(ntuple(x -> 1, 2))

Tuple{Int64,Int64}

On Tuesday, August 2, 2016 at 2:34:27 PM UTC-5, mmh wrote:
>
> Care to explain in more depth? If the function is type stable i.e. it 
> returns an Int for an Int input then why would ntuple(::Function,::Int) not 
> be type stable?  What do you mean by the return type depends on the "value" 
> of the integer (it's an integer!). Am I misunderstanding?
>
>
>
> On Monday, August 1, 2016 at 11:40:25 AM UTC-4, Kristoffer Carlsson wrote:
>>
>> Nope. nuple(::Function, ::Int) is not type stable because the return type 
>> depends on the value of the integer.
>>
>> On Monday, August 1, 2016 at 5:17:10 PM UTC+2, mmh wrote:
>>>
>>> Is this a known bug/regression?
>>>
>>> On Sunday, July 31, 2016 at 10:53:11 PM UTC-4, Sheehan Olver wrote:

 It still doesn't infer the type in 0.5:

 *julia> **@code_warntype ntuple( x -> 0, 3)*

 Variables:

   #self#::Base.#ntuple

   f::##5#6

   n::Int64


 Body:

   begin 

   unless (Base.sle_int)(n::Int64,0)::Bool goto 3

   return (Core.tuple)()::Tuple{}

   3: 

   unless (n::Int64 === 1)::Bool goto 6

   return (Core.tuple)($(QuoteNode(0)))::Tuple{Int64}

   6: 

   unless (n::Int64 === 2)::Bool goto 9

   return 
 (Core.tuple)($(QuoteNode(0)),$(QuoteNode(0)))::Tuple{Int64,Int64}

   9: 

   unless (n::Int64 === 3)::Bool goto 12

   return 
 (Core.tuple)($(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)))::Tuple{Int64,Int64,Int64}

   12: 

   unless (n::Int64 === 4)::Bool goto 15

   return 
 (Core.tuple)($(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)))::Tuple{Int64,Int64,Int64,Int64}

   15: 

   unless (n::Int64 === 5)::Bool goto 18

   return 
 (Core.tuple)($(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)))::Tuple{Int64,Int64,Int64,Int64,Int64}

   18: 

   unless (Base.slt_int)(n::Int64,16)::Bool goto 21

   return (Core._apply)(Core.tuple,$(Expr(:invoke, LambdaInfo for 
 ntuple(::##5#6, ::Int64), :(Base.ntuple), :(f), 
 :((Base.box)(Int64,(Base.sub_int)(n,5),(Core.tuple)($(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)),$(QuoteNode(0)))::Tuple{Int64,Int64,Int64,Int64,Int64})
 *::Tuple{Vararg{Any,N}}*

   21: 

   return $(Expr(:invoke, LambdaInfo for _ntuple(::Function, 
 ::Int64), :(Base._ntuple), :(f), :(n)))

   end*::Tuple*

 On Monday, August 1, 2016 at 10:34:30 AM UTC+10, David P. Sanders wrote:
>
>
>
> El domingo, 31 de julio de 2016, 20:16:04 (UTC-4), Sheehan Olver 
> escribió:
>>
>> I'm doing the following:
>>
>>
>> immutable FooIterator{d} end
>>
>> Base.start(::FooIterator{d}) = tuple(zeros(Int,d)...)::NTuple{d,Int}
>>
>
>
> You can use the `ntuple` function, which constructs a tuple from a 
> function:
>
> julia> ntuple( x -> 0, 3)
> (0,0,0)
>
> julia> typeof(ans)
> Tuple{Int64,Int64,Int64}
>  
>
>>
>>
>> But is there a more elegant way of getting the type inferred?  I 
>> suppose I can override low order d directly:
>>
>> Base.start(::FooIterator{2}) = (0,0)
>> Base.start(::FooIterator{3}) = (0,0,0)
>>
>

[julia-users] Re: A story on unchecked integer operations

2016-07-13 Thread John Myles White
This seems more like a use case for static analysis that checked operations 
to me. The problem IIUC isn't about the usage of high-performance code that 
is unsafe, but rather that the system was nominally tested, but tested in 
an imperfect way that didn't cover the failure cases. If you were rewriting 
this in Rust, it's easy for me to imagine that you would use checked 
arithmetic at the start until 5 years have passed, then you would decide 
it's safe and turn off the checks -- all because you had never really 
tested the obscure cases that only a static analyzer is likely to catch.

 -- John

On Wednesday, July 13, 2016 at 1:07:59 PM UTC-7, Erik Schnetter wrote:
>
> We have this code  that simulates black 
> holes and other astrophysical systems. It's written in C++ (and a few other 
> languages). I obviously intend to rewrite it in Julia, but that's not the 
> point here.
>
> One of the core functions allows evaluating (interpolating) the value of a 
> function at any point in the domain. That code was originally written in 
> 2002, and has been used and optimized and tested extensively. So you'd 
> think it's reasonably bug-free...
>
> Today, a colleague ran this code on Blue Waters, using 32,000 nodes, and 
> with some other parameters set to higher resolutions that before. Given the 
> subject of the email, you can guess what happened.
>
> Luckily, a debugging routine was active, and caught an inconsistency (an 
> inconsistent domain decomposition), alerting us to the problem.
>
> Would Julia have prevented this? I know that everybody wants speed -- and 
> if you are using 32,000 nodes, you want a lot of speed -- but the idea of 
> bugs that only appear when you are pushing the limits makes me 
> uncomfortable. So, no -- Julia's unchecked integer arithmetic would not 
> have caught this bug either.
>
> Score: Julia vs. C++, both zero.
>
> -erik
>
> -- 
> Erik Schnetter  
> http://www.perimeterinstitute.ca/personal/eschnetter/
>


Re: [julia-users] When Julia v1.0 will be released?

2016-07-07 Thread John Myles White

>
> For industry, it probably means something similar.


I really hope people in industry won't act on this date, as it is not 
nearly firm enough to bet a business on. We already have people writing 
blog posts about how using Julia for their startup turned out to be a 
mistake; we really don't need to encourage a new group of people to bet on 
something that's not 100% guaranteed.

Or to use industry language: that date isn't an SLA.

 -- John

On Thursday, July 7, 2016 at 7:55:34 AM UTC-7, Chris Rackauckas wrote:
>
> This information is hugely beneficial in science/mathematics, especially 
> for a PhD. It means that if you start a project in Julia now, although 
> there will be some bumps for when versions change, the project will likely 
> end after v1.0 is released (say 2 years?) and so your code should be stable 
> when complete. It could have been 3-5 years for v1.0 (that's actually what 
> I thought before reading it), in which case you know your code will be 
> broken soon after publication, and so you should think about either not 
> publishing the code or putting it to a Github repo with tests and be ready 
> for the extra work of updating it.
>
> For industry, it probably means something similar.
>
> It's by no means a guarantee, but as a ballpark it's still extremely 
> useful just to know what they have in mind. Since it's so soon, it also 
> tells us that the "put the extra stuff in a package" instead of growing 
> base mentality is how they are continuing forward (it's the leaner version 
> of Julia that they have been pushing with at least v0.5 which gives them 
> more mobility), which I think is good and it means I should plan to really 
> plug into the package ecosystem, which may not be stable at the v1.0 
> release.
>
> On Thursday, July 7, 2016 at 7:47:28 AM UTC-7, Isaiah wrote:
>>
>> I knew that.
>>>
>>
>> The goal is 2017, if development community considers it to be ready.
>>
>> I don't mean to be too glib, but I fail to see how any answer is 
>> particularly actionable; it is certainly not binding.
>>  
>>
>> On Thursday, July 7, 2016 at 10:14:24 AM UTC-4, Isaiah wrote:

 When it is ready.

 On Thu, Jul 7, 2016 at 10:07 AM, Hisham Assi  
 wrote:

> I really like Julia (I am using it for my publications & thesis), but 
> I noticed that the versions are not really backward compatible. I am 
> still 
> ok with that, but  many other people are waiting for the mature, stable 
> version  (1.0) to start using Julia. So, when Julia v1.0 will be 
> released?
>


>>

[julia-users] Re: A naive benchmark

2016-07-01 Thread John Myles White
See the top of 
http://docs.julialang.org/en/release-0.4/manual/performance-tips/

On Friday, July 1, 2016 at 7:16:10 AM UTC-7, baillot maxime wrote:
>
> Hello everyone,
>
> I am working on a Julia code which is a brutal copy of a Matlab code and 
> found out that the Julia code is slower. 
>
> So I did some naive benchmark and found strange results... 
>
> I did look for solution on The Internets but... I found nothing usefull 
> that why I'm asking the question here.
>
> Maybe someone have an idea of why the Julia code is slower than MATLAB? 
> (because the official benchmark say that it should be quicker )
>
> PS: I know that it's a bit of a recurrent question 
>
> The codes are 
>
> Julia code 
>
> nbp = 2^12;
>
> m = rand(nbp,nbp);
> a = 0.0;
> Mr = zeros(nbp,nbp);
>
> tic()
> for k = 1:nbp
> for kk = 1:nbp
>
> Mr[k,kk] = m[k,kk]*m[k,kk];
>
> end
> end
> toc()
>
>
> Elapsed time: 7.481011275 seconds
>
>
> MATLAB code
>
> nbp = 2^12;
>
> m = rand(nbp,nbp);
> a = 0.0;
> Mr = zeros(nbp,nbp);
>
>
> tic
> for k = 1:nbp
> for kk = 1:nbp
>
> Mr(k,kk) =m(k,kk)*m(k,kk);
> 
> end
> end
> toc
>
>
> Elapsed time is 0.618451 seconds.
>
>

[julia-users] Re: Is passing a function as an argument discouraged?

2016-06-17 Thread John Myles White
Specialization of higher-order functions should be much improved in Julia 
0.5.

On Friday, June 17, 2016 at 1:10:24 PM UTC-7, Douglas Bates wrote:
>
> I am writing a simulation function that loops over simulating a data set 
> and fitting multiple statistical models to the data.  The exact form of the 
> output will depend on which characteristics of the fitted models I wish to 
> preserve.  My inclination is to pass a callback function to take the set of 
> models after each iteration and extract and save the characteristics of 
> interest.
>
> However, I have a vague recollection that passing a function as an 
> argument to another function was discouraged.  I believe it made type 
> inference awkward.  Was that ever the case and, if so, is still the case?
>


[julia-users] Re: parse.(Int64, x)

2016-06-15 Thread John Myles White
I would be careful combining element-wise function application with partial 
function application. Why not use map instead?

On Wednesday, June 15, 2016 at 3:47:05 PM UTC-7, David Anthoff wrote:
>
> I just tried to use the new dot syntax for vectorising function calls in 
> order to convert an array of strings into an array of Int64. For example, 
> if this would work, it would be very, very handy:
>
>  
>
> x = [“1”, “2”, “3”]
>
> parse.(Int64, x)
>
>  
>
> Right now I get an error, but I wonder whether this could be enabled 
> somehow in this new framework? If this would work for all sorts of parsing, 
> type conversions etc. it would just be fantastic. Especially when working 
> DataFrames and one is in the first phase of cleaning up data types of 
> columns etc. this would make for a very nice and short notation.
>
>  
>
> Thanks,
>
> David 
>
>  
>
> --
>
> David Anthoff
>
> University of California, Berkeley
>
>  
>
> http://www.david-anthoff.com
>
>  
>


[julia-users] Re: Dirichlet distribution estimation ported/written in Julia?

2016-04-30 Thread John Myles White
The Lightspeed license is a little weird if I remember correctly, but it 
would good to get fitting procedures if we're missing them. Not sure we 
even have a Polya distribution right now.

On Saturday, April 30, 2016 at 7:32:16 AM UTC-7, Artem OBOTUROV wrote:
>
> Hi
>
> I would like to ask if somebody has already ported to Julia 
> http://research.microsoft.com/en-us/um/people/minka/software/fastfit/ 
> package? 
> Or knows a similar package [in Julia] to estimate Dirichlet and Polya 
> distributions?
>
> Best,
> Artem 
>


[julia-users] Re: enforcing homogeneity of vector elements in function signature

2016-04-04 Thread John Myles White
Vector{Foo{T}}?

On Monday, April 4, 2016 at 1:25:46 PM UTC-7, Davide Lasagna wrote:
>
> Hi all, 
>
> Consider the following example code
>
> type Foo{T, N}
> a::NTuple{N, T}
> end
>
> function make_Foos(M)
> fs = Foo{Float64}[]
> for i = 1:M
> N = rand(1:2)
> f = Foo{Float64, N}(ntuple(i->0.0, N))
> push!(fs, f)
> end
> fs
> end
>
> function bar{F<:Foo}(x::Vector{F})
> println("Hello, Foo!")
> end
>
> const fs = make_Foos(100)
>
> bar(fs)
>
> What would be the signature of `bar` to enforce that all the entries of 
> `x` have the same value for the first parameter T? As it is now, `x` could 
> contain an `Foo{Float64}` and a `Foo{Int64}`, whereas I would like to 
> enforce homogeneity of the vector elements in the first parameter.
>
> Thanks
>
>
>

[julia-users] Re: What to read to understand finishing v0.5?

2016-03-09 Thread John Myles White
I think it's fair to say that the reason your questions aren't already 
answered by GitHub is because there's no one who's made an executive 
decision about the answers to those questions.

 -- John

On Wednesday, March 9, 2016 at 4:44:28 AM UTC-8, Andreas Lobinger wrote:
>
> Hello colleagues,
>
> i need a bigger picture of the status of v0.5, dates, timelines, missing 
> features, missing testing, expected closing. Just go to github and select 
> the v0.5 milestone gives me a diverse picture.
>
> Wishing ahappy day,
> Andreas
>


[julia-users] Re: Array slices and functions that modify inputs.

2016-03-07 Thread John Myles White
Array indexing produces a brand new array that has literally no 
relationship with the source array.

 -- John

On Monday, March 7, 2016 at 5:21:34 PM UTC-8, Daniel Carrera wrote:
>
> Hello,
>
> Some Julia functions act on their inputs. For example:
>
> julia> vals = [6,5,4,3]
> 4-element Array{Int64,1}:
>  6
>  5
>  4
>  3
>
> julia> sort!(vals);
>
> julia> vals
> 4-element Array{Int64,1}:
>  3
>  4
>  5
>  6
>
>
> However, it looks like these functions do not modify array slices:
>
> julia> vals = [6,5,4,3]
> 4-element Array{Int64,1}:
>  6
>  5
>  4
>  3
>
> julia> sort!(vals[2:end])
> 3-element Array{Int64,1}:
>  3
>  4
>  5
>
> julia> vals
> 4-element Array{Int64,1}:
>  6
>  5
>  4
>  3
>
>
> Can anyone explain to me why this happens? Is this a language feature? Is 
> it at all possible to make a destructive function that acts on slices?
>
> Cheers,
> Daniel.
>
>

[julia-users] Re: Incomplete parametric types

2016-03-04 Thread John Myles White
 

Is this what you want?


julia> abstract ABC


julia> type A <: ABC end


julia> type B <: ABC end


julia> 


julia> type TestType{T <:ABC}

   a::Float64

   b::T

   

   TestType(a::Float64) = new(a)

   end


julia> myT = TestType{A}(4.0)

TestType{A}(4.0,#undef)


julia> myT.b = A()

A()


julia> myT

TestType{A}(4.0,A())

On Friday, March 4, 2016 at 9:19:42 AM UTC-8, Christopher Alexander wrote:
>
> Hi all,
>
> Is there anyway to do something like the following?
>
> abstract ABC
>
> type A <: ABC end
>
> type B <: ABC end
>
> type TestType{T <:ABC}
> a::Float64
> b::T
> 
> TestType(a::Float64) = new(a)
> end
>
> myT = TestType(4.0)
> myT.b = A()
>
> I am wondering if you can incompletely initialize a parametric type, and 
> then set the actual value needed later.  The above code doesn't work, but 
> that is what I'm trying to do.  The alternative is I guess to have some 
> default Null Type, but then to get the performance gains I have to copy the 
> object when I want to actually set it with the value that I want.
>
> Thanks!
>
> Chris
>


[julia-users] Re: Precision problem in matrix inversion (for ridge regression)?

2016-03-01 Thread John Myles White
 

Compare and contrast the following:


inv(map(Float32, Z'*Z))*(Z'*ys_dataset)

inv(map(Float64, Z'*Z))*(Z'*ys_dataset)

 -- John

On Tuesday, March 1, 2016 at 4:19:19 PM UTC-8, Francisco C Pereira wrote:
>
> Hi, 
>
> I deeply apologize in advance if this is a repeated question...  the 
> question is relatively simple. How can this happen? I was estimating a 
> basic ridge regression model but ran into strange results:
>
> lambda=0
> beta_hat=inv(Z'Z+lambda*eye(size(Z,2)))*(Z'*ys_dataset)
> beta_hat2=inv(Z'*Z)*(Z'*ys_dataset)
> beta_hat-beta_hat2
>
> Output (it should be a vector with zeroes):
> 21-element Array{Float64,1}:
>
>   -245.183  
>   -436.175  
>   2633.22   
> 73.0047 
>  -3345.18   
>   1712.78   
>   -354.545  
>187.102  
> 49.8899 
>-49.4564 
> -2.91745
> -0.253636   
> -0.186349   
>  0.457511   
> -0.031112   
>  0.0789713  
> -0.00130604 
> -0.00448352 
> -0.00324767 
>  0.000248121
>  0.000152718
>
>
>
>
> Just for reference, here are the other variables:
>
>
> xs_dataset=[-0.8,-0.6,-0.4,-0.2,0.0,0.2,0.4,0.6,0.8,1.0,1.2,1.4,1.6,1.8,2.0,2.2,2.4,2.6,2.8,3.0,3.2]
>
> ys_dataset=[-194.74997124179467,-218.7737153673673,-262.705773686837,-151.42110150479215,-147.91115080458417,-208.4047152970189,-128.36970674967745,-114.53430386917461,-158.27662488829077,-142.21198729962245,-82.80260143610138,-27.621822591723756,-25.154050675997198,-6.99434298775955,-0.06912672996766567,-5.349114073043877,32.43503138931636,-37.228578783587075,-37.22806830522013,-113.28447052959032,-71.0209779039439]
>
> Z=Array{Float32}([z^i for z in xs_dataset, i in 0:20])
>
>
>
> I've been arguing that Julia is the best in the world, I need help to keep 
> preaching! ;-)
>
>
> Thanks!
>
>Francisco
>
>

[julia-users] Re: Loading a module without loading its submodules

2016-02-24 Thread John Myles White
I don't think Julia is really amenable to this kind of organization because 
Julia's modules have no logical relationship to filesystem layouts, whereas 
Python's system is all about filesystem layout and has nothing to do with 
textual inclusion.

On Wednesday, February 24, 2016 at 1:28:08 PM UTC-8, Cedric St-Jean wrote:
>
>
>
> On Wednesday, February 24, 2016 at 4:15:49 PM UTC-5, Jeffrey Sarnoff wrote:
>>
>> This should not be a problem.  What is your concern?
>>
>
> Loading time/RAM usage. I'm trying to wrap/port scikit-learn, and their 
> module arrangement makes a lot of sense. In Python, I don't get to load 
> code for support vector machines unless I actually need them.
>
> import sklearn.svm
>
> I could define separate modules like "sklearn_svm", "sklearn_cluster", but 
> it's awfully ugly.
>  
>
>> On Wednesday, February 24, 2016 at 3:45:50 PM UTC-5, Cedric St-Jean wrote:
>>>
>>> In Python, loading a module (i.e. importing a file) does not load 
>>> sub-modules, eg.:
>>>
>>> import sklearn
>>> import sklearn.linear_model
>>>
>>> Is there any way to achieve the same thing in Julia?
>>>
>>> module A
>>> println("loaded A")
>>>
>>> module B
>>> println("loaded B")
>>> end
>>>
>>> end
>>>
>>> Can I have "loaded A" without "loaded B"?
>>>
>>

[julia-users] Re: Spped, assigning arrays

2015-08-24 Thread John Myles White
You'll want to study http://julialang.org/blog/2013/09/fast-numeric/ to 
figure out how to make your code faster.

On Monday, August 24, 2015 at 10:31:56 AM UTC-7, Mohamed Moussa wrote:

 Hey. I'm new to Julia. I'm playing around with 0.4 on Windows. I'm 
 interested in writing finite element code for research purposes. 

 Consider the following code

 function get_gauss_points!(gp_xi::Matrix)
 gp_xi = [-1/sqrt(3)  1/sqrt(3)  1/sqrt(3) -1/sqrt(3);
  -1/sqrt(3) -1/sqrt(3)  1/sqrt(3)  1/sqrt(3)]
 end

 function get_gauss_points2!(gp_xi::Matrix)
 gp_xi[1,1] = -1/sqrt(3)
 gp_xi[2,1] = -1/sqrt(3)
 gp_xi[1,2] =  1/sqrt(3)
 gp_xi[2,2] = -1/sqrt(3)
 gp_xi[1,3] =  1/sqrt(3)
 gp_xi[2,3] =  1/sqrt(3)
 gp_xi[1,4] = -1/sqrt(3)
 gp_xi[2,4] =  1/sqrt(3)
 end

 function test()
 gp_xi = zeros(2,4)
 get_gauss_points!(gp_xi)
 get_gauss_points2!(gp_xi)

 println(get_gauss_points!:)
 @time for i=1:1e7 get_gauss_points!(gp_xi) end

 println(\nget_gauss_points2!)
 @time for i=1:1e7 get_gauss_points2!(gp_xi) end
 end

 test()

 The output is
 get_gauss_points!:
   1.129231 seconds (100.00 M allocations: 2.682 GB, 14.56% gc time)

 get_gauss_points2!
   0.067619 seconds

 Using @profile shows that get_gauss_points! spends a lot of time in 
 abstractarray. Is it possible to make get_gauss_points run as fast as 
 get_gauss_points2 ? 

 Cheers. 



[julia-users] Re: Splitting a multidimensional function

2015-08-19 Thread John Myles White
Since f1(x) requires a call to f(x), there's no way for your approach to 
work in Julia. You probably should define f1(x) as sqrt(x[1]) and f2(x) as 
2 * x[2].

 -- John

On Wednesday, August 19, 2015 at 2:32:38 PM UTC-7, Nikolay Kryukov wrote:

 I have a problem when I try to separate the components of a 
 multidimensional function. Example:

 Given the 2D function of a 2D argument:
 f(x) = [sqrt(x[1]), 2*x[2]]

 I want to split it into two separate functions which are the components of 
 the original 2D function. I thought that the obvious solution was:

 f1(x) = f(x)[1]
 f2(x) = f(x)[2]

 The second function merely doubles the second component of its argument, 
 as it should:
 f2([2, 3])
 -- 6.0

 But the functions don't turn out to be completely decoupled: let's see 
 what happens when we do

 f2([-2, 3])
 -- ERROR: DomainError
  in f2 at none:1

 Even though the second function doesn't do sqrt and doesn't even depend on 
 the first component of the argument, the first component of the original 
 function is still checked and obviously returns an error. 

 How do I decouple a 2D function?



[julia-users] Re: Package description on pkg.julialang.org

2015-08-13 Thread John Myles White
Keyan,

Don't worry about pkg.julialang.org. It's only updated once-a-week because 
it's not the definitive package listing.

The listing that has day-to-day importance to user experience is the 
listing on METADATA.jl and the content of your package's README. Worry 
about those and just ignore anything wrong with pkg.julialang.org. It's not 
intended to reflect the package system's state in real time.

 -- John

On Thursday, August 13, 2015 at 7:38:56 AM UTC-7, Keyan wrote:

 Hi,

 the package description of Shannon.jl is not correct. I misused a word, 
 which I have replaced in the single-line repository description. What is 
 the best procedure to have the package description on pkg.julialang.org 
 changed 
 accordingly?

 Cheers,
 Keyan




[julia-users] Re: New variables overwrite old variables in certain cases.

2015-07-25 Thread John Myles White
http://www.johnmyleswhite.com/notebook/2014/09/06/values-vs-bindings-the-map-is-not-the-territory/

On Saturday, July 25, 2015 at 8:13:33 AM UTC-7, Christopher Fisher wrote:

 I am writing a program (in .3.1) that takes a one dimensional array of 
 indices (x) and creates a new one dimensional array of indices  (x = y) and 
 switches a given pair of elements in the new array. For some reason, the 
 original array (x) changes when I change the new array (see below). 

 In [16]:

 x = [1:5]

 y = x

 y[[3;4]] = y[[4;3]]

 [x y]

 Out[16]:

 5x2 Array{Int64,2}:
  1  1
  2  2
  4  4
  3  3



 However, if I change y = x to y = x[:], the problem does not occur, even 
 though x and x[:] appear to be the same type.


 In [15]:

 x = [1:5]

 y = x[:]

 y[[3;4]] = y[[4;3]]

 [x y]

 Out[15]:

 5x2 Array{Int64,2}:
  1  1
  2  2
  3  4
  4  3
  5  5


 The code below shows that this does not seem to be a general property of the 
 language. As in the previous cases, I define y as x and change y. 


 In [13]:

 x = [1:5]

 y = x

 y = y*4

 [x y]

 Out[13]:

 5x2 Array{Int64,2}:
  1   4
  2   8
  3  12
  4  16
  5  20



 Certainly, I could use y = x[:] or simply change the original array. 
 Nonetheless, the inconsistency seems undesirable from my point of view. Is 
 this a bug? Any help would be appreciated. 




Re: [julia-users] Deducing probability density functions from model equations

2015-07-21 Thread John Myles White
There's tons of WIP-code that implements arithmetic on Distributions. No 
one has the time to finish that code or volunteer to maintain it, so it's 
just sitting in limbo.

OP: What you're asking for sounds like it's largely equivalent to 
probabilistic programming. There are a ton of ways you could implement it, 
but it's almost surely a mistake to implement this sort of thing on your 
own. I would encourage you to learn Stan (which you can call from Julia), 
which is a formal language for specifying joint probability distributions. 
A Stan encoding of your probability model will allow you to draw samples 
from arbitrary conditional distributions by conditioning on observed data.

On Tuesday, July 21, 2015 at 6:43:05 AM UTC-7, Tamas Papp wrote:

 In general, f(x,theta) does not necessarily belong to any 
 frequently used distribution family, even if theta does -- it is 
 easy to come up with examples. 

 Some distribution families are closed under certain operations (eg 
 addition, and multiplication by scalars for the normal), and some 
 distributions arise as transformations of other distributions (eg 
 chi square from normal, etc). AFAIK Distributions.jl does not 
 support arithmetic on distributions, but you can always try 
 programming the relevant generic functions, eg + for (Real, 
 Distributions.Normal) if the results of your transformations are 
 distributions in Distribution.jl. 

 Best, 

 Tamas 

 On Tue, Jul 21 2015, amik...@gmail.com wrote: 

  Dear all, 
  
  I don't know if that's the best place to ask such a question but 
  I'll give  it a try: I need to code stochastic models: 
  
  xnplus1 = f(xn, theta) 
  
  where xn is the state of my system at time n and theta is a set 
  of  parameters for this model, constant through time, and 
  possibly containing  noise. As a matter of example, let's 
  consider the following model: 
  
  function transition(n, xn, theta) 
  e = rand(Normal(theta.mu, theta.sigma)) xnplus1 = xn + e 
  return xnplus1 
  end 
  
  My goal is to write such equations and to deduce automatically 
  the  transition probability density function: 
  
  p(xnplus1 | xn, theta). 
  
  I intended to parse the code of the model and look for the 
  rand keyword,  the name of the law used to generate this 
  random variable and the values  given to it. In the previous 
  case, I could deduce that: 
  
  p(xnplus1 | xn, theta) = normal_pdf(xnplus1, xn + theta.mu, 
  theta.sigma) 
  
  where 
  
  normal_pdf(x, mu, sigma) = 1 / (sigma * sqrt(2 * pi)) * exp( - 1 
  / 2 * (x -  mu) ^ 2 / sigma ^ 2) 
  
  (rather) easily I think by specifying that whenever a constant 
  is added to  a normal variable, the mean of the new variable 
  (left value in the  equation) is normal and of mean increased by 
  this constant and of the same  standard deviation. But that's 
  the simplest case of all and it requires me  to specify some 
  rules about the normal distribution. 
  
  I therefore have the following questions: - is what I'm trying 
  to do understandable?  - is it doable at all? and in Julia?  - 
  is specifying rules for each distribution (normal, uniform, 
  Poisson;  correlated, uncorrelated) the way to go or should I 
  think of something even  more generic?  - do you have any other 
  suggestions to solve this problem? 
  
  Thank you very much, 



[julia-users] Re: eig()/eigfact() performance: Julia vs. MATLAB

2015-07-12 Thread John Myles White
http://julia.readthedocs.org/en/release-0.3/manual/performance-tips/

On Sunday, July 12, 2015 at 8:33:56 PM UTC+2, Evgeni Bezus wrote:

 Hi all,

 I am a Julia novice and I am considering it as a potential alternative to 
 MATLAB.
 My field is computational nanophotonics and the main numerical technique 
 that I use involves multiple solution of the eigenvalue/eigenvector problem 
 for dense matrices with size of about 1000*1000 (more or less).
 I tried to run the following nearly equivalent code in Julia and in MATLAB:

 Julia code:

 n = 1000
 M = rand(n, n)
 F = eigfact(M)
 tic()
 for i = 1:10
 F = eigfact(M)
 end
 toc()


 MATLAB code:

 n = 1000;
 M = rand(n, n);
 [D, V] = eig(M);
 tic;
 for i = 1:10
 [D, V] = eig(M);
 end
 toc

 It turns out that MATLAB's eig() runs nearly 2.3 times faster than eig() 
 or eigfact() in Julia. On the machine available to me right now (relatively 
 old Core i5 laptop) the average time for MATLAB is of about 37 seconds, 
 while the mean Julia time is of about 85 seconds. I use MATLAB R2010b and 
 Julia 0.3.7 (i tried to run the code both in Juno and in a REPL session and 
 obtained nearly identical results).

 Is there anything that I'm doing wrong?

 Best regards,
 Evgeni



[julia-users] Re: Too many packages?

2015-07-11 Thread John Myles White
I think most of have the opposite desire: we're trying to move more 
functionality out of the core language and into packages.

 -- John

On Sunday, July 12, 2015 at 3:03:31 AM UTC+2, Burak Budanur wrote:

 I heard a lot about Julia language over the last year and last week 
 had a conversation with a colleague, who attended Juliacon and was 
 quite impressed. We talked about possibly moving some of our fluid 
 dynamics projects to Julia, so that for a new student who is joining
 the project it would be much easier to start without going through 
 learning c++ and/or fortran. 


 I am a physicist and most of my day job is some form of scientific 
 computing. My current default working environment is python 
 (numpy, scipy, sympy, matplotlib) + fortran (f2py) when some part 
 of my code needs to speed up. Yesterday I decided to start a a 
 new, relatively easy project as a simple example for an upcoming 
 paper. So I thought this might be a good occasion to start 
 learning Julia language to code a simple dynamical systems toolbox 
 in it, which might be useful for other people as well. 


 Basic functionality I need from the language are these:


 - Symbolic differentiation (for computation of Jacobians)
 - Numerical integration of ODEs (a general purpose integrator, such as
 lsoda from odepack, wrapped in scipy.integrate.odeint)
 - Linear algebra functions
 - Interpolation
 - Plotting in 2D and 3D


 After reading The Julia Express and parts of the documentation, I 
 thought that such a project is not a good investment, at least for 
 now. The reason is all the functionality I listed above are provided
 by external packages, partially excluding linear algebra functions.
 I'm aware that I can use specific packages for all the functionality
 I mentioned above, but each such package is maintained by different
 people, and they can change or become obsolete. I can also find some
 Fortran/C code, and include in Julia, and have all these 
 functionality, but then what is the advantage of using Julia, as 
 opposed to, say, python?


 In a more general sense, I am a little bit turned off by the 
 presence of an external package for almost every task I need to 
 do. I can understand this kind of structure in python as it is a 
 general purpose language. But since Julia is a language 
 specifically for scientific computation, I'd be happy to have 
 something like the basic functionality of MATLAB in the main 
 language. 


 I understand that Julia is under development and there is a lot to
 change and to be added, but I am wondering what is the Julia's future 
 directions regarding these issues? I did some search, but could not 
 find an answer to this question, so I apologize if this was already 
 answered elsewhere. 
 I heard a lot about Julia language over the last year and last week 
 had a conversation with a colleague, who attended Juliacon and was 
 quite impressed. We talked about possibly moving some of our fluid 
 dynamics projects to Julia, so that for a new student who is joining
 the project it would be much easier to start without going through 
 learning c++ and/or fortran. 

 I am a physicist and most of my day job is some form of scientific 
 computing. My current default working environment is python 
 (numpy, scipy, sympy, matplotlib) + fortran (f2py) when some part 
 of my code needs to speed up. Yesterday I decided to start a a 
 new, relatively easy project as a simple example for an upcoming 
 paper. So I thought this might be a good occasion to start 
 learning Julia language to code a simple dynamical systems toolbox 
 in it, which might be useful for other people as well. 

 Basic functionality I need from the language are these:

 - Symbolic differentiation (for computation of Jacobians)
 - Numerical integration of ODEs (a general purpose integrator, such as
 lsoda from odepack, wrapped in scipy.integrate.odeint)
 - Linear algebra functions
 - Interpolation
 - Plotting in 2D and 3D

 After reading The Julia Express and parts of the documentation, I 
 thought that such a project is not a good investment, at least for 
 now. The reason is all the functionality I listed above are provided
 by external packages, partially excluding linear algebra functions.
 I'm aware that I can use specific packages for all the functionality
 I mentioned above, but each such package is maintained by different
 people, and they can change or become obsolete. I can also find some
 Fortran/C code, and include in Julia, and have all these 
 functionality, but then what is the advantage of using Julia, as 
 opposed to, say, python?

 In a more general sense, I am a little bit turned off by the 
 presence of an external package for almost every task I need to 
 do. I can understand this kind of structure in python as it is a 
 general purpose language. But since Julia is a language 
 specifically for scientific computation, I'd be happy to have 
 something like the basic functionality of MATLAB in the main 
 

Re: [julia-users] Re: Environment reification and lazy evaluation

2015-07-09 Thread John Myles White
 to 
 immutable 
  reifications (which could solve a bunch of problems as is). 
 However, it 
  seems natural to match mutable symbol tables with mutable 
 reifications, and 
  immutable symbol tables with immutable reifications. 
  
  
  On Wednesday, July 8, 2015 at 6:50:03 PM UTC-4, Brandon Taylor 
 wrote: 
  
  I'm not sure I understand... 
  
  On Wednesday, July 8, 2015 at 6:24:37 PM UTC-4, John Myles 
 White wrote: 
  
  Reified scope makes static analysis much too hard. Take any 
 criticism 
  of mutable state: they all apply to globally mutable symbol 
 tables. 
  
  On Wednesday, July 8, 2015 at 10:26:23 PM UTC+2, Milan 
 Bouchet-Valat 
  wrote: 
  
  Le mercredi 08 juillet 2015 à 13:20 -0700, Brandon Taylor a 
 écrit : 
   All functions. 
  Well, I don't know of any language which doesn't have scoping 
  rules... 
  
  Anyway, I didn't say scoping rules are necessarily confusing, 
 I was 
  only referring to R formulas. But according to the examples 
 you 
  posted, 
  your question appears to be different. 
  
  



Re: [julia-users] Converting a string to a custom type results in a function that is not type stable

2015-06-24 Thread John Myles White
Excited you're working on dependent data bootstraps. I implemented one just 
the other day since it could be useful for analyzing benchmark data. Would 
be great to have other methods to do out.

 -- John

On Wednesday, June 24, 2015 at 5:31:52 AM UTC-4, Milan Bouchet-Valat wrote:

 Le mercredi 24 juin 2015 à 01:18 -0700, colintbow...@gmail.com a écrit 
  : 
  Hi all, 
  
  I've got an issue I don't really like in one of my modules, and I was 
  wondering the best thing (if anything) to do about it. 
  
  The module if for dependent bootstraps, but the problem is more of a 
  project design issue. I have a type for each bootstrap method, e.g. 
  `StationaryBootstrap`, `MovingBlockBootstrap` e.t.c. and they are all 
  sub-types of an abstract `BootstrapMethod`. Then I have functions 
  that can be called over these different bootstrap method types and 
  multiple dispatch will make sure the appropriate code is called, e.g 
  `bootindices(::StationaryBootstrap)` or 
  `bootindices(::MovingBlockBootstrap)`. This all works nicely. 
  
  I now want to define some keyword wrapper type functions in the 
  module for users who don't want to learn much about how the types 
  within the module work. For example, my wrapper might let the user 
  describe the bootstrap procedure they want with a string, eg 
  `bootindices(...; bootstrapmethod::ASCIIString=stationary)`. 
  
  The keyword wrapper is called, and I have a variable 
  `bootstrapMethod` which is a string. I need to convert it into the 
  appropriate bootstrap method type so I can then call the appropriate 
  method via multiple dispatch. Currently I have one function that does 
  this and looks something like this: 
  
  function boot_string_to_type(x::ASCIIString) 
  x == stationary  return(StationaryBootstrap()) 
  x == movingBlock  return(MovingBlock()) 
  ... 
  end 
  
  The problem is that this function is not type-stable. 
  
  Should I be worried? Does anyone have a better way of dealing with 
  this kind of issue? Maybe something involving symbols or expressions, 
  or anonymous functions etc? 
  
  Note, the situation can sometimes get quite a bit more complicated 
  than this, with multiple key-word arguments, all of which need to be 
  combined into the constructor for the relevant type. 
 I think the most Julian way to do this is to have users pass a type 
 instead of a string. They would write 
 bootindices{T:BootstrapMethod}(...; method::Type{T}=StationaryBootstrap) 

 That's simpler for the user as passing a string (since autocompletion w 
 ill work), you don't need to define boot_string_to_type(), and it's 
 type-stable. This is the idiom used by fit() in StatsBase.jl (and 
 GLM.jl) to choose which type of model should be estimated. 


 Hope this helps 


 PS: in the cases where you still want to pass a string as an argument, 
 rather than a type, consider using symbols instead, as it is more 
 efficient. 



[julia-users] Re: Using composite types with many fields

2015-06-20 Thread John Myles White
It sounds like you might be better off working with Dict's instead of types.

 -- John

On Saturday, June 20, 2015 at 12:43:03 PM UTC-7, Stef Kynaston wrote:

 I feel I am missing a simpler approach to replicating the behaviour of a 
 Matlab structure. I am doing FEM, and require structure like behaviour for 
 my model initialisation and mesh generation. Currently I am using composite 
 type definitions, such as:

 type Mesh
 coords   :: Array{Float64,2}  
 elements   :: Array{Float64,2}  
 end

 but in actuality I have many required fields (20 for Mesh, for example). 
 It seems to me very impractical to initialise an instance of Mesh via

 mesh = Mesh(field1, field2, field3, ..., field20),

 as this would require a lookup of the type definition every time to ensure 
 correct ordering. None of my fields have standard default values.

 Is there an easier way to do this that I have overlooked? In Matlab I can 
 just define the fields as I compute their values, using Mesh.coords = 
 ..., and this would work here except that I need to initialise Mesh before 
 the . field referencing will work.

 First post, so apologies if I have failed to observe etiquette rules. 



[julia-users] Re: Plans for Linear Algebra

2015-06-19 Thread John Myles White
Could you elaborate? What exactly is lacking?

On Friday, June 19, 2015 at 6:32:36 AM UTC-7, cuneyts...@gmail.com wrote:

 Dear all,

 Considering that Julia is designed to be a scientific programming 
 language, its built-in Linear Algebra capabilities seems to be limited 
 (based on the users manual of version 0.3).

 May I ask about the short-term and long-term plans about new Linear 
 Algebra features that Julia will have.

 Thank you.

 Cuneyt



[julia-users] Re: Distributions.jl: why use Float64 over Real when defining the functions

2015-06-19 Thread John Myles White
For many of the numerical methods in that package, people weren't sure if 
the code in question would generate reasonable results for other types. 
Some of it probably does, but it's not trivially true that evaluating those 
distribution on high-precision floats would produce correct results.

On Friday, June 19, 2015 at 8:59:41 AM UTC-7, Xiubo Zhang wrote:

 I have been reading the source code for Distributions.jl, and had noted 
 that functions such as pdf are defined over (d::Normal, x::Float64) (for 
 Normal distribution, for example) rather than (d::Normal, x::Real). The 
 manual of Julia recommended that more general types be used over specific 
 types when possible; also it seems one of the major features of the 
 language is the smart type inferencing which eliminates the needs to use 
 very specific types.

 So my question really is, what is the rationale behind the decision of 
 using specific types over abstract types? What are the advantages and 
 disadvantages?



[julia-users] Re: good time to start to learn julia?

2015-06-18 Thread John Myles White
My answer to these questions is always the same these days: if you're not 
sure that you have enough expertise to determine Julia's value for 
yourself, then you should be cautious and stick to playing around with 
Julia rather than trying to jump onboard wholesale. Julia is a wonderful 
language and it's very usable for many things, but you shouldn't expect 
that you can do all (or even most) of your work in Julia unless you're 
confident that you can do the development work required to implement any 
functionality that you find to be missing. Depending on your specific 
interests, you might find that Julia is missing nothing or you might find 
that Julia is missing everything.

 -- John

On Thursday, June 18, 2015 at 7:27:52 AM UTC-7, J.Z. wrote:

 Hi, 

 I have been following julia for some time and have seen lots of positive 
 comments. There are still lots of good work being put into its development. 
 I use R and Python to do lots of technical (statistical) computing and 
 would like to try julia for my work. My quick question to the current users 
 and developers is that whether it is a good time to learn julia now, or 
 should I wait until the language is more mature? Could it be the case that 
 things I learn now would be broken in future releases and I have to relearn 
 everything?

 Thanks!
 JZ



[julia-users] Re: good time to start to learn julia?

2015-06-18 Thread John Myles White
There will definitely still be some languages to the core language.

But, in my experience, the changes to the core language are seldom very 
burdensome. They're almost always large improvements to the language, so 
the code that you have to rewrite ends up being vastly easier to maintain. 
This, for example, was my experience when both default and keyword 
arguments were introduced into the language.

That said, I do spend a lot of time working on Julia code so I'm not so 
upset by the maintenance overhead.

Sounds like a lot of the newer arrivals to Julia are even more positive 
than I am, so it can't hurt to try.

 -- John

On Thursday, June 18, 2015 at 9:01:52 AM UTC-7, J.Z. wrote:

 I should have been more specific. I am just wondering if the core language 
 itself (syntax etc.) would change a lot in the future or not. I am not 
 expecting that Julia has a specific package that R provides. But then it's 
 good to know whether the fundamentals like basic visualization and 
 optimization functions are mature or not. 

 On Thursday, June 18, 2015 at 10:57:08 AM UTC-4, John Myles White wrote:

 My answer to these questions is always the same these days: if you're not 
 sure that you have enough expertise to determine Julia's value for 
 yourself, then you should be cautious and stick to playing around with 
 Julia rather than trying to jump onboard wholesale. Julia is a wonderful 
 language and it's very usable for many things, but you shouldn't expect 
 that you can do all (or even most) of your work in Julia unless you're 
 confident that you can do the development work required to implement any 
 functionality that you find to be missing. Depending on your specific 
 interests, you might find that Julia is missing nothing or you might find 
 that Julia is missing everything.

  -- John

 On Thursday, June 18, 2015 at 7:27:52 AM UTC-7, J.Z. wrote:

 Hi, 

 I have been following julia for some time and have seen lots of positive 
 comments. There are still lots of good work being put into its development. 
 I use R and Python to do lots of technical (statistical) computing and 
 would like to try julia for my work. My quick question to the current users 
 and developers is that whether it is a good time to learn julia now, or 
 should I wait until the language is more mature? Could it be the case that 
 things I learn now would be broken in future releases and I have to relearn 
 everything?

 Thanks!
 JZ



[julia-users] Re: Multivariate optimization with bounds

2015-06-18 Thread John Myles White
Try NLopt.

 -- John

On Thursday, June 18, 2015 at 8:35:20 AM UTC-7, Nils Gudat wrote:

 I'm trying to minimize a function of multiple variables using the Optim 
 package. In my original Matlab code, I'm supplying two arrays to fmincon to 
 set upper and lower bounds on each of the variables, but it seems the 
 optimize function in Optim only allows for bounds in the univariate case. 

 Is this implemented somewhere else in Julia?



Re: [julia-users] Re: Atom package

2015-06-12 Thread John Myles White
Should be set now. There's only an empty repo there right now 
at https://github.com/JuliaLang/atom-language-julia

On Friday, June 12, 2015 at 2:39:29 PM UTC-7, John Myles White wrote:

 I can create a repo.

 On Friday, June 12, 2015 at 12:04:09 PM UTC-7, Spencer Lyon wrote:

 John, 

 Do you have create permissions within the JuliaLang github  org? 

 If not, who should we contact to create the repo/set up permissions? 

 I guess I could just request that my repo be transferred to the 
 organization, but I'd want whoever gets that message to have a heads up. 



Re: [julia-users] Re: Atom package

2015-06-12 Thread John Myles White
I can create a repo.

On Friday, June 12, 2015 at 12:04:09 PM UTC-7, Spencer Lyon wrote:

 John, 

 Do you have create permissions within the JuliaLang github  org? 

 If not, who should we contact to create the repo/set up permissions? 

 I guess I could just request that my repo be transferred to the 
 organization, but I'd want whoever gets that message to have a heads up. 



[julia-users] Re: Atom package

2015-06-11 Thread John Myles White
Looking into this more, it looks like the repo can have any name you want. 
The important thing is that the package is named language-julia, not the 
repo.

On Thursday, June 11, 2015 at 8:05:20 PM UTC-7, Spencer Lyon wrote:

 Good to see something for atom-language-julia. That is what the package 
 was named originally, but it wasn't maintained so it was forked and renamed 
 language-julia. 

 I don't think there will really be any reusable components. I just dont 
 know another editor/tool that uses textmate style grammar in cson format or 
 atom's snippet syntax, for example. 



[julia-users] Re: Writing a mutable function (exclamation mark function)

2015-06-08 Thread John Myles White
http://julia.readthedocs.org/en/release-0.3/manual/faq/#i-passed-an-argument-x-to-a-function-modified-it-inside-that-function-but-on-the-outside-the-variable-x-is-still-unchanged-why

http://www.johnmyleswhite.com/notebook/2014/09/06/values-vs-bindings-the-map-is-not-the-territory/

On Monday, June 8, 2015 at 7:34:47 AM UTC-7, dwor...@gmail.com wrote:

 I'm currently trying to understand how functions with an exclamation mark 
 at the end work. I know the exclamation mark is just a notational point 
 however I'm currently confused at how to actually write a mutable function.
 If a = 1

 function add_one(a)
 return a + 1
 end

 running add_one(a) twice outputs:
 2
 2

 how would I create add_one!(a) to output:
 2
 3



[julia-users] Re: a simple function which changes the order of a vector (by modifying the argument)

2015-06-08 Thread John Myles White
http://julia.readthedocs.org/en/release-0.3/manual/faq/#i-passed-an-argument-x-to-a-function-modified-it-inside-that-function-but-on-the-outside-the-variable-x-is-still-unchanged-why

http://www.johnmyleswhite.com/notebook/2014/09/06/values-vs-bindings-the-map-is-not-the-territory/

On Monday, June 8, 2015 at 9:28:16 AM UTC-7, bernhard wrote:

 All

 I would like to sort a vector according to a given index (or permutation). 
 I understand that the code below produces a false (last evaluation below) 
 due to the fact that the function creates a local copy of x.
 How can I modify my function such that x is in sorted differently (i.e. 
 such that the argument is modified)?
 Thank you in advance.


 x=rand(5)
 anothervector=rand(5)

 mypermutation=sortperm(x)

 function change_order!(x,srt)
   x=x[srt]
   nothing
 end

 before=copy(anothervector)
 change_order!(anothervector,mypermutation)

 before==anothervector[mypermutation] #false

 before==anothervector #true, this should be false
 before[mypermutation]==anothervector #false, this should be true



[julia-users] Re: Anonymous Objects?

2015-06-04 Thread John Myles White
https://groups.google.com/forum/#!topic/julia-users/HYhm0A8KQXw

On Thursday, June 4, 2015 at 2:02:51 PM UTC-7, David Gold wrote:

 What is A*b? Is it just a formal multiplication?

 On Thursday, June 4, 2015 at 4:54:39 PM UTC-4, Gabriel Goh wrote:

 Do there exist anonymous objects, in the same way anonymous functions 
 exist?

 For example, I'd like to return a object A, without going through the 
 hoops of making an explicit type, which you can do (using my made up syntax)

 function createMatrix()
   # create an anonymous object A
   A = anonymous type 
 *(A,b) = A*b
 \(A,b) = A\b
   end
   return A
 end

 A = createMatrix()
 A*x
 A\x

 etc.

 Gabe



[julia-users] Re: [Slightly OT] Creating JuliaCon presentation slides as a Jupyter notebook

2015-06-04 Thread John Myles White
A long time ago, I did this with 
IJulia: https://github.com/johnmyleswhite/UCDavis.jl

Hopefully most of my approach isn't relevant anymore, since I hit a couple 
of bugs in the resulting slides that I had to fix with a Ruby script.

 -- John

On Thursday, June 4, 2015 at 7:49:15 AM UTC-7, Douglas Bates wrote:

 The JuliaCon2015 organizers have suggested preparing conference 
 proceedings in the form of Jupyter notebooks, which I think is a great 
 idea.  I have considered going further and preparing presentation slides 
 using Jupyter.  I know this can be done but many of the search engine hits 
 on the topic seem out of date.  Can anyone suggest a discussion or sample 
 notebook regarding this?



Re: [julia-users] Interop with C on an array of mutable structs

2015-06-02 Thread John Myles White
Indeed, that is exactly what I'm doing. I suspect Isaiah had the 
mysql_query() API in mind, which is much easier to work with, but less 
powerful.

Thanks everyone for your comments.

 -- John

On Tuesday, June 2, 2015 at 1:57:50 PM UTC-7, Jameson wrote:

 @Isaiah I suspect the array referred to is of type `MYSQL_BIND[]`, in 
 which case John is correct that declaring this as `Vector{MYSQL_BIND}` 
 (where `MYSQL_BIND` is an appropriated defined isbits type) should work 
 exactly as desired.

 On Tue, Jun 2, 2015 at 11:09 AM John Myles White johnmyleswh...@gmail.com 
 wrote:

 I've been fixing up the MySQL.jl package recently. To receive data on the 
 client-side from prepared statements, I need to pass around an array of 
 mutable structs, defined in MySQL C's API, so that C can populate those 
 structs with data from the server.

 If helpful, an example of how this works in pure C is at: 
 http://www.erickcantwell.com/2011/08/mysql-prepared-statements-in-c/

 I'm not sure how one is supposed to work with arrays of mutable structs. 
 Since the structs satisfy the isbits() requirement when written as Julia 
 immutables, I think I can get away with passing an array of Julia immutable 
 objects and letting C do all the mutation. Is that the best way to do this 
 sort of thing in Julia?

  -- John

  

[julia-users] Interop with C on an array of mutable structs

2015-06-02 Thread John Myles White
I've been fixing up the MySQL.jl package recently. To receive data on the 
client-side from prepared statements, I need to pass around an array of mutable 
structs, defined in MySQL C's API, so that C can populate those structs with 
data from the server.

If helpful, an example of how this works in pure C is at: 
http://www.erickcantwell.com/2011/08/mysql-prepared-statements-in-c/ 
http://www.erickcantwell.com/2011/08/mysql-prepared-statements-in-c/

I'm not sure how one is supposed to work with arrays of mutable structs. Since 
the structs satisfy the isbits() requirement when written as Julia immutables, 
I think I can get away with passing an array of Julia immutable objects and 
letting C do all the mutation. Is that the best way to do this sort of thing in 
Julia?

 -- John



[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread John Myles White
The NullableArrays work is very far behind schedule. I developed RSI right 
after announcing the work on NullableArrays and am still recovering, which 
means that I can spend very little time working on Julia code these days.

I'll give you more details offline.

 -- John

On Saturday, May 30, 2015 at 10:48:10 AM UTC-7, David Gold wrote:

 Thank you for the link and the explanation, John -- it's definitely 
 helpful. Is current work with Nullable and data structures available 
 anywhere in JuliaStats, or is it being developed elsewhere?

 On Saturday, May 30, 2015 at 12:23:09 PM UTC-4, John Myles White wrote:

 David,

 To clarify your understanding of what's wrong with DataArrays, check out 
 the DataArray code for something like getindex(): 
 https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109 
 https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FJuliaStats%2FDataArrays.jl%2Fblob%2Fmaster%2Fsrc%2Findexing.jl%23L109sa=Dsntz=1usg=AFQjCNHy0P-zaAlH7SKIUtbSOUgb1zfpcw

 I don't have a full understanding of Julia's type inference system, but 
 here's my best attempt to explain my current understanding of the system 
 and how it affects Seth's original example.

 Consider two simple functions, f and g, and their application inside a 
 larger function, gf():

 # Given pre-existing definitions such that:
 #
 # f(input::R) = output::S
 # g(input::S) = output::T
 #
 # What can we infer about the following larger function?
 function gf(x::Any)
 return g(f(x))
 end

 The important questions to ask are about what we can infer at 
 method-compile-time for gf(). Specifically, ask:

 (1) Can we determine the type S given the type R, which is currently 
 bound to the type of the specific value of x on which we called gf()? (Note 
 that it was the act of calling gf(x) on a specific value that triggered the 
 entire method-compilation process.)

 (2) Can we determine that the type S is a specific concrete type? 
 Concreteness matters, because we're going to have to think about how the 
 output of f() affects the input of g(). In particular, we need to know 
 whether we need to perform run-time dispatch inside of gf() or whether all 
 dispatch inside of gf() can be determined statically given the type of 
 gf()'s argument x.

 (3) Assuming that we successfully determined a concrete type S given R, 
 can we repeat the process for g() to yield a concrete type for T? If so, 
 then we'll be able to infer, at least for one specific type R, the concrete 
 output type of gf(x). If not, we'll have to give looser bounds on the 
 concrete types that come out of gf() given an input of a specific value 
 like our current x. That would be important if we were going to call gf() 
 inside another function.

 Hope that helps.

  -- John

 On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote:

 @Steven,

 Would you help me to understand the difference between this case here 
 and the case of DataArray{T}s -- which, by my understanding, are basically 
 AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
 Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
 with the compiler's ability to perform type inference, similar to how 
 looping through a DataArray can experience a cost from the compiler having 
 to deal with possible NaNs. 

 But what you're saying is that this does not apply here, since 
 presumably the argument, whether it is a Bool or an AbstractArray, would be 
 type-stable throughout the functions operations -- unlike the values 
 contained in a DataArray. Would it be fair to say that dealing with Union{} 
 types tends to be dangerous to performance mostly when they are looped over 
 in some sort of container, since in that case it's not a matter of simply 
 dispatching a specially compiled method on one of the conjunct types or the 
 other?

 On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is 
 applicable.

 The first time a function is called, a specialized version is compiled 
 for the types of the arguments that you pass it.  Subsequently, when you 
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent 
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



[julia-users] Re: Does union() imply worse performance?

2015-05-30 Thread John Myles White
David,

To clarify your understanding of what's wrong with DataArrays, check out 
the DataArray code for something like 
getindex(): 
https://github.com/JuliaStats/DataArrays.jl/blob/master/src/indexing.jl#L109

I don't have a full understanding of Julia's type inference system, but 
here's my best attempt to explain my current understanding of the system 
and how it affects Seth's original example.

Consider two simple functions, f and g, and their application inside a 
larger function, gf():

# Given pre-existing definitions such that:
#
# f(input::R) = output::S
# g(input::S) = output::T
#
# What can we infer about the following larger function?
function gf(x::Any)
return g(f(x))
end

The important questions to ask are about what we can infer at 
method-compile-time for gf(). Specifically, ask:

(1) Can we determine the type S given the type R, which is currently bound 
to the type of the specific value of x on which we called gf()? (Note that 
it was the act of calling gf(x) on a specific value that triggered the 
entire method-compilation process.)

(2) Can we determine that the type S is a specific concrete type? 
Concreteness matters, because we're going to have to think about how the 
output of f() affects the input of g(). In particular, we need to know 
whether we need to perform run-time dispatch inside of gf() or whether all 
dispatch inside of gf() can be determined statically given the type of 
gf()'s argument x.

(3) Assuming that we successfully determined a concrete type S given R, can 
we repeat the process for g() to yield a concrete type for T? If so, then 
we'll be able to infer, at least for one specific type R, the concrete 
output type of gf(x). If not, we'll have to give looser bounds on the 
concrete types that come out of gf() given an input of a specific value 
like our current x. That would be important if we were going to call gf() 
inside another function.

Hope that helps.

 -- John

On Saturday, May 30, 2015 at 4:51:09 AM UTC-7, David Gold wrote:

 @Steven,

 Would you help me to understand the difference between this case here and 
 the case of DataArray{T}s -- which, by my understanding, are basically 
 AbstractArray{Union{T, NaN}, 1}'s? My first thought was that taking a 
 Union{Bool, AbstractArray{Float, 2}} argument would potentially interfere 
 with the compiler's ability to perform type inference, similar to how 
 looping through a DataArray can experience a cost from the compiler having 
 to deal with possible NaNs. 

 But what you're saying is that this does not apply here, since presumably 
 the argument, whether it is a Bool or an AbstractArray, would be 
 type-stable throughout the functions operations -- unlike the values 
 contained in a DataArray. Would it be fair to say that dealing with Union{} 
 types tends to be dangerous to performance mostly when they are looped over 
 in some sort of container, since in that case it's not a matter of simply 
 dispatching a specially compiled method on one of the conjunct types or the 
 other?

 On Friday, May 29, 2015 at 9:49:45 PM UTC-4, Steven G. Johnson wrote:

 *No!*  This is one of the most common misconceptions about Julia 
 programming.

 The type declarations in function arguments have *no impact* on 
 performance.  Zero.  Nada.  Zip.  You *don't have to declare a type at 
 all* in the function argument, and it *still* won't matter for 
 performance.

 The argument types are just a filter for when the function is applicable.

 The first time a function is called, a specialized version is compiled 
 for the types of the arguments that you pass it.  Subsequently, when you 
 call it with arguments of the same type, the specialized version is called.

 Note also that a default argument foo(x, y=false) is exactly equivalent 
 to defining

 foo(x,y) = ...
 foo(x) = foo(x, false)

 So, if you call foo(x, [1,2,3]), it calls a version of foo(x,y) 
 specialized for an Array{Int} in the second argument.  The existence of a 
 version of foo specialized for a boolean y is irrelevant.



Re: [julia-users] Re: Maybe, Nullable, etc

2015-05-29 Thread John Myles White
Nullable is Julia's closest equivalent to Haskell's Maybe. But there are 
important differences:

* Presently run-time dispatch through even a very restricted subset of 
types is quite costly. So you want to make sure that all code allows type 
inference to assign a concrete type to every value. Nullable allows this, 
whereas any Union(S, T) with S != T will not. We dealt with the resulting 
performance problem for years in DataArrays; Nullable is the solution to 
that problem.

* Julia does literally nothing to ensure compile-time exhaustiveness (cf. 
https://blogs.janestreet.com/what-do-haskellers-have-against-exhaustiveness/), 
which means that Nullable's sole value is providing a concrete type that 
encodes (a) all values in a set S and (b) the lack of a value from S. Put 
another way: branching on nullness should always happen in run-time world 
via the isnull() predicate. Nullable is not meant to allow dispatching on 
nullness.

For anyone interested in having more background on this topic, you might 
check out my talk at last year's JuliaCon. We literally invented Nullable 
during that talk; our current implementation is very similar to the draft 
Stefan wrote while I was talking.

  -- John

On Friday, May 29, 2015 at 12:20:32 PM UTC-7, andrew cooke wrote:

 On Friday, 29 May 2015 15:58:12 UTC-3, Yichao Yu wrote:

 On Fri, May 29, 2015 at 2:12 PM, andrew cooke and...@acooke.org wrote: 
  then Julia needs a Maybe type as well? 

 Note that different from `Nullable{T}`, which is a type by itself, 
 `Maybe{T}` as proposed in the issue and you email, is just a type 
 alias. In this sense, it is already defined (just write 
 `Union(Nothing, ...)`) and it can already appear in type inference. 

 ``` 
 julia f(a) = a  0 ? nothing : 1 
 f (generic function with 1 method) 

 julia @code_typed f(1) 
 1-element Array{Any,1}: 
  :($(Expr(:lambda, Any[:a], 
 Any[Any[],Any[Any[:a,Int64,0]],Any[],Any[]], :(begin  # none, line 1: 
 unless (top(slt_int))(0,a::Int64)::Bool goto 0 
 return nothing 
 0: 
 return 1 
 end::Union(Int64,Void) 
 ``` 

 It might make sense to add such a type alias if a lot of people need 
 to write it but I personally don't think that's the case. 

 First of all, if the function returns a certain concrete type or 
 nothing, the type inference can figure that out by itself (as shown 
 above) and you don't need to tell it anything about that. 

 Second, you probably don't want to write that as the type constraint 
 of the argument either. If the function doesn't have any other 
 methods, writing that is meaningless (except as sanity check probably) 
 since julia will specialize on the concrete type anyway. If the 
 function has other methods, you will introduce ambiguity. 

 ```julia 
 julia f(::Union(Int, Void)) = 1 
 f (generic function with 1 method) 

 julia f(::Union(Float64, Void)) = 2 
 Warning: New definition 
 f(Union(Float64, Void)) at none:1 
 is ambiguous with: 
 f(Union(Int64, Void)) at none:1. 
 To fix, define 
 f(Void) 
 before the new definition. 
 f (generic function with 2 methods) 
 ``` 

 For your usecase, according to what you've described, I don't think 
 you will need to explicitly writh `Union(Nothing, ...)` 

 1. If you want to dispatch on a single Nothing and if the input has 
 arbitrary types, just check for nothing explicitly or defining 
 `f(::Nothing)` 
 2. If you want to distinguish between different types of Nothing (i.e. 
 if the input is a missing Int or a missing Float64), you can probably 
 use `Nullable{T}()` as the returned missing value. Whether you want to 
 return Nullable or not for non-missing value depends on whether you 
 want the user function (that generates the input value) to be type 
 stable.


 1 doesn't work because the user may have meant to return the value 
 Nothing.  that's why you need a type.  because you need to go to a 
 metalanguage to describe having no value in the lower level language (aka 
 use / mention ditinction).  i don't share your worry with type stability 
 yet because i have no idea if i can even do what i want - whether it is 
 fast or not comes much later.  but i do want the user to be able to return 
 anything or decide not to return anything at all, and since anything 
 can be literally Nothing, i cannot use a type union.

 so there seems to be a (3) which is unconnected with type stability, but 
 allows quoting.  this is the kind of Maybe that haskell has (and what i was 
 thinking of when i suggested that Julia needed one).  but since it is 
 structurally equivalent to Nullable then it seems like Nullable would do 
 just fine there.

  

  
  On Friday, 29 May 2015 10:16:40 UTC-3, andrew cooke wrote: 
  
  
  I'm not sure if I'm confused, or if there's a problem here, and I 
 don't 
  know what any fix would be anyway, so apologies for the open-ended 
 post 
  but... 
  
  I cannot find on option type in Julia that I can dispatch on, so 
 that I 
  have a method 

Re: [julia-users] Re: Maybe, Nullable, etc

2015-05-29 Thread John Myles White
To reinforce Yichao's point, using Nullables in the way you propose comes 
with substantial performance risks. The problem is that your strategy 
globally poisons type inference: type-uncertain functions don't just force 
run-time dispatch near the location of their function calls, they also 
force run-time dispatch in every downstream function call that depends upon 
the results of that function call.

Consider code like the following:

function slow()
x = uncertain_type_generator()
for i in 1:10_000_000
do_something_with(x)

end
end


Every call to do_something_with() is made type-uncertain because the type 
of x can't be inferred at method-compile-time. So your hot loop is poisoned 
by the use of a once-called type-uncertain function.

 -- John


[julia-users] Re: IDE Julia

2015-05-20 Thread John Myles White
En general, escribimos en ingles en esta lista.

Has probado Juno?

 -- John

On Wednesday, May 20, 2015 at 3:33:08 PM UTC-7, perico garcia wrote:

 IDE Julia como cuando Anaconda o Spyder ?? Sería el factor determinante 
 para la expansión del lenguaje de programación.



Re: [julia-users] Packing and Unpacking several Matrixes into flat vectors

2015-05-14 Thread John Myles White
In the long-term, the best way to do this will be to use SubArray and 
ReshapeArray. You'll allocate enough space for all parameters, then unpack 
them into separate objects when that helps.

 -- John

On Thursday, May 14, 2015 at 2:03:27 AM UTC-7, Tim Holy wrote:

 Some Optim algorithms, like cg, already allow you to optimize a matrix. 

 --Tim 

 On Wednesday, May 13, 2015 11:50:00 PM Lyndon White wrote: 
  Hi all, 
  I've been trinking about this for a while. 
  
  Numerical Optimistation Libraries, eg 
  NLopt(https://github.com/JuliaOpt/NLopt.jl) and Optim 
  (https://github.com/JuliaOpt/Optim.jl), 
  require the parameter to be optimised (x), to be a vector. 
  
  In Neural Networks, the paramer to be optimise are Weight Matrixes and 
 and 
  Bias Vectors. 
  
  The work around to train a Neural Network with such an optimistation 
  library is to Pack those matrixes and vectors down to single vector, 
 when 
  returning the gradient, 
  and to unpack it into the matrixes and vectors when acted to evaluate 
 the 
  gradient/loss. 
  
  Like follows: 
  
  type NN 
  
  W_e::Matrix{Float64} 
  b_e::Vector{Float64} 
  W_d::Matrix{Float64} 
  b_d::Vector{Float64} 
  
  end 
  
  
  function unpack!(nn::NN, θ::Vector) 
  W_e_len = length(nn.W_e) 
  b_e_len = length(nn.b_e) 
  W_d_len = length(nn.W_d) 
  b_d_len = length(nn.b_d) 
  W_e_shape = size(nn.W_e) 
  W_d_shape = size(nn.W_d) 
  
  nn.W_e = reshape(θ[1: W_e_len],W_e_shape) 
  nn.b_e = θ[W_e_len+1: W_e_len+b_e_len] 
  nn.W_d = reshape(θ[W_e_len+b_e_len+1: 
 W_e_len+b_e_len+W_d_len],W_d_shape 
  ) 
  nn.b_d = θ[W_e_len+b_e_len+W_d_len+1: end] 
  
  nn 
  end 
  
  function pack(nn::NN) 
  pack(nn.W_e[:],nn.b_e, nn.W_d[:],nn.b_d[:]] _ 
  end 
  
  pack(∇W_e::Matrix{Float64}, ∇b_e::Vector{Float64}, 
 ∇W_d::Matrix{Float64}, ∇ 
  b_d::Vector{Float64}) 
  [∇W_e[:], ∇b_e, ∇W_d[:], ∇b_d] 
  end 
  
  
  
  
  Then use it like: 
  
  function loss_and_loss_grad!(θ::Vector, grad::Vector)   #NLOpt and Optim 
  both provide the grad matrix  to be overwritten in place 
  grad[:] = 0 
  unpack!(nn_outer, θ) #Keep a global nn to track size, (and handy if 
 the 
  algorithm crashes) 
  
  
  function loss_and_loss_grad(train_datum) 
  ∇W_e, ∇b_e, ∇W_d, ∇b_d, err = 
 loss_and_loss_grad_single(nn_outer, 
  train_datum) 
  [pack(∇W_e, ∇b_e, ∇W_d, ∇b_d), err] 
  end 
  
  ret = map(loss_and_loss_grad, training_data)| sum 
  grad[:] = ret[1:end-1] 
  err=ret[end] 
  
  grad[:]/=length(training_data) 
  err/=length(training_data) 
  err 
  end 
  
  
  
  
  
  This works. 
  But in involved excessive array copies (I suspect). 
  
  The order in the packed vector does not matter, so long as it is 
 consistent. 
  
  Now, memory is already linear -- matrices are Vectors in memory with 
  special operations defined that say how to interpret them in 2D. 
  and the matrixes in the composite type, are adjacent in memory (i 
 assume, 
  since why not be like a C struct). 
  
  So it is logically simple to just reinterpret them them as a single 
 vector. 
  I don't think reinterpet functions on composite types though. 
  
  In C of PL/I, this could be solved by defining the Composite type as an 
  untagged union, of a Vector and a Structure. 
  I don't think Julia has this facility. (It is pretty niche, this is one 
 of 
  the only times i can think of it as being actually convenient). 
  
  
  Anyone have any suggestions? 
  
  Regards 



Re: [julia-users] Yet Another String Concatenation Thread (was: Re: Naming convention)

2015-04-28 Thread John Myles White
I think it's time for this thread to stop. People are already upset and 
things could easily get much worse.

Let's all go back to our core work: writing packages, building 
infrastructure and improving Base Julia's functionality. We can discuss 
naming conventions when we've got the functionality in place that Julia is 
sorely lacking right now.

 -- John

On Tuesday, April 28, 2015 at 4:47:51 PM UTC-7, François Fayard wrote:

 Ian. I am really sorry if I hurt people. I really respect what has been 
 done with Julia. I kind of like when people push me in the corner because 
 it helps me build better tools. That's why I might act this way, and I am 
 sorry if it hurts people. 

 I've expressed my ideas that I would like to resume: 
 - I think consistency in naming is really important for big languages like 
 Julia (as opposed to languages such as C) 
 - I wanted to follow a style guide, and the one I've found is not 
 respected at all. It's a fact. When I find LinAlg, sprandn and so many 
 other names whereas the style guide says clearly that one should avoid 
 abbreviation, I just don't get it. If a style guide is not enforced, it is 
 useless because it does not pass the reality test. 



[julia-users] Re: Confused about parametrization type

2015-03-05 Thread John Myles White
Here's my perspective:

* You should almost never define any type that has fields that aren't 
concrete types. You can achieve this in two ways: by hard-coding concrete 
types or by using parametric types.

* You should generally avoid allocating new arrays. This means that, if 
your type ultimately needs to have an Array{Float64} stored in a field, 
then it shouldn't even accept other kinds of arrays since those would need 
to be converted with an allocation along the way.

* You should try to make code general, but you should also it correct 
first. This often encourages specializing on Float64 during a first pass 
since that's close to the lingua franca type for numeric computing.

 -- John

On Thursday, March 5, 2015 at 8:55:40 AM UTC-8, Benjamin Deonovic wrote:

 Moving a post from julia issues to here since it is more appropriate: 
 https://github.com/JuliaLang/julia/issues/10408

 If I am making a function or composite type that involves floating point 
 numbers, should I enforce those numbers to be Float64 or FloatingPoint? I 
 thought 
 it should be FloatingPoint so that the function/type will work with any 
 kind of floating point number. However, several julia packages enforce 
 Float64 (e.g. Distributions package Multinomial distribution) and so I run 
 into problems and have to put in a lot of converts in my code to Float64. Am 
 I doing this wrong? I'm quite new to julia


 I don't have any intention to use non Float64 floatingpoints numbers, I'm 
 just trying to write good code. I saw a lot of examples where people 
 recommended to to use Integer rather than Int64 or String rather than 
 ASCIIString, etc. I'm just trying to be consistent. I'm fine just using 
 Float64 if that is the appropriate approach here.



[julia-users] Re: Issues with optimize in Julia

2015-02-15 Thread John Myles White
Here's my two, not very thorough, cents:

(1) The odds of a bug in Optim.jl are very high (90%).
(2) The odds of a bug in your code are very high (90%).

It's pretty easy to make a decision about (2). Deciding on (1) is a lot 
harder, since you need a specific optimization that Optim should solve, but 
fails to solve.

For resolving (2), you have a couple of sub-problems:

(a) Is your gradient analytically correct? You can check this by comparing 
it with finite differencing. If it doesn't produce a close match, be 
suspicious.
(b) Is your log likelihood + gradient numerically correct? Your stress test 
is, in theory, an attempt to test this. But numerical instability implies 
that the problem only occurs when the problem is likely to be numerically 
unstable. So you'd want to measure the correlation between the difficulty 
of the problem and the probability of failure.

My experience is that the Optim error messages don't make it easy to 
realize when you've made a mistake in your gradients. This is being worked 
on at the moment, but I think someone would need to dedicate a week to 
working on this to get us to a point where the error messages are always 
clear.

 -- John

On Sunday, February 15, 2015 at 3:29:35 PM UTC-8, Ryan Carey wrote:

 Hi all,

 I've just discovered Julia this last month, and have been greatly enjoying 
 using it, especially because of its matlab-like linear algebra notation and 
 all-round concise and intuitive syntax.

 I've been playing with its optimisation functions, looking to implement 
 gradient descent for logistic regression but I hit a couple of stumbling 
 blocks, and was wondering how you've managed these.

 Using Optim, I implemented regularized logistic regression with l_bfgs, 
 and although it worked some times, when I stress-tested it with some k-fold 
 validation, I got Linesearch errors.

 I've got a dataset that's about 600 x 100 (m x n) with weights w and 
 classes y.

 my code:
   function J(w)
 m,n = size(X)
 return sum(-y'*log(logistic(X*w)) - (1-y')*log(1-logistic(X*w))) + 
  reg/(2m) * sum(w.^2) # note normalizing bias weight
   end
 function g!(w,storage)
 storage[:] = X' * (logistic(X*w) - y) + reg / m * w
 end

 out = optimize(J, g!, w, method = :l_bfgs,show_trace=true)


 the error:

 Iter Function value   Gradient norm 
 ...
 19-9.034225e+02 2.092807e+02
 20-9.034225e+02 2.092807e+02
 21-9.034225e+02 2.092807e+02
 22-9.034225e+02 2.092807e+02
 23-9.034225e+02 2.092807e+02

 Linesearch failed to converge
 while loading In[6], in expression starting on line 2

  in hz_linesearch! at 
 /home/ryan/.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:374
  in hz_linesearch! at 
 /home/ryan/.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:188
  in l_bfgs at /home/ryan/.julia/v0.3/Optim/src/l_bfgs.jl:165
  in optimize at /home/ryan/.julia/v0.3/Optim/src/optimize.jl:340


 Perhaps I should override its convergence criteria? Or there's a bug in my 
 code? Anyway, I thought I might have more like with conjugate gradient 
 descent, so I included types.jl and cg.jl from the Optim package, and tried 
 to make it work too, defining a Differentiable Function type


 function rosenbrock(g, x::Vector)

  d1 = 1.0 - x[1]

  d2 = x[2] - x[1]^2

  if !(g === nothing)

g[1] = -2.0*d1 - 400.0*d2*x[1]

g[2] = 200.0*d2

  end

  val = d1^2 + 100.0 * d2^2

  return val

end


 function rosenbrock_gradient!(x::Vector, storage::Vector)

storage[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]

storage[2] = 200.0 * (x[2] - x[1]^2)

end

  

  cg(rosenbrock,[0,0])


 d2 = DifferentiableFunction(rosenbrock,rosenbrock_gradient!)

 cg(d2,[0,0])

 ERROR: InexactError()


 I tried a few variations on the function 'cg' before coming here for help. 
 I notice that there are a couple of other optimization packages out there 
 but this one is by JMW and looks good.

 Obviously, if I just wanted to perform linear regression, I could just use 
 a built-in function, but to use more complex models, I would need to be 
 able to do gradient descent.

 How have others fared with Optim? Any thoughts on what's going wrong? 
 General tips for how to make gradient descent work with Julia?





[julia-users] Re: ANN: StreamStats.jl

2015-02-09 Thread John Myles White
I have seen t-digest. Ill try to implement it after getting q-digest 
working and then try comparing them.

 -- John

On Monday, February 9, 2015 at 4:48:42 AM UTC-8, Stephen Lien wrote:

 Have you seen t-digest https://github.com/tdunning/t-digest ? Per the 
 link: it handles doubles, uses less memory, and more accurate. There is 
 java code and a paper at the link.

 StreamingStats is a great addition to Julia!

 - Stephen



[julia-users] ANN: StreamStats.jl

2015-02-07 Thread John Myles White
I've been doing a lot of streaming data analysis in Julia lately, so I 
finally put together a package with some core functionality for working 
with data streams:

https://github.com/johnmyleswhite/StreamStats.jl

My hope is that the community can help refine the design over time. After 
it's settled down, I'll move the package over to JuliaStats so that we have 
a canonical place for storing streaming algorithms.

I'll be adding approximate quantiles via q-digest and some basic regression 
modeling functions via SGD in the next few days.

 -- John



Re: [julia-users] compilation problem on OS X for v0.4

2014-12-26 Thread John Myles White
You need to install cmake and make it available on your path: 
http://www.cmake.org/download/

 -- John

On Dec 26, 2014, at 2:07 PM, Ethan Anderes ethanande...@gmail.com wrote:

 Hi Everyone:
 
 I just decided to upgrade to Julia v0.4 and ran into the following error when 
 trying to compile from source (I’m running Yosemite 10.10.1). Anyone else run 
 into this problem and know how to fix it? —Cheers
 
 == 
 All 3 tests passed 
 == 
 Making check in cxx 
 == 
 All 0 tests passed 
 == 
 Making check in mpn 
 Making check in mpz 
 Making check in mpq 
 Making check in mpf 
 Making check in printf 
 Making check in scanf 
 Making check in rand 
 Making check in cxx 
 Making check in demos 
 Making check in calc 
 Making check in expr 
 Making check in tune 
 Making check in doc 
 /bin/sh: cmake: command not found 
 make[2]: *** [libgit2-0.21.3/build/Makefile] Error 127 
 make[1]: *** [julia-release] Error 2 
 make: *** [release] Error 2
 ​



Re: [julia-users] compilation problem on OS X for v0.4

2014-12-26 Thread John Myles White
No worries. Adding a cmake dependency is a big change for Julia.

 -- John

On Dec 26, 2014, at 2:13 PM, Ethan Anderes ethanande...@gmail.com wrote:

 Thanks John. I guess that was obvious from the error message. Sorry for the 
 noise.
 
 --Ethan
 
 On Friday, December 26, 2014 11:09:04 AM UTC-8, John Myles White wrote:
 You need to install cmake and make it available on your path: 
 http://www.cmake.org/download/
 
  -- John
 
 On Dec 26, 2014, at 2:07 PM, Ethan Anderes ethana...@gmail.com wrote:
 
 Hi Everyone:
 
 I just decided to upgrade to Julia v0.4 and ran into the following error 
 when trying to compile from source (I’m running Yosemite 10.10.1). Anyone 
 else run into this problem and know how to fix it? —Cheers
 
 == 
 All 3 tests passed 
 == 
 Making check in cxx 
 == 
 All 0 tests passed 
 == 
 Making check in mpn 
 Making check in mpz 
 Making check in mpq 
 Making check in mpf 
 Making check in printf 
 Making check in scanf 
 Making check in rand 
 Making check in cxx 
 Making check in demos 
 Making check in calc 
 Making check in expr 
 Making check in tune 
 Making check in doc 
 /bin/sh: cmake: command not found 
 make[2]: *** [libgit2-0.21.3/build/Makefile] Error 127 
 make[1]: *** [julia-release] Error 2 
 make: *** [release] Error 2
 ​
 



Re: [julia-users] Julia modifies variable outside of function when operating on different variable inside of function

2014-12-26 Thread John Myles White
This is aliasing. Almost all languages allow this.

 -- John

Sent from my iPhone

 On Dec 26, 2014, at 2:49 PM, Bradley Setzler bradley.setz...@gmail.com 
 wrote:
 
 Hi,
 
 I cannot explain this behavior. I apply a function to a variable in the 
 workspace, the function initializes its local variable at the workspace 
 variable, then modifies the local variable and produces the desired output. 
 However, it turns out the Julia modifies both the local and workspace 
 variable with each operation on the local variable. Only the local variable 
 is supposed to be modified. 
 
 This is very dangerous behavior, as Julia is modifying the data itself 
 between performing operations on the data; the data itself is supposed to 
 remain fixed between operations on it.
 
 Minimal working example:
 
 data=[1,2,3]
 function square(arg)
 inner_var = arg
 for i=1:length(inner_var)
 inner_var[i] = inner_var[i]^2
 end
 return inner_var
 end
 output=square(data)
 
 julia print(data)
 [1,4,9]
 
 The data has been squared due to the local variable, which was initialized at 
 the data values, being squared. Now, if i wish to apply a different function 
 to the data, the result will be incorrect because the data has been modified 
 unintentionally.
 
 How long has Julia been doing this? Was this behavior intentional?
 Bradley
 
 


Re: [julia-users] Re: list of unregistered (experimental) packages

2014-12-22 Thread John Myles White
It would be really easy to run a GitHub page that is literally a list of URL’s 
for unofficial packages and which receives edits via GitHub pull requests.

 — John

 On Dec 22, 2014, at 12:03 PM, Stefan Karpinski ste...@karpinski.org wrote:
 
 To me the main benefit of having a list of non-packages would be that we 
 could have code their without having to go through the whole name 
 bikeshedding process that is traditional for registered Julia packages (and 
 which I think is very important, if imperfect).
 
 On Mon, Dec 22, 2014 at 11:44 AM, Tomas Lycken tomas.lyc...@gmail.com 
 mailto:tomas.lyc...@gmail.com wrote:
 I have always regarded the version tagging system as an indicator of package 
 ready-state, and not only as progress since the package was 
 concieved/first released. For example, I've registered the Interpolations.jl 
 package in Metadata, but I haven't tagged a version, so if I do Pkg.status() 
 it shows as version `0.0.0-` - to me, that works as an indicator that this 
 package isn't as ready as a package with a version of, say, 1.2.5, or even 
 0.3.1.
 
 It would probably be quite simple to add a filtering feature for package 
 versions on pkg.julialang.org http://pkg.julialang.org/ - nothing too 
 specific, of course, but one could for example choose to include all 
 packages, just packages with a tagged version, or even just packages version 
 1.0 or later. That would be a simple answer to most of the questions you 
 raise, albeit maybe not as specific as you might want. However, it would have 
 the benefit of curating itself.
 
 // T
 
 
 On Monday, December 22, 2014 5:23:43 PM UTC+1, Hans W Borchers wrote:
 There's a list (of such lists) at http://svaksha.github.io/Julia.jl/ 
 http://svaksha.github.io/Julia.jl/ .
 But you are right: something more complete and more up-to-date would be nice.
 I started an overview of Math packages with usage examples, but stopped
 when the Julia 0.4 version came about.
 
 
 On Monday, December 22, 2014 4:59:24 PM UTC+1, lapeyre@gmail.com  wrote:
 Does it make sense to have a list of unregistered packages ?  I'd like to 
 make my packages visible, for feedback or whatever, and also to see what 
 other packages are out there.
 
 Putting a new package that no one has used in the same list as a heavily 
 used/developed package doesn't seem right.
 My packages have interfaces that are too big, and need to be pruned/altered 
 after people use them. Still, it would be nice
 to be able to install them easily, so maybe a separate metadata repo, or a 
 tag 'experimental' would work. (It would not make sense to register them in 
 another list and then still call them 'unregistered')  I guess Julia will 
 have to deal with something like this sooner or later.
 
 github says there are about 2000 Julia repos. Surely not all are meant to be 
 packages. I have a Swap.jl repo on github just so I can install it easily 
 myself.  But I wonder how many of the 2000 are useable packages?
 
 This must have been discussed already somewhere, but I can't find it.
 
 



Re: [julia-users] Julia compatibility between versions

2014-12-20 Thread John Myles White
Julia is a _lot_ less mature than Fortran or Python. Between Julia 0.3 and 
Julia 1.0 I am pretty sure there will changes as substantial as the Python 2 
and Python 3 changes. We've tried to provide deprecation periods for changes, 
but there's going to be more changes than in Fortran.

 -- John

On Dec 20, 2014, at 7:22 AM, Sergio Rojas srpyp...@gmail.com wrote:

 
 Hello,
 
  Starting to explore Julia, I am wondering about
 its compatibility between versions for long term
 projects. For instance, we can still compile very
 old  fortran codes without much pain, which in
 general does not go further than using a
 compiling option. This compatibility helps in
 developing via Fortran big projects that will last for years
 without worrying on wasting time unnecessarily on rewriting
 already tested piece of code.
 
  On the contrary, running Python code from previous 
 versions on the new set up (say running Python 2 scripts
  on Python 3 could be a pain. One needs o spend a lot of time
 on it). 
 
 Is Julia evolving via the incompatibility between versions as`
 Python is doing?
 
 Sergio
 



Re: [julia-users] any clarification of bytes allocated in @time?

2014-12-19 Thread John Myles White
Are you timing things in the global scope?

 -- John

On Dec 19, 2014, at 1:00 PM, John Drummond john...@gmail.com wrote:

 For the following code (julia 0.3.3 in windows 7 ) I don't understand what 
 the bytes allocated in @time means
 
 All I'm doing each time is adding 10  8 byte integers
 
 Thanks for any thoughts
 
 julia @time c = [1,2]
 elapsed time: 3.32e-6 seconds (144 bytes allocated)
 2-element Array{Int64,1}:
  1
  2
 
 julia sizeof(c)
 16
 
 
 julia @time for x in 30:39 push!(c, x) end
 elapsed time: 2.717e-6 seconds (256 bytes allocated)
 
 julia @time for x in 30:39 push!(c, x) end
 elapsed time: 2.717e-6 seconds (288 bytes allocated)
 
 julia @time for x in 30:39 push!(c, x) end
 elapsed time: 2.112e-6 seconds (0 bytes allocated)
 
 julia @time for x in 30:39 push!(c, x) end
 elapsed time: 3.321e-6 seconds (640 bytes allocated)
 
 julia sizeof(c)
 336



Re: [julia-users] any clarification of bytes allocated in @time?

2014-12-19 Thread John Myles White
The problem is that your let block is not a proper function body. You need to 
time things inside of a function body:

julia function foo()
  @time a1 = zeros(Int64,1000)
  @time resize!(a1, 1000)
  @time resize!(a1, 1000)
  @time resize!(a1, 1000)
  @time resize!(a1, 2000)
   end
foo (generic function with 1 method)

julia foo()
elapsed time: 0.039935728 seconds (8048 bytes allocated)
elapsed time: 8.61e-6 seconds (0 bytes allocated)
elapsed time: 4.45e-7 seconds (0 bytes allocated)
elapsed time: 2.71e-7 seconds (0 bytes allocated)
elapsed time: 0.000365734 seconds (16000 bytes allocated)

 -- John

On Dec 19, 2014, at 1:34 PM, John Drummond john...@gmail.com wrote:

 Sorry to be stupid - but this also helps me understand things for another 
 question:
 So bytes allocated would be the underlying usage of memory.
 
 in which case with
 julia let
@time a1 = zeros(Int64,1000)
@time resize!(a1, 1000)
@time resize!(a1, 1000)
@time resize!(a1, 1000)
@time resize!(a1, 2000)
end
 elapsed time: 0.015554722 seconds (8152 bytes allocated)
 elapsed time: 5.735e-6 seconds (80 bytes allocated)
 elapsed time: 2.113e-6 seconds (80 bytes allocated)
 elapsed time: 2.113e-6 seconds (80 bytes allocated)
 elapsed time: 0.027380439 seconds (16080 bytes allocated)
 2000-element Array{Int64,1}:
 
 what's the 80 bytes allocated when I'm just changing the length I'm using?
 
 And will increasing the size beyond the maximum always copy the whole array?
 
 Many thanks for clarifying this.
 
 
 
 
 
 On Friday, December 19, 2014 6:12:42 PM UTC, Tim Holy wrote:
 Julia's arrays grow by doubling, see 
 http://en.wikipedia.org/wiki/Dynamic_array 
 
 Since you're appending elements to an array, julia has to have somewhere to 
 put them---and when there's no spare capacity, julia has to allocate a new 
 array and copy the entire thing. So some allocations are much bigger than 
 what 
 you're adding, but others (as you can see) are 0. 
 
 This is a marked improvement over Matlab, which makes a copy of your entire 
 array each time you add 1 element. 
 
 Best, 
 --Tim 
 
 On Friday, December 19, 2014 10:00:28 AM John Drummond wrote: 
  For the following code (julia 0.3.3 in windows 7 ) I don't understand what 
  the bytes allocated in @time means 
  
  All I'm doing each time is adding 10  8 byte integers 
  
  Thanks for any thoughts 
  
  julia @time c = [1,2] 
  elapsed time: 3.32e-6 seconds (144 bytes allocated) 
  2-element Array{Int64,1}: 
   1 
   2 
  
  julia sizeof(c) 
  16 
  
  
  julia @time for x in 30:39 push!(c, x) end 
  elapsed time: 2.717e-6 seconds (256 bytes allocated) 
  
  julia @time for x in 30:39 push!(c, x) end 
  elapsed time: 2.717e-6 seconds (288 bytes allocated) 
  
  julia @time for x in 30:39 push!(c, x) end 
  elapsed time: 2.112e-6 seconds (0 bytes allocated) 
  
  julia @time for x in 30:39 push!(c, x) end 
  elapsed time: 3.321e-6 seconds (640 bytes allocated) 
  
  julia sizeof(c) 
  336 
 



Re: [julia-users] Re: Global variables

2014-12-18 Thread John Myles White
This should get you started: http://c2.com/cgi/wiki?GlobalVariablesAreBad

 -- John

On Dec 18, 2014, at 7:36 PM, Greg Plowman greg.plow...@gmail.com wrote:

 I realise my original question should have been more specific and not digress 
 about implementing fast globals.
  
 Please bear with me as I'm relatively new to programming let alone Julia 
 (from my little experience I quite like Julia)
 Any guidelines on best practice would be appreciated, but I'll also ask some 
 specific questions below.
  
 I guess conceptually I have a lot of global parameters that usually don't 
 change much.
 I also have tried to define short, one-purpose functions (which I believe is 
 encouraged?)
 These functions are called in many places and most have specific arguments 
 but also access the global parameters
   
 My current implementation uses Julia global variables these parameters, which 
 has 2 benefits for me:
 Easy to setup and leads to simpler code.
 Global parameters are accessible from REPL (or in my case Juno LT), which 
 allows interactive work.
 However using global variables probably leads to poor performance. And 
 probably poor program design?
 I'm quite happy to not use globals, but I'm not sure about the best way to go 
 about this.
  
 In any case, my next incarnation was to define a composite structure 
 [GlobalParameters] and instantiate as a global variable [g = 
 GlobalParameters() # constructors initialises all fields].
 Within functions, I access global parameters from this global composite type 
 variable [g.field] - g is not passed as argument to function, but accessed as 
 global 
 This improved performance.
  
 However, I wondered whether there was any real type stability improvement, 
 since variable is still global.
 Then I declared the global variable as const. [const g = GlobalParameters()]
 This appeared to lead to a further speedup.
  
 Q1. Are globals slow solely because type cannot be determined or guaranted? 
 or is it also because globals are slower for other reasons as well?
 Q2. Can you verify that declaring global composite type as const should 
 theoretically lead to better performance?
 Q3. Would there be further benefit from passing composite type as argument to 
 all the functions? 
  
 If anyone wants to provide general tips or best practice (general programming 
 or Julia specific) I would welcome that as well.
  
 Thanks!
 Greg  
  
  
  
  
 On Tuesday, December 16, 2014 8:40:50 AM UTC+11, Greg Plowman wrote:
 Hi,
  
 I understand using global variables can lead to poor performance.
 This is because types cannot be guaranteed? Globals could be reassigned with 
 different types?
  
 For my purpose, I set up a lot of parameters as global variables.
 Then define lots of functions that have specific arguments but also use some 
 subset of the globals as well.
  
 I guess the response is going to be: do not to use globals. I get that.
 However, to me it seems sometimes a natural and easy way to think about 
 program a solution.
  
 What are the alternatives:
 Live with poor performance
 Create composite type containing the global variables, and pass around a 
 reference to single global variable of this composite type
 Would this boost performance?
 Would it boost performance if I didn't pass the composite-type variable as 
 argument, but instead access it inside functions as a global variable?
 What else?
  
  
 To compare performance, in some functions I assigned global variables to 
 local variables annotated with type. Then used local variable in function. 
 This produced a considerable speed-up, between 3-4x faster.
   
 function foo(arg1::arg1type, ...)
 l_var1::var1type = g_var1
 l_var2::var1type = g_var2
 ...
  
 x = l_var1 * ...
 end
  
  
  
 Can I do something like the following with equivalent speedup?
  
 function foo(arg1::arg1type, ...)g_var1::var1typeg_var2::var2type
 ...
  
 x = g_var1 * ...
 end
  
 This would be sort of like declaring global variables as arguments to 
 function but with types.
 Compiler could optimise. Runtime error if type not correct.
  
  
  
 As an aside, it occurred to me that there might be 2 cases for global 
 variables.
 globals used in interactive REPL environment, where global can be reassigned
 globals used for bad or lazy or whatever programming
 Why can't we have static type globals? Not const but const type.
 So effectively, we have 3 levels of const/variable-ness
   
 global x = 5x = 6 # OKx = 9.1   # OK
   
 static x = 5x = 6 # OKx = 9.1   # ERROR, must release/clear/reset x first
   
 const x = 5x = 6 # ERRORx = 9.1   # ERROR
   
  
  
  
 Cheers, Greg 



Re: [julia-users] How do I turn a string into a variable

2014-12-17 Thread John Myles White
It's easy to write a macro that takes a static literal string and makes a 
variable out of it.

It's much harder (maybe impossible) to write a macro that takes in a variable 
that happens to be bound to a string value and to make a variable out of the 
value you happen to have stored in that variable.

Another way to put it: macros don't exist inside of the world of values -- they 
only live in the world of syntax.

Could you get a similar (and cleaner) effect by populating a Dict instead?

 -- John

On Dec 17, 2014, at 5:37 AM, Zeta Convex zeta.con...@gmail.com wrote:

 I want to be able to write:
 @makevar(life, 42)
 which will expand to
life = 42
 How do I do this?
 
 Why do I want to do it? Because it would be cool to have a feature like in 
 Octave where I could load an HDF5 file, and it automatically sets the 
 variables from the file.



Re: [julia-users] Re: overview questions about new doc changes (coming with v 0.4)

2014-12-17 Thread John Myles White
This debate seems a little premature to me since the definition of @doc is not 
totally finished yet and we need to finalize that before anyone should be 
adding documentation to 0.3 packages.

 -- John

On Dec 17, 2014, at 3:15 PM, Seth catch...@bromberger.com wrote:

 +1. Please reconsider making a @doc (at least a NOP) for 0.3.x - this way we 
 can start writing repl-printable docstrings that will be useful in 0.4 but 
 not have our code break in earlier versions.
 
 On Tuesday, December 16, 2014 4:50:56 PM UTC-8, ele...@gmail.com wrote:
 So if otherwise unchanged code is documented with @doc (which it will be, who 
 doesn't want it to show in the repl :) then it won't compile on 0.3?
 
 If it won't compile it makes maintaining backward compatibility harder, and 
 its hard enough between 0.4 and 0.3 already.
 
 On Wednesday, December 17, 2014 9:04:53 AM UTC+10, Mike Innes wrote:
 It is needed if you want the docs to show up in the repl etc. It's just that 
 the plain string won't break anything (it won't do anything, either, for now).
 
 On 16 December 2014 at 22:58, ele...@gmail.com wrote:
 
 
 On Wednesday, December 17, 2014 8:41:00 AM UTC+10, Mike Innes wrote:
 It's not really that worthwhile since (a) you can use Docile and (b) the 
 future syntax
 
 
 foo
 
 foo() ...
 
 is backwards-compatible already. I just use that.
 
 Oh, ok, I thought an @doc macro was needed in 0.4 
 https://github.com/JuliaLang/julia/blob/d0a951ccb3a7ebae7909665f4445a019f2ee54a1/base/basedocs.jl.
 
 Cheers
 Lex
  
 
 On 16 December 2014 at 22:37, ele...@gmail.com wrote:
 Since the @doc is 0.4, is it possible to backport a do nothing version that 
 will allow documented code to still compile in 0.3?
 
 Cheers
 Lex
 
 On Wednesday, December 17, 2014 8:04:06 AM UTC+10, Mike Innes wrote:
 Actually the @doc macro will still interpret plain strings as markdown by 
 default. There are some caveats with escaping that make it good practice to 
 write doc anyway, but those will go away once the parser changes are 
 implemented.
 
 I'm in the process of writing documentation documentation, so the manual 
 should be up to date reasonably soon.
 
 On 16 December 2014 at 21:55, Ivar Nesje iva...@gmail.com wrote:
  Hi,
 
 Hello.
 
  Looks like exciting doc changes are afoot with Julia! I'd like to get some 
  more understanding of what's coming. Had a look at some of the github 
  issues tagged doc, but I'm still missing some basics (note, I'm still 
  quite new to Julia). Questions:
 
   * Is code from Docile.jl, Lexicon.jl, and Markdown.jl being used / 
 incorporated into Julia proper?
 
 Yes.
 
   * Will the new syntax be `doc ...`, `@doc ... -`, or something else?
 
 The - is probably going away, but final syntax is not yet set in stone (nor 
 in code).
 
   * What is `mdSome *text* here`? Will Julia support and/or require that for 
 the new docstrings? If so, what is the benefit of `mdthis` over `this`?
 
 The benefit is that `mdthis` has an explicit format, so that we can have 
 more formats in the future. The value has been discussed and you can have 
 different formats by other means. I like the way it makes markdown optional, 
 but others want to save two characters to type.
 
   * Regarding the docs currently at 
 http://docs.julialang.org/en/release-0.3/, does all of that content 
 currently come only from the contents of julia/doc and below?
 
 Yes
 
   * Will the docstrings in 0.4 be online at, say, 
 http://docs.julialang.org/en/release-0.4/ , integrated with the rendered .rst 
 docs? Or are they intended to be strictly available via the repl? Hm... to 
 avoid duplication, are any files in julia/doc slated to be diced up, 
 reformatted into markdown, and inserted into source as docstrings?
 
 Maybe, but it's hard to predict the future. Many files in Base are too long 
 already, and detailed docs will not make them shorter. For huge codebases, I 
 think it makes sense to fit as much code as possible on a screen, and search 
 in separate docs if I need to know more about a function.
 
 Thanks,
 -- John



Re: [julia-users] Package development - best practices

2014-12-17 Thread John Myles White
I also develop in .julia, but it's possible to use any directory as your 
package directory. The manual should have some sections that describe how to 
configure an alternative to .julia.

 -- John

On Dec 17, 2014, at 8:15 PM, Keno Fischer kfisc...@college.harvard.edu wrote:

 Personally, I do develop my packages inside .julia. 
 If I need to sync across machines, I'll just use git, which I should be doing 
 more anyway (admittedly this can get annoying when developing on two machines 
 at the same time, in which case I tend to add the remote julia instance as a 
 worker and ship code that way).
 
 On Wed, Dec 17, 2014 at 5:21 PM, Seth catch...@bromberger.com wrote:
 I'm wondering whether folks are actually basing their repos in ~/.julia and 
 doing their development and commits from there, or whether there's some other 
 process that allows development to happen in a more standard location than a 
 hidden directory off of ~ while still allowing use of Pkg. I'm probably 
 overlooking something trivial.
 
 What's a recommended setup/process for creating packages using Pkg to manage 
 them?



Re: [julia-users] Re: How can I sort Dict efficiently?

2014-12-16 Thread John Myles White
If you want to retain N words, perhaps a priority queue would be useful?

http://julia.readthedocs.org/en/latest/stdlib/collections/#priorityqueue

I'd be cautious about drawing many coding lessons from the TextAnalysis 
package, which has been never been optimized for performance.

 -- John

On Dec 16, 2014, at 3:30 AM, Michiaki ARIGA che...@gmail.com wrote:

 Thanks for Pontus's kind explanation. He answered what I want to know.
 I want to know the standard way to create dictionary (which is a set of words 
 for ASR or NLP).
 
 To create dictionary for speech recognition or something NLP, we often 
 control size of vocabulary. There are two ways to limit size of vocabulary, 
 one is to cut under threshold frequency that Pontus showed, and the other is 
 to pick up top N frequent words (ngram tool kit such as IRSTLM supports this 
 situation and it is popular way to control necessary memory size). If I want 
 to pick frequent words, I think I'll use DataFrame.
 
 On Tue Dec 16 2014 at 15:31:00 Todd Leo sliznmail...@gmail.com wrote:
 Could you provide any clue to guide me locate the issue? I'm willing to make 
 a PR but I am unable to find the related issue.
 
 
 On Tuesday, December 16, 2014 3:38:11 AM UTC+8, Stefan Karpinski wrote:
 There is not, but if I recall, there may be an open issue about this 
 functionality.
 
 On Sun, Dec 14, 2014 at 10:15 PM, Todd Leo sliznm...@gmail.com wrote:
 Is there a partial sort equivalent to sortperm! ? Supposingly selectperm! ?
 
 On Monday, December 8, 2014 8:21:33 PM UTC+8, Stefan Karpinski wrote:
 We have a select function as part of Base, which can do O(n) selection of the 
 top n:
 
 julia v = randn(10^7);
 
 julia let w = copy(v); @time sort!(w)[1:1000]; end;
 elapsed time: 0.882989281 seconds (8168 bytes allocated)
 
 julia let w = copy(v); @time select!(w,1:1000); end;
 elapsed time: 0.054981192 seconds (8192 bytes allocated)
 
 So for large arrays, this is substantially faster.
 
 On Mon, Dec 8, 2014 at 3:50 AM, Jeff Waller trut...@gmail.com wrote:
 This can be done in O(N).  Avoid sorting as it will be O(NlogN)
 
 Here's one of many Q on how 
 http://stackoverflow.com/questions/7272534/finding-the-first-n-largest-elements-in-an-array
 
 



Re: [julia-users] Value types

2014-12-16 Thread John Myles White
The trouble is that values aren't usually know at compile time, which is why 
dispatching on them doesn't work well.

When values of immutables can be known at compile time, something like your 
proposal does work. For example, you can do this:

immutable Sentinel{S}
end

foo(::Type{Sentinel{:Case1}}) = println(1)
foo(::Type{Sentinel{:Case2}}) = println(2)

foo(Sentinel{:Case1})
foo(Sentinel{:Case2})

 -- John

On Dec 16, 2014, at 10:10 AM, Luca Antiga luca.ant...@orobix.com wrote:

 Dear Julia-dev, 
  first of all, a big hat off for Julia. I've started experimenting with it 
 and I'm very very impressed.
 
 While I was redesigning some code I had that I was porting to Julia, I kept 
 reaching for dispatch based on a (constant) value of an argument, like
 
 foo(:a, arg1, arg2) - do something
 
 foo(:b, arg1, arg2) - do something else
 
 This is something one could achieve with Haskell's data constructors or 
 Clojure's multimethods and it's convenient. 
 
 I'm aware of PatternDispatch.jl, but maybe just allowing a type to be 
 parameterized with a particular value, like Type{:a} (or ValueType{:a}, for 
 instance), could fit well in Julia's design of the type system and would not 
 really have any overhead.
 
 This would allow to write things like:
 
 function foo(s::Type{:a}, arg1, arg2)
   do_something()
 end
 
 function foo(s::Type{:b}, arg1, arg2)
   do_something_else()
 end
 
 Of course it's totally possible I'm not getting Julia's design right, so I 
 apologize in advance.
 
 Thanks for such a great language, keep kicking!
 
 
 Luca
 



Re: [julia-users] Value types

2014-12-16 Thread John Myles White
I should probably point out that the type parameters being immutable is not 
quite right: the restriction is any bits types, which you can assess using 
the isbits() function on a type's name. You can find more information here: 
https://github.com/JuliaLang/julia/issues/6081

 -- John

On Dec 16, 2014, at 12:55 PM, Luca Antiga luca.ant...@orobix.com wrote:

 Hi John,
 It didn't occur to me to try with immutable, it makes perfect sense now.
 Thanks a lot for the hint and the explanation
 
 Luca



Re: [julia-users] Value types

2014-12-16 Thread John Myles White
I think we're actually talking about different things. You're thinking about 
type Sentinel vs. immutable Sentinel, but I was talking about Sentinel{:a} vs. 
Sentinel{[1, 2, 3]}.

Sorry for not being more clear.

 -- John

On Dec 16, 2014, at 6:29 PM, Luca Antiga luca.ant...@orobix.com wrote:

 Ok, I see. So I checked and in fact it doesn't have to be immutable, only bit 
 type.
 
 I confirm that the following
 
 type Sentinel{S}
 end
 
 foo(::Type{Sentinel{:Case1}}) = println(1)
 foo(::Type{Sentinel{:Case2}}) = println(2)
 
 foo(Sentinel{:Case1}) 
 foo(Sentinel{:Case2}) 
 
 works as expected.
 
 Thanks again
 
 Luca
 
 On Tuesday, December 16, 2014 6:57:58 PM UTC+1, John Myles White wrote:
 I should probably point out that the type parameters being immutable is not 
 quite right: the restriction is any bits types, which you can assess using 
 the isbits() function on a type's name. You can find more information here: 
 https://github.com/JuliaLang/julia/issues/6081 
 
  -- John 
 
 On Dec 16, 2014, at 12:55 PM, Luca Antiga luca@orobix.com wrote: 
 
  Hi John, 
  It didn't occur to me to try with immutable, it makes perfect sense now. 
  Thanks a lot for the hint and the explanation 
  
  Luca 
 



Re: [julia-users] how to test NaN in an array?

2014-12-15 Thread John Myles White
You need to check isnan() per element. NaN == NaN is false, so in() fails on 
NaN right now.

 -- John

On Dec 15, 2014, at 3:33 PM, Evan Pu evanthebou...@gmail.com wrote:

 1 in [1,2,3] # returns true
 
 NaN in [NaN, 1.0, 2.0] # returns false
 
 how do I test if a float64 NaN is present in an array? I'm doing some 
 numerical computation and it can have some NaN error, I want to drop the 
 arrays that has NaN.



Re: [julia-users] Suspending Garbage Collection for Performance...good idea or bad idea?

2014-12-15 Thread John Myles White
This is taking the thread off-topic, but conceptually such things are possible. 
But Rust has a very different set of semantics for memory ownership than Julia 
has and is doing a lot more analysis at compile-time than Julia is doing. So 
Julia would need to change a lot to be more like Rust. I've come to really 
adore Rust, so I'd like to see us borrow some ideas, but my personal sense is 
that Julia and Rust simply serve different niches and shouldn't really move 
towards one another too much lest each language wind up forsaking what makes it 
useful.

 -- John

On Dec 15, 2014, at 8:43 PM, Eric Forgy eric.fo...@gmail.com wrote:

 Hi,
 
 I'm new to Julia and mentioned it to a friend who is more into systems than 
 mathematical models and he mentioned his current crush is Rust, which is 
 also built on LVVM. I may have totally missed the point, but IF I understand, 
 Rust does away with garbage collection by borrow blocking at compile time. 
 The question popped into my head whether we could turn off GC in Julia and 
 check for problems at compile time. A google later, brought me to this 
 thread. Is that a totally misguided idea?
 
 Best regards,
 Eric
 
 PS: You can tell I'm coming in with almost no background knowledge about 
 compilers (or real languages for that matter), but am having fun learning. 
 LVVM was developed at my alma mater (PhD in ECE - Computational 
 Electromagnetics - from UIUC 2002). Go Illini! :)
 
 On Friday, February 22, 2013 7:11:32 PM UTC+8, Tim Holy wrote:
 Have you played with SProfile in the Profile package? It's rather good at 
 highlighting which lines, in your code and in base/, trigger the gc. Note 
 that 
 in my experience the gc does not seem to be triggered necessarily on big 
 allocations; for example, even allocating an array as 
Array(Int, (3,5)) 
 rather than 
   Array(Int, 3, 5) 
 can trigger the gc (I see lots of gc() calls coming from our Lapack code for 
 this reason). 
 
 Because I don't really know how the gc works, I'm not certain that kind of 
 thing actually reflects a problem; perhaps it was just going to have to call 
 gc 
 on the next heap-allocation event, and (3,5) just happened to be the lucky 
 candidate. But there's an open issue about this: 
 https://github.com/JuliaLang/julia/issues/1976 
 
 Docs are here: https://github.com/timholy/Profile.jl 
 I think they're slightly out of date, but only in very minor ways. 
 
 --Tim 
 
 
 
 On Thursday, February 21, 2013 03:17:59 PM nathan hodas wrote: 
  Here's the code that benefits from @nogc: 
  
  Notice the iteration over a Dict and a Set. iscorrect() checks a field of 
  the Attempt type. I can tell by running this particular that the garbage 
  collection is running during the for loop. 
  function meantime(userdata::Dict{Int,Set{Attempt}}) 
  usertimes = Dict{Int,Float64}() 
  sizehint(usertimes,length(userdata)) 
  for (uid,attempts) in collect(userdata) 
  s = 0.0; 
  c = 0.0; 
  ic = 0.0; 
  for a in attempts 
  ic = iscorrect(a) 
  s += (a.tend - a.tstart)*ic; 
  c += ic; 
  end 
  usertimes[uid] = s/c; 
  end 
  usertimes 
  end 
  
  This code has no benefit from @nogc, regardless of the kernel function k1: 
  
  function dostuff(input1,input2) 
  output = similar(input1) 
  len = length(input1) 
  for i = 1:len 
  x = input1[i] 
  for j = 1:len 
  y = input2[j] 
  output[i] += k1(x,y) 
  end 
  end 
  output 
  end 
  
  On Thursday, February 21, 2013 10:37:09 AM UTC-8, Stefan Karpinski wrote: 
   Can you post some example code? Are you just iterating the Dict object 
   with a for loop? 
   
   
   On Thu, Feb 21, 2013 at 1:35 PM, Stefan Karpinski 
   ste...@karpinski.orgjavascript: 
wrote: 
   That's good information to have. I'll look into it. 
   
   
   On Thu, Feb 21, 2013 at 1:13 PM, nathan hodas 
   nho...@gmail.comjavascript: 
wrote: 
   It's true that @nogc is not a panacea.  For my particular function, it 
   produces a robust 20x speed up, even after subsequent collection.  For 
   other seemingly similar functions, it has no effect at all. I don't use 
   any 
   temporary arrays *that I'm aware of*, but it seems the iterators of a 
   Dict 
   are doing something in the background. 
   
   On Wednesday, February 20, 2013 3:44:24 PM UTC-8, Tim Holy wrote: 
   The other thing you should check is whether you're allocating more 
   than 
   you 
   need to. I find that I can often reuse bits of memory, and that can 
   dramatically decrease the need for gc. In the long run that may help 
   you a 
   _lot_ more than temporarily disabling gc, because at some point you'll 
   need to 
   turn it on again. 
   
   There are examples of this kind of thing scattered all over the Julia 
   code 
   tree. Just because it was rather easy for me to find :-), here's the 
   patch I 
   pushed to Zip 

Re: [julia-users] Test approximate equality of Float16 arrays

2014-12-14 Thread John Myles White
That seems like a bug. Running something like,

x = rand(Float16, 10, 10)
y = rand(Float16, 10, 10)
all(abs(x - y) . eps(max(maximum(x), maximum(y

Gives me a true more than 80% of the time.

 -- John

On Dec 14, 2014, at 3:26 AM, Steve Cordwell steve.cordw...@gmail.com wrote:

 Hi,
 
 I have been playing around with Float16 arrays and I haven't been able to 
 figure out what is going on in this case, and whether it is expected 
 behaviour. According to the Wikipedia article half precision floating point 
 numbers are used for data storage and not for computations.
 
 I was surprised to find that 
 
  using Base.Test
  @test_approx_eq rand(Float16, 10,10) rand(Float16, 10, 10)
 
 gave no error, or even
 
  @test_approx_eq 1000rand(Float16, 10,10) 1000rand(Float16, 10, 10)
 
 gave no errors. Using the testing modules method of working out the tolerance 
 for comparison of the arrays gives
 
  array_eps(a) = eps(float(maximum(x-(isfinite(x) ? abs(x) : oftype(x,NaN)), 
  a)))
  va=rand(10,10,3); vb=rand(Float16, 10,10,3); 
  1E4*length(va)*max(array_eps(va), array_eps(vb))
 1464.84375
 
 So a tolerance of over 1000. Is this to be expected with Float16, or not?



Re: [julia-users] ISLR (Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani) Examples in Julia

2014-12-14 Thread John Myles White
This would be a great first project for someone interested in learning Julia.

FWIW, the RDatasets.jl repo doesn't have anything to do with ISRL -- except 
insofar as ISRL decided to use common R datasets.

 -- John

On Dec 14, 2014, at 10:36 AM, webuser1...@gmail.com wrote:

 I'm going through ISRL and find the book very useful. I see that someone has 
 loaded the data from the book:
 
 https://github.com/johnmyleswhite/RDatasets.jl
 
 Someone has also taken the chapters in R and implemented in numpy:
 
 https://github.com/TomAugspurger/StatLearning/tree/master/python
 
 The book is great, and I would love to see the examples implemented in 
 Julia...
 
  



[julia-users] Re: [WIP] CSVReaders.jl

2014-12-14 Thread John Myles White
For those following along, I've done a lot of work on this library over the 
past week. I still have another week of work left to do before making a 
official release, but the performance is much better (it's about +25% slower 
than readtable(), but uses about 1/3 as much memory) and the tests have been 
elaborated enough that I'm comfortable saying that most non-pathological files 
will be read correctly.

 -- John

On Dec 8, 2014, at 12:35 AM, John Myles White johnmyleswh...@gmail.com wrote:

 Over the last month or so, I've been slowly working on a new library that 
 defines an abstract toolkit for writing CSV parsers. The goal is to provide 
 an abstract interface that users can implement in order to provide functions 
 for reading data into their preferred data structures from CSV files. In 
 principle, this approach should allow us to unify the code behind Base's 
 readcsv and DataFrames's readtable functions.
 
 The library is still very much a work-in-progress, but I wanted to let others 
 see what I've done so that I can start getting feedback on the design.
 
 Because the library makes heavy use of Nullables, you can only try out the 
 library on Julia 0.4. If you're interested, it's available at 
 https://github.com/johnmyleswhite/CSVReaders.jl
 
 For now, I've intentionally given very sparse documentation to discourage 
 people from seriously using the library before it's officially released. But 
 there are some examples in the README that should make clear how the library 
 is intended to be used.
 
  -- John
 



Re: [julia-users] What is the use of J[:,:] = K*M please?

2014-12-14 Thread John Myles White
Assigning in-place and creating temporaries are actually totally orthogonal.

One is concerned with mutating J. This is contrasted with writing,

J = K * M

The other is concerned with the way that K * M gets computed before any 
assignment operation or mutation can occur. This is contrasted with something 
like A_mul_B.

 -- John

Sent from my iPhone

 On Dec 14, 2014, at 7:48 PM, Petr Krysl krysl.p...@gmail.com wrote:
 
 Hello everybody,
 
 I hope someone knows this:  What is the use of writing
 
 J[:,:] = K*M
 
 where all of these quantities are matrices? I thought I'd seen somewhere that 
 it was assigning to the matrix in-place  instead of creating a temporary.   
 Is that so?
 I couldn't find it in the documentation   for 0.3.
 
 Thanks,
 
 Petr


Re: [julia-users] Announcing RobotOS.jl

2014-12-12 Thread John Myles White
Very cool. Glad you've been enjoying Julia so much, Josh.

 -- John

On Dec 12, 2014, at 2:06 PM, Josh Langsfeld jdla...@gmail.com wrote:

 After about 6 weeks of initial part-time development, I'm announcing the 
 first release of the RobotOS.jl package, which enables essentially seamless 
 integration of Julia code with ROS (Robot Operating System). At the core, it 
 is mostly a wrapper for the rospy python library (many thanks to Steve 
 Johnson for PyCall), but on top of that I added an automatic Julia type 
 generation system so the back-end python details are completely hidden from 
 the end-user.
 
 https://github.com/phobon/RobotOS.jl
 
 I believe the robotics community is especially well suited to adopt Julia, as 
 speed and natural mathematical expressiveness are both highly desirable 
 features in whichever programming language is used. For about six months now, 
 I've been doing all my research in Julia and it has been vastly more 
 pleasurable than using either Matlab or Python.
 
 I am quite eager to continue development on the package with whatever 
 community feedback I can get but hopefully it can already prove useful to 
 anyone out there who has already thought of using the two systems together.
 
 Thanks,
 
 Josh Langsfeld
 Graduate Research Assistant
 Maryland Robotics Center - UMD



Re: [julia-users] Roadmap

2014-12-11 Thread John Myles White
This is a very good point. I'd label this as something like core unsolved 
challenges. Julia #265 (https://github.com/JuliaLang/julia/issues/265) comes 
to mind.

In general, a list of the big issues would be much easier to maintain than a 
list of goals for the future. We could just use a tag like core on the issue 
tracker.

 -- John

On Dec 11, 2014, at 4:49 AM, Mike Innes mike.j.in...@gmail.com wrote:

 It seems to me that a lot of FAQs could be answered by a simple list of the 
 communities'/core developers' priorities. For example:
 
 We care about module load times and static compilation, so that's going to 
 happen eventually. We care about package documentation, which is basically 
 done. We don't care as much about deterministic memory management or TCO, so 
 neither of those things are happening any time soon.
 
 It doesn't have to be a commitment to releases or dates, or even be 
 particularly detailed, to give a good sense of where Julia is headed from a 
 user perspective. 
 
 Indeed, it's only the same things you end up posting on HN every time someone 
 complains that Gadfly is slow.
 
 On 11 December 2014 at 03:01, Tim Holy tim.h...@gmail.com wrote:
 Really nice summaries, John and Tony.
 
 On Thursday, December 11, 2014 02:08:54 AM Boylan, Ross wrote:
  BTW, is 0.4 still in a you don't want to go there state for users of
  julia?
 
 In short, yes---for most users I'd personally recommend sticking with 0.3.
 Unless you simply _must_ have some of its lovely new features. But be prepared
 to update your code basically every week or so to deal with changes.
 
 --Tim
 
 



Re: [julia-users] Changes to array are not visible from caller

2014-12-11 Thread John Myles White
Nope.

You'll find Julia much easier to program in if you always replace x += y with x 
= x + y before attempting to reason about performance. In this case, you'll
get

x[:, :] = x[:, :] + 1.0f - 5 * dxdt

In other words, you literally make a copy of the entire matrix x before doing 
any useful work.

 -- John

On Dec 11, 2014, at 12:21 PM, Mark Stock mark.j.st...@gmail.com wrote:

 The line now reads
 
 x[:,:] += 1.0f-5*dxdt
 
 And the result is now correct, but memory usage increased. Shouldn't it go 
 down if we're re-assigning to the same variable?
 
 On Wednesday, December 10, 2014 11:57:28 PM UTC-7, Isaiah wrote:
 `x += ...` is equivalent to writing `x = x + ...` which rebinds the variable 
 within that function. Instead, do an explicit array assignment `x[:,:] = ...`
 
 This is discussed in the manual with a warning about type changes, but the 
 implication for arrays should probably be made clear as well: 
 http://julia.readthedocs.org/en/latest/manual/mathematical-operations/#updating-operators
 
 (there are some ongoing discussions about in-place array operators to improve 
 the situation)
 
 On Wed, Dec 10, 2014 at 7:44 PM, Mark Stock mark.j...@gmail.com wrote:
 Hello, n00b Julia user here. I have two functions that change the values of a 
 passed-in array. One works (dxFromX), but for the other one (eulerStep) the 
 caller does not see any changes to the array. Why is this?
 
 function dxFromX!(x,dxdt)
   fill!(dxdt,0.0)
 
   for itarg = 1:size(x,1)
 for isrc = 1:size(x,1)
   dx = x[isrc,1] - x[itarg,1]
   dy = x[isrc,2] - x[itarg,2]
   coeff = 1.0f0 / (dx^2 + dy^2 + 0.1f0)
   dxdt[itarg,1] -= coeff * dy
   dxdt[itarg,2] += coeff * dx
 end
   end
 end
 
 function eulerStep!(x)
   dxdt = zeros(x)
   print (\ndxdt before\n,dxdt[1:5,:],\n)
   dxFromX!(x,dxdt)
   print (\ndxdt after has changed\n,dxdt[1:5,:],\n)
   x += 1.0f-5*dxdt
   print (\nx inside\n,x[1:5,:],\n)
 end
 
 x = float32(rand(1024,2))
 print (\nx before\n,x[1:5,:],\n)
 @time eulerStep!(x)
 print (\nx after is unchanged!\n,x[1:5,:],\n)
 
 I see the same behavior on 0.3.3 and 0.4.0, both release and debug binaries, 
 on OSX and Linux. 
 



Re: [julia-users] Is there a function that performs (1:length(v))[v] for v a Vector{Bool} or a BitArray?

2014-12-11 Thread John Myles White
Does find() work?

 -- John

On Dec 11, 2014, at 4:19 PM, Douglas Bates dmba...@gmail.com wrote:

 I realize it would be a one-liner to write one but, in the interests of not 
 reinventing the wheel, I wanted to ask if I had missed a function that does 
 this.  In R there is such a function called which but that name is already 
 taken in Julia for something else.



Re: [julia-users] LLVM3.2 and JULIA BUILD PROBLEM

2014-12-11 Thread John Myles White
My understanding is that different versions of LLVM are enormously different 
and that there's no safe way to make Julia work with any version of LLVM other 
than the intended one.

 -- John

On Dec 11, 2014, at 4:56 PM, Vehbi Eşref Bayraktar 
vehbi.esref.bayrak...@gmail.com wrote:

 Hi;
 
 I am using llvm 3.2 with libnvvm . However when i try to build julia using 
 those 2 flags :
 USE_SYSTEM_LLVM = 1
 USE_LLVM_SHLIB = 1
 
 I have a bunch of errors. starting as following:
 
 codegen.cpp: In function ‘void jl_init_codegen()’:
 codegen.cpp:4886:26: error: ‘getProcessTriple’ is not a member of ‘llvm::sys’
  Triple TheTriple(sys::getProcessTriple()); // llvm32 doesn't have 
 this one instead it has getDefaultTargetTriple()
   ^
 codegen.cpp:4919:5: error: ‘mbuilder’ was not declared in this scope
  mbuilder = new MDBuilder(getGlobalContext());  //  include 
 llvm/MDBuilder.h would fix this
  ^
 codegen.cpp:4919:20: error: expected type-specifier before ‘MDBuilder’
  mbuilder = new MDBuilder(getGlobalContext());
 
 Even you fix these errors, you keep hitting the following ones:
 In file included from codegen.cpp:976:0:
 intrinsics.cpp: In function ‘llvm::Value* emit_intrinsic(JL_I::intrinsic, 
 jl_value_t**, size_t, jl_codectx_t*)’:
 intrinsics.cpp:1158:72: error: ‘ceil’ is not a member of ‘llvm::Intrinsic’
  return builder.CreateCall(Intrinsic::getDeclaration(jl_Module, 
 Intrinsic::ceil,
 
 
 
 So is the master branch currently supporting llvm32? Or is there a patch 
 somewhere?
 
 Thanks



Re: [julia-users] Define composite types in a different file - constructor not defined error

2014-12-11 Thread John Myles White
You want include, not require.

 -- John

On Dec 11, 2014, at 7:25 PM, Test This curiousle...@gmail.com wrote:

 
 I have two files: dataTypes.jl and paramcombos.jl
 
 In dataTypes.jl I have 
 
 type Params
  .
  . // field names and types
  .
 end
 
 
 In paramcombos.jl I have 
 
 module paramcombos
 
 require(dataTypes.jl)
 
 function baseParams()
params = Params( field1 = blah1, field2 = blah2, ...)
 end
 
 end
  
 
 In the julia repl if I do 
 
 require(paramcombos.jl)
 
 and then, 
 
 basep = paramcombos.baseParams()
 
 I get an error saying:
 
 ERROR: Params not defined
  in baseParams at /Users/code/paramcombos.jl:33 (where 33 is the line shown 
 above from baseParams() function. 
 
 If I move type declaration to paramcombos.jl, things work fine. Is there a 
 way to keep type definitions in one file and use the constructor in another 
 file?
 
 Thank you



Re: [julia-users] Define composite types in a different file - constructor not defined error

2014-12-11 Thread John Myles White
http://julia.readthedocs.org/en/release-0.3/manual/modules/

 -- John

On Dec 11, 2014, at 8:55 PM, Test This curiousle...@gmail.com wrote:

 Thank you, John. That worked!
 
 Could you please direct me to a reference which explains when one should use 
 include/require/import/using?
 
 Thank you.
 
 On Thursday, December 11, 2014 7:39:23 PM UTC-5, John Myles White wrote:
 You want include, not require.
 
  -- John
 
 On Dec 11, 2014, at 7:25 PM, Test This curiou...@gmail.com wrote:
 
 
 I have two files: dataTypes.jl and paramcombos.jl
 
 In dataTypes.jl I have 
 
 type Params
  .
  . // field names and types
  .
 end
 
 
 In paramcombos.jl I have 
 
 module paramcombos
 
 require(dataTypes.jl)
 
 function baseParams()
params = Params( field1 = blah1, field2 = blah2, ...)
 end
 
 end
  
 
 In the julia repl if I do 
 
 require(paramcombos.jl)
 
 and then, 
 
 basep = paramcombos.baseParams()
 
 I get an error saying:
 
 ERROR: Params not defined
  in baseParams at /Users/code/paramcombos.jl:33 (where 33 is the line shown 
 above from baseParams() function. 
 
 If I move type declaration to paramcombos.jl, things work fine. Is there a 
 way to keep type definitions in one file and use the constructor in another 
 file?
 
 Thank you
 



Re: [julia-users] Weird timing issue

2014-12-11 Thread John Myles White
Are you creating a bunch of garbage? My understanding is that any garbage that 
gets created will be cleaned up at seemingly haphazard (but fully 
deterministic) points in times.

 -- John

On Dec 11, 2014, at 11:43 PM, Sean McBane seanmc...@gmail.com wrote:

 Hi all,
 
 So, I'm starting to define a couple of routines to be used in iterative 
 solvers, optimized for sparse matrices. I wrote a routine that returns me a 
 list of tuples containing the (i,j) coordinates for the non-zero values from 
 a sparse matrix and was testing it for timing, but observed a performance 
 issue I don't understand.
 
 Testing using a 100x100 sparse matrix from a sample finite difference 
 problem, containing about 500 non-zero values, @time shows a time of ~20s 
 the first time I load the module and execute the function, and about 3-5% of 
 that is gc. I'm sure I can speed that up later, but the really odd thing is 
 that the NEXT run, it takes ~40s and about 50% of that is garbage collection. 
 The next one after that is back to 20s, and it keeps going back and forth 
 every time. This seems really weird to me.
 
 Anyway, to confirm that I wasn't going crazy I wrote a loop and timed this a 
 hundred times and collected results, and it keeps following the same pattern. 
 See attached 'times.txt' with the numbers. Any ideas what could be causing 
 this behavior?
 
 Thanks,
 
 -- Sean



Re: [julia-users] Weird timing issue

2014-12-11 Thread John Myles White
Well, you're clearly allocating memory and discarding it since the list 
comprehension and sort both allocate memory. So that's garbage that the GC has 
to deal with. The GC means that your function's timing will be erratic and, 
generally, longer on a second pass than during the first.

 -- John

On Dec 11, 2014, at 11:54 PM, Sean McBane seanmc...@gmail.com wrote:

 function getIJValues(A::SparseMatrixCSC)
 m,n = size(A)
 rowcoords = rowvals(A)
 coordinates = []
 for j = 1:n
 append!(coordinates, [(rowcoords[i],j) for i in nzrange(A,j)])
 end
 return sort(coordinates)
 end
 
 
 I've never formally studied any computer science or programming so I don't 
 have a great grasp on what goes on underneath this, but it seems to me like 
 the only thing the garbage collector should need to do would be free the 
 memory taken up by the list comprehension inside the loop at the end of the 
 loop. And perhaps this could be redone in a more efficient manner, but 
 inlining it like that seemed most natural. But what's strange is that called 
 twice in a row, with exactly the same input, it takes twice as long the 
 second time as the first.
 
 Otherwise, I certainly wouldn't be surprised if this method is inherently 
 inefficient, since I only picked up the language yesterday.
 
 -- Sean



Re: [julia-users] Weird timing issue

2014-12-11 Thread John Myles White
This is just how the GC works. Someone who's done more work on the GC can give 
you more context about why the GC runs for the length of time it runs for at 
each specific moment that it starts going.

As a favor to me, can you please make sure that you quote the entire e-mail 
thread you're responding to? I find responding to e-mails without context to be 
pretty jarring.

 -- John

On Dec 12, 2014, at 12:04 AM, Sean McBane seanmc...@gmail.com wrote:

 Right, I know I'm allocating it and discarding memory. However, if the GC 
 cleans up at deterministic points in time, as you point out in your first 
 reply, why is timing erratic? And why the regular pattern in timing? It's 
 always faster one call, slower one call, faster one call, slower one call...



Re: [julia-users] Re: Aren't loops supposed to be faster?

2014-12-11 Thread John Myles White
Petr,

You should be able to do something like the following:

function foo(n::Integer)
if iseven(n)
return 1.0
else
return 1
end
end

function bar1()
x = foo(1)
return x
end

function bar2()
x = foo(1)::Int
return x
end

julia code_typed(bar1, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], 
Any[Any[:x],Any[Any[:x,Union(Float64,Int64),18]],Any[]], :(begin  # none, line 
2:
x = foo(1)::Union(Float64,Int64) # line 3:
return x::Union(Float64,Int64)
end::Union(Float64,Int64)

julia code_typed(bar2, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], Any[Any[:x],Any[Any[:x,Int64,18]],Any[]], :(begin  # 
none, line 2:
x = (top(typeassert))(foo(1)::Union(Float64,Int64),Int)::Int64 # line 3:
return x::Int64
end::Int64

In the output of code_typed, note how the strategically placed type declaration 
at the point at which a type-unstable function is called resolves the type 
inference problem completely when you move downstream from the point of 
ambiguity.

 -- John

On Dec 12, 2014, at 12:20 AM, Petr Krysl krysl.p...@gmail.com wrote:

 John,
 
 I hear you.   I agree with you  that type instability is not very helpful,  
 and indicates  problems with program design. 
 However, I believe that  provided the program cannot resolve  the types 
 properly  (as it couldn't in the original design of my program, because I 
 haven't provided  declarations  of  variables where they were getting used,  
 only in their data structure  types that the compiler apparently couldn't 
 see), the optimization  for loop performance  cannot be successful. It 
 certainly wasn't successful in this case.
 
 How would you solve  the problem with  storing a function and at the same 
 time allowing the compiler to deduce what  values it returns?  In my case I 
 store a function that always returns a floating-point array.   However, it 
 may return a constant value supplied as input to the constructor,  or it may 
 return the value provided by another function (that the  user of the type 
 supplied).
 
 So, the type  of the return value is  stable, but I haven't found a way of 
 informing the compiler that it is so.
 
 Petr
 
 
 
 
 On Thursday, December 11, 2014 8:20:20 PM UTC-8, John Myles White wrote:
  The moral of this story is: If you can't or  won't  declare every single 
  variable, don't do loops. They are likely to be a losing proposition. 
 
 I don't think this lesson will serve most people well. It doesn't reflect my 
 experiences using Julia at all. 
 
 My experience is that code that requires variable type declarations usually 
 suffers from a deeper problem that the variable declarations suppress without 
 solving: either (a) there's some insoluble source of ambiguity in the program 
 (as occurs when calling a function that's passed around as a value and 
 therefore not amenable to static analysis) or (b) there's some subtle source 
 of type instability, as happens sometimes when mixing integers and floating 
 point numbers. 
 
  -- John 
 



Re: [julia-users] Re: home page content

2014-12-10 Thread John Myles White
As always in Julia (and OSS in general), I think the problem is that there's no 
labor supply to do most nice things for the community. Everybody would love 
to see weekly updates. Not many people have both the time and desire to do the 
work.

 -- John

On Dec 10, 2014, at 10:41 AM, Tamas Papp tkp...@gmail.com wrote:

 
 On Wed, Dec 10 2014, Christian Peel sanpi...@gmail.com wrote:
 
 provide would be helpful. Also, I'd be happy for something like a weekly
 update; or a weekly blog post to help those who don't peruse this group in
 depth each day.
 
 there was
 
 http://thisweekinjulia.github.io/
 
 but it has not been updated since late October.
 
 best,
 
 Tamas



Re: [julia-users] Re: home page content

2014-12-10 Thread John Myles White
Stefan, I shared your moment of terror about the idea of posting plans 
(essentially all of which will be invalidated) to the home page.

Although it's huge volume of e-mail, I do feel like people who want to keep up 
with new developments in Julia should try to subscribe to the issue tracker and 
watch decisions get made in real time. It's a large increase in workload to ask 
people to both do work on Julia and write up regular reports about the work.

 -- John

On Dec 10, 2014, at 1:48 PM, Stefan Karpinski ste...@karpinski.org wrote:

 I have to say the concept of putting plans up on the home page fills me with 
 dread. That means I have update the home page while I'm planning things and 
 as that plan changes and then do the work and then document it. It's hard 
 enough to actually do the work.
 
 On Wed, Dec 10, 2014 at 4:44 PM, David Anthoff anth...@berkeley.edu wrote:
 +1 on that! Even vague plans that are subject to change would be great to 
 have.
 
  
 
 From: julia-users@googlegroups.com [mailto:julia-users@googlegroups.com] On 
 Behalf Of Christian Peel
 Sent: Wednesday, December 10, 2014 10:15 AM
 To: julia-users@googlegroups.com
 Subject: Re: [julia-users] Re: home page content
 
  
 
 One thing that I would very much appreciate is some kind of development 
 schedule.  For example
   - Some kind of general roadmap
   - a plan for when 0.4 and future releases will come
   - Any plans to switch to a regular schedule?  (yearly, six
 months, ...) 
   - What features remain before a 1.0 release?
   - When will following arrive?
  faster compilation
  pre-compiled modules
  Interactive debugging; line numbers for all errors
  Automatic reload on file modification.
  Solving P=NP
 
 I know that it's tough to make such a schedule, but anything that you can 
 provide would be helpful. Also, I'd be happy for something like a weekly 
 update; or a weekly blog post to help those who don't peruse this group in 
 depth each day.
 
 Thanks!
 
 Chris
 
 On Wednesday, December 10, 2014 5:41:35 AM UTC-8, Tamas Papp wrote:
 
 From the discussion, it looks like that homepages for programming 
 languages (and realed projects) serve two purposes: 
 
 A. provide resources for the existing users (links to mailing lists, 
 package directories, documentation, etc) 
 
 B. provide information for potential new users (showcasing features of 
 the language, links to tutorials). 
 
 Given that space on the very front page is constrained (in the soft 
 sense: no one wants pages that go on and on any more), I think that 
 deciding on a balance between A and B would be a good way to focus the 
 discussion. 
 
 Once we have decided that, we can shamelessly copy good practices. 
 
 For example, 
 
 1. the R website emphasizes content for existing users (in a non-flashy 
 way that I am OK with), with very little material for new users, 
 
 2. about 1/3 of the middle bar on 
 https://www.haskell.org/haskellwiki/Haskell is for new users 
 (explanations/tutorials/etc), the 1/3 is for existing users (specs, 
 libraries), and the final 1/3 is for both (forums, wiki, etc), 
 
 3. http://new-www.haskell.org/ is mostly caters to potential new users 
 (see how great this language is), 
 
 4. the content of clojure.org is similarly for potential new users, 
 while the sidebar has links for existing users. 
 
 Best, 
 
 Tamas 
 
 On Wed, Dec 10 2014, Hans W Borchers hwbor...@gmail.com wrote: 
 
  Look at the R home page. R is one of the most popular languages, and esp. 
  so 
  for statistical and computational applications. A programming language does 
  not need bloated home pages. 
  
  I like the old Haskell home page much more than the new one. The new one 
  has 
  large, uninformative background pictures and not much information in a 
  small 
  and readable view. The HaskellWiki front page was much better in that. It 
  may 
  not even be decided which version will win. 
  
  [Clojure])http://clojure.org/) has a nice, simple and informative home 
  page, 
  while [Scala](http://www.scala-lang.org/) has overdone it like the new 
  Haskell. For other approaches see the [Nim](http://nimrod-lang.org/) - 
  formerly 'Nimrod' - and [Nemerle](http://nemerle.org/) home pages. 
  
  In the end I feel the condensed form of the Python home page will attract 
  more interest, for example with 'latest news' and 'upcoming events' on the 
  first page.This gives the impression of a lively and engaged community. 
  
  
  On Wednesday, December 10, 2014 11:23:37 AM UTC+1, Tim Holy wrote: 
  
  I like the Haskell one better than the Rust one. 
  
  --Tim 
  
 
 
 



Re: [julia-users] Roadmap

2014-12-10 Thread John Myles White
FWIW, my sense is that no one really knows what's going to happen between 0.4 
and 1.0. There are lots of projects that are seen as essential before 1.0, but 
many of those are tenatively on the 0.4 release targets (static compilation, 
array views, package documentation, etc.).

At JuliaCon, I realized that I was one of the longest standing users of Julia 
-- many people at JuliaCon had never tried Julia 0.1 and therefore don't 
remember how much the 0.2 release improved the language and redefined the way 
Julia code was written. I feel like 0.4 is going to be a similar release: a lot 
of the most egregious problems with the current version of Julia are going to 
be fixed. But once those problems are solved, it seems hard to believe that we 
won't start realizing that there are lots of parts of the language that could 
be cleaned up before 1.0. My sense is that Julia, like ggplot2, will start to 
be mature enough for almost all users well before 1.0 is released, but that1.0 
will still thankfully have the freedom to make any changes that are necessary 
before something gets declared as the finished product.

 -- John

On Dec 10, 2014, at 4:45 PM, David Anthoff anth...@berkeley.edu wrote:

 I hear you, and I didn’t think much before sending my email. Couple of points:
  
 I totally agree this should certainly not be on the homepage. I also agree 
 that there is no need for a detailed schedule, deadlines or anything like 
 that. I think the only thing that would be immensely helpful at least for me 
 is just a very high level idea of what the core team is thinking about a 
 roadmap/timing. Do you expect a 1.0 more in 10 years, or more in 1 year? Do 
 you right now expect there to be a 0.5, 0.6, or many more releases before a 
 1.0? My gut guess is that the core team has an idea about those kinds of 
 questions, and it would be great if you could share that kind of stuff from 
 time to time. Maybe one idea here would be that the core team just sends out 
 a brief email after a major release what the current thinking is about the 
 next version and the road to 1.0? Such an email could be fuzzy and 
 non-committal if the plans are fuzzy, but that in itself would also be 
 valuable information for us users.
  
 I am following the issue tracker and am subscribed to the email lists, and I 
 don’t get any sense/picture about those kind of high level questions from 
 those sources.
  
 Cheers,
 David
  
 From: julia-users@googlegroups.com [mailto:julia-users@googlegroups.com] On 
 Behalf Of Tony Kelman
 Sent: Wednesday, December 10, 2014 4:31 PM
 To: julia-users@googlegroups.com
 Subject: Re: [julia-users] Re: home page content
  
 -1 on trying to put plans, schedule, roadmap on the website. This week in 
 Julia was a great contribution to the community but evidently took more 
 effort than Matt had time to keep up with.
  
 New features get developed as the PR's for them get worked on and finished. 
 You can subscribe to just the subset of issues/PR's for things you (along 
 with everyone else) are eagerly awaiting. Better yet, help with testing and 
 code review if you can.
  
 We have been doing a good job of monthly backport bugfix releases, we should 
 be able to continue doing that. But 0.4 is still unstable and has several 
 big-ticket items still open and being worked on (check the milestones on 
 github). It's too early to try to make time estimates, if people are 
 impatient and want a release sooner it's not going to be possible without 
 punting on a number of targeted features and pushing them back to 0.5 or 
 later.
 
 
 On Wednesday, December 10, 2014 1:58:52 PM UTC-8, Randy Zwitch wrote:
 I think it would please everyone if you moved daily televised scrums.
  
 
 On Wednesday, December 10, 2014 4:53:50 PM UTC-5, John Myles White wrote:
 Stefan, I shared your moment of terror about the idea of posting plans 
 (essentially all of which will be invalidated) to the home page.
  
 Although it's huge volume of e-mail, I do feel like people who want to keep 
 up with new developments in Julia should try to subscribe to the issue 
 tracker and watch decisions get made in real time. It's a large increase in 
 workload to ask people to both do work on Julia and write up regular reports 
 about the work.
  
  -- John
  
 On Dec 10, 2014, at 1:48 PM, Stefan Karpinski ste...@karpinski.org wrote:
 
 
 I have to say the concept of putting plans up on the home page fills me with 
 dread. That means I have update the home page while I'm planning things and 
 as that plan changes and then do the work and then document it. It's hard 
 enough to actually do the work.
  
 On Wed, Dec 10, 2014 at 4:44 PM, David Anthoff ant...@berkeley.edu wrote:
 +1 on that! Even vague plans that are subject to change would be great to 
 have.
  
 From: julia...@googlegroups.com [mailto:julia...@googlegroups.com] On Behalf 
 OfChristian Peel
 Sent: Wednesday, December 10, 2014 10:15 AM
 To: julia...@googlegroups.com
 Subject: Re: [julia

Re: [julia-users] Unexpected append! behavior

2014-12-10 Thread John Myles White
Hi Sean,

I'm really confused by the output you're showing.

  X = [1,2]; Y = [1,2];
  append!(X,Y)
 4-element Array(Int64,1):
   1
   2
   3
   4

Do you really get this output? That seems like a huge bug if so. But I don't 
see that at all, which is what I'd expect.

 -- John



Re: [julia-users] Re: home page content

2014-12-09 Thread John Myles White
+1 for emulating the Rust site

 -- John

On Dec 9, 2014, at 4:46 PM, Joey Huchette joehuche...@gmail.com wrote:

 I think the [Rust website](http://www.rust-lang.org/) is pretty fantastic, in 
 terms of both design and content. Having the code examples runnable and 
 editable (via JuliaBox) would be a killer feature, though I have no idea how 
 feasible that is.
 
 On Tuesday, December 9, 2014 6:54:33 PM UTC-5, Elliot Saba wrote:
 Perhaps not now, but as a long-term goal, having a live, editable widget of 
 code on the homepage is such an awesome draw-in, IMO.
 -E
 
 On Tue, Dec 9, 2014 at 3:43 PM, Leah Hanson astri...@gmail.com wrote:
 Seeing code examples of a type and a couple of functions that use it would 
 probably give a good idea of what the code looks like. The JuMP seems 
 exciting enough to highlight both as a package and a use of macros.
 
 I don't know if you want to encourage different styles, but seeing examples 
 of Python like, c like, and functional-ish ways of writing Julia would be a 
 way to show off the variety of things you can do.
 
 --Leah
 
 On Wed, Dec 10, 2014 at 8:50 AM Elliot Saba stati...@gmail.com wrote:
 We're having intermittent DNS issues.  http://julialang.org is now up for me 
 however, and I can dig it: (I couldn't, previously)
 
 $ dig julialang.org
 
 ;  DiG 9.8.3-P1  julialang.org
 ;; global options: +cmd
 ;; Got answer:
 ;; -HEADER- opcode: QUERY, status: NOERROR, id: 56740
 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 4, ADDITIONAL: 4
 
 ;; QUESTION SECTION:
 ;julialang.org. IN  A
 
 ;; ANSWER SECTION:
 julialang.org.  2202IN  CNAME   julialang.github.io.
 julialang.github.io.2202IN  CNAME   github.map.fastly.net.
 github.map.fastly.net.  15  IN  A   199.27.79.133
 ...
 -E
 
 
 
 On Tue, Dec 9, 2014 at 2:46 PM, ele...@gmail.com wrote:
 
 
 On Wednesday, December 10, 2014 8:23:26 AM UTC+10, Stefan Karpinski wrote:
 We're looking to redesign the JuliaLang.org home page and try to give it a 
 little more focus than it currently has. Which raises the question of what to 
 focus on. We could certainly have better code examples and maybe highlight 
 features of the language and its ecosystem better. What do people think we 
 should include?
 
 The whole site seems to be offline?  Is that because of this? 
 
 



Re: [julia-users] [WIP] CSVReaders.jl

2014-12-08 Thread John Myles White
Thanks, Tom. I wanted this to be my first package that uses the full 
functionality of the new documentation system.

 -- John

On Dec 8, 2014, at 5:08 AM, Tom Short tshort.rli...@gmail.com wrote:

 Exciting, John! Although your documentation may be very sparse, the code is 
 nicely documented.
 
 On Mon, Dec 8, 2014 at 12:35 AM, John Myles White johnmyleswh...@gmail.com 
 wrote:
 Over the last month or so, I've been slowly working on a new library that 
 defines an abstract toolkit for writing CSV parsers. The goal is to provide 
 an abstract interface that users can implement in order to provide functions 
 for reading data into their preferred data structures from CSV files. In 
 principle, this approach should allow us to unify the code behind Base's 
 readcsv and DataFrames's readtable functions.
 
 The library is still very much a work-in-progress, but I wanted to let others 
 see what I've done so that I can start getting feedback on the design.
 
 Because the library makes heavy use of Nullables, you can only try out the 
 library on Julia 0.4. If you're interested, it's available at 
 https://github.com/johnmyleswhite/CSVReaders.jl
 
 For now, I've intentionally given very sparse documentation to discourage 
 people from seriously using the library before it's officially released. But 
 there are some examples in the README that should make clear how the library 
 is intended to be used.
 
  -- John
 
 



Re: [julia-users] [WIP] CSVReaders.jl

2014-12-08 Thread John Myles White
I believe/hope the proposed solution will work for most cases, although there's 
still a bunch of performance work left to be done. I think the decoupling 
problem isn't as hard as it might seem since there are very clearly distinct 
stages in parsing a CSV file. But we'll find out if the indirection I've 
introduced causes performance problems when things can't be inlined.

While writing this package, I found the two most challenging problems to be:

(A) The disconnect between CSV files providing one row at a time and Julia's 
usage of column major arrays, which encourage reading one column at a time.
(B) The inability to easily resize! a matrix.

 -- John

On Dec 8, 2014, at 5:16 AM, Stefan Karpinski ste...@karpinski.org wrote:

 Doh. Obfuscate the code quick, before anyone uses it! This is very nice and 
 something I've always felt like we need for data formats like CSV – a way of 
 decoupling the parsing of the format from the populating of a data structure 
 with that data. It's a tough problem.
 
 On Mon, Dec 8, 2014 at 8:08 AM, Tom Short tshort.rli...@gmail.com wrote:
 Exciting, John! Although your documentation may be very sparse, the code is 
 nicely documented.
 
 On Mon, Dec 8, 2014 at 12:35 AM, John Myles White johnmyleswh...@gmail.com 
 wrote:
 Over the last month or so, I've been slowly working on a new library that 
 defines an abstract toolkit for writing CSV parsers. The goal is to provide 
 an abstract interface that users can implement in order to provide functions 
 for reading data into their preferred data structures from CSV files. In 
 principle, this approach should allow us to unify the code behind Base's 
 readcsv and DataFrames's readtable functions.
 
 The library is still very much a work-in-progress, but I wanted to let others 
 see what I've done so that I can start getting feedback on the design.
 
 Because the library makes heavy use of Nullables, you can only try out the 
 library on Julia 0.4. If you're interested, it's available at 
 https://github.com/johnmyleswhite/CSVReaders.jl
 
 For now, I've intentionally given very sparse documentation to discourage 
 people from seriously using the library before it's officially released. But 
 there are some examples in the README that should make clear how the library 
 is intended to be used.
 
  -- John
 
 
 



Re: [julia-users] Re: [WIP] CSVReaders.jl

2014-12-08 Thread John Myles White
Thanks, Simon. In response to your comments:

* This package and the current DataFrames code both support streaming CSV files 
in minibatches. It's a little hard to do this with the current DataFrames 
reader, but it is possible. It is designed to be easier with CSVReaders.

* This package and the current DataFrames code both support specifying the 
types of all columns before parsing begins. There's no fast path in CSVReaders 
that uses this information to full-advantage because the functions were 
designed to never fail -- instead they always enlarge types to ensure 
successful parsing. It would be good to think about how the library needs to be 
restructured to support both use cases. I believe the DataFrames parser will 
fail if the hand-specified types are invalidated by the data.

* I'm hopeful that the String rewrite Stefan is involved with will make it 
easier to write parser functions that take in an Array{Uint8} and return values 
of type T. There's certainly no reason that CSVReaders couldn't be configured 
to use other parser functions, although it might be best not to pass parsing 
functions in as function arguments since the parsing functions might not get 
inlined. At the moment, I'd prefer to see new parsers be added to the default 
list and therefore available to everyone. This is particularly relevant to me, 
since I want to add support for reading in data from Hive tables -- which 
require parsing Array and Map objects from CSV-style files.

One thing that makes parsing tricky is that type inference requires that all 
parseable types be placed into a linear order: if parsing as Int fails, the 
parser falls over to Float64, then Bool, then UTF8String. Coming up with a 
design that handles arbitrary types in a non-linear tree, while still 
supporting automatic type inference, seems tricky.

* Does the CSV standard have anything like END-OF-DATA? It's a very cool idea, 
but it seems that you'd need to introduce an arbitrary predicate that occurs 
per-row to make things work in the absence of existing conventions.

 -- John

On Dec 8, 2014, at 8:51 AM, Simon Byrne simonby...@gmail.com wrote:

 Very nice. I was thinking about this recently when I came across the rust csv 
 library:
 http://burntsushi.net/rustdoc/csv/
 
 It had a few neat features that I thought were useful:
 * the ability to iterate by row, without saving the entire table to an object 
 first (i.e. like awk)
 * the option to specify the type of each column (to improve performance)
 
 Some other things I've often wished for in CSV libraries:
 * be able to specify an arbitrary functions for mapping a string to data type 
 (e.g. strip out currency symbols, fix funny formatting, etc.)
 * be able to specify a end of data rule, other than end-of-file or number 
 of lines (e.g. stop on an empty line)
 
 s
 
 On Monday, 8 December 2014 05:35:02 UTC, John Myles White wrote:
 Over the last month or so, I've been slowly working on a new library that 
 defines an abstract toolkit for writing CSV parsers. The goal is to provide 
 an abstract interface that users can implement in order to provide functions 
 for reading data into their preferred data structures from CSV files. In 
 principle, this approach should allow us to unify the code behind Base's 
 readcsv and DataFrames's readtable functions.
 
 The library is still very much a work-in-progress, but I wanted to let others 
 see what I've done so that I can start getting feedback on the design.
 
 Because the library makes heavy use of Nullables, you can only try out the 
 library on Julia 0.4. If you're interested, it's available at 
 https://github.com/johnmyleswhite/CSVReaders.jl
 
 For now, I've intentionally given very sparse documentation to discourage 
 people from seriously using the library before it's officially released. But 
 there are some examples in the README that should make clear how the library 
 is intended to be used.
 
  -- John
 



Re: [julia-users] [WIP] CSVReaders.jl

2014-12-08 Thread John Myles White
Yes, this is how I've been doing things so far.

 -- John

On Dec 8, 2014, at 9:12 AM, Tim Holy tim.h...@gmail.com wrote:

 My suspicion is you should read into a 1d vector (and use `append!`), then at 
 the end do a reshape and finally a transpose. I bet that will be many times 
 faster than any other alternative, because we have a really fast transpose 
 now.
 
 The only disadvantage I see is taking twice as much memory as would be 
 minimally needed. (This can be fixed once we have row-major arrays.)
 
 --Tim
 
 On Monday, December 08, 2014 08:38:06 AM John Myles White wrote:
 I believe/hope the proposed solution will work for most cases, although
 there's still a bunch of performance work left to be done. I think the
 decoupling problem isn't as hard as it might seem since there are very
 clearly distinct stages in parsing a CSV file. But we'll find out if the
 indirection I've introduced causes performance problems when things can't
 be inlined.
 
 While writing this package, I found the two most challenging problems to be:
 
 (A) The disconnect between CSV files providing one row at a time and Julia's
 usage of column major arrays, which encourage reading one column at a time.
 (B) The inability to easily resize! a matrix.
 
 -- John
 
 On Dec 8, 2014, at 5:16 AM, Stefan Karpinski ste...@karpinski.org wrote:
 Doh. Obfuscate the code quick, before anyone uses it! This is very nice
 and something I've always felt like we need for data formats like CSV – a
 way of decoupling the parsing of the format from the populating of a data
 structure with that data. It's a tough problem.
 
 On Mon, Dec 8, 2014 at 8:08 AM, Tom Short tshort.rli...@gmail.com wrote:
 Exciting, John! Although your documentation may be very sparse, the code
 is nicely documented.
 
 On Mon, Dec 8, 2014 at 12:35 AM, John Myles White
 johnmyleswh...@gmail.com wrote: Over the last month or so, I've been
 slowly working on a new library that defines an abstract toolkit for
 writing CSV parsers. The goal is to provide an abstract interface that
 users can implement in order to provide functions for reading data into
 their preferred data structures from CSV files. In principle, this
 approach should allow us to unify the code behind Base's readcsv and
 DataFrames's readtable functions.
 
 The library is still very much a work-in-progress, but I wanted to let
 others see what I've done so that I can start getting feedback on the
 design.
 
 Because the library makes heavy use of Nullables, you can only try out the
 library on Julia 0.4. If you're interested, it's available at
 https://github.com/johnmyleswhite/CSVReaders.jl
 
 For now, I've intentionally given very sparse documentation to discourage
 people from seriously using the library before it's officially released.
 But there are some examples in the README that should make clear how the
 library is intended to be used. 
 -- John
 



Re: [julia-users] [WIP] CSVReaders.jl

2014-12-08 Thread John Myles White
Not really. It's mostly that the current interface doesn't make it easy to ask 
for a Matrix back when the intermediates are Vector objects. But I can change 
that.

 -- John

On Dec 8, 2014, at 9:25 AM, Tim Holy tim.h...@gmail.com wrote:

 Does the reshape/transpose really take any appreciable time (compared to the 
 I/O)?
 
 --Tim
 
 On Monday, December 08, 2014 09:14:35 AM John Myles White wrote:
 Yes, this is how I've been doing things so far.
 
 -- John
 
 On Dec 8, 2014, at 9:12 AM, Tim Holy tim.h...@gmail.com wrote:
 My suspicion is you should read into a 1d vector (and use `append!`), then
 at the end do a reshape and finally a transpose. I bet that will be many
 times faster than any other alternative, because we have a really fast
 transpose now.
 
 The only disadvantage I see is taking twice as much memory as would be
 minimally needed. (This can be fixed once we have row-major arrays.)
 
 --Tim
 
 On Monday, December 08, 2014 08:38:06 AM John Myles White wrote:
 I believe/hope the proposed solution will work for most cases, although
 there's still a bunch of performance work left to be done. I think the
 decoupling problem isn't as hard as it might seem since there are very
 clearly distinct stages in parsing a CSV file. But we'll find out if the
 indirection I've introduced causes performance problems when things can't
 be inlined.
 
 While writing this package, I found the two most challenging problems to
 be:
 
 (A) The disconnect between CSV files providing one row at a time and
 Julia's usage of column major arrays, which encourage reading one column
 at a time. (B) The inability to easily resize! a matrix.
 
 -- John
 
 On Dec 8, 2014, at 5:16 AM, Stefan Karpinski ste...@karpinski.org 
 wrote:
 Doh. Obfuscate the code quick, before anyone uses it! This is very nice
 and something I've always felt like we need for data formats like CSV –
 a
 way of decoupling the parsing of the format from the populating of a
 data
 structure with that data. It's a tough problem.
 
 On Mon, Dec 8, 2014 at 8:08 AM, Tom Short tshort.rli...@gmail.com
 wrote:
 Exciting, John! Although your documentation may be very sparse, the
 code
 is nicely documented.
 
 On Mon, Dec 8, 2014 at 12:35 AM, John Myles White
 johnmyleswh...@gmail.com wrote: Over the last month or so, I've been
 slowly working on a new library that defines an abstract toolkit for
 writing CSV parsers. The goal is to provide an abstract interface that
 users can implement in order to provide functions for reading data into
 their preferred data structures from CSV files. In principle, this
 approach should allow us to unify the code behind Base's readcsv and
 DataFrames's readtable functions.
 
 The library is still very much a work-in-progress, but I wanted to let
 others see what I've done so that I can start getting feedback on the
 design.
 
 Because the library makes heavy use of Nullables, you can only try out
 the
 library on Julia 0.4. If you're interested, it's available at
 https://github.com/johnmyleswhite/CSVReaders.jl
 
 For now, I've intentionally given very sparse documentation to
 discourage
 people from seriously using the library before it's officially released.
 But there are some examples in the README that should make clear how the
 library is intended to be used.
 -- John
 



Re: [julia-users] Re: [WIP] CSVReaders.jl

2014-12-08 Thread John Myles White
Ok, we can change things to fail on type-misspecification.

There's no real standard, but the rule does Excel read this in a sane way? is 
pretty effective for determining what you should try parsing and when you 
should tell people to reformat their data.

Given the current infrastructure, I think the easiest way to read that data 
would be to split it into separate files. There are other hacks that would 
work, but your problem is harder than just specifying end-of-data (which can be 
done by reading N rows) -- it's also specifying start-of-data (which can be 
done by skipping M rows at the start).

 -- John

On Dec 8, 2014, at 9:24 AM, Simon Byrne simonby...@gmail.com wrote:

 
 On Monday, 8 December 2014 17:04:10 UTC, John Myles White wrote:
 * This package and the current DataFrames code both support specifying the 
 types of all columns before parsing begins. There's no fast path in 
 CSVReaders that uses this information to full-advantage because the functions 
 were designed to never fail -- instead they always enlarge types to ensure 
 successful parsing. It would be good to think about how the library needs to 
 be restructured to support both use cases. I believe the DataFrames parser 
 will fail if the hand-specified types are invalidated by the data.
 
 I agree that being permissive by default is probably a good idea, but 
 sometimes it is nice if the parser throws an error if it finds something 
 unexpected. This could also be useful for the end-of-data problem below.
 
 * Does the CSV standard have anything like END-OF-DATA? It's a very cool 
 idea, but it seems that you'd need to introduce an arbitrary predicate that 
 occurs per-row to make things work in the absence of existing conventions.
 
 Well, there isn't really a standard, just this RFC:
 http://www.ietf.org/rfc/rfc4180.txt
 which seems to assume end-of-data = end-of-file.
  
 When I hit this problem the files I was reading weren't actually CSV, but 
 this:
 http://lsbr.niams.nih.gov/bsoft/bsoft_param.html
 which have multiple tables per file, ended by a blank line. I think I ended 
 up devising a hack that would count the number of lines beforehand.



  1   2   3   4   5   6   7   8   >