Also note that other packages are free to define ~ (as a macro) for their
own purposes (although there will be a warning when they try to use
DataFrames).
julia> macro ~(s1, s2)
:(string($s1, $s2))
end
julia> "Hello" ~ " World!"
"Hello World!"
julia> using DataFrames
Warning: using DataFrames.@~ in module Main conflicts with an existing
identifier.
julia> "Goodbye" ~ " World!"
"Goodbye World!"
DataFrames (or ~, at least) did get special treatment here (see
https://github.com/JuliaLang/julia/issues/4882). I think this is quite
reasonable--even though Julia is a general purpose programming language, it
does target scientific computing users first!
And just because it's less readable to you doesn't make it so for
everyone. As you pointed out, for users coming from R or those who do
Bayesian modeling (I do!), y ~ x1 + x2 is very understandable. (And it's
quite reasonable that those without those backgrounds would find it
weird... we all come with our own baggage!)
So please, continue to say what you like and find useful and what you find
weird... but don't expect everyone to agree with you. ;-)
Cheers,
Kevin
On Mon, Apr 27, 2015 at 8:53 AM, Scott Jones <[email protected]>
wrote:
> 1) How readable is that though, unless you know R already? `f = y ~ x1 +
> x2` really doesn't mean anything at all to me...
> 2) People say that we should just use `string(a,b,c)` and `repeat(a,n)`
> for string concatenation and repetition...
> that makes no sense for code where you find many string concatenations
> in the same line...
> 3) Even in the code you referenced, the operator is only used once every
> few lines... so why
> not have it be: `formula(y,x1+x2)` or `@formula(y,x1+x2)` instead?
> That seems MUCH more readable to me, and doesn't seem like an undue
> burden on people using DataFrames,
> as it doesn't seem to be used that frequently...
>
> Scott
>
> On Monday, April 27, 2015 at 11:42:10 AM UTC-4, Tamas Papp wrote:
>>
>> ~ is an implementation of R's formula interface, which many find useful,
>> even essential for statistics; calling it "pollution" is a somewhat
>> heterodox use of the term.
>>
>> See
>> https://github.com/JuliaStats/DataFrames.jl/blob/master/test/formula.jl
>> for nice examples.
>>
>> Best,
>>
>> Tamas
>>
>> On Mon, Apr 27 2015, François Fayard <[email protected]> wrote:
>>
>> > It comes back to the C++ coding guide from Google. One of the most
>> > important rule is : "don't polute the global namespace".
>> >
>> > On Monday, April 27, 2015 at 5:07:15 PM UTC+2, Scott Jones wrote:
>> >>
>> >> I had actually suggested that... I still think it is a good idea, but
>> I
>> >> think some other packages already pre-empted it's use as
>> >> a binary operator (in a way that didn't really make much sense to me -
>> I'd
>> >> suggest that DataFrames not use ~, and use ~
>> >> for string concatenation.
>> >>
>>
>>