On Thu, Dec 24, 2015 at 8:48 PM, Ismael Venegas Castelló
<[email protected]> wrote:
> Yichao, why is it bad idea to splat? Is it not performant?
Yes, it's terrible for performance
```
julia> using Benchmarks
julia> f1(x) = Int8[x...]
f1 (generic function with 1 method)
julia> f2(x) = map(Int8, x)
f2 (generic function with 1 method)
julia> x = [0 for i in 1:10000];
julia> @benchmark f1(x)
================ Benchmark Results ========================
Time per evaluation: 1.15 ms [785.77 μs, 1.50 ms]
Proportion of time in GC: 2.20% [0.00%, 9.76%]
Memory allocated: 726.92 kb
Number of allocations: 19504 allocations
Number of samples: 100
Number of evaluations: 100
Time spent benchmarking: 0.20 s
julia> @benchmark f2(x)
================ Benchmark Results ========================
Time per evaluation: 9.78 μs [9.61 μs, 9.96 μs]
Proportion of time in GC: 1.13% [0.56%, 1.69%]
Memory allocated: 9.86 kb
Number of allocations: 1 allocations
Number of samples: 5601
Number of evaluations: 416801
R² of OLS model: 0.953
Time spent benchmarking: 4.48 s
```
As a start, when you write `x...`, you are essentially asking for a
tuple that is the same length with an array. This is inherently type
unstable since the length of the array goes into the tuple length and
it is not known at compile time.
It is also very inefficient to call a function with unknown number of
arguments since we don't really have a calling convension to pass
those parameters unboxed.
>
> El jueves, 24 de diciembre de 2015, 5:02:14 (UTC-6), Min-Woong Sohn
> escribió:
>>
>> I want to reduce the amount of memory used by a dataframe that has lots of
>> binary variables. What is the best way to achieve this? For example, how can
>> I convert a variable from Int64 to Int8 in a dataframe.
>>
>> Thanks