On Thu, Dec 24, 2015 at 8:48 PM, Ismael Venegas Castelló
<[email protected]> wrote:
> Yichao, why is it bad idea to splat? Is it not performant?

Yes, it's terrible for performance

```
julia> using Benchmarks

julia> f1(x) = Int8[x...]
f1 (generic function with 1 method)

julia> f2(x) = map(Int8, x)
f2 (generic function with 1 method)

julia> x = [0 for i in 1:10000];

julia> @benchmark f1(x)
================ Benchmark Results ========================
     Time per evaluation: 1.15 ms [785.77 μs, 1.50 ms]
Proportion of time in GC: 2.20% [0.00%, 9.76%]
        Memory allocated: 726.92 kb
   Number of allocations: 19504 allocations
       Number of samples: 100
   Number of evaluations: 100
 Time spent benchmarking: 0.20 s


julia> @benchmark f2(x)
================ Benchmark Results ========================
     Time per evaluation: 9.78 μs [9.61 μs, 9.96 μs]
Proportion of time in GC: 1.13% [0.56%, 1.69%]
        Memory allocated: 9.86 kb
   Number of allocations: 1 allocations
       Number of samples: 5601
   Number of evaluations: 416801
         R² of OLS model: 0.953
 Time spent benchmarking: 4.48 s
```

As a start, when you write `x...`, you are essentially asking for a
tuple that is the same length with an array. This is inherently type
unstable since the length of the array goes into the tuple length and
it is not known at compile time.
It is also very inefficient to call a function with unknown number of
arguments since we don't really have a calling convension to pass
those parameters unboxed.

>
> El jueves, 24 de diciembre de 2015, 5:02:14 (UTC-6), Min-Woong Sohn
> escribió:
>>
>> I want to reduce the amount of memory used by a dataframe that has lots of
>> binary variables. What is the best way to achieve this? For example, how can
>> I convert a variable from Int64 to Int8 in a dataframe.
>>
>> Thanks

Reply via email to