Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true

2016-10-28 Thread Gregory Salvan
Yes I'm using 0.4. Thank you for these infos.

2016-10-27 21:29 GMT+02:00 Steven G. Johnson :

>
>
> On Thursday, October 27, 2016 at 1:23:47 PM UTC-4, DNF wrote:
>>
>> All higher-order functions? I thought it was mainly anonymous functions.
>> Either way, that's a seriously big slowdown.
>>
>
> All higher-order functions can benefit in Julia 0.5, not just anonymous
> functions.  Because each function now has its own type, calling a
> higher-order function like reduce(...) can now compile a specialized
> version for each function you pass it, which allows it to do things like
> inlining.
>


Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true

2016-10-26 Thread Gregory Salvan
2016-10-26 13:09 GMT+02:00 DNF :

> I don't have the impression that reduce is slow. The reduce function that
> you're using is complicated and may have features that preclude
> optimizations, such as vectorization.
>

I don't know exactly why, the difference is bigger than what I was
expected. for example with A and F = rand(100_000_000) I have:

mapeBase_v1
BenchmarkTools.Trial:
  samples:  26
  evals/sample: 1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  32.00 bytes
  allocs estimate:  1
  minimum time: 194.51 ms (0.00% GC)
  median time:  197.77 ms (0.00% GC)
  mean time:198.77 ms (0.00% GC)
  maximum time: 220.32 ms (0.00% GC)
mapeBase_v4
BenchmarkTools.Trial:
  samples:  1
  evals/sample: 1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  37.25 gb
  allocs estimate:  178401
  minimum time: 58.77 s (3.04% GC)
  median time:  58.77 s (3.04% GC)
  mean time:58.77 s (3.04% GC)
  maximum time: 58.77 s (3.04% GC)


If you have tips to help me understand this huge difference and eventually
optimize (I've tried to look at llvm ir but it's quite hard for my level,
I've just noted a lot of "store" but don't understand why it's done this
way)



>
> But perhaps more importantly, the reduce version, while probably very
> clever, is almost completely impossible to understand. I know what it's
> supposed to do, and I still cannot decipher it, while the straight loop is
> crystal clear and easy to understand. And almost as concise!
>

"crystal clear"... depends on your background and habits.
V4 is more natural for me than V1, probably because when I need a single
result (sum, abs...) from a list of values my first option is always
reduce, and I assimilate for loop with kind of "repeat something for each
of these values"
Then I also avoid nested blocks and use a lot guard clauses at the point I
read them as a part of function signature (by default I would naturally
extract v1 for loop content to a function with a guard clause), so my eyes
only catch the operation used to reduce the list in the form of the
expected result.

This is not adapted to julia and I have to change these habits which is not
a problem as julia is such a pleasure to use.


Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true

2016-10-26 Thread Gregory Salvan
Sorry reduce was a bad idea, even if syntax is nice, it's really slow and
greedy.

V1 can take advantage of @inbounds and @simd optimizations:
http://docs.julialang.org/en/release-0.5/manual/performance-tips/#performance-annotations

I hope reduce will be optimized in future because here it expresses rightly
the problem we solve and has a nice syntax:

mapeBase_v4(A::Vector{Float64}, F::Vector{Float64}) = reduce((0.0, 0),
enumerate(A)) do result::Tuple{Float64, Int64}, current::Tuple{Int64,
Float64}
current[2] == 0.0 && return result # guard clause
result[1] + abs( 1 - F[current[1]] / current[2]), result[2] + 1
end

unfortunately it's unusable ;)


2016-10-26 8:01 GMT+02:00 Martin Florek :

> Thank you everyone. v1 is very nice, as it turned out. I was looking for
> the magic of language Julia especially for the generator.
>


Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true

2016-10-25 Thread Gregory Salvan
maybe with reduce ?

```
function mapeBase_v4(A::Vector{Float64}, F::Vector{Float64})
  function abscount(prev, current)
current[2] == 0.0 && return prev
index, item = current
(prev[1] + abs( 1 - F[index] / item), prev[2] + 1)
  end
  reduce(abscount, (0.0, 0), enumerate(A))
end
```

2016-10-25 9:43 GMT+02:00 Jeffrey Sarnoff :

> This may do what you want.
>
> function mapeBase_v3(actuals::Vector{Float64}, forecasts::Vector{Float64})
> # actuals - actual target values
> # forecasts - forecasts (model estimations)
>
>   sum_reldiffs = sumabs((x - y) / x for (x, y) in zip(actuals, forecasts) if 
> x != 0.0)  # Generator
>
>   count_zeros = sum( map(x->(x==0.0), actuals) )
>   count_nonzeros = length(actuals) - count_zeros
>   sum_reldiffs, count_nonzeros
> end
>
>
>
>
> On Tuesday, October 25, 2016 at 3:15:54 AM UTC-4, Martin Florek wrote:
>>
>> Hi all,
>> I'm new in Julia and I'm doing refactoring. I have the following function:
>>
>> function mapeBase_v1(A::Vector{Float64}, F::Vector{Float64})
>>   s = 0.0
>>   count = 0
>>   for i in 1:length(A)
>> if(A[i] != 0.0)
>>   s += abs( (A[i] - F[i]) / A[i])
>>   count += 1
>> end
>>   end
>>
>>   s, count
>>
>> end
>>
>> I'm looking for a simpler variant which is as follows:
>>
>> function mapeBase_v2(A::Vector{Float64}, F::Vector{Float64})
>> # A - actual target values
>> # F - forecasts (model estimations)
>>
>>   s = sumabs((x - y) / x for (x, y) in zip(A, F) if x != 0) # Generator
>>
>>   count = length(A) # ???
>>   s, countend
>>
>>
>> However with this variant can not determine the number of non-zero elements. 
>> I found option with length(A[A .!= 0.0]), but it has a large allocation. 
>> Please, someone knows a solution with generator, or variant v1 is very good 
>> choice?
>>
>>
>> Thanks in advance,
>> Martin
>>
>>


Re: [julia-users] How to parallelize computation on HDF5 files?

2016-10-21 Thread Gregory Salvan
Hello,
I see no answer so I'll try to help.
Before, just in case, I think it's not a good approach to look at
performance first.
A better one would consist to write clean code that express your intents
first and then if it's necessary profile and optimize.

You should profile, because I have different results than you, in this
example multicore processing is 10% faster on my laptop than single core
(with hdf5-mpich on i7 last gen. under linux).

Without more informations I just only guess that you may do a little
better, by releasing the file lock sooner by doing the calculation on the
data after the file is closed.
Actually you're opening the file, doing "mean" on data, then close the
file, I'd prefer getting data with h5read then apply "mean" or other
calculations (except if data are too big for memory which is not the case
here)
Not sure it's better I've few insight on how it's managed internally.

hope it helps, good luck
Gregory

2016-10-20 11:55 GMT+02:00 Jon Alm Eriksen :

> Sorry if this get reposted, put I think my last mail got lost on the ether.
>
> I am trying to parallelize computation on HDF5 files, but I get only a
> minor speedup due to the parallelization. I thought that I could access the
> same file from multiple threads when I open the file in read mode, but it
> seems like it is faster to access data on different files, is that. Is
> there anything I can do to improve the preformance of my code.
>
> Consider the following two files:
>
> myparallel.jl
> ```julia
> module MyParallel
> using HDF5
>
> const N_objs = 20
> const N_floats = 100_000_000
> const singlefn = "junk.h5"
> const multiplefns = ["junk_$(i).h5" for i in 1:N_objs]
>
> function write_junk_singlefile()
> h5open(singlefn, "w") do f
> for i in 1:N_objs
> write(f, "data_$(i)", rand(N_floats))
> end
> end
> end
>
> function write_junk_multifiles()
> for i in 1:N_objs
> h5open(f->write(f, "data", rand(N_floats)), multiplefns[i], "w")
> end
> end
>
> calc_singlefile(obj_id::Int) = h5open(f->mean(f["data_$(obj_id)"][:]),
> singlefn, "r")
> calc_singlefile(obj_ids::AbstractArray{Int}) = [calc_singlefile(i) for i
> in obj_ids]
> calc_singlefile_parallel(obj_ids::AbstractArray{Int}) =
> pmap(calc_singlefile, obj_ids)
>
> calc_multifiles(file_id::Int) = h5open(f->mean(f["data"][:]),
> multiplefns[file_id], "r")
> calc_multifiles(file_ids::AbstractArray{Int}) = [calc_multifiles(i) for i
> in file_ids]
> calc_multifiles_parallel(file_ids::AbstractArray{Int}) =
> pmap(calc_singlefile, file_ids)
>
> export N_objs, write_junk_singlefile, write_junk_multifiles,
> calc_singlefile,
> calc_singlefile_parallel, calc_multifiles, calc_multifiles_parallel
> end
> ```
>
> and
>
> run_paralleltest.jl
>
> ```julia
> addprocs(7)
> @everywhere include("myparallel.jl")
> @everywhere using MyParallel
>
> #write_junk_singlefile()
> #write_junk_multifiles()
>
> println("singlefile single core processing:")
> @time calc_singlefile(1:N_objs)
> println("multifiles single core processing:")
> @time calc_multifiles(1:N_objs)
> println("singlefile multi core processing:")
> @time calc_singlefile_parallel(1:N_objs)
> println("multifiles multi core processing:")
> @time calc_multifiles_parallel(1:N_objs)
> ```
>
> when I run `julia run_paralleltest.jl` on my macbook pro, I get the
> following results:
>
> ```
> WARNING: replacing module HDF5
> WARNING: replacing module HDF5
> WARNING: replacing module HDF5
> WARNING: replacing module HDF5
> WARNING: replacing module HDF5
> WARNING: replacing module HDF5
> WARNING: replacing module HDF5
> singlefile single core processing:
>  16.758472 seconds (762.67 k allocations: 14.933 GB, 7.55% gc time)
> multifiles single core processing:
>  15.962462 seconds (27.59 k allocations: 14.902 GB, 7.74% gc time)
> singlefile multi core processing:
>  19.293451 seconds (5.73 M allocations: 241.379 MB, 1.08% gc time)
> multifiles multi core processing:
>  13.152688 seconds (3.15 k allocations: 204.315 KB)
> ```
>
> Is there a way to get a better performance for running the parallel
> calculations. Also, are the warnings expected?
>
> best,
> Jon Alm Eriksen
>


Re: [julia-users] UTF8, how to procesed text data

2016-10-21 Thread Gregory Salvan
Hi,
there is a library that let you specify the encoding type when opening
files:

https://github.com/nalimilan/StringEncodings.jl

2016-10-20 19:32 GMT+02:00 :

> Julia ver 5 is OK , but is new problem with space after ś, ć . More in new
> post...
>
> Paul
>
> W dniu środa, 19 października 2016 15:04:13 UTC+2 użytkownik Milan
> Bouchet-Valat napisał:
>>
>> Le mercredi 19 octobre 2016 à 06:02 -0700, program...@gmail.com a
>> écrit :
>> > Version 0.3.12, udate to 5 ?
>> Yes. 0.3.x versions are unsupported for some time now.
>>
>>
>> Regards
>>
>> > > Le mercredi 19 octobre 2016 à 04:46 -0700, program...@gmail.com a
>> > > écrit :
>> > > > Data file is coding UTF8 but i cant procedsed this datain Julia
>> > > ?
>> > > > What wrong ?
>> > > >
>> > > > o=open("data.txt")
>> > > >
>> > > > julia> temp=readline(io)
>> > > > "3699778,13,2,gdbiehz jablej gupując szybgi Injehnej dg 26
>> > > > paździehniga,1\n"
>> > > >
>> > > > julia> temp[61:65]
>> > > > "aźdz"
>> > > >
>> > > > julia> findin(temp[61:65],"d")
>> > > > ERROR: invalid UTF-8 character index
>> > > >  in next at utf8.jl:64
>> > > >  in findin at array.jl:1179
>> > > You didn't say what version of Julia you're using. The bug seems
>> > > to
>> > > happen on 0.4.7, but not on 0.5.0, so I'd encourage you to
>> > > upgrade.
>> > >
>> > > (Note that in general you shouldn't index into strings with
>> > > arbitrary
>> > > integers: only values referring to the beginning of a Unicode code
>> > > point are valid.)
>> > >
>> > >
>> > > Regards
>> >
>>
>


Re: [julia-users] Re: Function check without @assert

2016-04-22 Thread Gregory Salvan
I find assertions really usefull for examples in documentation and
sometimes in code for devs (by far better than comments).
http://c2.com/cgi/wiki?DoNotUseAssertions

2016-04-22 8:45 GMT+02:00 Lyndon White :

>
> *TL;DR; Assert is no slower than exception throwing&  the problem is
> people expect it to be a `nop` if optimisation is turned on. and a `nop` is
> much faster than an assert*(*I haven't actually tested this),
>
>
> Its not that assert is slow.
> it is that traditionally (and we are talking going back to a very long
> time),
> `assert` statements are removed by the optimizer. (Eg. It is one of the
> only things done by `cpython -o`)
> Because `asserts` are error messages targeted at the developer.
> In theory they will never be triggered at run time, in "shipped" code.
>
> This is different from Exceptions.
> Exceptions may be thrown (and not caught) in shipped code.
> Its bad, but sometimes unavoidable,
> and it is better that the exception is thrown (and not caught) than to
> allow the program to run in an unknown state (potentially leading to data
> corruption, or infinate looping or all kinds of problems).
>
> vs if the exception was instead an assert, then in the shipped version
> they are not their and so the program runs in unknown state.
>
> Currently running `julia --optimise` does not actually remove them.
> This is generally considered a bad thing.
> It break the expectation that they will be gone in the optimised version.
> This is very bad, if you have an `assert` in a key inner loop.
>
>
>
> However, some people are writing code, that is using asserts to check that
> a function is being used correctly.
> Because they are lazy (aren't we all? I know I do this).
> If this code is ship in optimized form (without asserts), then the users
> of there code (if it is a library) are going to be be able to get into
> unintended states -- by accident.
>
> Now what people should do, is as in your example check conditions and
> throw meaning full exceptions.
> Your way works, as does using an if statement (i'ld use an if statement,
> but that is just me).
> But as a hack for those of us who are lazy, a @check macro was proposed,
> which would not be removed by the optimizer.
>
>
>
>
>


[julia-users] Re: Stuck with using and LOAD_PATH

2016-04-20 Thread Gregory Salvan
Thanks for replies (sorry I was not notified)
The matter was for running tests often (BDD) and Pkg doesn't seem to be the 
solution.
Finally I resolved it but still don't understand what was the matter. :)
I've defined all modules in package.jl with exports and include then use 
"using Package.ModuleA.typeA" in moduleB for example.

Le lundi 21 mars 2016 04:13:24 UTC+1, James Dang a écrit :
>
> If MyDir is on the LOAD_PATH, this works for me:
>
> LOAD_PATH/
>   Foo/
> src/
>   Foo.jl
>
> Foo.jl:
>   module Foo
>   ...
>   end
>
> in separate code:
>import Foo
>
>
>
>
>
>
> On Wednesday, March 16, 2016 at 9:42:37 PM UTC+8, Gregory Salvan wrote:
>>
>> Hi,
>> I don't understand the way of importing/using modules, or including files.
>> I've look at documentation (FAQ and modules) and into this list for old 
>> subjects, and tested all solutions I found.
>>
>> with julia version 0.4.3 on gentoo
>>
>> directory structure looks like:
>>
>> package.jl (directory)
>>   |_ src
>>   |  |_ fileA.jl
>>   |  |_ fileB.jl
>>   |  |_ fileC.jl
>>   |  |_ fileD.jl
>>   |  |_ package.jl
>>   |_ test
>>  |_ testA.jl
>>
>> The* first issue* I had was with "using" in tests, for example I've 
>> tried:
>>
>> push!(LOAD_PATH, string(dirname(@__FILE__), "/..src")) # just to try - 
>> ModuleA is in fileA.jl in ../src
>> using ModuleA
>> # alternativelly:
>> # using ModuleA.TypeA1 or using ModuleA: TypeA1...
>> facts("Test ModuleA") do
>>   context("create") do
>>  type_a1 = ModuleA.TypeA1()
>>   end
>> end
>>
>>
>> I've tested launching this test with :
>> JULIA_LOAD_PATH=./src/ LOAD_PATH=./src julia --color=yes test/testA.jl
>>
>> (no differences with import too)
>>
>> The error message is:
>> ERROR: LoadError: ArgumentError: ModuleA not found in path
>>
>> Instead of import/using (but that's not what I wanted), It's OK if I use :
>> include("../src/fileA.jl")
>>
>>
>>
>> The *second issue* is when I wanted to define modules in package.jl this 
>> way:
>>
>> module Package
>>   module ModA
>> include("fileA.jl")
>> include("fileB.jl")
>>   end
>>   module ModC
>> include("fileC.jl")
>>   end
>> end
>>
>> And have for example a fileD.jl included in fileA.jl and fileB.jl (or 
>> fileC.jl) with and immutable type. (I've included fileD.jl because I 
>> couldn't use "using" but I would prefer "using")
>> I have an error ERROR: LoadError: LoadError: LoadError: LoadError: 
>> LoadError: invalid redefinition of constant ImmutableType
>>
>> So as I can't use "using" to avoid the reloading what can I do? 
>> what I've not understood about julia import/using and include ?
>>
>> NOTE:
>> some things I've tried: fileD.jl with a module ModuleD and "using" with 
>> and without  __precompile__
>> setting LOAD_PATH and JULIA_LOAD_PATH globally with bashrc or passing it 
>> to shell
>> ...
>>
>>

[julia-users] Re: cross-module exports / extending modules

2016-03-20 Thread Gregory Salvan
Hi,
when you want to add methods on Base module (like getindex, getfield...) 
you use "import Base.getindex" then write a new function with new types 
args.
for exemple to add methods in B.jl on functionA from A.jl
import A.functionA

function functionA(...)

end

Is what you were looking for?

Le dimanche 20 mars 2016 11:25:10 UTC+1, Andreas Lobinger a écrit :
>
> Hello colleagues,
>
> i remember a discussion about this, but maybe without conclusion and maybe 
> without the right keywords.
>
> Let's have module A (from package A.jl) with certain funcitionality and 
> maybe some types.
> Now module/package/code B.jl that somehow extends A with optional 
> functions. How to put these functions under the A. API?
> I'm pretty sure exports across modules don't work, but is there somewhere 
> some functionality on this?
>
> Wishing a happy day,
> Andreas
>


[julia-users] Stuck with using and LOAD_PATH

2016-03-19 Thread Gregory Salvan
Hi,
I don't understand the way of importing/using modules, or including files.
I've look at documentation (FAQ and modules) and into this list for old 
subjects, and tested all solutions I found.

with julia version 0.4.3 on gentoo

directory structure looks like:

package.jl (directory)
  |_ src
  |  |_ fileA.jl
  |  |_ fileB.jl
  |  |_ fileC.jl
  |  |_ fileD.jl
  |  |_ package.jl
  |_ test
 |_ testA.jl

The* first issue* I had was with "using" in tests, for example I've tried:

push!(LOAD_PATH, string(dirname(@__FILE__), "/..src")) # just to try - 
ModuleA is in fileA.jl in ../src
using ModuleA
# alternativelly:
# using ModuleA.TypeA1 or using ModuleA: TypeA1...
facts("Test ModuleA") do
  context("create") do
 type_a1 = ModuleA.TypeA1()
  end
end


I've tested launching this test with :
JULIA_LOAD_PATH=./src/ LOAD_PATH=./src julia --color=yes test/testA.jl

(no differences with import too)

The error message is:
ERROR: LoadError: ArgumentError: ModuleA not found in path

Instead of import/using (but that's not what I wanted), It's OK if I use :
include("../src/fileA.jl")



The *second issue* is when I wanted to define modules in package.jl this 
way:

module Package
  module ModA
include("fileA.jl")
include("fileB.jl")
  end
  module ModC
include("fileC.jl")
  end
end

And have for example a fileD.jl included in fileA.jl and fileB.jl (or 
fileC.jl) with and immutable type. (I've included fileD.jl because I 
couldn't use "using" but I would prefer "using")
I have an error ERROR: LoadError: LoadError: LoadError: LoadError: 
LoadError: invalid redefinition of constant ImmutableType

So as I can't use "using" to avoid the reloading what can I do? 
what I've not understood about julia import/using and include ?

NOTE:
some things I've tried: fileD.jl with a module ModuleD and "using" with and 
without  __precompile__
setting LOAD_PATH and JULIA_LOAD_PATH globally with bashrc or passing it to 
shell
...