Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true
Yes I'm using 0.4. Thank you for these infos. 2016-10-27 21:29 GMT+02:00 Steven G. Johnson : > > > On Thursday, October 27, 2016 at 1:23:47 PM UTC-4, DNF wrote: >> >> All higher-order functions? I thought it was mainly anonymous functions. >> Either way, that's a seriously big slowdown. >> > > All higher-order functions can benefit in Julia 0.5, not just anonymous > functions. Because each function now has its own type, calling a > higher-order function like reduce(...) can now compile a specialized > version for each function you pass it, which allows it to do things like > inlining. >
Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true
2016-10-26 13:09 GMT+02:00 DNF : > I don't have the impression that reduce is slow. The reduce function that > you're using is complicated and may have features that preclude > optimizations, such as vectorization. > I don't know exactly why, the difference is bigger than what I was expected. for example with A and F = rand(100_000_000) I have: mapeBase_v1 BenchmarkTools.Trial: samples: 26 evals/sample: 1 time tolerance: 5.00% memory tolerance: 1.00% memory estimate: 32.00 bytes allocs estimate: 1 minimum time: 194.51 ms (0.00% GC) median time: 197.77 ms (0.00% GC) mean time:198.77 ms (0.00% GC) maximum time: 220.32 ms (0.00% GC) mapeBase_v4 BenchmarkTools.Trial: samples: 1 evals/sample: 1 time tolerance: 5.00% memory tolerance: 1.00% memory estimate: 37.25 gb allocs estimate: 178401 minimum time: 58.77 s (3.04% GC) median time: 58.77 s (3.04% GC) mean time:58.77 s (3.04% GC) maximum time: 58.77 s (3.04% GC) If you have tips to help me understand this huge difference and eventually optimize (I've tried to look at llvm ir but it's quite hard for my level, I've just noted a lot of "store" but don't understand why it's done this way) > > But perhaps more importantly, the reduce version, while probably very > clever, is almost completely impossible to understand. I know what it's > supposed to do, and I still cannot decipher it, while the straight loop is > crystal clear and easy to understand. And almost as concise! > "crystal clear"... depends on your background and habits. V4 is more natural for me than V1, probably because when I need a single result (sum, abs...) from a list of values my first option is always reduce, and I assimilate for loop with kind of "repeat something for each of these values" Then I also avoid nested blocks and use a lot guard clauses at the point I read them as a part of function signature (by default I would naturally extract v1 for loop content to a function with a guard clause), so my eyes only catch the operation used to reduce the list in the form of the expected result. This is not adapted to julia and I have to change these habits which is not a problem as julia is such a pleasure to use.
Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true
Sorry reduce was a bad idea, even if syntax is nice, it's really slow and greedy. V1 can take advantage of @inbounds and @simd optimizations: http://docs.julialang.org/en/release-0.5/manual/performance-tips/#performance-annotations I hope reduce will be optimized in future because here it expresses rightly the problem we solve and has a nice syntax: mapeBase_v4(A::Vector{Float64}, F::Vector{Float64}) = reduce((0.0, 0), enumerate(A)) do result::Tuple{Float64, Int64}, current::Tuple{Int64, Float64} current[2] == 0.0 && return result # guard clause result[1] + abs( 1 - F[current[1]] / current[2]), result[2] + 1 end unfortunately it's unusable ;) 2016-10-26 8:01 GMT+02:00 Martin Florek : > Thank you everyone. v1 is very nice, as it turned out. I was looking for > the magic of language Julia especially for the generator. >
Re: [julia-users] Re: Comprehension (generator) with statement IF and the number of true
maybe with reduce ? ``` function mapeBase_v4(A::Vector{Float64}, F::Vector{Float64}) function abscount(prev, current) current[2] == 0.0 && return prev index, item = current (prev[1] + abs( 1 - F[index] / item), prev[2] + 1) end reduce(abscount, (0.0, 0), enumerate(A)) end ``` 2016-10-25 9:43 GMT+02:00 Jeffrey Sarnoff : > This may do what you want. > > function mapeBase_v3(actuals::Vector{Float64}, forecasts::Vector{Float64}) > # actuals - actual target values > # forecasts - forecasts (model estimations) > > sum_reldiffs = sumabs((x - y) / x for (x, y) in zip(actuals, forecasts) if > x != 0.0) # Generator > > count_zeros = sum( map(x->(x==0.0), actuals) ) > count_nonzeros = length(actuals) - count_zeros > sum_reldiffs, count_nonzeros > end > > > > > On Tuesday, October 25, 2016 at 3:15:54 AM UTC-4, Martin Florek wrote: >> >> Hi all, >> I'm new in Julia and I'm doing refactoring. I have the following function: >> >> function mapeBase_v1(A::Vector{Float64}, F::Vector{Float64}) >> s = 0.0 >> count = 0 >> for i in 1:length(A) >> if(A[i] != 0.0) >> s += abs( (A[i] - F[i]) / A[i]) >> count += 1 >> end >> end >> >> s, count >> >> end >> >> I'm looking for a simpler variant which is as follows: >> >> function mapeBase_v2(A::Vector{Float64}, F::Vector{Float64}) >> # A - actual target values >> # F - forecasts (model estimations) >> >> s = sumabs((x - y) / x for (x, y) in zip(A, F) if x != 0) # Generator >> >> count = length(A) # ??? >> s, countend >> >> >> However with this variant can not determine the number of non-zero elements. >> I found option with length(A[A .!= 0.0]), but it has a large allocation. >> Please, someone knows a solution with generator, or variant v1 is very good >> choice? >> >> >> Thanks in advance, >> Martin >> >>
Re: [julia-users] How to parallelize computation on HDF5 files?
Hello, I see no answer so I'll try to help. Before, just in case, I think it's not a good approach to look at performance first. A better one would consist to write clean code that express your intents first and then if it's necessary profile and optimize. You should profile, because I have different results than you, in this example multicore processing is 10% faster on my laptop than single core (with hdf5-mpich on i7 last gen. under linux). Without more informations I just only guess that you may do a little better, by releasing the file lock sooner by doing the calculation on the data after the file is closed. Actually you're opening the file, doing "mean" on data, then close the file, I'd prefer getting data with h5read then apply "mean" or other calculations (except if data are too big for memory which is not the case here) Not sure it's better I've few insight on how it's managed internally. hope it helps, good luck Gregory 2016-10-20 11:55 GMT+02:00 Jon Alm Eriksen : > Sorry if this get reposted, put I think my last mail got lost on the ether. > > I am trying to parallelize computation on HDF5 files, but I get only a > minor speedup due to the parallelization. I thought that I could access the > same file from multiple threads when I open the file in read mode, but it > seems like it is faster to access data on different files, is that. Is > there anything I can do to improve the preformance of my code. > > Consider the following two files: > > myparallel.jl > ```julia > module MyParallel > using HDF5 > > const N_objs = 20 > const N_floats = 100_000_000 > const singlefn = "junk.h5" > const multiplefns = ["junk_$(i).h5" for i in 1:N_objs] > > function write_junk_singlefile() > h5open(singlefn, "w") do f > for i in 1:N_objs > write(f, "data_$(i)", rand(N_floats)) > end > end > end > > function write_junk_multifiles() > for i in 1:N_objs > h5open(f->write(f, "data", rand(N_floats)), multiplefns[i], "w") > end > end > > calc_singlefile(obj_id::Int) = h5open(f->mean(f["data_$(obj_id)"][:]), > singlefn, "r") > calc_singlefile(obj_ids::AbstractArray{Int}) = [calc_singlefile(i) for i > in obj_ids] > calc_singlefile_parallel(obj_ids::AbstractArray{Int}) = > pmap(calc_singlefile, obj_ids) > > calc_multifiles(file_id::Int) = h5open(f->mean(f["data"][:]), > multiplefns[file_id], "r") > calc_multifiles(file_ids::AbstractArray{Int}) = [calc_multifiles(i) for i > in file_ids] > calc_multifiles_parallel(file_ids::AbstractArray{Int}) = > pmap(calc_singlefile, file_ids) > > export N_objs, write_junk_singlefile, write_junk_multifiles, > calc_singlefile, > calc_singlefile_parallel, calc_multifiles, calc_multifiles_parallel > end > ``` > > and > > run_paralleltest.jl > > ```julia > addprocs(7) > @everywhere include("myparallel.jl") > @everywhere using MyParallel > > #write_junk_singlefile() > #write_junk_multifiles() > > println("singlefile single core processing:") > @time calc_singlefile(1:N_objs) > println("multifiles single core processing:") > @time calc_multifiles(1:N_objs) > println("singlefile multi core processing:") > @time calc_singlefile_parallel(1:N_objs) > println("multifiles multi core processing:") > @time calc_multifiles_parallel(1:N_objs) > ``` > > when I run `julia run_paralleltest.jl` on my macbook pro, I get the > following results: > > ``` > WARNING: replacing module HDF5 > WARNING: replacing module HDF5 > WARNING: replacing module HDF5 > WARNING: replacing module HDF5 > WARNING: replacing module HDF5 > WARNING: replacing module HDF5 > WARNING: replacing module HDF5 > singlefile single core processing: > 16.758472 seconds (762.67 k allocations: 14.933 GB, 7.55% gc time) > multifiles single core processing: > 15.962462 seconds (27.59 k allocations: 14.902 GB, 7.74% gc time) > singlefile multi core processing: > 19.293451 seconds (5.73 M allocations: 241.379 MB, 1.08% gc time) > multifiles multi core processing: > 13.152688 seconds (3.15 k allocations: 204.315 KB) > ``` > > Is there a way to get a better performance for running the parallel > calculations. Also, are the warnings expected? > > best, > Jon Alm Eriksen >
Re: [julia-users] UTF8, how to procesed text data
Hi, there is a library that let you specify the encoding type when opening files: https://github.com/nalimilan/StringEncodings.jl 2016-10-20 19:32 GMT+02:00 : > Julia ver 5 is OK , but is new problem with space after ś, ć . More in new > post... > > Paul > > W dniu środa, 19 października 2016 15:04:13 UTC+2 użytkownik Milan > Bouchet-Valat napisał: >> >> Le mercredi 19 octobre 2016 à 06:02 -0700, program...@gmail.com a >> écrit : >> > Version 0.3.12, udate to 5 ? >> Yes. 0.3.x versions are unsupported for some time now. >> >> >> Regards >> >> > > Le mercredi 19 octobre 2016 à 04:46 -0700, program...@gmail.com a >> > > écrit : >> > > > Data file is coding UTF8 but i cant procedsed this datain Julia >> > > ? >> > > > What wrong ? >> > > > >> > > > o=open("data.txt") >> > > > >> > > > julia> temp=readline(io) >> > > > "3699778,13,2,gdbiehz jablej gupując szybgi Injehnej dg 26 >> > > > paździehniga,1\n" >> > > > >> > > > julia> temp[61:65] >> > > > "aźdz" >> > > > >> > > > julia> findin(temp[61:65],"d") >> > > > ERROR: invalid UTF-8 character index >> > > > in next at utf8.jl:64 >> > > > in findin at array.jl:1179 >> > > You didn't say what version of Julia you're using. The bug seems >> > > to >> > > happen on 0.4.7, but not on 0.5.0, so I'd encourage you to >> > > upgrade. >> > > >> > > (Note that in general you shouldn't index into strings with >> > > arbitrary >> > > integers: only values referring to the beginning of a Unicode code >> > > point are valid.) >> > > >> > > >> > > Regards >> > >> >
Re: [julia-users] Re: Function check without @assert
I find assertions really usefull for examples in documentation and sometimes in code for devs (by far better than comments). http://c2.com/cgi/wiki?DoNotUseAssertions 2016-04-22 8:45 GMT+02:00 Lyndon White : > > *TL;DR; Assert is no slower than exception throwing& the problem is > people expect it to be a `nop` if optimisation is turned on. and a `nop` is > much faster than an assert*(*I haven't actually tested this), > > > Its not that assert is slow. > it is that traditionally (and we are talking going back to a very long > time), > `assert` statements are removed by the optimizer. (Eg. It is one of the > only things done by `cpython -o`) > Because `asserts` are error messages targeted at the developer. > In theory they will never be triggered at run time, in "shipped" code. > > This is different from Exceptions. > Exceptions may be thrown (and not caught) in shipped code. > Its bad, but sometimes unavoidable, > and it is better that the exception is thrown (and not caught) than to > allow the program to run in an unknown state (potentially leading to data > corruption, or infinate looping or all kinds of problems). > > vs if the exception was instead an assert, then in the shipped version > they are not their and so the program runs in unknown state. > > Currently running `julia --optimise` does not actually remove them. > This is generally considered a bad thing. > It break the expectation that they will be gone in the optimised version. > This is very bad, if you have an `assert` in a key inner loop. > > > > However, some people are writing code, that is using asserts to check that > a function is being used correctly. > Because they are lazy (aren't we all? I know I do this). > If this code is ship in optimized form (without asserts), then the users > of there code (if it is a library) are going to be be able to get into > unintended states -- by accident. > > Now what people should do, is as in your example check conditions and > throw meaning full exceptions. > Your way works, as does using an if statement (i'ld use an if statement, > but that is just me). > But as a hack for those of us who are lazy, a @check macro was proposed, > which would not be removed by the optimizer. > > > > >
[julia-users] Re: Stuck with using and LOAD_PATH
Thanks for replies (sorry I was not notified) The matter was for running tests often (BDD) and Pkg doesn't seem to be the solution. Finally I resolved it but still don't understand what was the matter. :) I've defined all modules in package.jl with exports and include then use "using Package.ModuleA.typeA" in moduleB for example. Le lundi 21 mars 2016 04:13:24 UTC+1, James Dang a écrit : > > If MyDir is on the LOAD_PATH, this works for me: > > LOAD_PATH/ > Foo/ > src/ > Foo.jl > > Foo.jl: > module Foo > ... > end > > in separate code: >import Foo > > > > > > > On Wednesday, March 16, 2016 at 9:42:37 PM UTC+8, Gregory Salvan wrote: >> >> Hi, >> I don't understand the way of importing/using modules, or including files. >> I've look at documentation (FAQ and modules) and into this list for old >> subjects, and tested all solutions I found. >> >> with julia version 0.4.3 on gentoo >> >> directory structure looks like: >> >> package.jl (directory) >> |_ src >> | |_ fileA.jl >> | |_ fileB.jl >> | |_ fileC.jl >> | |_ fileD.jl >> | |_ package.jl >> |_ test >> |_ testA.jl >> >> The* first issue* I had was with "using" in tests, for example I've >> tried: >> >> push!(LOAD_PATH, string(dirname(@__FILE__), "/..src")) # just to try - >> ModuleA is in fileA.jl in ../src >> using ModuleA >> # alternativelly: >> # using ModuleA.TypeA1 or using ModuleA: TypeA1... >> facts("Test ModuleA") do >> context("create") do >> type_a1 = ModuleA.TypeA1() >> end >> end >> >> >> I've tested launching this test with : >> JULIA_LOAD_PATH=./src/ LOAD_PATH=./src julia --color=yes test/testA.jl >> >> (no differences with import too) >> >> The error message is: >> ERROR: LoadError: ArgumentError: ModuleA not found in path >> >> Instead of import/using (but that's not what I wanted), It's OK if I use : >> include("../src/fileA.jl") >> >> >> >> The *second issue* is when I wanted to define modules in package.jl this >> way: >> >> module Package >> module ModA >> include("fileA.jl") >> include("fileB.jl") >> end >> module ModC >> include("fileC.jl") >> end >> end >> >> And have for example a fileD.jl included in fileA.jl and fileB.jl (or >> fileC.jl) with and immutable type. (I've included fileD.jl because I >> couldn't use "using" but I would prefer "using") >> I have an error ERROR: LoadError: LoadError: LoadError: LoadError: >> LoadError: invalid redefinition of constant ImmutableType >> >> So as I can't use "using" to avoid the reloading what can I do? >> what I've not understood about julia import/using and include ? >> >> NOTE: >> some things I've tried: fileD.jl with a module ModuleD and "using" with >> and without __precompile__ >> setting LOAD_PATH and JULIA_LOAD_PATH globally with bashrc or passing it >> to shell >> ... >> >>
[julia-users] Re: cross-module exports / extending modules
Hi, when you want to add methods on Base module (like getindex, getfield...) you use "import Base.getindex" then write a new function with new types args. for exemple to add methods in B.jl on functionA from A.jl import A.functionA function functionA(...) end Is what you were looking for? Le dimanche 20 mars 2016 11:25:10 UTC+1, Andreas Lobinger a écrit : > > Hello colleagues, > > i remember a discussion about this, but maybe without conclusion and maybe > without the right keywords. > > Let's have module A (from package A.jl) with certain funcitionality and > maybe some types. > Now module/package/code B.jl that somehow extends A with optional > functions. How to put these functions under the A. API? > I'm pretty sure exports across modules don't work, but is there somewhere > some functionality on this? > > Wishing a happy day, > Andreas >
[julia-users] Stuck with using and LOAD_PATH
Hi, I don't understand the way of importing/using modules, or including files. I've look at documentation (FAQ and modules) and into this list for old subjects, and tested all solutions I found. with julia version 0.4.3 on gentoo directory structure looks like: package.jl (directory) |_ src | |_ fileA.jl | |_ fileB.jl | |_ fileC.jl | |_ fileD.jl | |_ package.jl |_ test |_ testA.jl The* first issue* I had was with "using" in tests, for example I've tried: push!(LOAD_PATH, string(dirname(@__FILE__), "/..src")) # just to try - ModuleA is in fileA.jl in ../src using ModuleA # alternativelly: # using ModuleA.TypeA1 or using ModuleA: TypeA1... facts("Test ModuleA") do context("create") do type_a1 = ModuleA.TypeA1() end end I've tested launching this test with : JULIA_LOAD_PATH=./src/ LOAD_PATH=./src julia --color=yes test/testA.jl (no differences with import too) The error message is: ERROR: LoadError: ArgumentError: ModuleA not found in path Instead of import/using (but that's not what I wanted), It's OK if I use : include("../src/fileA.jl") The *second issue* is when I wanted to define modules in package.jl this way: module Package module ModA include("fileA.jl") include("fileB.jl") end module ModC include("fileC.jl") end end And have for example a fileD.jl included in fileA.jl and fileB.jl (or fileC.jl) with and immutable type. (I've included fileD.jl because I couldn't use "using" but I would prefer "using") I have an error ERROR: LoadError: LoadError: LoadError: LoadError: LoadError: invalid redefinition of constant ImmutableType So as I can't use "using" to avoid the reloading what can I do? what I've not understood about julia import/using and include ? NOTE: some things I've tried: fileD.jl with a module ModuleD and "using" with and without __precompile__ setting LOAD_PATH and JULIA_LOAD_PATH globally with bashrc or passing it to shell ...