Thanks for the report, Andrei - Would you mind filing this is an issue at https://github.com/JuliaWeb/Requests.jl?
On Mon, Aug 24, 2015 at 9:17 AM, Andrei Zh <[email protected]> wrote: > Jonathan, thanks for your support. So far I noticed that DNS gives pretty > large delay. E.g. resolving IP addresses for 1000 URLs took 80 seconds in > serial code and 26 seconds in muli-task code: > > > Serial execution: > > julia> @time for url in urls > begin > Base.getaddrinfo(URI(url).host) > end > end > elapsed time: 80.071810293 seconds (732400 bytes allocated) > > > Multitask execution: > > > julia> @time @sync for url in urls > @async begin > Base.getaddrinfo(URI(url).host) > end > end > > elapsed time: 26.241893516 seconds (4277968 bytes allocated) > > So I'll try to pre-resolve IPs and test again. > > > On Monday, August 24, 2015 at 4:01:44 PM UTC+3, Jonathan Malmaud wrote: > >> As one of the maintainers of Requests.jl, I'm especially interested in >> its use for high-performance applications so don't hesitate to file an >> issue if it gives you any performance problems. >> >> On Sunday, August 23, 2015 at 7:40:08 PM UTC-4, Andrei Zh wrote: >>> >>> Hi Steven, >>> >>> thanks for your answer! It turns out I misunderstood @async long time >>> ago, assuming it also makes a remote call to other processes and thus >>> introduces true multi-tasking. So now I need to rethink my approach before >>> going further. >>> >>> Just to clarify: my goal is to perform as many requests as possible at >>> the same time, so I want to use both - multiple processes (to start several >>> requests at several cores in parallel) and tasks (to launch new requests >>> while old ones are still waiting for IO to complete). >>> >>> So I will update my approach and come back with results or new >>> questions. >>> >>> >>> >>> On Monday, August 24, 2015 at 2:13:23 AM UTC+3, Steven G. Johnson wrote: >>>> >>>> @parallel in Julia is for executing separate parallel processes (true >>>> multi-tasking, with separate address spaces). @async is for "tasks", which >>>> are "green threads" and represent cooperative multitasking (within the same >>>> process and the same address space). >>>> >>>> I/O in Julia is asynchronous — while one task is blocked waiting on >>>> I/O, another task will wake up and start running. (This is based on the >>>> libuv library, which is designed for high-performance asynchronous I/O.) >>>> >>>> The first question is whether you want to fetch URLs in separate OS >>>> processes, or you want to use green threads within the same process. It >>>> sounds like you want the latter, in which case @async is the right thing. >>>> >>>> The second question is whether something about the Requests.jl package >>>> is serializing things somehow; for that you might file an issue at >>>> Requests.jl. >>>> >>>
