I'm writing a kind of a web scanner that should retrieve and analyze about 100k URLs as fast as possible. Of course, it will take time anyway, but I'm looking for how to utilize my CPUs and network as much as possible.
My initial approach was to add all available processors, pack urls into tasks and run these tasks in parallel: using Requests urls = ... @time @sync @parallel for url in urls resp = get(url) println("Status: $(resp.status)") end My assumption was that 100k tasks would be created, each task would execute GET request and, since this is IO operation, free current thread for other tasks. From logs, however, I see that each worker executes tasks one by one, every time waiting for GET request to finish. So how do I start 100k requests in parallel? (100k is here just for example, I can easily split then into chunks of 10k, for example, so system limits and overused CPU/network are not an issue; issue is in their *underutilization*). Thanks