Here is the latest attempt at it https://github.com/harikb/scratchpad1/blob/master/julia2/run4.txt
I did look in to Requests.jl . The only probable lead I got is a use of on_body_cb by Requests.jl Looking inside https://github.com/joyent/http-parser/blob/master/http_parser.c#L1883 , there is some logic where content-length header may not be used and it may read byte-by-byte. I did confirm my http response header does not have any 'chunked' header Thanks -- Harry On Saturday, April 25, 2015 at 11:28:01 AM UTC-7, Tim Holy wrote: > > Please do read the docs, they are quite thorough: > http://docs.julialang.org/en/release-0.3/manual/profile/ > > You should run your code once before you profile, so you don't get all > those > calls to inference. As I'm sure you noticed, it makes the output hard to > read. > Also check out ProfileView.jl. > > But yes, you identified some bottleneck lines. I don't know that part of > julia > at all, though, so I'll defer to others. > > --Tim > > On Saturday, April 25, 2015 10:50:43 AM Harry B wrote: > > Here is the output of Profile.print() > > > > https://github.com/harikb/scratchpad1/blob/master/julia2/run3.txt > > > > I don't know how to interpret these results, but I would guess this is > > where the most time is spent > > > > 10769 stream.jl; stream_wait; line: 263 > > 10774 stream.jl; readavailable; line: 709 > > 10774 stream.jl; wait_readnb; line: 316 > > > > Is the issue that stream.jl is reading byte by byte? If a Content-Length > is > > available in the response header (and I know it is), it should probably > > read as one chunk. > > Again, I am throwing a dart in the dark. So I should probably stop > > speculating. > > > > Any help is appreciated on the next steps > > > > -- > > Harry > > > > On Thursday, April 23, 2015 at 5:52:09 PM UTC-7, Tim Holy wrote: > > > I think it's fair to say that Profile.print() will be quite a lot more > > > informative---all you're getting is the list of lines visited, not > > > anything > > > about how much time each one takes. > > > > > > --Tim > > > > > > On Thursday, April 23, 2015 04:19:08 PM Harry B wrote: > > > > I am trying to profile this code, so here is what I have so far. I > added > > > > the following code to the path taken for the single-process mode. > > > > I didn't bother with the multi-process once since I didn't know how > to > > > > > > deal > > > > > > > with @profile and remotecall_wait > > > > > > > > @profile processOneFile(3085, 35649, filename) > > > > bt, lidict = Profile.retrieve() > > > > println("Profiling done") > > > > for (k,v) in lidict > > > > > > > > println(v) > > > > > > > > end > > > > > > > > Output is here > > > > https://github.com/harikb/scratchpad1/blob/master/julia2/run1.txt > > > > > > (Ran > > > > > > > with julia 0.3.7) > > > > another run > > > > https://github.com/harikb/scratchpad1/blob/master/julia2/run2.txt > (Ran > > > > with julia-debug 0.3.7) - in case it gave better results. > > > > > > > > However, there is quite a few lines marked without line or file > info. > > > > > > > > On Wednesday, April 22, 2015 at 2:44:13 AM UTC-7, Yuuki Soho wrote: > > > > If I understand correctly now you are doing only 5 requests at > the > > > > > > same > > > > > > > time? It seems to me you could do much more. > > > > > > > > But that hides the inefficiency, whatever level it exists. The Go > > > > > > program > > > > > > > also uses only 5 parallel connections. > > > > > > > > On Wednesday, April 22, 2015 at 1:15:20 PM UTC-7, Stefan Karpinski > > > > > > wrote: > > > > Honestly, I'm pretty pleased with that performance. This kind of > > > > > > thing > > > > > > > is Go's bread and butter – being within a factor of 2 of Go at > something > > > > like this is really good. That said, if you do figure out anything > > > > > > that's a > > > > > > > bottleneck here, please file issues – there's no fundamental reason > > > > > > Julia > > > > > > > can't be just as fast or faster than any other language at this. > > > > > > > > Stefan, yes, it is about 2x if I subtract the 10 seconds or so > (whatever > > > > > > it > > > > > > > appears to me) as the startup time. I am running Julia 0.3.7 on a > box > > > > > > with > > > > > > > a deprecated GnuTLS (RHEL). The deprecation warning msg comes about > 8 > > > > seconds into the run and I wait another 2 seconds before I see the > first > > > > print statement from my code ("Started N processes" message). My > > > > calculations already exclude these 10 seconds. > > > > I wonder if I would get better startup time with 0.4, but > Requests.jl is > > > > not compatible with it (nor do I find any other library for 0.4). I > will > > > > try 0.4 again and see I can fix Requests.jl > > > > > > > > Any help is appreciated on further analysis of the profile output. > > > > > > > > Thanks > > > > > > > > > The use of Requests.jl makes this very hard to benchmark > accurately > > > > > > since > > > > > > > > it introduces (non-measurable) dependencies on network resources. > > > > > > > > > > If you @profile the function, can you tell where it's spending > most of > > > > > > its > > > > > > > > time? > > > > > > > > > > On Tuesday, April 21, 2015 at 2:12:52 PM UTC-7, Harry B wrote: > > > > >> Hello, > > > > >> > > > > >> I had the need to take a text file with several million lines, > > > > > > construct > > > > > > > >> a URL with parameters picked from the tab limited file, and fire > them > > > > > > one > > > > > > > >> after the other. After I read about Julia, I decided to try this > in > > > > >> Julia. > > > > >> However my initial implementation turned out to be slow and I was > > > > > > getting > > > > > > > >> close to my deadline. I then kept the Julia implementation aside > and > > > > >> wrote > > > > >> the same thing in Go, my other favorite language. Go version is > twice > > > > > > (at > > > > > > > >> least) as fast as the Julia version. Now the task/deadline is > over, I > > > > > > am > > > > > > > >> coming back to the Julia version to see what I did wrong. > > > > >> > > > > >> Go and Julia version are not written alike. In Go, I have just > one > > > > > > main > > > > > > > >> thread reading a file and 5 go-routines waiting in a channel and > one > > > > > > of > > > > > > > >> them will get the 'line/job' and fire off the url, wait for a > > > > > > response, > > > > > > > >> parse the JSON, and look for an id in a specific place, and go > back > > > > > > to > > > > > > > >> wait > > > > >> for more items from the channel. > > > > >> > > > > >> Julia code is very similar to the one discussed in the thread > quoted > > > > >> below. I invoke Julia with -p 5 and then have *each* process open > the > > > > >> file > > > > >> and read all lines. However each process is only processing 1/5th > of > > > > > > the > > > > > > > >> lines and skipping others. It is a slight modification of what > was > > > > >> discussed in this thread > > > > >> > https://groups.google.com/d/msg/julia-users/Kr8vGwdXcJA/8ynOghlYaGgJ > > > > >> > > > > >> Julia code (no server URL or source for that though ) : > > > > >> https://github.com/harikb/scratchpad1/tree/master/julia2 > > > > >> Server could be anything that returns a static JSON. > > > > >> > > > > >> Considering the files will entirely fit in filesystem cache and I > am > > > > >> running this on a fairly large system (procinfo says 24 cores, > 100G > > > > > > ram, > > > > > > > >> 50G or free even after removing cached). The input file is only > 875K. > > > > >> This > > > > >> should ideally mean I can read the files several times in any > > > > > > programming > > > > > > > >> language and not skip a beat. wc -l on the file takes only > 0m0.002s . > > > > > > Any > > > > > > > >> log/output is written to a fusion-io based flash disk. All fairly > > > > > > high > > > > > > > >> end. > > > > >> > > > > >> https://github.com/harikb/scratchpad1/tree/master/julia2 > > > > >> > > > > >> At this point, considering the machine is reasonably good, the > only > > > > >> bottleneck should be the time URL firing takes (it is a GET > request, > > > > > > but > > > > > > > >> the other side has some processing to do) or the subsequent JSON > > > > > > parsing. > > > > > > > >> Where do I go from here? How do I find out (a) are HTTP > connections > > > > > > being > > > > > > > >> re-used by the underlying library? I am using this library > > > > >> https://github.com/JuliaWeb/Requests.jl > > > > >> If not, that could answer this difference. How do I profile this > > > > > > code? I > > > > > > > >> am using julia 0.3.7 (since Requests.jl does not work with 0.4 > > > > > > nightly) > > > > > > > >> Any help is appreciated. > > > > >> Thanks > >
