Please do read the docs, they are quite thorough:
http://docs.julialang.org/en/release-0.3/manual/profile/

You should run your code once before you profile, so you don't get all those 
calls to inference. As I'm sure you noticed, it makes the output hard to read. 
Also check out ProfileView.jl.

But yes, you identified some bottleneck lines. I don't know that part of julia 
at all, though, so I'll defer to others.

--Tim

On Saturday, April 25, 2015 10:50:43 AM Harry B wrote:
> Here is the output of Profile.print()
> 
> https://github.com/harikb/scratchpad1/blob/master/julia2/run3.txt
> 
> I don't know how to interpret these results, but I would guess this is
> where the most time is spent
> 
>               10769 stream.jl; stream_wait; line: 263
>             10774 stream.jl; readavailable; line: 709
>              10774 stream.jl; wait_readnb; line: 316
> 
> Is the issue that stream.jl is reading byte by byte? If a Content-Length is
> available in the response header (and I know it is), it should probably
> read as one chunk.
> Again, I am throwing a dart in the dark. So I should probably stop
> speculating.
> 
> Any help is appreciated on the next steps
> 
> --
> Harry
> 
> On Thursday, April 23, 2015 at 5:52:09 PM UTC-7, Tim Holy wrote:
> > I think it's fair to say that Profile.print() will be quite a lot more
> > informative---all you're getting is the list of lines visited, not
> > anything
> > about how much time each one takes.
> > 
> > --Tim
> > 
> > On Thursday, April 23, 2015 04:19:08 PM Harry B wrote:
> > > I am trying to profile this code, so here is what I have so far. I added
> > > the following code to the path taken for the single-process mode.
> > > I didn't bother with the multi-process once since I didn't know how to
> > 
> > deal
> > 
> > > with @profile and remotecall_wait
> > > 
> > >     @profile processOneFile(3085, 35649, filename)
> > >     bt, lidict = Profile.retrieve()
> > >     println("Profiling done")
> > >     for (k,v) in lidict
> > >     
> > >         println(v)
> > >     
> > >     end
> > > 
> > > Output is here
> > > https://github.com/harikb/scratchpad1/blob/master/julia2/run1.txt
> > 
> > (Ran
> > 
> > > with julia 0.3.7)
> > > another run
> > > https://github.com/harikb/scratchpad1/blob/master/julia2/run2.txt  (Ran
> > > with julia-debug 0.3.7) - in case it gave better results.
> > > 
> > > However, there is quite a few lines marked without line or file info.
> > > 
> > > On Wednesday, April 22, 2015 at 2:44:13 AM UTC-7, Yuuki Soho wrote:
> > >     If I understand correctly now you are doing only 5 requests at the
> > 
> > same
> > 
> > > time? It seems to me you could do much more.
> > > 
> > > But that hides the inefficiency, whatever level it exists. The Go
> > 
> > program
> > 
> > > also uses only 5 parallel connections.
> > > 
> > > On Wednesday, April 22, 2015 at 1:15:20 PM UTC-7, Stefan Karpinski
> > 
> > wrote:
> > >     Honestly, I'm pretty pleased with that performance. This kind of
> > 
> > thing
> > 
> > > is Go's bread and butter – being within a factor of 2 of Go at something
> > > like this is really good. That said, if you do figure out anything
> > 
> > that's a
> > 
> > > bottleneck here, please file issues – there's no fundamental reason
> > 
> > Julia
> > 
> > > can't be just as fast or faster than any other language at this.
> > > 
> > > Stefan, yes, it is about 2x if I subtract the 10 seconds or so (whatever
> > 
> > it
> > 
> > > appears to me) as the startup time. I am running Julia 0.3.7 on a box
> > 
> > with
> > 
> > > a deprecated GnuTLS (RHEL). The deprecation warning msg comes about 8
> > > seconds into the run and I wait another 2 seconds before I see the first
> > > print statement from my code ("Started N processes" message). My
> > > calculations already exclude these 10 seconds.
> > > I wonder if I would get better startup time with 0.4, but Requests.jl is
> > > not compatible with it (nor do I find any other library for 0.4). I will
> > > try 0.4 again and see I can fix Requests.jl
> > > 
> > > Any help is appreciated on further analysis of the profile output.
> > > 
> > > Thanks
> > > 
> > > > The use of Requests.jl makes this very hard to benchmark accurately
> > 
> > since
> > 
> > > > it introduces (non-measurable) dependencies on network resources.
> > > > 
> > > > If you @profile the function, can you tell where it's spending most of
> > 
> > its
> > 
> > > > time?
> > > > 
> > > > On Tuesday, April 21, 2015 at 2:12:52 PM UTC-7, Harry B wrote:
> > > >> Hello,
> > > >> 
> > > >> I had the need to take a text file with several million lines,
> > 
> > construct
> > 
> > > >> a URL with parameters picked from the tab limited file, and fire them
> > 
> > one
> > 
> > > >> after the other. After I read about Julia, I decided to try this in
> > > >> Julia.
> > > >> However my initial implementation turned out to be slow and I was
> > 
> > getting
> > 
> > > >> close to my deadline. I then kept the Julia implementation aside and
> > > >> wrote
> > > >> the same thing in Go, my other favorite language. Go version is twice
> > 
> > (at
> > 
> > > >> least) as fast as the Julia version. Now the task/deadline is over, I
> > 
> > am
> > 
> > > >> coming back to the Julia version to see what I did wrong.
> > > >> 
> > > >> Go and Julia version are not written alike. In Go, I have just one
> > 
> > main
> > 
> > > >> thread reading a file and 5 go-routines waiting in a channel and one
> > 
> > of
> > 
> > > >> them will get the 'line/job' and fire off the url, wait for a
> > 
> > response,
> > 
> > > >> parse the JSON, and look for an id in a specific place, and go back
> > 
> > to
> > 
> > > >> wait
> > > >> for more items from the channel.
> > > >> 
> > > >> Julia code is very similar to the one discussed in the thread quoted
> > > >> below. I invoke Julia with -p 5 and then have *each* process open the
> > > >> file
> > > >> and read all lines. However each process is only processing 1/5th of
> > 
> > the
> > 
> > > >> lines and skipping others. It is a slight modification of what was
> > > >> discussed in this thread
> > > >> https://groups.google.com/d/msg/julia-users/Kr8vGwdXcJA/8ynOghlYaGgJ
> > > >> 
> > > >> Julia code (no server URL or source for that though ) :
> > > >> https://github.com/harikb/scratchpad1/tree/master/julia2
> > > >> Server could be anything that returns a static JSON.
> > > >> 
> > > >> Considering the files will entirely fit in filesystem cache and I am
> > > >> running this on a fairly large system (procinfo says 24 cores, 100G
> > 
> > ram,
> > 
> > > >> 50G or free even after removing cached). The input file is only 875K.
> > > >> This
> > > >> should ideally mean I can read the files several times in any
> > 
> > programming
> > 
> > > >> language and not skip a beat. wc -l on the file takes only 0m0.002s .
> > 
> > Any
> > 
> > > >> log/output is written to a fusion-io based flash disk. All fairly
> > 
> > high
> > 
> > > >> end.
> > > >> 
> > > >> https://github.com/harikb/scratchpad1/tree/master/julia2
> > > >> 
> > > >> At this point, considering the machine is reasonably good, the only
> > > >> bottleneck should be the time URL firing takes (it is a GET request,
> > 
> > but
> > 
> > > >> the other side has some processing to do) or the subsequent JSON
> > 
> > parsing.
> > 
> > > >> Where do I go from here? How do I find out (a) are HTTP connections
> > 
> > being
> > 
> > > >> re-used by the underlying library? I am using this library
> > > >> https://github.com/JuliaWeb/Requests.jl
> > > >> If not, that could answer this difference. How do I profile this
> > 
> > code? I
> > 
> > > >> am using julia 0.3.7 (since Requests.jl does not work with 0.4
> > 
> > nightly)
> > 
> > > >> Any help is appreciated.
> > > >> Thanks

Reply via email to