Re: [julia-users] Re: Need help/direction on how to optimize some Julia code

Harry B Thu, 30 Apr 2015 16:22:27 -0700

Here is the latest attempt at it  
https://github.com/harikb/scratchpad1/blob/master/julia2/run4.txt


I did look in to Requests.jl . The only probable lead I got is a use of 
on_body_cb by Requests.jl
Looking inside 
https://github.com/joyent/http-parser/blob/master/http_parser.c#L1883 , 
there is some logic
where content-length header may not be used and it may read byte-by-byte.

I did confirm my http response header does not have any 'chunked' header

Thanks
--
Harry

On Saturday, April 25, 2015 at 11:28:01 AM UTC-7, Tim Holy wrote:
>
> Please do read the docs, they are quite thorough: 
> http://docs.julialang.org/en/release-0.3/manual/profile/ 
>
> You should run your code once before you profile, so you don't get all 
> those 
> calls to inference. As I'm sure you noticed, it makes the output hard to 
> read. 
> Also check out ProfileView.jl. 
>
> But yes, you identified some bottleneck lines. I don't know that part of 
> julia 
> at all, though, so I'll defer to others. 
>
> --Tim 
>
> On Saturday, April 25, 2015 10:50:43 AM Harry B wrote: 
> > Here is the output of Profile.print() 
> > 
> > https://github.com/harikb/scratchpad1/blob/master/julia2/run3.txt 
> > 
> > I don't know how to interpret these results, but I would guess this is 
> > where the most time is spent 
> > 
> >               10769 stream.jl; stream_wait; line: 263 
> >             10774 stream.jl; readavailable; line: 709 
> >              10774 stream.jl; wait_readnb; line: 316 
> > 
> > Is the issue that stream.jl is reading byte by byte? If a Content-Length 
> is 
> > available in the response header (and I know it is), it should probably 
> > read as one chunk. 
> > Again, I am throwing a dart in the dark. So I should probably stop 
> > speculating. 
> > 
> > Any help is appreciated on the next steps 
> > 
> > -- 
> > Harry 
> > 
> > On Thursday, April 23, 2015 at 5:52:09 PM UTC-7, Tim Holy wrote: 
> > > I think it's fair to say that Profile.print() will be quite a lot more 
> > > informative---all you're getting is the list of lines visited, not 
> > > anything 
> > > about how much time each one takes. 
> > > 
> > > --Tim 
> > > 
> > > On Thursday, April 23, 2015 04:19:08 PM Harry B wrote: 
> > > > I am trying to profile this code, so here is what I have so far. I 
> added 
> > > > the following code to the path taken for the single-process mode. 
> > > > I didn't bother with the multi-process once since I didn't know how 
> to 
> > > 
> > > deal 
> > > 
> > > > with @profile and remotecall_wait 
> > > > 
> > > >     @profile processOneFile(3085, 35649, filename) 
> > > >     bt, lidict = Profile.retrieve() 
> > > >     println("Profiling done") 
> > > >     for (k,v) in lidict 
> > > >     
> > > >         println(v) 
> > > >     
> > > >     end 
> > > > 
> > > > Output is here 
> > > > https://github.com/harikb/scratchpad1/blob/master/julia2/run1.txt 
> > > 
> > > (Ran 
> > > 
> > > > with julia 0.3.7) 
> > > > another run 
> > > > https://github.com/harikb/scratchpad1/blob/master/julia2/run2.txt 
>  (Ran 
> > > > with julia-debug 0.3.7) - in case it gave better results. 
> > > > 
> > > > However, there is quite a few lines marked without line or file 
> info. 
> > > > 
> > > > On Wednesday, April 22, 2015 at 2:44:13 AM UTC-7, Yuuki Soho wrote: 
> > > >     If I understand correctly now you are doing only 5 requests at 
> the 
> > > 
> > > same 
> > > 
> > > > time? It seems to me you could do much more. 
> > > > 
> > > > But that hides the inefficiency, whatever level it exists. The Go 
> > > 
> > > program 
> > > 
> > > > also uses only 5 parallel connections. 
> > > > 
> > > > On Wednesday, April 22, 2015 at 1:15:20 PM UTC-7, Stefan Karpinski 
> > > 
> > > wrote: 
> > > >     Honestly, I'm pretty pleased with that performance. This kind of 
> > > 
> > > thing 
> > > 
> > > > is Go's bread and butter – being within a factor of 2 of Go at 
> something 
> > > > like this is really good. That said, if you do figure out anything 
> > > 
> > > that's a 
> > > 
> > > > bottleneck here, please file issues – there's no fundamental reason 
> > > 
> > > Julia 
> > > 
> > > > can't be just as fast or faster than any other language at this. 
> > > > 
> > > > Stefan, yes, it is about 2x if I subtract the 10 seconds or so 
> (whatever 
> > > 
> > > it 
> > > 
> > > > appears to me) as the startup time. I am running Julia 0.3.7 on a 
> box 
> > > 
> > > with 
> > > 
> > > > a deprecated GnuTLS (RHEL). The deprecation warning msg comes about 
> 8 
> > > > seconds into the run and I wait another 2 seconds before I see the 
> first 
> > > > print statement from my code ("Started N processes" message). My 
> > > > calculations already exclude these 10 seconds. 
> > > > I wonder if I would get better startup time with 0.4, but 
> Requests.jl is 
> > > > not compatible with it (nor do I find any other library for 0.4). I 
> will 
> > > > try 0.4 again and see I can fix Requests.jl 
> > > > 
> > > > Any help is appreciated on further analysis of the profile output. 
> > > > 
> > > > Thanks 
> > > > 
> > > > > The use of Requests.jl makes this very hard to benchmark 
> accurately 
> > > 
> > > since 
> > > 
> > > > > it introduces (non-measurable) dependencies on network resources. 
> > > > > 
> > > > > If you @profile the function, can you tell where it's spending 
> most of 
> > > 
> > > its 
> > > 
> > > > > time? 
> > > > > 
> > > > > On Tuesday, April 21, 2015 at 2:12:52 PM UTC-7, Harry B wrote: 
> > > > >> Hello, 
> > > > >> 
> > > > >> I had the need to take a text file with several million lines, 
> > > 
> > > construct 
> > > 
> > > > >> a URL with parameters picked from the tab limited file, and fire 
> them 
> > > 
> > > one 
> > > 
> > > > >> after the other. After I read about Julia, I decided to try this 
> in 
> > > > >> Julia. 
> > > > >> However my initial implementation turned out to be slow and I was 
> > > 
> > > getting 
> > > 
> > > > >> close to my deadline. I then kept the Julia implementation aside 
> and 
> > > > >> wrote 
> > > > >> the same thing in Go, my other favorite language. Go version is 
> twice 
> > > 
> > > (at 
> > > 
> > > > >> least) as fast as the Julia version. Now the task/deadline is 
> over, I 
> > > 
> > > am 
> > > 
> > > > >> coming back to the Julia version to see what I did wrong. 
> > > > >> 
> > > > >> Go and Julia version are not written alike. In Go, I have just 
> one 
> > > 
> > > main 
> > > 
> > > > >> thread reading a file and 5 go-routines waiting in a channel and 
> one 
> > > 
> > > of 
> > > 
> > > > >> them will get the 'line/job' and fire off the url, wait for a 
> > > 
> > > response, 
> > > 
> > > > >> parse the JSON, and look for an id in a specific place, and go 
> back 
> > > 
> > > to 
> > > 
> > > > >> wait 
> > > > >> for more items from the channel. 
> > > > >> 
> > > > >> Julia code is very similar to the one discussed in the thread 
> quoted 
> > > > >> below. I invoke Julia with -p 5 and then have *each* process open 
> the 
> > > > >> file 
> > > > >> and read all lines. However each process is only processing 1/5th 
> of 
> > > 
> > > the 
> > > 
> > > > >> lines and skipping others. It is a slight modification of what 
> was 
> > > > >> discussed in this thread 
> > > > >> 
> https://groups.google.com/d/msg/julia-users/Kr8vGwdXcJA/8ynOghlYaGgJ 
> > > > >> 
> > > > >> Julia code (no server URL or source for that though ) : 
> > > > >> https://github.com/harikb/scratchpad1/tree/master/julia2 
> > > > >> Server could be anything that returns a static JSON. 
> > > > >> 
> > > > >> Considering the files will entirely fit in filesystem cache and I 
> am 
> > > > >> running this on a fairly large system (procinfo says 24 cores, 
> 100G 
> > > 
> > > ram, 
> > > 
> > > > >> 50G or free even after removing cached). The input file is only 
> 875K. 
> > > > >> This 
> > > > >> should ideally mean I can read the files several times in any 
> > > 
> > > programming 
> > > 
> > > > >> language and not skip a beat. wc -l on the file takes only 
> 0m0.002s . 
> > > 
> > > Any 
> > > 
> > > > >> log/output is written to a fusion-io based flash disk. All fairly 
> > > 
> > > high 
> > > 
> > > > >> end. 
> > > > >> 
> > > > >> https://github.com/harikb/scratchpad1/tree/master/julia2 
> > > > >> 
> > > > >> At this point, considering the machine is reasonably good, the 
> only 
> > > > >> bottleneck should be the time URL firing takes (it is a GET 
> request, 
> > > 
> > > but 
> > > 
> > > > >> the other side has some processing to do) or the subsequent JSON 
> > > 
> > > parsing. 
> > > 
> > > > >> Where do I go from here? How do I find out (a) are HTTP 
> connections 
> > > 
> > > being 
> > > 
> > > > >> re-used by the underlying library? I am using this library 
> > > > >> https://github.com/JuliaWeb/Requests.jl 
> > > > >> If not, that could answer this difference. How do I profile this 
> > > 
> > > code? I 
> > > 
> > > > >> am using julia 0.3.7 (since Requests.jl does not work with 0.4 
> > > 
> > > nightly) 
> > > 
> > > > >> Any help is appreciated. 
> > > > >> Thanks 
>
>

Re: [julia-users] Re: Need help/direction on how to optimize some Julia code

Reply via email to