[ https://issues.apache.org/jira/browse/AIRAVATA-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101724#comment-16101724 ]
Marcus Christie commented on AIRAVATA-2492: ------------------------------------------- Some things to check: * what is the network connection speed between gw153 and alamo and between gw153 and uslims3? ** https://askubuntu.com/questions/7976/how-do-you-test-the-network-speed-betwen-two-boxes * how fast is an scp from alamo->gw153 and from gw153->uslims3? This is because gfac does a "fake" third party transfer where the data streams through gfac. This should give us a theoretical upper bound on transfer speed. * historically how fast are transfer speeds for gfac? Is there a way to query for them? * From [~marpierc]: look into tweaking the buffer size in gfac for the transfer > File transfer slow from alamo.uthscsa.edu to uslims3.uthscsa.edu > ---------------------------------------------------------------- > > Key: AIRAVATA-2492 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2492 > Project: Airavata > Issue Type: Bug > Reporter: Marcus Christie > Assignee: Marcus Christie > > Experiment ids in ultrascan.scigap.org: > US3-AIRA_f200b78e-5d86-499c-b2e6-c3b6fbb3bfa3 > US3-AIRA_b69482ba-b34f-4809-9274-bcda71b8dccc > Hipchat discussion: > {quote} > [11:27 PM] Gary Gorbet: On Alamo, 20 minutes are more elapse from end of job > until Airavata reports a job status of COMPLETE. Can't *something* be done to > speed that up! > ---- Wednesday July 26, 2017 ---- > [9:31 AM] Eroma Abeysinghe: @gary sorry for this. Currently all experiments > are completed, and two running in alamo. but we found two which took rather > unusually long time to transfers the files. we are investigating. will keep > you posted. > [9:51 AM] Marcus Christie: @gary here's what we've been able to confirm: gfac > is getting the completed email from the scheduler and processing it > immediately. Airavata doesn't mark an experiment as being completed until the > output data staging completes. The output data staging for the jobs you gave > us to look at took a long time to complete. > [9:52 AM] Marlon Pierce: You can test network speed in various ways (like > https://askubuntu.com/questions/7976/how-do-you-test-the-network-speed-betwen-two-boxes, > which I googled). > [9:53 AM] Marlon Pierce: It won't fix the problem, but it may help network > admins diagnose the problem > [9:53 AM] Marlon Pierce: I'm assuming (maybe incorrectly) that this is an > issue between Alamo and our servers > [9:54 AM] Marcus Christie: This is partly because transfer times from alamo > to lims are happening very slowly, by my rough calculations at about 200kb/s. > But also the analysis-results.tar file is much larger in these experiments. > One example we looked at it was about 200 MB and another was about 1.13 GB. > Looking through the logs back to April this is historically very large for an > analysis-results.tar file. > [9:54 AM] Marlon Pierce: But we can confirm this by looking at other servers > [9:54 AM] Marcus Christie: @marlon this is between alamo and lims. Do we have > a login on lims? > [9:56 AM] Marlon Pierce: Oh...where is the LIMS server? At UTHSCSA? > [9:57 AM] Marlon Pierce: I think it is. > [9:58 AM] Marcus Christie: @marlon uslims3.uthscsa.edu > [9:59 AM] Gary Gorbet: That other job completed. But now there is new, > critical, job that FINISHED 40 minutes ago. Airavata job status is EXECUTING. > [9:59 AM] Marcus Christie: It would be nice if we could run a command line > scp transfer test from alamo to uslims3 just to see if it is fundamentally > limited or this is a slowness in GFac. > [10:00 AM] Gary Gorbet: That is easy to do. Will do a test and report. > [10:00 AM] Marcus Christie: Thanks @gary > [10:01 AM] Gary Gorbet: Transferred a 1.6M file from uslims3 to alamo in > under a second. > [10:02 AM] Marcus Christie: Actually I realized that GFac will do the > transfer from alamo to uslims3 via GFac. It's not a true third party transfer > but rather the data is streamed through the GFac server. So @marlon was > right we need to test between alamo -> Gfac and GFac -> uslims3. > [10:03 AM] Gary Gorbet: This would be alamo/uslims3 to gw153; right? > [10:03 AM] Marcus Christie: Thanks @gary. So that's our theoretical upper > bound. > [10:04 AM] Marcus Christie: @gary that's right > [10:08 AM] Sudhakar Pamidighantam: I think If it is reasonably secure GFac > should do a true third party transfer to avoid this delay. > [10:08 AM] Gary Gorbet: The LIMS work directory for the job has all the > output files. So, it seems to me that gw153 to uslims3 transfers completed. > So, why the EXECUTING gfac status? > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)