Marcus Christie created AIRAVATA-2492:
-----------------------------------------
Summary: File transfer slow from alamo.uthscsa.edu to
uslims3.uthscsa.edu
Key: AIRAVATA-2492
URL: https://issues.apache.org/jira/browse/AIRAVATA-2492
Project: Airavata
Issue Type: Bug
Reporter: Marcus Christie
Assignee: Marcus Christie
Experiment ids in ultrascan.scigap.org:
US3-AIRA_f200b78e-5d86-499c-b2e6-c3b6fbb3bfa3
US3-AIRA_b69482ba-b34f-4809-9274-bcda71b8dccc
Hipchat discussion:
{quote}
[11:27 PM] Gary Gorbet: On Alamo, 20 minutes are more elapse from end of job
until Airavata reports a job status of COMPLETE. Can't *something* be done to
speed that up!
---- Wednesday July 26, 2017 ----
[9:31 AM] Eroma Abeysinghe: @gary sorry for this. Currently all experiments are
completed, and two running in alamo. but we found two which took rather
unusually long time to transfers the files. we are investigating. will keep you
posted.
[9:51 AM] Marcus Christie: @gary here's what we've been able to confirm: gfac
is getting the completed email from the scheduler and processing it
immediately. Airavata doesn't mark an experiment as being completed until the
output data staging completes. The output data staging for the jobs you gave us
to look at took a long time to complete.
[9:52 AM] Marlon Pierce: You can test network speed in various ways (like
https://askubuntu.com/questions/7976/how-do-you-test-the-network-speed-betwen-two-boxes,
which I googled).
[9:53 AM] Marlon Pierce: It won't fix the problem, but it may help network
admins diagnose the problem
[9:53 AM] Marlon Pierce: I'm assuming (maybe incorrectly) that this is an issue
between Alamo and our servers
[9:54 AM] Marcus Christie: This is partly because transfer times from alamo to
lims are happening very slowly, by my rough calculations at about 200kb/s. But
also the analysis-results.tar file is much larger in these experiments. One
example we looked at it was about 200 MB and another was about 1.13 GB.
Looking through the logs back to April this is historically very large for an
analysis-results.tar file.
[9:54 AM] Marlon Pierce: But we can confirm this by looking at other servers
[9:54 AM] Marcus Christie: @marlon this is between alamo and lims. Do we have a
login on lims?
[9:56 AM] Marlon Pierce: Oh...where is the LIMS server? At UTHSCSA?
[9:57 AM] Marlon Pierce: I think it is.
[9:58 AM] Marcus Christie: @marlon uslims3.uthscsa.edu
[9:59 AM] Gary Gorbet: That other job completed. But now there is new,
critical, job that FINISHED 40 minutes ago. Airavata job status is EXECUTING.
[9:59 AM] Marcus Christie: It would be nice if we could run a command line scp
transfer test from alamo to uslims3 just to see if it is fundamentally limited
or this is a slowness in GFac.
[10:00 AM] Gary Gorbet: That is easy to do. Will do a test and report.
[10:00 AM] Marcus Christie: Thanks @gary
[10:01 AM] Gary Gorbet: Transferred a 1.6M file from uslims3 to alamo in under
a second.
[10:02 AM] Marcus Christie: Actually I realized that GFac will do the transfer
from alamo to uslims3 via GFac. It's not a true third party transfer but rather
the data is streamed through the GFac server. So @marlon was right we need to
test between alamo -> Gfac and GFac -> uslims3.
[10:03 AM] Gary Gorbet: This would be alamo/uslims3 to gw153; right?
[10:03 AM] Marcus Christie: Thanks @gary. So that's our theoretical upper bound.
[10:04 AM] Marcus Christie: @gary that's right
[10:08 AM] Sudhakar Pamidighantam: I think If it is reasonably secure GFac
should do a true third party transfer to avoid this delay.
[10:08 AM] Gary Gorbet: The LIMS work directory for the job has all the output
files. So, it seems to me that gw153 to uslims3 transfers completed. So, why
the EXECUTING gfac status?
{quote}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)