[jira] [Commented] (AIRAVATA-2621) SSH port provided in compute resource registration is not considered for cluster SSH communication

2018-01-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRAVATA-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332515#comment-16332515
 ] 

ASF subversion and git services commented on AIRAVATA-2621:
---

Commit f13c17fe41bfdd1a4c8e35a88ab16f534190b523 in airavata's branch 
refs/heads/AIRAVATA-2620 from dimuthu.upeks...@gmail.com
[ https://gitbox.apache.org/repos/asf?p=airavata.git;h=f13c17f ]

Fixing AIRAVATA-2621


> SSH port provided in compute resource registration is not considered for 
> cluster SSH communication
> --
>
> Key: AIRAVATA-2621
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2621
> Project: Airavata
>  Issue Type: Bug
>  Components: GFac
>Affects Versions: 0.18
> Environment: https://hpcgateway.gsu.edu/
> https://scigap.org/
>Reporter: Eroma
>Assignee: Dimuthu Upeksha
>Priority: Major
> Fix For: 0.18
>
>
> 1. Added a specific port for job submissions (15022)
> 2. But when submitting jobs, for environment creation, the gfac is using the 
> default 22 port, not the specified one in scigap.org for hpclogin.gsu.edu.
> 3. log messages in airavata log
> 2017-12-19 11:13:18,996 [pool-7-thread-2] INFO  
> o.a.airavata.gfac.impl.Factory 
> process_id=PROCESS_3b471b3b-5b4e-4b6d-a66e-554652a390d2, 
> token_id=35da840b-63d5-4cbf-b9ce-3005cd94d961, 
> experiment_id=NWChem2_a38ac303-666f-4dea-9b4c-7bffe0f97dd7, 
> gateway_id=georgiastate - Initialize a new SSH session for 
> :airavata_hpclogin.gsu.edu_22_35da840b-63d5-4cbf-b9ce-3005cd94d961
> 2017-12-19 11:15:26,272 [pool-7-thread-2] ERROR o.a.a.gfac.core.GFacException 
> process_id=PROCESS_3b471b3b-5b4e-4b6d-a66e-554652a390d2, 
> token_id=35da840b-63d5-4cbf-b9ce-3005cd94d961, 
> experiment_id=NWChem2_a38ac303-666f-4dea-9b4c-7bffe0f97dd7, 
> gateway_id=georgiastate - JSch initialization error
> com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed 
> out (Connection timed out)
> at com.jcraft.jsch.Util.createSocket(Util.java:349)
> at com.jcraft.jsch.Session.connect(Session.java:215)
> at com.jcraft.jsch.Session.connect(Session.java:183)
> at 
> org.apache.airavata.gfac.impl.Factory.getSSHSession(Factory.java:542)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.getSshSession(HPCRemoteCluster.java:138)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.getSession(HPCRemoteCluster.java:315)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.makeDirectory(HPCRemoteCluster.java:242)
> at 
> org.apache.airavata.gfac.impl.task.EnvironmentSetupTask.execute(EnvironmentSetupTask.java:51)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeTask(GFacEngineImpl.java:814)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.configureWorkspace(GFacEngineImpl.java:553)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeTaskListFrom(GFacEngineImpl.java:324)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeProcess(GFacEngineImpl.java:286)
> at 
> org.apache.airavata.gfac.impl.GFacWorker.executeProcess(GFacWorker.java:227)
> at org.apache.airavata.gfac.impl.GFacWorker.run(GFacWorker.java:86)
> at 
> org.apache.airavata.common.logging.MDCUtil.lambda$wrapWithMDC$0(MDCUtil.java:40)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.ConnectException: Connection timed out (Connection timed 
> out)
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at java.net.Socket.connect(Socket.java:538)
> at java.net.Socket.(Socket.java:434)
> at java.net.Socket.(Socket.java:211)
> at com.jcraft.jsch.Util.createSocket(Util.java:343)
> ... 17 common frames omitted
> ?NWChem2_a38ac303-666f-4dea-9b4c-7bffe0f97dd7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRAVATA-2620) Force post processing functionality

2018-01-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRAVATA-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332517#comment-16332517
 ] 

ASF subversion and git services commented on AIRAVATA-2620:
---

Commit e3de5a05731c27b6eb640a7da4b97b177aad48fc in airavata's branch 
refs/heads/AIRAVATA-2620 from [~smarru]
[ https://gitbox.apache.org/repos/asf?p=airavata.git;h=e3de5a0 ]

Merge branch 'master' into AIRAVATA-2620


> Force post processing functionality 
> 
>
> Key: AIRAVATA-2620
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2620
> Project: Airavata
>  Issue Type: Improvement
>Affects Versions: 0.16
>Reporter: Suresh Marru
>Assignee: Dimuthu Upeksha
>Priority: Major
> Fix For: 0.17
>
>
> Due to current limitations of only relying on email for job monitoring, the 
> post-processing sometimes has inherent delays. Ultrascan science gateway 
> would like to have a capability in airavata to request forcing of post 
> processing. This will be used when clients have out of band knowledge about 
> job completion (for example through code instrumented UDP messages) and would 
> like Airavata to force staging of output files.
> This improvement has to be carefully added so existing life cycle of an 
> experiment is not hampred. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRAVATA-2624) Stampede2 cluster SSH connectivity issue

2018-01-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRAVATA-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332516#comment-16332516
 ] 

ASF subversion and git services commented on AIRAVATA-2624:
---

Commit 02379098159a81612ed8cf4f73a5cef1e3eae9f9 in airavata's branch 
refs/heads/AIRAVATA-2620 from dimuthu.upeks...@gmail.com
[ https://gitbox.apache.org/repos/asf?p=airavata.git;h=0237909 ]

Fixing AIRAVATA-2624 Sampede2 cluster SSH connectivity issue


> Stampede2 cluster SSH connectivity issue
> 
>
> Key: AIRAVATA-2624
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2624
> Project: Airavata
>  Issue Type: Bug
>  Components: Airavata System, GFac
>Affects Versions: 0.18
> Environment: https://seagrid.org 
>Reporter: Eroma
>Assignee: Marcus Christie
>Priority: Major
> Fix For: 0.18
>
>
> Job submission fails at env creation due to JSch initialization error.
> Error messages
> 2018-01-09 09:46:10,786 [pool-7-thread-15] ERROR 
> o.a.a.gfac.core.GFacException 
> process_id=PROCESS_650014f6-fcb6-4680-90ea-898bee373f37, 
> token_id=3d65bf6d-2c9f-4166-a51b-e76e0022bd3b, 
> experiment_id=Clone_of_st2molcastest_e2942a34-c9c7-4f04-8ccb-af6fe27e0990, 
> gateway_id=seagrid - JSch initialization error
> com.jcraft.jsch.JSchException: Auth fail
> at com.jcraft.jsch.Session.connect(Session.java:512)
> at com.jcraft.jsch.Session.connect(Session.java:183)
> at 
> org.apache.airavata.gfac.impl.Factory.getSSHSession(Factory.java:542)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.getSshSession(HPCRemoteCluster.java:138)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.getSession(HPCRemoteCluster.java:315)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.makeDirectory(HPCRemoteCluster.java:242)
> at 
> org.apache.airavata.gfac.impl.task.EnvironmentSetupTask.execute(EnvironmentSetupTask.java:51)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeTask(GFacEngineImpl.java:814)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.configureWorkspace(GFacEngineImpl.java:553)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeTaskListFrom(GFacEngineImpl.java:324)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeProcess(GFacEngineImpl.java:286)
> at 
> org.apache.airavata.gfac.impl.GFacWorker.executeProcess(GFacWorker.java:227)
> at org.apache.airavata.gfac.impl.GFacWorker.run(GFacWorker.java:86)
> at 
> org.apache.airavata.common.logging.MDCUtil.lambda$wrapWithMDC$0(MDCUtil.java:40)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2018-01-09 09:46:10,786 [pool-7-thread-15] ERROR 
> o.a.a.g.i.t.EnvironmentSetupTask 
> process_id=PROCESS_650014f6-fcb6-4680-90ea-898bee373f37, 
> token_id=3d65bf6d-2c9f-4166-a51b-e76e0022bd3b, 
> experiment_id=Clone_of_st2molcastest_e2942a34-c9c7-4f04-8ccb-af6fe27e0990, 
> gateway_id=seagrid - Error while environment setup
> org.apache.airavata.gfac.core.GFacException: JSch initialization error
> at 
> org.apache.airavata.gfac.impl.Factory.getSSHSession(Factory.java:545)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.getSshSession(HPCRemoteCluster.java:138)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.getSession(HPCRemoteCluster.java:315)
> at 
> org.apache.airavata.gfac.impl.HPCRemoteCluster.makeDirectory(HPCRemoteCluster.java:242)
> at 
> org.apache.airavata.gfac.impl.task.EnvironmentSetupTask.execute(EnvironmentSetupTask.java:51)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeTask(GFacEngineImpl.java:814)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.configureWorkspace(GFacEngineImpl.java:553)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeTaskListFrom(GFacEngineImpl.java:324)
> at 
> org.apache.airavata.gfac.impl.GFacEngineImpl.executeProcess(GFacEngineImpl.java:286)
> at 
> org.apache.airavata.gfac.impl.GFacWorker.executeProcess(GFacWorker.java:227)
> at org.apache.airavata.gfac.impl.GFacWorker.run(GFacWorker.java:86)
> at 
> org.apache.airavata.common.logging.MDCUtil.lambda$wrapWithMDC$0(MDCUtil.java:40)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: com.jcraft.jsch.JSchException: Auth fail
> at com.jcraft.jsch.Session.connect(Session.java:512)
>   

[jira] [Created] (AIRAVATA-2646) Upgrade to Thrift 0.11.0

2018-01-19 Thread Suresh Marru (JIRA)
Suresh Marru created AIRAVATA-2646:
--

 Summary: Upgrade to Thrift 0.11.0
 Key: AIRAVATA-2646
 URL: https://issues.apache.org/jira/browse/AIRAVATA-2646
 Project: Airavata
  Issue Type: Improvement
Reporter: Suresh Marru


Thrift upgrades to 0.11.0 has some changes related to C++ stubs - 
[https://github.com/apache/thrift/blob/0.11.0/CHANGES]

To mitigate an unrelated issue, I upgraded to 0.11.0 and its working ok. I 
think we should move the develop branch to the latest thrift release. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRAVATA-2622) Search airavata.log and locate job IDs returned at subsequent job ID search after submitting the job

2018-01-19 Thread Eroma (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRAVATA-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332853#comment-16332853
 ] 

Eroma commented on AIRAVATA-2622:
-

This was tested using the SEAGrid js-169-144.jetstream-cloud.org jetstream 
slurm cluster. The subsequent steps are functioning. In order to test this
 # We stopped job ID return at job submission
 # Then the job ID was returned in the 1st retry or the second retry of squeue 
command.

 

> Search airavata.log and locate job IDs returned at subsequent job ID search 
> after submitting the job
> 
>
> Key: AIRAVATA-2622
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2622
> Project: Airavata
>  Issue Type: Bug
>  Components: Airavata System, GFac
>Affects Versions: 0.18
>Reporter: Eroma
>Assignee: Eroma
>Priority: Major
> Fix For: 0.18
>
>
> Currently at job submission the job ID is returned to airavata GFAC. At some 
> job submissions when the job ID is not returned gfac will try to retrieve it 
> two more times. We need to confirm that these subsequent steps are working. 
> One way is to find and locate instances where the job ID is returned in a 
> subsequent step.
> This is to confirm that these steps are functioning as they are expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRAVATA-2622) Search airavata.log and locate job IDs returned at subsequent job ID search after submitting the job

2018-01-19 Thread Eroma (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRAVATA-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eroma closed AIRAVATA-2622.
---
Resolution: Fixed

> Search airavata.log and locate job IDs returned at subsequent job ID search 
> after submitting the job
> 
>
> Key: AIRAVATA-2622
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2622
> Project: Airavata
>  Issue Type: Bug
>  Components: Airavata System, GFac
>Affects Versions: 0.18
>Reporter: Eroma
>Assignee: Eroma
>Priority: Major
> Fix For: 0.18
>
>
> Currently at job submission the job ID is returned to airavata GFAC. At some 
> job submissions when the job ID is not returned gfac will try to retrieve it 
> two more times. We need to confirm that these subsequent steps are working. 
> One way is to find and locate instances where the job ID is returned in a 
> subsequent step.
> This is to confirm that these steps are functioning as they are expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRAVATA-2590) Update UGE_groovy.template to apply different parallel environment (-pe) values

2018-01-19 Thread Eroma (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRAVATA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331178#comment-16331178
 ] 

Eroma edited comment on AIRAVATA-2590 at 1/19/18 8:02 PM:
--

Testing the fix done for Printing the processes per node and job submission 
commands. The -pe related change was not applied to production as the 
requirement is no longer valid.

Tested using [https://seagrid.org,|https://seagrid.org,/] 
[https://sciencegateway.siu.edu|https://sciencegateway.siu.edu/] and 
[https://sciencegateway.usd.edu|https://sciencegateway.usd.edu/]

Test Cases.
 # Submit a test job to little dog cluster. - Cluster is currently non 
responsive 
 # Submit a test job to big dog cluster. Test job completed successfully. - PASS
 # Cancel a test job in little dog - luster is currently non responsive 
 # Cancel a test job in big dog. - PASS
 # Submit a test job to USD HPC. Test job completed successfully. - PASS
 # Cancel a job in USD HPC. - PASS
 # Submit test job to SLURM cluster in SEAGrid. Tested with Comet and Bridges - 
PASS
 # Submit test job to PBS cluster in SEAGrid. Tested with bigred2 - PASS
 # Cancel  SLURM job. Tested with Stampede2 - PASS
 # Cancel a PBS jobs. Tested with bigred2 - PASS


was (Author: eroma_a):
Testing the fix done for Printing the processes per node and job submission 
commands. The -pe related change was not applied to production as the 
requirement is no longer valid.

Tested using [https://seagrid.org,|https://seagrid.org,/] 
[https://sciencegateway.siu.edu|https://sciencegateway.siu.edu/] and 
[https://sciencegateway.usd.edu|https://sciencegateway.usd.edu/]

Test Cases.
 # Submit a test job to little dog cluster. 
 # Submit a test job to big dog cluster. Test job completed successfully. - PASS
 # Cancel a test job in little dog
 # Cancel a test job in big dog. - PASS
 # Submit a test job to USD HPC. Test job completed successfully. - PASS
 # Cancel a job in USD HPC.
 # Submit test job to SLURM cluster in SEAGrid. Tested with Comet and Bridges - 
PASS
 # Submit test job to PBS cluster in SEAGrid. Tested with bigred2 - PASS
 # Cancel  SLURM job.
 # Cancel a PBS jobs

> Update UGE_groovy.template to apply different parallel environment (-pe) 
> values
> ---
>
> Key: AIRAVATA-2590
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2590
> Project: Airavata
>  Issue Type: Bug
>Reporter: Marcus Christie
>Assignee: Marcus Christie
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRAVATA-2590) Update UGE_groovy.template to apply different parallel environment (-pe) values

2018-01-19 Thread Eroma (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRAVATA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eroma resolved AIRAVATA-2590.
-
Resolution: Fixed

> Update UGE_groovy.template to apply different parallel environment (-pe) 
> values
> ---
>
> Key: AIRAVATA-2590
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2590
> Project: Airavata
>  Issue Type: Bug
>Reporter: Marcus Christie
>Assignee: Marcus Christie
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)