Hi Aaron,

Thanks for contacting us. We will look into this issue and get back to you.

Best,
Lahiru

On 2024/09/11 18:45:33 Aaron Householder wrote:
> Hi Airavata,
> 
> I’m working on connecting Ultrascan3 to Airavata. As the message below shows, 
> if the job fails the metascheduler might not retry the job. Is there a 
> resource available to take a look at this issue?
> 
> Regards,
> Aaron
> 
> From: Aaron Householder <[email protected]>
> Date: Tuesday, September 3, 2024 at 4:42 PM
> To: Airavata Users <[email protected]>
> Subject: Failed but metascheduler did not resubmit job
> Hi Airavata Users,
> 
> I had an UltraScan job that seemed to fail without the metascheduler 
> resubmitting the job for completion by another cluster. I received the 
> following in an email:
> 
>    Your UltraScan job is complete:
> 
>    Submission Time : 2024-08-26 00:50:05
>    Job End Time    :
>    Mail Time       : 2024-08-25 19:54:41
>    LIMS Host       :
>    Analysis ID     : US3-AIRA_ea2b4a32-27a8-4df4-827c-5fd9367c5e1c
>    Request ID      : 182  ( uslims3_Demo )
>    RunID           : demo1_veloc1
>    EditID          : 21030600161
>    Data Type       : RA
>    Cell/Channel/Wl : 2 / A / 259
>    Status          : failed
>    Cluster         : metascheduler
>    Job Type        : 2DSA-MC
>    GFAC Status     : FAILED
>    GFAC Message    : org.apache.airavata.helix.impl.task.TaskOnFailException: 
> Error Code : 23857cb5-5431-43e7-a927-fedd7a929e34, Task 
> TASK_5b1ea99b-750c-49f6-a05b-df7175f141ed failed due to Couldn't find job id 
> in both submitted and verified steps. 
> expId:US3-AIRA_ea2b4a32-27a8-4df4-827c-5fd9367c5e1c Couldn't find remote 
> jobId for JobName:A1394806797, both submit and verify steps doesn't return a 
> valid JobId. Hence changing experiment state to Failed
>         at 
> org.apache.airavata.helix.impl.task.AiravataTask.onFail(AiravataTask.java:146)
>         at 
> org.apache.airavata.helix.impl.task.submission.DefaultJobSubmissionTask.onRun(DefaultJobSubmissionTask.java:192)
>         at 
> org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:437)
>         at 
> org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:102)
>         at org.apache.helix.task.TaskRunner.run(TaskRunner.java:71)
>         at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>         at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> 
> 
> No reservation for this job
> --> Verifying valid submit host (login2)...OK
> --> Verifying valid jobname...OK
> --> Verifying valid ssh keys...OK
> --> Verifying access to desired queue (normal)...OK
> --> Checking available allocation FAILED
>    Airavata stderr : ERROR: You have no project in the projectuser.map file 
> (in accounting_check_prod.pl).
> 

Reply via email to