Re: hdfs lease issues on flink retry

Matthias Pohl Thu, 26 Aug 2021 09:15:08 -0700

I see - I should have checked my mailbox before answering. I received the
email and was able to login.


On Thu, Aug 26, 2021 at 6:12 PM Matthias Pohl <matth...@ververica.com>
wrote:

> The link doesn't work, i.e. I'm redirected to a login page. It would be
> also good to include the Flink logs and make them accessible for everyone.
> This way others could share their perspective as well...
>
> On Thu, Aug 26, 2021 at 5:40 PM Shah, Siddharth [Engineering] <
> siddharth.x.s...@gs.com> wrote:
>
>> Hi Matthias,
>>
>>
>>
>> Thank you for responding and taking time to look at the issue.
>>
>>
>>
>> Uploaded the yarn lags here:
>> https://lockbox.gs.com/lockbox/folders/963b0f29-85ad-4580-b420-8c66d9c07a84/
>> and have also requested read permissions for you. Please let us know if
>> you’re not able to see the files.
>>
>>
>>
>>
>>
>> *From:* Matthias Pohl <matth...@ververica.com>
>> *Sent:* Thursday, August 26, 2021 9:47 AM
>> *To:* Shah, Siddharth [Engineering] <siddharth.x.s...@ny.email.gs.com>
>> *Cc:* user@flink.apache.org; Hailu, Andreas [Engineering] <
>> andreas.ha...@ny.email.gs.com>
>> *Subject:* Re: hdfs lease issues on flink retry
>>
>>
>>
>> Hi Siddharth,
>>
>> thanks for reaching out to the community. This might be a bug. Could you
>> share your Flink and YARN logs? This way we could get a better
>> understanding of what's going on.
>>
>>
>>
>> Best,
>> Matthias
>>
>>
>>
>> On Tue, Aug 24, 2021 at 10:19 PM Shah, Siddharth [Engineering] <
>> siddharth.x.s...@gs.com> wrote:
>>
>> Hi  Team,
>>
>>
>>
>> We are seeing transient failures in the jobs mostly requiring higher
>> resources and using flink RestartStrategies
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.13_docs_dev_execution_task-5Ffailure-5Frecovery_-23fixed-2Ddelay-2Drestart-2Dstrategy&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=eLqB-T65EFJVVpR6QlSfRHIga7DPK3o8yJvw_OhnMvk&m=qIClgDVq00Jp0qluJfWV-aGM7Sg7tAnr_I2yy4TtNaM&s=wL6-8B4mnGofyRWetXrTSw9FBSV-XTDnoHsPtzU7h7c&e=>
>> [1]. Upon checking the yarn logs we have observed hdfs lease issues when
>> flink retry happens. The job originally fails for the first try with 
>> PartitionNotFoundException
>> or NoResourceAvailableException., but on retry it seems form the yarn logs
>> is that the lease for the temp sink directory is not yet released by the
>> node from previous try.
>>
>>
>>
>> Initial Failure log message:
>>
>>
>>
>> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
>> Could not allocate enough slots to run the job. Please make sure that the
>> cluster has enough resources.
>>
>>         at
>> org.apache.flink.runtime.executiongraph.Execution.lambda$scheduleForExecution$0(Execution.java:461)
>>
>>         at
>> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>>
>>         at
>> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>>
>>         at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>>
>>         at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>>
>>         at
>> org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl.lambda$internalAllocateSlot$0(SchedulerImpl.java:190)
>>
>>         at
>> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>>
>>         at
>> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>>
>>         at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>>
>>         at
>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>>
>>
>>
>>
>>
>> Retry failure log message:
>>
>>
>>
>> Caused by: 
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException):
>>  
>> /user/p2epda/lake/delp_prod/PROD/APPROVED/data/TECHRISK_SENTINEL/INFORMATION_REPORT/4377/temp/data/_temporary/0/_temporary/attempt__0000_r_000003_0/partMapper-r-00003.snappy.parquet
>>  for client 10.51.63.226 already exists
>>
>>         at 
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2815)
>>
>>         at 
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2702)
>>
>>         at 
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2586)
>>
>>         at 
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:736)
>>
>>         at 
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:409)
>>
>>         at 
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>
>>         at 
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>>
>>
>>
>>
>>
>>
>>
>> I could verify that it’s the same nodes from previous try owning the
>> lease, and checked for multiple jobs by matching IP addresses. Ideally, we
>> want an internal retry to happen since there will be thousands of jobs
>> running at a time and hard to manually retry them.
>>
>>
>>
>> This is our current restart config:
>>
>> executionEnv.setRestartStrategy(RestartStrategies.*fixedDelayRestart*(3,
>> Time.*of*(10, TimeUnit.*SECONDS*)));
>>
>>
>>
>> Is it possible to resolve leases before a retry? Or is it possible to
>> have different sink directories (increment attempt id somewhere) for every
>> retry, that way we have no lease issues? Or do you have any other
>> suggestion on resolving this?
>>
>>
>>
>>
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/execution/task_failure_recovery/#fixed-delay-restart-strategy
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.13_docs_dev_execution_task-5Ffailure-5Frecovery_-23fixed-2Ddelay-2Drestart-2Dstrategy&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=eLqB-T65EFJVVpR6QlSfRHIga7DPK3o8yJvw_OhnMvk&m=qIClgDVq00Jp0qluJfWV-aGM7Sg7tAnr_I2yy4TtNaM&s=wL6-8B4mnGofyRWetXrTSw9FBSV-XTDnoHsPtzU7h7c&e=>
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Siddharth
>>
>>
>>
>>
>> ------------------------------
>>
>>
>> Your Personal Data: We may collect and process information about you that
>> may be subject to data protection laws. For more information about how we
>> use and disclose your personal data, how we protect your information, our
>> legal basis to use your information, your rights and who you can contact,
>> please refer to: www.gs.com/privacy-notices
>>
>>
>> ------------------------------
>>
>> Your Personal Data: We may collect and process information about you that
>> may be subject to data protection laws. For more information about how we
>> use and disclose your personal data, how we protect your information, our
>> legal basis to use your information, your rights and who you can contact,
>> please refer to: www.gs.com/privacy-notices
>>
>
>

-- 

Matthias Pohl | Engineer

Follow us @VervericaData Ververica <https://www.ververica.com/>

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton
Wehner

Re: hdfs lease issues on flink retry

Reply via email to