Limess opened a new issue, #9814:
URL: https://github.com/apache/hudi/issues/9814

   **Describe the problem you faced**
   
   After upgrading Hudi from 0.12.1 to 0.13.1 via an EMR upgrade I’m seeing a 
lot of these:
   
   ```
   23/09/25 16:51:57 INFO RemoteHoodieTableFileSystemView: Sending request : 
(http://ip-10-0-107-14.eu-west-1.compute.internal:38427/v1/hoodie/view/datafiles/beforeoron/latest/?partition=story_published_partition_date%3D2023-08-26&maxinstant=20230925101228159&basepath=s3%3A%2F%2Fprod-signal-articles-store%2Farticles_hudi_copy_on_write&lastinstantts=20230925142837150&timelinehash=839a7f3760bd309b411eecb46f32635c0eb8d06daac3fba349cb7713a6a698c7)
   23/09/25 16:52:36 INFO RetryExec: I/O exception 
(org.apache.hudi.org.apache.http.NoHttpResponseException) caught when 
processing request to 
{}->http://ip-10-0-107-14.eu-west-1.compute.internal:38427/: The target server 
failed to respond
   23/09/25 16:52:36 INFO RetryExec: Retrying request to 
{}->http://ip-10-0-107-14.eu-west-1.compute.internal:38427/
   23/09/25 16:53:06 INFO RetryExec: I/O exception 
(org.apache.hudi.org.apache.http.NoHttpResponseException) caught when 
processing request to 
{}->http://ip-10-0-107-14.eu-west-1.compute.internal:38427/: The target server 
failed to respond
   23/09/25 16:53:06 INFO RetryExec: Retrying request to 
{}->http://ip-10-0-107-14.eu-west-1.compute.internal:38427/
   23/09/25 16:53:36 INFO RetryExec: I/O exception 
(org.apache.hudi.org.apache.http.NoHttpResponseException) caught when 
processing request to 
{}->http://ip-10-0-107-14.eu-west-1.compute.internal:38427/: The target server 
failed to respond
   23/09/25 16:53:36 INFO RetryExec: Retrying request to 
{}->http://ip-10-0-107-14.eu-west-1.compute.internal:38427/
   23/09/25 16:54:07 WARN RetryHelper: Catch Exception for Sending request, 
will retry after 100 ms.
   org.apache.hudi.org.apache.http.NoHttpResponseException: 
ip-10-0-107-14.eu-west-1.compute.internal:38427 failed to respond
   ```
   
   I’ve enabled retries, but it seems to be slowing down various write tasks a 
lot as they retry/fallover to secondary methods. Why would this be happening?
   Between these, and seemingly slower bloom filter lookups, jobs are taking 2x 
longer or more.
   
   I'm unsure if these correspond to these warnings on the driver logs:
   
   ```
   WARN RequestHandler: Bad request response due to client view behind server 
view. Last known instant from client was 20230925142837150 but server has the 
following timeline [[20230405172930640__rollback__COMPLETED], 
[20230405220408317__rollback__COMPLETED], 
[20230405230726307__rollback__COMPLETED], 
[20230406004821619__rollback__COMPLETED], 
[20230406022626456__rollback__COMPLETED], 
[20230406040217179__rollback__COMPLETED], 
[20230406053604634__rollback__COMPLETED], 
[20230406071500195__rollback__COMPLETED], 
[20230406085932605__rollback__COMPLETED], 
[20230406091145473__rollback__COMPLETED], 
[20230904040946183__rollback__COMPLETED], 
[20230904200935082__rollback__COMPLETED], 
[20230905102904696__rollback__COMPLETED], 
[20230920120910043__commit__COMPLETED], [20230920161015352__commit__COMPLETED], 
[20230920200916636__commit__COMPLETED], [20230921000922099__commit__COMPLETED], 
[20230921040951133__commit__COMPLETED], [20230921081133533__commit__COMPLETED], 
[20230921081136531__clean__COMPLETED
 ], [20230921120938905__commit__COMPLETED], 
[20230921120941970__clean__COMPLETED], [20230921161019209__commit__COMPLETED], 
[20230921161022485__clean__COMPLETED], [20230921200920596__commit__COMPLETED], 
[20230921200923858__clean__COMPLETED], [20230922001011936__commit__COMPLETED], 
[20230922001014953__clean__COMPLETED], [20230922040943645__commit__COMPLETED], 
[20230922040946795__clean__COMPLETED], [20230922080911829__commit__COMPLETED], 
[20230922080915209__clean__COMPLETED], [20230922120928185__commit__COMPLETED], 
[20230922120931568__clean__COMPLETED], [20230922161014635__commit__COMPLETED], 
[20230922161017634__clean__COMPLETED], [20230922200911764__commit__COMPLETED], 
[20230922200914501__clean__COMPLETED], [20230923000928118__commit__COMPLETED], 
[20230923000931194__clean__COMPLETED], [20230923040937860__commit__COMPLETED], 
[20230923040940748__clean__COMPLETED], [20230923080919659__commit__COMPLETED], 
[20230923080922740__clean__COMPLETED], [20230923120913393__commit__COMPLETED], 
[20230
 923120916656__clean__COMPLETED], [20230923160937358__commit__COMPLETED], 
[20230923160940858__clean__COMPLETED], [20230923200914761__commit__COMPLETED], 
[20230923200917719__clean__COMPLETED], [20230924000958223__commit__COMPLETED], 
[20230924001001271__clean__COMPLETED], [20230924040915658__commit__COMPLETED], 
[20230924040918676__clean__COMPLETED], [20230924080919687__commit__COMPLETED], 
[20230924080922913__clean__COMPLETED], [20230924120907571__commit__COMPLETED], 
[20230924120910946__clean__COMPLETED], [20230924160910339__commit__COMPLETED], 
[20230924160913410__clean__COMPLETED], [20230924200912759__commit__COMPLETED], 
[20230924200915964__clean__COMPLETED], [20230925000926377__commit__COMPLETED], 
[20230925000931547__clean__COMPLETED], [20230925041024449__commit__COMPLETED], 
[20230925041027798__clean__COMPLETED], [20230925080953746__commit__COMPLETED], 
[20230925080957003__clean__COMPLETED], [20230925101228159__commit__COMPLETED], 
[20230925101231993__clean__COMPLETED], [202309251146078
 21__clean__COMPLETED], [20230925142837150__rollback__COMPLETED], 
[20230925161210335__rollback__COMPLETED]]
   23/09/25 17:12:41 INFO HoodieActiveTimeline: Loaded instants upto : 
Option{val=[20230925161210335__rollback__COMPLETED]}
   
   I’m also seeing similar errors on writes:
   
   ```
   Caused by: org.apache.hudi.exception.HoodieRemoteException: Failed to create 
marker file 
story_published_partition_date=2023-01-06/47d20ede-bbbe-4cd9-91d1-41993c76752a-0_668-25-96261_20230925161205373.parquet.marker.MERGE
   ip-10-0-107-14.eu-west-1.compute.internal:38427 failed to respond
   ```
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to