[ 
https://issues.apache.org/jira/browse/HUDI-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916659#comment-17916659
 ] 

Sagar Sumit commented on HUDI-8621:
-----------------------------------

Removing the optimization for single file slice led to increased CI runtime. 
Some tests took lot of time fetching from remote fs view ([timeline server read 
timeout|https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_apis/build/builds/3055/logs/58],
 see stacktrace below). Reverting the change in 
[https://github.com/apache/hudi/pull/12643] and reopening this ticket. We need 
to investigate why timeline server read timed out.
{code:java}
2025-01-23T22:18:28.6326078Z 334526 
[ScalaTest-main-running-TestStreamSourceReadByStateTransitionTime] WARN  
org.apache.hudi.client.utils.ArchivalUtils [] - Error parsing instant time: 
000000002
2025-01-23T22:23:28.8470539Z 634738 [Executor task launch worker for task 1.0 
in stage 1335.0 (TID 2085)] ERROR 
org.apache.hudi.common.table.view.PriorityBasedFileSystemView [] - Got error 
running preferred function. Trying secondary
2025-01-23T22:23:28.8471638Z org.apache.hudi.exception.HoodieRemoteException: 
Read timed out
2025-01-23T22:23:28.8472396Z    at 
org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.getLatestBaseFilesFromParams(RemoteHoodieTableFileSystemView.java:241)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8628636Z    at 
org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.getLatestBaseFilesBeforeOrOn(RemoteHoodieTableFileSystemView.java:260)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8629289Z    at 
org.apache.hudi.common.table.view.PriorityBasedFileSystemView.execute(PriorityBasedFileSystemView.java:103)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8629855Z    at 
org.apache.hudi.common.table.view.PriorityBasedFileSystemView.getLatestBaseFilesBeforeOrOn(PriorityBasedFileSystemView.java:148)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8630433Z    at 
org.apache.hudi.table.action.commit.UpsertPartitioner.getSmallFiles(UpsertPartitioner.java:306)
 ~[hudi-spark-client-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8630990Z    at 
org.apache.hudi.table.action.commit.UpsertPartitioner.lambda$getSmallFilesForPartitions$6e0f90ba$1(UpsertPartitioner.java:288)
 ~[hudi-spark-client-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8631680Z    at 
org.apache.hudi.client.common.HoodieSparkEngineContext.lambda$mapToPair$786cea6a$1(HoodieSparkEngineContext.java:176)
 ~[hudi-spark-client-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8632166Z    at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$pairFunToScalaFun$1(JavaPairRDD.scala:1073)
 ~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8632553Z    at 
scala.collection.Iterator$$anon$10.next(Iterator.scala:461) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8632906Z    at 
scala.collection.Iterator.foreach(Iterator.scala:943) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8633238Z    at 
scala.collection.Iterator.foreach$(Iterator.scala:943) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8635828Z    at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1431) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8636659Z    at 
scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8637044Z    at 
scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8637435Z    at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8637833Z    at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8638220Z    at 
scala.collection.TraversableOnce.to(TraversableOnce.scala:366) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8638600Z    at 
scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8638967Z    at 
scala.collection.AbstractIterator.to(Iterator.scala:1431) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8639339Z    at 
scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8639730Z    at 
scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8640110Z    at 
scala.collection.AbstractIterator.toBuffer(Iterator.scala:1431) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8641176Z    at 
scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8641576Z    at 
scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8641952Z    at 
scala.collection.AbstractIterator.toArray(Iterator.scala:1431) 
~[scala-library-2.12.18.jar:?]
2025-01-23T22:23:28.8642516Z    at 
org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1049) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8642922Z    at 
org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2433) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8643348Z    at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8643784Z    at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8644180Z    at 
org.apache.spark.scheduler.Task.run(Task.scala:141) 
~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8644589Z    at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
 ~[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8645048Z    at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
 [spark-common-utils_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8645519Z    at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
 [spark-common-utils_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8645950Z    at 
org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) 
[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8646851Z    at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) 
[spark-core_2.12-3.5.4.jar:3.5.4]
2025-01-23T22:23:28.8647264Z    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_432]
2025-01-23T22:23:28.8647643Z    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_432]
2025-01-23T22:23:28.8647978Z    at java.lang.Thread.run(Thread.java:750) 
[?:1.8.0_432]
2025-01-23T22:23:28.8648257Z Caused by: java.net.SocketTimeoutException: Read 
timed out
2025-01-23T22:23:28.8648540Z    at 
java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_432]
2025-01-23T22:23:28.8648864Z    at 
java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_432]
2025-01-23T22:23:28.8649213Z    at 
java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_432]
2025-01-23T22:23:28.8649547Z    at 
java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_432]
2025-01-23T22:23:28.8649940Z    at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8650500Z    at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8650946Z    at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8651401Z    at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
 ~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8651883Z    at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
 ~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8652334Z    at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8652800Z    at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8653375Z    at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) 
~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8653787Z    at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8654193Z    at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
 ~[httpcore-4.4.16.jar:4.4.16]
2025-01-23T22:23:28.8654591Z    at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) 
~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8654980Z    at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) 
~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8655426Z    at 
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) 
~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8655810Z    at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) 
~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8656207Z    at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
 ~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8656620Z    at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
 ~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8657026Z    at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
 ~[httpclient-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8657420Z    at 
org.apache.http.client.fluent.Request.internalExecute(Request.java:173) 
~[fluent-hc-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8657792Z    at 
org.apache.http.client.fluent.Request.execute(Request.java:177) 
~[fluent-hc-4.5.14.jar:4.5.14]
2025-01-23T22:23:28.8658231Z    at 
org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.get(RemoteHoodieTableFileSystemView.java:552)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8658764Z    at 
org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.executeRequest(RemoteHoodieTableFileSystemView.java:190)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8659311Z    at 
org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.getLatestBaseFilesFromParams(RemoteHoodieTableFileSystemView.java:237)
 ~[hudi-common-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
2025-01-23T22:23:28.8660076Z    ... 37 more {code}
 

 

> Revert single file slice optimisation for getRecordsByKeys in MDT table
> -----------------------------------------------------------------------
>
>                 Key: HUDI-8621
>                 URL: https://issues.apache.org/jira/browse/HUDI-8621
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Sagar Sumit
>            Assignee: Lokesh Jain
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.1
>
>
> In [https://github.com/apache/hudi/pull/12376] - we attempted to revert the 
> optimization for single file slice, and do the computation such as 
> getRecordByKeys, etc. over executors even if it is for a single file slice. 
> This means when listing files using metadata files index, even if the data 
> partition has only one file slice, it happens over the executor and the 
> request is sent to the timeline server (RemoteFileSystemView). However, we 
> noticed that the timeline server did not respond and the request timed out in 
> the case of bootstrap of a MOR table having multiple partition fields.
> To reproduce locally, follow below steps:
>  # First, revert the single file slice optimization in 
> HoodieBackedTableMetadata. Look at this commit for ref - 
> [https://github.com/codope/hudi/commit/e9f58e007b8428e52f7d3d60e655108376950679]
>  # Now, run the `TestBootstrapRead.testBootstrapFunctional`. You will notice 
> that COW case passes, MOR with 2 partition fields just hangs in fetching from 
> fs view.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to