[I] [SUPPORT] Streaming job fails after 1h (Hudi 0.15 on EMR) [hudi]

via GitHub Thu, 17 Jul 2025 06:17:09 -0700


Hfal91 opened a new issue, #13571:
URL: https://github.com/apache/hudi/issues/13571


   Describe the problem you faced
   
   Using Hudi 0.15, on EMR, streaming job fails after exactly 1h.
   Issue seems related with HiveSync. Has not happening on 0.14.1
   
   **with metadata disabled:**
   {"message":"25/07/17 13:05:25 ERROR MicroBatchExecution: Query [id = 
81196902-5cc5-45b5-86ca-8adcbc5bc236, runId = 
9d7e67e3-a38b-4423-a19d-40d47663f944] terminated with 
error","time":"2025-07-17T13:05:25+00:00"}
   {"message":"py4j.Py4JException: An exception was raised by the Python Proxy. 
Return Message: Traceback (most recent call 
last):","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/clientserver.py\", line 
617, in _call_proxy","time":"2025-07-17T13:05:25+00:00"}
   2025-07-17T13:05:25.374Z
   {"message":" return_value = getattr(self.pool[obj_id], 
method)(*params)","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 120, in 
call","time":"2025-07-17T13:05:25+00:00"}
   {"message":" raise e","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py\", line 117, in 
call","time":"2025-07-17T13:05:25+00:00"}
   {"message":" self.func(DataFrame(jdf, wrapped_session_jdf), 
batch_id)","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/tmp/spark-a13bcd8a-97f4-493e-9632-f1992f945236/advance-curation-Medium.py\", 
line 178, in foreach_batch_function","time":"2025-07-17T13:05:25+00:00"}
   {"message":" avroDf.write \\","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py\", line 1461, 
in save","time":"2025-07-17T13:05:25+00:00"}
   {"message":" self._jwrite.save()","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py\", line 
1322, in __call__","time":"2025-07-17T13:05:25+00:00"}
   {"message":" return_value = 
get_return_value(","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py\",
 line 179, in deco","time":"2025-07-17T13:05:25+00:00"}
   {"message":" return f(*a, **kw)","time":"2025-07-17T13:05:25+00:00"}
   {"message":" File 
\"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py\", line 326, 
in get_return_value","time":"2025-07-17T13:05:25+00:00"}
   {"message":" raise Py4JJavaError(","time":"2025-07-17T13:05:25+00:00"}
   {"message":"py4j.protocol.Py4JJavaError: An error occurred while calling 
o6783.save.","time":"2025-07-
   {"message":": **org.apache.hudi.exception.HoodieMetaSyncException: Could not 
sync using the meta sync class 
org.apache.hudi.hive.HiveSyncTool**","time":"2025-07-17T13:05:25+00:00"}
   
        
   **with metadata enabled:**
   {"message":"25/07/17 11:47:48 WARN HoodieSparkSqlWriterInternal: Closing 
write client","time":"2025-07-17T11:47:48+00:00"}
   {"message":"25/07/17 11:48:34 WARN HttpParser: URI is too large 
>8192","time":"2025-07-17T11:48:34+00:00"}
   {"message":"25/07/17 11:48:34 ERROR PriorityBasedFileSystemView: Got error 
running preferred function. Trying 
secondary","time":"2025-07-17T11:48:34+00:00"}
   {"message":"**org.apache.hudi.exception.HoodieRemoteException: status code: 
414, reason phrase: URI Too Long**","time":"2025-07-17T11:48:34+00:00"}
   
   
   
   HUDI_OPTIONS = {
       # Writer config
       'hoodie.datasource.write.table.type': INSERT,
       'hoodie.table.name': STREAMING_TABLENAME,
       'hoodie.database.name': DATABASE_NAME,
       'hoodie.datasource.write.recordkey.field': PRIMARY_KEY,
       'hoodie.datasource.write.partitionpath.field': PART_KEYS,
       'hoodie.datasource.write.table.name': STREAMING_TABLENAME,
       'hoodie.datasource.write.operation': COPY_ON_WRITE,
       'hoodie.datasource.write.precombine.field': 'TS',
       'hoodie.datasource.write.hive_style_partitioning': 'true',
       'hoodie.write.set.null.for.missing.columns': 'true',
       'hoodie.parquet.compression.codec': 'snappy',
       'hoodie.merge.allow.duplicate.on.inserts': 'true',
   
       'hoodie.enable.data.skipping': 'true',
       'hoodie.clean.automatic': 'true',
       'hoodie.clean.async': 'false',
       'hoodie.cleaner.commits.retained': '5',
       'hoodie.schema.on.read.enable': 'true',
   
       'hoodie.parquet.small.file.limit': 64 *  1020 * 1024,
       'hoodie.parquet.max.file.size': 128 *  1020 * 1024,
       'hoodie.index.type':'RECORD_INDEX',
   
   
       # v0.15
       'hoodie.parquet.bloom.filter.enabled': 'false',
       'hoodie.datasource.meta.sync.glue.partition_index_fields.enable': 'true',
       'hoodie.datasource.meta.sync.glue.all_partitions_read_parallelism':10,
       'hoodie.datasource.hive_sync.ignore_exceptions': 'false',
       'hoodie.metadata.enable': 'false',
       'hoodie.metadata.log.compaction.enable': 'false',
      
       # Hive Sync config
       'hoodie.datasource.meta.sync.enable': 'true',
       'hoodie.datasource.hive_sync.database': DATABASE_NAME,
       'hoodie.datasource.hive_sync.table': STREAMING_TABLENAME,
       'hoodie.datasource.hive_sync.partition_fields': PART_KEYS,
       'hoodie.datasource.hive_sync.enable': 'true',
       'hoodie.datasource.hive_sync.mode': 'hms',
       'hoodie.datasource.hive_sync.use_jdbc': 'false',
       'hoodie.datasource.hive_sync.skip_ro_suffix': 'true',
       'hoodie.datasource.hive_sync.auto_create_database': 'false',
       'hoodie.datasource.hive_sync.create_managed_table': 'false',
       'hoodie.datasource.hive_sync.ignore_exceptions': 'false',
       'hoodie.datasource.hive_sync.omit_metadata_fields': 'false',
       'hoodie.datasource.hive_sync.support_timestamp': 'true'
   }


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [SUPPORT] Streaming job fails after 1h (Hudi 0.15 on EMR) [hudi]

Reply via email to