ad1happy2go commented on issue #11178: URL: https://github.com/apache/hudi/issues/11178#issuecomment-2107163324
@MrAladdin 1. Ideally this should not be the reason for this exception, as it's more like parquet file only got corrupted. Are you facing this issue frequently? 2. Not very sure about it. Adding @xushiyan in case he knows. 3. if individual hfile file are too large, you can increase file group count. Seems like in each file group there are too many record keys assigned. One you restart the writer (spark streaming job) it will take effect for new writes. To fix the size of the already existing index files, you may need to create record index again only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
