cshuo commented on PR #18103: URL: https://github.com/apache/hudi/pull/18103#issuecomment-3871569955
> @cshuo thx for reviewing, all comments are resolved except `transient` comment. > > If we remove `transient`, jobs will recover failed because of the state incompatibility in case `read.splits.limit` enabled. > > Exception as below: > > <img alt="截屏2026-02-09 11 49 00" width="2000" height="1022" src="https://private-user-images.githubusercontent.com/33287603/546951248-a932bd34-f3d2-4140-beac-272ec412714d.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzA2NDE2MzUsIm5iZiI6MTc3MDY0MTMzNSwicGF0aCI6Ii8zMzI4NzYwMy81NDY5NTEyNDgtYTkzMmJkMzQtZjNkMi00MTQwLWJlYWMtMjcyZWM0MTI3MTRkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjAyMDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwMjA5VDEyNDg1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTFjODQ1Y2VkNGE2NjliNDEzYTc0OTNhZGJhZGFkNzQ2NTYwMGM4ODZlYWQzZjE1NzM1NTJjNDZkZjljY2I4MjgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0._yTxfHB3e8pJiPm5d18Ypt7Y-g8zRe9WIXP2PzYM_AI"> Got it. `inputSplitsState` in StreamReadMonitoringFunction uses the fallback kryo serializer for the state serialization, which doesn't schema evolution for state. But in StreamReadOperator, there is a `inputSplitsState` which is a list state for `MergeOnReadInputSplit` using JavaSerializer as state serializer, and it supports state schema evolution. So it's kind of ambiguous to add a new field to `MergeOnReadInputSplit` with `transient`. Maybe we can just extract partition path from the file path directly, since it doesn't incur much extra costs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
