geserdugarov commented on issue #12155: URL: https://github.com/apache/hudi/issues/12155#issuecomment-2435110468
In a result, current filtering for hash is broken, and two examples above show that even for usual jobs, we could get really badly distributed data. But fixing of this filtering could lead to duplications in result data, if we wrote part of it using current hash, and continued after upgrade to write data using fixed hash. Maybe the only possibility is to attach this change to 1.0 release, if some backward compatibility is already broken. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
