bsikander commented on issue #23179: URL: https://github.com/apache/beam/issues/23179#issuecomment-1252452868
@mosche I just noticed that the there is one more difference between my old and new runs. A `DropFields.fields` call is added to drop some duplicates and this results in a shuffle. Based on what you suggested, this can be the reason for file size explode. What is generally a good way to fix this? To sort data on some column after DropFields? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
