EnricoMi commented on PR #1529: URL: https://github.com/apache/incubator-uniffle/pull/1529#issuecomment-1947826787
Re https://github.com/apache/incubator-uniffle/pull/1514#issuecomment-1947743242: > AttemptNo will waste some bits. If we increase the bits, the bitmap will occupy more memory. The current config is not optimal: public static final int PARTITION_ID_MAX_LENGTH = 24; public static final int TASK_ATTEMPT_ID_MAX_LENGTH = 21; public static final int ATOMIC_INT_MAX_LENGTH = 18; The `PARTITION_ID_MAX_LENGTH` supports 16,777,216 partitions, with an assumed optimal partition size of 200 MB this would easily support a dataset of 3 PB. I think that can be reduced a bit. Further, a `TASK_ATTEMPT_ID_MAX_LENGTH` that is smaller than `PARTITION_ID_MAX_LENGTH` does not make sense. A single stage with 2^`PARTITION_ID_MAX_LENGTH` partitions would create at least as many task attempt ids, which immediately exhausts `TASK_ATTEMPT_ID_MAX_LENGTH`. So there is room for improvement. With the improvement in #1529 you would set `TASK_ATTEMPT_ID_MAX_LENGTH` = `PARTITION_ID_MAX_LENGTH` + 2 (for the default max failures of 4). If you would like to support 2 m partitions and 4 max failures, then you would use: public static final int PARTITION_ID_MAX_LENGTH = 21; public static final int TASK_ATTEMPT_ID_MAX_LENGTH = 23; public static final int ATOMIC_INT_MAX_LENGTH = 19; I think 2 m partitions is quite a lot (supports 400 TB datasets) and `ATOMIC_INT_MAX_LENGTH` would even be increased with that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
