Re: [PR] [#134] improvement(spark3): Use taskId and attemptNo as taskAttemptId [incubator-uniffle]

via GitHub Thu, 15 Feb 2024 22:26:43 -0800


EnricoMi commented on PR #1529:
URL: 
https://github.com/apache/incubator-uniffle/pull/1529#issuecomment-1947826787


   Re 
https://github.com/apache/incubator-uniffle/pull/1514#issuecomment-1947743242:
   
   > AttemptNo will waste some bits. If we increase the bits, the bitmap will 
occupy more memory.
   
   The current config is not optimal:
   
       public static final int PARTITION_ID_MAX_LENGTH = 24;
       public static final int TASK_ATTEMPT_ID_MAX_LENGTH = 21;
       public static final int ATOMIC_INT_MAX_LENGTH = 18;
   
   The `PARTITION_ID_MAX_LENGTH` supports 16,777,216 partitions, with an 
assumed optimal partition size of 200 MB this would easily support a dataset of 
3 PB. I think that can be reduced a bit.
   
   Further, a `TASK_ATTEMPT_ID_MAX_LENGTH` that is smaller than 
`PARTITION_ID_MAX_LENGTH` does not make sense. A single stage with 
2^`PARTITION_ID_MAX_LENGTH` partitions would create at least as many task 
attempt ids, which immediately exhausts `TASK_ATTEMPT_ID_MAX_LENGTH`. So there 
is room for improvement.
   
   With the improvement in #1529 you would set `TASK_ATTEMPT_ID_MAX_LENGTH` = 
`PARTITION_ID_MAX_LENGTH` + 2 (for the default max failures of 4).
   
   If you would like to support 2 m partitions and 4 max failures, then you 
would use:
   
       public static final int PARTITION_ID_MAX_LENGTH = 21;
       public static final int TASK_ATTEMPT_ID_MAX_LENGTH = 23;
       public static final int ATOMIC_INT_MAX_LENGTH = 19;
   
   I think 2 m partitions is quite a lot (supports 400 TB datasets) and 
`ATOMIC_INT_MAX_LENGTH` would even be increased with that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [#134] improvement(spark3): Use taskId and attemptNo as taskAttemptId [incubator-uniffle]

Reply via email to