jihoonson commented on issue #7338: Overwrite index task maxTotalRows with 
computed maxRowsPerSegments
URL: https://github.com/apache/incubator-druid/pull/7338#issuecomment-477780789
 
 
   Oh, `maxTotalRows` and `numShards` look similar but different. An 
appenderator can append to multiple segments at the same time. For example, if 
the segment granularity is `DAY` and two rows of different days are added, the 
appenderator would append each row to different segments. `maxTotalRows` is to 
limit the total number of rows in all segments which are actively being 
appended across all [time 
chunks](http://druid.io/docs/latest/ingestion/index.html).
   
   `numShards` is quite different. `numShards` and `maxRowsPerSegment` are 
configurations to determine how many segments would be created per time chunk. 
If `numShards` is set, the index task generates the exactly same number of 
segments. If `maxRowsPerSegment` is set, the index task would create a new 
segment whenever number of rows in a generating segment reaches to the 
threshold. Once a new segment is created, the old segment is not active anymore 
and is not counted to check `maxTotalRows`.
   
   > If there is multiple-segment situation, we need to set maxRowsPerSegments 
to a high value e.g. Long.MAX_VALUE?
   
   Yeah, I think it makes more sense to `Long.MAX_VALUE` to generate segments 
of intended size unless it's explicitly set by users.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to