jihoonson commented on issue #7338: Overwrite index task maxTotalRows with computed maxRowsPerSegments URL: https://github.com/apache/incubator-druid/pull/7338#issuecomment-477780789 Oh, `maxTotalRows` and `numShards` look similar but different. An appenderator can append to multiple segments at the same time. For example, if the segment granularity is `DAY` and two rows of different days are added, the appenderator would append each row to different segments. `maxTotalRows` is to limit the total number of rows in all segments which are actively being appended across all [time chunks](http://druid.io/docs/latest/ingestion/index.html). `numShards` is quite different. `numShards` and `maxRowsPerSegment` are configurations to determine how many segments would be created per time chunk. If `numShards` is set, the index task generates the exactly same number of segments. If `maxRowsPerSegment` is set, the index task would create a new segment whenever number of rows in a generating segment reaches to the threshold. Once a new segment is created, the old segment is not active anymore and is not counted to check `maxTotalRows`. > If there is multiple-segment situation, we need to set maxRowsPerSegments to a high value e.g. Long.MAX_VALUE? Yeah, I think it makes more sense to `Long.MAX_VALUE` to generate segments of intended size unless it's explicitly set by users.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
