WingsGo commented on pull request #3694: URL: https://github.com/apache/incubator-doris/pull/3694#issuecomment-634546368
> > Hi, @morningman I have some question about olap_scan_node, could you plz help me figure it out? > > > > 1. what's the relationship between scanner and tablet, one tablet can be scanned by many scanner?they are n:1 or 1:1? > > 2. what's the relationship between TPaloScanRange and ColumnValueRange, is ColumnValueRange to deal with predicate case and TPaloScanRange can indicate that the range a scanner need to scan from StorageEngine's prefix index? > > 3. What's the usage of function `extend_scan_key` and the config variable `doris_scanner_row_num`? > > I would appreciate your reply, thanks~ > > 1. scanner and tablet is `n:m` relation > 2. I am not familiar with `TPaloScanRange`, need a further look... > 3. `extend_scan_key` is used to extend the scan keys. `scan keys` are used to determine the number of scanners. > > For example, `WHERE a in (1,2,3) and b > 5` and `a`, `b` are key columns of the table. > > First, for `a in (1,2,3)`, the `scan keys` become: `(a=1),(a=2),(a=3)` > Second, for `b > 5`, the `scan keys` extends to `(a=1, b>5), (a=2, b>5), (a=3, b>5)` > > 1. `doris_scanner_row_num` is used for priority scheduling of scanner. Doris has a scanner thread pool that is shared by all scanners. Therefore, a scheduling strategy is needed to prevent a scanner from occupying threads for a long time. `doris_scanner_row_num` can control the maximum number of rows a scanner can read in one scheduling. The details of the algorithm that are not specific need to be understood by reading the code. > > @chaoyli may understand this better. If I said something wrong, he can help correct it. Thanks for your reply, so if the `scan keys` extends to `(a=1, b>5), (a=2, b>5), (a=3, b>5)` , doris will split into 3 scanners to scan RowBatch which satisified the predicate? Do I understand correctly? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
