jihoonson commented on issue #10210: URL: https://github.com/apache/druid/issues/10210#issuecomment-663657288
@vikramsinghchandel thanks for the details! There was a bug in native batch task with rollup (https://github.com/apache/druid/pull/9861) when `partitionDimensions` are not set which was fixed in 0.19. Could you try again but with explicit `partitionDimensions` (you can specify all dimensions in there)? Note that `timestamp` column will not be included in hash partitioning with explicit `partitionDimensions` whereas it will be when they are implicit. Could you compare the rollup ratio of Hadoop and native task with explicit `partitionDimensions`? Regarding performance, if you still saw difference in performance even with the same rollup ratio, I'm wondering how many actual tasks were running for hadoop and native. Could you double check they used the same number of worker tasks for each map and reduce phases? BTW, when I tested Indexer last time, it wasn't as good as middleManager in terms of performance because of its limits on the global memory usage and concurrent segment persists/merges. Perhaps you could see different result with middleManagers. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
