cheezman34 opened a new issue #12311:
URL: https://github.com/apache/druid/issues/12311
### Affected Version
21.1
### Description
```
SELECT "field1", "field2", "subquery_count", Count(*) as "count"
FROM datamodel
LEFT JOIN (
SELECT "field1" as "subquery_field", Count(*) as "subquery_count"
FROM datamodel
WHERE ("field1", "field2") IN (('a1, 'a2'), ('b1, 'b2'), {more_values})
AND {various_other_filters}
GROUP BY 1
) ON "subquery_field"="field1"
WHERE ("field1", "field2") IN (('a1, 'a2'), ('b1, 'b2'), {more_values})
AND {various_other_filters}
GROUP BY 1,2,3
```
The issue appears to crop up whenever we have a query with a LEFT JOIN and
the `("field1", "field2") IN (('a1, 'a2'), ('b1, 'b2'), {more_values})` has
more than 16 pairs of values on the right side. It does matter if the values on
the right side are unique, i.e. if a1=b1=c1 for every pair, then the crash does
not happen (likely runs a more optimal query plan in that case). The query
will take a long time and will eventually crash the historical, even sometimes
if you cancel the query well before the OOM is hit. It's a bit surprising that
it happens at all, because the total amount of data the subquery would return
should be pretty small.
cluster is 2 historicals, 2 routers, 1 broker
broker settings:
```
- name: DRUID_XMX
value: 4g
- name: DRUID_XMS
value: 4g
- name: DRUID_MAXDIRECTMEMORYSIZE
value: 3g
- name: JAVA_OPTS
value: -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError
- name: druid_broker_http_numConnections
value: "25"
- name: druid_broker_http_maxQueuedBytes
value: "10000000"
- name: druid_server_http_numThreads
value: "30"
- name: druid_plaintextPort
value: "8082"
- name: druid_broker_cache_useCache
value: "false"
- name: druid_broker_cache_populateCache
value: "false"
- name: druid_processing_buffer_sizeBytes
value: "524288000"
- name: druid_processing_numThreads
value: "3"
- name: druid_processing_numMergeBuffers
value: "2"
- name: druid_sql_enable
value: "true"
```
historical settings:
```
- name: DRUID_XMX
value: 9g
- name: DRUID_XMS
value: 9g
- name: DRUID_MAXDIRECTMEMORYSIZE
value: 7g
- name: JAVA_OPTS
value: -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError
- name: druid_server_http_numThreads
value: "60"
- name: druid_plaintextPort
value: "8083"
- name: druid_processing_numThreads
value: "8"
- name: druid_processing_numMergeBuffers
value: "2"
- name: druid_processing_buffer_sizeBytes
value: "524288000"
- name: druid_segmentCache_locations
value: '[{"maxSize":2199023255552,"path":"var/druid/segment-cache"}]'
- name: druid_segmentCache_numLoadingThreads
value: "12"
- name: druid_segmentCache_lazyLoadOnStart
value: "true"
- name: druid_server_maxSize
value: "2199023255552"
- name: druid_historical_cache_useCache
value: "true"
- name: druid_historical_cache_populateCache
value: "true"
- name: druid_cache_sizeInBytes
value: "1073741824"
- name: druid_monitoring_monitors
value:
'["org.apache.druid.java.util.metrics.JvmMonitor","org.apache.druid.server.metrics.QueryCountStatsMonitor","org.apache.druid.server.metrics.HistoricalMetricsMonitor","org.apache.druid.client.cache.CacheMonitor"]'
- name: druid_query_groupBy_maxOnDiskStorage
value: "10737418240"
```
plugins:
```
druid-aws-common
org.apache.druid.common.aws.AWSModule
0.21.1
druid-gcp-common
org.apache.druid.common.gcp.GcpModule
0.21.1
postgresql-metadata-storage
org.apache.druid.metadata.storage.postgresql.PostgreSQLMetadataStorageModule
0.21.1
druid-kafka-indexing-service
org.apache.druid.indexing.kafka.KafkaIndexTaskModule
0.21.1
druid-histogram
org.apache.druid.query.aggregation.histogram.ApproximateHistogramDruidModule
0.21.1
druid-datasketches
org.apache.druid.query.aggregation.datasketches.theta.SketchModule
0.21.1
druid-datasketches
org.apache.druid.query.aggregation.datasketches.theta.oldapi.OldApiSketchModule
0.21.1
druid-datasketches
org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchModule
0.21.1
druid-datasketches
org.apache.druid.query.aggregation.datasketches.tuple.ArrayOfDoublesSketchModule
0.21.1
druid-datasketches
org.apache.druid.query.aggregation.datasketches.hll.HllSketchModule
0.21.1
druid-lookups-cached-global
org.apache.druid.server.lookup.namespace.NamespaceExtractionModule
0.21.1
druid-hdfs-storage
org.apache.druid.storage.hdfs.HdfsStorageDruidModule
0.21.1
druid-stats
org.apache.druid.query.aggregation.stats.DruidStatsModule
0.21.1
statsd-emitter
org.apache.druid.emitter.statsd.StatsDEmitterModule
0.21.1
druid-processing
org.apache.druid.guice.BloomFilterExtensionModule
0.21.1
```
Any more configuration / details worth sharing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]