tmichaud314 opened a new issue #7190:
URL: https://github.com/apache/incubator-pinot/issues/7190


   Hello,
   
   I have been testing the same `ST_Contains(<complex WKT>, my_st_point)` 
transformation function on a single machine (i.e., 8 core laptop with 32GB 
memory and SSD) with varying table size on a basic cluster of 1 
broker/controller/server container. This "kick the tires" experiment shows 
these rough results with the PQL submitted using the Pinot "Query Console" UI.
   
   | Table Size (records) | Query Response Time |
   | ----------------------------| -------------------------------|
   |300    | 10ms |
   |1K      | 8.5s |
   |10K    | 85s |
   |100K  | 80s |
   |1M     | 82s |
   |10M   | 84s |
   
   The query with much of the WKT omitted:
   ```
   select * from "test" where st_contains(st_geomfromtext("MULTIPOLYGON 
(((-107.888838605 41.0180204540003,-107.888889102 41.014255698,-107.888937695 
41.0106329080003,-107.879372778 41.0106025780002,-107.879471012 
41.0033569920004,-107.879532005 41.0019229880002,-107.886612003 
41.0019229880002,-107.889095926 41.0019260450003,-107.898804012 
41.001937991,-107.898712006 41.0034180120001,-107.89853021 
41.010663326,-107.898439312 41.0142859800001,-107.893671822 
41.0142708630003,-107.893621903 41.018012684,-107.888838605 41.0180204540003))
   ,((-107.716352303276 41.0103825046501,-107.716384379463 
41.0178298208872,-107.716384385044 41.0178683989511,-107.716384903785 
41.0214527614842,-107.716384907739 41.0214800849727,-107.716434979282 
41.0214799794529,-107.71642 41.01785,-107.71644 41.01419,-107.71646 
41.01053,-107.72601 41.0105,-107.725951222677 
41.0141697813734,-107.726046524359 41.010343653198,-107.716352303276 
41.0103825046501,-107.716336739313 41.006759586392,-107.721119085977 
41.0067409198176,-107.721103342127 41.0031367730833,-107.721103260149 
41.0031180002171,-107.726227 41.003098,-107.726089 41.002134,-107.735016 
41.002106,-107.744965 41.00209,-107.75386 41.002014,-107.754567531742 
41.0020150243201,-107.754567531701 41.0031125077887,-107.764236 
41.003082,-107.764186305586 41.0067049235521,-107.764235740826 
41.006704971405,-107.773751023988 41.0067129977712,-107.773799937838 
41.0067130392442,-107.783203258294 41.0067209711681,-107.78323724287 
41.006720995628,-107.783271484865 41.0103438986861,-107.783271488706 
 41.0103438986892,-107.783271488852 41.0103438986893,-107.788054130892 
41.0103438986892,-107.788088635321 41.0139668446042,-107.792837035325 
41.0139668446042,-107.792871541576 41.0139668403091,-107.792906342492 
41.0175933869572,-107.792906346898 41.0176543356283,-107.78812317235 
41.0176236678071,-107.788123177873 41.0177000134088,-107.79049 
41.01771,-107.79286 41.01772,-107.792906356429 
41.0177201953907,-107.792906346898 41.0176543356283,-107.802368 
41.017715,-107.80233465258 41.0222567938367,-107.802325423139 
41.0249610203361,-107.802366952339 41.0249606317072,-107.811882536138 
41.0248699982772,-107.821499976649 41.0247783946893,-107.82158005355 
41.0247787066638,-107.831153965817 41.0248160060118,-107.83124001457 
41.0248163412523,-107.840913046047 41.024854862997,-107.841081266021 
41.0248555046931,-107.850648608141 41.0248927920441,-107.850723990499 
41.032028385696,-107.850720631233 41.0322115398222,-107.850419408319 
41.0394545118551,-107.850359645336 41.0394533651647,-107.85035 41.
 03971,-107.85021 41.04343,-107.85007 41.04716,-107.84046 
41.04716,-107.840465610547 41.0470281521572,-107.831057975786 
41.0470503830851,-107.826272489108 41.0470616915401,-107.821487 
41.047073,-107.816633174369 41.0470462167871,-107.811847776745 
41.0470198111573,-107.807062383101 41.0469934055503,-107.802277 
41.046967,-107.802200780865 41.046965525206,-107.8022 41.04708,-107.78329 
41.0467,-107.7833 41.04308,-107.76412 41.043,-107.74499 41.04314,-107.73548 
41.04324,-107.73549 41.03959,-107.726 41.03968,-107.726000233382 
41.0395945822185,-107.725937524623 41.0395951398342,-107.725967762302 
41.035972071063,-107.725997907355 41.0323601007092,-107.71647 
41.03239,-107.71169 41.0324,-107.71169 41.02876,-107.711680000035 
41.0251200186679,-107.711601716503 41.0251201831678,-107.711602238542 
41.032333099295,-107.706818 41.032333,-107.706818 
41.025113877756,-107.70693002276 41.0251134307264,-107.70694 
41.01785,-107.711601190522 41.01785,-107.711569771936 
41.0104199376802,-107.711569692699 41.0
 1040117122,-107.716352303276 41.0103825046501),(-107.73536 
41.01778,-107.735360710429 41.0177149957248,-107.733035499832 
41.0177338437896,-107.733035684596 41.0177532543502,-107.733036118699 
41.0177996939266,-107.73536 41.01778,-107.7354 41.02141,-107.735439124196 
41.0249605208045,-107.735496236022 41.0249600578882,-107.735427741691 
41.0177923919791,-107.735427619439 41.0177795720291,-107.73536 
41.01778),(-107.74483 41.01772,-107.754302936927 41.0176705587843,-107.75441 
41.01767,-107.754421873654 41.0193893050818,-107.754422319046 
41.0194537978134,-107.756711735459 41.0194535424082,-107.756710979 
41.019374837,-107.756693714 41.0175785620001,-107.756693568001 
41.0175634211329,-107.754302056033 41.0175785368961,-107.749519442183 
41.0176087658301,-107.744830447107 41.0176384030389,-107.74483 
41.01772,-107.744736769464 41.0177205913142,-107.744736767554 
41.0177205913264,-107.744804506 41.0249025060002,-107.744821633 
41.0267183825002,-107.74245804714 41.0267380463908,-107.742441762687 41
 .0249220620627,-107.742441598615 41.0249037652974,-107.740279754227 
41.024921287179,-107.74028012832 41.0249601532001,-107.740297194947 
41.0267326835592,-107.740314635771 41.028544077821,-107.740314905465 
41.0285720881768,-107.744838750085 41.0285342590831,-107.744838496824 
41.0285074041022,-107.744948974833 41.0285067098399,-107.74493 
41.02678,-107.74491 41.02497,-107.74483 41.01772),(-107.759137508 
41.022982551,-107.759154985 41.0247939660001,-107.761546757 
41.0247788510001,-107.761623346706 41.0247783669648,-107.761606138562 
41.0230507839221,-107.759203458625 41.0230559047191,-107.75918565325 
41.0212545865627,-107.761588220125 41.0212518921564,-107.763990787 
41.0212491977501,-107.764017368868 41.0238975301362,-107.764009752022 
41.0211402319999,-107.764009752 41.0211402320001,-107.761511673 
41.0211560200001,-107.759120032 41.0211711360001,-107.759137508 
41.022982551),(-107.75919012756 41.0284167910028,-107.759190938978 
41.028500434346,-107.759190939003 41.0285004343459,-107.759190
 939 41.0285004340001,-107.759190128 41.0284167910001,-107.75919012756 
41.0284167910028),(-107.77368 41.0176449819309,-107.777813703862 
41.0176599204581,-107.773604134696 41.0175359320092,-107.773603880341 
41.0175546648492,-107.768820943628 41.017535832438,-107.768821199286 
41.0175171012671,-107.768845088124 41.0157662762011,-107.768761516443 
41.0157662054667,-107.768737179347 41.0176271193885,-107.77368 
41.0176449819309),(-107.777813703877 41.017659920458,-107.783340005109 
41.0176798916034,-107.78334 41.017593,-107.783339971649 
41.0175900003588,-107.777813703877 41.017659920458),(-107.74485559241 
41.0303189018951,-107.744873013886 41.0321659999003,-107.744989394083 
41.0321651645489,-107.74497 41.03041,-107.744968985473 
41.0303181853234,-107.74485559241 41.0303189018951),(-107.81182935217 
41.032098939299,-107.811829586023 41.0322884053301,-107.81183829932 
41.039344712499,-107.821542874068 41.0391529814501,-107.821568441546 
41.03552885118,-107.821593612505 41.0319609265038,-107.82148 
 41.03196,-107.821480143951 41.0319082496773,-107.81182935217 
41.032098939299),(-107.740349931308 41.0322098527985,-107.744863289912 
41.032166094143,-107.740349877078 41.0322042204942,-107.740349931308 
41.0322098527985),(-107.744820022997 41.010389015655,-107.735611540695 
41.0103667791125,-107.735609217379 41.0104592829774,-107.74016 
41.01044,-107.744819546529 41.0104301280795,-107.744820022997 
41.010389015655),(-107.77370156287 41.0103573398623,-107.778452947844 
41.0103399327397,-107.773701853509 41.0103359253012,-107.77370156287 
41.0103573398623)), ... )))"),
   location_st_point) = 1
   ```
   The `MULTIPOLYGON WKT` contains 374 polygons of varying size. Some contain 
holes.
   
   The `test` table config:
   ```
   {
     "tableName": "test",
     "tableType": "OFFLINE",
     "isDimTable": false,
     "segmentsConfig": {
       "replication": 1,
       "timeColumnName": "CreateDateInEpoch",
       "timeType": "MILLISECONDS",
       "retentionTimeUnit": "DAYS",
       "retentionTimeValue": 365
     },
     "tenants": {
       "broker": "DefaultTenant",
       "server": "DefaultTenant"
     },
     "fieldConfigList": [{
       "name": "location_st_point",
       "encodingType":"RAW",
       "indexType":"H3",
       "properties": {
         "resolutions": "5"
       }
     }],
     "tableIndexConfig": {
       "loadMode": "MMAP",
       "noDictionaryColumns": [
         "location_st_point"
       ],
       "startTreeIndexConfigs": [{
         "dimensionsSplitOrder": [
        "ID"
         ],
         "skipStarNodeCreationForDimensions": [
         ],
         "functionColumnPairs": [
           "SUM__Area",
        "MIN__Area",
        "MAX__Area"
         ],
         "maxLeafRecords": 1
       }]
     },
     "ingestionConfig": {
       "batchIngestionConfig": {
         "segmentIngestionType": "APPEND",
         "segmentIngestionFrequency": "DAILY"
       },
       "transformConfigs": [{
         "columnName": "CreateDateInEpoch",
         "transformFunction": "fromEpochHours(CreateDate)"
       }]
     },
     "metadata": {}
   }
   ```
   
   The `test` schema:
   ```
   {
     "schemaName": "test",
     "dimensionFieldSpecs": [
       {
         "name": "ID",
         "dataType": "STRING"
       },
       {
         "name": "latitude",
         "dataType": "DOUBLE"
       },
       {
         "name": "longitude",
         "dataType": "DOUBLE"
       },
       {
         "name": "location_st_point",
         "dataType": "BYTES",
         "transformFunction": "stPoint(longitude,latitude)"
       }
     ],
     "metricFieldSpecs": [
       {
         "name": "Area",
         "dataType": "DOUBLE"
       }
     ],
     "dateTimeFieldSpecs": [{
       "name": "CreateDateInEpoch",
       "dataType": "LONG",
       "format" : "1:MILLISECONDS:EPOCH",
       "granularity": "1:MILLISECONDS"
     }]
   }
   ```
   
   The results were surprising in the following ways:
   1. The 1000 record table took 85 seconds to query where the slightly smaller 
300 record table took sub-second time.
   2. The query returned in 80-85 seconds irrespective of table size.
   
   Are these results consistent with the experience of others here?
   
   Do the developers or users here with large clusters supporting geospatial 
queries expect Pinot's (org.locationtech) `ST_Contains()` function to be able 
to handle a MULTIPOLYGON geometry consisting of 100s of polygons?
   
   Thanks in advance.
   
   Tom


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to