glasser commented on issue #5789: Add stringLast and stringFirst aggregators 
extension
URL: https://github.com/apache/incubator-druid/pull/5789#issuecomment-472128679
 
 
   Hmm, I'm not sure if that's exactly it. I've been trying the standard 
quickstart Kafka ingestion example with this supervisor:
   
   ```
   {
     "type": "kafka",
     "dataSchema": {
       "dataSource": "wikipedia",
       "parser": {
         "type": "string",
         "parseSpec": {
           "format": "json",
           "timestampSpec": {
             "column": "time",
             "format": "auto"
           },
           "dimensionsSpec": {
             "dimensions": [
               "cityName",
               "comment",
               "countryIsoCode",
               "countryName",
               "isAnonymous",
               "isMinor",
               "isNew",
               "isRobot",
               "isUnpatrolled",
               "metroCode",
               "namespace",
               "page",
               "regionIsoCode",
               "regionName",
               "user",
               { "name": "added", "type": "long" },
               { "name": "deleted", "type": "long" },
               { "name": "delta", "type": "long" }
             ]
           }
         }
       },
       "metricsSpec": [{
         "name": "channel",
         "fieldName": "channel",
         "type": "stringFirst",
         "maxStringBytes": 100
       }],
       "granularitySpec": {
         "type": "uniform",
         "segmentGranularity": "DAY",
         "queryGranularity": "NONE",
         "rollup": false
       }
     },
     "tuningConfig": {
       "type": "kafka",
       "reportParseExceptions": false,
       "maxRowsInMemory": 3000
     },
     "ioConfig": {
       "topic": "wikipedia",
       "replicas": 2,
       "taskDuration": "PT2M",
       "completionTimeout": "PT20M",
       "consumerProperties": {
         "bootstrap.servers": "localhost:9092"
       }
     }
   }
   ```
   
   Note the maxRowsInMemory: 3000, which is less than the number of rows in the 
wikiticker-2015-09-12-sampled.json. (I tried setting it just to 1 but that 
leads to OOMs.)  This job runs successfully.
   
   I should probably try with just an index task instead of kafka to make it 
simpler though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to