RestfulBlue opened a new issue #6189: Lucene indexing for free form text URL: https://github.com/apache/incubator-druid/issues/6189 Currently druid uses classic inverted indexes to index string columns. But it not really useful when using free-form text. Currently possible to disable indexes, to have no overhead of such columns, but will be very useful to have possibility to enable full text search. For example, setup in configuration like this ```json { "type": "string", "name": "additional_info", "indexType": "unindexed" // without bitmap }, { "type": "string", "name": "hostname", "indexType": "default" // current inverted index }, { "type": "string", "name": "log_record", "indexType": "lucene" // lucene indexing } ``` with this possibility druid can be used to store almost everything related to monitoring and log data, making possible to get fast result for query like this : ```sql select time_floor(__time, "PT1H") , count(*) from system_logs where log_record satisfy "*something*" and hostname = "node1" group by time_floor(__time, "PT1H") order by time_floor(__time, "PT1H") ``` where satisfy apply lucene filter log_record:*errormessage*. Adding full text search will made druid universal instrument for monitoring and logging different systems(Currently filter by free form text require almost full scan, which not work well, so necessary to store such data in solr or elasticsearch)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
