[
https://issues.apache.org/jira/browse/METRON-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593086#comment-16593086
]
Ali Nazemian commented on METRON-1677:
--------------------------------------
Given we are using ES/Solr for a time series use case, bringing timestamp to
the id generation might be a good idea. We are working on implementing a
Stellar function to give us a more Lucene friendly id for this case. I will
share the outcome once it's tested.
> UUIDv4 GUID is not Lucene friendly
> ----------------------------------
>
> Key: METRON-1677
> URL: https://issues.apache.org/jira/browse/METRON-1677
> Project: Metron
> Issue Type: Bug
> Reporter: Ali Nazemian
> Priority: Major
>
> Using UUIDv4 by UUID.randomUUID() in Java is not Lucene friendly and impacts
> Elasticsearch and Solr indexing/search performance and makes it unpredictable
> sometimes.
> http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html
> Moreover, specifying doc id at the client side will impact indexing
> throughput due to enabling Elasticsearch deduplication policy and changing
> insert to upsert. Hence, indexing throughput can be increased by providing an
> ability to disable ID generation at the client side. Currently, the way ID is
> generated can be overwritten at the config level by replacing Metron default
> guid via Stellar, but it is not possible to disable it completely to let
> Elasticsearch decide what ID can be used for the corresponding document.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)