Hi,

In our Solr collection (Solr 4.8), we have the following unique key
definition.
 <field name="id" type="string" indexed="true" stored="true"
required="true" multiValued="false" />

 <uniqueKey>id</uniqueKey>


In our external java program, we will generate an UUID with
UUID.randomUUID().toString() first. Then, we will use Cryptographic hash to
generate a 32 bytes length text and finally use it as id.

For now, we might need to post more than 20k Solr docs per second. Then
UUID.randomUUID() or the Cryptographic hash stuff might take time. We might
have a simple workaround to share one Cryptographic hash stuff for many
Solr docs. Namely, we want to append sequence to Cryptographic hash such
as 9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY000000,
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY000001,
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY000002, etc.


What we want to know, if we use a 38 bytes length id, are there any
performance impact for Solr data insert or query? Or, if we use Solr's
default automatically generated id implementation, should it be more
efficient?



Thanks,
Eternal

Reply via email to