Re: Are there any performance impact of using a non-standard length UUID as the unique key of Solr?

Mark Miller Thu, 24 Jul 2014 18:56:20 -0700

Some good info on unique id’s for Lucene / Solr can be found here: 
http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html
-- 
Mark Miller
about.me/markrmiller


On July 24, 2014 at 9:51:28 PM, He haobo (haob...@gmail.com) wrote:

Hi,  

In our Solr collection (Solr 4.8), we have the following unique key  
definition.  
<field name="id" type="string" indexed="true" stored="true"  
required="true" multiValued="false" />  

<uniqueKey>id</uniqueKey>  


In our external java program, we will generate an UUID with  
UUID.randomUUID().toString() first. Then, we will use Cryptographic hash to  
generate a 32 bytes length text and finally use it as id.  

For now, we might need to post more than 20k Solr docs per second. Then  
UUID.randomUUID() or the Cryptographic hash stuff might take time. We might  
have a simple workaround to share one Cryptographic hash stuff for many  
Solr docs. Namely, we want to append sequence to Cryptographic hash such  
as 9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY000000,  
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY000001,  
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY000002, etc.  


What we want to know, if we use a 38 bytes length id, are there any  
performance impact for Solr data insert or query? Or, if we use Solr's  
default automatically generated id implementation, should it be more  
efficient?  



Thanks,  
Eternal

Re: Are there any performance impact of using a non-standard length UUID as the unique key of Solr?

Reply via email to