This is what we do if this "list of excluded foods" is immutable (constant) and 
can fit in memory:


1. Store the list in source control (manually curated) or in a database 
(automatically copied from some other source of truth)


2. Use bolt prepare method to load it into memory:

2.1 in a singleton class (we use 
http://en.wikipedia.org/wiki/Singleton_pattern#Initialization-on-demand_holder_idiom
 )

2.2 We use Guava ImmutableSet. Immutable (constant) is thread safe.

2.3 This means one copy per worker which is not so bad if the data size is 
small compared to Xmx. If it is too bit, then copy it to redis (see 
http://redis.io/commands/sismember )


3. Access the immutable set from the bolt execute() method.


________________________________
From: Kushan Maskey <[email protected]>
Sent: Thursday, November 20, 2014 5:15 AM
To: [email protected]
Subject: Thread safe function


I have a scenario,

I have a common project where I have a synchronized function to validate 
whether a text file contains a string. Example, that text file contains say a 
list of food that needs to be excluded. I have a data coming through kafka and 
storm which contain list of food.

The function i created is a synchronized SET with all these food. When i get 
some kind of food in my data I look up to see if it needs to be excluded form 
getting inserted into the database. Everything worked in my local environment 
but when i deploy this code in a clustered environment, exclusion is a hit or 
miss. Now the data that gets loaded is not correct coz the food that is 
supposed to be excluded still exists. I am 100% sure it is because of the 
thread safe issue of that function. How do i achieve this functionality in the 
clustered environment. Please advice. Thanks.

--
Kushan Maskey
817.403.7500
M. Miller & Associates<http://mmillerassociates.com/>
[email protected]<mailto:[email protected]>

Reply via email to