This is what we do if this "list of excluded foods" is immutable (constant) and can fit in memory:
1. Store the list in source control (manually curated) or in a database (automatically copied from some other source of truth) 2. Use bolt prepare method to load it into memory: 2.1 in a singleton class (we use http://en.wikipedia.org/wiki/Singleton_pattern#Initialization-on-demand_holder_idiom ) 2.2 We use Guava ImmutableSet. Immutable (constant) is thread safe. 2.3 This means one copy per worker which is not so bad if the data size is small compared to Xmx. If it is too bit, then copy it to redis (see http://redis.io/commands/sismember ) 3. Access the immutable set from the bolt execute() method. ________________________________ From: Kushan Maskey <[email protected]> Sent: Thursday, November 20, 2014 5:15 AM To: [email protected] Subject: Thread safe function I have a scenario, I have a common project where I have a synchronized function to validate whether a text file contains a string. Example, that text file contains say a list of food that needs to be excluded. I have a data coming through kafka and storm which contain list of food. The function i created is a synchronized SET with all these food. When i get some kind of food in my data I look up to see if it needs to be excluded form getting inserted into the database. Everything worked in my local environment but when i deploy this code in a clustered environment, exclusion is a hit or miss. Now the data that gets loaded is not correct coz the food that is supposed to be excluded still exists. I am 100% sure it is because of the thread safe issue of that function. How do i achieve this functionality in the clustered environment. Please advice. Thanks. -- Kushan Maskey 817.403.7500 M. Miller & Associates<http://mmillerassociates.com/> [email protected]<mailto:[email protected]>
