Hi Tsuna, Thanks for the feedback. Please find some clarifications inline below.
On Tue, Oct 19, 2010 at 10:29 PM, tsuna <[email protected]> wrote: > Having implemented something ~similar in a library (auto-assigning > auto-increment unique IDs in a distributed application for HBase), I > took a look out of curiosity. > Thank you so very much! > Given the synchronization in your class, I gather that this singleton > is going to be used from multiple threads and thus needs to be > thread-safe. > > The first line of get() accesses tableCurrentMax without synchronization. > If you need a concurrent map, use > `java.util.concurrent.ConcurrentHashMap'. If you're not familiar with > it, I recommend you read > http://developers.sun.com/learning/javaoneonline/sessions/2009/pdf/TS-5217.pdf > (slides 29-32 especially) Thanks for the PDF, incorporating the fix. > getPool and getHbaseConfiguration aren't thread safe since they don't > do any synchronization. This can lead to your code crashing in > unexpected ways because you can observe a partially constructed object > and other weirdnesses like that. Fixing... :) I forgot to put it the first time, its becoming a common mistake for me :( to miss synchronization in getters. > If you want only a single row out of the scanner, you should call the > confusingly named setCaching method on the scan object like so: > scan.setCaching(1) – otherwise, depending on your Configuration loaded > by getHbaseConfiguration(), your scanner may end up fetching any > number of rows (possibly *many* rows) and returning just the first one > to you. Using it, thanks again for the suggestion, > I don't see why you use MutableLong instead of > `java.util.concurrent.atomic.AtomicLong' as I believe the latter would > allow you to remove a lot of the synchronization. > You acquire a RowLock but I don't see where you release it. > AtomicLong should have been the way to go, I have to study the new Concurrency API, I only used it for Mutex, Lock, ExecutorService and Semaphor, need learn the other stuffs of API as well. > I don't think you can run multiple instances of this web service at > the same time. If you think you do, I believe there's a race > condition. Well, I plan to use a single instance of the service. As its a lightweight service, it should be able to handle a lot of requests, so I do not plan to run multiple instances of it. > Say you have 2 instances, service A and service B. > A starts, receives a request, scans the table to discover that the > last ID assigned was 43. > In parallel, B starts, receives a request, scans the table and > discovers 43 as well. > Now A decides to assign the next ID, 42, and calls `exists' to verify > that there's no row for 42. `exists' returns false so A proceeds to > lock this row. > In parallel, B does the same thing and attempts to lock the row too. > So now both A and B are racing to lock the same row. Let's say that > the RegionServer gives the RowLock to A, so B gets stuck waiting until > the lock is released. > I don't see in your code where you would create the row for 42, but > let's assume A creates it and releases the RowLock. > B now gets unblocked, acquires the RowLock, and returns the ID 42 too. Bang. > > Am I missing something? > If I would have been running multiple instances this would have definitely popped up thus I decided earlier that I do not want use multiple instances. Thanks once again Tsuna. Regards, Imran > -- > Benoit "tsuna" Sigoure > Software Engineer @ www.StumbleUpon.com > -- Imran M Yousuf Twitter: @imyousuf - http://twitter.com/imyousuf Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557
