Ian Maxon has posted comments on this change. Change subject: Make LSM bulkload append-only and write-once. ......................................................................
Patch Set 17: (32 comments) https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-bloomfilter/src/main/java/edu/uci/ics/hyracks/storage/am/bloomfilter/impls/BloomFilter.java File hyracks/hyracks-storage-am-bloomfilter/src/main/java/edu/uci/ics/hyracks/storage/am/bloomfilter/impls/BloomFilter.java: Line 234: page.releaseWriteLatch(false); > Latch doesn't need to be acquired since bloomfilter is created by a single Done Line 279: page.acquireWriteLatch(); > All these latch/unlatch things are unnecessary since confiscated pages seem Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-btree/src/main/java/edu/uci/ics/hyracks/storage/am/btree/impls/BTree.java File hyracks/hyracks-storage-am-btree/src/main/java/edu/uci/ics/hyracks/storage/am/btree/impls/BTree.java: Line 993: leafFrontier.page.releaseWriteLatch(false); > latch can be removed if appendOnly is true Done Line 1064: frontier.page.releaseWriteLatch(false); > useless latch Done Line 1071: frontier.page = bufferCache.confiscatePage(-1); > static variable instead of -1 Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/api/IFreePageManagerFactory.java File hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/api/IFreePageManagerFactory.java: Line 19: public interface IFreePageManagerFactory { > Should it be named as MetadataManagerFactory? and the method name should be Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/api/IMetaDataManager.java File hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/api/IMetaDataManager.java: Line 20: public interface IMetaDataManager { > It will be better to add general description of this interface such as the Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/freepage/LinkedMetaDataManager.java File hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/freepage/LinkedMetaDataManager.java: Line 269: metaNode.acquireWriteLatch(); > useless latch Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/impls/AbstractTreeIndex.java File hyracks/hyracks-storage-am-common/src/main/java/edu/uci/ics/hyracks/storage/am/common/impls/AbstractTreeIndex.java: Line 104: int numPages = bufferCache.getNumPagesOfFile(fileId); > It's good to have description of what the file looks like when this create( Done Line 185: wasActivated = true; > Why there are two flags? Good to have the description of why two flags are wasActivated is to account for the case where someone may call deactivate() before activate(), as part of exception handling. There's no way to tell if you ever did activate an instance of an index object by just having the object itself. Line 215: bufferCache.deleteFile(fileId, true); > why should the pages in deleted file flushed? Shouldn't be :) Done. Line 282: return freePageManager; > Maybe better to change the variable name to meta(data)PageManager? Done Line 365: frontierPage.releaseWriteLatch(false); > if (!appendOnly) then you should unpin(), but make sure that the unpin() is Done Line 366: bufferCache.returnPage(frontierPage); > what if returnpage throws exception? It doesn't :) Line 367: continue; > continue is unnecessary. Done Line 402: nodeFrontiers.get(i).page.releaseWriteLatch(false); > flag should be true Done Line 413: nodeFrontiers.get(i).page.releaseWriteLatch(false); > again, why latch should be acquired? We actually don't split the codepath at the point where the write latch might be acquired in this case. We can either acquire the latch (and certainly get it) or have a if/else, not sure which is more expensive. Line 424: frontier.page = bufferCache.confiscatePage(-1); > why -1? Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-lsm-btree/src/main/java/edu/uci/ics/hyracks/storage/am/lsm/btree/impls/ExternalBTreeWithBuddy.java File hyracks/hyracks-storage-am-lsm-btree/src/main/java/edu/uci/ics/hyracks/storage/am/lsm/btree/impls/ExternalBTreeWithBuddy.java: Line 473: break; > no diff for REPLICATE case Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/edu/uci/ics/hyracks/storage/am/lsm/invertedindex/ondisk/OnDiskInvertedIndex.java File hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/edu/uci/ics/hyracks/storage/am/lsm/invertedindex/ondisk/OnDiskInvertedIndex.java: Line 344: currentPage.releaseWriteLatch(false); > useless latch Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-common/src/main/java/edu/uci/ics/hyracks/storage/common/buffercache/AsyncFIFOPageQueueManager.java File hyracks/hyracks-storage-common/src/main/java/edu/uci/ics/hyracks/storage/common/buffercache/AsyncFIFOPageQueueManager.java: Line 100: protected LinkedBlockingQueue<QueueEntry> queue = new LinkedBlockingQueue<QueueEntry>(); > LinkedBlockingQueue is not thread-safe. I don't see queue is accessed in a It is indeed threadsafe, as we discussed. Line 101: volatile Thread writerThread; > why is writerThread declared as volatile? For the double checked lock in createQueue. See http://jeremymanson.blogspot.com/2008/05/double-checked-locking.html and http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html for the gory details. Line 116: return new PageQueue(bufferCache, writer); > Why should PageQueue be created more than once? Since there is only one wri This was kind of a hold over from when we made a pagequeue for each file. I'll factor it out. Line 124: queue.wait(100l); > Will it be better to use wait() and notify() pair instead of checking queue As we discussed, this method actually is dead code. The shutdown case for this whole class needs more careful consideration; the behavior is far too implicit. Line 154: lowWater.wait(100l); > Will it be better to use wait() and notify() pair instead of checking queue This was a hack :) wait() should work. Line 167: while (!haltWriter) { > What's gonna happen in the following case? when there are pages in the queu I think the question is whether or not we want the queue to be fully flushed, or if stop means stop now not later. Line 178: page.acquireReadLatch(); > why is readLatch acquired, not writeLatch? Also, no other threads will try Done https://asterix-gerrit.ics.uci.edu/#/c/255/17/hyracks/hyracks-storage-common/src/main/java/edu/uci/ics/hyracks/storage/common/buffercache/BufferCache.java File hyracks/hyracks-storage-common/src/main/java/edu/uci/ics/hyracks/storage/common/buffercache/BufferCache.java: Line 612: if (curPage >= pageReplacementStrategy.getNumPages()) { > pageReplacementStrategy.getNumPages() in master branch was modified by Ying Will do when I rebase this on top of that. Line 940: if (victim != null) { > Is it possible to factor out this if clause if this logic is similar to the Just relaying the discussion we had. In short it may be possible, but it's kind of tricky, because in findPage() case 2a/2b are different, but for confiscate, case 2 is the same. Unfortunately the linked list modification is intertwined with the search. Line 1016: ((CachedPage) returnPage).virtual.set(true); > what's the purpose of this set(true) call? The virtual flag is just to be sure that the cleaner thread and clock eviction don't ever consider the page, because while it doesn't exist in the page hash table, it still exists in the list of allocated cache pages. Line 1028: } > The following code also looks identical to what findPage does. This is also quite similar but I think should be easier to factor out between findPage and confiscatePage than the confiscation routine itself. Line 1056: public void returnPage(ICachedPage page, boolean reinsert) { > We need to carefully think about the default value of reinsert flag for mer Agreed. However as we discussed, we either have to relay the LSM operation to the FIFO writer, or expose the caching behavior up through queue.put(), to allow for both cases where the pages are reinserted and where they are just thrown back in but not reinserted to the hash table. -- To view, visit https://asterix-gerrit.ics.uci.edu/255 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I80fb891b5310252143854a336b591bf3f8cd4ba7 Gerrit-PatchSet: 17 Gerrit-Project: hyracks Gerrit-Branch: master Gerrit-Owner: Ian Maxon <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Young-Seok Kim <[email protected]> Gerrit-HasComments: Yes
