> On April 29, 2014, 7:32 p.m., Martin Kleppmann wrote: > > samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngineFactory.scala, > > line 47 > > <https://reviews.apache.org/r/20811/diff/1/?file=569849#file569849line47> > > > > Gut feeling (not backed by any data) is that a threshold of 1000 might > > be quite low for a default. If there is a lot of data in the store, the > > compaction itself may start taking a long time. > > > > Rather than an absolute number, how about setting the threshold in > > terms of the proportion of keys in the keys? e.g. perform a compaction if > > more than (say) 20% of the keys in the store have been deleted? That way, > > if the store is big (=compaction is expensive), the threshold is > > automatically higher. > > > > (For purposes of that calculation I think it would be fine to simply > > count number of put requests and number of delete requests -- no need to > > track unique keys.)
s/keys in the keys/keys in the store/ - Martin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20811/#review41764 ----------------------------------------------------------- On April 28, 2014, 10:52 p.m., Chris Riccomini wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/20811/ > ----------------------------------------------------------- > > (Updated April 28, 2014, 10:52 p.m.) > > > Review request for samza. > > > Repository: samza > > > Description > ------- > > add javadocs, and reset deletion counter in compact > > > make delete threshold configurable. add a performance test (takes 25s to run). > > > make compaction lazy on read-side so we can take advantage of cached writes > > > trigger compactions periodically to remove deleted keys from levels > > > Diffs > ----- > > > samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngineFactory.scala > 81fe86165019f72a15be1ac9cfcfff0598b4b92b > > samza-kv/src/main/scala/org/apache/samza/storage/kv/LevelDbKeyValueStore.scala > 8602a328673e6fa7d435366abcd9a96a99d9cd88 > > samza-kv/src/test/scala/org/apache/samza/storage/kv/TestKeyValueStores.scala > 85ba11a3362ad7cf4f84fbcbd944cd790e572cbe > > Diff: https://reviews.apache.org/r/20811/diff/ > > > Testing > ------- > > > Thanks, > > Chris Riccomini > >
