I really think that putting update semantics into Katta would be much easier.
Building the write-ahead log for the lucene case isn't all that hard. If you follow the Zookeeper model of having a WAL thread that writes batches of log entries you can get pretty high speed as well. The basic idea is that update requests are put into a queue of pending log writes, but are written to the index immediately. When the WAL thread finishes the previous trenche of log items, it comes back around and takes everything that is pending. When it finishes a trenche of writes, it releases all of the pending updates in a batch. If updates are lot frequent, then you lose no latency. If you updates are very high speed, then you transition seamlessly to a bandwidth oriented scheme of large updates while latency is roughly bounded to 2-3x the original case. If you put the write-ahead log on a reliable replicated file system then, as you say, much of the complexity of write ahead logging goes away. But this verges off topic for hbase. On Sat, Feb 12, 2011 at 1:01 PM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > So in giving this a day of breathing room, it looks like HBase loads > values as it's scanning a column? I think that'd be a killer to some > Lucene queries, eg, we'd be loading entire/part-of posting lists just > for a linear scan of the terms dict? Or we'd probably instead want to > place the posting list into it's own column? > > Another approach would be to feed off the HLog, place updates into a > dedicated RT Lucene index (eg, outside of HBase). With the latter > system we'd get transactional consistency, and we wouldn't need to > work so hard to force Lucene's index into HBase columns etc (which's > extremely high risk). On being built, the indexes could be offloaded > automatically into HDFS. This architecture would be more of a > 'parallel' to HBase Lucene index. We'd still gain the removal of > doc-stores, we wouldn't need to sorry about tacking on new HBase > specific merge policies, and we'd gain [probably most importantly] a > consistent transactional view of the data, while also being able to > query that data using con/disjunction and phrase queries, amongst > others. A delete or update in HBase'd cascade into a Lucene delete, > and this'd be performed atomically, and vice versa. > > On Fri, Feb 11, 2011 at 7:00 PM, Ted Dunning <tdunn...@maprtech.com> > wrote: > > No. And I doubt there ever will be. > > > > That was one reason to split the larger posting vectors. That way you > can > > multi-thread the fetching and the scoring. > > > > On Fri, Feb 11, 2011 at 6:56 PM, Jason Rutherglen < > > jason.rutherg...@gmail.com> wrote: > > > >> Thanks! In browsing the HBase code, I think it'd be optimal to stream > >> the posting/binary data directly from the underlying storage (instead > >> of loading the entire byte[]), it doesn't look like there's a way to > >> do this (yet)? > >> > >> On Fri, Feb 11, 2011 at 6:20 PM, Ted Dunning <tdunn...@maprtech.com> > >> wrote: > >> > Go for it! > >> > > >> > On Fri, Feb 11, 2011 at 4:44 PM, Jason Rutherglen < > >> > jason.rutherg...@gmail.com> wrote: > >> > > >> >> > Michi's stuff uses flexible indexing with a zero lock architecture. > >> The > >> >> > speed *is* much higher. > >> >> > >> >> The speed's higher, and there isn't much Lucene left there either, as > >> >> I believe it was built specifically for the 140 characters use case > >> >> (eg, not the general use case). I don't think most indexes can be > >> >> compressed to only exist in RAM on a single server? The Twitter use > >> >> case isn't one that the HBase RT search solution is useful for? > >> >> > >> >> > If you were to store entire posting vectors as values with terms as > >> keys, > >> >> > you might be OK. Very long posting vectors or add-ons could be > added > >> >> using > >> >> > a key+serial number trick. > >> >> > >> >> This sounds like the right approach to try. Also, the Lucene terms > >> >> dict is sorted anyways, so moving the terms into HBase's sorted keys > >> >> probably makes sense. > >> >> > >> >> > For updates, speed would only be acceptable if you batch up a > >> >> > lot updates or possibly if you build in a value append function as > a > >> >> > co-processor. > >> >> > >> >> Hmm... I think the main issue would be the way Lucene implements > >> >> deletes (eg, today as a BitVector). I think we'd keep that > >> >> functionality. The new docs/updates would be added to the > >> >> in-RAM-buffer. I think there'd be a RAM size based flush as there is > >> >> today. Where that'd be flushed to is an open question. > >> >> > >> >> I think the key advantages to the RT + HBase architecture is the > index > >> >> would live alongside HBase columns, and so all other scaling problems > >> >> (especially those related to scaling RT, such as synchronization of > >> >> distributed data and updates) goes away. > >> >> > >> >> A distributed query would remain the same, eg, it'd hit N servers? > >> >> > >> >> In addition, Lucene offers a wide variety of new query types which > >> >> HBase'd get in realtime for free. > >> >> > >> >> On Fri, Feb 11, 2011 at 4:13 PM, Ted Dunning <tdunn...@maprtech.com> > >> >> wrote: > >> >> > On Fri, Feb 11, 2011 at 3:50 PM, Jason Rutherglen < > >> >> > jason.rutherg...@gmail.com> wrote: > >> >> > > >> >> >> > I can't imagine that the speed achieved by using Hbase would be > >> even > >> >> >> within > >> >> >> > orders of magnitude of what you can do in Lucene 4 (or even 3). > >> >> >> > >> >> >> The indexing speed in Lucene hasn't changed in quite a while, are > you > >> >> >> saying HBase would somehow be overloaded? That doesn't seem to > jive > >> >> >> with the sequential writes HBase performs? > >> >> >> > >> >> > > >> >> > Michi's stuff uses flexible indexing with a zero lock architecture. > >> The > >> >> > speed *is* much higher. > >> >> > > >> >> > The real problem is that hbase repeats keys. > >> >> > > >> >> > If you were to store entire posting vectors as values with terms as > >> keys, > >> >> > you might be OK. Very long posting vectors or add-ons could be > added > >> >> using > >> >> > a key+serial number trick. > >> >> > > >> >> > Short queries would involve reading and merging several posting > >> vectors. > >> >> In > >> >> > that mode, query speeds might be OK, but there isn't a lot of > Lucene > >> left > >> >> at > >> >> > that point. For updates, speed would only be acceptable if you > batch > >> up > >> >> a > >> >> > lot updates or possibly if you build in a value append function as > a > >> >> > co-processor. > >> >> > > >> >> > > >> >> > > >> >> >> The speed of indexing is a function of creating segments, with > >> >> >> flexible indexing, the underlying segment files (and postings) may > be > >> >> >> significantly altered from the default file structures, eg, placed > >> >> >> into HBase in various ways. The posting lists could even be split > >> >> >> along with HBase regions? > >> >> >> > >> >> > > >> >> > Possibly. But if you use term + counter and post vectors of > limited > >> >> length > >> >> > you might be OK. > >> >> > > >> >> > >> > > >> > > >