Hi Josh, On Thu, Dec 3, 2015 at 10:50 AM, Josh Elser <josh.el...@gmail.com> wrote:
> Hi Thai, > > There is no out-of-the-box feature provided with Accumulo that does what > you're asking for. Accumulo doesn't provide any functionality to push > notifications to other systems. You could potentially maintain other > tables/columns in which you maintain the last time a row was updated, but > the onus is on your "other services" to read the table to find out when a > change occurred (which is probably not scalable at "real time"). > You're absolutely right here. Reading the table to find out when and where a change occurred is not a good way to go. Furthermore, introducing new states into our current system (which is stateless at this moment) and maintaining them is not a good idea either. > > There are other systems you could likely leverage to solve this, depending > on the durability and scalability that your application needs. > > For a system "close" to Accumulo, you could take a look at Fluo [1] which > is an implementation of Google's "Percolator" system. This is a system > based on throughput rather than low-latency, so it may not be a good fit > for your needs. There are probably other systems in the Apache ecosystem > (Kafka, Storm, Flink or Spark Streaming maybe?) that are be helpful to your > problem. I'm not an expert on these to recommend on (nor do I think I > understand your entire architecture well enough). Good news to hear about Fluo and will look at it and see how different it is from (HBase) Coprocessors. I do use Kafka in the current system but I do not think I need Kafka for the purpose because i) it is probably overkilled and ii) I do not want to move (changed) data back and forth. BTW, My current approach is with Zookeeper and I have fun with this. Thanks for your time. Best, Thai > > > Thai Ngo wrote: > >> Hi list, >> >> I have a use-case when existing rows in a table will be updated by an >> internal service. Data in a row of this table is composed of 2 parts: >> 1st part - immutable and the 2nd one - will be updated (filled in) a >> little later. >> >> Currently, I have a need of knowing when and which rows will be updated >> in the table so that other services will be wisely start consuming the >> data. It will make more sense when I need to consume the data in near >> realtime. So developing a notification function or simpler - a trigger >> is what I really want to do now. >> >> I am curious to know if someone has done similar job or there are >> features or APIs or best practices available for Accumulo so far. I'm >> thinking of letting the internal service which updates the data notify >> us whenever it updates the data. >> >> What do you think? >> >> Thanks, >> Thai >> >