On Thu, Feb 27, 2014 at 11:10 AM, Aaron McCurry <[email protected]> wrote: > What if we provide an implementation of the QueueReader concept that does > what you are discussing. That way in more extreme cases when the user is > forced into implementing the lower level api (perhaps for performance) they > can still do it, but for the normal case the partitioning (and other > difficult issues) are handled by the controllers.
That may be good longer term - I'd be supportive of pulling it from the shards for now and focus some time fully baking the simpler idea in the controller. Or at least disclaiming there be dragons:) > I could see adding an enqueueMutate call to the controllers that pushes the > mutates to the correct buckets for the user. At the same time we could > allow each of the controllers to pull from an external and push the mutates > to the correct buckets for the shards. I could see a couple of different > ways of handling this. Not sure what you mean by enqueueMutate - I was thinking just taking the existing QueueReader and plugging it into the Controller (with some leader election) - obviously calling mutateRow instead of the current behavior. Any more than one controller and we have to either expose the partitioning or protect against dupes, right? > However I do agree that right now there is too much burden on the user for > the 95% case. We should make this simpler. Yeah, from a user perspective, I think I'd ask these questions of the Blur API: o) What "streams" are available? (e.g. twitter, jms, kafka) o) Create an instance of a stream (e.g. client.createStreamTable(type:twitter, name:apache)) - o) Add stream-specific arguments to the table (e.g. twitter search criteria) o) Add one or more Filter's to a stream table (e.g. the default twitter stream might index 'mentions' but the user might add a Filter to drop that column) or drop whole messages. o) Start/Stop the stream. o) Get metrics on the stream's activity. Thanks, --tim > On Thu, Feb 27, 2014 at 10:07 AM, Tim Williams <[email protected]> wrote: > >> I've been playing around with the new QueueReader stuff and I'm >> starting to believe it's at the wrong level of abstraction - in the >> shard context - for a user. >> >> Between having to know about the BlurPartioner and handling all the >> failure nuances, I'm thinking a much friendlier approach would be to >> have the client implement a single message pump that Blur take's from >> and handles. >> >> Maybe on startup the Controllers compete for the lead QueueReader >> position, create it from the TableContext and run with it? The user >> would still need to deal with Controller failures but that seems >> easier to reason about then shard failures. >> >> The way it's crafted right now, the user seems burdened with a lot of >> the hard problems that Blur otherwise solves. Obviously, it trades >> off a high burden for one of the controllers. >> >> Thoughts? >> --tim >>
