I'm not sure that in a controlled environment, arbitrary code would be all that bad. I guess ddosing your own regionserver would be bad, but still.
As for real time map reduce, that was a thing on Jonathan's slides, and he mentioned it was a top secret fancy thing he was working on at Streamy. No other details are available, unless he chooses to share them. -ryan On Sun, Oct 4, 2009 at 12:06 PM, Andrew Purtell <[email protected]> wrote: > On a related note HBASE-1002 talks about generic user filters. But as you > point out there are risks with untrusted code execution which have to be > considered even for that restricted case. > > One thing that can be done with some confidence that one user or job won't > DoS everyone else is to allow a fixed set of additive/aggregate function to > run in a scanner context on the regionservers. This would avoid the need to > send any of the data back to the client if the goal is counting, summation, > averaging, etc. And these functions can be stacked such that a list of > operations on columns are fed into a list of operations on the row. > > Allowing arbitrary code however is the way to madness. There could be an > option to allow this through bytecode shipping but I do not think anyone > should fool themselves into thinking this is at all safe to do in production. > There is a middle ground of restricted code (e.g. no backwards branches or > cyclical calling dependencies allowed) which is interesting from both > usability and code safety perspectives. There are some research bytecode > rewriting systems which could serve as a starting point. > > - Andy > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >
