On a related note HBASE-1002 talks about generic user filters. But as you point 
out there are risks with untrusted code execution which have to be considered 
even for that restricted case.

One thing that can be done with some confidence that one user or job won't DoS 
everyone else is to allow a fixed set of additive/aggregate function to run in 
a scanner context on the regionservers. This would avoid the need to send any 
of the data back to the client if the goal is counting, summation, averaging, 
etc. And these functions can be stacked such that a list of operations on 
columns are fed into a list of operations on the row.

Allowing arbitrary code however is the way to madness. There could be an option 
to allow this through bytecode shipping but I do not think anyone should fool 
themselves into thinking this is at all safe to do in production. There is a 
middle ground of restricted code (e.g. no backwards branches or cyclical 
calling dependencies allowed) which is interesting from both usability and code 
safety perspectives. There are some research bytecode rewriting systems which 
could serve as a starting point. 

   - Andy

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to