Hello guys,

       My Company is using Cloudera Impala as our basic infrastructure for 
online data analysis. The most difficult part we met is resource isolation and 
instability.
According to our experiences in Impala, some big query which consume a vast 
amount of memory will crash impalad process(actually as worker but not 
coordinator, right?).
In our simplest scenario, user A is a very important customer and his queries 
are relatively small, user B is a unimportant user who may issue very large SQL 
to impala. It is unacceptable that the big query from user B crash the impalad 
process and affect the user experiences of user A. So resource isolation is the 
point.
But per the Impala documents : 
http://www.cloudera.com/documentation/enterprise/5-6-x/topics/impala_admission.html
 , Impala resource isolation is soft limit, cannot strictly prevent query from 
user B affecting user A.
As I know llama(run impala with yarn) is not recommended and we actually tried 
it but disappointed about the performance and accuracy.
       Is there any best practice for user resource isolation? So different 
user will not affect each other.
       Thanks.

Best Regards,
Songbo

Reply via email to