最近构建了一个包含超高基维的cube,查询时直接 hbase 协处理器超时,sql如下:select sum(filesize)/1024/1024 fs 
,path6 from impala_monitor.V_MONITOR_HDFS_INFO where par_dt = '2020-03-06' 
group by path6 order by fs desc limit 100 。
path6 这个维度 基数在10亿+。大家在处理这种超高基维度时有什么优化吗?


报错:
org.apache.hadoop.hbase.DoNotRetryIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: Coprocessor passed deadline! 
Maybe server is overloaded at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.checkDeadline(CubeVisitService.java:226)
 at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.visitCube(CubeVisitService.java:261)
 at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(CubeVisitProtos.java:5555)
 at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7996) 
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1986)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1968)
 at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33652)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2191) at 
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:183) at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:163) while 
executing SQL: "select sum(filesize)/1024/1024 fs ,path6 from 
impala_monitor.V_MONITOR_HDFS_INFO where par_dt = '2020-03-06' group by path6 
order by fs desc limit 100"

Reply via email to