Hi, Yesterday, I tried to see if query pushdown functions work well in the Kylin5 docker, and all of my queries return proper responses . After checking your logs from Shaofeng, I found these error messages repeated many times: 1. 'java.io.IOException: All datanodes DatanodeInfoWithStorage[ 127.0.0.1:9866,DS-5093899b-06c7-4386-95d5-6fc271d92b52,DISK] are bad. Aborting...' 2. 'curator.ConnectionState : Connection timed out for connection string (localhost:2181) and timeout (15000) / elapsed (41794) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss'
I guess the root cause is that the container didn't not have enough resources. I found you query on a table called 'XXX_hive_dwh_400million_rows', looks like you gave a complex query on a table which contains 400 million rows? Since I am the uploader of kylin5 's docker image, I want to give some explainment. Kylin5 docker is not a place for performance benchmarks, it is only for demonstration. It is only allocated with very little resources(8G memory) if you are using the default command from docker hub page. Before I uploaded my image, I only tested my image using the ssb dataset, which the biggest table only contains about 60k rows. If you are using a larger dataset and complexer queries, you have to scale the resource properly. Try querying tables which contain not more than 100k rows by default. Here are some tips which may help you to check if the daemon service is in health status and resources(particularly disk space) is configured properly. 1. Checking HDFS 's web ui( http://localhost:9870/dfshealth.html#tab-datanode ) to confirm whether HDFS service is in 'In service' status. 2. Checking Datanode 's log in `/opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log`, check if there is any error message. Like: cat /opt/hadoop-3.2.1/logs/hadoop-root-datanode-Kylin5-Machine.log | grep ERROR | wc -l 3. Checking if your docker engine is configured with enough disk space, if you are using Docker Desktop like me,please go to "Settings" - "Resources" - "Advanced", make sure you have allocated 40GB+ disk space to the docker container. 4. Checking the available disk space of your container by `df -h`, make sure the 'Use%' of 'overlay' is less than 60% . 5. Checking the load average/ cpu usage/ jvm gc. Make sure these metrics are not really high when you send a query. ------------------------ With warm regard Xiaoxiang Yu On Tue, Oct 31, 2023 at 5:13 PM Nam Đỗ Duy <na...@vnpay.vn.invalid> wrote: > Hi ShaoFeng > > Thank you very much for your valuable feedback > > I saw the application to be there (if I see it right) as in the attachment > photo. Kindly advise so that I can run this query on OLAP. > > PS. I sent you the log file in private. > > [image: image.png] > > On Tue, Oct 31, 2023 at 3:11 PM ShaoFeng Shi <shaofeng...@apache.org> > wrote: > >> Can you provide the messages in logs/kylin.log when executing the SQL? >> and you can also check the Spark UI from yarn resource manager (there >> should be one running application called Spardar, which is Kylin's backend >> spark application). If the application is not there, it may indicates the >> yarn doesn't have resource to startup it. >> >> Best regards, >> >> Shaofeng Shi 史少锋 >> Apache Kylin PMC, >> Apache Incubator PMC, >> Email: shaofeng...@apache.org >> >> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html >> Join Kylin user mail group: user-subscr...@kylin.apache.org >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org >> >> >> >> >> Nam Đỗ Duy <na...@vnpay.vn> 于2023年10月31日周二 10:35写道: >> >>> Dear Sir/Madam, >>> >>> I have a fact with 500million rows then I build model, index according >>> to the website help. >>> >>> I chose full incremental because this is the first times I load data >>> >>> I create both index types Aggregate group index, table index as photo >>> attached. >>> >>> But the query always failed after timeout of 300 seconds (I run in >>> docker), I dont want to increase the value of 300 seconds because I wish >>> the OLAP can run within 1 minutes (is that possible?) >>> >>> It seems that the OLAP function in indexing not working to speedup the >>> query by precomputed cube. >>> >>> Can you advise to check whether the index did really work? >>> >>> It is quite urgent task for me so prompt response is highly appreciated. >>> >>> Thank you very much >>> >>