It would be nice to document the solution somewhere so that if someone else has this issue there is some reference. Best, -- C
> On Mar 26, 2021, at 11:23 AM, luoc <[email protected]> wrote: > > Hello, > Glad to see that you have resolved the issues. BTW, Are you interested in > contribute the ideas (described the solution) in our JIRA? We have a JIRA > about testing the compatibility of drill on CDH. > Also welcome to join our Slack channel: > > https://bit.ly/3t4rozO > >> 在 2021年3月26日,16:16,Mehmet - <[email protected]> 写道: >> >> Hi, >> We have just solved this problem. I would like to talk about our solution >> to contribute to the community. Maybe other friends may face the same >> problem. >> We noticed the reason for that problem is trying to connect CDH's HDFS >> without client deployment. So, firstly we deployed our Hadoop client files >> into /opt/apache-drill/conf folder. >> Then we were able to query data without block error and connect hdfs >> namenode with name service name (as high available). >> >> F.y.i. >> Have a nice day. >> >> Apache <[email protected]>, 4 Mar 2021 Per, 18:34 tarihinde şunu yazdı: >> >>> Hi, >>> I think the solution to this problem depends on your initiative. Because I >>> planned to download CDH 6.3.3, but it seems that Cloudera have requested a >>> license from February 1. Orz... >>> So, Is it possible that you are using mini cluster (installed HDFS roles >>> service only) to confirm that drill can running based on 6.3.3. thanks >>>>> 在 2021年3月3日,15:30,Mehmet - <[email protected]> 写道: >>>> Hi, >>>> I don't have any plan yet but I was trying it on our Dev cluster, I'm >>>> thinking of trying it on the prod cluster as well. >>>> Btw, you're right, as far as I see in drillbit.log and >>>> "jars/3rdparty/hadoop-client-3.2.1.jar" folder, I have been using >>>> hadoop-client-3.2.1.jar (My Hadoop version is 3.0-cdh6.3.3). Actually >>>> Drill's client is higher than mine. I think there shouldn't be a problem. >>>> What do you think? >>>> Thank you, >>>> BR. >>>> luoc <[email protected]>, 1 Mar 2021 Pzt, 13:18 tarihinde şunu yazdı: >>>>> Hi, >>>>> I am so sorry to hear that. Do you have any plans? And, have you used >>>>> hadoop-client-3.2.1.jar to connect the cluster directly? because of the >>>>> drill include the hadoop version is 3.2.1. >>>>>> 2021年2月28日 上午4:28,Mehmet - <[email protected]> 写道: >>>>>> Hi, >>>>>> Yes, you're right and I agree with you. Regarding your last comment; I >>>>> can get and read any file from HDFS via terminal. I have attached the >>>>> ScreenShot. >>>>>> Thank you. >>>>>> 27 Şub 2021 Cmt, saat 05:58 tarihinde luoc <[email protected] <mailto: >>>>> [email protected]>> şunu yazdı: >>>>>> Hi, >>>>>> Drill 1.18 works well on CDH 5.13. the difference is that I did not >>>>> enable krb, then drill is supported for hadoop 3.x. (CDH 5.x based on >>>>> hadoop 2.6, 6.x based on hadoop 3.x) >>>>>> May the `list` commands only need to connect NameNode for get metadata >>>>> (without DataNode), So recommend that you using DFS client to connect >>> the >>>>> cluster and test whether the file can be read. Because of the log point >>> out >>>>> the dfs client have a problem on read file. >>>>>> Don’t paste the image for apache email, this is not supported. as an >>>>> attachment file is a simple way. >>>>>>> 2021年2月27日 上午12:10,Mehmet - <[email protected] <mailto: >>>>> [email protected]>> 写道: >>>>>>> Hi, >>>>>>> Stack Trace: https://paste.ubuntu.com/p/nFvygSpcjy/ < >>>>> https://paste.ubuntu.com/p/nFvygSpcjy/> < >>>>> https://paste.ubuntu.com/p/nFvygSpcjy/ < >>>>> https://paste.ubuntu.com/p/nFvygSpcjy/>> >>>>>>> Yes DataNode port is opening and accessible and any firewall problem >>>>> is impossible because the Drill was established on the same nodes with >>>>> Cloudera cluster. >>>>>>> I consider there is no problem with authentication. Because I can list >>>>> the hdfs folder clearly as below: >>>>>>> Via Terminal (Note: I have used the same kerberos user that I set >>>>> within Drill's jaas.conf folder. ); >>>>>>> Thank you. >>>>>>> BR. >>>>>>> luoc <[email protected] <mailto:[email protected]> <mailto: >>> [email protected] >>>>> <mailto:[email protected]>>>, 26 Şub 2021 Cum, 17:58 tarihinde şunu >>> yazdı: >>>>>>> Hi, >>>>>>> The storage config is correct. then enabled the Kerberos security. >>>>> So, please check the java stack trace to ensure not the authentication >>>>> problem. >>>>>>> Is it possible to use dfs client connect the HDFS for reading the >>>>> csv file? >>>>>>> The DataNode port is opening and accessible? >>>>>>>> 2021年2月26日 下午9:52,Mehmet - <[email protected] <mailto: >>>>> [email protected]> <mailto:[email protected] <mailto: >>>>> [email protected]>>> 写道: >>>>>>>> Hi, >>>>>>>> 1. Drill version: 1.18.0 >>>>>>>> 2. HDFS Version: Hadoop 3.0-cdh6.3.3 >>>>>>>> 3. Storage config: https://paste.ubuntu.com/p/5Dk9jVCxYr/ < >>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/> < >>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/ < >>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/>> >>>>>>>> 4. drill-env.sh file: https://paste.ubuntu.com/p/MGNG4zhbrk/ < >>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/> < >>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/ < >>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/>> >>>>>>>> Thank you. >>>>>>>> BR. >>>>>>>> luoc <[email protected] <mailto:[email protected]> <mailto: >>>>> [email protected] <mailto:[email protected]>>>, 26 Şub 2021 Cum, 16:14 >>>>> tarihinde şunu yazdı: >>>>>>>>> Hi, >>>>>>>>> That does not seem like an issues with Drill. >>>>>>>>> Would you please provides more helpful information : >>>>>>>>> 1. Drill version >>>>>>>>> 2. HDFS version >>>>>>>>> 3. Storage config >>>>>>>>>> 2021年2月26日 下午3:32,Mehmet - <[email protected] <mailto: >>>>> [email protected]> <mailto:[email protected] <mailto: >>>>> [email protected]>>> 写道: >>>>>>>>>> Hi Team, >>>>>>>>>> I have a problem with Hdfs query on Drill. When I run a "SHOW >>>>> FILES in >>>>>>>>>> root.`tmp/` ", I can list the files correctly. >>>>>>>>>> Bu when I run a select query like "Select * from root.`tmp/` it >>>>> throws >>>>>>>>>> below error. >>>>>>>>>> Notes: >>>>>>>>>> - I have already checked the state of hdfs health(via dfsadmin and >>>>> hdfs >>>>>>>>> ui) >>>>>>>>>> and there is no any corruption or block error. >>>>>>>>>> - Drillbits are on the same cluster with Hadoop. So I think any >>>>> network >>>>>>>>>> problem is impossible. >>>>>>>>>> - I have also set dfs.client.use.datanode.hostname as true ( >>>>>>>>>> https://stackoverflow.com/a/55290406/7894534 < >>>>> https://stackoverflow.com/a/55290406/7894534> < >>>>> https://stackoverflow.com/a/55290406/7894534 < >>>>> https://stackoverflow.com/a/55290406/7894534>> ) >>>>>>>>>> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ >>>>> ERROR: >>>>>>>>>> Could not obtain block: BP-2026912985-<namenode_ip>- >>>>>>>>>> 1569935018133:blk_1073842201_101390 file=/tmp/2015-summary.csv >>>>>>>>>> File Path: hdfs://<drillbit_ip>:8020/tmp/2015-summary.csv >>>>>>>>>> Fragment: 0:0 [Error Id: 466835bd-6512-4854-b231-eaa439eba6f2 on >>>>>>>>>> <drillbit_ip>:31010] >>>>>>>>>> Thank you. >>>>>>>>>> -- >>>>>>>>>> Mehmet ERSOY >>>>>>>> -- >>>>>>>> Mehmet ERSOY >>>>>>> -- >>>>>>> Mehmet ERSOY >>>> -- >>>> Mehmet ERSOY >> >> -- >> Mehmet ERSOY
