Hi Charles, Okay, I can prepare a document about that. Btw, HDFS Storage plugin section on the Drill official document does not contain this detail(Client files deployment). If you want I can also open a PR for that section: https://github.com/apache/drill/blob/gh-pages/_docs/connect-a-data-source/plugins/040-file-system-storage-plugin.md
@luoc, Sure, Nowadays I am working on connecting to Drill on CDH and using it on BI Tools. It's a pleasure for me to contribute with my experiences. Thank you for your invitation. BR, Mehmet. Charles Givre <[email protected]>, 30 Mar 2021 Sal, 06:58 tarihinde şunu yazdı: > It would be nice to document the solution somewhere so that if someone > else has this issue there is some reference. > Best, > -- C > > > On Mar 26, 2021, at 11:23 AM, luoc <[email protected]> wrote: > > > > Hello, > > Glad to see that you have resolved the issues. BTW, Are you interested > in contribute the ideas (described the solution) in our JIRA? We have a > JIRA about testing the compatibility of drill on CDH. > > Also welcome to join our Slack channel: > > > > https://bit.ly/3t4rozO > > > >> 在 2021年3月26日,16:16,Mehmet - <[email protected]> 写道: > >> > >> Hi, > >> We have just solved this problem. I would like to talk about our > solution > >> to contribute to the community. Maybe other friends may face the same > >> problem. > >> We noticed the reason for that problem is trying to connect CDH's HDFS > >> without client deployment. So, firstly we deployed our Hadoop client > files > >> into /opt/apache-drill/conf folder. > >> Then we were able to query data without block error and connect hdfs > >> namenode with name service name (as high available). > >> > >> F.y.i. > >> Have a nice day. > >> > >> Apache <[email protected]>, 4 Mar 2021 Per, 18:34 tarihinde şunu yazdı: > >> > >>> Hi, > >>> I think the solution to this problem depends on your initiative. > Because I > >>> planned to download CDH 6.3.3, but it seems that Cloudera have > requested a > >>> license from February 1. Orz... > >>> So, Is it possible that you are using mini cluster (installed HDFS > roles > >>> service only) to confirm that drill can running based on 6.3.3. thanks > >>>>> 在 2021年3月3日,15:30,Mehmet - <[email protected]> 写道: > >>>> Hi, > >>>> I don't have any plan yet but I was trying it on our Dev cluster, I'm > >>>> thinking of trying it on the prod cluster as well. > >>>> Btw, you're right, as far as I see in drillbit.log and > >>>> "jars/3rdparty/hadoop-client-3.2.1.jar" folder, I have been using > >>>> hadoop-client-3.2.1.jar (My Hadoop version is 3.0-cdh6.3.3). Actually > >>>> Drill's client is higher than mine. I think there shouldn't be a > problem. > >>>> What do you think? > >>>> Thank you, > >>>> BR. > >>>> luoc <[email protected]>, 1 Mar 2021 Pzt, 13:18 tarihinde şunu yazdı: > >>>>> Hi, > >>>>> I am so sorry to hear that. Do you have any plans? And, have you used > >>>>> hadoop-client-3.2.1.jar to connect the cluster directly? because of > the > >>>>> drill include the hadoop version is 3.2.1. > >>>>>> 2021年2月28日 上午4:28,Mehmet - <[email protected]> 写道: > >>>>>> Hi, > >>>>>> Yes, you're right and I agree with you. Regarding your last > comment; I > >>>>> can get and read any file from HDFS via terminal. I have attached the > >>>>> ScreenShot. > >>>>>> Thank you. > >>>>>> 27 Şub 2021 Cmt, saat 05:58 tarihinde luoc <[email protected] > <mailto: > >>>>> [email protected]>> şunu yazdı: > >>>>>> Hi, > >>>>>> Drill 1.18 works well on CDH 5.13. the difference is that I did not > >>>>> enable krb, then drill is supported for hadoop 3.x. (CDH 5.x based on > >>>>> hadoop 2.6, 6.x based on hadoop 3.x) > >>>>>> May the `list` commands only need to connect NameNode for get > metadata > >>>>> (without DataNode), So recommend that you using DFS client to connect > >>> the > >>>>> cluster and test whether the file can be read. Because of the log > point > >>> out > >>>>> the dfs client have a problem on read file. > >>>>>> Don’t paste the image for apache email, this is not supported. as an > >>>>> attachment file is a simple way. > >>>>>>> 2021年2月27日 上午12:10,Mehmet - <[email protected] <mailto: > >>>>> [email protected]>> 写道: > >>>>>>> Hi, > >>>>>>> Stack Trace: https://paste.ubuntu.com/p/nFvygSpcjy/ < > >>>>> https://paste.ubuntu.com/p/nFvygSpcjy/> < > >>>>> https://paste.ubuntu.com/p/nFvygSpcjy/ < > >>>>> https://paste.ubuntu.com/p/nFvygSpcjy/>> > >>>>>>> Yes DataNode port is opening and accessible and any firewall > problem > >>>>> is impossible because the Drill was established on the same nodes > with > >>>>> Cloudera cluster. > >>>>>>> I consider there is no problem with authentication. Because I can > list > >>>>> the hdfs folder clearly as below: > >>>>>>> Via Terminal (Note: I have used the same kerberos user that I set > >>>>> within Drill's jaas.conf folder. ); > >>>>>>> Thank you. > >>>>>>> BR. > >>>>>>> luoc <[email protected] <mailto:[email protected]> <mailto: > >>> [email protected] > >>>>> <mailto:[email protected]>>>, 26 Şub 2021 Cum, 17:58 tarihinde şunu > >>> yazdı: > >>>>>>> Hi, > >>>>>>> The storage config is correct. then enabled the Kerberos security. > >>>>> So, please check the java stack trace to ensure not the > authentication > >>>>> problem. > >>>>>>> Is it possible to use dfs client connect the HDFS for reading the > >>>>> csv file? > >>>>>>> The DataNode port is opening and accessible? > >>>>>>>> 2021年2月26日 下午9:52,Mehmet - <[email protected] <mailto: > >>>>> [email protected]> <mailto:[email protected] > <mailto: > >>>>> [email protected]>>> 写道: > >>>>>>>> Hi, > >>>>>>>> 1. Drill version: 1.18.0 > >>>>>>>> 2. HDFS Version: Hadoop 3.0-cdh6.3.3 > >>>>>>>> 3. Storage config: https://paste.ubuntu.com/p/5Dk9jVCxYr/ < > >>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/> < > >>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/ < > >>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/>> > >>>>>>>> 4. drill-env.sh file: https://paste.ubuntu.com/p/MGNG4zhbrk/ < > >>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/> < > >>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/ < > >>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/>> > >>>>>>>> Thank you. > >>>>>>>> BR. > >>>>>>>> luoc <[email protected] <mailto:[email protected]> <mailto: > >>>>> [email protected] <mailto:[email protected]>>>, 26 Şub 2021 Cum, 16:14 > >>>>> tarihinde şunu yazdı: > >>>>>>>>> Hi, > >>>>>>>>> That does not seem like an issues with Drill. > >>>>>>>>> Would you please provides more helpful information : > >>>>>>>>> 1. Drill version > >>>>>>>>> 2. HDFS version > >>>>>>>>> 3. Storage config > >>>>>>>>>> 2021年2月26日 下午3:32,Mehmet - <[email protected] <mailto: > >>>>> [email protected]> <mailto:[email protected] > <mailto: > >>>>> [email protected]>>> 写道: > >>>>>>>>>> Hi Team, > >>>>>>>>>> I have a problem with Hdfs query on Drill. When I run a "SHOW > >>>>> FILES in > >>>>>>>>>> root.`tmp/` ", I can list the files correctly. > >>>>>>>>>> Bu when I run a select query like "Select * from root.`tmp/` it > >>>>> throws > >>>>>>>>>> below error. > >>>>>>>>>> Notes: > >>>>>>>>>> - I have already checked the state of hdfs health(via dfsadmin > and > >>>>> hdfs > >>>>>>>>> ui) > >>>>>>>>>> and there is no any corruption or block error. > >>>>>>>>>> - Drillbits are on the same cluster with Hadoop. So I think any > >>>>> network > >>>>>>>>>> problem is impossible. > >>>>>>>>>> - I have also set dfs.client.use.datanode.hostname as true ( > >>>>>>>>>> https://stackoverflow.com/a/55290406/7894534 < > >>>>> https://stackoverflow.com/a/55290406/7894534> < > >>>>> https://stackoverflow.com/a/55290406/7894534 < > >>>>> https://stackoverflow.com/a/55290406/7894534>> ) > >>>>>>>>>> org.apache.drill.common.exceptions.UserRemoteException: > DATA_READ > >>>>> ERROR: > >>>>>>>>>> Could not obtain block: BP-2026912985-<namenode_ip>- > >>>>>>>>>> 1569935018133:blk_1073842201_101390 file=/tmp/2015-summary.csv > >>>>>>>>>> File Path: hdfs://<drillbit_ip>:8020/tmp/2015-summary.csv > >>>>>>>>>> Fragment: 0:0 [Error Id: 466835bd-6512-4854-b231-eaa439eba6f2 on > >>>>>>>>>> <drillbit_ip>:31010] > >>>>>>>>>> Thank you. > >>>>>>>>>> -- > >>>>>>>>>> Mehmet ERSOY > >>>>>>>> -- > >>>>>>>> Mehmet ERSOY > >>>>>>> -- > >>>>>>> Mehmet ERSOY > >>>> -- > >>>> Mehmet ERSOY > >> > >> -- > >> Mehmet ERSOY > > -- Mehmet ERSOY
