Hello, Thanks. Please take the following steps: 1. Create a JIRA about the issues (Ignored the operation if no code to be changed). 2. Update the docs (on the `gh-pages` branch). 3. Create a PR on our GitHub (add the `[DOC UPDATE]` in the front of title).
> 2021年3月30日 下午5:10,Mehmet - <[email protected]> 写道: > > Hi Charles, > > Okay, I can prepare a document about that. Btw, HDFS Storage plugin section > on the Drill official document does not contain this detail(Client files > deployment). If you want I can also open a PR for that section: > https://github.com/apache/drill/blob/gh-pages/_docs/connect-a-data-source/plugins/040-file-system-storage-plugin.md > > @luoc, > Sure, Nowadays I am working on connecting to Drill on CDH and using it on > BI Tools. It's a pleasure for me to contribute with my experiences. > Thank you for your invitation. > > BR, > Mehmet. > > Charles Givre <[email protected]>, 30 Mar 2021 Sal, 06:58 tarihinde şunu > yazdı: > >> It would be nice to document the solution somewhere so that if someone >> else has this issue there is some reference. >> Best, >> -- C >> >>> On Mar 26, 2021, at 11:23 AM, luoc <[email protected]> wrote: >>> >>> Hello, >>> Glad to see that you have resolved the issues. BTW, Are you interested >> in contribute the ideas (described the solution) in our JIRA? We have a >> JIRA about testing the compatibility of drill on CDH. >>> Also welcome to join our Slack channel: >>> >>> https://bit.ly/3t4rozO >>> >>>> 在 2021年3月26日,16:16,Mehmet - <[email protected]> 写道: >>>> >>>> Hi, >>>> We have just solved this problem. I would like to talk about our >> solution >>>> to contribute to the community. Maybe other friends may face the same >>>> problem. >>>> We noticed the reason for that problem is trying to connect CDH's HDFS >>>> without client deployment. So, firstly we deployed our Hadoop client >> files >>>> into /opt/apache-drill/conf folder. >>>> Then we were able to query data without block error and connect hdfs >>>> namenode with name service name (as high available). >>>> >>>> F.y.i. >>>> Have a nice day. >>>> >>>> Apache <[email protected]>, 4 Mar 2021 Per, 18:34 tarihinde şunu yazdı: >>>> >>>>> Hi, >>>>> I think the solution to this problem depends on your initiative. >> Because I >>>>> planned to download CDH 6.3.3, but it seems that Cloudera have >> requested a >>>>> license from February 1. Orz... >>>>> So, Is it possible that you are using mini cluster (installed HDFS >> roles >>>>> service only) to confirm that drill can running based on 6.3.3. thanks >>>>>>> 在 2021年3月3日,15:30,Mehmet - <[email protected]> 写道: >>>>>> Hi, >>>>>> I don't have any plan yet but I was trying it on our Dev cluster, I'm >>>>>> thinking of trying it on the prod cluster as well. >>>>>> Btw, you're right, as far as I see in drillbit.log and >>>>>> "jars/3rdparty/hadoop-client-3.2.1.jar" folder, I have been using >>>>>> hadoop-client-3.2.1.jar (My Hadoop version is 3.0-cdh6.3.3). Actually >>>>>> Drill's client is higher than mine. I think there shouldn't be a >> problem. >>>>>> What do you think? >>>>>> Thank you, >>>>>> BR. >>>>>> luoc <[email protected]>, 1 Mar 2021 Pzt, 13:18 tarihinde şunu yazdı: >>>>>>> Hi, >>>>>>> I am so sorry to hear that. Do you have any plans? And, have you used >>>>>>> hadoop-client-3.2.1.jar to connect the cluster directly? because of >> the >>>>>>> drill include the hadoop version is 3.2.1. >>>>>>>> 2021年2月28日 上午4:28,Mehmet - <[email protected]> 写道: >>>>>>>> Hi, >>>>>>>> Yes, you're right and I agree with you. Regarding your last >> comment; I >>>>>>> can get and read any file from HDFS via terminal. I have attached the >>>>>>> ScreenShot. >>>>>>>> Thank you. >>>>>>>> 27 Şub 2021 Cmt, saat 05:58 tarihinde luoc <[email protected] >> <mailto: >>>>>>> [email protected]>> şunu yazdı: >>>>>>>> Hi, >>>>>>>> Drill 1.18 works well on CDH 5.13. the difference is that I did not >>>>>>> enable krb, then drill is supported for hadoop 3.x. (CDH 5.x based on >>>>>>> hadoop 2.6, 6.x based on hadoop 3.x) >>>>>>>> May the `list` commands only need to connect NameNode for get >> metadata >>>>>>> (without DataNode), So recommend that you using DFS client to connect >>>>> the >>>>>>> cluster and test whether the file can be read. Because of the log >> point >>>>> out >>>>>>> the dfs client have a problem on read file. >>>>>>>> Don’t paste the image for apache email, this is not supported. as an >>>>>>> attachment file is a simple way. >>>>>>>>> 2021年2月27日 上午12:10,Mehmet - <[email protected] <mailto: >>>>>>> [email protected]>> 写道: >>>>>>>>> Hi, >>>>>>>>> Stack Trace: https://paste.ubuntu.com/p/nFvygSpcjy/ < >>>>>>> https://paste.ubuntu.com/p/nFvygSpcjy/> < >>>>>>> https://paste.ubuntu.com/p/nFvygSpcjy/ < >>>>>>> https://paste.ubuntu.com/p/nFvygSpcjy/>> >>>>>>>>> Yes DataNode port is opening and accessible and any firewall >> problem >>>>>>> is impossible because the Drill was established on the same nodes >> with >>>>>>> Cloudera cluster. >>>>>>>>> I consider there is no problem with authentication. Because I can >> list >>>>>>> the hdfs folder clearly as below: >>>>>>>>> Via Terminal (Note: I have used the same kerberos user that I set >>>>>>> within Drill's jaas.conf folder. ); >>>>>>>>> Thank you. >>>>>>>>> BR. >>>>>>>>> luoc <[email protected] <mailto:[email protected]> <mailto: >>>>> [email protected] >>>>>>> <mailto:[email protected]>>>, 26 Şub 2021 Cum, 17:58 tarihinde şunu >>>>> yazdı: >>>>>>>>> Hi, >>>>>>>>> The storage config is correct. then enabled the Kerberos security. >>>>>>> So, please check the java stack trace to ensure not the >> authentication >>>>>>> problem. >>>>>>>>> Is it possible to use dfs client connect the HDFS for reading the >>>>>>> csv file? >>>>>>>>> The DataNode port is opening and accessible? >>>>>>>>>> 2021年2月26日 下午9:52,Mehmet - <[email protected] <mailto: >>>>>>> [email protected]> <mailto:[email protected] >> <mailto: >>>>>>> [email protected]>>> 写道: >>>>>>>>>> Hi, >>>>>>>>>> 1. Drill version: 1.18.0 >>>>>>>>>> 2. HDFS Version: Hadoop 3.0-cdh6.3.3 >>>>>>>>>> 3. Storage config: https://paste.ubuntu.com/p/5Dk9jVCxYr/ < >>>>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/> < >>>>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/ < >>>>>>> https://paste.ubuntu.com/p/5Dk9jVCxYr/>> >>>>>>>>>> 4. drill-env.sh file: https://paste.ubuntu.com/p/MGNG4zhbrk/ < >>>>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/> < >>>>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/ < >>>>>>> https://paste.ubuntu.com/p/MGNG4zhbrk/>> >>>>>>>>>> Thank you. >>>>>>>>>> BR. >>>>>>>>>> luoc <[email protected] <mailto:[email protected]> <mailto: >>>>>>> [email protected] <mailto:[email protected]>>>, 26 Şub 2021 Cum, 16:14 >>>>>>> tarihinde şunu yazdı: >>>>>>>>>>> Hi, >>>>>>>>>>> That does not seem like an issues with Drill. >>>>>>>>>>> Would you please provides more helpful information : >>>>>>>>>>> 1. Drill version >>>>>>>>>>> 2. HDFS version >>>>>>>>>>> 3. Storage config >>>>>>>>>>>> 2021年2月26日 下午3:32,Mehmet - <[email protected] <mailto: >>>>>>> [email protected]> <mailto:[email protected] >> <mailto: >>>>>>> [email protected]>>> 写道: >>>>>>>>>>>> Hi Team, >>>>>>>>>>>> I have a problem with Hdfs query on Drill. When I run a "SHOW >>>>>>> FILES in >>>>>>>>>>>> root.`tmp/` ", I can list the files correctly. >>>>>>>>>>>> Bu when I run a select query like "Select * from root.`tmp/` it >>>>>>> throws >>>>>>>>>>>> below error. >>>>>>>>>>>> Notes: >>>>>>>>>>>> - I have already checked the state of hdfs health(via dfsadmin >> and >>>>>>> hdfs >>>>>>>>>>> ui) >>>>>>>>>>>> and there is no any corruption or block error. >>>>>>>>>>>> - Drillbits are on the same cluster with Hadoop. So I think any >>>>>>> network >>>>>>>>>>>> problem is impossible. >>>>>>>>>>>> - I have also set dfs.client.use.datanode.hostname as true ( >>>>>>>>>>>> https://stackoverflow.com/a/55290406/7894534 < >>>>>>> https://stackoverflow.com/a/55290406/7894534> < >>>>>>> https://stackoverflow.com/a/55290406/7894534 < >>>>>>> https://stackoverflow.com/a/55290406/7894534>> ) >>>>>>>>>>>> org.apache.drill.common.exceptions.UserRemoteException: >> DATA_READ >>>>>>> ERROR: >>>>>>>>>>>> Could not obtain block: BP-2026912985-<namenode_ip>- >>>>>>>>>>>> 1569935018133:blk_1073842201_101390 file=/tmp/2015-summary.csv >>>>>>>>>>>> File Path: hdfs://<drillbit_ip>:8020/tmp/2015-summary.csv >>>>>>>>>>>> Fragment: 0:0 [Error Id: 466835bd-6512-4854-b231-eaa439eba6f2 on >>>>>>>>>>>> <drillbit_ip>:31010] >>>>>>>>>>>> Thank you. >>>>>>>>>>>> -- >>>>>>>>>>>> Mehmet ERSOY >>>>>>>>>> -- >>>>>>>>>> Mehmet ERSOY >>>>>>>>> -- >>>>>>>>> Mehmet ERSOY >>>>>> -- >>>>>> Mehmet ERSOY >>>> >>>> -- >>>> Mehmet ERSOY >> >> > > -- > Mehmet ERSOY
