Hi Latha, Unfortunately the mailing list does not support attachments, could you possibly throw the file onto a file sharing service and share a link? If the file is below 20 MB you should be able to file a JIRA issue and upload it there as an attachment if you don't have another host available.
-Jason On Wed, Apr 8, 2015 at 2:06 PM, Sivasubramaniam, Latha < [email protected]> wrote: > Ramana, > > > > Please find attached dservices.tar file. > > > > Thanks for your help. > > > > -Latha > > > > *From:* Sivasubramaniam, Latha > *Sent:* Wednesday, April 08, 2015 1:33 PM > > *To:* '[email protected]' > *Subject:* RE: Unable to query data from hdfs > > > > Thanks for all the responses. > > > > Once I renamed files within directories to have extensions .csv, then it > worked. So looks like for csv format, having extension is a must. It would > be nice, if it does not allow “null” in the extension description. > > > > Now in the next step of my proof of concept, I am trying to access parquet > files. I have parquet files(tables) created for the tables using impala, I > am assuming that I should be able to access those files via drill as well. > > > > My parquet tables are placed under /user/hive/warehouse, like listed below > here > > > > > > [root@rtr-poc-imp1 sample-data]# hdfs dfs -ls /user/hive/warehouse > > Found 19 items > > drwxrwxrwt - impala hive 0 2015-03-31 16:00 > /user/hive/warehouse/dim_agent_status_parq > > drwxrwxrwt - impala hive 0 2015-03-31 16:00 > /user/hive/warehouse/dim_agent_status_reasons_parq > > drwxrwxrwt - impala hive 0 2015-03-27 12:27 > /user/hive/warehouse/dim_agents_parquet > > drwxrwxrwt - impala hive 0 2015-03-31 16:00 > /user/hive/warehouse/dim_call_action_reasons_parq > > drwxrwxrwt - impala hive 0 2015-03-31 14:09 > /user/hive/warehouse/dim_call_actions_parq > > drwxrwxrwt - impala hive 0 2015-03-31 13:54 > /user/hive/warehouse/dim_call_types_parq > > drwxrwxrwt - impala hive 0 2015-03-31 15:59 > /user/hive/warehouse/dim_dispositions_parq > > drwxrwxrwt - impala hive 0 2015-03-31 15:20 > /user/hive/warehouse/dim_resource_groups_parq > > drwxrwxrwt - impala hive 0 2015-03-31 13:33 > /user/hive/warehouse/dim_services_parq > > drwxrwxrwt - impala hive 0 2015-03-31 14:00 > /user/hive/warehouse/dim_sites_parq > > drwxrwxrwt - impala hive 0 2015-03-31 15:25 > /user/hive/warehouse/dim_workgroups_parq > > drwxrwxrwx - root hive 0 2015-04-08 14:36 > /user/hive/warehouse/dservices > > drwxrwxrwt - impala hive 0 2015-03-27 11:48 > /user/hive/warehouse/edwpoc.db > > drwxrwxrwt - impala hive 0 2015-03-31 12:47 > /user/hive/warehouse/fact_agent_activity_detail_12m_partparq > > drwxrwxrwt - impala hive 0 2015-03-30 13:03 > /user/hive/warehouse/fact_contact_detail_12m_partparq > > drwxrwxrwt - impala hive 0 2015-03-27 13:36 > /user/hive/warehouse/fact_contact_detail_partparq > > -rw-r--r-- 3 root hive 455 2015-04-08 14:55 > /user/hive/warehouse/region.parq > > drwxrwxrwt - impala hive 0 2015-03-25 22:29 > /user/hive/warehouse/sample_07 > > drwxrwxrwt - impala hive 0 2015-03-25 22:29 > /user/hive/warehouse/sample_08 > > > > example listing from one of the directory > > > > hdfs dfs -ls /user/hive/warehouse/dim_services_parq > > Found 3 items > > -rw-r--r-- 3 impala hive 55121 2015-03-31 13:33 > /user/hive/warehouse/dim_services_parq/4645c4221dafa337-250888c6ac1de29b_1376355963_data.0.parq > > -rw-r--r-- 3 impala hive 71075 2015-03-31 13:33 > /user/hive/warehouse/dim_services_parq/4645c4221dafa337-250888c6ac1de29c_ > 2123191845_data.0.parq > > drwxrwxrwt - impala hive 0 2015-03-31 13:33 > /user/hive/warehouse/dim_services_parq/_impala_insert_staging > > [root@rtr-poc-imp1 sample-data]# > > > > There is nothing under impala staging directory, this is primarily used > when insert operation is performed. > > > > I copied dim_services_parq directory to dservices and below is the listing > of dservices directory. > > > > [root@rtr-poc-imp1 sample-data]# hdfs dfs -ls > /user/hive/warehouse/dservices > > Found 2 items > > -rwxrwxrwx 3 root hive 55121 2015-04-08 14:12 > /user/hive/warehouse/dservices/service0.parquet > > -rwxrwxrwx 3 root hive 71075 2015-04-08 14:12 > /user/hive/warehouse/dservices/service1.parquet > > > > Now when I try, I get the below error > > > > select * from hdfs.drillpoc.`/dservices`; > > Query failed: RemoteRpcException: Failure while running fragment., > java.lang.UnsupportedOperationException [ > cfca83ec-986a-43c0-a967-5aee102401dd on rtr-poc-imp2.labs.aspect.com:31010 > ] > > [ cfca83ec-986a-43c0-a967-5aee102401dd on > rtr-poc-imp2.labs.aspect.com:31010 ] > > > > I also copied the drill sample parquet file region.parquet to the same > location and that works fine like below. > > > > select * from hdfs.drillpoc.`region.parq`; > > +-------------+------------+------------+ > > | R_REGIONKEY | R_NAME | R_COMMENT | > > +-------------+------------+------------+ > > | 0 | AFRICA | lar deposits. blithe | > > | 1 | AMERICA | hs use ironic, even | > > | 2 | ASIA | ges. thinly even pin | > > | 3 | EUROPE | ly final courts cajo | > > | 4 | MIDDLE EAST | uickly special accou | > > +-------------+------------+------------+ > > 5 rows selected (0.122 seconds) > > > > So far what I have read, impala created parquet file should be like any > other parquet file, there should not be a problem. If this does not work, I > need to convert all my tables in text format to parquet format and access > it with drill. Is there any utility to do that. > > > > Thanks for all the help. > > Latha > > > > > > > > > > > > > > > > *From:* Sivasubramaniam, Latha > *Sent:* Wednesday, April 08, 2015 8:00 AM > *To:* '[email protected]' > *Subject:* RE: Unable to query data from hdfs > > > > Hi, > > > > Thanks for your responses. Even though I had done use hdfs, only when I > fully qualified the file name it worked. But I am not able to access files > without .csv extension. > > > > I modified > > > > "csv": { > > "type": "text", > > "extensions": [ > > "csv" > > ], > > "delimiter": "," > > > > To > > > > "csv": { > > "type": "text", > > "extensions": null, > > "delimiter": "," > > > > And tried to access hdfs file ‘DIM_Agents’ and I get the same error. With > null extensions, I can’t access ‘test.csv’ also, once I reverted back csv > format description then I could access test.csv again, but I cannot access > other files with either of the format descriptions. > > > > Below are what I tried. Is ‘_’ (underscore) a problem in the file name. > All my hdfs files are in text format. > > > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; > > +------------+------------+ > > | columns | dir0 | > > +------------+------------+ > > | ["1","Latha"] | root | > > | ["2","Roshan"] | root | > > +------------+------------+ > > 2 rows selected (0.276 seconds) > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`; > > Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not > found > > > > Error: exception while executing query: Failure while executing query. > (state=,code=0) > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`; > > Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not > found > > > > Error: exception while executing query: Failure while executing query. > (state=,code=0) > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; > > Query failed: SqlValidatorException: Table 'hdfs.root./test.csv' not found > > > > Error: exception while executing query: Failure while executing query. > (state=,code=0) > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`; > > Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not > found > > > > Error: exception while executing query: Failure while executing query. > (state=,code=0) > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; > > Query failed: SqlValidatorException: Table 'hdfs.root./test.csv' not found > > > > Error: exception while executing query: Failure while executing query. > (state=,code=0) > > 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; > > +------------+------------+ > > | columns | dir0 | > > +------------+------------+ > > | ["1","Latha"] | root | > > | ["2","Roshan"] | root | > > +------------+------------+ > > 2 rows selected (0.112 seconds) > > > > Appreciate your help. > > > > Thanks, > > Latha > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. >
