Hi, When I tried doing this import-mysql-into-hive in 3 separate steps, it worked!
Here is what I ran - 1) Import MySQL table in HDFS with command: ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop import --connect jdbc:mysql://localhost/ClassicModels -table Customers -m 1 --verbose --fields-terminated-by '\t' --lines-terminated-by '\n' 2) Create a table in Hive with query: (Using same query found in --verbose log of failing sqoop run) hive> CREATE TABLE IF NOT EXISTS `Customers` ( `customerNumber` INT, `customerName` STRING, `contactLastName` STRING, `contactFirstName` STRING, `phone` STRING, `addressLine1` STRING, `addressLine2` STRING, `city` STRING, `state` STRING, `postalCode` STRING, `country` STRING, `salesRepEmployeeNumber` INT, `creditLimit` DOUBLE) COMMENT 'Imported by sqoop on 2013/07/04 00:41:14' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE; OK Time taken: 0.214 seconds 3) Load data from HDFS with query: (Using same query found in --verbose log of failing sqoop run) hive> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO TABLE `Customers`; Loading data to table default.customers Table default.customers stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 15556, raw_data_size: 0] OK Time taken: 0.317 seconds hive> select count(*) from Customers; ....... Total MapReduce CPU Time Spent: 2 seconds 460 msec OK 122 Time taken: 26.873 seconds, Fetched: 1 row(s) So the table got created and populated properly. I now wonder why it is not working when I use --hive-import. Any pointers to debug this issue? Thanks, Siddharth On Thu, Jul 4, 2013 at 12:21 PM, Siddharth Karandikar < [email protected]> wrote: > Hi Jarek, > > I am have not re-configured Hive. I am using the default > settings/locations. I am using --hive-home to tell sqoop where to find > Hive. > > Here are the locations of my sqoop, Hive and Hadoop instances. > Hadoop: /root/siddharth/tools/hadoop-1.1.2 > Hive: /root/siddharth/tools/hive-0.11.0-bin > Sqoop: /root/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 > > > And here are few more details after running it with verbose. > > I am using following command to import into hive: > ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop > import --connect jdbc:mysql://localhost/ClassicModels -table Customers > -m 1 --hive-home /root/siddharth/tools/hive-0.11.0-bin --hive-import > --verbose --mysql-delimiters > > Verbose output of above command: > http://pastebin.com/TcYG8vkr > > After running this command here is what I see in Hive and HDFS > > HDFS > ===== > ssk01:~/siddharth/tools/hadoop-1.1.2 # bin/hadoop fs -ls > hdfs://localhost:9000/user/hive/warehouse/* > Found 2 items > -rw-r--r-- 1 root supergroup 0 2013-07-04 00:41 > /user/hive/warehouse/customers/_SUCCESS > -rw-r--r-- 1 root supergroup 15569 2013-07-04 00:41 > /user/hive/warehouse/customers/part-m-00000 > > > Hive (I am running Hive from its own directory so metadata should be > accessible) > =========================================================== > ssk01:~/siddharth/tools/hive-0.11.0-bin # ./bin/hive > > Logging initialized using configuration in > > jar:file:/root/siddharth/tools/hive-0.11.0-bin/lib/hive-common-0.11.0.jar!/hive-log4j.properties > Hive history > file=/tmp/root/[email protected]_201307040044_383137545.txt > hive> show databases; > OK > default > Time taken: 8.035 seconds, Fetched: 1 row(s) > > hive> use default; > OK > Time taken: 0.018 seconds > > hive> show tables; > OK > Time taken: 4.175 seconds > hive> > > Strange thing is table named default.customers doesn't exist in Hive > even though sqoop output mentioned that. > > > Thanks, > Siddharth > > > > On Wed, Jul 3, 2013 at 9:36 PM, Jarek Jarcec Cecho <[email protected]> > wrote: > > Hi Siddharth, > > using directory in LOAD DATA command is completely valid. You can find > more information about the command in Hive documentation [1]. I would > estimate that your issue might be more with parsing the data rather than > accessing them when you are able to see the rows, just with incorrect > values. > > > > Jarcec > > > > Links: > > 1: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML > > > > On Wed, Jul 03, 2013 at 05:11:47PM +0530, Siddharth Karandikar wrote: > >> Hi, > >> > >> While looking into Hive history file, I found this query. > >> > >> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO > >> TABLE `Customers`" > >> QUERY_ID="root_20130703050909_882c2484-e1c8-43a3-9eff-dd0f296fc560" > >> ..... > >> > >> HDFS location mentioned in this query is a directory not a csv file. > >> This directory contains the part-* file(s) which hold actual data. I > >> don't know if Sqoop understands this directory structure and knows how > >> to read those multiple part-* files? Or is this an issue? > >> > >> I was hit by a similar thing while creating an external table in Hive > >> where location specified was such hdfs directory (generated by sqoop > >> import) containing multiple part-* files. Hive table got created but > >> all the rows were NULL. And thats why I started looking into > >> --hive-import option available in sqoop. But looks like it is also not > >> working for me. > >> > >> Am I missing something? > >> > >> > >> Thanks, > >> Siddharth > >> > >> On Wed, Jul 3, 2013 at 4:55 PM, Siddharth Karandikar > >> <[email protected]> wrote: > >> > Hi, > >> > > >> > I am facing some problems while importing a sample database from MySQL > >> > to Hive using Sqoop 1.4.3, Hive 0.11.0 and Hadoop 1.1.2 on a single > >> > node setup. > >> > > >> > While doing this, I am always seeing following message in job logs - > >> > Table default.customers stats: [num_partitions: 0, num_files: 2, > >> > num_rows: 0, total_size: 15556, raw_data_size: 0] > >> > > >> > Job ends with success message - > >> > 13/07/03 05:09:30 INFO hive.HiveImport: Time taken: 0.74 seconds > >> > 13/07/03 05:09:30 INFO hive.HiveImport: Hive import complete. > >> > 13/07/03 05:09:30 INFO hive.HiveImport: Export directory is empty, > removing it. > >> > > >> > Full command and log can be found at - http://pastebin.com/03f6Wdga > >> > > >> > I am using Sqoop for the first time and I could be missing few things. > >> > Any pointers to solve thos problem would really help. > >> > > >> > > >> > MySQL to HDFS is working fine though. > >> > > >> > > >> > Thanks, > >> > Siddharth >
