Hi,

When I tried doing this import-mysql-into-hive in 3 separate steps, it
worked!

Here is what I ran -

1) Import MySQL table in HDFS with command:
ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop import
--connect jdbc:mysql://localhost/ClassicModels -table Customers -m 1
--verbose --fields-terminated-by '\t' --lines-terminated-by '\n'

2) Create a table in Hive with query: (Using same query found in --verbose
log of failing sqoop run)
hive> CREATE TABLE IF NOT EXISTS `Customers` ( `customerNumber` INT,
`customerName` STRING, `contactLastName` STRING, `contactFirstName` STRING,
`phone` STRING, `addressLine1` STRING, `addressLine2` STRING, `city`
STRING, `state` STRING, `postalCode` STRING, `country` STRING,
`salesRepEmployeeNumber` INT, `creditLimit` DOUBLE) COMMENT 'Imported by
sqoop on 2013/07/04 00:41:14' ROW FORMAT DELIMITED FIELDS TERMINATED BY
'\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE;
OK
Time taken: 0.214 seconds

3) Load data from HDFS with query: (Using same query found in --verbose log
of failing sqoop run)
hive> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO
TABLE `Customers`;
Loading data to table default.customers
Table default.customers stats: [num_partitions: 0, num_files: 2, num_rows:
0, total_size: 15556, raw_data_size: 0]
OK
Time taken: 0.317 seconds

hive> select count(*) from Customers;
.......
Total MapReduce CPU Time Spent: 2 seconds 460 msec
OK
122
Time taken: 26.873 seconds, Fetched: 1 row(s)


So the table got created and populated properly. I now wonder why it is not
working when I use --hive-import.


Any pointers to debug this issue?


Thanks,
Siddharth




On Thu, Jul 4, 2013 at 12:21 PM, Siddharth Karandikar <
[email protected]> wrote:

> Hi Jarek,
>
> I am have not re-configured Hive. I am using the default
> settings/locations. I am using --hive-home to tell sqoop where to find
> Hive.
>
> Here are the locations of my sqoop, Hive and Hadoop instances.
> Hadoop:    /root/siddharth/tools/hadoop-1.1.2
> Hive:    /root/siddharth/tools/hive-0.11.0-bin
> Sqoop:    /root/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0
>
>
> And here are few more details after running it with verbose.
>
> I am using following command to import into hive:
> ssk01:~/siddharth/tools/sqoop-1.4.3.bin__hadoop-1.0.0 # ./bin/sqoop
> import --connect jdbc:mysql://localhost/ClassicModels -table Customers
> -m 1 --hive-home /root/siddharth/tools/hive-0.11.0-bin --hive-import
> --verbose --mysql-delimiters
>
> Verbose output of above command:
> http://pastebin.com/TcYG8vkr
>
> After running this command here is what I see in Hive and HDFS
>
> HDFS
> =====
> ssk01:~/siddharth/tools/hadoop-1.1.2 # bin/hadoop fs -ls
> hdfs://localhost:9000/user/hive/warehouse/*
> Found 2 items
> -rw-r--r--   1 root supergroup          0 2013-07-04 00:41
> /user/hive/warehouse/customers/_SUCCESS
> -rw-r--r--   1 root supergroup      15569 2013-07-04 00:41
> /user/hive/warehouse/customers/part-m-00000
>
>
> Hive (I am running Hive from its own directory so metadata should be
> accessible)
> ===========================================================
> ssk01:~/siddharth/tools/hive-0.11.0-bin # ./bin/hive
>
> Logging initialized using configuration in
>
> jar:file:/root/siddharth/tools/hive-0.11.0-bin/lib/hive-common-0.11.0.jar!/hive-log4j.properties
> Hive history
> file=/tmp/root/[email protected]_201307040044_383137545.txt
> hive> show databases;
> OK
> default
> Time taken: 8.035 seconds, Fetched: 1 row(s)
>
> hive> use default;
> OK
> Time taken: 0.018 seconds
>
> hive> show tables;
> OK
> Time taken: 4.175 seconds
> hive>
>
> Strange thing is table named default.customers doesn't exist in Hive
> even though sqoop output mentioned that.
>
>
> Thanks,
> Siddharth
>
>
>
> On Wed, Jul 3, 2013 at 9:36 PM, Jarek Jarcec Cecho <[email protected]>
> wrote:
> > Hi Siddharth,
> > using directory in LOAD DATA command is completely valid. You can find
> more information about the command in Hive documentation [1]. I would
> estimate that your issue might be more with parsing the data rather than
> accessing them when you are able to see the rows, just with incorrect
> values.
> >
> > Jarcec
> >
> > Links:
> > 1: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> >
> > On Wed, Jul 03, 2013 at 05:11:47PM +0530, Siddharth Karandikar wrote:
> >> Hi,
> >>
> >> While looking into Hive history file, I found this query.
> >>
> >> LOAD DATA INPATH 'hdfs://localhost:9000/user/root/Customers' INTO
> >> TABLE `Customers`"
> >> QUERY_ID="root_20130703050909_882c2484-e1c8-43a3-9eff-dd0f296fc560"
> >> .....
> >>
> >> HDFS location mentioned in this query is a directory not a csv file.
> >> This directory contains the part-* file(s) which hold actual data. I
> >> don't know if Sqoop understands this directory structure and knows how
> >> to read those multiple part-* files? Or is this an issue?
> >>
> >> I was hit by a similar thing while creating an external table in Hive
> >> where location specified was such hdfs directory (generated by sqoop
> >> import) containing multiple part-* files. Hive table got created but
> >> all the rows were NULL. And thats why I started looking into
> >> --hive-import option available in sqoop. But looks like it is also not
> >> working for me.
> >>
> >> Am I missing something?
> >>
> >>
> >> Thanks,
> >> Siddharth
> >>
> >> On Wed, Jul 3, 2013 at 4:55 PM, Siddharth Karandikar
> >> <[email protected]> wrote:
> >> > Hi,
> >> >
> >> > I am facing some problems while importing a sample database from MySQL
> >> > to Hive using Sqoop 1.4.3, Hive 0.11.0 and Hadoop 1.1.2 on a single
> >> > node setup.
> >> >
> >> > While doing this, I am always seeing following message in job logs -
> >> > Table default.customers stats: [num_partitions: 0, num_files: 2,
> >> > num_rows: 0, total_size: 15556, raw_data_size: 0]
> >> >
> >> > Job ends with success message -
> >> > 13/07/03 05:09:30 INFO hive.HiveImport: Time taken: 0.74 seconds
> >> > 13/07/03 05:09:30 INFO hive.HiveImport: Hive import complete.
> >> > 13/07/03 05:09:30 INFO hive.HiveImport: Export directory is empty,
> removing it.
> >> >
> >> > Full command and log can be found at - http://pastebin.com/03f6Wdga
> >> >
> >> > I am using Sqoop for the first time and I could be missing few things.
> >> > Any pointers to solve thos problem would really help.
> >> >
> >> >
> >> > MySQL to HDFS is working fine though.
> >> >
> >> >
> >> > Thanks,
> >> > Siddharth
>

Reply via email to