[bcc:sqoop-u...@cloudera.org, to:sqoop-user@incubator.apache.org] Kevin,
It looks like the transfer of data was successful but there was a problem in invoking Hive. What version of Hive are you using? Hive by default generates a session log file in /tmp. Search for hive.log in /tmp and let us know what you find in there. From the looks of it, it appears to be a classpath issue. Thanks, Arvind Note: Please subscribe to sqoop-user@incubator.apache.org and direct further repsonses there. On Thu, Aug 4, 2011 at 5:38 PM, Kevin <kevinfift...@gmail.com> wrote: > I appreciate your response Arvind. I tried the --direct route you > mentioned and it seems to have fixed the problems I mentioned earlier. > Unfortunately, I haven't been successful with Sqoop yet. I'm running > into this problem: > > After executing: > sqoop import --direct --connect jdbc:postgresql:// > query-4.redfintest.com:5432/stingray_6_5_d --username redfin_oltp -P -- > table brokerages --hive-import --hive-home /usr/share/brisk/hive/ -- > target-dir /data/qa-metrics/ > > I get: > 11/08/04 17:26:46 INFO tool.BaseSqoopTool: Using Hive-specific > delimiters for output. You can override > 11/08/04 17:26:46 INFO tool.BaseSqoopTool: delimiters with --fields- > terminated-by, etc. > 11/08/04 17:26:46 INFO manager.SqlManager: Using default fetchSize of > 1000 > 11/08/04 17:26:46 INFO tool.CodeGenTool: Beginning code generation > 11/08/04 17:26:47 INFO manager.SqlManager: Executing SQL statement: > SELECT t.* FROM "brokerages" AS t LIMIT 1 > 11/08/04 17:26:47 INFO manager.SqlManager: Executing SQL statement: > SELECT t.* FROM "brokerages" AS t LIMIT 1 > 11/08/04 17:26:47 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/ > hadoop > 11/08/04 17:26:47 INFO orm.CompilationManager: Found hadoop core jar > at: /usr/lib/hadoop/hadoop-0.20.2-cdh3u1-core.jar > 11/08/04 17:26:48 INFO orm.CompilationManager: Writing jar file: /tmp/ > sqoop-root/compile/8d97f121b1707576d1574cb5ba4653b0/brokerages.jar > 11/08/04 17:26:48 INFO manager.DirectPostgresqlManager: Beginning psql > fast path import > 11/08/04 17:26:48 INFO manager.SqlManager: Executing SQL statement: > SELECT t.* FROM "brokerages" AS t LIMIT 1 > 11/08/04 17:26:48 INFO manager.DirectPostgresqlManager: Performing > import of table brokerages from database stingray_6_5_d > 11/08/04 17:26:48 WARN util.NativeCodeLoader: Unable to load native- > hadoop library for your platform... using builtin-java classes where > applicable > 11/08/04 17:26:48 INFO manager.DirectPostgresqlManager: Transfer loop > complete. > 11/08/04 17:26:48 INFO manager.DirectPostgresqlManager: Transferred > 78.8574 KB in 0.0396 seconds (1.9445 MB/sec) > 11/08/04 17:26:48 INFO hive.HiveImport: Loading uploaded data into > Hive > 11/08/04 17:26:48 INFO manager.SqlManager: Executing SQL statement: > SELECT t.* FROM "brokerages" AS t LIMIT 1 > 11/08/04 17:26:48 INFO manager.SqlManager: Executing SQL statement: > SELECT t.* FROM "brokerages" AS t LIMIT 1 > 11/08/04 17:26:48 WARN hive.TableDefWriter: Column created_date had to > be cast to a less precise type in Hive > 11/08/04 17:26:48 INFO hive.HiveImport: Exception in thread "main" > java.lang.NoClassDefFoundError: jline/ArgumentCompletor > $ArgumentDelimiter > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.lang.Class.forName0(Native Method) > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.lang.Class.forName(Class.java:247) > 11/08/04 17:26:48 INFO hive.HiveImport: at > org.apache.hadoop.util.RunJar.main(RunJar.java:179) > 11/08/04 17:26:48 INFO hive.HiveImport: Caused by: > java.lang.ClassNotFoundException: jline.ArgumentCompletor > $ArgumentDelimiter > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.net.URLClassLoader$1.run(URLClassLoader.java:202) > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.security.AccessController.doPrivileged(Native Method) > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.net.URLClassLoader.findClass(URLClassLoader.java:190) > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.lang.ClassLoader.loadClass(ClassLoader.java:307) > 11/08/04 17:26:48 INFO hive.HiveImport: at sun.misc.Launcher > $AppClassLoader.loadClass(Launcher.java:301) > 11/08/04 17:26:48 INFO hive.HiveImport: at > java.lang.ClassLoader.loadClass(ClassLoader.java:248) > 11/08/04 17:26:48 INFO hive.HiveImport: ... 3 more > 11/08/04 17:26:48 ERROR tool.ImportTool: Encountered IOException > running import job: java.io.IOException: Hive exited with status 1 > at > com.cloudera.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java: > 326) > at > com.cloudera.sqoop.hive.HiveImport.executeScript(HiveImport.java:276) > at > com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:218) > at > com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:362) > at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:423) > at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180) > at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:219) > at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:228) > at com.cloudera.sqoop.Sqoop.main(Sqoop.java:237) > > It seems to be a Hive issue. I haven't had luck trying to figure out a > solution. Is it possible that my Hive is corrupt? > > > On Aug 2, 6:18 pm, "arv...@cloudera.com" <arv...@cloudera.com> wrote: >> [bcc:sqoop-u...@cloudera.org, to:sqoop-u...@incubator.apache.org. >> Please move the conversation over to Apache mailing list.] >> >> Kevin, >> >> The OOM error you pointed out is raised when the proportion of the VM >> time spent is GC crosses a high threshold that should normally not be >> reached. This could happen if your existing heap space for the map >> task is small enough that it compares with the record size you are >> dealing with. You could try increasing your heap space by specifying >> the property mapred.child.java.opts to something like -Xmx4096m, >> assuming your nodes have that much memory to spare. You can also add >> another switch to this property -XX:-UseGCOverheadLimit which will >> disable the policy of the VM that results in OOM errors like you are >> seeing, however doing that may not be of any help. >> >> Alternatively, you could try using the direct mode of import from >> PostgreSQL server by specifying --direct option during the import. >> This option will require that you have the PostgreSQL client (psql) >> installed on the nodes where the map task will be executed. >> >> Thanks, >> Arvin >> >> >> >> >> >> >> >> On Tue, Aug 2, 2011 at 5:09 PM, Kevin <kevinfift...@gmail.com> wrote: >> > Hi all, >> >> > I am trying to use Sqoop alongside Brisk. For those who don't know, >> > Brisk is DataStax's is a Hadoop/Hive distribution powered by >> > Cassandra.http://www.datastax.com/products/brisk >> >> > I'm attempting to use Sqoop to transfer data from a PostgreSQL db to >> > Hive. >> > I used this command: sqoop import --connect jdbc:postgresql:// >> > idb.corp.redfin.com:5432/metrics --username redfin_readonly -P --table >> > metrics --target-dir /data/qatest --hive-import >> >> > A end of the output is: >> >> > 11/08/02 16:50:15 INFO mapred.LocalJobRunner: >> > 11/08/02 16:50:16 INFO mapred.JobClient: map 100% reduce 0% >> > 11/08/02 16:50:18 INFO mapreduce.AutoProgressMapper: Auto-progress >> > thread is finished. keepGoing=false >> > 11/08/02 16:50:18 WARN mapred.LocalJobRunner: job_local_0001 >> > java.lang.OutOfMemoryError: GC overhead limit exceeded >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 6026311161398729600_27875872_1046092330/file/usr/lib/sqoop/lib/ant- >> > contrib-1.0b3.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/ >> > archive/-2957435182115348485_1795484550_1046091330/file/usr/lib/sqoop/ >> > lib/ant-eclipse-1.0-jvm1.2.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 8769703156041621920_1132758636_1046087330/file/usr/lib/sqoop/lib/ >> > avro-1.5.1.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/ >> > archive/-9111846482501201198_-633786885_1046093330/file/usr/lib/sqoop/ >> > lib/avro-ipc-1.5.1.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/ >> > archive/-7340432634222599452_756368084_1046091330/file/usr/lib/sqoop/ >> > lib/avro-mapred-1.5.1.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/ >> > archive/-5046079240639376542_-1808425119_1046090330/file/usr/lib/sqoop/ >> > lib/commons-io-1.4.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 8537290295187062884_-810674145_1046086330/file/usr/lib/sqoop/lib/ >> > ivy-2.0.0-rc2.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/ >> > archive/-3739620623688167588_-1832479804_1046082330/file/usr/lib/sqoop/ >> > lib/jackson-core-asl-1.7.3.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 3083352659231038596_-1724007002_1046089330/file/usr/lib/sqoop/lib/ >> > jackson-mapper-asl-1.7.3.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 2334745090627744860_-1029425194_1046082330/file/usr/lib/sqoop/lib/jopt- >> > simple-3.2.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 4321476485305182066_-92574265_1046090330/file/usr/lib/sqoop/lib/ >> > paranamer-2.3.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/ >> > archive/-5164030306491852882_252469521_1046081330/file/usr/lib/sqoop/ >> > lib/snappy-java-1.0.3-rc2.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 7943398653543290704_-1938786533_204956683/file/usr/lib/sqoop/ >> > postgresql-9.0-801.jdbc4.jar >> > 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager: >> > Deleted path /tmp/hadoop-root/mapred/local/archive/ >> > 3916205498081349063_799987770_1046094330/file/usr/lib/sqoop/ >> > sqoop-1.3.0-cdh3u1.jar >> > 11/08/02 16:50:19 INFO mapred.JobClient: Job complete: job_local_0001 >> > 11/08/02 16:50:19 INFO mapred.JobClient: Counters: 6 >> > 11/08/02 16:50:19 INFO mapred.JobClient: FileSystemCounters >> > 11/08/02 16:50:19 INFO mapred.JobClient: FILE_BYTES_READ=4628309 >> > 11/08/02 16:50:19 INFO mapred.JobClient: >> > FILE_BYTES_WRITTEN=32313964098 >> > 11/08/02 16:50:19 INFO mapred.JobClient: Map-Reduce Framework >> > 11/08/02 16:50:19 INFO mapred.JobClient: Map input records=128000 >> > 11/08/02 16:50:19 INFO mapred.JobClient: Spilled Records=0 >> > 11/08/02 16:50:19 INFO mapred.JobClient: SPLIT_RAW_BYTES=87 >> > 11/08/02 16:50:19 INFO mapred.JobClient: Map output records=128000 >> > 11/08/02 16:50:19 INFO mapreduce.ImportJobBase: Transferred 0 bytes in >> > 865.1743 seconds (0 bytes/sec) >> > 11/08/02 16:50:19 INFO mapreduce.ImportJobBase: Retrieved 128000 >> > records. >> > 11/08/02 16:50:19 ERROR tool.ImportTool: Error during import: Import >> > job failed! >> >> > The prompt indicates that it retrieved 128000 records, but that the >> > import job failed with 0 bytes transferred. One source of problem I >> > see is "java.lang.OutOfMemoryError: GC overhead limit exceeded". This >> > problem has stumped me for a while. Any input would be great >> > appreciated thanks! >> >> > -- >> > NOTE: The mailing list sqoop-u...@cloudera.org is deprecated in favor of >> > Apache Sqoop mailing list sqoop-u...@incubator.apache.org. Please >> > subscribe to it by sending an email to >> > incubator-sqoop-user-subscr...@apache.org. > > -- > NOTE: The mailing list sqoop-u...@cloudera.org is deprecated in favor of > Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please subscribe > to it by sending an email to incubator-sqoop-user-subscr...@apache.org. >