Awesomeness and thank you for helping with the other forums.
Markus Kemper Customer Operations Engineer [image: www.cloudera.com] <http://www.cloudera.com> On Tue, Oct 25, 2016 at 2:56 PM, Douglas Spadotto <[email protected]> wrote: > Hi Markus, > > It worked fine end to end. Here it goes the edited output: > > *SOURCE:* > > hive> describe extended teste1; > OK > id int > nome string > > Detailed Table Information Table(tableName:teste1, dbName:default, > owner:cloudera, createTime:1477408342, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), > FieldSchema(name:nome, type:string, comment:null)], > location:hdfs://quickstart.cloudera:8020/user/hive/warehouse/teste1, > *inputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat,* > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, > parameters:{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], > skewedColValues:[], skewedColValueLocationMaps:{}), > storedAsSubDirectories:false), partitionKeys:[], parameters:{numFiles=2, > transient_lastDdlTime=1477408508, COLUMN_STATS_ACCURATE=true, > totalSize=645, numRows=3, rawDataSize=6}, viewOriginalText:null, > viewExpandedText:null, tableType:MANAGED_TABLE) > Time taken: 0.08 seconds, Fetched: 4 row(s) > > hive> select * from teste1; > OK > ... > 1 Douglas > 2 Spadotto > 3 Doug > Time taken: 0.323 seconds, Fetched: 3 row(s) > > *SQOOP:* > > cloudera@quickstart ~]$ sqoop export --connect > jdbc:postgresql://localhost/postgres --table test1 --hcatalog-table > teste1 > Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will > fail. > Please set $ACCUMULO_HOME to the root of your Accumulo installation. > 16/10/25 11:42:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.0 > 16/10/25 11:42:19 INFO manager.SqlManager: Using default fetchSize of 1000 > 16/10/25 11:42:19 INFO tool.CodeGenTool: Beginning code generation > 16/10/25 11:42:19 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM "test1" AS t LIMIT 1 > 16/10/25 11:42:19 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is > /usr/lib/hadoop-mapreduce > Note: /tmp/sqoop-cloudera/compile/50329ee88534a5c3b6df8b37aadf2131/test1.java > uses or overrides a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 16/10/25 11:42:20 INFO orm.CompilationManager: Writing jar file: > /tmp/sqoop-cloudera/compile/50329ee88534a5c3b6df8b37aadf2131/test1.jar > 16/10/25 11:42:20 INFO mapreduce.ExportJobBase: Beginning export of test1 > 16/10/25 11:42:21 INFO Configuration.deprecation: mapred.jar is > deprecated. Instead, use mapreduce.job.jar > 16/10/25 11:42:21 INFO Configuration.deprecation: mapred.map.max.attempts > is deprecated. Instead, use mapreduce.map.maxattempts > 16/10/25 11:42:21 INFO mapreduce.ExportJobBase: Configuring HCatalog for > export job > 16/10/25 11:42:21 INFO hcat.SqoopHCatUtilities: Configuring HCatalog > specific details for job > 16/10/25 11:42:21 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM "test1" AS t LIMIT 1 > 16/10/25 11:42:21 INFO hcat.SqoopHCatUtilities: Database column names > projected : [id, nome] > 16/10/25 11:42:21 INFO hcat.SqoopHCatUtilities: Database column name - > info map : > id : [Type : 4,Precision : 10,Scale : 0] > nome : [Type : 12,Precision : 50,Scale : 0] *<-- Got the structure here!* > > 16/10/25 11:42:21 INFO hive.metastore: Trying to connect to metastore with > URI thrift://quickstart.cloudera:9083 > 16/10/25 11:42:21 INFO hive.metastore: Opened a connection to metastore, > current connections: 1 > 16/10/25 11:42:21 INFO hive.metastore: Connected to metastore. > 16/10/25 11:42:21 INFO hcat.SqoopHCatUtilities: HCatalog full table schema > fields = [id, nome] > ... > 16/10/25 11:42:22 WARN hcat.SqoopHCatUtilities: No files under > /usr/lib/hive-hcatalog/share/hcatalog/storage-handlers to add to > distributed cache for hcatalog job > 16/10/25 11:42:22 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for > export job > 16/10/25 11:42:22 INFO hcat.SqoopHCatUtilities: Ignoring configuration > request for HCatalog info > 16/10/25 11:42:22 INFO Configuration.deprecation: > mapred.reduce.tasks.speculative.execution > is deprecated. Instead, use mapreduce.reduce.speculative > 16/10/25 11:42:22 INFO Configuration.deprecation: > mapred.map.tasks.speculative.execution is deprecated. Instead, use > mapreduce.map.speculative > 16/10/25 11:42:22 INFO Configuration.deprecation: mapred.map.tasks is > deprecated. Instead, use mapreduce.job.maps > 16/10/25 11:42:22 INFO client.RMProxy: Connecting to ResourceManager at > quickstart.cloudera/192.168.26.129:8032 > 16/10/25 11:42:49 INFO Configuration.deprecation: mapred.input.dir is > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir > 16/10/25 11:42:49 INFO mapred.FileInputFormat: Total input paths to > process : 2 > 16/10/25 11:42:50 INFO mapreduce.JobSubmitter: number of splits:2 > ... > 16/10/25 11:43:07 INFO mapreduce.ExportJobBase: Transferred 15.9258 KB in > 45.1568 seconds (361.1415 bytes/sec) > 16/10/25 11:43:07 INFO mapreduce.ExportJobBase: Exported 3 records. > 16/10/25 11:43:07 INFO hive.metastore: Closed a connection to metastore, > current connections: 0 > > *DESTINATION:* > > [cloudera@quickstart ~]$ psql -d postgres -c "select * from test1" > id | nome > ----+---------- > 1 | Douglas > 2 | Spadotto > 3 | Doug > (3 rows) > > Now I'll go off into the Internet answer to everyone that posted this same > question on other forums. :) > > Regards, > > Douglas > > > On Tue, Oct 25, 2016 at 4:00 PM, Markus Kemper <[email protected]> > wrote: > >> Glad to hear it. Let us know how it goes >> >> >> Markus Kemper >> Customer Operations Engineer >> [image: www.cloudera.com] <http://www.cloudera.com> >> >> >> On Tue, Oct 25, 2016 at 1:55 PM, Douglas Spadotto <[email protected] >> > wrote: >> >>> Hi Markus, >>> >>> Thank you! >>> >>> I tried this (hcatalog options) myself a few minutes after I hit "send" >>> on the e-mail. It worked fine, Sqoop was able to read the Parquet >>> structure. Just my MR crashed but it was due to my unstable environment. >>> >>> It looks like I'm on the way to the solution. >>> >>> Cheers, >>> >>> Douglas >>> >>> On Tue, Oct 25, 2016 at 3:32 PM, Markus Kemper <[email protected]> >>> wrote: >>> >>>> Hello Douglas, >>>> >>>> The only workaround that I am aware of is to use the Sqoop --hcatalog >>>> options, for example: >>>> >>>> sqoop export --connect <jdbc_connection_string> --table <rdbms_table> >>>> --hcatalog-database <hive_database> --hcatalog-table <hive_table> >>>> >>>> >>>> >>>> Markus Kemper >>>> Customer Operations Engineer >>>> [image: www.cloudera.com] <http://www.cloudera.com> >>>> >>>> >>>> On Tue, Oct 25, 2016 at 1:21 PM, Douglas Spadotto < >>>> [email protected]> wrote: >>>> >>>>> Hello everyone, >>>>> >>>>> I saw in the past few months quite a few messages about Parquet >>>>> support on Sqoop, all about importing. Some of them worked well. >>>>> >>>>> But for exporting I'm receiving this error when trying to export from >>>>> a Hive table stored as Parquet to Postgresql: >>>>> >>>>> [cloudera@quickstart ~]$ sqoop export --connect >>>>> jdbc:postgresql://localhost/postgres --table test1 --export-dir >>>>> /user/hive/warehouse/teste1 >>>>> Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports >>>>> will fail. >>>>> Please set $ACCUMULO_HOME to the root of your Accumulo installation. >>>>> 16/10/25 09:19:09 INFO sqoop.Sqoop: Running Sqoop version: >>>>> 1.4.6-cdh5.8.0 >>>>> 16/10/25 09:19:09 INFO manager.SqlManager: Using default fetchSize of >>>>> 1000 >>>>> 16/10/25 09:19:09 INFO tool.CodeGenTool: Beginning code generation >>>>> 16/10/25 09:19:10 INFO manager.SqlManager: Executing SQL statement: >>>>> SELECT t.* FROM "test1" AS t LIMIT 1 >>>>> 16/10/25 09:19:10 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is >>>>> /usr/lib/hadoop-mapreduce >>>>> Note: >>>>> /tmp/sqoop-cloudera/compile/019c3435216213411e2de14c483af692/test1.java >>>>> uses or overrides a deprecated API. >>>>> Note: Recompile with -Xlint:deprecation for details. >>>>> 16/10/25 09:19:11 INFO orm.CompilationManager: Writing jar file: >>>>> /tmp/sqoop-cloudera/compile/019c3435216213411e2de14c483af692/test1.jar >>>>> 16/10/25 09:19:11 INFO mapreduce.ExportJobBase: Beginning export of >>>>> test1 >>>>> 16/10/25 09:19:12 INFO Configuration.deprecation: mapred.jar is >>>>> deprecated. Instead, use mapreduce.job.jar >>>>> 16/10/25 09:19:12 INFO Configuration.deprecation: >>>>> mapred.map.max.attempts is deprecated. Instead, use >>>>> mapreduce.map.maxattempts >>>>> 16/10/25 09:19:13 INFO manager.SqlManager: Executing SQL statement: >>>>> SELECT t.* FROM "test1" AS t LIMIT 1 >>>>> 16/10/25 09:19:13 ERROR sqoop.Sqoop: Got exception running Sqoop: >>>>> org.kitesdk.data.DatasetNotFoundException: Descriptor location does >>>>> not exist: hdfs://quickstart.cloudera:802 >>>>> 0/user/hive/warehouse/teste1/.metadata >>>>> org.kitesdk.data.DatasetNotFoundException: Descriptor location does >>>>> not exist: hdfs://quickstart.cloudera:802 >>>>> 0/user/hive/warehouse/teste1/.metadata >>>>> at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.c >>>>> heckExists(FileSystemMetadataProvider.java:562) >>>>> at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.f >>>>> ind(FileSystemMetadataProvider.java:605) >>>>> at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.l >>>>> oad(FileSystemMetadataProvider.java:114) >>>>> at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository. >>>>> load(FileSystemDatasetRepository.java:197) >>>>> at org.kitesdk.data.Datasets.load(Datasets.java:108) >>>>> at org.kitesdk.data.Datasets.load(Datasets.java:140) >>>>> at org.kitesdk.data.mapreduce.DatasetKeyInputFormat$ConfigBuild >>>>> er.readFrom(DatasetKeyInputFormat.java:92) >>>>> at org.kitesdk.data.mapreduce.DatasetKeyInputFormat$ConfigBuild >>>>> er.readFrom(DatasetKeyInputFormat.java:139) >>>>> at org.apache.sqoop.mapreduce.JdbcExportJob.configureInputForma >>>>> t(JdbcExportJob.java:84) >>>>> at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJob >>>>> Base.java:432) >>>>> at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.j >>>>> ava:931) >>>>> at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:81) >>>>> at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) >>>>> at org.apache.sqoop.Sqoop.run(Sqoop.java:143) >>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>>>> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) >>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) >>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) >>>>> at org.apache.sqoop.Sqoop.main(Sqoop.java:236) >>>>> >>>>> I saw a recent JIRA opened about this, https://issues.apache.or >>>>> g/jira/browse/SQOOP-2907, and am wondering if there is any workaround >>>>> for this? >>>>> >>>>> Thanks in advance, >>>>> >>>>> Douglas >>>>> >>>>> -- >>>>> Visite: http://canseidesercowboy.wordpress.com/ >>>>> Siga: @dougspadotto ou @excowboys >>>>> ----- >>>>> Frodo: "I wish none of this had happened." >>>>> Gandalf: "So do all who live to see such times, but that is not for >>>>> them to decide. All we have to decide is what to do with the time that is >>>>> given to us." >>>>> -- Lord of the Rings: The Fellowship of the Ring (2001) >>>>> >>>> >>>> >>> >>> >>> -- >>> Visite: http://canseidesercowboy.wordpress.com/ >>> Siga: @dougspadotto ou @excowboys >>> ----- >>> Frodo: "I wish none of this had happened." >>> Gandalf: "So do all who live to see such times, but that is not for them >>> to decide. All we have to decide is what to do with the time that is given >>> to us." >>> -- Lord of the Rings: The Fellowship of the Ring (2001) >>> >> >> > > > -- > Visite: http://canseidesercowboy.wordpress.com/ > Siga: @dougspadotto ou @excowboys > ----- > Frodo: "I wish none of this had happened." > Gandalf: "So do all who live to see such times, but that is not for them > to decide. All we have to decide is what to do with the time that is given > to us." > -- Lord of the Rings: The Fellowship of the Ring (2001) >
