[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186551#comment-17186551 ] Ram edited comment on SQOOP-2907 at 8/28/20, 2:04 PM: -- [~yuan_zac] [~sanysand...@gmail.com] [~vasas] [~stanleyxu2005] We are using *sqoop 1.4.7* to upload parquet data that is stored in HDFS - *Plain parquet files and NOT a Hive table* **We're still facing the same issue - {code:java} 20/08/28 13:37:02 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetIOException: Cannot access descriptor location: hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata org.kitesdk.data.DatasetIOException: Cannot access descriptor location: hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata{code} The command we're running - {code:java} /sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop export --connect jdbc:postgresql:// --username --password --table --export-dir hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000.parquet {code} Postgres JAR - postgresql-42.2.11.jar Please do suggest a solution ASAP. was (Author: ramkrishnan): [~yuan_zac] [~sanysand...@gmail.com] [~vasas] We are using *sqoop 1.4.7* to upload parquet data that is stored in HDFS - *Plain parquet files and NOT a Hive table* **We're still facing the same issue - {code:java} 20/08/28 13:37:02 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetIOException: Cannot access descriptor location: hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata org.kitesdk.data.DatasetIOException: Cannot access descriptor location: hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata{code} The command we're running - {code:java} /sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop export --connect jdbc:postgresql:// --username --password --table --export-dir hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000.parquet {code} Postgres JAR - postgresql-42.2.11.jar Please do suggest a solution ASAP. > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov >Assignee: Sandish Kumar HN >Priority: Major > Labels: sqoop > Attachments: SQOOP-2907-3.patch, SQOOP-2907.patch, SQOOP-2907.patch1, > SQOOP-2907.patch2 > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118505#comment-16118505 ] Anna Szonyi edited comment on SQOOP-2907 at 8/8/17 3:53 PM: Hi [~sanysand...@gmail.com], Thanks for picking this up again! Please make sure that it applies to the current trunk (or if needed please rebase on it), also we should ping [~yuan_zac] and [~514793...@qq.com]] in case the original author wants to reclaim their patch. If not (or we receive no reply), then please submit it for review! Thanks, Anna was (Author: anna.szonyi): Hi [~sanysand...@gmail.com], Thanks for picking this up again! Please make sure that it applies to the current trunk (or if needed please rebase on it), also we should ping [~yuan_zac] and [~chenkai.dr] in case the original author wants to reclaim their patch. If not (or we receive no reply), then please submit it for review! Thanks, Anna > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov >Assignee: Sandish Kumar HN > Attachments: SQOOP-2907.patch, SQOOP-2907.patch1 > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118505#comment-16118505 ] Anna Szonyi edited comment on SQOOP-2907 at 8/8/17 3:52 PM: Hi [~sanysand...@gmail.com], Thanks for picking this up again! Please make sure that it applies to the current trunk (or if needed please rebase on it), also we should ping [~yuan_zac] and [~chenkai.dr] in case the original author wants to reclaim their patch. If not (or we receive no reply), then please submit it for review! Thanks, Anna was (Author: anna.szonyi): Hi [~sanysand...@gmail.com], Thanks for picking this up again! Please make sure that it applies to the current trunk (or if needed please rebase on it), also we should ping [~yuan_zac] in case the original author wants to reclaim their patch. If not (or we receive no reply), then please submit it for review! Thanks, Anna > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov >Assignee: Sandish Kumar HN > Attachments: SQOOP-2907.patch, SQOOP-2907.patch1 > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118234#comment-16118234 ] Sandish Kumar HN edited comment on SQOOP-2907 at 8/8/17 11:52 AM: -- Hi [~anna.szonyi] It seems issue there from a long time. The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. can I submit the patch at review board?? was (Author: sanysand...@gmail.com): [~anna.szonyi] It seems issue there from a long time. The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. can I submit the patch at review board?? > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov >Assignee: Sandish Kumar HN > Attachments: SQOOP-2907.patch, SQOOP-2907.patch1 > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784701#comment-15784701 ] Zac Zhou edited comment on SQOOP-2907 at 12/29/16 7:07 AM: --- SQOOP-2907.patch1 keeps using kite to handle parquet file. But it would get the schema from parquet file directly if there is no .metastore folder. This way works for parquet files generated by hive and spark. Some unit tests are added as well was (Author: yuan_zac): SQOOP-2907.patch1 keep using kite to handle parquet file. But it would get the schema from parquet file directly if there is no .metastore folder. this way works for parquet files generated by hive and spark. some unit tests are added as well > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov > Attachments: SQOOP-2907.patch, SQOOP-2907.patch1 > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784701#comment-15784701 ] Zac Zhou edited comment on SQOOP-2907 at 12/29/16 7:06 AM: --- SQOOP-2907.patch1 keep using kite to handle parquet file. But it would get the schema from parquet file directly if there is no .metastore folder. this way works for parquet files generated by hive and spark. some unit tests are added as well was (Author: yuan_zac): get schema data from parquet file directly > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov > Attachments: SQOOP-2907.patch, SQOOP-2907.patch1 > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766630#comment-15766630 ] chen kai edited comment on SQOOP-2907 at 12/21/16 3:48 PM: --- We are using hive stored as parquet file and exporting by sqoop failed. This patch will resolve it, the changes are based on `sqoop-1.4.6-cdh5.7.1`. If you want to export recursively, see [SQOOP-951], And then change [ExportJobBase.java line:159] code, finding child file recursively bug. was (Author: 514793...@qq.com): We are using hive stored as parquet file and exporting by sqoop failed. This patch will resolve it. If you want to export recursively, see [SQOOP-951], And then change [ExportJobBase.java line:159] code, finding child file recursively bug. > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov > Attachments: SQOOP-2907.patch > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files
[ https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766630#comment-15766630 ] chen kai edited comment on SQOOP-2907 at 12/21/16 9:54 AM: --- We are using hive stored as parquet file and exporting by sqoop failed. This patch will resolve it. If you want to export recursively, see [SQOOP-951], And then change [ExportJobBase.java line:159] code, finding child file recursively bug. was (Author: 514793...@qq.com): We are using hive stored as parquet file and exporting by sqoop failed. This patch will resolve it. if you want to export recursively, see [SQOOP-951], And then change [ExportJobBase.java line:159] code, finding child file. > Export parquet files to RDBMS: don't require .metadata for parquet files > > > Key: SQOOP-2907 > URL: https://issues.apache.org/jira/browse/SQOOP-2907 > Project: Sqoop > Issue Type: Improvement > Components: metastore >Affects Versions: 1.4.6 > Environment: sqoop 1.4.6 > export parquet files to Oracle >Reporter: Ruslan Dautkhanov > Attachments: SQOOP-2907.patch > > > Kite currently requires .metadata. > Parquet files have their own metadata stored along data files. > It would be great for Export operation on parquet files to RDBMS not to > require .metadata. > We have most of the files created by Spark and Hive, and they don't create > .metadata, it only Kite that does. > It makes sqoop export of parquet files usability very limited. -- This message was sent by Atlassian JIRA (v6.3.4#6332)