[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2020-08-28 Thread Ram (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186551#comment-17186551
 ] 

Ram edited comment on SQOOP-2907 at 8/28/20, 2:04 PM:
--

[~yuan_zac] [~sanysand...@gmail.com] [~vasas]  [~stanleyxu2005] 

We are using *sqoop 1.4.7* to upload parquet data that is stored in HDFS - 
*Plain parquet files and NOT a Hive table*

**We're still facing the same issue - 

 
{code:java}
20/08/28 13:37:02 ERROR sqoop.Sqoop: Got exception running Sqoop: 
org.kitesdk.data.DatasetIOException: Cannot access descriptor location: 
hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata
org.kitesdk.data.DatasetIOException: Cannot access descriptor location:  
hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata{code}
The command we're running - 

 
{code:java}
/sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop export --connect 
jdbc:postgresql:// --username  --password 
 --table  --export-dir 
hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000.parquet
{code}
Postgres JAR - postgresql-42.2.11.jar

Please do suggest a solution ASAP.

 

 


was (Author: ramkrishnan):
[~yuan_zac] [~sanysand...@gmail.com] [~vasas]

We are using *sqoop 1.4.7* to upload parquet data that is stored in HDFS - 
*Plain parquet files and NOT a Hive table*

**We're still facing the same issue - 

 
{code:java}
20/08/28 13:37:02 ERROR sqoop.Sqoop: Got exception running Sqoop: 
org.kitesdk.data.DatasetIOException: Cannot access descriptor location: 
hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata
org.kitesdk.data.DatasetIOException: Cannot access descriptor location:  
hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata{code}
The command we're running - 

 
{code:java}
/sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop export --connect 
jdbc:postgresql:// --username  --password 
 --table  --export-dir 
hdfs:part-0-f9f92493-36a1-4714-bcc6-291c118cf599-c000.parquet
{code}
Postgres JAR - postgresql-42.2.11.jar

Please do suggest a solution ASAP.

 

 

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
>Priority: Major
>  Labels: sqoop
> Attachments: SQOOP-2907-3.patch, SQOOP-2907.patch, SQOOP-2907.patch1, 
> SQOOP-2907.patch2
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-08 Thread Anna Szonyi (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118505#comment-16118505
 ] 

Anna Szonyi edited comment on SQOOP-2907 at 8/8/17 3:53 PM:


Hi [~sanysand...@gmail.com],

Thanks for picking this up again! Please make sure that it applies to the 
current trunk (or if needed please rebase on it), also we should ping 
[~yuan_zac] and [~514793...@qq.com]] in case the original author wants to 
reclaim their patch. If not (or we receive no reply), then please submit it for 
review!

Thanks,
Anna


was (Author: anna.szonyi):
Hi [~sanysand...@gmail.com],

Thanks for picking this up again! Please make sure that it applies to the 
current trunk (or if needed please rebase on it), also we should ping 
[~yuan_zac] and [~chenkai.dr] in case the original author wants to reclaim 
their patch. If not (or we receive no reply), then please submit it for review!

Thanks,
Anna

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-08 Thread Anna Szonyi (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118505#comment-16118505
 ] 

Anna Szonyi edited comment on SQOOP-2907 at 8/8/17 3:52 PM:


Hi [~sanysand...@gmail.com],

Thanks for picking this up again! Please make sure that it applies to the 
current trunk (or if needed please rebase on it), also we should ping 
[~yuan_zac] and [~chenkai.dr] in case the original author wants to reclaim 
their patch. If not (or we receive no reply), then please submit it for review!

Thanks,
Anna


was (Author: anna.szonyi):
Hi [~sanysand...@gmail.com],

Thanks for picking this up again! Please make sure that it applies to the 
current trunk (or if needed please rebase on it), also we should ping 
[~yuan_zac] in case the original author wants to reclaim their patch. If not 
(or we receive no reply), then please submit it for review!

Thanks,
Anna

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-08 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118234#comment-16118234
 ] 

Sandish Kumar HN edited comment on SQOOP-2907 at 8/8/17 11:52 AM:
--

Hi [~anna.szonyi] 

It seems issue there from a long time.
The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. 
 can I submit the patch at review board?? 


was (Author: sanysand...@gmail.com):
[~anna.szonyi] 

It seems issue there from a long time.
The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. 
 can I submit the patch at review board?? 

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2016-12-28 Thread Zac Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784701#comment-15784701
 ] 

Zac Zhou edited comment on SQOOP-2907 at 12/29/16 7:07 AM:
---

SQOOP-2907.patch1 keeps using kite to handle parquet file. But it would get the 
schema from parquet file directly if there is no .metastore folder. This way 
works for parquet files generated by hive and spark. Some unit tests are added 
as well


was (Author: yuan_zac):
SQOOP-2907.patch1 keep using kite to handle parquet file. But it would get the 
schema from parquet file directly if there is no .metastore folder. this way 
works for parquet files generated by hive and spark. some unit tests are added 
as well

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2016-12-28 Thread Zac Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784701#comment-15784701
 ] 

Zac Zhou edited comment on SQOOP-2907 at 12/29/16 7:06 AM:
---

SQOOP-2907.patch1 keep using kite to handle parquet file. But it would get the 
schema from parquet file directly if there is no .metastore folder. this way 
works for parquet files generated by hive and spark. some unit tests are added 
as well


was (Author: yuan_zac):
get schema data from parquet file directly

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2016-12-21 Thread chen kai (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766630#comment-15766630
 ] 

chen kai edited comment on SQOOP-2907 at 12/21/16 3:48 PM:
---

We are using hive stored as parquet file and exporting by sqoop failed. This 
patch will resolve it, the changes are based on `sqoop-1.4.6-cdh5.7.1`. If you 
want to export recursively, see [SQOOP-951], And then change 
[ExportJobBase.java line:159] code, finding child file recursively bug.


was (Author: 514793...@qq.com):
We are using hive stored as parquet file and exporting by sqoop failed. This 
patch will resolve it. If you want to export recursively, see [SQOOP-951], And 
then change [ExportJobBase.java line:159] code, finding child file recursively 
bug.

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
> Attachments: SQOOP-2907.patch
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2016-12-21 Thread chen kai (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766630#comment-15766630
 ] 

chen kai edited comment on SQOOP-2907 at 12/21/16 9:54 AM:
---

We are using hive stored as parquet file and exporting by sqoop failed. This 
patch will resolve it. If you want to export recursively, see [SQOOP-951], And 
then change [ExportJobBase.java line:159] code, finding child file recursively 
bug.


was (Author: 514793...@qq.com):
We are using hive stored as parquet file and exporting by sqoop failed. This 
patch will resolve it. if you want to export recursively, see [SQOOP-951], And 
then change [ExportJobBase.java line:159] code, finding child file.

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
> Attachments: SQOOP-2907.patch
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)