[jira] [Commented] (SQOOP-2903) Add Kudu connector for Sqoop

2019-10-01 Thread Sandish Kumar HN (Jira)


[ 
https://issues.apache.org/jira/browse/SQOOP-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942337#comment-16942337
 ] 

Sandish Kumar HN commented on SQOOP-2903:
-

[~sabhyankar] are you making efforts to make this work? if not I can take the 
existing patch and make it work with the current version of sqoop.  [~vasas] 
what is the process to work on the existing patch. 

> Add Kudu connector for Sqoop
> 
>
> Key: SQOOP-2903
> URL: https://issues.apache.org/jira/browse/SQOOP-2903
> Project: Sqoop
>  Issue Type: Improvement
>  Components: connectors
>Reporter: Sameer Abhyankar
>Assignee: Sameer Abhyankar
>Priority: Major
> Attachments: SQOOP-2903.1.patch, SQOOP-2903.2.patch, SQOOP-2903.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sqoop currently does not have a connector for Kudu. We should add the 
> functionality to allow Sqoop to ingest data directly into Kudu.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (SQOOP-3148) Sqoop Integration for Apache Kudu

2018-06-18 Thread Sandish Kumar HN (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN resolved SQOOP-3148.
-
Resolution: Duplicate

> Sqoop Integration for Apache Kudu
> -
>
> Key: SQOOP-3148
> URL: https://issues.apache.org/jira/browse/SQOOP-3148
> Project: Sqoop
>  Issue Type: New Feature
>  Components: sqoop2-framework
>Affects Versions: no-release
> Environment: Hadoop
>Reporter: Mohan kumar
>Priority: Major
>  Labels: features
> Fix For: 2.0.0
>
>
> Hi Team,
> As we are aware, we are started using Apache Kudu Framework for Anaytic 
> Solutions.
> Can we have the Sqoop Framework to perform the load and unload between Apache 
> Kudu and current source systems (Teradata, Oracle, SqlServer).
> Requesting to build the Framework where we can directly read and load the 
> data to Apache Kudu tables.
> Presently, we need to load data to HDFS file system, then create the 
> Impala/Kudu Tables. If we have option to load data to Kudu file system then 
> it will be nice.
> Thanks,
> Mohan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SQOOP-2903) Add Kudu connector for Sqoop

2018-05-25 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491104#comment-16491104
 ] 

Sandish Kumar HN commented on SQOOP-2903:
-

[~vasas] Kudu java client has minicluster support now. 
[https://github.com/cloudera/kudu/blob/master/java/kudu-client/src/test/java/org/apache/kudu/client/MiniKuduCluster.java]
[~sabhyankar]

> Add Kudu connector for Sqoop
> 
>
> Key: SQOOP-2903
> URL: https://issues.apache.org/jira/browse/SQOOP-2903
> Project: Sqoop
>  Issue Type: Improvement
>  Components: connectors
>Reporter: Sameer Abhyankar
>Assignee: Sameer Abhyankar
>Priority: Major
> Attachments: SQOOP-2903.1.patch, SQOOP-2903.2.patch, SQOOP-2903.patch
>
>
> Sqoop currently does not have a connector for Kudu. We should add the 
> functionality to allow Sqoop to ingest data directly into Kudu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SQOOP-3171) Import as parquet jobs failed randomly while multiple jobs concurrently importing into targets with same parent

2018-03-27 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415172#comment-16415172
 ] 

Sandish Kumar HN commented on SQOOP-3171:
-

https://issues.cloudera.org/browse/KITE-1176

> Import as parquet jobs failed randomly while multiple jobs concurrently 
> importing into targets with same parent
> ---
>
> Key: SQOOP-3171
> URL: https://issues.apache.org/jira/browse/SQOOP-3171
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Xiaomin Zhang
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Running multiple parquet import jobs concurrently into below target 
> directories:
> hdfs://ns/path/dataset1
> hdfs://ns/path/dataset2
> In some cases, one of the sqoop job will be failed with below error:
> 17/03/19 08:21:21 INFO mapreduce.Job: Job job_1488289274600_188649 failed 
> with state FAILED due to: Job commit failed: 
> org.kitesdk.data.DatasetIOException: Could not cleanly delete 
> path:hdfs://ns/path/.temp/job_1488289274600_188649
> at 
> org.kitesdk.data.spi.filesystem.FileSystemUtil.cleanlyDelete(FileSystemUtil.java:239)
> at 
> org.kitesdk.data.spi.filesystem.TemporaryFileSystemDatasetRepository.delete(TemporaryFileSystemDatasetRepository.java:61)
> at 
> org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:395)
> at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:274)
> at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException: File hdfs://ns/path/.temp does not 
> exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:106)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:763)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:759)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:759)
> at 
> org.kitesdk.data.spi.filesystem.FileSystemUtil.cleanlyDelete(FileSystemUtil.java:226)
> This is due to:
> https://issues.cloudera.org/browse/KITE-1155



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SQOOP-3171) Import as parquet jobs failed randomly while multiple jobs concurrently importing into targets with same parent

2018-03-27 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415157#comment-16415157
 ] 

Sandish Kumar HN commented on SQOOP-3171:
-

[~ximz] Yes, I'm thinking to get KITE new release with current fixes. there are 
other two KITE issue's which are blocking me in SQOOP DEV. let me ask KITE.  

> Import as parquet jobs failed randomly while multiple jobs concurrently 
> importing into targets with same parent
> ---
>
> Key: SQOOP-3171
> URL: https://issues.apache.org/jira/browse/SQOOP-3171
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Xiaomin Zhang
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Running multiple parquet import jobs concurrently into below target 
> directories:
> hdfs://ns/path/dataset1
> hdfs://ns/path/dataset2
> In some cases, one of the sqoop job will be failed with below error:
> 17/03/19 08:21:21 INFO mapreduce.Job: Job job_1488289274600_188649 failed 
> with state FAILED due to: Job commit failed: 
> org.kitesdk.data.DatasetIOException: Could not cleanly delete 
> path:hdfs://ns/path/.temp/job_1488289274600_188649
> at 
> org.kitesdk.data.spi.filesystem.FileSystemUtil.cleanlyDelete(FileSystemUtil.java:239)
> at 
> org.kitesdk.data.spi.filesystem.TemporaryFileSystemDatasetRepository.delete(TemporaryFileSystemDatasetRepository.java:61)
> at 
> org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:395)
> at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:274)
> at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException: File hdfs://ns/path/.temp does not 
> exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:106)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:763)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:759)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:759)
> at 
> org.kitesdk.data.spi.filesystem.FileSystemUtil.cleanlyDelete(FileSystemUtil.java:226)
> This is due to:
> https://issues.cloudera.org/browse/KITE-1155



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (SQOOP-3299) Implement HiveServer2 support

2018-03-21 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407727#comment-16407727
 ] 

Sandish Kumar HN edited comment on SQOOP-3299 at 3/21/18 10:29 AM:
---

Sure [~vasas] let me know about subtasks and review request once you're done.  


was (Author: sanysand...@gmail.com):
Sure [~vasas] let me know about subtasks and review request. 

> Implement HiveServer2 support
> -
>
> Key: SQOOP-3299
> URL: https://issues.apache.org/jira/browse/SQOOP-3299
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.5.0
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
>
> Sqoop uses HiveCLI currently to load the data into Hive. HiveCLI is a really 
> old and deprecated tool which does not support proper authorization and 
> auditing so it would be beneficial to use HiveServer2.
> This is an umbrella JIRA for the initiative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SQOOP-3299) Implement HiveServer2 support

2018-03-21 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407727#comment-16407727
 ] 

Sandish Kumar HN commented on SQOOP-3299:
-

Sure [~vasas] let me know about subtasks and review request. 

> Implement HiveServer2 support
> -
>
> Key: SQOOP-3299
> URL: https://issues.apache.org/jira/browse/SQOOP-3299
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.5.0
>Reporter: Szabolcs Vasas
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Sqoop uses HiveCLI currently to load the data into Hive. HiveCLI is a really 
> old and deprecated tool which does not support proper authorization and 
> auditing so it would be beneficial to use HiveServer2.
> This is an umbrella JIRA for the initiative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (SQOOP-3299) Implement HiveServer2 support

2018-03-21 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3299:
---

Assignee: Szabolcs Vasas  (was: Sandish Kumar HN)

> Implement HiveServer2 support
> -
>
> Key: SQOOP-3299
> URL: https://issues.apache.org/jira/browse/SQOOP-3299
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.5.0
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
>
> Sqoop uses HiveCLI currently to load the data into Hive. HiveCLI is a really 
> old and deprecated tool which does not support proper authorization and 
> auditing so it would be beneficial to use HiveServer2.
> This is an umbrella JIRA for the initiative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SQOOP-3299) Implement HiveServer2 support

2018-03-21 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407709#comment-16407709
 ] 

Sandish Kumar HN commented on SQOOP-3299:
-

Anyone have an idea to Implement HiveServer2 support? thoughts, design??

> Implement HiveServer2 support
> -
>
> Key: SQOOP-3299
> URL: https://issues.apache.org/jira/browse/SQOOP-3299
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.5.0
>Reporter: Szabolcs Vasas
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Sqoop uses HiveCLI currently to load the data into Hive. HiveCLI is a really 
> old and deprecated tool which does not support proper authorization and 
> auditing so it would be beneficial to use HiveServer2.
> This is an umbrella JIRA for the initiative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (SQOOP-3299) Implement HiveServer2 support

2018-03-21 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3299:
---

Assignee: Sandish Kumar HN

> Implement HiveServer2 support
> -
>
> Key: SQOOP-3299
> URL: https://issues.apache.org/jira/browse/SQOOP-3299
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.5.0
>Reporter: Szabolcs Vasas
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Sqoop uses HiveCLI currently to load the data into Hive. HiveCLI is a really 
> old and deprecated tool which does not support proper authorization and 
> auditing so it would be beneficial to use HiveServer2.
> This is an umbrella JIRA for the initiative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (SQOOP-3171) Import as parquet jobs failed randomly while multiple jobs concurrently importing into targets with same parent

2018-03-15 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3171:
---

Assignee: Sandish Kumar HN

> Import as parquet jobs failed randomly while multiple jobs concurrently 
> importing into targets with same parent
> ---
>
> Key: SQOOP-3171
> URL: https://issues.apache.org/jira/browse/SQOOP-3171
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Xiaomin Zhang
>Assignee: Sandish Kumar HN
>Priority: Major
>
> Running multiple parquet import jobs concurrently into below target 
> directories:
> hdfs://ns/path/dataset1
> hdfs://ns/path/dataset2
> In some cases, one of the sqoop job will be failed with below error:
> 17/03/19 08:21:21 INFO mapreduce.Job: Job job_1488289274600_188649 failed 
> with state FAILED due to: Job commit failed: 
> org.kitesdk.data.DatasetIOException: Could not cleanly delete 
> path:hdfs://ns/path/.temp/job_1488289274600_188649
> at 
> org.kitesdk.data.spi.filesystem.FileSystemUtil.cleanlyDelete(FileSystemUtil.java:239)
> at 
> org.kitesdk.data.spi.filesystem.TemporaryFileSystemDatasetRepository.delete(TemporaryFileSystemDatasetRepository.java:61)
> at 
> org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:395)
> at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:274)
> at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException: File hdfs://ns/path/.temp does not 
> exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:705)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:106)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:763)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:759)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:759)
> at 
> org.kitesdk.data.spi.filesystem.FileSystemUtil.cleanlyDelete(FileSystemUtil.java:226)
> This is due to:
> https://issues.cloudera.org/browse/KITE-1155



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SQOOP-3216) Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2

2017-10-03 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189699#comment-16189699
 ] 

Sandish Kumar HN commented on SQOOP-3216:
-

Hi [~BoglarkaEgyed] & [~zach.berkowitz]
did we remove this option?:

sqoop.metastore.client.enable.autoconnect
false
If true, Sqoop will connect to a local metastore
  for job management when no other metastore arguments are
  provided.

  

now it's not reading anything from sqoop-site.xml by default it takes default 
hsql DB. Irrespective of what you have specified in sqoop-site.xml


> Expanded Metastore support for MySql, Oracle, Postgresql, MSSql, and DB2
> 
>
> Key: SQOOP-3216
> URL: https://issues.apache.org/jira/browse/SQOOP-3216
> Project: Sqoop
>  Issue Type: New Feature
>  Components: metastore
>Reporter: Zach Berkowitz
>Assignee: Zach Berkowitz
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: SQOOP-3216-2.patch, SQOOP-3216-3.patch, 
> SQOOP-3216-4.patch, SQOOP-3216.patch
>
>
> Reconfigured HsqldbJobStorage class to support MySql, Oracle, Postgresql, 
> MSSql, and DB2 databases in addition to Hsqldb, renamed HsqldbJobStorage 
> GenericJobStorage. This new class also serves the function of 
> AutoHsqldbStorage, which has been removed.
> Two new commands, --meta-username and --meta-password have been added to 
> connect to metastore databases that require a username and password. 
> Added an enum class JdbcDrivers that holds Jdbc connection information.
> Added two testClasses, MetaConnectIncrementalImportTest and JobToolTest, and 
> modified TestSavedJobs (now SavedJobsTest) to test with all supported 
> database services.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2851) Could not change the sqoop metastore username in the sqoop-site.xml ?

2017-10-03 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189646#comment-16189646
 ] 

Sandish Kumar HN commented on SQOOP-2851:
-


sqoop.metastore.client.enable.autoconnect
false
If true, Sqoop will connect to a local metastore
  for job management when no other metastore arguments are
  provided.

  

> Could not change the  sqoop metastore username in the sqoop-site.xml ?
> --
>
> Key: SQOOP-2851
> URL: https://issues.apache.org/jira/browse/SQOOP-2851
> Project: Sqoop
>  Issue Type: Bug
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop-1.4.6
> hadoop-2.6.0
>Reporter: alexBai
>
> sqoop job --list
> Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will 
> fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 16/02/26 12:00:47 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
> 16/02/26 12:00:47 ERROR tool.JobTool: I/O error performing job operation: 
> java.io.IOException: Exception creating SQL connection
>   at 
> org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:216)
>   at 
> org.apache.sqoop.metastore.hsqldb.AutoHsqldbStorage.open(AutoHsqldbStorage.java:112)
>   at org.apache.sqoop.tool.JobTool.run(JobTool.java:274)
>   at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
>   at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
>   at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
>   at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
> Caused by: java.sql.SQLException: User not found: SQOOP
>   at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
>   at org.hsqldb.jdbc.jdbcConnection.(Unknown Source)
>   at org.hsqldb.jdbcDriver.getConnection(Unknown Source)
>   at org.hsqldb.jdbcDriver.connect(Unknown Source)
>   at java.sql.DriverManager.getConnection(DriverManager.java:571)
>   at java.sql.DriverManager.getConnection(DriverManager.java:215)
>   at 
> org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:176)
>   ... 8 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-09-05 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153697#comment-16153697
 ] 

Sandish Kumar HN commented on SQOOP-2907:
-

I have already submitted for review. it should be available for sqoop 1.4.7

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
>  Labels: sqoop
> Attachments: SQOOP-2907-3.patch, SQOOP-2907.patch, SQOOP-2907.patch1, 
> SQOOP-2907.patch2
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3146) Sqoop (import + --as-parquetfile) with Oracle (CLOB vs. BLOB) is inconsistent

2017-08-30 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3146:
---

Assignee: Sandish Kumar HN

> Sqoop (import + --as-parquetfile) with Oracle (CLOB vs. BLOB) is inconsistent
> -
>
> Key: SQOOP-3146
> URL: https://issues.apache.org/jira/browse/SQOOP-3146
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Anna Szonyi
>Assignee: Sandish Kumar HN
>
> 
> # Owner: Sqoopinators
> # Component: Sqoop1
> # Purpose: Escalation Test Case
> # SFDC Case ID:127558
> # SFDC EscalationID: CDH-50699
> # File: SupportTest_Case_127558_JIRA_CDH-50699.txt
> #
> # Description
> # 1. Sqoop import + —as-parquetfile + CLOB Data Types (Gives Error)
> # 2. Sqoop import + —as-parquetfile + BLOB Data Types (Works Good)
> 
> ##
>  USE CASE [1] . Sqoop import + —as-parquetfile + CLOB Data Types (Gives Error)
> ##
> ###
> # STEP 01 - CREATE DATA
> ###
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1_clob (c1 int,c2 clob)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1_clob values(1,'qwqewewqrerew121212121212’)”
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1_clob"
> #
> OUTPUT
> #
> ---
> | C1   | C2   | 
> ---
> | 1| qwqewewqrerew121212121212 | 
> ---
> 
> STEP 02 - IMPORT AS PARQUET FILE (Without —map-column-java) [REPRODUCING 
> THE ERROR]
> 
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD 
> --as-parquetfile --table T1_CLOB --delete-target-dir --target-dir 
> '/projects/t1_clob' -m 1
> OUTPUT
> ——
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 17/02/21 10:07:08 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.3
> 17/02/21 10:07:08 WARN tool.BaseSqoopTool: Setting your password on the 
> command-line is insecure. Consider using -P instead.
> 17/02/21 10:07:08 INFO oracle.OraOopManagerFactory: Data Connector for Oracle 
> and Hadoop is disabled.
> 17/02/21 10:07:08 INFO manager.SqlManager: Using default fetchSize of 1000
> 17/02/21 10:07:08 INFO tool.CodeGenTool: Beginning code generation
> 17/02/21 10:07:08 INFO tool.CodeGenTool: Will generate java class as 
> codegen_T1_CLOB
> 17/02/21 10:07:09 INFO manager.OracleManager: Time zone has been set to GMT
> 17/02/21 10:07:09 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM "T1_CLOB" t WHERE 1=0
> 17/02/21 10:07:09 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is 
> /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
> Note: 
> /tmp/sqoop-root/compile/cbaf5013e6bc9dad7283090f9d761289/codegen_T1_CLOB.java 
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 17/02/21 10:07:11 INFO orm.CompilationManager: Writing jar file: 
> /tmp/sqoop-root/compile/cbaf5013e6bc9dad7283090f9d761289/codegen_T1_CLOB.jar
> 17/02/21 10:07:13 INFO tool.ImportTool: Destination directory 
> /projects/t1_clob is not present, hence not deleting.
> 17/02/21 10:07:13 INFO manager.OracleManager: Time zone has been set to GMT
> 17/02/21 10:07:13 INFO manager.OracleManager: Time zone has been set to GMT
> 17/02/21 10:07:13 INFO mapreduce.ImportJobBase: Beginning import of T1_CLOB
> 17/02/21 10:07:13 INFO Configuration.deprecation: mapred.jar is deprecated. 
> Instead, use mapreduce.job.jar
> 17/02/21 10:07:13 INFO manager.OracleManager: Time zone has been set to GMT
> 17/02/21 10:07:13 INFO manager.OracleManager: Time zone has been set to GMT
> 17/02/21 10:07:13 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM "T1_CLOB" t WHERE 1=0
> 17/02/21 10:07:13 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM "T1_CLOB" t WHERE 1=0
> 17/02/21 10:07:13 ERROR tool.ImportTool: Imported Failed: Cannot convert SQL 
> type 2005
> #
> STEP 02.1 - IMPORT AS PARQUET FILE + —map-column-java (For CLOB data type)
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD 
> --as-parquet file --table T1_CLOB 

[jira] [Assigned] (SQOOP-3151) Sqoop export HDFS file type auto detection can pick wrong type

2017-08-30 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3151:
---

Assignee: Sandish Kumar HN

> Sqoop export HDFS file type auto detection can pick wrong type
> --
>
> Key: SQOOP-3151
> URL: https://issues.apache.org/jira/browse/SQOOP-3151
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Boglarka Egyed
>Assignee: Sandish Kumar HN
>
> It appears that Sqoop export tries to detect the file format by reading the 
> first 3 characters of a file. Based on that header, the appropriate file 
> reader is used. However, if the result set happens to contain the header 
> sequence, the wrong reader is chosen resulting in a misleading error.
> For example, if someone is exporting a table in which one of the field values 
> is "PART". Since Sqoop sees the letters "PAR", it is invoking the Kite SDK as 
> it assumes the file is in Parquet format. This leads to a misleading error:
> ERROR sqoop.Sqoop: Got exception running Sqoop: 
> org.kitesdk.data.DatasetNotFoundException: Descriptor location does not 
> exist: hdfs://.metadata 
> org.kitesdk.data.DatasetNotFoundException: Descriptor location does not 
> exist: hdfs://.metadata
> This can be reproduced easily, using Hive as a real world example:
> > create table test2 (val string);
> > insert into test1 values ('PAR');
> Then run a sqoop export against the table data:
> $ sqoop export --connect $MYCONN --username $MYUSER --password $MYPWD -m 1 
> --export-dir /user/hive/warehouse/test --table $MYTABLE
> Sqoop will fail with the following:
> ERROR sqoop.Sqoop: Got exception running Sqoop: 
> org.kitesdk.data.DatasetNotFoundException: Descriptor location does not 
> exist: hdfs://.metadata
> org.kitesdk.data.DatasetNotFoundException: Descriptor location does not 
> exist: hdfs://.metadata
> Changing value from "PAR" to something else, like 'Obj' (Avro) or 'SEQ' 
> (sequencefile), which will result in similar errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3166) Sqoop1 (import + --query + aggregate function + --split-by -m >=2) fails with error (unknown column)

2017-08-30 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3166:
---

Assignee: Sandish Kumar HN

> Sqoop1 (import + --query + aggregate function + --split-by -m >=2) fails with 
> error (unknown column)
> 
>
> Key: SQOOP-3166
> URL: https://issues.apache.org/jira/browse/SQOOP-3166
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>Assignee: Sandish Kumar HN
>
> This issue appears to be RDBMS generic.
> *Test Case*
> {noformat}
> 
> # Description:
> # 
> # 1. Sqoop import fails with Unknown Column error with the following 
> conditions
> # 1.1. Using --query + sql aggregate function() + --split-by + --num-mappers 
> >1 fails
> # 2. The Sqoop documentation does not seem to clarify requirements for 
> "select list" and "--split-by" 
> #
> # Documentation
> # 
> http://archive.cloudera.com/cdh5/cdh/5/sqoop-1.4.6-cdh5.10.0/SqoopUserGuide.html#_controlling_parallelism
> # 7.2.4. Controlling Parallelism
> 
> #
> # STEP 01 - [ORACLE] Create Data
> #
> export MYCONN=jdbc:oracle:thin:@oracle1.cloudera.com:1521/db11g;
> export MYUSER=sqoop
> export MYPSWD=cloudera
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop view v1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 date, c3 varchar(10))"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, current_date, 'some data')"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create view v1 as select c1 as \"ID\", c2, c3 from t1" 
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select count(c1) from t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select count(id) from v1"
> 
> | COUNT(C1)| 
> 
> | 1| 
> 
> ~
> 
> | COUNT(ID)| 
> 
> | 1| 
> 
> #
> # STEP 02 - [ORACLE] Import from Table/View with --num-mappes 2 (reproduction)
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select count(c1) from t1 where \$CONDITIONS" --target-dir /user/root/t1 
> --delete-target-dir --num-mappers 2 --split-by C1 --verbose
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select count(id) from v1 where \$CONDITIONS" --target-dir /user/root/t1 
> --delete-target-dir --num-mappers 2 --split-by ID --verbose
> Output:
> 17/04/03 09:09:11 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT 
> MIN(C1), MAX(C1) FROM (select count(c1) from t1 where  (1 = 1) ) t1
> 17/04/03 09:09:11 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/root/.staging/job_1490976836761_0069
> 17/04/03 09:09:11 WARN security.UserGroupInformation: 
> PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: 
> java.sql.SQLSyntaxErrorException: ORA-00904: "C1": invalid identifier
> ~
> 17/04/03 09:10:01 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT 
> MIN(ID), MAX(ID) FROM (select count(id) from v1 where  (1 = 1) ) t1
> 17/04/03 09:10:01 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/root/.staging/job_1490976836761_0070
> 17/04/03 09:10:01 WARN security.UserGroupInformation: 
> PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: 
> java.sql.SQLSyntaxErrorException: ORA-00904: "ID": invalid identifier
> #
> # STEP 03 - [ORACLE] Import from Table/View with --num-mappes 1 (workaround)
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select count(c1) from t1 where \$CONDITIONS" --target-dir /user/root/t1 
> --delete-target-dir --num-mappers 1 --split-by C1 --verbose
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select count(id) from v1 where \$CONDITIONS" --target-dir /user/root/t1 
> --delete-target-dir --num-mappers 1 --split-by ID --verbose
> Output:
> 17/04/03 09:07:11 INFO mapreduce.ImportJobBase: Transferred 2 bytes in 
> 21.5799 seconds (0.0927 bytes/sec)
> 17/04/03 09:07:11 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> ~
> 17/04/03 09:08:13 INFO 

[jira] [Assigned] (SQOOP-3227) Sqoop Avro import with decimal mapping issue

2017-08-30 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3227:
---

Assignee: Sandish Kumar HN

> Sqoop Avro import with decimal mapping issue
> 
>
> Key: SQOOP-3227
> URL: https://issues.apache.org/jira/browse/SQOOP-3227
> Project: Sqoop
>  Issue Type: Bug
>Reporter: liviu
>Assignee: Sandish Kumar HN
>Priority: Blocker
>
> Hi,
> We are using Sqoop version 1.4.6-cdh5.8.5 for importing numeric data types 
> from Oracle to Avro Decimal Logical type (Bytes as Decimal with precision and 
> scale)
> The import works ok except for cases in which the values stored in Oracle 
> have the scale lower than the one defined in table ddl.
> Ex. the column is defined as NUMERIC(20,2) and the value stored is "3.2" 
> (only one digit after decimal point); in this case we receive the error 
> message:
> *_Error: org.apache.avro.file.DataFileWriter$AppendWriteException: 
> org.apache.avro.AvroTypeException: Cannot encode decimal with scale 1 as 
> scale 2_*
> This error was discussed in [https://issues.apache.org/jira/browse/AVRO-1864] 
> with the resolution that this sqoop behavior is correct (it cannot add by 
> default an extra info for transforming "3.2" to "3.20")
> We tried below methods for conversion to AVRO Decimal(20,2) Logical datatype:
> 1). use   *--map-column-java "COLNAME=java.math.BigDecimal"* in sqoop import 
> command
> error received:
> _*Error: org.apache.avro.file.DataFileWriter$AppendWriteException: 
> org.apache.avro.AvroTypeException: Cannot encode decimal with scale 1 as 
> scale 2*_
> 2). use _*--map-column-java "COL1=Decimal(20%2C2)"*_ in sqoop command
> error received:
>  *_ERROR tool.ImportTool: Import failed: No ResultSet method for Java type 
> Decimal(20,2)_*
> 3). made the column as varchar2(100) in Oracle database, stored the value as 
> "3.20" and use _*--map-column-java "COLNAME=java.math.BigDecimal"* _in sqoop 
> command
> error received: 
> *_ERROR tool.ImportTool: Import failed: Cannot convert to AVRO type 
> java.math.BigDecimal_*
> There is any way in which we can instruct sqoop to map the column from Oracle 
> table to AVRO Decimal logical type (precision=20, scale=2) and import the 
> "3.2" value from Oracle database as "3.20" type Decimal(20,2) in AVRO file?
> Thanks,
> Liviu



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-869) Sqoop can not append data to hive in SequenceFiles format

2017-08-30 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147979#comment-16147979
 ] 

Sandish Kumar HN commented on SQOOP-869:


[~zhangguancheng] sqoop import -as-sequencefile works fine but while doing 
import sqoop create's a table-name.class inside the sequencefile. so when ever 
you query a table on hive it expects table-name.class (you need to do add jar 
tablename.jar in hive). try once

coming to your question:  2) failed while step 1) succeeded.

1) succeeded. but you can't query on hive without table-name.jar
2) sqoop import -as-sequencefile  --append 
sqoop import use "load data inpath 'path' into tablename" command - which again 
expects table-name.jar (you need to do "add jar tablename.jar" in hive) which 
is not been done on sqoop hive import for -as-sequencefile 

anyways currently we are throwing an exception for hive-import -as-sequencefile


> Sqoop can not append data to hive in SequenceFiles format
> -
>
> Key: SQOOP-869
> URL: https://issues.apache.org/jira/browse/SQOOP-869
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.2
>Reporter: zhangguancheng
>Assignee: Sandish Kumar HN
>
> To reproduce it, do the following:
> 1) sqoop import  --hive-import  --connect 
> jdbc:oracle:thin:@###.###.###.###:###:orcl --username ### --password ### 
> --target-dir /###/### --hive-home /###/### --hive-table ### --as-sequencefile 
>  --query "select # from "###"."###" where \$CONDITIONS" --create-hive-table  
> --class-name ###   --outdir /###/###  --bindir /###/### --map-column-hive 
> ###=STRING,###=BIGINT,###=BIGINT
> 2) sqoop import  --hive-import  --connect 
> jdbc:oracle:thin:@###.###.###.###:###:orcl --username ### --password ### 
> --target-dir /###/### --hive-home /###/### --hive-table ### --as-sequencefile 
>  --query "select # from "TEST1"."BAI" where \$CONDITIONS"   -append
> --class-name ### --outdir /###/###   --bindir /###/### --map-column-hive 
> ###=STRING,###=BIGINT,###=BIGINT 
> And the output of step 2) will be something like:
> {noformat} 
> 12/05/04 23:47:07 INFO hive.HiveImport: OK
> 12/05/04 23:47:07 INFO hive.HiveImport: Time taken: 3.996 seconds
> 12/05/04 23:47:08 INFO hive.HiveImport: Loading data to table default.###
> 12/05/04 23:47:09 INFO hive.HiveImport: Failed with exception 
> java.lang.RuntimeException: java.io.IOException: WritableName can't load 
> class: ***
> 12/05/04 23:47:09 INFO hive.HiveImport: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> 12/05/04 23:47:09 ERROR tool.ImportTool: Encountered IOException running 
> import job: java.io.IOException: Hive exited with status 9
> at 
> org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:375)
> at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:315)
> at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:227)
> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
> at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-3043) Sqoop HiveImport fails with Wrong FS while removing the _logs

2017-08-30 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147962#comment-16147962
 ] 

Sandish Kumar HN commented on SQOOP-3043:
-

remove-temp logs has already been taken care 
https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/hive/HiveImport.java#L117
 here. 

> Sqoop HiveImport fails with Wrong FS while removing the _logs 
> --
>
> Key: SQOOP-3043
> URL: https://issues.apache.org/jira/browse/SQOOP-3043
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Reporter: Ramesh B
>Assignee: Sandish Kumar HN
>
> With s3:// as --target-dir and --hive-import provided sqoop fails with 
> {code}ERROR tool.ImportTool: Imported Failed: Wrong FS: 
> s3a://dataplatform/sqoop/target/user/_logs, expected: hdfs://nn1
> {code}
> This is due to removeTempLogs method in HiveImport.java which is expecting 
> hdfs as the path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (SQOOP-3182) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Can't parse input data: 'PAR1')

2017-08-30 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN resolved SQOOP-3182.
-
   Resolution: Fixed
Fix Version/s: 1.4.7

This issue is solved through this 
https://issues.apache.org/jira/browse/SQOOP-3178.
class name and jar name for parquet format will create as 
codegen_.java and codegen_.jar

> Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with 
> (Can't parse input data: 'PAR1')
> 
>
> Key: SQOOP-3182
> URL: https://issues.apache.org/jira/browse/SQOOP-3182
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>Assignee: Sandish Kumar HN
> Fix For: 1.4.7
>
>
> Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with 
> (Can't parse input data: 'PAR1').  See test case below.
> *Test Case*
> {noformat}
> #
> # STEP 01 - Create Table and Data
> #
> export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g;
> export MYUSER=sqoop
> export MYPSWD=sqoop
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | NEW ROW 1  | 2017-05-06 
> 06:59:02 | 
> -
> #
> # STEP 02 - Import Data into HDFS 
> #
> hdfs dfs -rm -r /user/root/t1
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> hdfs dfs -ls /user/root/t1/*.parquet
> parquet-tools cat --json 
> 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet'
> Output:
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 
> 23.6174 seconds (113.8988 bytes/sec)
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> 17/05/06 07:01:34 INFO tool.ImportTool:   --last-value 2017-05-06 07:01:09.0
> ~
> -rw-r--r--   3 root root   1144 2017-05-06 07:01 
> /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet
> ~
> {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 
> 06:59:02"}
> #
> # STEP 03 - Insert New Row and Update Existing Row
> #
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1 order by c1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | UPDATE 1   | 2017-05-06 
> 07:04:40 | 
> | 2| 2017-05-06 07:04:38.0 | NEW ROW 2  | 2017-05-06 
> 07:04:38 | 
> -
> #
> # STEP 04 - Import Data into HDFS and Merge changes 
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> Output:
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 
> 27.4934 seconds (99.1148 bytes/sec)
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records.
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: 
> java.net.FactoryURLClassLoader@121fdcee
> 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run 
> merge job.
> 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 
> 4bc6b65cd0194b81938f4660974ee392_T1
> 17/05/06 

[jira] [Assigned] (SQOOP-3182) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Can't parse input data: 'PAR1')

2017-08-24 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3182:
---

Assignee: Sandish Kumar HN

> Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with 
> (Can't parse input data: 'PAR1')
> 
>
> Key: SQOOP-3182
> URL: https://issues.apache.org/jira/browse/SQOOP-3182
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>Assignee: Sandish Kumar HN
>
> Sqoop1 (import + --incremental + --merge-key + --as-parquet) fails with 
> (Can't parse input data: 'PAR1').  See test case below.
> *Test Case*
> {noformat}
> #
> # STEP 01 - Create Table and Data
> #
> export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g;
> export MYUSER=sqoop
> export MYPSWD=sqoop
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | NEW ROW 1  | 2017-05-06 
> 06:59:02 | 
> -
> #
> # STEP 02 - Import Data into HDFS 
> #
> hdfs dfs -rm -r /user/root/t1
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> hdfs dfs -ls /user/root/t1/*.parquet
> parquet-tools cat --json 
> 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet'
> Output:
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 
> 23.6174 seconds (113.8988 bytes/sec)
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> 17/05/06 07:01:34 INFO tool.ImportTool:   --last-value 2017-05-06 07:01:09.0
> ~
> -rw-r--r--   3 root root   1144 2017-05-06 07:01 
> /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet
> ~
> {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 
> 06:59:02"}
> #
> # STEP 03 - Insert New Row and Update Existing Row
> #
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1 order by c1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | UPDATE 1   | 2017-05-06 
> 07:04:40 | 
> | 2| 2017-05-06 07:04:38.0 | NEW ROW 2  | 2017-05-06 
> 07:04:38 | 
> -
> #
> # STEP 04 - Import Data into HDFS and Merge changes 
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> Output:
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 
> 27.4934 seconds (99.1148 bytes/sec)
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records.
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: 
> java.net.FactoryURLClassLoader@121fdcee
> 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run 
> merge job.
> 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 
> 4bc6b65cd0194b81938f4660974ee392_T1
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Attempting to load jar through 
> URL: 
> 

[jira] [Assigned] (SQOOP-3043) Sqoop HiveImport fails with Wrong FS while removing the _logs

2017-08-24 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3043:
---

Assignee: Sandish Kumar HN

> Sqoop HiveImport fails with Wrong FS while removing the _logs 
> --
>
> Key: SQOOP-3043
> URL: https://issues.apache.org/jira/browse/SQOOP-3043
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Reporter: Ramesh B
>Assignee: Sandish Kumar HN
>
> With s3:// as --target-dir and --hive-import provided sqoop fails with 
> {code}ERROR tool.ImportTool: Imported Failed: Wrong FS: 
> s3a://dataplatform/sqoop/target/user/_logs, expected: hdfs://nn1
> {code}
> This is due to removeTempLogs method in HiveImport.java which is expecting 
> hdfs as the path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3147) Import data to Hive Table in S3 in Parquet format

2017-08-24 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3147:
---

Assignee: Sandish Kumar HN

> Import data to Hive Table in S3 in Parquet format
> -
>
> Key: SQOOP-3147
> URL: https://issues.apache.org/jira/browse/SQOOP-3147
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Ahmed Kamal
>Assignee: Sandish Kumar HN
>
> Using this command succeeds only if the Hive Table's location is HDFS. If the 
> table is backed by S3 it throws an exception while trying to move the data 
> from HDFS tmp directory to S3
> Job job_1486539699686_3090 failed with state FAILED due to: Job commit 
> failed: org.kitesdk.data.DatasetIOException: Dataset merge failed
>   at 
> org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:333)
>   at 
> org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:56)
>   at 
> org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:370)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
>   at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Dataset merge failed during rename of 
> hdfs://hdfs-path/tmp/dev_kamal/.temp/job_1486539699686_3090/mr/job_1486539699686_3090/0192f987-bd4c-4cb7-836f-562ac483e008.parquet
>  to 
> s3://bucket_name/dev_kamal/address/0192f987-bd4c-4cb7-836f-562ac483e008.parquet
>   at 
> org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:329)
>   ... 7 more
> sqoop import  --connect "jdbc:mysql://connectionUrl"   --table "tableName" 
> --as-parquetfile --verbose  --username=uname --password=pass --hive-import  
> --delete-target-dir --hive-database dev_kamal --hive-table  tableName 
> --hive-overwrite -m 150
> Another issue that I noticed is that Sqoop loads the Avro schema in 
> TBLProperties under avro.schema.literal attribute and if the table has a lot 
> of columns, the schema would be truncated and this would cause a weird 
> exception like this one.
> *Exception :*
> 17/03/07 12:13:13 INFO hive.metastore: Trying to connect to metastore with 
> URI thrift://url:9083
> 17/03/07 12:13:13 INFO hive.metastore: Opened a connection to metastore, 
> current connections: 1
> 17/03/07 12:13:13 INFO hive.metastore: Connected to metastore.
> 17/03/07 12:13:17 DEBUG util.ClassLoaderStack: Restoring classloader: 
> sun.misc.Launcher$AppClassLoader@3e9b1010
> 17/03/07 12:13:17 ERROR sqoop.Sqoop: Got exception running Sqoop: 
> org.apache.avro.SchemaParseException: 
> org.codehaus.jackson.JsonParseException: Unexpected end-of-input: was 
> expecting closing quote for a string value
>  at [Source: java.io.StringReader@3fb42ec7; line: 1, column: 6001]
> org.apache.avro.SchemaParseException: 
> org.codehaus.jackson.JsonParseException: Unexpected end-of-input: was 
> expecting closing quote for a string value
>  at [Source: java.io.StringReader@3fb42ec7; line: 1, column: 6001]
>   at org.apache.avro.Schema$Parser.parse(Schema.java:929)
>   at org.apache.avro.Schema$Parser.parse(Schema.java:917)
>   at 
> org.kitesdk.data.DatasetDescriptor$Builder.schemaLiteral(DatasetDescriptor.java:475)
>   at 
> org.kitesdk.data.spi.hive.HiveUtils.descriptorForTable(HiveUtils.java:154)
>   at 
> org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:104)
>   at 
> org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:192)
>   at org.kitesdk.data.Datasets.load(Datasets.java:108)
>   at org.kitesdk.data.Datasets.load(Datasets.java:165)
>   at org.kitesdk.data.Datasets.load(Datasets.java:187)
>   at 
> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:78)
>   at 
> org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108)
>   at 
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
>   at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
>   at 
> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
>   at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
>   at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
>   at 

[jira] [Assigned] (SQOOP-3209) joda-time.jar missing for Sqoop for S3 Object Store

2017-08-24 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3209:
---

Assignee: Sandish Kumar HN

> joda-time.jar missing for Sqoop for S3 Object Store
> ---
>
> Key: SQOOP-3209
> URL: https://issues.apache.org/jira/browse/SQOOP-3209
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Sowmya Ramesh
>Assignee: Sandish Kumar HN
>
> joda-time.jar missing for Sqoop for S3 Object Store



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-3132) sqoop export from Hive table stored in Parquet format to Oracle CLOB column results in (null)

2017-08-22 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136819#comment-16136819
 ] 

Sandish Kumar HN commented on SQOOP-3132:
-

sqoop export --connect jdbc:oracle:thin:@:1521:ORCL --username 
admin --password admin123  --table sqoop_exported_oracle --columns 
SAMPLE_ID,VERYLARGESTRING --hcatalog-table "sqoop_oracle_clob_test"  
--hcatalog-database "default" --map-column-hive "VERYLARGESTRING=clob"

> sqoop export from Hive table stored in Parquet format to Oracle CLOB column 
> results in (null)
> -
>
> Key: SQOOP-3132
> URL: https://issues.apache.org/jira/browse/SQOOP-3132
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/oracle, hive-integration
>Affects Versions: 1.4.6
> Environment: sandbox
>Reporter: Ramprasad
>Assignee: Sandish Kumar HN
>Priority: Critical
>  Labels: beginner
>
> I am trying to export a String column from Hive table (stored in Parquet 
> format) to Oracle CLOB data type column using sqoop export. Below are the 
> commands I run for creation of tables in Oracle & Hive and, the sqoop command 
> I use to to export the data.
> Table creation & insert into Hive: 
> create table default.sqoop_oracle_clob_test (sample_id int, verylargestring 
> String) stored as PARQUET; 
> [SUCCESS] 
> insert into default.sqoop_oracle_clob_test (sample_id, verylargestring) 
> values (123, "Really a very large String"); 
> insert into default.sqoop_oracle_clob_test (sample_id, verylargestring) 
> values (456, "Another very large String"); 
> [SUCCESS]
> Table creation in Oracle 
> create table sqoop_exported_oracle (sample_id NUMBER, verylargestring CLOB); 
> [success] 
> Sqoop export command:
> sqoop \
> export \
> --connect jdbc:oracle:thin:@//host:port/database_name \
> --username ** \
> --password ** \
> --table sqoop_exported_oracle \
> --columns SAMPLE_ID,VERYLARGESTRING \
> --map-column-java "VERYLARGESTRING=String" \
> --hcatalog-table "sqoop_oracle_clob_test" \
> --hcatalog-database "default"
> sqoop job executes fine without any error messages and displays the message 
> Exported 2 records.
> The result in Oracle table is as below,
> select * from sqoop_exported_oracle;
> sample_id | verylargestring
> 123 | (null)
> 456 | (null) 
> I tried using --staging-table as well but, resulted in same. I suspect this 
> is a bug while exporting to oracle CLOB columns when retrieving from Hive 
> which is stored in parquet format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-869) Sqoop can not append data to hive in SequenceFiles format

2017-08-16 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-869:
--

Assignee: Sandish Kumar HN

> Sqoop can not append data to hive in SequenceFiles format
> -
>
> Key: SQOOP-869
> URL: https://issues.apache.org/jira/browse/SQOOP-869
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.2
>Reporter: zhangguancheng
>Assignee: Sandish Kumar HN
>
> To reproduce it, do the following:
> 1) sqoop import  --hive-import  --connect 
> jdbc:oracle:thin:@###.###.###.###:###:orcl --username ### --password ### 
> --target-dir /###/### --hive-home /###/### --hive-table ### --as-sequencefile 
>  --query "select # from "###"."###" where \$CONDITIONS" --create-hive-table  
> --class-name ###   --outdir /###/###  --bindir /###/### --map-column-hive 
> ###=STRING,###=BIGINT,###=BIGINT
> 2) sqoop import  --hive-import  --connect 
> jdbc:oracle:thin:@###.###.###.###:###:orcl --username ### --password ### 
> --target-dir /###/### --hive-home /###/### --hive-table ### --as-sequencefile 
>  --query "select # from "TEST1"."BAI" where \$CONDITIONS"   -append
> --class-name ### --outdir /###/###   --bindir /###/### --map-column-hive 
> ###=STRING,###=BIGINT,###=BIGINT 
> And the output of step 2) will be something like:
> {noformat} 
> 12/05/04 23:47:07 INFO hive.HiveImport: OK
> 12/05/04 23:47:07 INFO hive.HiveImport: Time taken: 3.996 seconds
> 12/05/04 23:47:08 INFO hive.HiveImport: Loading data to table default.###
> 12/05/04 23:47:09 INFO hive.HiveImport: Failed with exception 
> java.lang.RuntimeException: java.io.IOException: WritableName can't load 
> class: ***
> 12/05/04 23:47:09 INFO hive.HiveImport: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> 12/05/04 23:47:09 ERROR tool.ImportTool: Encountered IOException running 
> import job: java.io.IOException: Hive exited with status 9
> at 
> org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:375)
> at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:315)
> at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:227)
> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
> at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2411) Sqoop using '--direct' option fails with mysqldump exit code 2 and 3

2017-08-15 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127254#comment-16127254
 ] 

Sandish Kumar HN commented on SQOOP-2411:
-

Hi [~anna.szonyi],

Yes, We can't solve this issue from SQOOP side.
if you 
see[here|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/MySQLDumpMapper.java#L437]
 .p.waitFor(); is getting time from MYSQL, even the status code is coming from 
MYSQL side. The error is coming for the larger dataset and it fails when the 
running time is longer than the waiting net-read-timeout.  I think we can close 
this issue. Correct me if I'm wrong?.

> Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
> 
>
> Key: SQOOP-2411
> URL: https://issues.apache.org/jira/browse/SQOOP-2411
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
> Environment: Amazon EMR
>Reporter: Karthick H
>Assignee: Sandish Kumar HN
>Priority: Critical
>
> I am running Sqoop in AWS EMR. I am trying to copy a table ~10 GB from MySQL 
> into HDFS.
> I get the following exception
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_00_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 3
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_05_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 2
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:08 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/06 12:19:20 INFO mapreduce.Job:  map 25% reduce 0%
> 15/07/06 12:19:22 INFO mapreduce.Job:  map 38% reduce 0%
> 15/07/06 12:19:23 INFO mapreduce.Job:  map 50% reduce 0%
> 15/07/06 12:19:24 INFO mapreduce.Job:  map 75% reduce 0%
> 15/07/06 12:19:25 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/06 12:23:11 INFO mapreduce.Job: Job job_1435664372091_0048 failed with 
> state FAILED due to: Task failed task_1435664372091_0048_m_00
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 15/07/06 12:23:11 INFO mapreduce.Job: Counters: 8
> Job Counters 
> Failed map tasks=28
> Launched map tasks=28
> Other local map tasks=28
> Total time spent by all maps in occupied slots (ms)=34760760
> Total time spent by all reduces in occupied slots (ms)=0
> Total time spent by all map tasks (ms)=5793460
> Total vcore-seconds taken by all map tasks=5793460
> Total megabyte-seconds taken by all map tasks=8342582400
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group FileSystemCounters is 
> deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 
> 829.8697 seconds (0 bytes/sec)
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group   
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Retrieved 0 records.
> 15/07/06 12:23:11 ERROR tool.ImportTool: Error during import: Import job 
> failed!
> If I run with out '--direct' option, I get the communication exception as in 
> https://issues.cloudera.org/browse/SQOOP-186
> I have set 'net-write-timeout' and 'net-read-timeout' values in MySQL to 6000.
> My Sqoop command looks like this
> sqoop import -D mapred.task.timeout=0 

[jira] [Commented] (SQOOP-2411) Sqoop using '--direct' option fails with mysqldump exit code 2 and 3

2017-08-11 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16124140#comment-16124140
 ] 

Sandish Kumar HN commented on SQOOP-2411:
-

[~anna.szonyi] This is not SQOOP error. it's from MySQL.

need to increase  'net-read-timeout' values in MySQL. otherwise, it breaks at 
[https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/MySQLDumpMapper.java#L437-L485
] throws this error "throw new IOException("mysqldump terminated with status "+ 
Integer.toString(result));"

https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/MySQLDumpMapper.java#L437-L485

Should we close this?

> Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
> 
>
> Key: SQOOP-2411
> URL: https://issues.apache.org/jira/browse/SQOOP-2411
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
> Environment: Amazon EMR
>Reporter: Karthick H
>Assignee: Sandish Kumar HN
>Priority: Critical
>
> I am running Sqoop in AWS EMR. I am trying to copy a table ~10 GB from MySQL 
> into HDFS.
> I get the following exception
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_00_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 3
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_05_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 2
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:08 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/06 12:19:20 INFO mapreduce.Job:  map 25% reduce 0%
> 15/07/06 12:19:22 INFO mapreduce.Job:  map 38% reduce 0%
> 15/07/06 12:19:23 INFO mapreduce.Job:  map 50% reduce 0%
> 15/07/06 12:19:24 INFO mapreduce.Job:  map 75% reduce 0%
> 15/07/06 12:19:25 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/06 12:23:11 INFO mapreduce.Job: Job job_1435664372091_0048 failed with 
> state FAILED due to: Task failed task_1435664372091_0048_m_00
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 15/07/06 12:23:11 INFO mapreduce.Job: Counters: 8
> Job Counters 
> Failed map tasks=28
> Launched map tasks=28
> Other local map tasks=28
> Total time spent by all maps in occupied slots (ms)=34760760
> Total time spent by all reduces in occupied slots (ms)=0
> Total time spent by all map tasks (ms)=5793460
> Total vcore-seconds taken by all map tasks=5793460
> Total megabyte-seconds taken by all map tasks=8342582400
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group FileSystemCounters is 
> deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 
> 829.8697 seconds (0 bytes/sec)
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group   
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Retrieved 0 records.
> 15/07/06 12:23:11 ERROR tool.ImportTool: Error during import: Import job 
> failed!
> If I run with out '--direct' option, I get the communication exception as in 
> https://issues.cloudera.org/browse/SQOOP-186
> I have set 'net-write-timeout' and 'net-read-timeout' values in MySQL to 6000.
> My Sqoop command looks like this
> sqoop import -D 

[jira] [Comment Edited] (SQOOP-2411) Sqoop using '--direct' option fails with mysqldump exit code 2 and 3

2017-08-11 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16124140#comment-16124140
 ] 

Sandish Kumar HN edited comment on SQOOP-2411 at 8/11/17 9:46 PM:
--

[~anna.szonyi] This is not SQOOP error. it's from MySQL.

need to increase  'net-read-timeout' values in MySQL. otherwise, it breaks at 
here(below link) throws this error "throw new IOException("mysqldump terminated 
with status "+ Integer.toString(result));"

https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/MySQLDumpMapper.java#L437-L485

Should we close this?


was (Author: sanysand...@gmail.com):
[~anna.szonyi] This is not SQOOP error. it's from MySQL.

need to increase  'net-read-timeout' values in MySQL. otherwise, it breaks at 
[https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/MySQLDumpMapper.java#L437-L485
] throws this error "throw new IOException("mysqldump terminated with status "+ 
Integer.toString(result));"

https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/MySQLDumpMapper.java#L437-L485

Should we close this?

> Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
> 
>
> Key: SQOOP-2411
> URL: https://issues.apache.org/jira/browse/SQOOP-2411
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
> Environment: Amazon EMR
>Reporter: Karthick H
>Assignee: Sandish Kumar HN
>Priority: Critical
>
> I am running Sqoop in AWS EMR. I am trying to copy a table ~10 GB from MySQL 
> into HDFS.
> I get the following exception
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_00_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 3
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_05_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 2
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:08 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/06 12:19:20 INFO mapreduce.Job:  map 25% reduce 0%
> 15/07/06 12:19:22 INFO mapreduce.Job:  map 38% reduce 0%
> 15/07/06 12:19:23 INFO mapreduce.Job:  map 50% reduce 0%
> 15/07/06 12:19:24 INFO mapreduce.Job:  map 75% reduce 0%
> 15/07/06 12:19:25 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/06 12:23:11 INFO mapreduce.Job: Job job_1435664372091_0048 failed with 
> state FAILED due to: Task failed task_1435664372091_0048_m_00
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 15/07/06 12:23:11 INFO mapreduce.Job: Counters: 8
> Job Counters 
> Failed map tasks=28
> Launched map tasks=28
> Other local map tasks=28
> Total time spent by all maps in occupied slots (ms)=34760760
> Total time spent by all reduces in occupied slots (ms)=0
> Total time spent by all map tasks (ms)=5793460
> Total vcore-seconds taken by all map tasks=5793460
> Total megabyte-seconds taken by all map tasks=8342582400
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group FileSystemCounters is 
> deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 
> 829.8697 seconds (0 bytes/sec)
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group   
> org.apache.hadoop.mapred.Task$Counter is 

[jira] [Commented] (SQOOP-3132) sqoop export from Hive table stored in Parquet format to Oracle CLOB column results in (null)

2017-08-11 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123970#comment-16123970
 ] 

Sandish Kumar HN commented on SQOOP-3132:
-

This is the ERROR [main] org.apache.sqoop.mapreduce.hcat.SqoopHCatExportHelper: 
Cannot convert HCatalog object of  type string to java object type 
com.cloudera.sqoop.lib.ClobRef

> sqoop export from Hive table stored in Parquet format to Oracle CLOB column 
> results in (null)
> -
>
> Key: SQOOP-3132
> URL: https://issues.apache.org/jira/browse/SQOOP-3132
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/oracle, hive-integration
>Affects Versions: 1.4.6
> Environment: sandbox
>Reporter: Ramprasad
>Assignee: Sandish Kumar HN
>Priority: Critical
>  Labels: beginner
>
> I am trying to export a String column from Hive table (stored in Parquet 
> format) to Oracle CLOB data type column using sqoop export. Below are the 
> commands I run for creation of tables in Oracle & Hive and, the sqoop command 
> I use to to export the data.
> Table creation & insert into Hive: 
> create table default.sqoop_oracle_clob_test (sample_id int, verylargestring 
> String) stored as PARQUET; 
> [SUCCESS] 
> insert into default.sqoop_oracle_clob_test (sample_id, verylargestring) 
> values (123, "Really a very large String"); 
> insert into default.sqoop_oracle_clob_test (sample_id, verylargestring) 
> values (456, "Another very large String"); 
> [SUCCESS]
> Table creation in Oracle 
> create table sqoop_exported_oracle (sample_id NUMBER, verylargestring CLOB); 
> [success] 
> Sqoop export command:
> sqoop \
> export \
> --connect jdbc:oracle:thin:@//host:port/database_name \
> --username ** \
> --password ** \
> --table sqoop_exported_oracle \
> --columns SAMPLE_ID,VERYLARGESTRING \
> --map-column-java "VERYLARGESTRING=String" \
> --hcatalog-table "sqoop_oracle_clob_test" \
> --hcatalog-database "default"
> sqoop job executes fine without any error messages and displays the message 
> Exported 2 records.
> The result in Oracle table is as below,
> select * from sqoop_exported_oracle;
> sample_id | verylargestring
> 123 | (null)
> 456 | (null) 
> I tried using --staging-table as well but, resulted in same. I suspect this 
> is a bug while exporting to oracle CLOB columns when retrieving from Hive 
> which is stored in parquet format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-2411) Sqoop using '--direct' option fails with mysqldump exit code 2 and 3

2017-08-11 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-2411:
---

Assignee: Sandish Kumar HN

> Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
> 
>
> Key: SQOOP-2411
> URL: https://issues.apache.org/jira/browse/SQOOP-2411
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
> Environment: Amazon EMR
>Reporter: Karthick H
>Assignee: Sandish Kumar HN
>Priority: Critical
>
> I am running Sqoop in AWS EMR. I am trying to copy a table ~10 GB from MySQL 
> into HDFS.
> I get the following exception
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_00_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 3
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_05_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 2
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:08 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/06 12:19:20 INFO mapreduce.Job:  map 25% reduce 0%
> 15/07/06 12:19:22 INFO mapreduce.Job:  map 38% reduce 0%
> 15/07/06 12:19:23 INFO mapreduce.Job:  map 50% reduce 0%
> 15/07/06 12:19:24 INFO mapreduce.Job:  map 75% reduce 0%
> 15/07/06 12:19:25 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/06 12:23:11 INFO mapreduce.Job: Job job_1435664372091_0048 failed with 
> state FAILED due to: Task failed task_1435664372091_0048_m_00
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 15/07/06 12:23:11 INFO mapreduce.Job: Counters: 8
> Job Counters 
> Failed map tasks=28
> Launched map tasks=28
> Other local map tasks=28
> Total time spent by all maps in occupied slots (ms)=34760760
> Total time spent by all reduces in occupied slots (ms)=0
> Total time spent by all map tasks (ms)=5793460
> Total vcore-seconds taken by all map tasks=5793460
> Total megabyte-seconds taken by all map tasks=8342582400
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group FileSystemCounters is 
> deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 
> 829.8697 seconds (0 bytes/sec)
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group   
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Retrieved 0 records.
> 15/07/06 12:23:11 ERROR tool.ImportTool: Error during import: Import job 
> failed!
> If I run with out '--direct' option, I get the communication exception as in 
> https://issues.cloudera.org/browse/SQOOP-186
> I have set 'net-write-timeout' and 'net-read-timeout' values in MySQL to 6000.
> My Sqoop command looks like this
> sqoop import -D mapred.task.timeout=0 --fields-terminated-by '\t' 
> --escaped-by '\\' --optionally-enclosed-by '\"' --bindir ./ --connect 
> jdbc:mysql:/// --username tuser --password tuser --table 
> table1 --target-dir=/base/table1 --split-by id -m 8 --direct



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3215) Sqoop create hive table to support other formats(avro,parquet)

2017-08-11 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3215:
---

Assignee: Sandish Kumar HN  (was: Eric Lin)

> Sqoop create hive table to support other formats(avro,parquet)
> --
>
> Key: SQOOP-3215
> URL: https://issues.apache.org/jira/browse/SQOOP-3215
> Project: Sqoop
>  Issue Type: Improvement
>Affects Versions: 1.4.6
>Reporter: Nitish Khanna
>Assignee: Sandish Kumar HN
>
> Hi Team,
> Sqoop doesn't support any other format apart from text format when we make 
> use of "create-hive-table".
> It would be great if sqoop could create avro,parquet etc format table(schema 
> only).
> I tried below command to create avro format table in hive.
> [root@host-10-17-81-13 ~]# sqoop create-hive-table --connect $MYCONN 
> --username $MYUSER --password $MYPSWD --table test_table --hive-table 
> test_table_avro --as-avrodatafile
> Warning: 
> /opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.2/bin/../lib/sqoop/../accumulo 
> does not exist! Accumulo imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 17/07/26 21:23:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.3
> 17/07/26 21:23:38 WARN tool.BaseSqoopTool: Setting your password on the 
> command-line is insecure. Consider using -P instead.
> 17/07/26 21:23:38 ERROR tool.BaseSqoopTool: Error parsing arguments for 
> create-hive-table:
> 17/07/26 21:23:38 ERROR tool.BaseSqoopTool: Unrecognized argument: 
> --as-avrodatafile
> Please correct me if i missed anything.
> Regards
> Nitish Khanna



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (SQOOP-3215) Sqoop create hive table to support other formats(avro,parquet)

2017-08-11 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated SQOOP-3215:

Comment: was deleted

(was: [~ericlin] I'm interested to work on this? can I take this??)

> Sqoop create hive table to support other formats(avro,parquet)
> --
>
> Key: SQOOP-3215
> URL: https://issues.apache.org/jira/browse/SQOOP-3215
> Project: Sqoop
>  Issue Type: Improvement
>Affects Versions: 1.4.6
>Reporter: Nitish Khanna
>Assignee: Sandish Kumar HN
>
> Hi Team,
> Sqoop doesn't support any other format apart from text format when we make 
> use of "create-hive-table".
> It would be great if sqoop could create avro,parquet etc format table(schema 
> only).
> I tried below command to create avro format table in hive.
> [root@host-10-17-81-13 ~]# sqoop create-hive-table --connect $MYCONN 
> --username $MYUSER --password $MYPSWD --table test_table --hive-table 
> test_table_avro --as-avrodatafile
> Warning: 
> /opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.2/bin/../lib/sqoop/../accumulo 
> does not exist! Accumulo imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 17/07/26 21:23:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.3
> 17/07/26 21:23:38 WARN tool.BaseSqoopTool: Setting your password on the 
> command-line is insecure. Consider using -P instead.
> 17/07/26 21:23:38 ERROR tool.BaseSqoopTool: Error parsing arguments for 
> create-hive-table:
> 17/07/26 21:23:38 ERROR tool.BaseSqoopTool: Unrecognized argument: 
> --as-avrodatafile
> Please correct me if i missed anything.
> Regards
> Nitish Khanna



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3175) Sqoop1 (import + --as-parquetfile) writes data to wrong Hive table if same table name exists in Hive default database

2017-08-11 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3175:
---

Assignee: Sandish Kumar HN

> Sqoop1 (import + --as-parquetfile) writes data to wrong Hive table if same 
> table name exists in Hive default database
> -
>
> Key: SQOOP-3175
> URL: https://issues.apache.org/jira/browse/SQOOP-3175
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>Assignee: Sandish Kumar HN
>
> Sqoop1 (import + --as-parquetfile) writes data to wrong Hive table if same 
> table name exists in Hive default database.  The test case below demonstrates 
> this issue.
> *Test Case*
> {noformat}
> 
> # Issue: Data files written to the wrong table with Parquet
> 
> #
> # STEP 01 - Create Table and Data
> #
> export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/sqoop
> export MYUSER=sqoop
> export MYPSWD=cloudera
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 date, c3 varchar(10))"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, current_date, 'new row 1')"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> -
> | c1  | c2 | c3 | 
> -
> | 1   | 2017-04-22 | new row 1  | 
> -
> #
> # STEP 02 - Create HDFS Structures
> #
> beeline -u jdbc:hive2:// -e "drop database db1 cascade; drop database db2 
> cascade; drop database db3 cascade;"
> sudo -u hdfs hdfs dfs -rm -r /data/tmp
> sudo -u hdfs hdfs dfs -rm -r /data/dbs
> sudo -u hdfs hdfs dfs -mkdir -p /data/tmp
> sudo -u hdfs hdfs dfs -chmod 777 /data/tmp
> sudo -u hdfs hdfs dfs -mkdir -p /data/dbs
> sudo -u hdfs hdfs dfs -chmod 777 /data/dbs
> #
> # STEP 03 - Create Hive Databases
> #
> beeline -u jdbc:hive2:// -e "create database db1 location '/data/dbs/db1'; 
> create database db2 location '/data/dbs/db2';"
> beeline -u jdbc:hive2:// -e "show databases; describe database default; 
> describe database db1; describe database db2;"
> beeline -u jdbc:hive2:// -e "use default; show tables; use db1; show tables; 
> use db2; show tables;"
> hdfs dfs -ls -R /user/hive/warehouse /data
> +--++--+-+-+-+--+
> | db_name  |comment | location
>  | owner_name  | owner_type  | parameters  |
> +--++--+-+-+-+--+
> | default  | Default Hive database  | hdfs://nameservice1/user/hive/warehouse 
>  | public  | ROLE| |
> +--++--+-+-+-+--+
> +--+--+---+-+-+-+--+
> | db_name  | comment  | location  | owner_name  | 
> owner_type  | parameters  |
> +--+--+---+-+-+-+--+
> | db1  |  | hdfs://nameservice1/data/dbs/db1  | root| 
> USER| |
> +--+--+---+-+-+-+--+
> +--+--+---+-+-+-+--+
> | db_name  | comment  | location  | owner_name  | 
> owner_type  | parameters  |
> +--+--+---+-+-+-+--+
> | db2  |  | hdfs://nameservice1/data/dbs/db2  | root| 
> USER| |
> +--+--+---+-+-+-+--+
> ~
> +---+--+
> | tab_name  |
> +---+--+
> +---+--+
> +---+--+
> | tab_name  |
> +---+--+
> +---+--+
> +---+--+
> | tab_name  |
> +---+--+
> +---+--+
> ~
> drwxrwxrwx   - root supergroup  0 2017-04-22 16:22 /data/dbs/db1
> drwxrwxrwx   - root supergroup  0 

[jira] [Assigned] (SQOOP-3132) sqoop export from Hive table stored in Parquet format to Oracle CLOB column results in (null)

2017-08-10 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3132:
---

Assignee: Sandish Kumar HN

> sqoop export from Hive table stored in Parquet format to Oracle CLOB column 
> results in (null)
> -
>
> Key: SQOOP-3132
> URL: https://issues.apache.org/jira/browse/SQOOP-3132
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/oracle, hive-integration
>Affects Versions: 1.4.6
> Environment: sandbox
>Reporter: Ramprasad
>Assignee: Sandish Kumar HN
>Priority: Critical
>  Labels: beginner
>
> I am trying to export a String column from Hive table (stored in Parquet 
> format) to Oracle CLOB data type column using sqoop export. Below are the 
> commands I run for creation of tables in Oracle & Hive and, the sqoop command 
> I use to to export the data.
> Table creation & insert into Hive: 
> create table default.sqoop_oracle_clob_test (sample_id int, verylargestring 
> String) stored as PARQUET; 
> [SUCCESS] 
> insert into default.sqoop_oracle_clob_test (sample_id, verylargestring) 
> values (123, "Really a very large String"); 
> insert into default.sqoop_oracle_clob_test (sample_id, verylargestring) 
> values (456, "Another very large String"); 
> [SUCCESS]
> Table creation in Oracle 
> create table sqoop_exported_oracle (sample_id NUMBER, verylargestring CLOB); 
> [success] 
> Sqoop export command:
> sqoop \
> export \
> --connect jdbc:oracle:thin:@//host:port/database_name \
> --username ** \
> --password ** \
> --table sqoop_exported_oracle \
> --columns SAMPLE_ID,VERYLARGESTRING \
> --map-column-java "VERYLARGESTRING=String" \
> --hcatalog-table "sqoop_oracle_clob_test" \
> --hcatalog-database "default"
> sqoop job executes fine without any error messages and displays the message 
> Exported 2 records.
> The result in Oracle table is as below,
> select * from sqoop_exported_oracle;
> sample_id | verylargestring
> 123 | (null)
> 456 | (null) 
> I tried using --staging-table as well but, resulted in same. I suspect this 
> is a bug while exporting to oracle CLOB columns when retrieving from Hive 
> which is stored in parquet format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (SQOOP-3187) Sqoop import as PARQUET to S3 failed

2017-08-10 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN resolved SQOOP-3187.
-
   Resolution: Fixed
Fix Version/s: 1.4.7

Upgraded KITE-SDK with the SQOOP-3192 issue.
Example Run:
sqoop import -D fs.s3n.awsAccessKeyId="" -D fs.s3n.awsSecretAccessKey="" 
--connect jdbc:mysql://localhost:3306/db1 --username  --password 
 --query "select * from t1 where \$CONDITIONS" --num-mappers 1 
--target-dir s3n://bucket/dataset/outfolder --as-parquetfile

> Sqoop import as PARQUET to S3 failed
> 
>
> Key: SQOOP-3187
> URL: https://issues.apache.org/jira/browse/SQOOP-3187
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Surendra Nichenametla
>Assignee: Sandish Kumar HN
> Fix For: 1.4.7
>
>
> Sqoop import as parquet file to S3 fails. Command and error are give below.
> However, import to a HDFS location works though.
> sqoop import --connect "jdbc:oracle:thin:@:1521/ORCL" --table 
> mytable --username myuser --password mypass --target-dir s3://bucket/foo/bar/ 
> --columns col1,col2 -m1 --as-parquetfile
> 17/05/09 21:00:18 ERROR tool.ImportTool: Imported Failed: Wrong FS: 
> s3://bucket/foo/bar, expected: hdfs://master-ip:8020
> P.S. I tried this from Amazon EMR cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-3215) Sqoop create hive table to support other formats(avro,parquet)

2017-08-10 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121426#comment-16121426
 ] 

Sandish Kumar HN commented on SQOOP-3215:
-

[~ericlin] I'm interested to work on this? can I take this??

> Sqoop create hive table to support other formats(avro,parquet)
> --
>
> Key: SQOOP-3215
> URL: https://issues.apache.org/jira/browse/SQOOP-3215
> Project: Sqoop
>  Issue Type: Improvement
>Affects Versions: 1.4.6
>Reporter: Nitish Khanna
>Assignee: Eric Lin
>
> Hi Team,
> Sqoop doesn't support any other format apart from text format when we make 
> use of "create-hive-table".
> It would be great if sqoop could create avro,parquet etc format table(schema 
> only).
> I tried below command to create avro format table in hive.
> [root@host-10-17-81-13 ~]# sqoop create-hive-table --connect $MYCONN 
> --username $MYUSER --password $MYPSWD --table test_table --hive-table 
> test_table_avro --as-avrodatafile
> Warning: 
> /opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.2/bin/../lib/sqoop/../accumulo 
> does not exist! Accumulo imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 17/07/26 21:23:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.3
> 17/07/26 21:23:38 WARN tool.BaseSqoopTool: Setting your password on the 
> command-line is insecure. Consider using -P instead.
> 17/07/26 21:23:38 ERROR tool.BaseSqoopTool: Error parsing arguments for 
> create-hive-table:
> 17/07/26 21:23:38 ERROR tool.BaseSqoopTool: Unrecognized argument: 
> --as-avrodatafile
> Please correct me if i missed anything.
> Regards
> Nitish Khanna



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3187) Sqoop import as PARQUET to S3 failed

2017-08-10 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3187:
---

Assignee: Sandish Kumar HN  (was: Eric Lin)

> Sqoop import as PARQUET to S3 failed
> 
>
> Key: SQOOP-3187
> URL: https://issues.apache.org/jira/browse/SQOOP-3187
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Surendra Nichenametla
>Assignee: Sandish Kumar HN
>
> Sqoop import as parquet file to S3 fails. Command and error are give below.
> However, import to a HDFS location works though.
> sqoop import --connect "jdbc:oracle:thin:@:1521/ORCL" --table 
> mytable --username myuser --password mypass --target-dir s3://bucket/foo/bar/ 
> --columns col1,col2 -m1 --as-parquetfile
> 17/05/09 21:00:18 ERROR tool.ImportTool: Imported Failed: Wrong FS: 
> s3://bucket/foo/bar, expected: hdfs://master-ip:8020
> P.S. I tried this from Amazon EMR cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-09 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119793#comment-16119793
 ] 

Sandish Kumar HN commented on SQOOP-2907:
-

Hi [~yuan_zac] & [~514793...@qq.com]

I see multiple submission for this issue. As you both gave me permission to 
submit for review board even I have submitted one. but only one of us can own 
this. can you both( [~yuan_zac] & [~514793...@qq.com] ) tell me who could be 
the right person to own this. 

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
>  Labels: sqoop
> Attachments: SQOOP-2907-3.patch, SQOOP-2907.patch, SQOOP-2907.patch1, 
> SQOOP-2907.patch2
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-09 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119472#comment-16119472
 ] 

Sandish Kumar HN commented on SQOOP-2907:
-

Hi [~yuan_zac]
To get your change committed please do the following:
Create a review request at Apache's review board for project Sqoop and link it 
to this JIRA ticket: https://reviews.apache.org/
Please consider the guidelines below:
Review board
Summary: generate your summary using the issue's jira key + jira title
Groups: add the relevant group so everyone on the project will know about your 
patch (Sqoop)
Bugs: add the issue's jira key so it's easy to navigate to the jira side
Repository: sqoop-trunk for Sqoop1 or sqoop-sqoop2 for Sqoop2
And as soon as the patch gets committed, it's very useful for the community if 
you close the review and mark it as "Submitted" at the Review board. The button 
to do this is top right at your own tickets, right next to the Download Diff 
button.
Jira
Link: please add the link of the review as an external/web link so it's easy to 
navigate to the reviews side
Status: mark it as "patch available"
Sqoop community will receive emails about your new ticket and review request 
and will review your change.

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-08 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118921#comment-16118921
 ] 

Sandish Kumar HN commented on SQOOP-2907:
-

[~514793...@qq.com] gave me permission to submit. 

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-08 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118234#comment-16118234
 ] 

Sandish Kumar HN edited comment on SQOOP-2907 at 8/8/17 11:52 AM:
--

Hi [~anna.szonyi] 

It seems issue there from a long time.
The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. 
 can I submit the patch at review board?? 


was (Author: sanysand...@gmail.com):
[~anna.szonyi] 

It seems issue there from a long time.
The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. 
 can I submit the patch at review board?? 

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-08 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118234#comment-16118234
 ] 

Sandish Kumar HN commented on SQOOP-2907:
-

[~anna.szonyi] 

It seems issue there from a long time.
The attached SQOOP-2907.patch1 works fine with small inline change at AvroUtil. 
 can I submit the patch at review board?? 

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-2907) Export parquet files to RDBMS: don't require .metadata for parquet files

2017-08-03 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-2907:
---

Assignee: Sandish Kumar HN

> Export parquet files to RDBMS: don't require .metadata for parquet files
> 
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
>  Issue Type: Improvement
>  Components: metastore
>Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
>Reporter: Ruslan Dautkhanov
>Assignee: Sandish Kumar HN
> Attachments: SQOOP-2907.patch, SQOOP-2907.patch1
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-07-21 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097117#comment-16097117
 ] 

Sandish Kumar HN commented on SQOOP-3178:
-

Thanks, [~anna.szonyi]. Thanks for accepting My patch. I'm looking to upload 
more patches in future. 

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Blocker
>  Labels: features, newbie, sqoop
>
> Currently, sqoop-1 only supports merging of two Parquet format data sets but 
> it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations.
> blocked by issue https://issues.apache.org/jira/browse/SQOOP-3192



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (SQOOP-3181) Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with (Could not find class .)

2017-06-16 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3181:
---

Assignee: Sandish Kumar HN

> Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with 
> (Could not find class .)
> ---
>
> Key: SQOOP-3181
> URL: https://issues.apache.org/jira/browse/SQOOP-3181
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Markus Kemper
>Assignee: Sandish Kumar HN
>
> Sqoop1 (import + --incremental + --merge-key + --as-parquetfile) fails with 
> (Could not find class .).  See test case below
> *Test Case*
> {noformat}
> #
> # STEP 01 - Create Table and Data
> #
> export MYCONN=jdbc:oracle:thin:@oracle.sqoop.com:1521/db11g;
> export MYUSER=sqoop
> export MYPSWD=sqoop
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "create table t1 (c1 int, c2 date, c3 varchar(10), c4 timestamp)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (1, sysdate, 'NEW ROW 1', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | NEW ROW 1  | 2017-05-06 
> 06:59:02 | 
> -
> #
> # STEP 02 - Import Data into HDFS 
> #
> hdfs dfs -rm -r /user/root/t1
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-01-01 00:00:00.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> hdfs dfs -ls /user/root/t1/*.parquet
> parquet-tools cat --json 
> 'hdfs://namenode/user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet'
> Output:
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Transferred 2.627 KB in 
> 23.6174 seconds (113.8988 bytes/sec)
> 17/05/06 07:01:34 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> 17/05/06 07:01:34 INFO tool.ImportTool:   --last-value 2017-05-06 07:01:09.0
> ~
> -rw-r--r--   3 root root   1144 2017-05-06 07:01 
> /user/root/t1/b65c1ca5-c8f0-44c6-8c60-8ee83161347f.parquet
> ~
> {"C1":"1","C2":"2017-05-06 06:59:02.0","C3":"NEW ROW 1","C4":"2017-05-06 
> 06:59:02"}
> #
> # STEP 03 - Insert New Row and Update Existing Row
> #
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "insert into t1 values (2, sysdate, 'NEW ROW 2', sysdate)"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "update t1 set c3 = 'UPDATE 1', c4 = sysdate where c1 = 1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query 
> "select * from t1 order by c1"
> Output:
> -
> | C1   | C2  | C3 | C4  | 
> -
> | 1| 2017-05-06 06:59:02.0 | UPDATE 1   | 2017-05-06 
> 07:04:40 | 
> | 2| 2017-05-06 07:04:38.0 | NEW ROW 2  | 2017-05-06 
> 07:04:38 | 
> -
> #
> # STEP 04 - Import Data into HDFS and Merge changes 
> #
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table 
> T1 --target-dir /user/root/t1 --incremental lastmodified --check-column C4 
> --merge-key C1 --last-value '2017-05-06 07:01:09.0' --as-parquetfile 
> --map-column-java C2=String,C4=String --num-mappers 1 --verbose 
> Output:
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Transferred 2.6611 KB in 
> 27.4934 seconds (99.1148 bytes/sec)
> 17/05/06 07:06:43 INFO mapreduce.ImportJobBase: Retrieved 2 records.
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Restoring classloader: 
> java.net.FactoryURLClassLoader@121fdcee
> 17/05/06 07:06:43 INFO tool.ImportTool: Final destination exists, will run 
> merge job.
> 17/05/06 07:06:43 DEBUG tool.ImportTool: Using temporary folder: 
> 4bc6b65cd0194b81938f4660974ee392_T1
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Checking for existing class: T1
> 17/05/06 07:06:43 DEBUG util.ClassLoaderStack: Attempting to load jar through 
> URL: 
> 

[jira] [Resolved] (SQOOP-3192) upgrade parquet

2017-06-08 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN resolved SQOOP-3192.
-
Resolution: Fixed

Fixed

> upgrade parquet
> ---
>
> Key: SQOOP-3192
> URL: https://issues.apache.org/jira/browse/SQOOP-3192
> Project: Sqoop
>  Issue Type: Improvement
>  Components: codegen
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>  Labels: features, newbie, test
>
> Found the problem with the parquet incremental merge tests with PARQUET-MR 
> 1.4.1 which is the current version used by SQOOP. these issues have been 
> fixed with a new version of PARQUET-MR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SQOOP-3192) upgrade parquet

2017-06-08 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3192:
---

Assignee: Sandish Kumar HN

> upgrade parquet
> ---
>
> Key: SQOOP-3192
> URL: https://issues.apache.org/jira/browse/SQOOP-3192
> Project: Sqoop
>  Issue Type: Improvement
>  Components: codegen
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>  Labels: features, newbie, test
>
> Found the problem with the parquet incremental merge tests with PARQUET-MR 
> 1.4.1 which is the current version used by SQOOP. these issues have been 
> fixed with a new version of PARQUET-MR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-06-08 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3178:
---

Assignee: Sandish Kumar HN

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Assignee: Sandish Kumar HN
>Priority: Blocker
>  Labels: features, newbie
>
> Currently, sqoop-1 only supports merging of two Parquet format data sets but 
> it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations.
> blocked by issue https://issues.apache.org/jira/browse/SQOOP-3192



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SQOOP-3192) upgrade parquet

2017-05-31 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031874#comment-16031874
 ] 

Sandish Kumar HN commented on SQOOP-3192:
-

Hi [~brocknoland]
I have created review request with the upgrade change patch here 
https://reviews.apache.org/r/59690

> upgrade parquet
> ---
>
> Key: SQOOP-3192
> URL: https://issues.apache.org/jira/browse/SQOOP-3192
> Project: Sqoop
>  Issue Type: Improvement
>  Components: codegen
>Reporter: Sandish Kumar HN
>  Labels: features, newbie, test
>
> Found the problem with the parquet incremental merge tests with PARQUET-MR 
> 1.4.1 which is the current version used by SQOOP. these issues have been 
> fixed with a new version of PARQUET-MR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (SQOOP-3192) upgrade parquet

2017-05-31 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031725#comment-16031725
 ] 

Sandish Kumar HN edited comment on SQOOP-3192 at 5/31/17 7:01 PM:
--

I see that SQOOP use's parquet through kitesdk. By upgrading 
kite-data.version=1.0.0 to kite-data.version=1.1.0 should solve the issue. I 
will do few tests and create a patch on RB. 


was (Author: sanysand...@gmail.com):
I see that SQOOP use's parquet through kitesdk. By upgrading 
kite-data.version=1.0.0 to kite-data.version=1.1.0 should solve the issue. 

> upgrade parquet
> ---
>
> Key: SQOOP-3192
> URL: https://issues.apache.org/jira/browse/SQOOP-3192
> Project: Sqoop
>  Issue Type: Improvement
>  Components: codegen
>Reporter: Sandish Kumar HN
>  Labels: features, newbie, test
>
> Found the problem with the parquet incremental merge tests with PARQUET-MR 
> 1.4.1 which is the current version used by SQOOP. these issues have been 
> fixed with a new version of PARQUET-MR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SQOOP-3192) upgrade parquet

2017-05-31 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031725#comment-16031725
 ] 

Sandish Kumar HN commented on SQOOP-3192:
-

I see that SQOOP use's parquet through kitesdk. By upgrading 
kite-data.version=1.0.0 to kite-data.version=1.1.0 should solve the issue. 

> upgrade parquet
> ---
>
> Key: SQOOP-3192
> URL: https://issues.apache.org/jira/browse/SQOOP-3192
> Project: Sqoop
>  Issue Type: Improvement
>  Components: codegen
>Reporter: Sandish Kumar HN
>  Labels: features, newbie, test
>
> Found the problem with the parquet incremental merge tests with PARQUET-MR 
> 1.4.1 which is the current version used by SQOOP. these issues have been 
> fixed with a new version of PARQUET-MR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-31 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031688#comment-16031688
 ] 

Sandish Kumar HN commented on SQOOP-3178:
-

Hi [~brocknoland] 

1) I have sent a mail again to dev@sqoop.apache.org. I'm waiting for their 
reply. 
2) Here is RB item link:  https://reviews.apache.org/r/59346/
3) And I have created issue called "upgrade parquet" 
https://issues.apache.org/jira/browse/SQOOP-3192.

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Priority: Blocker
>  Labels: features, newbie
>
> Currently, sqoop-1 only supports merging of two Parquet format data sets but 
> it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations.
> blocked by issue https://issues.apache.org/jira/browse/SQOOP-3192



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-31 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated SQOOP-3178:

 Labels: features newbie  (was: )
   Priority: Blocker  (was: Critical)
Description: 
Currently, sqoop-1 only supports merging of two Parquet format data sets but it 
doesn't support to do incremental merge, so I have written a Sqoop Incremental 
Merge MR for Parquet File Format and I have tested with million records of data 
with N number of iterations.

blocked by issue https://issues.apache.org/jira/browse/SQOOP-3192

  was:Currently, sqoop-1 only supports merging of two parquet format data sets 
but it doesn't support to do incremental merge, so I have written a Sqoop 
Incremental Merge MR for Parquet File Format and I have tested with million 
records of data with N number of iterations.


> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Priority: Blocker
>  Labels: features, newbie
>
> Currently, sqoop-1 only supports merging of two Parquet format data sets but 
> it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations.
> blocked by issue https://issues.apache.org/jira/browse/SQOOP-3192



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SQOOP-3192) upgrade parquet

2017-05-31 Thread Sandish Kumar HN (JIRA)
Sandish Kumar HN created SQOOP-3192:
---

 Summary: upgrade parquet
 Key: SQOOP-3192
 URL: https://issues.apache.org/jira/browse/SQOOP-3192
 Project: Sqoop
  Issue Type: Improvement
  Components: codegen
Reporter: Sandish Kumar HN


Found the problem with the parquet incremental merge tests with PARQUET-MR 
1.4.1 which is the current version used by SQOOP. these issues have been fixed 
with a new version of PARQUET-MR.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-29 Thread Sandish Kumar HN (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028085#comment-16028085
 ] 

Sandish Kumar HN commented on SQOOP-3178:
-

[~brocknoland] Thanks, just sent a mail to the dev team. 

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Priority: Critical
>
> Currently, sqoop-1 only supports merging of two parquet format data sets but 
> it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-17 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated SQOOP-3178:

Description: Currently, sqoop-1 only supports merging of two parquet format 
data sets but it doesn't support to do incremental merge, so I have written a 
Sqoop Incremental Merge MR for Parquet File Format and I have tested with 
million records of data with N number of iterations.  (was: Hi. 
I see that there is no parquet Incremental merge, I just took sqoop version 
1.4.6 source code and wrote an MR job for parquet incremental job. Can anyone 
give me specific instructions to push parquet Incremental merge code to latest 
version ??)

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Priority: Critical
>
> Currently, sqoop-1 only supports merging of two parquet format data sets but 
> it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-03 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated SQOOP-3178:

Environment: None  (was: Hi. 
I see that there is no parquet Incremental merge, I just took sqoop version 
1.4.6 source code and wrote an MR job for parquet incremental job. Can anyone 
give me specific instructions to push parquet Incremental merge code code to 
latest version ??)

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-03 Thread Sandish Kumar HN (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN updated SQOOP-3178:

Description: 
Hi. 
I see that there is no parquet Incremental merge, I just took sqoop version 
1.4.6 source code and wrote an MR job for parquet incremental job. Can anyone 
give me specific instructions to push parquet Incremental merge code to latest 
version ??

> SQOOP PARQUET INCREMENTAL MERGE 
> 
>
> Key: SQOOP-3178
> URL: https://issues.apache.org/jira/browse/SQOOP-3178
> Project: Sqoop
>  Issue Type: Improvement
>  Components: build, codegen, connectors
> Environment: None
>Reporter: Sandish Kumar HN
>Priority: Critical
>
> Hi. 
> I see that there is no parquet Incremental merge, I just took sqoop version 
> 1.4.6 source code and wrote an MR job for parquet incremental job. Can anyone 
> give me specific instructions to push parquet Incremental merge code to 
> latest version ??



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE

2017-05-03 Thread Sandish Kumar HN (JIRA)
Sandish Kumar HN created SQOOP-3178:
---

 Summary: SQOOP PARQUET INCREMENTAL MERGE 
 Key: SQOOP-3178
 URL: https://issues.apache.org/jira/browse/SQOOP-3178
 Project: Sqoop
  Issue Type: Improvement
  Components: build, codegen, connectors
 Environment: Hi. 
I see that there is no parquet Incremental merge, I just took sqoop version 
1.4.6 source code and wrote an MR job for parquet incremental job. Can anyone 
give me specific instructions to push parquet Incremental merge code code to 
latest version ??
Reporter: Sandish Kumar HN
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)