[jira] [Commented] (SQOOP-2922) Add --hive-database documentation to user guide

2017-05-26 Thread Boglarka Egyed (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026313#comment-16026313
 ] 

Boglarka Egyed commented on SQOOP-2922:
---

Hi [~astadtler],

To get your change committed please do the following:
* Create a review request at Apache's review board for project Sqoop and link 
it to this JIRA ticket: https://reviews.apache.org/

Please consider the guidelines below:

Review board
* Summary: generate your summary using the issue's jira key + jira title
* Groups: add the relevant group so everyone on the project will know about 
your patch (Sqoop)
* Bugs: add the issue's jira key so it's easy to navigate to the jira side
* Repository: sqoop-trunk for Sqoop1 or sqoop-sqoop2 for Sqoop2
* And as soon as the patch gets committed, it's very useful for the community 
if you close the review and mark it as "Submitted" at the Review board. The 
button to do this is top right at your own tickets, right next to the Download 
Diff button.

Sqoop community will receive emails about your new ticket and review request 
and will review your change.

Thanks,
Bogi

> Add --hive-database documentation to user guide
> ---
>
> Key: SQOOP-2922
> URL: https://issues.apache.org/jira/browse/SQOOP-2922
> Project: Sqoop
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 1.4.4, 1.4.5, 1.4.6
>Reporter: Tyler Seader
>Priority: Minor
>  Labels: documentation
> Fix For: 1.4.4
>
> Attachments: sqoop-2922.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> Requesting the --hive-database option be included into the Sqoop User Guide.  
> This option was included in version 1.4.4 and there is not trace of it in 
> 1.4.6 documentation.  
> Related JIRAs: 
> https://issues.apache.org/jira/browse/SQOOP-912 - code fix



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SQOOP-3187) Sqoop import as PARQUET to S3 failed

2017-05-26 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin reassigned SQOOP-3187:
---

Assignee: Eric Lin

> Sqoop import as PARQUET to S3 failed
> 
>
> Key: SQOOP-3187
> URL: https://issues.apache.org/jira/browse/SQOOP-3187
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Surendra Nichenametla
>Assignee: Eric Lin
>
> Sqoop import as parquet file to S3 fails. Command and error are give below.
> However, import to a HDFS location works though.
> sqoop import --connect "jdbc:oracle:thin:@:1521/ORCL" --table 
> mytable --username myuser --password mypass --target-dir s3://bucket/foo/bar/ 
> --columns col1,col2 -m1 --as-parquetfile
> 17/05/09 21:00:18 ERROR tool.ImportTool: Imported Failed: Wrong FS: 
> s3://bucket/foo/bar, expected: hdfs://master-ip:8020
> P.S. I tried this from Amazon EMR cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58466: SQOOP-3158 - Columns added to Mysql after initial sqoop import, export back to table with same schema fails

2017-05-26 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58466/#review176188
---



Hi Eric!

Thank you for improving the patch however it turned out that there is one more 
edge case which needs to be covered.
I have realized that there are 2 options for specifying the representation of 
null: org.apache.sqoop.SqoopOptions#getInNullStringValue (this is used for 
columns with String data type) and 
org.apache.sqoop.SqoopOptions#getInNullNonStringValue (this is used for all the 
other data types). At this point your patch uses the first option only thus the 
first lines of the __loadFromFields0 method of the generated class look like 
this:

if (__it.hasNext()) {
__cur_str = __it.next();
} else {
__cur_str = "NNUULL";
}
if (__cur_str.equals("null") || __cur_str.length() == 0) { this.ID = null; 
} else {
  this.ID = Integer.valueOf(__cur_str);
}

Since the ID column is of type Integer Sqoop should use 
org.apache.sqoop.SqoopOptions#getInNullNonStringValue to generate the first if 
statement.

Sorry for the late notice but I have just realized that there is this other 
option too.

Apart from this I have run all the unit tests and it seems that couple of test 
cases are failing in org.apache.sqoop.TestExportUsingProcedure. 
TestExportUsingProcedure is a subclass of TestExport thus inherits the new test 
cases you added but they fail for some reason in that class. Can you please 
take a look at it too?


Thank you for your efforts!

Regards,
Szabolcs

- Szabolcs Vasas


On May 21, 2017, 11:12 a.m., Eric Lin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58466/
> ---
> 
> (Updated May 21, 2017, 11:12 a.m.)
> 
> 
> Review request for Sqoop, Attila Szabo and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3158
> https://issues.apache.org/jira/browse/SQOOP-3158
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> I have table in MySQL with 2 columns until yesterday. The columns are id and 
> name.
> 1,Raj
> 2,Jack
> 
> I have imported this data into HDFS yesterday itself as a file. Today we 
> added a new column to the table in MySQL called salary. The table looks like 
> below.
> 
> 1,Raj
> 2,Jack
> 3,Jill,2000
> 4,Nick,3000
> 
> Now I have done Incremental import on this table as a file.
> 
> Part-m-0 file contains
> 1,Raj
> 2,Jack
> 
> Part-m-1 file contains
> 3,Jill,2000
> 4,Nick,3000
> 
> Now I created a new table in MySQL with same schema as Original MySQL table 
> with columns id name and salary.
> 
> Sqoop export will fail with below error:
> 
> java.lang.RuntimeException: Can't parse input data: 'Raj'
> at SQOOP_3158.__loadFromFields(SQOOP_3158.java:316)
> at SQOOP_3158.parse(SQOOP_3158.java:254)
> at 
> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
> at 
> org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.util.NoSuchElementException
> at java.util.ArrayList$Itr.next(ArrayList.java:854)
> at SQOOP_3158.__loadFromFields(SQOOP_3158.java:311)
> ... 12 more
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/orm/ClassWriter.java eaa9123 
>   src/test/com/cloudera/sqoop/TestExport.java b2edc53 
> 
> 
> Diff: https://reviews.apache.org/r/58466/diff/4/
> 
> 
> Testing
> ---
> 
> There is no existing test class to cover the path and I am not sure the best 
> way to add test case for this. If you have any suggestion, please let me know.
> 
> I have done manual testing to replicate the issue and confirmed that patch 
> fixed the issue. I have also tried different data types, all working.
> 
> However, if column in MySQL is defined as NOT NULL, then the export will 
> still fail with error, this is expected.
> 
> 
> Thanks,
> 
> Eric Lin
> 
>



Re: Review Request 59346: SQOOP-3178 : Incremental Merging for Parquet File Format

2017-05-26 Thread Szabolcs Vasas


> On May 19, 2017, 10:25 a.m., Szabolcs Vasas wrote:
> > Hi Sandish,
> > 
> > Thank you for your patch! I wanted to review it in my IDE but I got the 
> > following error when I tried to apply it:
> > 
> > /Users/szabolcsvasas/Downloads/MergeIncrementalParquetFormat.diff:140: 
> > trailing whitespace.
> >   
> > /Users/szabolcsvasas/Downloads/MergeIncrementalParquetFormat.diff:224: 
> > trailing whitespace.
> >   
> > /Users/szabolcsvasas/Downloads/MergeIncrementalParquetFormat.diff:254: 
> > trailing whitespace.
> >   
> > /Users/szabolcsvasas/Downloads/MergeIncrementalParquetFormat.diff:326: 
> > trailing whitespace.
> > 
> > /Users/szabolcsvasas/Downloads/MergeIncrementalParquetFormat.diff:330: 
> > trailing whitespace.
> >   
> > error: patch failed: src/java/org/apache/sqoop/mapreduce/MergeJob.java:19
> > error: src/java/org/apache/sqoop/mapreduce/MergeJob.java: patch does not 
> > apply
> > error: patch failed: src/java/org/apache/sqoop/tool/ImportTool.java:54
> > error: src/java/org/apache/sqoop/tool/ImportTool.java: patch does not apply
> > 
> > Can you please check your patch?
> > 
> > It would be also very good if you could add test coverage. As far as I see 
> > there are some incremental import related test cases in 
> > com.cloudera.sqoop.TestIncrementalImport that could help.
> > 
> > Regards,
> > Szabolcs
> 
> Sandish Kumar HN wrote:
> Thanks for the reply Szabolcs vasas. Sorry for errors inthe pacth. this 
> is My first time at apache. 
> Sure I will write few test cases for parquet incremental merge and Which 
> branch I should be using?? currently Im using "branch-1.4.6"

Hi Sandish,

Sorry for the late reply. The new developments should go into the trunk branch.

Regards,
Szabolcs


- Szabolcs


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59346/#review175485
---


On May 18, 2017, 5:50 p.m., Sandish Kumar HN wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59346/
> ---
> 
> (Updated May 18, 2017, 5:50 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Attila Szabo, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3178
> https://issues.apache.org/jira/browse/SQOOP-3178
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> New feature for sqoop-1: Sqoop Incremental Merge for Parquet File Format
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/mapreduce/MergeGenericRecordExportMapper.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MergeJob.java 4e2a916 
>   src/java/org/apache/sqoop/mapreduce/MergeParquetMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/MergeParquetReducer.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/ImportTool.java c79e044 
> 
> 
> Diff: https://reviews.apache.org/r/59346/diff/1/
> 
> 
> Testing
> ---
> 
> Hi,
> 
> Currently, sqoop-1 only supports merging of two parquet file format data sets 
> but it doesn't support to do incremental merge, so I have written a Sqoop 
> Incremental Merge MR for Parquet File Format and I have tested with million 
> records of data with N number of iterations. Please review My patch.
> 
> Please let me know if there are any mistakes in My patch
> 
> 
> Thanks,
> 
> Sandish Kumar HN
> 
>