> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > build.xml, line 54
> > <https://reviews.apache.org/r/10688/diff/5/?file=288026#file288026line54>
> >
> >     I'm not feeling entirely comfortable about depending on SNAPSHOTS. Is 
> > there a particular feature that we're taking advantage of in 0.6.0 that is 
> > not in 0.5.0?

No, the functionality (from the contract point of view) is even compatible with 
0.4.0 I think.   I could not successfully resolve the maven repos for the 
earlier versions and hence I had to switch to it.   I think now I tried to 
build and found that only 0.11.0 is available readily at repos.maven.org.   
That was the reason.  I will update and switch to 0.5.0 if that version is 
available in the repos.   But given that we want to have readily available 
Hadoop 2 and Hadoop 1 artifacts, we may have to set to 0.11.0 assuming that is 
the version the HCatalog team decides to publish the repositories for.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/SqoopOptions.java, line 160
> > <https://reviews.apache.org/r/10688/diff/5/?file=288029#file288029line160>
> >
> >     Out of curiosity what the "stanza" stands for?

Stanza means paragraph :)   We used this a lot earlier in my work to describe 
the SQL snippets when ]we write essays to describe what we want from the 
database.   May be clause is a more general DB term.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/config/ConfigurationConstants.java, lines 66-67
> > <https://reviews.apache.org/r/10688/diff/5/?file=288030#file288030line66>
> >
> >     Does the new property make sense when it's valid only on Hadoop2 that 
> > actually do not have any JobTracker address at all? We already had issues 
> > with that on Sqoop2 side in SQOOP-1002.

I just wanted to consolidate all constants.  If we can remove it, that would be 
OK


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/config/ConfigurationConstants.java, lines 90-91
> > <https://reviews.apache.org/r/10688/diff/5/?file=288030#file288030line90>
> >
> >     Nit: Please put extra empty line between the property name and private 
> > constructor.

Thanks Will fix


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java, lines 377-379
> > <https://reviews.apache.org/r/10688/diff/5/?file=288052#file288052line377>
> >
> >     Nit: Incorrect indentation.

Thanks will fix


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java, lines 368-370
> > <https://reviews.apache.org/r/10688/diff/5/?file=288052#file288052line368>
> >
> >     Nit: Incorrect indentation.

Thanks  will fix


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java, line 849
> > <https://reviews.apache.org/r/10688/diff/5/?file=288043#file288043line849>
> >
> >     I think that we also want to skip invoking the output committer in case 
> > of hadoop 2.

Good catch.  I will modify the version check to just say if it is Hadoop2

Thanks


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java, lines 
> > 76-77
> > <https://reviews.apache.org/r/10688/diff/5/?file=288039#file288039line76>
> >
> >     Nit: Those extra lines seems unnecessary here because the next row also 
> > have type constant.

Will remove the extra lines - thanks


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, line 389
> > <https://reviews.apache.org/r/10688/diff/5/?file=288034#file288034line389>
> >
> >     This comment seems to be artifact from development. I would suggest to 
> > improve the message and move it into "debug" state in case that we would 
> > like to have it around.

Will remove


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java, lines 68-74
> > <https://reviews.apache.org/r/10688/diff/5/?file=288033#file288033line68>
> >
> >     I'm thinking if having subclass of the DataDrivenImportJob for HCat 
> > specific things that would override this and couple of other methods would 
> > be cleaner than having multiple if-else statements. What do you think 
> > Venkat?

Yes,   Good point.   It makes sense (and that is what I started with), but 
since I had to change the ExportJob in place, I used a similar scheme for 
consistency and followed what we did for other storage formats like Avro.   
Also, this will make sure that irrespective of the changes to import job (on 
what it gets its feed from), we will be able to send it to HCat.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/mapreduce/ExportJobBase.java, lines 202-204
> > <https://reviews.apache.org/r/10688/diff/5/?file=288034#file288034line202>
> >
> >     Similarly as in the import. Would having dedicated classes for HCatalog 
> > make sense/would be cleaner that having one class for everything and having 
> > multiple if-else statements?

Good point Jarek.  Actually I had that implementation first - but then we will 
not be able to support update/upsert and call by procedure would  need to be 
modified to handle the HCat format.   Since we were using HCat more as storage 
format like Avro, I decided to implement in place.  And followed similar logic 
for Imports as well


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > src/java/org/apache/sqoop/manager/ConnManager.java, line 197
> > <https://reviews.apache.org/r/10688/diff/5/?file=288032#file288032line197>
> >
> >     Is the timestamp mapped to String from similar reason as mentioned 
> > above with SMALLINT?

Timestamp is currently not a supported datatype in HCat (even though Hive 
supports it).   I will create a JIRA issue on HCat to support that now that 
HCatalog is a sub project of Hive.


> On May 20, 2013, 1:02 p.m., Jarek Cecho wrote:
> > ivy.xml, lines 185-193
> > <https://reviews.apache.org/r/10688/diff/5/?file=288027#file288027line185>
> >
> >     Shouldn't those two dependencies be transitively propagated from 
> > HCatalog/Hive?

I had an issue building without the explicit dependency listed - may be because 
the repos were not having all the artifacts and the data nucleus was only 
available from datanucleus repository.   I will try to remove the dependency 
and retry.   
Thanks


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review20756
-----------------------------------------------------------


On May 4, 2013, 11:46 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 4, 2013, 11:46 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   
> With this feature, it is possible to import and export data between Sqoop and 
> HCatalog tables.   The document attached to SQOOP-931 JIRA issue discusses 
> the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving 
> data between enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 1c33fee 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 
> PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been 
> added to test various aspects of the integration.  A unit test to test the 
> option management is also added.   All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>

Reply via email to