-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9543/#review17076
-----------------------------------------------------------
Hi Venkat,
thank you very much for working on this. I really appreciate your time and
effort. I've started diving into your patch. I'm not quite done yet, but let me
share my thoughts to unblock you.
1) Would you mind running "ant checkstyle" and cleaning up the errors? Some of
the errors are already present in trunk, so please feel free to leave them out.
Just it would be great if would not include any new errors.
2) Tests do not seem passing for me against Netezza Simulator 6.0. I'm not yet
sure why, so I'll investigate that on my end.
3) It seems that you prefer creating chained exceptions in form
NewException("String").initCause(oldException). The rest of the code seems to
be following different approach and is creating exceptions in form
NewException("String", oldException). Do you feel comfortable with change that
to have unified code style?
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36230>
It seems that I've confused you, so please accept my apologies for that.
Would you mind putting the mapreduce specific classes into package
"org.apache.sqoop.mapreduce.netezza"? (e.g leaving out the "db" package). You
can notice that there are already packages for MySQL and Microsoft SQL Server:
org.apache.sqoop.mapreduce.mysql
org.apache.sqoop.mapreduce.sqlserver
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36161>
I can see similar code fragment in both exportTable() and importTable()
methods (just returning different exception class). I would suggest to put it
into constructor instead. The other connectors seems to be doing that already.
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36211>
I've noticed that this parameter is being used by both Direct and JDBC
manager. What about creating method getNetezzaExtraOpts() in JDBC Manager that
will create this single argument and in this method simply add direct specific
arguments. Something like:
RelatedOptions netezzaOpts = super.getNetezzaExtraOpts();
// Add the direct mode specific arguments
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36193>
Shouldn't this condition be something like:
...extraArgs.length != 0 ...
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36196>
I'm not quite sure what is the reasoning for skipping extra parameter
checking when there is only one mapper?
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36200>
Would you mind simplifying this by doing something like:
throw new SqoopOptions.InvalidOptionsException(pe.getMessage(), pe);
src/java/org/apache/sqoop/manager/DirectNetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36195>
It seems to me that this property is set in both branches of the if-else
statement. Would it make sense to set it at one place then?
src/java/org/apache/sqoop/manager/NetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36202>
I didn't see any reference to this constructor, is it used?
src/java/org/apache/sqoop/manager/NetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36205>
Would you mind sharing more details about why is the parallel JDBC update
dangerous and how it can lead to inconsistencies?
src/java/org/apache/sqoop/manager/NetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36206>
Shouldn't this condition be something like:
... extraArgs.length != 0 ...
src/java/org/apache/sqoop/manager/NetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36207>
I'm not quite sure what is the reasoning for skipping extra parameter
checking when there is only one mapper?
src/java/org/apache/sqoop/manager/NetezzaManager.java
<https://reviews.apache.org/r/9543/#comment36210>
Can we put this into standalone method getNetezzaExtraOpts() similarly as
we've done in NetezzaDirectManager?
src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportJob.java
<https://reviews.apache.org/r/9543/#comment36213>
That comment seems little bit off :-)
src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java
<https://reviews.apache.org/r/9543/#comment36220>
Shouldn't this be abstract class? It seems to me that it do not make sense
when it would be used directly.
src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java
<https://reviews.apache.org/r/9543/#comment36219>
Can we inherit from SqoopMapper?
src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java
<https://reviews.apache.org/r/9543/#comment36214>
Can we make similar check on the Sqoop client side? I'm afraid that this
warning will get lost as it's printed in the mapper task and I'm not expecting
users to read it up there.
src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableInputSplitter.java
<https://reviews.apache.org/r/9543/#comment36223>
I would suggest to rename this into NetezzaExternalTableInputSplit instead
of "..Splitter".
src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java
<https://reviews.apache.org/r/9543/#comment36255>
This chaining of exceptions seems problematic to me. The initCause() might
not be allowed to call as the exception cause might be already initialized.
It seems that there might be maximally two exceptions. If that is correct
then I suggest to return only one instead of the chain. I would personally
prioritized the one catched on line 88 over the one on line 95.
src/test/com/cloudera/sqoop/manager/DirectNetezzaExportManualTest.java
<https://reviews.apache.org/r/9543/#comment36224>
I would vote for not doing this. The file:// should be default value and by
explicitly specifying this, we will loose the ability to run the tests against
real Hadoop cluster.
src/test/com/cloudera/sqoop/manager/NetezzaImportManualTest.java
<https://reviews.apache.org/r/9543/#comment36225>
It seems to me that both tests are doing exactly the same, is it really
necessary?
src/test/com/cloudera/sqoop/manager/NetezzaImportManualTest.java
<https://reviews.apache.org/r/9543/#comment36231>
Would you mind fixing this to the proper call:
return args.toArray(new String[args.size()]);
src/test/com/cloudera/sqoop/manager/NetezzaTestUtils.java
<https://reviews.apache.org/r/9543/#comment36229>
I believe that the second semicolon is not needed there :-)
src/test/com/cloudera/sqoop/manager/NetezzaTestUtils.java
<https://reviews.apache.org/r/9543/#comment36228>
Can we print out used Netezza URL, something like:
LOG.info("Using Netezza URL: " + url.toString());
My motivation here is to help users running the test to see what exactly is
Sqoop trying to do. It helped me, so I can imagine that it might help others as
well.
Jarcec
- Jarek Cecho
On Feb. 25, 2013, 10:02 p.m., Venkat Ranganathan wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9543/
> -----------------------------------------------------------
>
> (Updated Feb. 25, 2013, 10:02 p.m.)
>
>
> Review request for Sqoop and Jarek Cecho.
>
>
> Description
> -------
>
> This addresses SQOOP-846 (provide a Netezza connector)
>
>
> This addresses bug SQOOP-846.
> https://issues.apache.org/jira/browse/SQOOP-846
>
>
> Diffs
> -----
>
> src/java/org/apache/sqoop/lib/DelimiterSet.java 4e9bcab
> src/java/org/apache/sqoop/manager/DefaultManagerFactory.java 54eb258
> src/java/org/apache/sqoop/manager/DirectNetezzaManager.java PRE-CREATION
> src/java/org/apache/sqoop/manager/MySQLUtils.java ef18818
> src/java/org/apache/sqoop/manager/NetezzaManager.java PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaDBDataSliceSplitter.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaDataDrivenDBInputFormat.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportJob.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableImportJob.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableImportMapper.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableInputFormat.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableInputSplitter.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableRecordExportMapper.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableTextExportMapper.java
> PRE-CREATION
>
> src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java
> PRE-CREATION
> src/test/com/cloudera/sqoop/manager/DirectNetezzaExportManualTest.java
> PRE-CREATION
> src/test/com/cloudera/sqoop/manager/NetezzaImportManualTest.java
> PRE-CREATION
> src/test/com/cloudera/sqoop/manager/NetezzaTestUtils.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/9543/diff/
>
>
> Testing
> -------
>
> Ran all sqoop tests. Ran Netezza manual tests against Netezza VMs version 6
> and 7
>
>
> Thanks,
>
> Venkat Ranganathan
>
>