Re: Sqoop CI changes

2017-04-07 Thread Attila Szabó
Hey Anna,

Thanks for your quick reply!

I'm not sure if I can follow you, on the Kudu team topic, but if that means
either we'd like to include some of their working solutions in our CI
system, or we'd like to do a working Kudu integration, for we need better
CI, I would say +1 for both of them, and would be more than happy to help
your efforts on that front. :-)

My original idea briefly would be for the 3rd party automation:
- As a step 0 fire up docker containers with the related DB dependencies
before the whole cycle or before a good organized group of tests (e.g. one
group of MySQL, Oracle, etc.)
- After the tests executed terminate the containers.

Other solution would be possible like:
- Somehow getting access and resources from the Apache community to have
dedicated/global CI DB servers for Sqoop
- Ask any of the Hadoop vendors or the Apache sponsors to provide us pre
installed server infrastructure
- Acquire server instances and deploy them with Terraform+Ansible, etc.

The problem with this "second approach" is that it would need external
resources involved and we would need someone who would sponsor our HW
resources. On the other hand the benefits would be that on real bare metal
resources we would be able to perform meaningful performance tests in the
future as well.

The original plan:
I have no concerns, but still would like to highlight, that IMHO only the
pre commit hook would be urgent, and the rest of your energies would be
better focused on finishing the new build system, or rather on the solution
3rd party resource problem we've discussed above.

And as always: if you would need my help in any of your efforts please do
not hesitate to ping me here, or on the related JIRA task.

My 2cents,
Attila

On Fri, Apr 7, 2017 at 8:34 PM, Anna Szonyi  wrote:

> Hey Attila,
>
> Thanks for your input! I agree that adding the 3rd party tests to the CI
> would be really beneficial. As there are some resourcing problems that need
> to be solved for that to happen (I started a discussion with the Kudu team,
> however if you have any specific help to add there, I would be very happy
> to accept any assistance with that) - I'll start a separate thread where we
> can discuss this issue with the community, see if anyone else has any
> inputs on it :).
>
> However if no one has any objections about the original proposal, I will
> follow through with that in the meantime.
>
> Thanks and Regards,
> Anna
>
>
> On Fri, Apr 7, 2017 at 1:04 PM, Attila Szabó  wrote:
>
> > Hello everyone,
> >
> > First of all I'd like to thank that Anna is willing to invest some
> efforts
> > on making our CI system better, and sorry for my delayed answer (
> however I
> > do still hope it helps regardless the stage of the ongoing efforts )
> >
> > I'd like to share the following thoughts here:
> > - Although it would make sense to eliminate the four ( right now totally
> > equal ) CI cycles and create only one, but regardless some "static noise"
> > it doesn't cause any serious issues for our current commit flow.
> > - Creating a precommit hook sounds like a great idea, I would encourage
> the
> > community to move on that path.
> > - However: According to my humblest opinion the biggest problem with the
> > current CI is that it doesn't execute the so called " 3rd party tests" (
> > which is generally our DB integration test layer), and thus it provides
> > only a limited safety belt for us ( and we've seen quite a few regression
> > on this front in the past one year ). Although we do have solution for
> > running those tests manually from command line, it's quite difficult to
> > setup/test those things from a single desktop, thus cause serious
> > difficulties in validating some changeset before commit.
> > - Still an issue we have on this front is our build
> > system/scripts/mechanism, which again could slow down the Dev+commit
> flow.
> > Although we've started efforts on this front, the final solution was not
> > fully delivered yet.
> >
> > As a conclusion of the elements above:
> > Anna! Would you mind first focusing on the build scripts and the "3rd
> > party" CI automation instead of eliminating the obsolete stuff? IMHO that
> > would be a much better usage of your efforts and would provide a much
> > bigger impact for the community.
> >
> > With my kindest regards,
> > Attila
> >
> > On Mar 28, 2017 2:44 PM, "Erzsebet Szilagyi" 
> > wrote:
> >
> > > Great ideas!
> > >
> > > I agree with Bogi and Szabolcs on the redundant test jobs.
> > >
> > > Would this pre-commit hook launch the same process as the current
> > > post-commit hook, or would this do something different?
> > > I think in the first case we could rework the post-commit check into
> the
> > > pre-commit hook, in the latter I'm curious about what exactly this
> check
> > > would add.
> > > In general I support the idea: we have seen a number of problems that
> > could
> > > have been 

Re: Sqoop CI changes

2017-04-07 Thread Anna Szonyi
Hey Attila,

Thanks for your input! I agree that adding the 3rd party tests to the CI
would be really beneficial. As there are some resourcing problems that need
to be solved for that to happen (I started a discussion with the Kudu team,
however if you have any specific help to add there, I would be very happy
to accept any assistance with that) - I'll start a separate thread where we
can discuss this issue with the community, see if anyone else has any
inputs on it :).

However if no one has any objections about the original proposal, I will
follow through with that in the meantime.

Thanks and Regards,
Anna


On Fri, Apr 7, 2017 at 1:04 PM, Attila Szabó  wrote:

> Hello everyone,
>
> First of all I'd like to thank that Anna is willing to invest some efforts
> on making our CI system better, and sorry for my delayed answer ( however I
> do still hope it helps regardless the stage of the ongoing efforts )
>
> I'd like to share the following thoughts here:
> - Although it would make sense to eliminate the four ( right now totally
> equal ) CI cycles and create only one, but regardless some "static noise"
> it doesn't cause any serious issues for our current commit flow.
> - Creating a precommit hook sounds like a great idea, I would encourage the
> community to move on that path.
> - However: According to my humblest opinion the biggest problem with the
> current CI is that it doesn't execute the so called " 3rd party tests" (
> which is generally our DB integration test layer), and thus it provides
> only a limited safety belt for us ( and we've seen quite a few regression
> on this front in the past one year ). Although we do have solution for
> running those tests manually from command line, it's quite difficult to
> setup/test those things from a single desktop, thus cause serious
> difficulties in validating some changeset before commit.
> - Still an issue we have on this front is our build
> system/scripts/mechanism, which again could slow down the Dev+commit flow.
> Although we've started efforts on this front, the final solution was not
> fully delivered yet.
>
> As a conclusion of the elements above:
> Anna! Would you mind first focusing on the build scripts and the "3rd
> party" CI automation instead of eliminating the obsolete stuff? IMHO that
> would be a much better usage of your efforts and would provide a much
> bigger impact for the community.
>
> With my kindest regards,
> Attila
>
> On Mar 28, 2017 2:44 PM, "Erzsebet Szilagyi" 
> wrote:
>
> > Great ideas!
> >
> > I agree with Bogi and Szabolcs on the redundant test jobs.
> >
> > Would this pre-commit hook launch the same process as the current
> > post-commit hook, or would this do something different?
> > I think in the first case we could rework the post-commit check into the
> > pre-commit hook, in the latter I'm curious about what exactly this check
> > would add.
> > In general I support the idea: we have seen a number of problems that
> could
> > have been avoided, so this shall be a very useful change!
> >
> > Thank you,
> > Liz
> >
> > On Fri, Mar 24, 2017 at 9:39 AM, Szabolcs Vasas 
> > wrote:
> >
> > > Hi Anna,
> > >
> > > Removing the redundant test execution jobs sounds great, I think you
> can
> > go
> > > ahead with that.
> > >
> > > Regarding the pre-commit hook: what would be the purpose of it exactly?
> > > Would it execute the unit tests before the patch is committed?
> > >
> > > Regards,
> > > Szabolcs
> > >
> > > On Thu, Mar 23, 2017 at 4:03 PM, Anna Szonyi 
> > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I would like to make the following changes to the Sqoop CI system:
> > > > Disable the SCM polling for the Sqoop-hadoop23 Sqoop-hadoop20 and
> > > > Sqoop-hadoop100 jobs (and later delete the jobs themselves),
> > > > as the current trunk version of sqoop no longer contains these
> > profiles,
> > > so
> > > > these runs are redundant.
> > > >
> > > > I would also like to propose the creation of a pre-commit hook for
> > Sqoop
> > > > (like the existing one for Sqoop2).
> > > >
> > > > Please let me know if you have any objections.
> > > >
> > > > Thanks,
> > > > Anna
> > > >
> > >
> > >
> > >
> > > --
> > > Szabolcs Vasas
> > > Software Engineer
> > > 
> > >
> >
>


Re: Review Request 58233: SQOOP-3167: Evalutation and automation of SQLServer Manual tests

2017-04-07 Thread Attila Szabo


> On April 7, 2017, 5:25 p.m., Attila Szabo wrote:
> > Hi Bogi, Liz,
> > 
> > Most probably I did not phrase my thoughts precise enough in the previous 
> > round, so please let me express myself in a bit more direct and clear way:
> > My problem is not with the size of the patch file (although IMHO ~1000+ 
> > changed lines is quite a few), but with the fact that this change tries to 
> > achieve multiple things, which I think would make more sense as part of 
> > multiple issues/commits. Here are the things  I've identified as the 
> > effects/consequences of this changeset:
> > - It delivers improvements/fixes around the way connections are handled in 
> > the MSSqlServer related tests, which is a great thing, makes the usage much 
> > more flexible.
> > -- Although around the deletes (were those codes really dead?)/fixes I'm 
> > not sure if those things would be in sync with the design goals of the 
> > SqlServer connector, but I'm not used to consider myself as a MSSqlServer 
> > expert.
> > - Pushes the SqlServer related manual tests into the third party ant test 
> > cycle. IMHO this is not the best decision in the current test + CI 
> > architecture, as from that moment this patch would have been committed, it 
> > would force every contributor to have a working MSSqlServer instance (on 
> > the top of the currently existing ones including MySQL, PostgreSQL, Oracle, 
> > Cubrid, etc.) on their dev infrastructure, or facing with the fact that 
> > some of the third party tests will fail continuously on their side (which 
> > does not sound like a best practice). Most probably on your side it didn't 
> > appear as a problem as you do have standalone instances for yourself, but 
> > we cannot depend on that this is true in case of every contributor. 
> > Although the test files themselves contains some very basic instructions 
> > how to make the tests work, but still in the current version it would need 
> > manual interactions+installs, thus won't work out of the box (needless to 
> > say that the instructions are very much not 
 detailed enough to someone who is not expert in installing MSSqlServer).
> > -- As a subpart of this section, I have to highlight that this way of 
> > changes alternates the intention of the original devs (and I'm pretty sure 
> > they had good reason why these tests were not automated but manual).
> > -- Introducing just another external dependency won't help to onboard new 
> > contributors (just another new -D param for a process which is not quite 
> > good documented already)
> > -- It also won't make the CI integration easier in the future
> > - The title+description JIRA is missleading as it does not solve the true 
> > automation (neither on CI level, nor on the side of the devs).
> > 
> > So if you trust in my experience and wisdom as a committer you would 
> > consider splitting up this changeset into 2-3 parts, and only after pushing 
> > the tests into the 3rd party cycle, when the SqlServer deployment is solved 
> > out of the box (e.g. with docker or global apache installation or ansible 
> > or so).
> > 
> > Please also consider fixing that part what I've identified as an issue
> > 
> > My 2 cents,
> > Attila

I suggest discussing this on dev@sqoop list.
If the PMC agrees on adding SqlServer integration tests to the current test 
suite, I'd be happy to merge your patch.


- Attila


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58233/#review171363
---


On April 6, 2017, 5:27 p.m., Boglarka Egyed wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58233/
> ---
> 
> (Updated April 6, 2017, 5:27 p.m.)
> 
> 
> Review request for Sqoop, Anna Szonyi, Szabolcs Vasas, and Liz Szilagyi.
> 
> 
> Bugs: SQOOP-3167
> https://issues.apache.org/jira/browse/SQOOP-3167
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Automated SQLServer Manual tests including test case correction and minor 
> rework too:
> - modified connection string setup to make them able to run automatically
> - fixed failing test cases
> - excluded invalid test cases
> - added database cleanup logic in tearDown part
> - updated java docs
> - removed unused imports
> - changing names to add them to the 3rd party test suite
> 
> Note: A more extensive refactor of the test classes in 
> org.apache.sqoop.manager.sqlserver could be made, it will be addressed in 
> another JIRA as that is a different scope.
> 
> 
> Diffs
> -
> 
>   build.xml 73db28b272c50b4f76fef8421e6b9dfe5fed40f4 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 
> 33e0cc41f6f379bac2085431e0f1adc60bce6bce 
>   

Re: Review Request 58233: SQOOP-3167: Evalutation and automation of SQLServer Manual tests

2017-04-07 Thread Attila Szabo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58233/#review171363
---



Hi Bogi, Liz,

Most probably I did not phrase my thoughts precise enough in the previous 
round, so please let me express myself in a bit more direct and clear way:
My problem is not with the size of the patch file (although IMHO ~1000+ changed 
lines is quite a few), but with the fact that this change tries to achieve 
multiple things, which I think would make more sense as part of multiple 
issues/commits. Here are the things  I've identified as the 
effects/consequences of this changeset:
- It delivers improvements/fixes around the way connections are handled in the 
MSSqlServer related tests, which is a great thing, makes the usage much more 
flexible.
-- Although around the deletes (were those codes really dead?)/fixes I'm not 
sure if those things would be in sync with the design goals of the SqlServer 
connector, but I'm not used to consider myself as a MSSqlServer expert.
- Pushes the SqlServer related manual tests into the third party ant test 
cycle. IMHO this is not the best decision in the current test + CI 
architecture, as from that moment this patch would have been committed, it 
would force every contributor to have a working MSSqlServer instance (on the 
top of the currently existing ones including MySQL, PostgreSQL, Oracle, Cubrid, 
etc.) on their dev infrastructure, or facing with the fact that some of the 
third party tests will fail continuously on their side (which does not sound 
like a best practice). Most probably on your side it didn't appear as a problem 
as you do have standalone instances for yourself, but we cannot depend on that 
this is true in case of every contributor. Although the test files themselves 
contains some very basic instructions how to make the tests work, but still in 
the current version it would need manual interactions+installs, thus won't work 
out of the box (needless to say that the instructions are very much not deta
 iled enough to someone who is not expert in installing MSSqlServer).
-- As a subpart of this section, I have to highlight that this way of changes 
alternates the intention of the original devs (and I'm pretty sure they had 
good reason why these tests were not automated but manual).
-- Introducing just another external dependency won't help to onboard new 
contributors (just another new -D param for a process which is not quite good 
documented already)
-- It also won't make the CI integration easier in the future
- The title+description JIRA is missleading as it does not solve the true 
automation (neither on CI level, nor on the side of the devs).

So if you trust in my experience and wisdom as a committer you would consider 
splitting up this changeset into 2-3 parts, and only after pushing the tests 
into the 3rd party cycle, when the SqlServer deployment is solved out of the 
box (e.g. with docker or global apache installation or ansible or so).

Please also consider fixing that part what I've identified as an issue

My 2 cents,
Attila


src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportDelimitedFileManualTest.java
Lines 224-242 (original), 249-299 (patched)


Hi,

Could you please give me some explanation around these test cases?

For me it looks strange intentionally, that we're trying to leverage from 
some base/super test class, but nullifying some of it's test cases with an 
empty method body.

If these things are not neccessary I would suggest deleting them, as the 
current solution is very much misleading, b/c these cases won't look as 
'invalid' as you marked them, but very much passed green test cases, giving the 
false intention these cases are very much supported in the delimited case as 
well.

If you need a common ancestor I would advise using 'extract superclass' 
refactoring.


- Attila Szabo


On April 6, 2017, 5:27 p.m., Boglarka Egyed wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58233/
> ---
> 
> (Updated April 6, 2017, 5:27 p.m.)
> 
> 
> Review request for Sqoop, Anna Szonyi, Szabolcs Vasas, and Liz Szilagyi.
> 
> 
> Bugs: SQOOP-3167
> https://issues.apache.org/jira/browse/SQOOP-3167
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Automated SQLServer Manual tests including test case correction and minor 
> rework too:
> - modified connection string setup to make them able to run automatically
> - fixed failing test cases
> - excluded invalid test cases
> - added database cleanup logic in tearDown part
> - updated java docs
> - removed unused imports
> - changing names to add them to the 

[jira] [Commented] (SQOOP-3125) NULL updates are not pulled into Hbase whenever incremental load happens in sqoop

2017-04-07 Thread Boglarka Egyed (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960966#comment-15960966
 ] 

Boglarka Egyed commented on SQOOP-3125:
---

Hi [~selvakumar7],

There is a patch which has been created for a similar ticket (SQOOP-3149), it 
is under review currently, please see https://reviews.apache.org/r/57499/

Cheers,
Bogi

> NULL updates are not pulled into Hbase whenever incremental load happens in 
> sqoop
> -
>
> Key: SQOOP-3125
> URL: https://issues.apache.org/jira/browse/SQOOP-3125
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/oracle
>Affects Versions: 1.4.6
> Environment: MapR Hadoop
>Reporter: Selvakumar
>
> Ingesting data from Oracle to HBase using Sqoop and fetching data from HBase 
> using Hive tables using HBaseStorageHandler.
> Problem is whenever incremental load happens from Sqoop and if a column is 
> updated with "null",
> HBase doesn't gets updated with null value for that cell and when the data is 
> fetched from HBase using Hive table it fetches last updated timestamp record.
> So the entire record becomes inconsistent as for all the columns it shows 
> updated value and for the "null" column it shows old value of it as timestamp 
> is not updated. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58233: SQOOP-3167: Evalutation and automation of SQLServer Manual tests

2017-04-07 Thread Liz Szilagyi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58233/#review171350
---


Ship it!




Bogi,
Thank you for working diligently on our tests and thus improving quality!

I found while reading through the changes nothing really stood out as something 
that could be separated into another ticket, so personally I support keeping 
these changes together as one.
I also found the instructions included on how to run these tests very helpful.

Thank you,
Liz

- Liz Szilagyi


On April 6, 2017, 7:27 p.m., Boglarka Egyed wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58233/
> ---
> 
> (Updated April 6, 2017, 7:27 p.m.)
> 
> 
> Review request for Sqoop, Anna Szonyi, Szabolcs Vasas, and Liz Szilagyi.
> 
> 
> Bugs: SQOOP-3167
> https://issues.apache.org/jira/browse/SQOOP-3167
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Automated SQLServer Manual tests including test case correction and minor 
> rework too:
> - modified connection string setup to make them able to run automatically
> - fixed failing test cases
> - excluded invalid test cases
> - added database cleanup logic in tearDown part
> - updated java docs
> - removed unused imports
> - changing names to add them to the 3rd party test suite
> 
> Note: A more extensive refactor of the test classes in 
> org.apache.sqoop.manager.sqlserver could be made, it will be addressed in 
> another JIRA as that is a different scope.
> 
> 
> Diffs
> -
> 
>   build.xml 73db28b272c50b4f76fef8421e6b9dfe5fed40f4 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 
> 33e0cc41f6f379bac2085431e0f1adc60bce6bce 
>   src/test/com/cloudera/sqoop/manager/SQLServerManagerExportManualTest.java 
> 9a92479245fa35c210d8e49f847292ee53d6f9b1 
>   src/test/com/cloudera/sqoop/manager/SQLServerManagerImportManualTest.java 
> 1f69725da8408853ac55b1f316ce1b9ef015e674 
>   src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java 
> 851bf49614e829d07de252b83f4ad550d0cb043b 
>   src/test/org/apache/sqoop/manager/sqlserver/ManagerCompatExport.java 
> 8c5176ad61aae61b96c7458d3b4b83dc11960268 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeExportDelimitedFileManualTest.java
>  099d7344beb428c58b32d926af5ea079211da490 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeExportSequenceFileManualTest.java
>  21676f02510693dcdd856a1d9dfba7d05eace023 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportDelimitedFileManualTest.java
>  519fb525bdbb167520368d404667036669925041 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportSequenceFileManualTest.java
>  a0dad8a60b99d522ad3691e15b8b16c56e4b5858 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerHiveImportManualTest.java
>  1999272181421a539318ed195ea4257f52b2ed08 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerManualTest.java 
> 1178e3c79de4d0b5c7a96c6ad7eb316ed15e47c4 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerMultiColsManualTest.java 
> 6a8ab51967237f471044b868615fdb3e057b1d92 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerMultiMapsManualTest.java 
> c9a5b5ef596cfc1b28948c2a071935dfb9500cde 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerParseMethodsManualTest.java
>  cd05aecf1ae5bd79fb485325d58b33a73e9df290 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerQueryManualTest.java 
> 0057ac9df562c8e92cf7b9014c5e4239886a8104 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerSplitByManualTest.java 
> f85245ab8cdd66da983ac9017d356f251f22e7db 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerWhereManualTest.java 
> 10ae03b324b15f5ea0cc3cbbc04d3a5041233dd9 
> 
> 
> Diff: https://reviews.apache.org/r/58233/diff/2/
> 
> 
> Testing
> ---
> 
> ant clean test, ant test
> 
> ant clean test -Dthirdparty=true -Dsqoop.thirdparty.lib.dir=3rdpartylib 
> -Dsqoop.test.sqlserver.connectstring.host_url=sqlserver 
> -Dsqoop.test.sqlserver.database=databasename -Dms.sqlserver.username=username 
> -Dms.sqlserver.password=password -Dtestcase=SQLServer*
> 
> ant clean test -Dthirdparty=true -Dsqoop.thirdparty.lib.dir=3rdpartylib 
> -Dsqoop.test.mysql.connectstring.host_url=mysql 
> -Dsqoop.test.mysql.databasename=databasename 
> -Dsqoop.test.mysql.password=password -Dsqoop.test.mysql.username=username 
> -Dsqoop.test.oracle.connectstring=oracle 
> -Dsqoop.test.postgresql.connectstring.host_url=postgre 
> -Dsqoop.test.cubrid.connectstring.host_url=cubrid 
> -Dsqoop.test.cubrid.connectstring.username=username 
> -Dsqoop.test.cubrid.connectstring.database=database 
> -Dsqoop.test.cubrid.connectstring.password=password 
> 

[jira] [Commented] (SQOOP-3168) Sqoop Saved Job feature of overwriting job argument at execution time is not working in Sqoop1.4.6CDH 5.8.0

2017-04-07 Thread Anna Szonyi (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960653#comment-15960653
 ] 

Anna Szonyi commented on SQOOP-3168:


Hi,

Unfortunately this is a bug that was introduced with SQOOP-2779 and later fixed 
by SQOOP-2896. As for a workaround: there is none, besides patching it with the 
fix of SQOOP-2896, as for CDH 5.8.0, only creating new jobs or upgrading to 
5.8.5+ would solve it, as that is where the fix was first backported.

Apologies and Regards,
Anna

> Sqoop Saved Job feature of overwriting job argument at execution time is not 
> working in Sqoop1.4.6CDH 5.8.0
> ---
>
> Key: SQOOP-3168
> URL: https://issues.apache.org/jira/browse/SQOOP-3168
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Hemendra Yadav
>
> Hi,
> I have created one sqoop saved job using below command ::
> sqoop job --create  -- import --connect  --driver 
> com.mysql.jdbc.Driver --username  --password  --table 
>  --target-dir location1 --fields-terminated-by , --escaped-by \\ -m 1
>  
> I need to put the data to different location so while execution i have 
> provided new location in below command::
> sqoop job --exec jobname1 -- --target-dir location_new
>  But it still picks up the target directory from the saved job i.e. location1.
> I am currently using sqoop 1.4.6 CDH 5.8.0 version. When i have checked the 
> sqoop code it seems that in sqoop CDH 5.8.0  code for overwritting job 
> argument is removed from the method "private int execJob(SqoopOptions opts)" 
> in the class "org.apache.sqoop.tool.JobTool" due to issue 
> "https://issues.apache.org/jira/browse/SQOOP-2779;.
>  When i have executed the same command using sqoop 1.4.6 CDH 5.7.3 it is able 
> to overwrite the target directory and working as expected.
> Can you please suggest any workaround how to import data to a different 
> target directory while executing sqoop saved job.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Sqoop CI changes

2017-04-07 Thread Attila Szabó
Hello everyone,

First of all I'd like to thank that Anna is willing to invest some efforts
on making our CI system better, and sorry for my delayed answer ( however I
do still hope it helps regardless the stage of the ongoing efforts )

I'd like to share the following thoughts here:
- Although it would make sense to eliminate the four ( right now totally
equal ) CI cycles and create only one, but regardless some "static noise"
it doesn't cause any serious issues for our current commit flow.
- Creating a precommit hook sounds like a great idea, I would encourage the
community to move on that path.
- However: According to my humblest opinion the biggest problem with the
current CI is that it doesn't execute the so called " 3rd party tests" (
which is generally our DB integration test layer), and thus it provides
only a limited safety belt for us ( and we've seen quite a few regression
on this front in the past one year ). Although we do have solution for
running those tests manually from command line, it's quite difficult to
setup/test those things from a single desktop, thus cause serious
difficulties in validating some changeset before commit.
- Still an issue we have on this front is our build
system/scripts/mechanism, which again could slow down the Dev+commit flow.
Although we've started efforts on this front, the final solution was not
fully delivered yet.

As a conclusion of the elements above:
Anna! Would you mind first focusing on the build scripts and the "3rd
party" CI automation instead of eliminating the obsolete stuff? IMHO that
would be a much better usage of your efforts and would provide a much
bigger impact for the community.

With my kindest regards,
Attila

On Mar 28, 2017 2:44 PM, "Erzsebet Szilagyi" 
wrote:

> Great ideas!
>
> I agree with Bogi and Szabolcs on the redundant test jobs.
>
> Would this pre-commit hook launch the same process as the current
> post-commit hook, or would this do something different?
> I think in the first case we could rework the post-commit check into the
> pre-commit hook, in the latter I'm curious about what exactly this check
> would add.
> In general I support the idea: we have seen a number of problems that could
> have been avoided, so this shall be a very useful change!
>
> Thank you,
> Liz
>
> On Fri, Mar 24, 2017 at 9:39 AM, Szabolcs Vasas 
> wrote:
>
> > Hi Anna,
> >
> > Removing the redundant test execution jobs sounds great, I think you can
> go
> > ahead with that.
> >
> > Regarding the pre-commit hook: what would be the purpose of it exactly?
> > Would it execute the unit tests before the patch is committed?
> >
> > Regards,
> > Szabolcs
> >
> > On Thu, Mar 23, 2017 at 4:03 PM, Anna Szonyi 
> wrote:
> >
> > > Hi All,
> > >
> > > I would like to make the following changes to the Sqoop CI system:
> > > Disable the SCM polling for the Sqoop-hadoop23 Sqoop-hadoop20 and
> > > Sqoop-hadoop100 jobs (and later delete the jobs themselves),
> > > as the current trunk version of sqoop no longer contains these
> profiles,
> > so
> > > these runs are redundant.
> > >
> > > I would also like to propose the creation of a pre-commit hook for
> Sqoop
> > > (like the existing one for Sqoop2).
> > >
> > > Please let me know if you have any objections.
> > >
> > > Thanks,
> > > Anna
> > >
> >
> >
> >
> > --
> > Szabolcs Vasas
> > Software Engineer
> > 
> >
>


Re: Review Request 58233: SQOOP-3167: Evalutation and automation of SQLServer Manual tests

2017-04-07 Thread Boglarka Egyed


> On April 7, 2017, 8:11 a.m., Attila Szabo wrote:
> > Hey Bogi,
> > 
> > I've got two concerns related to this change:
> > - The change set is quite huge, and thus not easy to review. Could we 
> > please split up to smaller pieces?
> > - Besides the fact that this modification will provide some way to execute 
> > these tests through Junit and some ant tasks, I can't really spot how this 
> > change will make the test cases more "automated". Will these tests be 
> > included in the CI cycle? Do we already have a SqlServer instance connected 
> > to the Apache CI system we can test against? Which CI cycle would include 
> > the execution of this test suite?
> > 
> > Thanks for your answers and clarification,
> > 
> > Attila

Hi Attila,

Thanks for your inputs.

However it is a bigger change set, I would not split it into smaller pieces as 
I think this is a coherent whole representing one logical change. Every file 
contains similar, limited amount of modifications which makes easier to review 
it IMHO.

Automation means several things here. First, these all were manual tests 
meaning these were executed possibly never or a really long time ago as there 
were numerous tests failing. Also, these were able to be executed by some 
manual ant task which was quite difficult because of the hard coded DB related 
variables and thus were possibly avoided instead. But from now they can be 
executed as part of the 3rd party test suite by setting the DB connect string 
and credentials through system properties. This definitely makes easier to test 
changes which possibly affect integration with DBs as we already have the same 
practice for MySQL, Postgre, etc.

Unfortunately, even the 3rd party test suite is not a part of CI cycle, 
however, it totally makes sense to add it too and it should be. Fortunately, 
AFAIR there are plans to improve Sqoop CI cycles, the related communication has 
already started at dev@ and is an ongoing effort thus I wouldn't think that it 
should be part of the scope of this change - this change is a good first step 
toward it, however.

Many thanks,
Bogi


- Boglarka


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58233/#review171319
---


On April 6, 2017, 5:27 p.m., Boglarka Egyed wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58233/
> ---
> 
> (Updated April 6, 2017, 5:27 p.m.)
> 
> 
> Review request for Sqoop, Anna Szonyi, Szabolcs Vasas, and Liz Szilagyi.
> 
> 
> Bugs: SQOOP-3167
> https://issues.apache.org/jira/browse/SQOOP-3167
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Automated SQLServer Manual tests including test case correction and minor 
> rework too:
> - modified connection string setup to make them able to run automatically
> - fixed failing test cases
> - excluded invalid test cases
> - added database cleanup logic in tearDown part
> - updated java docs
> - removed unused imports
> - changing names to add them to the 3rd party test suite
> 
> Note: A more extensive refactor of the test classes in 
> org.apache.sqoop.manager.sqlserver could be made, it will be addressed in 
> another JIRA as that is a different scope.
> 
> 
> Diffs
> -
> 
>   build.xml 73db28b272c50b4f76fef8421e6b9dfe5fed40f4 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 
> 33e0cc41f6f379bac2085431e0f1adc60bce6bce 
>   src/test/com/cloudera/sqoop/manager/SQLServerManagerExportManualTest.java 
> 9a92479245fa35c210d8e49f847292ee53d6f9b1 
>   src/test/com/cloudera/sqoop/manager/SQLServerManagerImportManualTest.java 
> 1f69725da8408853ac55b1f316ce1b9ef015e674 
>   src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java 
> 851bf49614e829d07de252b83f4ad550d0cb043b 
>   src/test/org/apache/sqoop/manager/sqlserver/ManagerCompatExport.java 
> 8c5176ad61aae61b96c7458d3b4b83dc11960268 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeExportDelimitedFileManualTest.java
>  099d7344beb428c58b32d926af5ea079211da490 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeExportSequenceFileManualTest.java
>  21676f02510693dcdd856a1d9dfba7d05eace023 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportDelimitedFileManualTest.java
>  519fb525bdbb167520368d404667036669925041 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportSequenceFileManualTest.java
>  a0dad8a60b99d522ad3691e15b8b16c56e4b5858 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerHiveImportManualTest.java
>  1999272181421a539318ed195ea4257f52b2ed08 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerManualTest.java 
> 1178e3c79de4d0b5c7a96c6ad7eb316ed15e47c4 
>   
> 

[jira] [Updated] (SQOOP-3167) Evalutation and automation of SQLServer Manual tests

2017-04-07 Thread Boglarka Egyed (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boglarka Egyed updated SQOOP-3167:
--
Attachment: SQOOP-3167.patch

> Evalutation and automation of SQLServer Manual tests
> 
>
> Key: SQOOP-3167
> URL: https://issues.apache.org/jira/browse/SQOOP-3167
> Project: Sqoop
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 1.4.6
>Reporter: Boglarka Egyed
>Assignee: Boglarka Egyed
> Attachments: SQOOP-3167.patch, SQOOP-3167.patch
>
>
> Evaluating and automating Sqoop manual tests would bring a huge improvement 
> to the quality as these tests are probably executed very rarely currently and 
> thus not bringing too much value. Adding them to the 3rd party test suite 
> would be ideal and would help to keep Sqoop more robust.
> This ticket targets the SQLServer specific manual tests including if these 
> tests:
> * run and still make sense
> * fixed as needed
> * automated and added to the 3rd party test suite
> The following 14 test classes are affected by this ticket:
> * com.cloudera.sqoop.manager
> ** SQLServerManagerExportManualTest
> ** SQLServerManagerImportManualTest
> * org.apache.sqoop.manager.sqlserver
> ** SQLServerDatatypeExportDelimitedFileManualTest
> ** SQLServerDatatypeExportSequenceFileManualTest
> ** SQLServerDatatypeImportDelimitedFileManualTest
> ** SQLServerDatatypeImportSequenceFileManualTest
> ** SQLServerHiveImportManualTest
> ** SQLServerManagerManualTest
> ** SQLServerMultiColsManualTest
> ** SQLServerMultiMapsManualTest
> ** SQLServerParseMethodsManualTest
> ** SQLServerQueryManualTest
> ** SQLServerSplitByManualTest
> ** SQLServerWhereManualTest



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Review Request 58233: SQOOP-3167: Evalutation and automation of SQLServer Manual tests

2017-04-07 Thread Attila Szabo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58233/#review171319
---



Hey Bogi,

I've got two concerns related to this change:
- The change set is quite huge, and thus not easy to review. Could we please 
split up to smaller pieces?
- Besides the fact that this modification will provide some way to execute 
these tests through Junit and some ant tasks, I can't really spot how this 
change will make the test cases more "automated". Will these tests be included 
in the CI cycle? Do we already have a SqlServer instance connected to the 
Apache CI system we can test against? Which CI cycle would include the 
execution of this test suite?

Thanks for your answers and clarification,

Attila

- Attila Szabo


On April 6, 2017, 5:27 p.m., Boglarka Egyed wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58233/
> ---
> 
> (Updated April 6, 2017, 5:27 p.m.)
> 
> 
> Review request for Sqoop, Anna Szonyi, Szabolcs Vasas, and Liz Szilagyi.
> 
> 
> Bugs: SQOOP-3167
> https://issues.apache.org/jira/browse/SQOOP-3167
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Automated SQLServer Manual tests including test case correction and minor 
> rework too:
> - modified connection string setup to make them able to run automatically
> - fixed failing test cases
> - excluded invalid test cases
> - added database cleanup logic in tearDown part
> - updated java docs
> - removed unused imports
> - changing names to add them to the 3rd party test suite
> 
> Note: A more extensive refactor of the test classes in 
> org.apache.sqoop.manager.sqlserver could be made, it will be addressed in 
> another JIRA as that is a different scope.
> 
> 
> Diffs
> -
> 
>   build.xml 73db28b272c50b4f76fef8421e6b9dfe5fed40f4 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 
> 33e0cc41f6f379bac2085431e0f1adc60bce6bce 
>   src/test/com/cloudera/sqoop/manager/SQLServerManagerExportManualTest.java 
> 9a92479245fa35c210d8e49f847292ee53d6f9b1 
>   src/test/com/cloudera/sqoop/manager/SQLServerManagerImportManualTest.java 
> 1f69725da8408853ac55b1f316ce1b9ef015e674 
>   src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java 
> 851bf49614e829d07de252b83f4ad550d0cb043b 
>   src/test/org/apache/sqoop/manager/sqlserver/ManagerCompatExport.java 
> 8c5176ad61aae61b96c7458d3b4b83dc11960268 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeExportDelimitedFileManualTest.java
>  099d7344beb428c58b32d926af5ea079211da490 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeExportSequenceFileManualTest.java
>  21676f02510693dcdd856a1d9dfba7d05eace023 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportDelimitedFileManualTest.java
>  519fb525bdbb167520368d404667036669925041 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerDatatypeImportSequenceFileManualTest.java
>  a0dad8a60b99d522ad3691e15b8b16c56e4b5858 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerHiveImportManualTest.java
>  1999272181421a539318ed195ea4257f52b2ed08 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerManualTest.java 
> 1178e3c79de4d0b5c7a96c6ad7eb316ed15e47c4 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerMultiColsManualTest.java 
> 6a8ab51967237f471044b868615fdb3e057b1d92 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerMultiMapsManualTest.java 
> c9a5b5ef596cfc1b28948c2a071935dfb9500cde 
>   
> src/test/org/apache/sqoop/manager/sqlserver/SQLServerParseMethodsManualTest.java
>  cd05aecf1ae5bd79fb485325d58b33a73e9df290 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerQueryManualTest.java 
> 0057ac9df562c8e92cf7b9014c5e4239886a8104 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerSplitByManualTest.java 
> f85245ab8cdd66da983ac9017d356f251f22e7db 
>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerWhereManualTest.java 
> 10ae03b324b15f5ea0cc3cbbc04d3a5041233dd9 
> 
> 
> Diff: https://reviews.apache.org/r/58233/diff/2/
> 
> 
> Testing
> ---
> 
> ant clean test, ant test
> 
> ant clean test -Dthirdparty=true -Dsqoop.thirdparty.lib.dir=3rdpartylib 
> -Dsqoop.test.sqlserver.connectstring.host_url=sqlserver 
> -Dsqoop.test.sqlserver.database=databasename -Dms.sqlserver.username=username 
> -Dms.sqlserver.password=password -Dtestcase=SQLServer*
> 
> ant clean test -Dthirdparty=true -Dsqoop.thirdparty.lib.dir=3rdpartylib 
> -Dsqoop.test.mysql.connectstring.host_url=mysql 
> -Dsqoop.test.mysql.databasename=databasename 
> -Dsqoop.test.mysql.password=password -Dsqoop.test.mysql.username=username 
> -Dsqoop.test.oracle.connectstring=oracle 
>