from:"Fero Szabo \(JIRA\)"

[jira] [Commented] (SQOOP-3428) Fix the CI

2019-10-16 Thread Fero Szabo (Jira)



[ 
https://issues.apache.org/jira/browse/SQOOP-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952792#comment-16952792
 ] 

Fero Szabo commented on SQOOP-3428:
---

[~fokko], I've added this Jira to the commit description. If you've had a 
different one in mind, we can amend the commit! Otherwise we should close this 
one, as the CI appears to be fixed. :) 

Thanks again!

> Fix the CI
> --
>
> Key: SQOOP-3428
> URL: https://issues.apache.org/jira/browse/SQOOP-3428
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, the CI is broken because the Oracle 11 XE Dockerimage isn't 
> available anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-25 Thread Fero Szabo (Jira)



[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937767#comment-16937767
 ] 

Fero Szabo edited comment on SQOOP-3451 at 9/25/19 2:17 PM:


Hi [~dionusos],

Yeah, I think you are right and Oracle is a pain in this regard to work with.

I've had the same issue when developing the fixed point decimal support for 
Avro and Parquet, namely that a column defined as NUMBER (without precision and 
scale), comes back with invalid metadata from the database. (I believe 
something like -127 as scale, though please double check this). And under the 
hood, I suspect Oracle is using NUMBER, again, to store the type Float. 

In my case, the only missing thing was a proper scale to be able to pad a 
BigDecimal within sqoop. So, I created a flag to allow this to the user in 
SQOOP-2976. Not sure what to do in your case, as it's neither of those files, 
(it's orc, if I'm seeing this correctly). In any case, I believe you'll need to 
watch out for these "special" values for scale and precision returned by Oracle 
and implement a logic that maps these to proper values.

I used user input for this via properties. Seemed the best at the time, for 
that particular case. I'm not sure if my approach is the right one for you as 
well, though certainly an option.

So, TL;DR:

Track down where the Hive schema gets created and debug whether you can 
identify a Float coming from Oracle based on the precision and scale. You might 
want to check other number types, too.

Hope this helps!

 

(edited a mistake)


was (Author: fero):
Hi [~dionusos],

Yeah, I think you are right and Oracle is a pain in this regard to work with.

I've had the same issue when developing the fixed point number support for Avro 
and Parquet, namely that a column defined as NUMBER (without precision and 
scale), comes back with invalid metadata from the database. (I believe 
something like -127 as scale, though please double check this). And under the 
hood, I suspect Oracle is using NUMBER, again, to store the type Float. 

In my case, the only missing thing was a proper scale to be able to pad a 
BigDecimal within sqoop. So, I created a flag to allow this to the user in 
SQOOP-2976. Not sure what to do in your case, as it's neither of those files, 
(it's orc, if I'm seeing this correctly). In any case, I believe you'll need to 
watch out for these "special" values for scale and precision returned by Oracle 
and implement a logic that maps these to proper values.

I used user input for this via properties. Seemed the best at the time, for 
that particular case. I'm not sure if my approach is the right one for you as 
well, though certainly an option.

So, TL;DR:

Track down where the Hive schema gets created and debug whether you can 
identify a Float coming from Oracle based on the precision and scale. You might 
want to check other number types, too.

Hope this helps!

 

(edited a mistake)

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45

[jira] [Comment Edited] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-25 Thread Fero Szabo (Jira)



[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937767#comment-16937767
 ] 

Fero Szabo edited comment on SQOOP-3451 at 9/25/19 2:17 PM:


Hi [~dionusos],

Yeah, I think you are right and Oracle is a pain in this regard to work with.

I've had the same issue when developing the fixed point number support for Avro 
and Parquet, namely that a column defined as NUMBER (without precision and 
scale), comes back with invalid metadata from the database. (I believe 
something like -127 as scale, though please double check this). And under the 
hood, I suspect Oracle is using NUMBER, again, to store the type Float. 

In my case, the only missing thing was a proper scale to be able to pad a 
BigDecimal within sqoop. So, I created a flag to allow this to the user in 
SQOOP-2976. Not sure what to do in your case, as it's neither of those files, 
(it's orc, if I'm seeing this correctly). In any case, I believe you'll need to 
watch out for these "special" values for scale and precision returned by Oracle 
and implement a logic that maps these to proper values.

I used user input for this via properties. Seemed the best at the time, for 
that particular case. I'm not sure if my approach is the right one for you as 
well, though certainly an option.

So, TL;DR:

Track down where the Hive schema gets created and debug whether you can 
identify a Float coming from Oracle based on the precision and scale. You might 
want to check other number types, too.

Hope this helps!

 

(edited a mistake)


was (Author: fero):
Hi [~dionusos],

Yeah, I think you are right and Oracle is a pain in this regard to work with.

I've had the same issue when developing the floating point number support for 
Avro and Parquet, namely that a column defined as NUMBER (without precision and 
scale), comes back with invalid metadata from the database. (I believe 
something like -127 as scale, though please double check this). And under the 
hood, I suspect Oracle is using NUMBER, again, to store the type Float. 

In my case, the only missing thing was a proper scale to be able to pad a 
BigDecimal within sqoop. So, I created a flag to allow this to the user in 
SQOOP-2976. Not sure what to do in your case, as it's neither of those files, 
(it's orc, if I'm seeing this correctly). In any case, I believe you'll need to 
watch out for these "special" values for scale and precision returned by Oracle 
and implement a logic that maps these to proper values.

I used user input for this via properties. Seemed the best at the time, for 
that particular case. I'm not sure if my approach is the right one for you as 
well, though certainly an option.

So, TL;DR:

Track down where the Hive schema gets created and debug whether you can 
identify a Float coming from Oracle based on the precision and scale. You might 
want to check other number types, too.

Hope this helps!

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO

[jira] [Commented] (SQOOP-3451) Importing FLOAT from Oracle to Hive results in INTEGER

2019-09-25 Thread Fero Szabo (Jira)



[ 
https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937767#comment-16937767
 ] 

Fero Szabo commented on SQOOP-3451:
---

Hi [~dionusos],

Yeah, I think you are right and Oracle is a pain in this regard to work with.

I've had the same issue when developing the floating point number support for 
Avro and Parquet, namely that a column defined as NUMBER (without precision and 
scale), comes back with invalid metadata from the database. (I believe 
something like -127 as scale, though please double check this). And under the 
hood, I suspect Oracle is using NUMBER, again, to store the type Float. 

In my case, the only missing thing was a proper scale to be able to pad a 
BigDecimal within sqoop. So, I created a flag to allow this to the user in 
SQOOP-2976. Not sure what to do in your case, as it's neither of those files, 
(it's orc, if I'm seeing this correctly). In any case, I believe you'll need to 
watch out for these "special" values for scale and precision returned by Oracle 
and implement a logic that maps these to proper values.

I used user input for this via properties. Seemed the best at the time, for 
that particular case. I'm not sure if my approach is the right one for you as 
well, though certainly an option.

So, TL;DR:

Track down where the Hive schema gets created and debug whether you can 
identify a Float coming from Oracle based on the precision and scale. You might 
want to check other number types, too.

Hope this helps!

> Importing FLOAT from Oracle to Hive results in INTEGER
> --
>
> Key: SQOOP-3451
> URL: https://issues.apache.org/jira/browse/SQOOP-3451
> Project: Sqoop
>  Issue Type: Bug
>  Components: codegen, connectors/oracle, hive-integration
>Affects Versions: 1.4.7
>Reporter: Denes Bodo
>Priority: Major
>
> We ran into an issue where there is a table created in Oracle 11g:
> {noformat}
> create table floattest (column1 float(30), column2 number(30,-127), column3 
> number(30));
> {noformat}
> We want to import date from Oracle to Hive:
> {noformat}
> sqoop import -D 
> mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' 
> -Dmapreduce.job.queuename=default --connect 
> "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table 
> floattest --hcatalog-database default --hcatalog-table floattest 
> --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza 
> "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose
> {noformat}
> In Sqoop logs we see the following:
> {noformat}
> 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM floattest t WHERE 1=0
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, 
> 30, -127]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, 
> 30, -84]
> 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, 
> 30, 0]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names 
> projected : [COLUMN1, COLUMN2, COLUMN3]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info 
> map :
> COLUMN3 : [Type : 2,Precision : 30,Scale : 0]
> COLUMN2 : [Type : 2,Precision : 30,Scale : -84]
> COLUMN1 : [Type : 2,Precision : 30,Scale : -127]
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table 
> default.floattest for import
> 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table 
> statement:
> create external table `default`.`floattest` (
> `column1` decimal(30),
> `column2` decimal(30),
> `column3` decimal(30))
> stored as orc
> {noformat}
> From this output we can see that Oracle states about column1 has Type=2 which 
> is NUMERIC (regarding to 
> https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT).
>  Sqoop translates NUMERIC to DECIMAL 
> (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107).
>  Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT 
> instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) 
> from NUMBER.
> I think it is the fault of Oracle as it does not use Java Type=6 to sign type 
> of a float. What do you think?
> 
> Thank you for the details and investigation to [~mbalakrishnan] and Andrew 
> Miller



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (SQOOP-3442) Sqoop Java 11 support

2019-06-12 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3442:
-

 Summary: Sqoop Java 11 support
 Key: SQOOP-3442
 URL: https://issues.apache.org/jira/browse/SQOOP-3442
 Project: Sqoop
  Issue Type: Improvement
Reporter: Fero Szabo


In order for Sqoop to support Java 11, we'll have to bump the dependencies for 
the following:
 * Hadoop
 * Hive
 * HBase

This will be a major undertaking.

SQOOP-3441 took care of the necessary code changes.

We (the community) should also consider dropping ant support entirely, as 
hammering out the dependency upgrade in two build systems just doesn't make 
sense.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3441) Prepare Sqoop for Java 11 support

2019-06-05 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3441:
-

 Summary: Prepare Sqoop for Java 11 support
 Key: SQOOP-3441
 URL: https://issues.apache.org/jira/browse/SQOOP-3441
 Project: Sqoop
  Issue Type: Improvement
Reporter: Fero Szabo
Assignee: Fero Szabo


A couple of code changes will be required in order for Sqoop to work with 
Java11 and we'll also have to bump a couple of dependencies and the gradle 
version. 

I'm not sure what's required for ant, that is to be figured out in a separate 
Jira, if we keep the ant build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3430) Fix broken CI

2019-03-11 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3430:
--
Description: I think both ant and gradle are enough  
!/jira/images/icons/emoticons/smile.png|width=16,height=16!  (was: Currently, 
the CI is broken because the Oracle 11 XE Dockerimage isn't available anymore.)

> Fix broken CI
> -
>
> Key: SQOOP-3430
> URL: https://issues.apache.org/jira/browse/SQOOP-3430
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>
> I think both ant and gradle are enough  
> !/jira/images/icons/emoticons/smile.png|width=16,height=16!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3430) Remove the old maven pom

2019-03-11 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789439#comment-16789439
 ] 

Fero Szabo commented on SQOOP-3430:
---

[~Fokko],

Could you please link your PR to this jira? (not sure if a new one needs to be 
opened)



I see no reason to keep the old maven pom.

> Remove the old maven pom
> 
>
> Key: SQOOP-3430
> URL: https://issues.apache.org/jira/browse/SQOOP-3430
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>
> I think both ant and gradle are enough :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3428) Fix the CI

2019-03-11 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3428:
--
Summary: Fix the CI  (was: Remove the old Maven pom)

> Fix the CI
> --
>
> Key: SQOOP-3428
> URL: https://issues.apache.org/jira/browse/SQOOP-3428
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think both ant and gradle are enough :-)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3428) Fix the CI

2019-03-11 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3428:
--
Description: Currently, the CI is broken because the Oracle 11 XE 
Dockerimage isn't available anymore.  (was: I think both ant and gradle are 
enough :-))

> Fix the CI
> --
>
> Key: SQOOP-3428
> URL: https://issues.apache.org/jira/browse/SQOOP-3428
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, the CI is broken because the Oracle 11 XE Dockerimage isn't 
> available anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3430) Remove the old maven pom

2019-03-11 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3430:
--
Description: I think both ant and gradle are enough :)  (was: I think both 
ant and gradle are enough  
!/jira/images/icons/emoticons/smile.png|width=16,height=16!)

> Remove the old maven pom
> 
>
> Key: SQOOP-3430
> URL: https://issues.apache.org/jira/browse/SQOOP-3430
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>
> I think both ant and gradle are enough :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3428) Fix the CI

2019-03-11 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789433#comment-16789433
 ] 

Fero Szabo commented on SQOOP-3428:
---

Hi [~Fokko],

I changed the Summary and removed the description to match that of the pull 
request. Apparently there was a mismatch I didn't spot in time (the Jira 
referenced in the pull request was this one, instead of SQOOP-3430, that you've 
opened for fixing the CI). 

Anyway, this one is resolved and I am repurposing SQOOP-3430 for removing the 
old pom.

> Fix the CI
> --
>
> Key: SQOOP-3428
> URL: https://issues.apache.org/jira/browse/SQOOP-3428
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think both ant and gradle are enough :-)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-3428) Fix the CI

2019-03-11 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-3428:
-

Assignee: Fokko Driesprong

> Fix the CI
> --
>
> Key: SQOOP-3428
> URL: https://issues.apache.org/jira/browse/SQOOP-3428
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think both ant and gradle are enough :-)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3418) Document decimal support in Hive external import into parquet files

2018-12-05 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3418:
-

 Summary: Document decimal support in Hive external import into 
parquet files
 Key: SQOOP-3418
 URL: https://issues.apache.org/jira/browse/SQOOP-3418
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo
Assignee: Fero Szabo


Remember to note the limitations in Hive i.e. the max scale and precision is 38 
and how it behaves in edge cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (SQOOP-3416) Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable in the gradle build

2018-12-05 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo resolved SQOOP-3416.
---
Resolution: Won't Fix

This became obsolete, as the variable was deleted.

> Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable 
> in the gradle build
> ---
>
> Key: SQOOP-3416
> URL: https://issues.apache.org/jira/browse/SQOOP-3416
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Minor
>
> Since the sqoopThirdPartyLib doesn't have a default value, if one runs the 
> Oracle tests, one always have to specify the sqoop.thirdparty.lib.dir system 
> variable.
> With this change, we just have to move the downloaded oracle driver to 
> /var/lib/dir and avoid some typing. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (SQOOP-3417) Execute Oracle XE tests on Travis CI

2018-12-04 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo resolved SQOOP-3417.
---
Resolution: Fixed

> Execute Oracle XE tests on Travis CI
> 
>
> Key: SQOOP-3417
> URL: https://issues.apache.org/jira/browse/SQOOP-3417
> Project: Sqoop
>  Issue Type: Test
>Affects Versions: 1.4.7
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
>
> The task is to enable the Travis CI to execute Oracle XE tests too 
> automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3417) Execute Oracle XE tests on Travis CI

2018-12-04 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708772#comment-16708772
 ] 

Fero Szabo commented on SQOOP-3417:
---

Hi [~vasas],

Your change is now committed. Thank you for your contribution, good catch!

> Execute Oracle XE tests on Travis CI
> 
>
> Key: SQOOP-3417
> URL: https://issues.apache.org/jira/browse/SQOOP-3417
> Project: Sqoop
>  Issue Type: Test
>Affects Versions: 1.4.7
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
>
> The task is to enable the Travis CI to execute Oracle XE tests too 
> automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3416) Give the default value of /var/lib/sqoop to the sqoopThirdPartyLib variable in the gradle build

2018-11-30 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3416:
-

 Summary: Give the default value of /var/lib/sqoop to the 
sqoopThirdPartyLib variable in the gradle build
 Key: SQOOP-3416
 URL: https://issues.apache.org/jira/browse/SQOOP-3416
 Project: Sqoop
  Issue Type: Improvement
Reporter: Fero Szabo
Assignee: Fero Szabo


Since the sqoopThirdPartyLib doesn't have a default value, if one runs the 
Oracle tests, one always have to specify the sqoop.thirdparty.lib.dir system 
variable.

With this change, we just have to move the downloaded oracle driver to 
/var/lib/dir and avoid some typing. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (SQOOP-3407) Introduce methods instead of TEMP_BASE_DIR and LOCAL_WAREHOUSE_DIR static fields

2018-11-21 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694648#comment-16694648
 ] 

Fero Szabo edited comment on SQOOP-3407 at 11/21/18 12:22 PM:
--

Hi [~vasas],

I've committed your patch, thank you for your contribution!

You can close the related review request.


was (Author: fero):
Hi [~vasas],

Thank you for your contribution!

You can close the related review request.

> Introduce methods instead of TEMP_BASE_DIR and LOCAL_WAREHOUSE_DIR static 
> fields
> 
>
> Key: SQOOP-3407
> URL: https://issues.apache.org/jira/browse/SQOOP-3407
> Project: Sqoop
>  Issue Type: Test
>Reporter: Szabolcs Vasas
>Assignee: Szabolcs Vasas
>Priority: Major
> Attachments: SQOOP-3407.patch
>
>
> BaseSqoopTestCase.TEMP_BASE_DIR and BaseSqoopTestCase.LOCAL_WAREHOUSE_DIR are 
> public static fields which get initialized once at the JVM startup and store 
> the paths for the test temp and warehouse directories.
> The problem is that HBase test cases change the value of the test.build.data 
> system property which can cause tests using these static fields to fail.
> Since we do not own the code in HBase which changes the system property we 
> need to turn these static fields into methods which evaluate the 
> test.build.data system property every time they invoked which will make sure 
> that the invoking tests will be successful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3405) Refactor: break up Parameterized tests on a per database basis

2018-11-20 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3405:
--
Description: 
Follow the example of the abstract class SavedJobsTestBase and it's subclasses!

We need this to be able to add test categories (so for Travis integration) as 
well.

  was:Follow the example of the abstract class SavedJobsTestBase and it's 
subclasses!


> Refactor: break up Parameterized tests on a per database basis
> --
>
> Key: SQOOP-3405
> URL: https://issues.apache.org/jira/browse/SQOOP-3405
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> Follow the example of the abstract class SavedJobsTestBase and it's 
> subclasses!
> We need this to be able to add test categories (so for Travis integration) as 
> well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3405) Refactor: break up Parameterized tests on a per database basis

2018-11-20 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3405:
--
Summary: Refactor: break up Parameterized tests on a per database basis  
(was: Refactor: break up NumericTypesImportTest to be executable on a per 
database basis)

> Refactor: break up Parameterized tests on a per database basis
> --
>
> Key: SQOOP-3405
> URL: https://issues.apache.org/jira/browse/SQOOP-3405
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> Follow the example of the abstract class SavedJobsTestBase and it's 
> subclasses!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3405) Refactor: break up NumericTypesImportTest to be executable on a per database basis

2018-11-20 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3405:
-

 Summary: Refactor: break up NumericTypesImportTest to be 
executable on a per database basis
 Key: SQOOP-3405
 URL: https://issues.apache.org/jira/browse/SQOOP-3405
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Fero Szabo
Assignee: Fero Szabo


Follow the example of the abstract class SavedJobsTestBase and it's subclasses!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (SQOOP-3403) Sqoop2: Add Fero Szabo to committer list in our pom file

2018-11-09 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo resolved SQOOP-3403.
---
Resolution: Fixed

> Sqoop2: Add Fero Szabo to committer list in our pom file
> 
>
> Key: SQOOP-3403
> URL: https://issues.apache.org/jira/browse/SQOOP-3403
> Project: Sqoop
>  Issue Type: Task
>Affects Versions: 1.99.8
>Reporter: Boglarka Egyed
>Assignee: Fero Szabo
>Priority: Major
>
> Now that [~fero] is committer we should update our committer list in the root 
> pom.xml file:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-10-29 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667424#comment-16667424
 ] 

Fero Szabo commented on SQOOP-2949:
---

Hi [~gireeshp],

As we agreed offline, I've developed tests for this fix in SQOOP-3400. I've 
also posted your change on review board, because it's required for the tests. 
Hope you don't mind! (I mentioned that you developed it, wouldn't want to steal 
the credit!)

In any case, please feel free to review the tests if you can find the time!

Bests,

Fero

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Assignee: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3400) Create tests for SQOOP-2949, quote escaping in split-by

2018-10-24 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3400:
-

 Summary: Create tests for SQOOP-2949, quote escaping in split-by
 Key: SQOOP-3400
 URL: https://issues.apache.org/jira/browse/SQOOP-3400
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Fero Szabo
Assignee: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-10-18 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3396:
-

 Summary: Add parquet numeric support for Parquet in Hive import
 Key: SQOOP-3396
 URL: https://issues.apache.org/jira/browse/SQOOP-3396
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-3396) Add parquet numeric support for Parquet in Hive import

2018-10-18 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-3396:
-

Assignee: Fero Szabo

> Add parquet numeric support for Parquet in Hive import
> --
>
> Key: SQOOP-3396
> URL: https://issues.apache.org/jira/browse/SQOOP-3396
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3382) Add parquet numeric support for Parquet in hdfs import

2018-10-18 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3382:
--
Summary: Add parquet numeric support for Parquet in hdfs import  (was: Add 
parquet numeric support and reuse existing Avro numeric tests Parquet)

> Add parquet numeric support for Parquet in hdfs import
> --
>
> Key: SQOOP-3382
> URL: https://issues.apache.org/jira/browse/SQOOP-3382
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> The current Avro numeric tests are suitable to be used as Parquet tests, with 
> very minor modifications, as parquet can be written with the same input and 
> nearly the same args. Since we are writing Parquet with it's Avro support, it 
> would be good to cover this code with the same, or similar tests (including 
> the edge cases related to padding, missing scale and precision cases).
> Differences are:
>  * the expected output, since stored in a parquet file is different.
>  * input arguements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3382) Add parquet numeric support and reuse existing Avro numeric tests Parquet

2018-10-17 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3382:
--
Summary: Add parquet numeric support and reuse existing Avro numeric tests 
Parquet  (was: Add parquet numeric support and refactor existing Avro numeric 
tests for reusability (with Parquet))

> Add parquet numeric support and reuse existing Avro numeric tests Parquet
> -
>
> Key: SQOOP-3382
> URL: https://issues.apache.org/jira/browse/SQOOP-3382
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> The current Avro numeric tests are suitable to be used as Parquet tests, with 
> very minor modifications, as parquet can be written with the same input and 
> nearly the same args. Since we are writing Parquet with it's Avro support, it 
> would be good to cover this code with the same, or similar tests (including 
> the edge cases related to padding, missing scale and precision cases).
> Differences are:
>  * the expected output, since stored in a parquet file is different.
>  * input arguements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3382) Add parquet numeric support and refactor existing Avro numeric tests for reusability (with Parquet)

2018-10-17 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3382:
--
Summary: Add parquet numeric support and refactor existing Avro numeric 
tests for reusability (with Parquet)  (was: Reafactor existing Avro numeric 
tests for reusability (with Parquet))

> Add parquet numeric support and refactor existing Avro numeric tests for 
> reusability (with Parquet)
> ---
>
> Key: SQOOP-3382
> URL: https://issues.apache.org/jira/browse/SQOOP-3382
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> The current Avro numeric tests are suitable to be used as Parquet tests, with 
> very minor modifications, as parquet can be written with the same input and 
> nearly the same args. Since we are writing Parquet with it's Avro support, it 
> would be good to cover this code with the same, or similar tests (including 
> the edge cases related to padding, missing scale and precision cases).
> Differences are:
>  * the expected output, since stored in a parquet file is different.
>  * input arguements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3381) Upgrade the Parquet library from 1.6.0 to 1.9.0

2018-10-15 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650311#comment-16650311
 ] 

Fero Szabo commented on SQOOP-3381:
---

Hi [~dvoros],

Thanks for letting me know.

Anyway, I've just updated my patch on Reviewboard. I encountered the same 
security policy related issue as you did with the Hadoop upgrade in SQOOP-3305, 
so I've incorporated the DerbyPolicy and the related code changes in mine. I 
hope you approve, and can have a look at it. :)

I've decided to go for an older version of Hive, 2.1.1, since that suffices for 
this parquet upgrade as well.

 

> Upgrade the Parquet library from 1.6.0 to 1.9.0
> ---
>
> Key: SQOOP-3381
> URL: https://issues.apache.org/jira/browse/SQOOP-3381
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> As we will need to register a data supplier in the fix for parquet decimal 
> support, we will need a version that contains PARQUET-243.
> We need to upgrade the Parquet library to a version that contains this fix 
> and is compatible with Hadoop. Most probably, the newest version will be 
> adequate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2331) Snappy Compression Support in Sqoop-HCatalog

2018-10-11 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16646753#comment-16646753
 ] 

Fero Szabo commented on SQOOP-2331:
---

Hi [~standon],

I wonder if you've managed to find the time to work on this? Or, can you share 
any detail on when you might be able to?

Thanks,

Fero

> Snappy Compression Support in Sqoop-HCatalog
> 
>
> Key: SQOOP-2331
> URL: https://issues.apache.org/jira/browse/SQOOP-2331
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.4.7
>Reporter: Atul Gupta
>Assignee: Shashank
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: SQOOP-2331_0.patch, SQOOP-2331_1.patch, 
> SQOOP-2331_2.patch, SQOOP-2331_2.patch, SQOOP-2331_3.patch
>
>
> Current Apache Sqoop 1.4.7 does not compress in gzip format with 
>  --compress option while using with --hcatalog-table option. It also does not 
> support option --compression-codec snappy with --hcatalog-table option. it 
> would be nice if we add both the options in the Sqoop future releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3381) Upgrade the Parquet library from 1.6.0 to 1.9.0

2018-10-05 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639554#comment-16639554
 ] 

Fero Szabo commented on SQOOP-3381:
---

Hi [~dvoros],

Thanks for your comment and sorry for the late answer!

I've been pretty busy in the last few weeks with other issues, but now am ready 
to continue working on this one. Thanks for pointing me to the shaded 
parquet-hadoop-bundle. A few Hive tests are failing because of it... It makes 
me wonder if you've made progress with SQOOP-3305 in the meantime, and if 
upgrading Hive to 3.1.0 would solve this problem?

Because of the failing Hive tests, I haven't tested on a cluster, yet, but will 
certainly do so. 

> Upgrade the Parquet library from 1.6.0 to 1.9.0
> ---
>
> Key: SQOOP-3381
> URL: https://issues.apache.org/jira/browse/SQOOP-3381
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> As we will need to register a data supplier in the fix for parquet decimal 
> support, we will need a version that contains PARQUET-243.
> We need to upgrade the Parquet library to a version that contains this fix 
> and is compatible with Hadoop. Most probably, the newest version will be 
> adequate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3386) Add DB2 support to upstream documentation.

2018-09-25 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3386:
-

 Summary: Add DB2 support to upstream documentation.
 Key: SQOOP-3386
 URL: https://issues.apache.org/jira/browse/SQOOP-3386
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo


DB2 is actually supported by Sqoop, but is not in the list of supported 
databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2639) Unable to export utf-8 data to MySQL using --direct mode

2018-09-17 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618300#comment-16618300
 ] 

Fero Szabo commented on SQOOP-2639:
---

Hi [~charsyam],

I presume you want to contribute to Sqoop? ;)

I cannot grant you any privileges, since I'm just a contributor myself, but in 
this case, you could ask to be added as contributor on the Sqoop-dev mailing 
list.

Best Regards,

Fero

> Unable to export utf-8 data to MySQL using --direct mode
> 
>
> Key: SQOOP-2639
> URL: https://issues.apache.org/jira/browse/SQOOP-2639
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
>Reporter: Ranjan Bagchi
>Priority: Major
> Attachments: sqoop-2639.patch
>
>
> I am able to import utf-8 data (non-latin1) data successfully into HDFS via:
> sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
> --mysql-delimiters \
> --table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table 
> --direct 
> However, using 
> sqoop export --connect  jdbc:mysql://host/db --username XX --password YY \
> --mysql-delimiters \
> --table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table 
> \
> --direct 
> Cuts off the fields after the first non-latin1 character (eg a letter w/ an 
> umlaut).
> I tried other options like  -- --default-character-set=utf8, without success.
> I was able to fix the problem with the following change:
> Change 
> https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java,
>  line 322 from 
> this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;
> to
> this.mysqlCharSet = "utf-8"; 
> Hope this helps



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3381) Upgrade the Parquet library from 1.6.0 to 1.9.0

2018-09-12 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612271#comment-16612271
 ] 

Fero Szabo commented on SQOOP-3381:
---

Hi [~dvoros],

I think this change might affect the Hadoop 3.0 upgrade...

Can you perhaps comment on this? 

Thanks!

Fero

> Upgrade the Parquet library from 1.6.0 to 1.9.0
> ---
>
> Key: SQOOP-3381
> URL: https://issues.apache.org/jira/browse/SQOOP-3381
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> As we will need to register a data supplier in the fix for parquet decimal 
> support, we will need a version that contains PARQUET-243.
> We need to upgrade the Parquet library to a version that contains this fix 
> and is compatible with Hadoop. Most probably, the newest version will be 
> adequate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3381) Upgrade the Parquet library from 1.6.0 to 1.9.0

2018-09-12 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3381:
--
Summary: Upgrade the Parquet library from 1.6.0 to 1.9.0  (was: Upgrade the 
Parquet library)

> Upgrade the Parquet library from 1.6.0 to 1.9.0
> ---
>
> Key: SQOOP-3381
> URL: https://issues.apache.org/jira/browse/SQOOP-3381
> Project: Sqoop
>  Issue Type: Sub-task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> As we will need to register a data supplier in the fix for parquet decimal 
> support, we will need a version that contains PARQUET-243.
> We need to upgrade the Parquet library to a version that contains this fix 
> and is compatible with Hadoop. Most probably, the newest version will be 
> adequate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3382) Reafactor existing Avro numeric tests for reusability (with Parquet)

2018-09-10 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3382:
-

 Summary: Reafactor existing Avro numeric tests for reusability 
(with Parquet)
 Key: SQOOP-3382
 URL: https://issues.apache.org/jira/browse/SQOOP-3382
 Project: Sqoop
  Issue Type: Sub-task
Affects Versions: 1.4.7
Reporter: Fero Szabo
Assignee: Fero Szabo
 Fix For: 3.0.0


The current Avro numeric tests are suitable to be used as Parquet tests, with 
very minor modifications, as parquet can be written with the same input and 
nearly the same args. Since we are writing Parquet with it's Avro support, it 
would be good to cover this code with the same, or similar tests (including the 
edge cases related to padding, missing scale and precision cases).

Differences are:
 * the expected output, since stored in a parquet file is different.
 * input arguements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3381) Upgrade the Parquet library

2018-09-10 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3381:
-

 Summary: Upgrade the Parquet library
 Key: SQOOP-3381
 URL: https://issues.apache.org/jira/browse/SQOOP-3381
 Project: Sqoop
  Issue Type: Sub-task
Affects Versions: 1.4.7
Reporter: Fero Szabo
Assignee: Fero Szabo
 Fix For: 3.0.0


As we will need to register a data supplier in the fix for parquet decimal 
support, we will need a version that contains PARQUET-243.

We need to upgrade the Parquet library to a version that contains this fix and 
is compatible with Hadoop. Most probably, the newest version will be adequate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3380) parquet-configurator-implementation is not recognized as an option

2018-09-04 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3380:
-

 Summary: parquet-configurator-implementation is not recognized as 
an option
 Key: SQOOP-3380
 URL: https://issues.apache.org/jira/browse/SQOOP-3380
 Project: Sqoop
  Issue Type: Bug
Reporter: Fero Szabo
Assignee: Szabolcs Vasas


The parquet-configurator-implementation option was added to Sqoop with 
SQOOP-3329: Remove Kite dependency from the Sqoop project, but the command line 
parser doesn't recognize it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-09-03 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16601955#comment-16601955
 ] 

Fero Szabo commented on SQOOP-2949:
---

Hi [~gireeshp],

My email is [f...@cloudera.com|mailto:f...@cloudera.com] 

The release process doesn't have a defined schedule, yet, so there is no 
timeline. There is only 1 item left from the discussed items that is still 
pending (Hadoop 3 / Hive 3 / Hbase 2 support), i.e. just a library upgrade on 
the Sqoop side.

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Assignee: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3377) True Parquet Decimal Support

2018-08-31 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3377:
--
Description: 
Currently, fixed point numbers (decimal, number) are stored as a String when 
imported to parquet. This Jira is about adding the capability to store them as 
logical types (as we do in avro).

The parquet library might have to be upgraded.

  was:Currently, fixed point numbers (decimal, number) are stored as a String 
when imported to parquet. This Jira is about adding the capability to store 
them as logical types (as we do in avro).


> True Parquet Decimal Support
> 
>
> Key: SQOOP-3377
> URL: https://issues.apache.org/jira/browse/SQOOP-3377
> Project: Sqoop
>  Issue Type: Improvement
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, fixed point numbers (decimal, number) are stored as a String when 
> imported to parquet. This Jira is about adding the capability to store them 
> as logical types (as we do in avro).
> The parquet library might have to be upgraded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3377) True Parquet Decimal Support

2018-08-31 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3377:
-

 Summary: True Parquet Decimal Support
 Key: SQOOP-3377
 URL: https://issues.apache.org/jira/browse/SQOOP-3377
 Project: Sqoop
  Issue Type: Improvement
Affects Versions: 1.4.7
Reporter: Fero Szabo
Assignee: Fero Szabo
 Fix For: 3.0.0


Currently, fixed point numbers (decimal, number) are stored as a String when 
imported to parquet. This Jira is about adding the capability to store them as 
logical types (as we do in avro).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2331) Snappy Compression Support in Sqoop-HCatalog

2018-08-31 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598603#comment-16598603
 ] 

Fero Szabo commented on SQOOP-2331:
---

[~standon],

Also, [~BoglarkaEgyed] mentioned it to me that Sqoop has a test that you might 
be able to reuse:

org.apache.sqoop.TestCompression

 

> Snappy Compression Support in Sqoop-HCatalog
> 
>
> Key: SQOOP-2331
> URL: https://issues.apache.org/jira/browse/SQOOP-2331
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.4.7
>Reporter: Atul Gupta
>Assignee: Shashank
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: SQOOP-2331_0.patch, SQOOP-2331_1.patch, 
> SQOOP-2331_2.patch, SQOOP-2331_2.patch, SQOOP-2331_3.patch
>
>
> Current Apache Sqoop 1.4.7 does not compress in gzip format with 
>  --compress option while using with --hcatalog-table option. It also does not 
> support option --compression-codec snappy with --hcatalog-table option. it 
> would be nice if we add both the options in the Sqoop future releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-30 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597566#comment-16597566
 ] 

Fero Szabo commented on SQOOP-2949:
---

Hi [~gireeshp],

Do you have an update on this issue? We'd like to get this one into the next 
release, if possible... :)

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Assignee: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2331) Snappy Compression Support in Sqoop-HCatalog

2018-08-28 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594979#comment-16594979
 ] 

Fero Szabo commented on SQOOP-2331:
---

Hi [~standon],

I've reviewed the modified code and made a few suggestions on ReviewBoard. Can 
you please have a look? 

Also, I wonder if you saw the update there when I posted it, as it was two 
weeks ago... 

Thanks,

Fero

> Snappy Compression Support in Sqoop-HCatalog
> 
>
> Key: SQOOP-2331
> URL: https://issues.apache.org/jira/browse/SQOOP-2331
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.4.7
>Reporter: Atul Gupta
>Assignee: Shashank
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: SQOOP-2331_0.patch, SQOOP-2331_1.patch, 
> SQOOP-2331_2.patch, SQOOP-2331_2.patch, SQOOP-2331_3.patch
>
>
> Current Apache Sqoop 1.4.7 does not compress in gzip format with 
>  --compress option while using with --hcatalog-table option. It also does not 
> support option --compression-codec snappy with --hcatalog-table option. it 
> would be nice if we add both the options in the Sqoop future releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (SQOOP-2331) Snappy Compression Support in Sqoop-HCatalog

2018-08-09 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574718#comment-16574718
 ] 

Fero Szabo edited comment on SQOOP-2331 at 8/9/18 11:36 AM:


Hi [~standon],

Sure can do. I've compile errors, though.
{noformat}
    [javac] 
/Users/ferencszabo/workspace/sqoop/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java:85:
 error: package com.cloudera.sqoop.io does not exist{noformat}
This package got changed to _org.apache.sqoop.io_, so you'll have to modify the 
import to the new package.

Also, can you please update the review on ReviewBoard please (upload the latest 
diff)? We can continue the code related discussion there.

Thanks,

Fero


was (Author: fero):
Hi [~standon],

Sure can do. I've compile errors, though.
{noformat}
    [javac] 
/Users/ferencszabo/workspace/sqoop/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java:85:
 error: package com.cloudera.sqoop.io does not exist{noformat}
This package got change to _org.apache.sqoop.io_, so you'll have to modify the 
import to the new package.


Also, can you please update the review on ReviewBoard please? We can continue 
the code related discussion there.

Thanks,

Fero

> Snappy Compression Support in Sqoop-HCatalog
> 
>
> Key: SQOOP-2331
> URL: https://issues.apache.org/jira/browse/SQOOP-2331
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.4.7
>Reporter: Atul Gupta
>Assignee: Shashank
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: SQOOP-2331_0.patch, SQOOP-2331_1.patch, 
> SQOOP-2331_2.patch, SQOOP-2331_2.patch
>
>
> Current Apache Sqoop 1.4.7 does not compress in gzip format with 
>  --compress option while using with --hcatalog-table option. It also does not 
> support option --compression-codec snappy with --hcatalog-table option. it 
> would be nice if we add both the options in the Sqoop future releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2331) Snappy Compression Support in Sqoop-HCatalog

2018-08-09 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574718#comment-16574718
 ] 

Fero Szabo commented on SQOOP-2331:
---

Hi [~standon],

Sure can do. I've compile errors, though.
{noformat}
    [javac] 
/Users/ferencszabo/workspace/sqoop/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java:85:
 error: package com.cloudera.sqoop.io does not exist{noformat}
This package got change to _org.apache.sqoop.io_, so you'll have to modify the 
import to the new package.


Also, can you please update the review on ReviewBoard please? We can continue 
the code related discussion there.

Thanks,

Fero

> Snappy Compression Support in Sqoop-HCatalog
> 
>
> Key: SQOOP-2331
> URL: https://issues.apache.org/jira/browse/SQOOP-2331
> Project: Sqoop
>  Issue Type: New Feature
>Affects Versions: 1.4.7
>Reporter: Atul Gupta
>Assignee: Shashank
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: SQOOP-2331_0.patch, SQOOP-2331_1.patch, 
> SQOOP-2331_2.patch, SQOOP-2331_2.patch
>
>
> Current Apache Sqoop 1.4.7 does not compress in gzip format with 
>  --compress option while using with --hcatalog-table option. It also does not 
> support option --compression-codec snappy with --hcatalog-table option. it 
> would be nice if we add both the options in the Sqoop future releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-07 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-2949:
-

Assignee: Gireesh Puthumana

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Assignee: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-07 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571394#comment-16571394
 ] 

Fero Szabo commented on SQOOP-2949:
---

Hi [~gireeshp],

Great to hear!

We are trying to improve the test coverage of Sqoop so that we can ensure that 
a new change won't break existing use cases. So the first step is to create a 
few basic test cases for the change.

For this, you will need to install the docker images for the 3rd party tests if 
you haven't done so, yet. Please see COMPILING.txt in the root directory of the 
project for the details. (The relevant section starts at line 221: "Setting up 
and executing third-party tests...")

*1. Creating tests*

As a start, I'd suggest modifying SQLServerSplitByTest.java. A test for SQL 
server is a good start, though, since this is a fundamental change, a test for 
the following RDBMS implementations would also be nice:
 * Oracle
 * MySQL
 * PostgresSQL

The committers / reviewers might ask for these anyway, as Oracle is the most 
popular database implementation used with Sqoop.

*2. Creating a review*

Please go to Review Board at [https://reviews.apache.org/account/login/] and 
register if you haven't done so far. Then, create a patch by invoking _git diff 
> SQOOP-2949-1.patch **_on the command line. Finally, create a review using the 
_sqoop-trunk_ repository and your patch. Fill in the necessary fields, as for 
example, in this reivew: [https://reviews.apache.org/r/65607/]  (no need for a 
description this long, nobody likes to read this much  :) ).

Once this is done, we'll continue the technical discussion on ReviewBoard!

Please let me know if you've any questions here, or via email (you should be 
able to see my email address under my name).

Best Regards,

Fero

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
>

[jira] [Assigned] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-07 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-2949:
-

Assignee: (was: Fero Szabo)

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-07 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-2949:
-

Assignee: (was: Fero Szabo)

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-07 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-2949:
-

Assignee: Fero Szabo

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Assignee: Fero Szabo
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-08-07 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-2949:
-

Assignee: Fero Szabo

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Assignee: Fero Szabo
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-2949) SQL Syntax error when split-by column is of character type and min or max value has single quote inside it

2018-07-31 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563657#comment-16563657
 ] 

Fero Szabo commented on SQOOP-2949:
---

Hi [~gireeshp],

I'm wondering if you would still like to contribute this change to Sqoop.

If so, I would be happy to guide you through the process. I believe that a 
couple of tests and a new review request on ReviewBoard is all that is needed. 
(And then, hopefully, we would be able to get the attention of one of the 
committers or PMCs and get it committed.)

If not, then I'd be happy to take this one over from you, as this fix would 
provide value to our customers, and also, to the community of Sqoop.

Kind Regards,

Fero

> SQL Syntax error when split-by column is of character type and min or max 
> value has single quote inside it
> --
>
> Key: SQOOP-2949
> URL: https://issues.apache.org/jira/browse/SQOOP-2949
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: Sqoop 1.4.6
> Run on Hadoop 2.6.0
> On Ubuntu
>Reporter: Gireesh Puthumana
>Priority: Major
>
> Did a sqoop import from mysql table "emp", with split-by column "ename", 
> which is a varchar(100) type.
> +Used below command:+
> sqoop import --connect jdbc:mysql://localhost/testdb --username root 
> --password * --table emp --m 2 --target-dir /sqoopTest/5 --split-by ename;
> +Ename has following records:+
> | ename   |
> | gireesh |
> | aavesh  |
> | shiva'  |
> | jamir   |
> | balu|
> | santosh |
> | sameer  |
> Min value is "aavesh" and max value is "shiva'" (please note the single quote 
> inside max value).
> When run, it tried to execute below query in mapper 2 and failed:
> SELECT `ename`, `eid`, `deptid` FROM `emp` AS `emp` WHERE ( `ename` >= 
> 'jd聯聭聪G耀' ) AND ( `ename` <= 'shiva'' )
> +Stack trace:+
> {quote}
> 2016-06-05 16:54:06,749 ERROR [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ''shiva'' )' at line 1
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
>   at com.mysql.jdbc.Util.getInstance(Util.java:387)
>   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:942)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3966)
>   at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3902)
>   at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2526)
>   at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
>   at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2549)
>   at 
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861)
>   at 
> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>   at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>   at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>   at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3355) Document SQOOP-1905 DB2 --schema option

2018-07-26 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3355:
--
Summary: Document SQOOP-1905 DB2 --schema option  (was: Document SQOOP-1905)

> Document SQOOP-1905 DB2 --schema option
> ---
>
> Key: SQOOP-3355
> URL: https://issues.apache.org/jira/browse/SQOOP-3355
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3355) Document SQOOP-1905

2018-07-26 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3355:
-

 Summary: Document SQOOP-1905
 Key: SQOOP-3355
 URL: https://issues.apache.org/jira/browse/SQOOP-3355
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Fero Szabo
Assignee: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3332) Extend Documentation of --resilient flag and add warning message when detected

2018-06-26 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3332:
--
Description: 
The resilient flag can be used to trigger the retry mechanism in the SQL Server 
connector. The documentation only tells that it can be used in export, however 
it can be used in import as well.

Also, the feature itself relies on the implicit assumption that the split-by 
column is unique and sorted in ascending order. The users have to be warned 
about this limitation, at the very least.

  was:
The non-resilient flag can be used to avoid the retry mechanism in the SQL 
Server connector. The documentation only tells that it can be used in export, 
however it can be used in import as well.

Also, the feature itself relies on the implicit assumption that the split-by 
column is unique and sorted in ascending order. The users have to be warned 
about this limitation, at the very least.


> Extend Documentation of --resilient flag and add warning message when detected
> --
>
> Key: SQOOP-3332
> URL: https://issues.apache.org/jira/browse/SQOOP-3332
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> The resilient flag can be used to trigger the retry mechanism in the SQL 
> Server connector. The documentation only tells that it can be used in export, 
> however it can be used in import as well.
> Also, the feature itself relies on the implicit assumption that the split-by 
> column is unique and sorted in ascending order. The users have to be warned 
> about this limitation, at the very least.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3337) Invalid Argument arrays in SQLServerManagerImportTest

2018-06-22 Thread Fero Szabo (JIRA)



[ 
https://issues.apache.org/jira/browse/SQOOP-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520445#comment-16520445
 ] 

Fero Szabo commented on SQOOP-3337:
---

The fix in SQOOP-3334 will help solve this issue easily.

> Invalid Argument arrays in SQLServerManagerImportTest 
> --
>
> Key: SQOOP-3337
> URL: https://issues.apache.org/jira/browse/SQOOP-3337
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> The argument array builder is only initialized for each test configuration, 
> so the 5 tests are reusing the same one. Each test case adds it's own tool 
> option, meaning that starting from the second case, an invalid array is 
> generated. For example, the last case contains the extra tool options from 
> all of the test cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3337) Invalid Argument arrays in SQLServerManagerImportTest

2018-06-22 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3337:
-

 Summary: Invalid Argument arrays in SQLServerManagerImportTest 
 Key: SQOOP-3337
 URL: https://issues.apache.org/jira/browse/SQOOP-3337
 Project: Sqoop
  Issue Type: Bug
Reporter: Fero Szabo
Assignee: Fero Szabo


The argument array builder is only initialized for each test configuration, so 
the 5 tests are reusing the same one. Each test case adds it's own tool option, 
meaning that starting from the second case, an invalid array is generated. For 
example, the last case contains the extra tool options from all of the test 
cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3333) Change default behavior of the MS SQL connector to non-resilient.

2018-06-20 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-:
--
Attachment: SQOOP--11.patch

> Change default behavior of the MS SQL connector to non-resilient.
> -
>
> Key: SQOOP-
> URL: https://issues.apache.org/jira/browse/SQOOP-
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: SQOOP--11.patch
>
>
> The default behavior of Sqoop is to use a "resilient" retry mechanism in the 
> SQL Server connector. However, this relies on the split-by column being 
> unique and ordered ascending. This can lead to obscure errors (duplicate or 
> missing records in imports / exports) and should be used only if specifically 
> wanted by the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3334) Improve ArgumentArrayBuilder, so arguments are replaceable

2018-06-18 Thread Fero Szabo (JIRA)



 [ 
https://issues.apache.org/jira/browse/SQOOP-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3334:
--
Summary: Improve ArgumentArrayBuilder, so arguments are replaceable  (was: 
Improve ArgumentArrayBuilder, so it arguments are replaceable)

> Improve ArgumentArrayBuilder, so arguments are replaceable
> --
>
> Key: SQOOP-3334
> URL: https://issues.apache.org/jira/browse/SQOOP-3334
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> The current implementation of the  ArgumentArrayBuilder allows duplicating 
> options. Instead we should be able to override options that were already 
> specified 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3334) Improve ArgumentArrayBuilder, so it arguments are replaceable

2018-06-15 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3334:
-

 Summary: Improve ArgumentArrayBuilder, so it arguments are 
replaceable
 Key: SQOOP-3334
 URL: https://issues.apache.org/jira/browse/SQOOP-3334
 Project: Sqoop
  Issue Type: Improvement
Reporter: Fero Szabo
Assignee: Fero Szabo


The current implementation of the  ArgumentArrayBuilder allows duplicating 
options. Instead we should be able to override options that were already 
specified 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3333) Change default behavior of the MS SQL connector to non-resilient.

2018-06-06 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-:
-

 Summary: Change default behavior of the MS SQL connector to 
non-resilient.
 Key: SQOOP-
 URL: https://issues.apache.org/jira/browse/SQOOP-
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo
Assignee: Fero Szabo


The default behavior of Sqoop is to use a "resilient" retry mechanism in the 
SQL Server connector. However, this relies on the split-by column being unique 
and ordered ascending. This can lead to obscure errors (duplicate or missing 
records in imports / exports) and should be used only if specifically wanted by 
the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3332) Extend Documentation of --non-resilient flag and add warning message when detected

2018-06-06 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3332:
-

 Summary: Extend Documentation of --non-resilient flag and add 
warning message when detected
 Key: SQOOP-3332
 URL: https://issues.apache.org/jira/browse/SQOOP-3332
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo
Assignee: Fero Szabo


The non-resilient flag can be used to avoid the retry mechanism in the SQL 
Server connector. The documentation only tells that it can be used in export, 
however it can be used in import as well.

Also, the feature itself relies on the implicit assumption that the split-by 
column is unique and sorted in ascending order. The users have to be warned 
about this limitation, at the very least.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column

2018-05-17 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479374#comment-16479374
 ] 

Fero Szabo edited comment on SQOOP-3082 at 5/17/18 5:07 PM:


Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
 I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
 (2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017]

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.

Since testing a connection reset is not a trivial issue, I've opened 
SQOOP-3325, to track the implementation of the tests.

 

 


was (Author: fero):
Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
 I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
 (2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.

Since testing a connection reset is not a trivial issue, I've opened 
SQOOP-3325, to track the implementation of the tests.

 

 

> Sqoop import fails after TCP connection reset if split by datetime column
> -
>
> Key: SQOOP-3082
> URL: https://issues.apache.org/jira/browse/SQOOP-3082
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sergey Svynarchuk
>Priority: Major
> Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
>
>
> If sqoop-to-mssqlserver connection reset, the whole command fails with 
> "Connection reset with com.microsoft.sqlserver.jdbc.SQLServerException: 
> Incorrect syntax near '00'" . On reestablishing connection, Sqoop tries to 
> resume import from the last record that was successfully read by :
> {code}
> 2016-12-10 15:18:54,523 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from 
> test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <= 
> '2016-11-24' and ( Date > 2015-09-18 00:00:00.0 ) AND ( Date < '2015-09-23 
> 11:48:00.0' ) 
> {code}
> Not quoted 2015-09-18 00:00:00.0

[jira] [Comment Edited] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column

2018-05-17 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479374#comment-16479374
 ] 

Fero Szabo edited comment on SQOOP-3082 at 5/17/18 5:06 PM:


Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
 I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
 (2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.

Since testing a connection reset is not a trivial issue, I've opened 
SQOOP-3325, to track the implementation of the tests.

 

 


was (Author: fero):
Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
 I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
 (2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)
 ]

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.

Since testing a connection reset is not a trivial issue, I've opened 
SQOOP-3325, to track the implementation of the tests.

 

 

> Sqoop import fails after TCP connection reset if split by datetime column
> -
>
> Key: SQOOP-3082
> URL: https://issues.apache.org/jira/browse/SQOOP-3082
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sergey Svynarchuk
>Priority: Major
> Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
>
>
> If sqoop-to-mssqlserver connection reset, the whole command fails with 
> "Connection reset with com.microsoft.sqlserver.jdbc.SQLServerException: 
> Incorrect syntax near '00'" . On reestablishing connection, Sqoop tries to 
> resume import from the last record that was successfully read by :
> {code}
> 2016-12-10 15:18:54,523 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from 
> test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <= 
> '2016-11-24' and ( Date > 2015-09-18 00:00:00.0 ) AND ( Date < '2015-09-23 
> 11:48:00.0' ) 
> {code}
> Not quoted 2015-09-18

[jira] [Comment Edited] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column

2018-05-17 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479374#comment-16479374
 ] 

Fero Szabo edited comment on SQOOP-3082 at 5/17/18 5:05 PM:


Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
 I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
 (2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)
 ]

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.

Since testing a connection reset is not a trivial issue, I've opened 
SQOOP-3325, to track the implementation of the tests.

 

 


was (Author: fero):
Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
(2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)
]

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.[
|https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)]Since
 testing a connection reset is not a trivial issue, I've opened SQOOP-3325, to 
track the implementation of the tests.






 

> Sqoop import fails after TCP connection reset if split by datetime column
> -
>
> Key: SQOOP-3082
> URL: https://issues.apache.org/jira/browse/SQOOP-3082
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sergey Svynarchuk
>Priority: Major
> Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
>
>
> If sqoop-to-mssqlserver connection reset, the whole command fails with 
> "Connection reset with com.microsoft.sqlserver.jdbc.SQLServerException: 
> Incorrect syntax near '00'" . On reestablishing connection, Sqoop tries to 
> resume import from the last record that was successfully read by :
> {code}
> 2016-12-10 15:18:54,523 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from 
> test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <= 
> '2016-11-24' and (

[jira] [Updated] (SQOOP-3325) Create automated tests for TCP connection reset error handling logic.

2018-05-17 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3325:
--
Affects Version/s: 1.4.7

> Create automated tests for TCP connection reset error handling logic.
> -
>
> Key: SQOOP-3325
> URL: https://issues.apache.org/jira/browse/SQOOP-3325
> Project: Sqoop
>  Issue Type: Task
>Affects Versions: 1.4.7
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> SQOOP-3082 is not covered by any automated test, we address this gap in our 
> coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column

2018-05-17 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479374#comment-16479374
 ] 

Fero Szabo commented on SQOOP-3082:
---

Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it 
to the current version, please find this version attached.
I've tested it manually with an Integer and a Date column in the split-by 
option.

The former to ensure that it doesn't alter current behavior, the latter to 
check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch 
fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and 
found that the data type precedence will ensure the correct behavior of Sqoop. 
For example, if the lastRecordValue field contains a number, it will be 
"encoded" as a String because of the apostrophes in the resulting statement, 
however, since the column's type is still INT, the INT will take precedence and 
the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the 
rules for data type precedence specify that the data type with the lower 
precedence is converted to the data type with the higher precedence. If the 
conversion is not a supported implicit conversion, an error is returned. When 
both operand expressions have the same data type, the result of the operation 
has that data type.
{quote}
(1) SQL Server 2000: 
[https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
(2) current documentation: 
[https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)
]

I believe we should get this committed now, since it adds a real value for 
sqoop users, even without tests.[
|https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)]Since
 testing a connection reset is not a trivial issue, I've opened SQOOP-3325, to 
track the implementation of the tests.






 

> Sqoop import fails after TCP connection reset if split by datetime column
> -
>
> Key: SQOOP-3082
> URL: https://issues.apache.org/jira/browse/SQOOP-3082
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sergey Svynarchuk
>Priority: Major
> Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
>
>
> If sqoop-to-mssqlserver connection reset, the whole command fails with 
> "Connection reset with com.microsoft.sqlserver.jdbc.SQLServerException: 
> Incorrect syntax near '00'" . On reestablishing connection, Sqoop tries to 
> resume import from the last record that was successfully read by :
> {code}
> 2016-12-10 15:18:54,523 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from 
> test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <= 
> '2016-11-24' and ( Date > 2015-09-18 00:00:00.0 ) AND ( Date < '2015-09-23 
> 11:48:00.0' ) 
> {code}
> Not quoted 2015-09-18 00:00:00.0 in SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3325) Create automated tests for TCP connection reset error handling logic.

2018-05-17 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3325:
--
Description: SQOOP-3082 is not covered by any automated test, we address 
this gap in our coverage.

> Create automated tests for TCP connection reset error handling logic.
> -
>
> Key: SQOOP-3325
> URL: https://issues.apache.org/jira/browse/SQOOP-3325
> Project: Sqoop
>  Issue Type: Task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>
> SQOOP-3082 is not covered by any automated test, we address this gap in our 
> coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3325) Create automated tests for TCP connection reset error handling logic.

2018-05-17 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3325:
-

 Summary: Create automated tests for TCP connection reset error 
handling logic.
 Key: SQOOP-3325
 URL: https://issues.apache.org/jira/browse/SQOOP-3325
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo
Assignee: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column

2018-05-17 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3082:
--
Attachment: SQOOP-3082-1.patch

> Sqoop import fails after TCP connection reset if split by datetime column
> -
>
> Key: SQOOP-3082
> URL: https://issues.apache.org/jira/browse/SQOOP-3082
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
>Reporter: Sergey Svynarchuk
>Priority: Major
> Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
>
>
> If sqoop-to-mssqlserver connection reset, the whole command fails with 
> "Connection reset with com.microsoft.sqlserver.jdbc.SQLServerException: 
> Incorrect syntax near '00'" . On reestablishing connection, Sqoop tries to 
> resume import from the last record that was successfully read by :
> {code}
> 2016-12-10 15:18:54,523 INFO [main] 
> org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from 
> test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <= 
> '2016-11-24' and ( Date > 2015-09-18 00:00:00.0 ) AND ( Date < '2015-09-23 
> 11:48:00.0' ) 
> {code}
> Not quoted 2015-09-18 00:00:00.0 in SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3324) Document SQOOP-816: Sqoop add support for external Hive tables

2018-05-11 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3324:
--
Issue Type: Sub-task  (was: Task)
Parent: SQOOP-3292

> Document SQOOP-816: Sqoop add support for external Hive tables
> --
>
> Key: SQOOP-3324
> URL: https://issues.apache.org/jira/browse/SQOOP-3324
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3324) Document SQOOP-816: Sqoop add support for external Hive tables

2018-05-11 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3324:
-

 Summary: Document SQOOP-816: Sqoop add support for external Hive 
tables
 Key: SQOOP-3324
 URL: https://issues.apache.org/jira/browse/SQOOP-3324
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo
Assignee: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3321) TestHiveImport is failing on Jenkins

2018-05-10 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470298#comment-16470298
 ] 

Fero Szabo commented on SQOOP-3321:
---

Hi [~dvoros],

Szabolcs is unavailable at the moment, but as I discussed this with 
[~BoglarkaEgyed], we think that we should go ahead with your suggestion. Would 
you like to implement a patch for it?

If you don't have the time, I'm happy to pick this up.

> TestHiveImport is failing on Jenkins
> 
>
> Key: SQOOP-3321
> URL: https://issues.apache.org/jira/browse/SQOOP-3321
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.7
>Reporter: Boglarka Egyed
>Priority: Major
> Attachments: TEST-org.apache.sqoop.hive.TestHiveImport.txt
>
>
> org.apache.sqoop.hive.TestHiveImport is failing since 
> [SQOOP-3318|https://reviews.apache.org/r/66761/bugs/SQOOP-3318/] has been 
> committed. This test seem to be failing only in the Jenkins environment as it 
> pass on several local machines. There can be some difference in the 
> filesystem which may cause this issue, it shall be investigated. I am 
> attaching the log from a failed run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3301) Document SQOOP-3216 - metastore related change

2018-03-21 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3301:
--
Summary: Document SQOOP-3216 - metastore related change  (was: Document 
SQOOP-3216)

> Document SQOOP-3216 - metastore related change
> --
>
> Key: SQOOP-3301
> URL: https://issues.apache.org/jira/browse/SQOOP-3301
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3301) Document SQOOP-3216

2018-03-21 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3301:
-

 Summary: Document SQOOP-3216
 Key: SQOOP-3301
 URL: https://issues.apache.org/jira/browse/SQOOP-3301
 Project: Sqoop
  Issue Type: Task
Reporter: Fero Szabo
Assignee: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3301) Document SQOOP-3216

2018-03-21 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3301:
--
Issue Type: Sub-task  (was: Task)
Parent: SQOOP-3292

> Document SQOOP-3216
> ---
>
> Key: SQOOP-3301
> URL: https://issues.apache.org/jira/browse/SQOOP-3301
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3297) Unrecognized argument: --schema for sqoop job execution

2018-03-19 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404859#comment-16404859
 ] 

Fero Szabo commented on SQOOP-3297:
---

[~bbusi],

I don't exactly know what went south here, but recreating the job might help.

If that fails, or isn't an option, folks at Cloudera support should be able to 
help you out (I noticed you are using CDH).

> Unrecognized argument: --schema for sqoop job execution
> ---
>
> Key: SQOOP-3297
> URL: https://issues.apache.org/jira/browse/SQOOP-3297
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Bhanu Busi
>Priority: Minor
>
> Hi,
>  
> previously I have used CDH-5.10 and sqoop 1.4.6. For this combination - - 
> schema argument working fine.
> Now I have upgraded cloudera manager to CDH-5.13.1 and sqoop 1.4.6. For this 
> combination - - schema argument  showing bellow error
> while executing the sqoop job I have faced the following issue.
> {noformat}
> sqoop job --exec X
> (256, 'Warning: 
> /cloudera/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo
>  does not exist! Accumulo imports will fail.\nPlease set $ACCUMULO_HOME to 
> the root of your Accumulo installation.\n18/03/19 05:05:40 INFO sqoop.Sqoop: 
> Running Sqoop version: 1.4.6-cdh5.13.1\n18/03/19 05:05:41 ERROR 
> tool.BaseSqoopTool: Error parsing arguments for import:\n18/03/19 05:05:41 
> ERROR tool.BaseSqoopTool: Unrecognized argument: --schema\n18/03/19 05:05:41 
> ERROR tool.BaseSqoopTool: Unrecognized argument: public\n\nTry --help for 
> usage instructions.'){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (SQOOP-3297) Unrecognized argument: --schema for sqoop job execution

2018-03-19 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404758#comment-16404758
 ] 

Fero Szabo commented on SQOOP-3297:
---

Hi [~bbusi],

Could you paste the output to the following command?
{noformat}
sqoop job --show myjob{noformat}
Thanks!

Fero

> Unrecognized argument: --schema for sqoop job execution
> ---
>
> Key: SQOOP-3297
> URL: https://issues.apache.org/jira/browse/SQOOP-3297
> Project: Sqoop
>  Issue Type: Bug
>Reporter: Bhanu Busi
>Priority: Minor
>
> Hi,
>  
> previously I have used CDH-5.10 and sqoop 1.4.6. For this combination - - 
> schema argument working fine.
> Now I have upgraded cloudera manager to CDH-5.13.1 and sqoop 1.4.6. For this 
> combination - - schema argument  showing bellow error
> while executing the sqoop job I have faced the following issue.
> {noformat}
> sqoop job --exec X
> (256, 'Warning: 
> /cloudera/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo
>  does not exist! Accumulo imports will fail.\nPlease set $ACCUMULO_HOME to 
> the root of your Accumulo installation.\n18/03/19 05:05:40 INFO sqoop.Sqoop: 
> Running Sqoop version: 1.4.6-cdh5.13.1\n18/03/19 05:05:41 ERROR 
> tool.BaseSqoopTool: Error parsing arguments for import:\n18/03/19 05:05:41 
> ERROR tool.BaseSqoopTool: Unrecognized argument: --schema\n18/03/19 05:05:41 
> ERROR tool.BaseSqoopTool: Unrecognized argument: public\n\nTry --help for 
> usage instructions.'){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (SQOOP-3293) Document SQOOP-2976

2018-03-14 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo updated SQOOP-3293:
--
Attachment: SQOOP-3292.1.patch

> Document SQOOP-2976
> ---
>
> Key: SQOOP-3293
> URL: https://issues.apache.org/jira/browse/SQOOP-3293
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
> Attachments: SQOOP-3292.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3293) Document SQOOP-2976

2018-03-14 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3293:
-

 Summary: Document SQOOP-2976
 Key: SQOOP-3293
 URL: https://issues.apache.org/jira/browse/SQOOP-3293
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Fero Szabo






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-3293) Document SQOOP-2976

2018-03-14 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-3293:
-

Assignee: Fero Szabo

> Document SQOOP-2976
> ---
>
> Key: SQOOP-3293
> URL: https://issues.apache.org/jira/browse/SQOOP-3293
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Fero Szabo
>Assignee: Fero Szabo
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (SQOOP-3292) Document new features in Sqoop 1.4.7

2018-03-14 Thread Fero Szabo (JIRA)

Fero Szabo created SQOOP-3292:
-

 Summary: Document new features in Sqoop 1.4.7
 Key: SQOOP-3292
 URL: https://issues.apache.org/jira/browse/SQOOP-3292
 Project: Sqoop
  Issue Type: Task
Affects Versions: 1.4.7
Reporter: Fero Szabo
Assignee: Fero Szabo


Documentation is missing for at least some of the new features in the recent 
release. This Jira is about adding them into the project.

Every feature will have it's own sub-task, so we'll be able to track their 
documentation separately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (SQOOP-2567) SQOOP import for Oracle fails with invalid precision/scale for decimal

2018-03-09 Thread Fero Szabo (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fero Szabo reassigned SQOOP-2567:
-

Assignee: Fero Szabo

> SQOOP import for Oracle fails with invalid precision/scale for decimal
> --
>
> Key: SQOOP-2567
> URL: https://issues.apache.org/jira/browse/SQOOP-2567
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors
>Affects Versions: 1.4.5
> Environment: CDH5.3
>Reporter: Suresh Deoda
>Assignee: Fero Szabo
>Priority: Major
>  Labels: AVRO, ORACLE
>
> Sqoop import fails creating avrodata file from the oracle source with decimal 
> data. If the table in oracle is defined as say,
> Col1  as Decimal(12,11) , but if some data has few less digits in scale then 
> it fails with the error as,
> Error: org.apache.avro.file.DataFileWriter$AppendWriteException: 
> org.apache.avro.AvroTypeException: Cannot encode decimal with scale 10 as 
> scale 11
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:296)
> at 
> org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:112)
> at 
> org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:108)
> at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:655)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at 
> org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:73)
> at 
> org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:39)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.avro.AvroTypeException: Cannot encode decimal with 
> scale 10 as scale 11
> at 
> org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:68)
> at 
> org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:39)
> at 
> org.apache.avro.generic.GenericDatumWriter.convert(GenericDatumWriter.java:90)
> at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70)
> at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)
> at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:112)
> at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143)
> at 
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
> at 
> org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:175)
> also, when we dont have precision defined in Oracle ( which it takes default 
> (38,0) i guess) it gives error as ,
> ERROR tool.ImportTool: Imported Failed: Invalid decimal precision: 0 (must be 
> positive)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

86 matches

Mail list logo