[ 
https://issues.apache.org/jira/browse/SQOOP-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kathleen Ting updated SQOOP-474:
--------------------------------

    Description: 
To reproduce this, run an import using a query with number of mappers set to 1 
and a split-by specification. For example:
{code}
$ sqoop import --connect jdbc:mysql://localhost/hadoopguide --query 'SELECT 
A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE $CONDITIONS' --split-by AID 
--target-dir /user/kateting/test1 --m=1
{code}

This import will output the following:
{code}
12/04/02 13:29:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT 
MIN(AID), MAX(AID) FROM (SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE 
 (1 = 1) ) AS t1
{code}

An embedded query fails in DB2 when using the 'with ur' syntax. This also fails 
for Informix if the version of Informix doesn't support embedded queries. The 
issue is the 'with ur' syntax, without which, the boundary query is harmless. 
The boundary query is being triggered because of the split-by specification. 
However specifying split-by is redundant given that the number of mappers is 1.

  was:
To reproduce this, run an import using a query with number of mappers set to 1 
and a split-by specification. For example:
{code}
$ sqoop import --connect jdbc:mysql://localhost/hadoopguide --query 'SELECT 
A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE $CONDITIONS' --split-by AID 
--target-dir /user/kateting/test1 --m=1
{code}

This import will output the following:
{code}
12/04/02 13:29:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT 
MIN(AID), MAX(AID) FROM (SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE 
 (1 = 1) ) AS t1
{code}

The problem is that the bounding value query construction is being triggered 
because of the --split-by specification. However specifying split-by is 
redundant given that the number of mappers is 1.

    
> Split-by specification incorrectly triggers bounding value query
> ----------------------------------------------------------------
>
>                 Key: SQOOP-474
>                 URL: https://issues.apache.org/jira/browse/SQOOP-474
>             Project: Sqoop
>          Issue Type: Bug
>          Components: build, connectors/generic
>    Affects Versions: 1.4.2-incubating
>            Reporter: Kathleen Ting
>            Assignee: Kathleen Ting
>         Attachments: SQOOP-474-1.patch, SQOOP-474.patch
>
>
> To reproduce this, run an import using a query with number of mappers set to 
> 1 and a split-by specification. For example:
> {code}
> $ sqoop import --connect jdbc:mysql://localhost/hadoopguide --query 'SELECT 
> A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE $CONDITIONS' --split-by AID 
> --target-dir /user/kateting/test1 --m=1
> {code}
> This import will output the following:
> {code}
> 12/04/02 13:29:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT 
> MIN(AID), MAX(AID) FROM (SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) 
> WHERE  (1 = 1) ) AS t1
> {code}
> An embedded query fails in DB2 when using the 'with ur' syntax. This also 
> fails for Informix if the version of Informix doesn't support embedded 
> queries. The issue is the 'with ur' syntax, without which, the boundary query 
> is harmless. The boundary query is being triggered because of the split-by 
> specification. However specifying split-by is redundant given that the number 
> of mappers is 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to