[jira] [Comment Edited] (SQOOP-1302) Doesn't run the mapper for remaining splits, when split-by ROWNUM

Venkat Ranganathan (JIRA) Wed, 02 Apr 2014 14:29:25 -0700

    [ 
https://issues.apache.org/jira/browse/SQOOP-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958168#comment-13958168
 ]


Venkat Ranganathan edited comment on SQOOP-1302 at 4/2/14 9:28 PM:
-------------------------------------------------------------------

I will add a warning to explain to the users that this is not the right use of 
ROWNUM in Oracle.


was (Author: venkatnrangan):
I will add a warning to explain to the users that this is not the right user of 
ROWNUM in Oracle.

> Doesn't run the mapper for remaining splits, when split-by ROWNUM
> -----------------------------------------------------------------
>
>                 Key: SQOOP-1302
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1302
>             Project: Sqoop
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 1.4.3
>         Environment: CDH 4.6.0-1
>            Reporter: praveen m
>            Assignee: Venkat Ranganathan
>            Priority: Minor
>              Labels: sqoop
>             Fix For: 1.4.5
>
>   Original Estimate: 396h
>  Remaining Estimate: 396h
>
> when trying to sqoop, import the table from  Oracle database to HDFS, I am  
> using the ROWNUM to split by, since the table doesn't have a primary key. I 
> am just getting the exactly the 1/4 of the data, remaining 3/4 not sqooped in 
> with default 4 mappers. It is splitting correctly into 4 pieces but it runs 
> mapper on only one.
> Eg: I have total of 28 records in table, The records ingested is is only 8
>  records in hdfs.
> HDFS OUTPUT:
> part-m-00000  544 bytes       
> part-m-00001  0 bytes 
> part-m-00002  0 bytes 
> part-m-00003  0 bytes 
> LOG:
> Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
> Please set $HCAT_HOME to the root of your HCatalog installation.
> Warning: 
> /hdp2014poc/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/bin/../lib/sqoop/../accumulo
>  does not exist! Accumulo imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 14/04/02 12:38:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.6.0
> 14/04/02 12:38:22 WARN tool.BaseSqoopTool: Setting your password on the 
> command-line is insecure. Consider using -P instead.
> 14/04/02 12:38:22 INFO manager.SqlManager: Using default fetchSize of 1000
> 14/04/02 12:38:22 INFO tool.CodeGenTool: Beginning code generation
> 14/04/02 12:38:23 INFO manager.OracleManager: Time zone has been set to GMT
> 14/04/02 12:38:23 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM BILL_FEE_CDE_DIM t WHERE 1=0
> 14/04/02 12:38:23 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is 
> /hdp2014poc/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/bin/../lib/hadoop-0.20-mapreduce
> Note: 
> /tmp/sqoop-dmadmin/compile/6a8ac204650278b4235d199bb6059358/BILL_FEE_CDE_DIM.java
>  uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 14/04/02 12:38:24 INFO orm.CompilationManager: Writing jar file: 
> /tmp/sqoop-dmadmin/compile/6a8ac204650278b4235d199bb6059358/BILL_FEE_CDE_DIM.jar
> 14/04/02 12:38:24 INFO mapreduce.ImportJobBase: Beginning import of 
> BILL_FEE_CDE_DIM
> 14/04/02 12:38:25 INFO manager.OracleManager: Time zone has been set to GMT
> 14/04/02 12:38:25 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
> the arguments. Applications should implement Tool for the same.
> 14/04/02 12:38:26 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT 
> MIN(ROWNUM), MAX(ROWNUM) FROM BILL_FEE_CDE_DIM
> 14/04/02 12:38:26 INFO mapred.JobClient: Running job: job_201403281357_0126
> 14/04/02 12:38:27 INFO mapred.JobClient:  map 0% reduce 0%
> 14/04/02 12:38:38 INFO mapred.JobClient:  map 50% reduce 0%
> 14/04/02 12:38:39 INFO mapred.JobClient:  map 75% reduce 0%
> 14/04/02 12:38:47 INFO mapred.JobClient:  map 100% reduce 0%
> 14/04/02 12:38:49 INFO mapred.JobClient: Job complete: job_201403281357_0126
> 14/04/02 12:38:49 INFO mapred.JobClient: Counters: 23
> 14/04/02 12:38:49 INFO mapred.JobClient:   File System Counters
> 14/04/02 12:38:49 INFO mapred.JobClient:     FILE: Number of bytes read=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     FILE: Number of bytes 
> written=765188
> 14/04/02 12:38:49 INFO mapred.JobClient:     FILE: Number of read operations=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     FILE: Number of large read 
> operations=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     FILE: Number of write 
> operations=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     HDFS: Number of bytes read=435
> 14/04/02 12:38:49 INFO mapred.JobClient:     HDFS: Number of bytes written=544
> 14/04/02 12:38:49 INFO mapred.JobClient:     HDFS: Number of read operations=4
> 14/04/02 12:38:49 INFO mapred.JobClient:     HDFS: Number of large read 
> operations=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     HDFS: Number of write 
> operations=4
> 14/04/02 12:38:49 INFO mapred.JobClient:   Job Counters
> 14/04/02 12:38:49 INFO mapred.JobClient:     Launched map tasks=4
> 14/04/02 12:38:49 INFO mapred.JobClient:     Total time spent by all maps in 
> occupied slots (ms)=34570
> 14/04/02 12:38:49 INFO mapred.JobClient:     Total time spent by all reduces 
> in occupied slots (ms)=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     Total time spent by all maps 
> waiting after reserving slots (ms)=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     Total time spent by all reduces 
> waiting after reserving slots (ms)=0
> 14/04/02 12:38:49 INFO mapred.JobClient:   Map-Reduce Framework
> 14/04/02 12:38:49 INFO mapred.JobClient:     Map input records=8
> 14/04/02 12:38:49 INFO mapred.JobClient:     Map output records=8
> 14/04/02 12:38:49 INFO mapred.JobClient:     Input split bytes=435
> 14/04/02 12:38:49 INFO mapred.JobClient:     Spilled Records=0
> 14/04/02 12:38:49 INFO mapred.JobClient:     CPU time spent (ms)=8030
> 14/04/02 12:38:49 INFO mapred.JobClient:     Physical memory (bytes) 
> snapshot=1160265728
> 14/04/02 12:38:49 INFO mapred.JobClient:     Virtual memory (bytes) 
> snapshot=6454796288
> 14/04/02 12:38:49 INFO mapred.JobClient:     Total committed heap usage 
> (bytes)=3032481792
> 14/04/02 12:38:49 INFO mapreduce.ImportJobBase: Transferred 544 bytes in 
> 24.8174 seconds (21.9201 bytes/sec)
> 14/04/02 12:38:49 INFO mapreduce.ImportJobBase: Retrieved 8 records.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (SQOOP-1302) Doesn't run the mapper for remaining splits, when split-by ROWNUM

Reply via email to