Boris Tyukin created SQOOP-3185:
-----------------------------------

             Summary: Oracle clustered tables not supported by oraoop (direct 
Oracle mode)
                 Key: SQOOP-3185
                 URL: https://issues.apache.org/jira/browse/SQOOP-3185
             Project: Sqoop
          Issue Type: Bug
          Components: connectors/oracle
    Affects Versions: 1.4.6
         Environment: CDH 5.9
            Reporter: Boris Tyukin


Oracle clustered tables not supported by oraoop (direct Oracle mode)
------------------------------------------------------------------------

More about clustered tables here 
https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_5001.htm)

While ingesting data using sqoop's direct mode, we noticed that it does not 
support these tables and would error out with error, stating that there are no 
records in the source table.

The issue is happening because sqoop is unable to determine database extents 
for such tables and hence chunks of data for splits in 
src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java

This code would generate SQL similar to below and dba_extents table would not 
have segment_name named as object name. 

For example in our case, ENCOUNTER table is clustered and dba_extents 
segment_name would differ from object_name. 

This query below would not return any data:

SELECT * FROM dba_objects o, dba_extents e
WHERE                o.owner               = 'V500'
                              AND o.object_name     = 'ENCOUNTER'
                              AND e.owner = 'V500'
                              AND e.segment_name = 'ENCOUNTER'
                              AND (o.subobject_name = e.partition_name
                              OR (o.subobject_name IS NULL
                              AND e.partition_name IS NULL))
;                             

but this query will (note how Oracle DB named object_name and segment_name):

Here is a modified query that returns data:
SELECT * FROM dba_objects o, dba_extents e
WHERE                o.owner               = 'V500'
                              AND o.object_name     = 'CLU_ENCOUNTER0077'
                              AND e.owner = 'V500'
                              AND e.segment_name = 'CLU_ENCOUNTER0077'
                              AND (o.subobject_name = e.partition_name
                              OR (o.subobject_name IS NULL
                              AND e.partition_name IS NULL))
;                             



sqoop import --direct --connect ....

--- cut from log ---

17/05/01 09:35:09 DEBUG oracle.OraOopOracleQueries: getOracleDataChunksExtent() 
SQL Query =
SELECT data_object_id, file_id, relative_fno, file_batch, MIN (start_block_id) 
start_block_id, MAX (end_block_id) end_block_id, SUM (blocks) blocks FROM 
(SELECT o.data_object_id, e.file_id, e.relative_fno, e.block_id start_block_id, 
e.block_id + e.blocks - 1 end_block_id, e.blocks, CEIL (    SUM (       
e.blocks)    OVER (PARTITION BY o.data_object_id, e.file_id          ORDER BY 
e.block_id ASC)    / (SUM (e.blocks)          OVER (PARTITION BY 
o.data_object_id, e.file_id)       / 17))    file_batch FROM dba_extents e, 
dba_objects o, dba_tab_subpartitions tsp WHERE     o.owner = V500 AND 
o.object_name = ENCOUNTER AND e.owner = V500 AND e.segment_name = ENCOUNTER AND 
o.owner = e.owner AND o.object_name = e.segment_name AND (o.subobject_name = 
e.partition_name      OR (o.subobject_name IS NULL AND e.partition_name IS 
NULL)) AND o.owner = tsp.table_owner(+) AND o.object_name = tsp.table_name(+) 
AND o.subobject_name = tsp.subpartition_name(+) ) GROUP BY data_object_id,      
    file_id,          relative_fno,          file_batch ORDER BY 
data_object_id,          file_id,          relative_fno,          file_batch
17/05/01 09:35:09 FATAL oracle.OraOopDataDrivenDBInputFormat: The table 
ENCOUNTER does not contain any data.
17/05/01 09:35:09 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
/user/oracle/.staging/job_1493213922849_0970
17/05/01 09:35:09 DEBUG util.ClassLoaderStack: Restoring classloader: 
sun.misc.Launcher$AppClassLoader@2f7a2457
17/05/01 09:35:09 ERROR sqoop.Sqoop: Got exception running Sqoop: 
java.lang.RuntimeException: The table ENCOUNTER does not contain any data.
java.lang.RuntimeException: The table ENCOUNTER does not contain any data.
        at 
org.apache.sqoop.manager.oracle.OraOopDataDrivenDBInputFormat.getSplits(OraOopDataDrivenDBInputFormat.java:108)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:305)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1325)
        at 
org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:203)
        at 
org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:176)
        at 
org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:273)
        at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
        at 
org.apache.sqoop.manager.oracle.OraOopConnManager.importTable(OraOopConnManager.java:284)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to