[ 
https://issues.apache.org/jira/browse/MAPREDUCE-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728411#action_12728411
 ] 

Aaron Kimball commented on MAPREDUCE-716:
-----------------------------------------

Enis,

The DBRecordReader API already contained a {{getSelectQuery()}} that was 
designed for overriding by subclasses. So after looking at things more closely, 
rather than extend DBIF, it makes more sense to extend DBRR in my mind. Of 
course, DBRecordReader is not static, so any other extensions would have to be 
in DBInputFormat.java. I've made DBRecordReader a static class and added 
protected accessor methods for private fields where they make sense.

I added 
{{src/java/org/apache/hadoop/mapreduce/lib/db/OracleDBRecordReader.java}} to 
hold the Oracle-specific logic.

I also added 
{{src/java/org/apache/hadoop/mapreduce/lib/db/MySQLDBRecordReader.java}} to 
hold MySQL-specific logic to force it to use unbuffered mode for queries, which 
prevents Out-of-memory errors (see MAPREDUCE-685 for a related problem in other 
queries run by Sqoop). 

DBInputFormat itself includes logic in {{getRecordReader()}} to determine the 
particular RR implementation to instantiate. This uses the same metadata as was 
originally pushed down into {{getSelectQuery()}}.

I've again tested this locally against Oracle and MySQL databases I've 
installed; both work.


> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-716
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-716
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>         Environment: Java 1.6, HAdoop0.19.0, Linux..Oracle, 
>            Reporter: evanand
>            Assignee: Aaron Kimball
>         Attachments: HADOOP-5482.20-branch.patch, HADOOP-5482.patch, 
> HADOOP-5482.trunk.patch, MAPREDUCE-716.2.branch20.patch, 
> MAPREDUCE-716.2.trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle.
> The out of the box implementation of the Hadoop is working properly with 
> mysql/hsqldb, but NOT with oracle.
> Reason is DBInputformat is implemented with mysql/hsqldb specific query 
> constructs like "LIMIT", "OFFSET".
> FIX:
> building a database provider specific logic based on the database 
> providername (which we can get using connection).
> I HAVE ALREADY IMPLEMENTED IT FOR ORACLE...READY TO CHECK_IN CODE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to