[jira] Commented: (MAPREDUCE-716) org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle

Aaron Kimball (JIRA) Wed, 08 Jul 2009 09:25:46 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728763#action_12728763
 ]


Aaron Kimball commented on MAPREDUCE-716:
-----------------------------------------

* Doing {{setFetchSize(Integer.MIN_VALUE)}} is the signal to MySQL to send the 
data row-at-a-time instead of buffering the entire resultset in the client.  No 
other values for the setFetchSize argument are supported (see 
[http://forums.mysql.com/read.php?39,137457]). Given the volume of data 
encountered, it is likely that buffering all results could cause OutOfMemory 
exceptions as were seen in Sqoop. There are enough bottlenecks elsewhere in 
HDFS that this is likely to not be the slowest point. Consequently, this is the 
"correct" setting for result sets which are expected to be large.
* So are you saying that DBRR should be a top-level class? I don't have strong 
opinions about this. I can pull it up to top level easily enough. I will only 
do this on the trunk branch, not the 0.20 branch patch.
* A reference to the statement object is no longer held onto. I'll reintroduce 
explicitly tracking the reference to the statement object and close it in the 
close() method again. 

> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-716
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-716
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>         Environment: Java 1.6, HAdoop0.19.0, Linux..Oracle, 
>            Reporter: evanand
>            Assignee: Aaron Kimball
>         Attachments: HADOOP-5482.20-branch.patch, HADOOP-5482.patch, 
> HADOOP-5482.trunk.patch, MAPREDUCE-716.2.branch20.patch, 
> MAPREDUCE-716.2.trunk.patch, MAPREDUCE-716.3.trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle.
> The out of the box implementation of the Hadoop is working properly with 
> mysql/hsqldb, but NOT with oracle.
> Reason is DBInputformat is implemented with mysql/hsqldb specific query 
> constructs like "LIMIT", "OFFSET".
> FIX:
> building a database provider specific logic based on the database 
> providername (which we can get using connection).
> I HAVE ALREADY IMPLEMENTED IT FOR ORACLE...READY TO CHECK_IN CODE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-716) org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle

Reply via email to