[
https://issues.apache.org/jira/browse/MAPREDUCE-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728763#action_12728763
]
Aaron Kimball commented on MAPREDUCE-716:
-----------------------------------------
* Doing {{setFetchSize(Integer.MIN_VALUE)}} is the signal to MySQL to send the
data row-at-a-time instead of buffering the entire resultset in the client. No
other values for the setFetchSize argument are supported (see
[http://forums.mysql.com/read.php?39,137457]). Given the volume of data
encountered, it is likely that buffering all results could cause OutOfMemory
exceptions as were seen in Sqoop. There are enough bottlenecks elsewhere in
HDFS that this is likely to not be the slowest point. Consequently, this is the
"correct" setting for result sets which are expected to be large.
* So are you saying that DBRR should be a top-level class? I don't have strong
opinions about this. I can pull it up to top level easily enough. I will only
do this on the trunk branch, not the 0.20 branch patch.
* A reference to the statement object is no longer held onto. I'll reintroduce
explicitly tracking the reference to the statement object and close it in the
close() method again.
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-716
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Environment: Java 1.6, HAdoop0.19.0, Linux..Oracle,
> Reporter: evanand
> Assignee: Aaron Kimball
> Attachments: HADOOP-5482.20-branch.patch, HADOOP-5482.patch,
> HADOOP-5482.trunk.patch, MAPREDUCE-716.2.branch20.patch,
> MAPREDUCE-716.2.trunk.patch, MAPREDUCE-716.3.trunk.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle.
> The out of the box implementation of the Hadoop is working properly with
> mysql/hsqldb, but NOT with oracle.
> Reason is DBInputformat is implemented with mysql/hsqldb specific query
> constructs like "LIMIT", "OFFSET".
> FIX:
> building a database provider specific logic based on the database
> providername (which we can get using connection).
> I HAVE ALREADY IMPLEMENTED IT FOR ORACLE...READY TO CHECK_IN CODE
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.