[ 
https://issues.apache.org/jira/browse/HADOOP-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HADOOP-2536:
----------------------------------

    Attachment: mapred_jdbc_v3.patch

Since Fredrik said that he cannot continue to work on the patch, I have updated 
it with some changes. 
The changes include :
# package and class names have DB prefix instead of database. 
# DBInputSplit is now an inner class of DBInputFormat
# instead of the type mapping to convert the data types in the library, a new 
DBWritable interface is introduced. The classes implement DBWritable to convert 
from/to db tuples. 
# DBRecordReader emits <LongWritable, T> types where record number is the key 
and T is of type DBWritable. 
# DBRecordWriter accepts <K, V> where K implements DBWritable(hence written to 
db) and V is discarded. 
# JDBC uses JDBC batch update. 
# introduced two ways of setting the input query. 
# improved documentation.
# added a sample mapred program reading data from db and writing the results 
back to db. The program calculates the number of pageviews in a syntactically 
generated access log. The example program uses HSQLDB as an embedded database. 
# added a test case running the example job in the MiniCluster. 



> MapReduce for MySQL
> -------------------
>
>                 Key: HADOOP-2536
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2536
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Fredrik Hedberg
>            Assignee: Fredrik Hedberg
>            Priority: Minor
>         Attachments: database-2.diff, database.diff, mapred_jdbc_v3.patch
>
>
> Add support for running MapReduce jobs over data residing in a MySQL table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to