[
https://issues.apache.org/jira/browse/HADOOP-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508710
]
Vuk Ercegovac commented on HADOOP-1519:
---------------------------------------
Thanks for the feedback and apologies for the initial tgz file. Let me know if
the patch works for you.
I included a sample driver, org.apache.hadoop.mapred.TableJobExample, that
scans an input table's columns and writes to an output table. The input/output
tables along with the columns to scan are user specified. Filtering can be done
by extending TableMap. Specifying a row range and versions are good suggestions.
I tried ArrayWritable but ran into the following problem at line 444 of
MapTask.java. Value is instantiated
with the given class, say ArrayWritable, using its empty constructor. Then in
line 459, value.readFields
is called. At this point, the valueClass in ArrayWritable is null, since it was
not instantiated or set appropriately. However, ArrayWritable assumes that it
is set, rather than reading it off the stream. My workaround is through
RecordWritable but am certainly open to better suggestions.
I have tried this code using MiniHBaseCluster and on a distributed cluster
(thanks for the new start/stop scripts!) for the simple example of copying
tables.
> mapreduce input and output formats to go against hbase
> ------------------------------------------------------
>
> Key: HADOOP-1519
> URL: https://issues.apache.org/jira/browse/HADOOP-1519
> Project: Hadoop
> Issue Type: New Feature
> Components: contrib/hbase
> Reporter: stack
> Assignee: Jim Kellerman
> Attachments: hbaseMR.tgz, patch.txt
>
>
> Inputs should allow specification of row range, columns and column versions.
> Outputs should allow specification of where to put the mapreduce result in
> hbase
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.