[
https://issues.apache.org/jira/browse/HBASE-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035707#comment-13035707
]
Bill Graham commented on HBASE-3880:
------------------------------------
Sounds good, I'm working on the following changes to {{TsvImporter}}:
* Make {{TsvImporter}} an outer class and rename it to {{TsvImporterMapper}}.
* Change the {{setup}} method to public from protected.
* Expose getters for {{ts}}, {{skipBadLines}} and {{badLineCount}}.
* Add {{incrementBadLineCount(int count)}} method.
I should have a patch ready soon unless there are other suggestions/comments.
For now I was going to leave the {{TsvParser}} as an inner class, unless anyone
things that would be useful as well.
> Make mapper function in ImportTSV plug-able
> -------------------------------------------
>
> Key: HBASE-3880
> URL: https://issues.apache.org/jira/browse/HBASE-3880
> Project: HBase
> Issue Type: New Feature
> Reporter: Bill Graham
> Assignee: Bill Graham
> Attachments: HBASE-3880_1.patch
>
>
> It would be really useful to allow the ability to specify a different Mapper
> for the {{ImportTsv}} class to use than the current {{TsvImporter}}. This
> would allow transformations to be made on the input data before being added
> to HBase. One suggestion is to add a new command line option to specify a
> user defined mapper (UDM?). Or maybe instead we just refactor it to be
> extended where a subclass can specify a new mapper.
> The mapper is statically defined and bound to the job though, so I'm not sure
> of the best way to make it dynamically plug-able. Suggestions welcome.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira