David Alves wrote:
Hi All
I'm currently rewriting my own TableOutputFormat classes to
comply with the new APIs introduced in the latest version and I was
wondering if it would be valuable to rewrite them as buffered writers,
meaning keeping a predetermined set of records (set by size to avoid OOME)
before commiting them to HBase.
Commits are by row. Are you talking of batching up rows before
forwarding them to hbase?
What are your thoughs about this?
In another note I think it would be valuable to rewrite the
TableInputFormat class to be extendable. For example in my case I needed a
Filtered (RegExpRowFilter) TableInputFormat and could not extend the
original because its instance of HTable is package protected.
This needs to be done before 0.2.0 release. Its been on my mind. I
just made a JIRA for it. Dump any thoughts you have on how it might
work into hbase-581. At a minimum, at note on what currently prevents
your being able to subclass.
If you are currently working on this, I could do the hbase end for you.
Just say.
St.Ack