[
https://issues.apache.org/jira/browse/HBASE-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529683#comment-13529683
]
Matt Corgan commented on HBASE-7233:
------------------------------------
Good stuff Stack. Some thoughts:
* Move the codec package up out of io package? For readability, but also may
be doing some encoding/decoding that's purely in memory at some point (memstore)
* Do we need both Encoder and CellOutputStream interfaces?
* You think CodecException should extend IOException? I was thinking they're
separate concepts that just happen to be used together a lot. Like if we
encode the memstore it would throw IOExceptions. Looks to be from the
relationship between CellOutputStream and Encoder which I'm not clear on.
* I saw you grabbed the CellSearcher interface from prefix-tree as well. I'm
not confident that the methods in that one are best for all of hbase, but we
can change them later when we figure out what should be there. Same with
ReversibleCellScanner.
* I saw you changed CellScanner.next() (an ambiguous word) to read() which is
fine. I'd throw advance() in as a candidate - i guess you're picturing RPC
decoding and i'm picturing block decoding. Not important
{quote}resetToBeforeFirstEntryMethod on something that implemented
CellSearcher?{quote}
yep, CellSearcher
{quote}Wondering in particular if Interface will work for Encoders that
compress; i.e. PrefixTree.{quote}
I think it will work great on the underlying DataBlockEncoders. Tricky part is
figuring out how to modify the HFileDataBlockEncoderImpl to allow the
streaming. Might be able to simplify that thing in the process. I wonder if
it's time to ditch the separate disk/memory encoding feature as I have a
feeling people don't use it.
{quote}Do you you know if your vint stuff is faster than what is in hadoop in
WritableUtils.vint?{quote}
Speed difference is probably negligible. I made that one because it encodes
only positive numbers, so you can get 255 in 1b rather than only 127. It can
actually matter when writing a lot of vint indexes into a token dictionary type
thing. You're using it to write array lengths which are always positive, so
probably a good fit, but i originally intended for it to be hidden in the
prefix-tree's black box implementation.
> Serializing KeyValues
> ---------------------
>
> Key: HBASE-7233
> URL: https://issues.apache.org/jira/browse/HBASE-7233
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7233sketch.txt, 7233.txt, 7233-v2.txt,
> 7233v3_encoders.txt, 7233v4_encoders.txt
>
>
> Undo KeyValue being a Writable.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira