[ 
https://issues.apache.org/jira/browse/HBASE-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529683#comment-13529683
 ] 

Matt Corgan commented on HBASE-7233:
------------------------------------

Good stuff Stack.  Some thoughts:

* Move the codec package up out of io package?  For readability, but also may 
be doing some encoding/decoding that's purely in memory at some point (memstore)
* Do we need both Encoder and CellOutputStream interfaces?
* You think CodecException should extend IOException?  I was thinking they're 
separate concepts that just happen to be used together a lot.  Like if we 
encode the memstore it would throw IOExceptions.  Looks to be from the 
relationship between CellOutputStream and Encoder which I'm not clear on.
* I saw you grabbed the CellSearcher interface from prefix-tree as well.  I'm 
not confident that the methods in that one are best for all of hbase, but we 
can change them later when we figure out what should be there.  Same with 
ReversibleCellScanner.
* I saw you changed CellScanner.next() (an ambiguous word) to read() which is 
fine.  I'd throw advance() in as a candidate - i guess you're picturing RPC 
decoding and i'm picturing block decoding.  Not important

{quote}resetToBeforeFirstEntryMethod on something that implemented 
CellSearcher?{quote}
yep, CellSearcher

{quote}Wondering in particular if Interface will work for Encoders that 
compress; i.e. PrefixTree.{quote}
I think it will work great on the underlying DataBlockEncoders.  Tricky part is 
figuring out how to modify the HFileDataBlockEncoderImpl to allow the 
streaming.  Might be able to simplify that thing in the process.  I wonder if 
it's time to ditch the separate disk/memory encoding feature as I have a 
feeling people don't use it.

{quote}Do you you know if your vint stuff is faster than what is in hadoop in 
WritableUtils.vint?{quote}
Speed difference is probably negligible.  I made that one because it encodes 
only positive numbers, so you can get 255 in 1b rather than only 127.  It can 
actually matter when writing a lot of vint indexes into a token dictionary type 
thing.  You're using it to write array lengths which are always positive, so 
probably a good fit, but i originally intended for it to be hidden in the 
prefix-tree's black box implementation.

                
> Serializing KeyValues
> ---------------------
>
>                 Key: HBASE-7233
>                 URL: https://issues.apache.org/jira/browse/HBASE-7233
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.96.0
>
>         Attachments: 7233sketch.txt, 7233.txt, 7233-v2.txt, 
> 7233v3_encoders.txt, 7233v4_encoders.txt
>
>
> Undo KeyValue being a Writable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to