[ 
https://issues.apache.org/jira/browse/HBASE-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511611#comment-13511611
 ] 

Lars Hofhansl edited comment on HBASE-7233 at 12/6/12 7:02 PM:
---------------------------------------------------------------

{quote}So, set a pb header and then write out <length><bytearray> as we have 
now after we send the pb.{quote}That's what I was thinking, except now we send 
the Cells through an official DataBlockEncoder to generate the <bytearray> 
rather than using the custom KeyValue serializer in use right now.  We can make 
a new DataBlockEncoder that mimics the byte[] output of the current RPC format 
so it has roughly the same performance as the current client.

{quote}It won't be evolvable, right?  Unless we put a 'version' in the pb 
header or client{quote}We could put a version in the PB header. Probably safe 
to put a version in the header even if it never gets used.  I also have a 
version in the internal PrefixTree encoder, but an extra version byte here or 
there doesn't hurt anything.

{quote}It'd write <length><bytearray><length><bytearray> and the byte array 
would be the backing array of a KV?{quote}Regarding the multiple 
<length><bytearray> here - is each section a separate RPC message, or there is 
a section per region from a single regionserver?

{quote}Rewriting all hfiles? Pretty controversial I'd say.{quote}Is the idea to 
use Protocol Buffers to write the data blocks in the HFiles?  That seems like a 
performance problem.  Or just the metadata like FixedFileTrailer?

{quote}I would really prefer not to double the number of kV types just to say 
"foo with tags". And then double again for "foo with tags and bar".{quote}That 
would be ugly, but at the same time it's difficult and maybe wasteful to 
future-proof it from every angle.  Tags are already sort of a flexible 
future-proofing mechanism.  Maybe tags can be added in a backwards compatible 
way to the existing encoders.  I'd have to think about it for PrefixTree, 
probably punting them to a PREFIX_TREE2 encoder with some other 
additions/improvements.
                
      was (Author: mcorgan):
    {quote}So, set a pb header and then write out <length><bytearray> as we 
have now after we send the pb.{quote}That's what I was thinking, except now we 
send the Cells through an official DataBlockEncoder to generate the <bytearray> 
rather than using the custom KeyValue serializer in use right now.  We can make 
a new DataBlockEncoder that mimics the byte[] output of the current RPC format 
so it has roughly the same performance as the current client.

{quote}It won't be evolvable, right?  Unless we put a 'version' in the pb 
header or client{quote}We could put a version in the PB header.{quote}Probably 
safe to put a version in the header even if it never gets used.  I also have a 
version in the internal PrefixTree encoder, but an extra version byte here or 
there doesn't hurt anything.

{quote}It'd write <length><bytearray><length><bytearray> and the byte array 
would be the backing array of a KV?{quote}Regarding the multiple 
<length><bytearray> here - is each section a separate RPC message, or there is 
a section per region from a single regionserver?

{quote}Rewriting all hfiles? Pretty controversial I'd say.{quote}Is the idea to 
use Protocol Buffers to write the data blocks in the HFiles?  That seems like a 
performance problem.  Or just the metadata like FixedFileTrailer?

{quote}I would really prefer not to double the number of kV types just to say 
"foo with tags". And then double again for "foo with tags and bar".{quote}That 
would be ugly, but at the same time it's difficult and maybe wasteful to 
future-proof it from every angle.  Tags are already sort of a flexible 
future-proofing mechanism.  Maybe tags can be added in a backwards compatible 
way to the existing encoders.  I'd have to think about it for PrefixTree, 
probably punting them to a PREFIX_TREE2 encoder with some other 
additions/improvements.
                  
> Serializing KeyValues
> ---------------------
>
>                 Key: HBASE-7233
>                 URL: https://issues.apache.org/jira/browse/HBASE-7233
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 7233.txt, 7233-v2.txt
>
>
> Undo KeyValue being a Writable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to