For large binary objects, you could consider Google Protocol Buffers. It is 
very compact when working with large lists of numbers, etc. where Java 
serialization will give a lot of overhead (for example: a single BigInteger 
object of value 0 takes 50 bytes in serialized form).

If you anticipate on using a volume of data that justifies the use of HBase and 
Hadoop, I would not want to fix any data corruption manually, so you probably 
should have this automated using some kind of sanity checks. I don't think you 
have to worry about HBase / HDFS corrupting your data. It has proven to be very 
stable in that area.


Friso




On 7 dec 2010, at 00:45, Buttler, David wrote:

> A couple of thoughts here:
> 1) for some types of objects, you want your fields to be column qualifiers in 
> HBase.  So in effect, you are serializing to the hbase format
> 2) Some objects you might want to serialize with json -- it is a very 
> lightweight serialization protocol -- and you can use Gson to do most of the 
> work for you
> 3) some objects you might want to invent your own human-readable format for 
> legacy or convenience reasons.
> 
> I do all three in a single table and find it very flexible
> 
> Dave
> 
> 
> -----Original Message-----
> From: Hiller, Dean (Contractor) [mailto:[email protected]] 
> Sent: Monday, December 06, 2010 1:40 PM
> To: [email protected]
> Subject: serialized objects as strings or as object? & data corruption?
> 
> Is there a good tool out there for serialization to hbase for a java
> entity?  If I have an Account, and then have a List<Activities> in the
> account, I preferably want to serialize that as all strings so data
> corruption issues can be fixed easier independent of the objects.....or
> do I just create MapReduce short lived jobs that fix data corruption?
> How do people deal with data corruption and serializing objects to HBase
> storage?
> 
> 
> 
> I also like the ability to query command line and actually be able to
> read the storage(but maybe I just build something that knows about my
> objects?)....how do people deal with this today?  Just looking for
> thoughts on this subject.
> 
> 
> 
> Thanks,
> 
> Dean
> 
> 
> This message and any attachments are intended only for the use of the 
> addressee and
> may contain information that is privileged and confidential. If the reader of 
> the 
> message is not the intended recipient or an authorized representative of the
> intended recipient, you are hereby notified that any dissemination of this
> communication is strictly prohibited. If you have received this communication 
> in
> error, please notify us immediately by e-mail and delete the message and any
> attachments from your system.
> 
> 

Reply via email to