[
https://issues.apache.org/jira/browse/HADOOP-3414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598126#action_12598126
]
Doug Cutting commented on HADOOP-3414:
--------------------------------------
So we'd:
- make Serializer an abstract class;
- add a method:
{code}
public int getSize() { return -1; }
{code}
- override this in some simple classes, like Text and BytesWritable;
- add a utility somewhere like:
{code}
public LengthPrefixedSerializer<T> extends Serializer<T> {
private DataOutputStream out;
private DataOutputBuffer buffer = new DataOutputBuffer();
private Serializer<T> serializer;
private Serializer<T> bufferSerializer;
public LengthPrefixedSerializer<T>(Class<T> c, DataOutputStream out) {
this.out = out;
serializer = SerializationFactory.getSerializer(c);
serializer.open(out);
bufferSerializer = SerializationFactory.getSerializer(c);
bufferSerializer.open(buffer);
}
public void serialize(T o) {
int size o.getSize();
if (size >= 0) {
// can serialize directly w/o buffering
WriteableUtils.writeVInt(out, size);
serializer.serialize(o);
} else {
// have to buffer before we can serialize
buffer.reset();
bufferSerializer.serialize(o);
WriteableUtils.writeVInt(out, buffer.getLength());
out.write(buffer.getBytes(), 0, buffer.getLength());
}
}
{code}
Is that something like what you have in mind?
> Facility to query serializable types such as Writables for 'raw length'
> -----------------------------------------------------------------------
>
> Key: HADOOP-3414
> URL: https://issues.apache.org/jira/browse/HADOOP-3414
> Project: Hadoop Core
> Issue Type: Improvement
> Components: io
> Reporter: Arun C Murthy
>
> Currently we need to jump through hoops to get the 'raw length' of
> serializable types for e.g. SequenceFile.Writer.append needs to copy the
> key/value into a buffer and then check the buffer's size to figure the
> record/key/value lenghts. Obviously this can be improved to do away with the
> extra copy if we had types which could be queried for it's raw-length.
> Thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.