[GitHub] [beam] nielm commented on issue #6407: [BEAM-5404] Use Java serialization for MutationGroup objects.

GitHub Fri, 21 Sep 2018 08:24:33 -0700

This PR is irrelevant  and will be withdrawn:  it was a bug in my testing which 
indicated that Java serialization is more efficient


My mutations were 10 columns of 10K strings... but the values were _the same_ 
10K string. 
ie: String stringValue = new String( /* 10K char array */)

Mutation m = Mutation.newInsertOrUpdateBuilder("table1")
    .set("key").to(UUID.randomUUID().toString())
    .set("value0").to(stringValue)
    .set("value1").to(stringValue)
    .set("value2").to(stringValue)
// etc

So when the custom serializer encoded this, it produced a ~100K byte array, 
Java serialization was being clever: it only sees one String object to be 
serialized and produced a ~10K byte array...


[ Full content available at: https://github.com/apache/beam/pull/6407 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [beam] nielm commented on issue #6407: [BEAM-5404] Use Java serialization for MutationGroup objects.

Reply via email to