This PR is irrelevant and will be withdrawn: it was a bug in my testing which
indicated that Java serialization is more efficient
My mutations were 10 columns of 10K strings... but the values were _the same_
10K string.
ie: String stringValue = new String( /* 10K char array */)
Mutation m = Mutation.newInsertOrUpdateBuilder("table1")
.set("key").to(UUID.randomUUID().toString())
.set("value0").to(stringValue)
.set("value1").to(stringValue)
.set("value2").to(stringValue)
// etc
So when the custom serializer encoded this, it produced a ~100K byte array,
Java serialization was being clever: it only sees one String object to be
serialized and produced a ~10K byte array...
[ Full content available at: https://github.com/apache/beam/pull/6407 ]
This message was relayed via gitbox.apache.org for [email protected]