clesaec commented on code in PR #2521: URL: https://github.com/apache/avro/pull/2521#discussion_r1361760317
########## lang/java/avro/src/main/java/org/apache/avro/io/BlockingDirectBinaryEncoder.java: ########## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.avro.io; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.OutputStream; +import java.nio.ByteBuffer; + +/** + * An {@link Encoder} for Avro's binary encoding that does not buffer output. + * <p/> + * This encoder does not buffer writes, and as a result is slower than + * {@link BufferedBinaryEncoder}. However, it is lighter-weight and useful when + * the buffering in BufferedBinaryEncoder is not desired and/or the Encoder is + * very short-lived. + * <p/> + * To construct, use + * {@link EncoderFactory#blockingDirectBinaryEncoder(OutputStream, BinaryEncoder)} + * <p/> + * BlockingDirectBinaryEncoder is not thread-safe + * + * @see BinaryEncoder + * @see EncoderFactory + * @see Encoder + * @see Decoder + */ +public class BlockingDirectBinaryEncoder extends DirectBinaryEncoder { + private static final ThreadLocal<BufferOutputStream> BUFFER = ThreadLocal.withInitial(BufferOutputStream::new); Review Comment: Why here a "static ThreadLocal" and not a simple field member. BinaryEncoder are not robust to multi-thread, but you can have 2 encoder on one thread, that are writing data alternatively. ########## lang/java/avro/src/main/java/org/apache/avro/io/BlockingDirectBinaryEncoder.java: ########## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.avro.io; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.OutputStream; +import java.nio.ByteBuffer; + +/** + * An {@link Encoder} for Avro's binary encoding that does not buffer output. + * <p/> + * This encoder does not buffer writes, and as a result is slower than + * {@link BufferedBinaryEncoder}. However, it is lighter-weight and useful when + * the buffering in BufferedBinaryEncoder is not desired and/or the Encoder is + * very short-lived. + * <p/> + * To construct, use + * {@link EncoderFactory#blockingDirectBinaryEncoder(OutputStream, BinaryEncoder)} + * <p/> + * BlockingDirectBinaryEncoder is not thread-safe + * + * @see BinaryEncoder + * @see EncoderFactory + * @see Encoder + * @see Decoder + */ +public class BlockingDirectBinaryEncoder extends DirectBinaryEncoder { + private static final ThreadLocal<BufferOutputStream> BUFFER = ThreadLocal.withInitial(BufferOutputStream::new); + + private OutputStream originalStream; + + private boolean inBlock = false; + + private long blockItemCount; + + /** + * Create a writer that sends its output to the underlying stream + * <code>out</code>. + * + * @param out The Outputstream to write to + */ + public BlockingDirectBinaryEncoder(OutputStream out) { + super(out); + } + + private void startBlock() { + if (inBlock) { + throw new RuntimeException("Nested Maps/Arrays are not supported by the BlockingDirectBinaryEncoder"); Review Comment: Just put BUFFER as a stack of outputStream, and it would become possible; but not mandatory :). ########## lang/java/avro/src/test/java/org/apache/avro/io/TestBinaryEncoderFidelity.java: ########## @@ -181,6 +181,50 @@ void directBinaryEncoder() throws IOException { assertArrayEquals(complexdata, result2); } + @Test + void blockingDirectBinaryEncoder() throws IOException { + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + BinaryEncoder e = factory.blockingDirectBinaryEncoder(baos, null); + generateData(e, true); + + byte[] result = baos.toByteArray(); + assertEquals(legacydata.length, result.length); + assertArrayEquals(legacydata, result); + baos.reset(); + + generateComplexData(e); + byte[] result2 = baos.toByteArray(); + // blocking will cause different length, should be two bytes larger + assertEquals(complexdata.length + 2, result2.length); + // the first byte is the array start, with the count of items negative + assertEquals(complexdata[0] >>> 1, result2[0]); + baos.reset(); + + e.writeArrayStart(); + e.setItemCount(1); + e.startItem(); + e.writeInt(1); + e.writeArrayEnd(); + + // 1: 1 element in the array + // 2: 1 byte for the int + // 3: zigzag encoded int + // 4: 0 elements in the next block + assertArrayEquals(baos.toByteArray(), new byte[] { 1, 2, 2, 0 }); + baos.reset(); + + e.writeArrayStart(); + e.setItemCount(0); + e.writeArrayEnd(); Review Comment: Could you test this 2 last byte array with a binary decoder to ensure it works with this new encoder. (if it can't, create a specific decoder class) ########## lang/java/avro/src/test/java/org/apache/avro/io/TestBinaryEncoderFidelity.java: ########## @@ -181,6 +181,50 @@ void directBinaryEncoder() throws IOException { assertArrayEquals(complexdata, result2); } + @Test + void blockingDirectBinaryEncoder() throws IOException { + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + BinaryEncoder e = factory.blockingDirectBinaryEncoder(baos, null); + generateData(e, true); + + byte[] result = baos.toByteArray(); + assertEquals(legacydata.length, result.length); + assertArrayEquals(legacydata, result); + baos.reset(); + + generateComplexData(e); + byte[] result2 = baos.toByteArray(); + // blocking will cause different length, should be two bytes larger + assertEquals(complexdata.length + 2, result2.length); + // the first byte is the array start, with the count of items negative + assertEquals(complexdata[0] >>> 1, result2[0]); + baos.reset(); + + e.writeArrayStart(); + e.setItemCount(1); + e.startItem(); + e.writeInt(1); + e.writeArrayEnd(); + + // 1: 1 element in the array + // 2: 1 byte for the int + // 3: zigzag encoded int + // 4: 0 elements in the next block + assertArrayEquals(baos.toByteArray(), new byte[] { 1, 2, 2, 0 }); + baos.reset(); + + e.writeArrayStart(); + e.setItemCount(0); + e.writeArrayEnd(); + + // This is correct + // 0: 0 elements in the block + assertArrayEquals(baos.toByteArray(), new byte[] { 0 }); + baos.reset(); Review Comment: Could you add a test where an array (or map) is skiped; if i understood well, by calling setItemCount(0) before end array ? (or i missed the purpose of this) ```java e.writeArrayStart(); e.setItemCount(1); e.startItem(); e.writeInt(1); e.setItemCount(0); // here, to skip the array ?? e.writeArrayEnd(); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
