huxinqiu commented on PR #5151: URL: https://github.com/apache/hadoop/pull/5151#issuecomment-1321736995
> @huxinqiu Thank you very much for your contribution! > > We need to discuss something: > > 1. It seems that the benefit is to avoid declaring this variable `ResponseBuffer`, bringing the initialized 1024 byte. > Then we moved the original internal calculation code directly to the outside. > > > modified code > > ``` > int computedSize = connectionContextHeader.getSerializedSize(); > computedSize += CodedOutputStream.computeUInt32SizeNoTag(computedSize); > int messageSize = message.getSerializedSize(); > computedSize += messageSize; > computedSize += CodedOutputStream.computeUInt32SizeNoTag(messageSize); > byte[] dataLengthBuffer = new byte[4]; > dataLengthBuffer[0] = (byte)((computedSize >>> 24) & 0xFF); > dataLengthBuffer[1] = (byte)((computedSize >>> 16) & 0xFF); > dataLengthBuffer[2] = (byte)((computedSize >>> 8) & 0xFF); > dataLengthBuffer[3] = (byte)(computedSize & 0xFF); > ``` > > > The original calculation code is like this connectionContextHeader.writeDelimitedTo(buf) > > ``` > int serialized = this.getSerializedSize(); > int bufferSize = CodedOutputStream.computePreferredBufferSize(CodedOutputStream.computeRawVarint32Size(serialized) + serialized); > CodedOutputStream codedOutput = CodedOutputStream.newInstance(output, bufferSize); > codedOutput.writeRawVarint32(serialized); > this.writeTo(codedOutput); > codedOutput.flush(); > ``` > > > ResponseBuffer#setSize > > ``` > @Override > public int size() { > return count - FRAMING_BYTES; > } > void setSize(int size) { > buf[0] = (byte)((size >>> 24) & 0xFF); > buf[1] = (byte)((size >>> 16) & 0xFF); > buf[2] = (byte)((size >>> 8) & 0xFF); > buf[3] = (byte)((size >>> 0) & 0xFF); > } > ``` > > 2. code duplication > The following calculation logic appears 3 times > > ``` > this.dataLengthBuffer = new byte[4]; > dataLengthBuffer[0] = (byte)((computedSize >>> 24) & 0xFF); > dataLengthBuffer[1] = (byte)((computedSize >>> 16) & 0xFF); > dataLengthBuffer[2] = (byte)((computedSize >>> 8) & 0xFF); > dataLengthBuffer[3] = (byte)(computedSize & 0xFF); > this.header = header; > this.rpcRequest = rpcRequest; > ``` > > RpcProtobufRequestWithHeader#Constructor SaslRpcClient#sendSaslMessage Client#writeConnectionContext @slfan1989 1. Yes, IpcStreams#out is a BufferedOutputStream, which has a byte array inside it, and protobuf's CodedOutputStream also has a byte array cache inside to optimize writing, we don't need to aggregate dataLength, header and rpcRequest into a ResponseBuffer which is actually a byte array. The only extra performance cost in the RpcRequestSender thread is the serialization of protobuf, usually the request size is only a few hundred bytes, and the serialization will only cost tens of microseconds. So I think it is better to first calculate the request size, and then write the dataLength, header and rpcRequest to the BufferedOutputStream one by one, so as to avoid requesting a 1024-byte array for each request. 2.I'll fix these code duplication afterwards -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org