[ 
https://issues.apache.org/jira/browse/IGNITE-28847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-28847:
--------------------------------------
    Fix Version/s: 2.19

> REST (memcached): eliminate intermediate buffer copies when encoding responses
> ------------------------------------------------------------------------------
>
>                 Key: IGNITE-28847
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28847
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Anton Vinogradov
>            Assignee: Anton Vinogradov
>            Priority: Major
>             Fix For: 2.19
>
>
> h3. Problem
> {\{GridTcpRestParser#encodeMemcache}} copies each response key/value payload 
> up to 4 times before it reaches the wire:
> # \{{encodeObj}} writes the encoded bytes into an intermediate 
> \{{ByteArrayOutputStream}} (copy into the BAOS internal buffer, with growth 
> reallocations on the way);
> # \{{ByteArrayOutputStream#toByteArray()}} produces another full copy;
> # that array is appended to a \{{GridByteArrayList}} created with capacity 
> \{{HDR_LEN}} (24 bytes) only, so the list keeps doubling and re-copying its 
> internal array while the payload is appended;
> # \{{GridByteArrayList#entireArray()}} makes a final trimming copy because 
> the internal array is larger than the actual packet.
> As a side effect, \{{encodeMemcache}} also mutates the message being encoded 
> (\{{msg.key(...)}} / \{{msg.value(...)}} are overwritten with the serialized 
> \{{byte[]}}).
> h3. Change
> * \{{encodeObj}} returns \{{T2<byte[], Integer>}} (encoded bytes + type 
> flags) instead of writing into a caller-provided \{{ByteArrayOutputStream}}. 
> For all fixed-width and \{{String}}/\{{byte[]}} payloads the encoded array is 
> produced directly; for JDK-serialized objects \{{U.marshal(marsh, obj)}} is 
> used instead of marshalling into a BAOS.
> * \{{encodeMemcache}} computes the exact packet size up front and allocates 
> \{{GridByteArrayList(HDR_LEN + flagsLen + keyLen + dataLen)}}. The list never 
> grows, so \{{entireArray()}} returns its internal array without copying. The 
> only remaining copy is the single append of key/value bytes into the packet 
> buffer.
> * The side-effecting mutation of \{{msg.key()}} / \{{msg.value()}} during 
> encoding is removed.
> Net effect for \{{String}}/\{{byte[]}} payloads: 4 full payload copies → 1.
> Wire format is *unchanged* (byte-for-byte identical packets, verified in 
> benchmark setup: the old and new encoders are replicated verbatim and their 
> outputs compared with \{{Arrays.equals}} for every payload; the benchmark 
> aborts on any mismatch).
> h3. Benchmark
> JMH 1.37, \{{Mode.AverageTime}}, 1 fork, 3×1s warmup, 5×1s measurement, 
> \{{-prof gc}}; Apple Silicon, JDK 17 (Amazon Corretto 17.0.11); key = short 
> \{{String}}, payloads: 64-byte \{{String}}, 1 KiB \{{String}}, 8 KiB 
> \{{byte[]}}, \{{HashMap}} of 10 entries (JDK serialization).
> ||Payload||old, ns/op||new, ns/op||Time||old, B/op||new, B/op||Alloc||
> |STR_64|50.6 ± 0.8|21.3 ± 1.9|−58%|872|240|−72%|
> |STR_1K|208.9 ± 20.2|73.8 ± 9.4|−65%|6,632|2,160|−67%|
> |BYTES_8K|1,596.0 ± 110.4|433.0 ± 9.1|−73%|41,432|8,352|−80%|
> |OBJ_MAP|962.3 ± 138.8|1,094.3 ± 194.1|parity ^1^|5,528|4,184|−24%|
> ^1^ JDK-serialized payloads are dominated by serialization cost itself; the 
> time difference is within the error bars, while per-op allocations still drop.
> h3. Testing
> {\{TcpRestParserSelfTest}}, \{{RestMemcacheProtocolSelfTest}}, 
> \{{ClientMemcachedProtocolSelfTest}} — 38/38 pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to