[jira] [Updated] (IGNITE-17043) Performance degradation in Marshaller

2022-05-26 Thread Sergey Kosarev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kosarev updated IGNITE-17043:

Description: 
There is a problem in ignite-core code in GridHandleTable used inside 
OptimizedMarshaller where the internal buffers grow in size and does not shrink 
back.
What problematic is in GridHandleTable? This is its reset() method that fills 
arrays in memory. Done once, it's not a big deal. Done a million times for a 
long buffer, it becomes really long and CPU-consuming.

Here is simple reproducer (omitting imports for brevity):

Marshalling of the same object at first takes about 50ms, and then after 
degradation more than 100 seconds.

{code:title=DegradationReproducer.java|borderStyle=solid}
public class DegradationReproducer extends BinaryMarshallerSelfTest {

@Test
public void reproduce() throws Exception {
List> obj = IntStream.range(0, 
10).mapToObj(Collections::singletonList).collect(Collectors.toList());

for (int i = 0; i < 50; i++) {
Assert.assertThat(measureMarshal(obj), Matchers.lessThan(1000L));
}

binaryMarshaller().marshal(
Collections.singletonList(IntStream.range(0, 
1000_000).mapToObj(String::valueOf).collect(Collectors.toList()))
);

Assert.assertThat(measureMarshal(obj), Matchers.lessThan(1000L));
}

private long measureMarshal(List> obj) throws 
IgniteCheckedException {
info("marshalling started ");
long millis = System.currentTimeMillis();

binaryMarshaller().marshal(obj);

millis = System.currentTimeMillis() - millis;

info("marshalling finished in " + millis + " ms");

return millis;
}
}

{code}

on my machine reslust is:
{quote}
.
[2022-05-26 20:58:27,178][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] marshalling finished in 
39 ms
[2022-05-26 20:58:27,769][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] marshalling started 
[2022-05-26 21:02:03,588][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] marshalling finished in 
215819 ms
[2022-05-26 21:02:03,593][ERROR][main][root] Test failed 
[test=DegradationReproducer#reproduce[useBinaryArrays = true], duration=218641]
java.lang.AssertionError: 
Expected: a value less than <1000L>
 but: <*215819L*> was greater than <1000L>

at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at 
org.apache.ignite.internal.binary.DegradationReproducer.reproduce(DegradationReproducer.java:27)
{quote}

  was:
There is a problem in ignite-core code in GridHandleTable used inside 
OptimizedMarshaller where the internal buffers grow in size and does not shrink 
back.
SingletonList is serialized with OptimizedMarshaller by default in Ignite. In 
contrast, for ArrayList serialization, BinaryMarshallerExImpl is used.
The difference between OptimizedMarshaller and BinaryMarshallerExImpl is that 
when OptimizedMarshaller starts to serialize an object node, all the descedant 
nodes continue to be serialized in OptimizedMarshaller using the same 
GridHandleTable associated with the current thread. GridHandleTable is static 
for a thread and never shrinks in size, its buffer becomes only larger in time.
BinaryMarshallerExImpl though, can divert serizliation to OptimizedMarshaller 
down the road.
What problematic is in GridHandleTable? This is its reset() method that fills 
arrays in memory. Done once, it's not a big deal. Done a million times for a 
long buffer, it becomes really long and CPU-consuming.

Here is simple reproducer (omitting imports for brevity):
{code:title=DegradationReproducer.java|borderStyle=solid}
public class DegradationReproducer extends BinaryMarshallerSelfTest {

@Test
public void reproduce() throws Exception {
List> obj = IntStream.range(0, 
10).mapToObj(Collections::singletonList).collect(Collectors.toList());

for (int i = 0; i < 50; i++) {
Assert.assertThat(measureMarshal(obj), Matchers.lessThan(1000L));
}

binaryMarshaller().marshal(
Collections.singletonList(IntStream.range(0, 
1000_000).mapToObj(String::valueOf).collect(Collectors.toList()))
);

Assert.assertThat(measureMarshal(obj), Matchers.lessThan(1000L));
}

private long measureMarshal(List> obj) throws 
IgniteCheckedException {
info("marshalling started ");
long millis = System.currentTimeMillis();

binaryMarshaller().marshal(obj);

millis = System.currentTimeMillis() - millis;

info("marshalling finished in " + millis + " ms");

return millis;
}
}

{code}

on my machine reslust is:
{quote}
.
[2022-05-26 20:58:27,178][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] 

[jira] [Updated] (IGNITE-17043) Performance degradation in Marshaller

2022-05-26 Thread Sergey Kosarev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kosarev updated IGNITE-17043:

Description: 
There is a problem in ignite-core code in GridHandleTable used inside 
OptimizedMarshaller where the internal buffers grow in size and does not shrink 
back.
SingletonList is serialized with OptimizedMarshaller by default in Ignite. In 
contrast, for ArrayList serialization, BinaryMarshallerExImpl is used.
The difference between OptimizedMarshaller and BinaryMarshallerExImpl is that 
when OptimizedMarshaller starts to serialize an object node, all the descedant 
nodes continue to be serialized in OptimizedMarshaller using the same 
GridHandleTable associated with the current thread. GridHandleTable is static 
for a thread and never shrinks in size, its buffer becomes only larger in time.
BinaryMarshallerExImpl though, can divert serizliation to OptimizedMarshaller 
down the road.
What problematic is in GridHandleTable? This is its reset() method that fills 
arrays in memory. Done once, it's not a big deal. Done a million times for a 
long buffer, it becomes really long and CPU-consuming.

Here is simple reproducer (omitting imports for brevity):
{code:title=DegradationReproducer.java|borderStyle=solid}
public class DegradationReproducer extends BinaryMarshallerSelfTest {

@Test
public void reproduce() throws Exception {
List> obj = IntStream.range(0, 
10).mapToObj(Collections::singletonList).collect(Collectors.toList());

for (int i = 0; i < 50; i++) {
Assert.assertThat(measureMarshal(obj), Matchers.lessThan(1000L));
}

binaryMarshaller().marshal(
Collections.singletonList(IntStream.range(0, 
1000_000).mapToObj(String::valueOf).collect(Collectors.toList()))
);

Assert.assertThat(measureMarshal(obj), Matchers.lessThan(1000L));
}

private long measureMarshal(List> obj) throws 
IgniteCheckedException {
info("marshalling started ");
long millis = System.currentTimeMillis();

binaryMarshaller().marshal(obj);

millis = System.currentTimeMillis() - millis;

info("marshalling finished in " + millis + " ms");

return millis;
}
}

{code}

on my machine reslust is:
{quote}
.
[2022-05-26 20:58:27,178][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] marshalling finished in 
39 ms
[2022-05-26 20:58:27,769][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] marshalling started 
[2022-05-26 21:02:03,588][INFO 
][test-runner-#1%binary.DegradationReproducer%][root] marshalling finished in 
215819 ms
[2022-05-26 21:02:03,593][ERROR][main][root] Test failed 
[test=DegradationReproducer#reproduce[useBinaryArrays = true], duration=218641]
java.lang.AssertionError: 
Expected: a value less than <1000L>
 but: <215819L> was greater than <1000L>

at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at 
org.apache.ignite.internal.binary.DegradationReproducer.reproduce(DegradationReproducer.java:27)
{quote}

  was:
There is a problem in ignite-core code in GridHandleTable used inside 
OptimizedMarshaller where the internal buffers grow in size and does not shrink 
back.
SingletonList is serialized with OptimizedMarshaller by default in Ignite. In 
contrast, for ArrayList serialization, BinaryMarshallerExImpl is used.
The difference between OptimizedMarshaller and BinaryMarshallerExImpl is that 
when OptimizedMarshaller starts to serialize an object node, all the descedant 
nodes continue to be serialized in OptimizedMarshaller using the same 
GridHandleTable associated with the current thread. GridHandleTable is static 
for a thread and never shrinks in size, its buffer becomes only larger in time.
BinaryMarshallerExImpl though, can divert serizliation to OptimizedMarshaller 
down the road.
What problematic is in GridHandleTable? This is its reset() method that fills 
arrays in memory. Done once, it's not a big deal. Done a million times for a 
long buffer, it becomes really long and CPU-consuming.




> Performance degradation in Marshaller
> -
>
> Key: IGNITE-17043
> URL: https://issues.apache.org/jira/browse/IGNITE-17043
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.13, 2.14
>Reporter: Sergey Kosarev
>Priority: Major
>
> There is a problem in ignite-core code in GridHandleTable used inside 
> OptimizedMarshaller where the internal buffers grow in size and does not 
> shrink back.
> SingletonList is serialized with OptimizedMarshaller by default in Ignite. In 
> contrast, for ArrayList serialization, BinaryMarshallerExImpl is used.
> The difference