[jira] [Updated] (IGNITE-12543) When put List>, the data was increased much larger.

LEE PYUNG BEOM (Jira) Wed, 15 Jan 2020 17:50:29 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-12543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


LEE PYUNG BEOM updated IGNITE-12543:
------------------------------------
    Description: 
I use Ignite 2.6 version of Java Thin Client.

 

When I put data in the form List<List<SomeObject>>, 

The size of the original 200KB data was increased to 50MB when inquired by 
Ignite servers.

On the Heap Dump, the list element was repeatedly accumulated, increasing the 
data size.

 

When I checked org.apacheignite.internal.binary.BinaryWriterExImpl.java 
doWriteBinaryObject() method,
{code:java}
// org.apacheignite.internal.binary.BinaryWriterExImpl.java

    public void doWriteBinaryObject(@Nullable BinaryObjectImpl po) {
        if (po == null)
            out.writeByte(GridBinaryMarshaller.NULL);
        else {
            byte[] poArr = po.array();
            out.unsafeEnsure(1 + 4 + poArr.length +4);
            out.unsafeWriteByte(GridBinaryMarshaller.BINARY_OBJ);
            out.unsafeWriteInt(poArr.length);
            out.writeByteArray(poArr);
            out.unsafeWriteInt(po.start());
        }
    }
{code}
 

The current Ignite implementation for storing data in the form 
List<List<Some_Objectject>> is:

In the Marshalling stage, for example, data the size of List(5 members)<List(10 
members)<Some_Object(size:200 KB)> is:

As many as 10*5 of the list's elements are duplicated.

If the above data contains five objects of 200KB size, ten by one,

50 iterations are stored and 200K*10**5 = 100MB of data is used for cache and 
transfer.

As a result of this increase in data size, it is confirmed that the failure of 
OOM, GC, etc. is caused by occupying Heap memory.

Unnecessarily redundant data is used for cache storage and network transport.

When looking up cache data, only some of the data at the top is read based on 
file location information from the entire data, so that normal data is 
retrieved.

The way we're implemented today is safe from basic behavior, but we're wasting 
memory and network unnecessarily using inefficient algorithms

This can have very serious consequences. Please check.

 

 

  was:
I use Ignite 2.6 version of Java Thin Client.

 

When I put data in the form List<List<SomeObject>>, 

The size of the original 200KB data was increased to 50MB when inquired by 
Ignite servers.

On the Heap Dump, the list element was repeatedly accumulated, increasing the 
data size.

 

When I checked org.apacheignite.internal.binary.BinaryWriterExImpl.java 
doWriteBinaryObject() method,
{code:java}
// org.apacheignite.internal.binary.BinaryWriterExImpl.java

    public void doWriteBinaryObject(@Nullable BinaryObjectImpl po) {
        if (po == null)
            out.writeByte(GridBinaryMarshaller.NULL);
        else {
            byte[] poArr = po.array();
            out.unsafeEnsure(1 + 4 + poArr.length +4);
            out.unsafeWriteByte(GridBinaryMarshaller.BINARY_OBJ);
            out.unsafeWriteInt(poArr.length);
            out.writeByteArray(poArr);
            out.unsafeWriteInt(po.start());
        }
    }
{code}
 

The current Ignite implementation for storing data in the form List 
<Some_Objectject> is:

In the Marshalling stage, for example, data the size of List(5 members)<List(10 
members)<Some_Object(size:200 KB)> is:

As many as 10*5 of the list's elements are duplicated.

If the above data contains five objects of 200KB size, ten by one,

50 iterations are stored and 200K*10**5 = 100MB of data is used for cache and 
transfer.

As a result of this increase in data size, it is confirmed that the failure of 
OOM, GC, etc. is caused by occupying Heap memory.

Unnecessarily redundant data is used for cache storage and network transport.

When looking up cache data, only some of the data at the top is read based on 
file location information from the entire data, so that normal data is 
retrieved.

The way we're implemented today is safe from basic behavior, but we're wasting 
memory and network unnecessarily using inefficient algorithms

This can have very serious consequences. Please check.

 

 


> When put List<List<SomeObject>>, the data was increased much larger.
> --------------------------------------------------------------------
>
>                 Key: IGNITE-12543
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12543
>             Project: Ignite
>          Issue Type: Bug
>          Components: thin client
>    Affects Versions: 2.6
>            Reporter: LEE PYUNG BEOM
>            Priority: Major
>
> I use Ignite 2.6 version of Java Thin Client.
>  
> When I put data in the form List<List<SomeObject>>, 
> The size of the original 200KB data was increased to 50MB when inquired by 
> Ignite servers.
> On the Heap Dump, the list element was repeatedly accumulated, increasing the 
> data size.
>  
> When I checked org.apacheignite.internal.binary.BinaryWriterExImpl.java 
> doWriteBinaryObject() method,
> {code:java}
> // org.apacheignite.internal.binary.BinaryWriterExImpl.java
>     public void doWriteBinaryObject(@Nullable BinaryObjectImpl po) {
>         if (po == null)
>             out.writeByte(GridBinaryMarshaller.NULL);
>         else {
>             byte[] poArr = po.array();
>             out.unsafeEnsure(1 + 4 + poArr.length +4);
>             out.unsafeWriteByte(GridBinaryMarshaller.BINARY_OBJ);
>             out.unsafeWriteInt(poArr.length);
>             out.writeByteArray(poArr);
>             out.unsafeWriteInt(po.start());
>         }
>     }
> {code}
>  
> The current Ignite implementation for storing data in the form 
> List<List<Some_Objectject>> is:
> In the Marshalling stage, for example, data the size of List(5 
> members)<List(10 members)<Some_Object(size:200 KB)> is:
> As many as 10*5 of the list's elements are duplicated.
> If the above data contains five objects of 200KB size, ten by one,
> 50 iterations are stored and 200K*10**5 = 100MB of data is used for cache and 
> transfer.
> As a result of this increase in data size, it is confirmed that the failure 
> of OOM, GC, etc. is caused by occupying Heap memory.
> Unnecessarily redundant data is used for cache storage and network transport.
> When looking up cache data, only some of the data at the top is read based on 
> file location information from the entire data, so that normal data is 
> retrieved.
> The way we're implemented today is safe from basic behavior, but we're 
> wasting memory and network unnecessarily using inefficient algorithms
> This can have very serious consequences. Please check.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IGNITE-12543) When put List>, the data was increased much larger.

Reply via email to