Re: IGNITE-13

Вадим Опольский Wed, 01 Mar 2017 06:18:54 -0800

Hi Valentin!

Thank you for comments.


There is a new method which writes directly to BinaryOutputStream instead
of intermediate array.
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/BinaryUtilsNew.java

There is benchmark.
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/MyBenchmark.java

Unit test
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/BinaryOutputStreamTest.java

Statistics
https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt

Benchmark
 Mode       Cnt    Score        Error  Units
MyBenchmark.binaryHeapOutputInDirect            avgt          50  111,337 ±
0,742  ns/op MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
23,847 ± 0,303    ns/op


Vadim










2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <[email protected]
>:

> Hi Vadim,
>
> Looks like you accidentally removed dev list from the thread, adding it
> back.
>
> I think there is still misunderstanding. What I propose is to modify
> the BinaryUtils#strToUtf8Bytes so that it writes directly to 
> BinaryOutputStream
> instead of intermediate array. This should decrease memory consumption and
> can also increase performance as we will avoid 'writeByteArray' step at
> the end.
>
> Does it make sense to you?
>
> -Val
>
> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[email protected]>
> wrote:
>
>> Hi, Valentin!
>>
>> What do you think about using the methods of BinaryOutputStream:
>>
>> 1) writeByteArray(byte[] val)
>> 2) writeCharArray(char[] val)
>> 3) write (byte[] arr, int off, int len)
>>
>> String val = "Test";
>>     out.writeByteArray( val.getBytes(UTF_8));
>>
>>  String val = "Test";
>>     out.writeCharArray(str.toCharArray());
>>
>> String val = "Test"
>> InputStream stream = new ByteArrayInputStream(
>> exampleString.getBytes(StandartCharsets.UTF_8));
>> byte[] buffer = new byte[1024];
>> while ((buffer = stream.read()) != -1) {
>> out.writeByteArray(buffer);
>> }
>>
>> What else can we use ?
>>
>> Vadim
>>
>>
>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>> [email protected]>:
>>
>>> Hi Vadim,
>>>
>>> Which method implements the approach described in the ticket? From what
>>> I see, all writeToStringX versions are still encoding into an intermediate
>>> array and then call out.writeByteArray. What we need to test is the
>>> approach where bytes are written directly into the stream during encoding.
>>> Encoding algorithm itself should stay the same for now, otherwise we will
>>> not know how to interpret the result.
>>>
>>> It looks like there is some misunderstanding here, so please let me know
>>> anything is still unclear. I will be happy to answer your questions.
>>>
>>> -Val
>>>
>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>> [email protected]> wrote:
>>>
>>>> Hi Vadim,
>>>>
>>>> Thanks, I will review this week.
>>>>
>>>> -Val
>>>>
>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Valentin!
>>>>>
>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>
>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and
>>>>> added new methods with changes described in the ticket
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>
>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/ExampleTest.java
>>>>>
>>>>> I run benchmark and compared results
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt
>>>>>
>>>>> # Run complete. Total time: 00:10:24
>>>>> Benchmark                                    Mode  Cnt
>>>>> Score       Error  Units
>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50  1114999,207 ±
>>>>> 16756,776  ns/op
>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50  1118149,320 ±
>>>>> 17515,961  ns/op
>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50  1113678,657 ±
>>>>> 17652,314  ns/op
>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50  1112415,051 ±
>>>>> 18273,874  ns/op
>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50  1111366,583 ±
>>>>> 18282,829  ns/op
>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50  1112079,667 ±
>>>>> 16659,532  ns/op
>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50  1114949,759 ±
>>>>> 16809,669  ns/op
>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>
>>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark to
>>>>> the Ignite project ?
>>>>>
>>>>> Vadim Opolski
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>> [email protected]>:
>>>>>
>>>>>> Hi Vadim,
>>>>>>
>>>>>> I'm not sure I understand your benchmarks and how they verify the
>>>>>> optimization discussed here. Basically, here is what needs to be done:
>>>>>>
>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>> 2. Run the benchmark with current implementation.
>>>>>> 3. Make the change described in the ticket.
>>>>>> 4. Run the benchmark with these changes.
>>>>>> 5. Compare results.
>>>>>>
>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hello everybody!
>>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>
>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>
>>>>>>> It collect data about time working of serialization.
>>>>>>>
>>>>>>> For instance - https://github.com/javaller/My
>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>
>>>>>>> To start it you have to do next:
>>>>>>>
>>>>>>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git
>>>>>>>
>>>>>>> 2) install it - mvn install
>>>>>>>
>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>> target\benchmarks.jar
>>>>>>>
>>>>>>> Vadim Opolski
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>> [email protected]>:
>>>>>>>
>>>>>>>> Vladimir,
>>>>>>>>
>>>>>>>> I think we misunderstood each other. My understanding of this
>>>>>>>> optimization is the following.
>>>>>>>>
>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>
>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into
>>>>>>>> byte array.
>>>>>>>> out.writeByteArray(strArr);                      // Write byte
>>>>>>>> array into stream.
>>>>>>>>
>>>>>>>> What this ticket suggests is to write directly into stream while
>>>>>>>> string is encoded, without intermediate array. This both reduces memory
>>>>>>>> consumption and eliminates array copy step.
>>>>>>>>
>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>
>>>>>>>> Vadim, can you create a micro benchmark and check if it gives any
>>>>>>>> improvement?
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it
>>>>>>>>> could speed up marshalling process at the cost of 2x memory required 
>>>>>>>>> for
>>>>>>>>> strings. From my previous experience with marshalling 
>>>>>>>>> micro-optimizations,
>>>>>>>>> we will hardly ever notice speedup in distributed environment.
>>>>>>>>>
>>>>>>>>> But, there is another sied - it could speedup our queries, because
>>>>>>>>> we will not have to unmarshal string on every field access. So I 
>>>>>>>>> would try
>>>>>>>>> to make this optimization optional and then measure query performance 
>>>>>>>>> with
>>>>>>>>> classes having lots of strings. It could give us interesting results.
>>>>>>>>>
>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Vladimir,
>>>>>>>>>>
>>>>>>>>>> Can you please take a look and provide your thoughts? Can this be
>>>>>>>>>> applied to binary marshaller? From what I recall, it serializes 
>>>>>>>>>> string a
>>>>>>>>>> bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>
>>>>>>>>>> -Val
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>> >
>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>> > However, I would check if this optimization is applicable to
>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> >
>>>>>>>>>>> > -Val
>>>>>>>>>>> >
>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>> [email protected]>
>>>>>>>>>>> > wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>> > >
>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>> > >
>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>> > >
>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>> > >
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: IGNITE-13

Reply via email to