Hi Valentin! Thank you for comments.
There is a new method which writes directly to BinaryOutputStream instead of intermediate array. https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/BinaryUtilsNew.java There is benchmark. https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/MyBenchmark.java Unit test https://github.com/javaller/MyBenchmark/blob/master/src/ main/java/org/sample/BinaryOutputStreamTest.java Statistics https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt Benchmark Mode Cnt Score Error Units MyBenchmark.binaryHeapOutputInDirect avgt 50 111,337 ± 0,742 ns/op MyBenchmark.binaryHeapOutputStreamDirect avgt 50 23,847 ± 0,303 ns/op Vadim 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <valentin.kuliche...@gmail.com >: > Hi Vadim, > > Looks like you accidentally removed dev list from the thread, adding it > back. > > I think there is still misunderstanding. What I propose is to modify > the BinaryUtils#strToUtf8Bytes so that it writes directly to > BinaryOutputStream > instead of intermediate array. This should decrease memory consumption and > can also increase performance as we will avoid 'writeByteArray' step at > the end. > > Does it make sense to you? > > -Val > > On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <vaopols...@gmail.com> > wrote: > >> Hi, Valentin! >> >> What do you think about using the methods of BinaryOutputStream: >> >> 1) writeByteArray(byte[] val) >> 2) writeCharArray(char[] val) >> 3) write (byte[] arr, int off, int len) >> >> String val = "Test"; >> out.writeByteArray( val.getBytes(UTF_8)); >> >> String val = "Test"; >> out.writeCharArray(str.toCharArray()); >> >> String val = "Test" >> InputStream stream = new ByteArrayInputStream( >> exampleString.getBytes(StandartCharsets.UTF_8)); >> byte[] buffer = new byte[1024]; >> while ((buffer = stream.read()) != -1) { >> out.writeByteArray(buffer); >> } >> >> What else can we use ? >> >> Vadim >> >> >> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko < >> valentin.kuliche...@gmail.com>: >> >>> Hi Vadim, >>> >>> Which method implements the approach described in the ticket? From what >>> I see, all writeToStringX versions are still encoding into an intermediate >>> array and then call out.writeByteArray. What we need to test is the >>> approach where bytes are written directly into the stream during encoding. >>> Encoding algorithm itself should stay the same for now, otherwise we will >>> not know how to interpret the result. >>> >>> It looks like there is some misunderstanding here, so please let me know >>> anything is still unclear. I will be happy to answer your questions. >>> >>> -Val >>> >>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko < >>> valentin.kuliche...@gmail.com> wrote: >>> >>>> Hi Vadim, >>>> >>>> Thanks, I will review this week. >>>> >>>> -Val >>>> >>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <vaopols...@gmail.com> >>>> wrote: >>>> >>>>> Hi Valentin! >>>>> >>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>> >>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and >>>>> added new methods with changes described in the ticket >>>>> >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/BinaryWriterExImplNew.java >>>>> >>>>> I created a benchmark for BinaryWriterExImplNew >>>>> >>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main >>>>> /java/org/sample/ExampleTest.java >>>>> >>>>> I run benchmark and compared results >>>>> >>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt >>>>> >>>>> # Run complete. Total time: 00:10:24 >>>>> Benchmark Mode Cnt >>>>> Score Error Units >>>>> ExampleTest.binaryHeapOutputStream1 avgt 50 1114999,207 ± >>>>> 16756,776 ns/op >>>>> ExampleTest.binaryHeapOutputStream2 avgt 50 1118149,320 ± >>>>> 17515,961 ns/op >>>>> ExampleTest.binaryHeapOutputStream3 avgt 50 1113678,657 ± >>>>> 17652,314 ns/op >>>>> ExampleTest.binaryHeapOutputStream4 avgt 50 1112415,051 ± >>>>> 18273,874 ns/op >>>>> ExampleTest.binaryHeapOutputStream5 avgt 50 1111366,583 ± >>>>> 18282,829 ns/op >>>>> ExampleTest.binaryHeapOutputStreamACSII avgt 50 1112079,667 ± >>>>> 16659,532 ns/op >>>>> ExampleTest.binaryHeapOutputStreamUTFCustom avgt 50 1114949,759 ± >>>>> 16809,669 ns/op >>>>> ExampleTest.binaryHeapOutputStreamUTFNIO avgt 50 >>>>> 1121462,325 ± 19836,466 ns/op >>>>> >>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark to >>>>> the Ignite project ? >>>>> >>>>> Vadim Opolski >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko < >>>>> valentin.kuliche...@gmail.com>: >>>>> >>>>>> Hi Vadim, >>>>>> >>>>>> I'm not sure I understand your benchmarks and how they verify the >>>>>> optimization discussed here. Basically, here is what needs to be done: >>>>>> >>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method. >>>>>> 2. Run the benchmark with current implementation. >>>>>> 3. Make the change described in the ticket. >>>>>> 4. Run the benchmark with these changes. >>>>>> 5. Compare results. >>>>>> >>>>>> Makes sense? Let me know if anything is unclear. >>>>>> >>>>>> -Val >>>>>> >>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский < >>>>>> vaopols...@gmail.com> wrote: >>>>>> >>>>>>> Hello everybody! >>>>>>> >>>>>>> https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>> >>>>>>> Valentin, I just have finished benchmark (with JMH) - >>>>>>> https://github.com/javaller/MyBenchmark.git >>>>>>> >>>>>>> It collect data about time working of serialization. >>>>>>> >>>>>>> For instance - https://github.com/javaller/My >>>>>>> Benchmark/blob/master/out200217.txt >>>>>>> >>>>>>> To start it you have to do next: >>>>>>> >>>>>>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git >>>>>>> >>>>>>> 2) install it - mvn install >>>>>>> >>>>>>> 3) run benchmarks - java -Xms1024m -Xmx4096m -jar >>>>>>> target\benchmarks.jar >>>>>>> >>>>>>> Vadim Opolski >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko < >>>>>>> valentin.kuliche...@gmail.com>: >>>>>>> >>>>>>>> Vladimir, >>>>>>>> >>>>>>>> I think we misunderstood each other. My understanding of this >>>>>>>> optimization is the following. >>>>>>>> >>>>>>>> Currently string serialization is done in two steps (see >>>>>>>> BinaryWriterExImpl#doWriteString): >>>>>>>> >>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into >>>>>>>> byte array. >>>>>>>> out.writeByteArray(strArr); // Write byte >>>>>>>> array into stream. >>>>>>>> >>>>>>>> What this ticket suggests is to write directly into stream while >>>>>>>> string is encoded, without intermediate array. This both reduces memory >>>>>>>> consumption and eliminates array copy step. >>>>>>>> >>>>>>>> I updated the ticket and added this explanation there. >>>>>>>> >>>>>>>> Vadim, can you create a micro benchmark and check if it gives any >>>>>>>> improvement? >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov < >>>>>>>> voze...@gridgain.com> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it >>>>>>>>> could speed up marshalling process at the cost of 2x memory required >>>>>>>>> for >>>>>>>>> strings. From my previous experience with marshalling >>>>>>>>> micro-optimizations, >>>>>>>>> we will hardly ever notice speedup in distributed environment. >>>>>>>>> >>>>>>>>> But, there is another sied - it could speedup our queries, because >>>>>>>>> we will not have to unmarshal string on every field access. So I >>>>>>>>> would try >>>>>>>>> to make this optimization optional and then measure query performance >>>>>>>>> with >>>>>>>>> classes having lots of strings. It could give us interesting results. >>>>>>>>> >>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko < >>>>>>>>> valentin.kuliche...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Vladimir, >>>>>>>>>> >>>>>>>>>> Can you please take a look and provide your thoughts? Can this be >>>>>>>>>> applied to binary marshaller? From what I recall, it serializes >>>>>>>>>> string a >>>>>>>>>> bit differently from optimized marshaller, so I'm not sure. >>>>>>>>>> >>>>>>>>>> -Val >>>>>>>>>> >>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan < >>>>>>>>>> dsetrak...@apache.org> wrote: >>>>>>>>>> >>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko < >>>>>>>>>>> valentin.kuliche...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> > Hi Vadim, >>>>>>>>>>> > >>>>>>>>>>> > I don't think it makes much sense to invest into >>>>>>>>>>> OptimizedMarshaller. >>>>>>>>>>> > However, I would check if this optimization is applicable to >>>>>>>>>>> > BinaryMarshaller, and if yes, implement it. >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>>> Val, in this case can you please update the ticket? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> > >>>>>>>>>>> > -Val >>>>>>>>>>> > >>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский < >>>>>>>>>>> vaopols...@gmail.com> >>>>>>>>>>> > wrote: >>>>>>>>>>> > >>>>>>>>>>> > > Dear sirs! >>>>>>>>>>> > > >>>>>>>>>>> > > I want to resolve issue IGNITE-13 - >>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13 >>>>>>>>>>> > > >>>>>>>>>>> > > Is it actual? >>>>>>>>>>> > > >>>>>>>>>>> > > Vadim Opolski >>>>>>>>>>> > > >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >