Hi, You can just point me to your repo and I can open a proper PR with that code
On Mon, Mar 6, 2023, 9:43 AM Fikavec F <fika...@yandex.ru> wrote: > Thanks. In the coming days I will conduct testing and measurements on real > hardware. > Unfortunately my code is not ready to become part of the project > directly, since this is a very serious place for changes and I am not a > Java developer, I am not deeply familiar with the work of internal Solr > mechanisms, the code has no tests, it does not support modes and parameters > like the original wt=json, and I myself have a number of questions about > code, but it would be great if someone from knowledgeable professionals > would check my code and prepare a high-quality patch, as previously Mikhail > Khludnev helped me here get a patch with a modified buffer. As before, I am > happy to take part in testing such a patch, if it appears. All I did was > replace SmileResponseWriter with JsonFactory in the source code, as I wrote > earlier. I'm not sure that viewing my low-quality code will help > professionals more than knowing at which part of the code there is a 4x+ > slowdown from possible speeds in order to revise and improve it. > > I'm prepared a repository and share the code with the changes made - > https://github.com/Fikavec/NewAndModifiedSolrResponseWriters > The first commit with the code of the original SmileResponseWriter so that > it would be convenient to see what small changes I made. I placed all jar's > from bin folders in ... > /solr-8.11.2/server/solr-webapp/webapp/WEB-INF/lib/* and connected them via > collection solrconfig.xml: > > <queryResponseWriter name="myfastjson" > class="my.MyJacksonJsonResponseWriter"></queryResponseWriter> > <queryResponseWriter name="myfastcbor" > class="my.MyJacksonCBORResponseWriter"></queryResponseWriter> > > Then I created a collection and used them as wt=myfastjson and > wt=myfastcbor query parameters. > > Please let me know if there are problems in my code, especially the > place with utf-8 raises the question, since I do not know in which encoding > Solr transmits data to writers, Michael Gibney mentioned that in utf-16 -> > utf-8 --> writer, in addition, there are methods writeString and > writeRawUTF8String in jackson ( > https://fasterxml.github.io/jackson-core/javadoc/2.13/com/fasterxml/jackson/core/JsonGenerator.html) > which one is needed after Solr passes the data to writer? > > Method similar to writeString(String) > <https://fasterxml.github.io/jackson-core/javadoc/2.13/com/fasterxml/jackson/core/JsonGenerator.html#writeString-java.lang.String-> > but > that takes as its input a UTF-8 encoded String that is to be output as-is, > without additional escaping (type of which depends on data format; > backslashes for JSON). However, quoting that data format requires (like > double-quotes for JSON) will be added around the value if and as necessary. > > Note that some backends may choose not to support this method: for > example, if underlying destination is a Writer > <https://docs.oracle.com/javase/8/docs/api/java/io/Writer.html?is-external=true> > using > this method would require UTF-8 decoding. If so, implementation may instead > choose to throw a UnsupportedOperationException > <https://docs.oracle.com/javase/8/docs/api/java/lang/UnsupportedOperationException.html?is-external=true> > due > to ineffectiveness of having to decode input. > > I checked my code on different utf-8 data, I didn't find any problems, but > suddenly I used the wrong function (writeString) and there are cases when > the data will be corrupted... > > Speeding up the json output would be useful to many people, but I'm not > sure about CBOR. It turned out that CBOR is easily added (like other data > formats from the fasterxml jackson library > https://github.com/FasterXML/jackson#data-format-modules it is possible > that csv, xml... will work faster with this library than the current > implementation) as ResponseWriter, python is well supported (cbor2 fast) > and full data fetching with cursors works 10%-20% faster than fetching data > from Solr to python via JSON format (*this means faster in comparison with > the modified json serializer on jackson **in python I use orjson library > which is faster than a regular json library). I didn't find any very fast > smile format python desereliazator, but this does not mean that many people > needs CBOR. > > At the moment, everything works for me on my collections and their data > structures and works very fast. It was surprising to me that the speed of > regular json select with gzip has almost doubled, this could potentially > lead to upper rps, since at full load individual server responses will > return and end faster, I will try to check this too on real hardware using > wrk benchmarking tool. > > Best Regards, >