[
https://issues.apache.org/jira/browse/SOLR-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729423#comment-17729423
]
Noble Paul commented on SOLR-16812:
-----------------------------------
Let's be clear about the objectives of this ticket.
We use JSON to index/query Solr because we do not use java. So we need a more
efficient method to interact with Solr(especially indexing, because we are
write heavy). I wanted to pick up a format that has libraries in as many
languages as possible (Go, python, C# etc)
{quote}What does the CBOR performance look like generally?
{quote}
Here is a [benchmark|https://ugorji.net/blog/benchmarking-serialization-in-go]
from the wild. The point is most of these binary formats are much better than
JSON.
{quote}"films.json" feels a little small to be testing this.
{quote}
It has 1100 docs. How often do we index/fetch more than 1100 docs?
{quote}Can you elaborate at all on why you chose CBOR over other alternatives?
{quote}
I have done benchmarks and it concurs with the numbers we see in the wild. Avro
is not considered because there is no jackson support and I didn't find a good
Go support too.
{quote}if we introduce a new binary format, then it should come with a plan to
deprecate or replace javabin.
{quote}
I wish to see it happening. javabin must go(if possible). We need to do a lot
of refactoring on our Solr/SolrJ code before it is possible. It's a non-trivial
task.
> Support CBOR format for update/query
> ------------------------------------
>
> Key: SOLR-16812
> URL: https://issues.apache.org/jira/browse/SOLR-16812
> Project: Solr
> Issue Type: Task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Noble Paul
> Assignee: Noble Paul
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Javabin is quite efficient and fast . But non-java users have to use JSON
> exclusively
>
> [CBOR |http://example.com/] is a widely used format that is supported by most
> languages.
>
> Here is a benchmark of updating using CBOR vs. JSON our films.json
> {code:java}
> Payload Size (bytes)
> ============
>
> json : 633600
> cbor : 290672
> javabin: 234520
> time taken to index
> ====================
> JSON: 583ms
> CBOR: 509ms
> JAVABIN : 549
> time takes to query *:* 1100 docs
> ==================================
> json: 92 ms
> javabin : 70ms
> cbor : 63ms{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]