[ 
https://issues.apache.org/jira/browse/SOLR-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112643#comment-15112643
 ] 

Shalin Shekhar Mangar commented on SOLR-8582:
---------------------------------------------

I think there was some underlying bug with JsonRecordReader that affects 
json-line files which is also solved by your patch. Without your patch, I was 
not able to index a 549MB json-line (one json per line) even with a 2g heap. I 
had to bump the heap upto 4g to succeed. But with your patch I am able to index 
the same file with a 512m heap. Too bad we missed 5.3.2 and 5.4.1 releases.

+1 to commit

> /update/json/docs is 4x slower than /update for indexing a list of json docs
> ----------------------------------------------------------------------------
>
>                 Key: SOLR-8582
>                 URL: https://issues.apache.org/jira/browse/SOLR-8582
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 5.5, Trunk
>
>         Attachments: SOLR-8582.patch, SOLR-8582.patch
>
>
> Indexing a ~650 MB json file containing a list of 2.2 million json documents, 
> I found that bin/post had become 4x slower after SOLR-7042. Memory 
> consumption has also gone up and I can no longer index this file with a 512mb 
> heap.
> The difference is because we now default to /update/json/docs instead of 
> /update. This can be verified on trunk:
> {code}
> time curl 'http://localhost:8983/solr/gettingstarted/update' --data-binary 
> @/hdd/solr-data/imdb.json 
> {"responseHeader":{"status":0,"QTime":161869}}
> ​
> real  2m42.044s
> user  0m0.292s
> sys   0m0.493s
> ​
> time curl 'http://localhost:8983/solr/gettingstarted/update/json/docs' 
> --data-binary @/hdd/solr-data/imdb.json 
> {"responseHeader":{"status":0,"QTime":686264}}
> ​
> real  11m26.478s
> user  0m0.324s
> sys   0m0.552s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to