[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-04-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086434#comment-17086434
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit c2cd10b923cf2dca0030f2b1c304038bd8267b4e in lucene-solr's branch 
refs/heads/branch_7_7 from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c2cd10b ]

SOLR-14259: Back port javabin performance regression fixes from SOLR-14013


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-04-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086432#comment-17086432
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit 6b1263a035cf1ff01c868dac5b32b2421aa74f1f in lucene-solr's branch 
refs/heads/branch_7_7 from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6b1263a ]

SOLR-14259: Back port javabin performance regression fixes from SOLR-14013


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-04-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083861#comment-17083861
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit 5d3dfbd0ce8a2ad990635e71144615f1c4815d22 in lucene-solr's branch 
refs/heads/branch_7_7 from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5d3dfbd ]

SOLR-14013: trying to port to SOlr 7.7 (#1254)



> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-02-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035589#comment-17035589
 ] 

Noble Paul commented on SOLR-14013:
---

I've opened SOLR-14259

 

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-02-12 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035499#comment-17035499
 ] 

Houston Putman commented on SOLR-14013:
---

[~noble.paul], at the very least I think we should backport this to 7_7. If we 
want to leave the latest release of 7 in a state with a significant 
regression/bug in it, then we are basically asking people to either:
 * Know that 7.6 is the last stable release of solr for people wanting to use 
multiValued fields in a sharded collection
 * Upgrade to Solr 8.4

In my opinion, neither of those are good options. Because users are always 
going to go with the most up to date version of Solr that works for their 
index, and upgrading to new major versions is a very tough process for a lot of 
people.

This isn't a bug that existed throughout the entirety of Solr 7, it was 
introduced in the last minor release. A lot of people are very comfortable with 
Solr 7, and trust it. People also trust that the last minor/patch version of 
something is going to be the most stable version. We should make sure that the 
latest release of our second to last major version (7) is stable and maintains 
that trust that users have in it and Solr in general.

It is very little work to backport this, and also probably not a whole lot of 
work to do another patch or minor release (7.8 or 7.7.3). And with that work we 
will be providing a significantly better user experience for our community. 

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-01-29 Thread Karl Stoney (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026170#comment-17026170
 ] 

Karl Stoney commented on SOLR-14013:


Please could this be backported to 7_7?  We build that branch from source 
anyway so I'd really appreciate it! 

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-01-29 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026046#comment-17026046
 ] 

Houston Putman commented on SOLR-14013:
---

I think backporting would be a good idea, even if a release isn't planned yet.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-01-29 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026032#comment-17026032
 ] 

Noble Paul commented on SOLR-14013:
---

I can port it to 7.x , but, no release is planned

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2020-01-29 Thread Florent Sithi (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025919#comment-17025919
 ] 

Florent Sithi commented on SOLR-14013:
--

Do you plan to fix it in 7.X series also ? or do we have to migrate to 8.4.0

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995402#comment-16995402
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit 422de99acf7cfab004e1e976c1ab47870dc6cfba in lucene-solr's branch 
refs/heads/branch_8_4 from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=422de99 ]

SOLR-14013: javabin performance regressions


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995400#comment-16995400
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit 9717540b8ecb6f5e142aaef8e27464690684a0f9 in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9717540 ]

SOLR-14013: javabin performance regressions


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994888#comment-16994888
 ] 

Houston Putman commented on SOLR-14013:
---

[~noble.paul], my speed tests weren't completely scientific, but I tried to 
make the scenarios as similar between the setups as possible.

I think the main takeaways were that the queries were significantly faster (30 
seconds -> .1 seconds). The smaller differences between the ingest speeds were 
less of a concern to me. I can redo the tests and try to make them more 
scientific & accurate if these numbers give you pause.

I've reviewed the patch and run the test on the updated master, and everything 
looks good to me.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994651#comment-16994651
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit 4d5df0e20ac3f2ac0a050241b3e124667ea1f812 in lucene-solr's branch 
refs/heads/gradle-master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4d5df0e ]

SOLR-14013: FIX: javabin performance regressions


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994650#comment-16994650
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit b35f1debe33e69dcfb94d295324ca7fa85a6b5d7 in lucene-solr's branch 
refs/heads/gradle-master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b35f1de ]

SOLR-14013: javabin performance regressions


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994621#comment-16994621
 ] 

Noble Paul commented on SOLR-14013:
---

[~jpountz] It's not yet pushed to branch_8x. I'll do it once it is reviewed. 
This has to go in {{8.4}}

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994614#comment-16994614
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit 4d5df0e20ac3f2ac0a050241b3e124667ea1f812 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4d5df0e ]

SOLR-14013: FIX: javabin performance regressions


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994610#comment-16994610
 ] 

Noble Paul commented on SOLR-14013:
---

I accidentally pushed the fix to master instead of a branch an raise a PR

 

[~ysee...@gmail.com] [~houston] please review

The changes are
 # Perf optimizations are eliminated from HttpShardHandler & JavabinLoader
 # The bug is fixed

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994603#comment-16994603
 ] 

ASF subversion and git services commented on SOLR-14013:


Commit b35f1debe33e69dcfb94d295324ca7fa85a6b5d7 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b35f1de ]

SOLR-14013: javabin performance regressions


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-12 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994367#comment-16994367
 ] 

Noble Paul commented on SOLR-14013:
---

I shall merge a fix soon. This fix is important

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-11 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994350#comment-16994350
 ] 

Adrien Grand commented on SOLR-14013:
-

I'm deferring to you as to whether this patch is safe to get in so close to the 
release, but if you think it's better to get it in than not, then to me the 
question is about how long you think we need to get it merged. If it's a matter 
of one or two additional days it's fine. If it's weeks, I'll have a preference 
for targeting it for the next release.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-11 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994233#comment-16994233
 ] 

Ishan Chattopadhyaya commented on SOLR-14013:
-

bq. Probably too close of a call to get it into 8.4
This is a blocker for 8.4. If [~jpountz] feels we shouldn't wait for this one, 
then we can have a 8.4.1 with this.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-11 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993963#comment-16993963
 ] 

Noble Paul commented on SOLR-14013:
---

Thanks [~houston]

I didn't understand the difference between. The perf (with the patch ) should 
be same on both 8.x and master, correct?
{code:java}
patch (8.x)

Ingest - 1.2 seconds
Sharded Query - 0.4 seconds
Non-Distrib Javabin Query - 0.17 seconds
Non-Distrib JSON Query - 0.13 seconds

patch (master)

Ingest - .87 seconds
Sharded Query - .3 seconds
Non-Distrib Javabin Query - 0.06 seconds
Non-Distrib JSON Query - 0.08 seconds


{code}

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, 
> test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-11 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993844#comment-16993844
 ] 

Houston Putman commented on SOLR-14013:
---

I've created a patch that adds in [~noble.paul] 's test and fix (without the 
large file), and reverts the conversion for indexing and deserializing 
responses. I think I incorporated all of the places that you mentioned 
[~ysee...@gmail.com]. The only issue in the tests was in the langid contrib 
module, which was reverting a later-made fix.

 

Probably too close of a call to get it into 8.4. What do y'all think of 
backporting this to 7.7.x, since it is such a serious regression?

The solr/lucene tests pass on 8x and master. I've used [~rrockenbaugh]'s 
testing method mentioned above, for all branches that I could think would be 
relevant. The results are below:



*patch (8.x)*

Ingest - 1.2 seconds
Sharded Query - 0.4 seconds
Non-Distrib Javabin Query - 0.17 seconds
Non-Distrib JSON Query - 0.13 seconds


 *patch (master)*

Ingest - .87 seconds
Sharded Query - .3 seconds
Non-Distrib Javabin Query - 0.06 seconds
Non-Distrib JSON Query - 0.08 seconds

 

*7.6*

Ingest - 1.6 seconds
Sharded Query - 0.3 seconds
Non-Distrib Javabin Query - 0.12 seconds
Non-Distrib JSON Query - 0.15 seconds

 

*7.x*

Ingest - 1.3 seconds
Sharded Query - 36 seconds
Non-Distrib Javabin Query -  30 seconds
Non-Distrib JSON Query - 0.3 seconds

 

*8.x*

Ingest - 1.18 seconds
Sharded Query - 21 seconds
Non-Distrib Javabin Query - 20 seconds
Non-Distrib JSON Query - 0.07 seconds

 

*master*

Ingest - 2.6 seconds
Sharded Query - 35 seconds
Non-Distrib Javabin Query - 35 seconds
Non-Distrib JSON Query - .16 seconds

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Blocker
> Fix For: 8.4
>
> Attachments: SOLR-14013.patch, TestQuerySpeed.java, test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991041#comment-16991041
 ] 

Yonik Seeley commented on SOLR-14013:
-

Please don't commit that huge JSON file... a doc matching that can be created 
with a few lines of java in the test.
I'm not sure the test belongs as a unit test anyway as it's more of a 
performance benchmark, but I don't care much either way as long as it's quick 
to run.

In general, what I think should be done is:
- the auto-convert changes should be removed (in SolrDocument, SolrInputField, 
MaskCharSeqSolrDocument)
- if there are parts of the code base that can't handle CharSequence, then 
disable reading Strings as CharSequence and look at if those other pieces of 
code can be fixed to handle CharSequence.



> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: SOLR-14013.patch, TestQuerySpeed.java, test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991031#comment-16991031
 ] 

Noble Paul commented on SOLR-14013:
---

I have submitted a bug fix for the perf degradation

There are 3 places where the optimizations are done
 # Writing out responses
 # Indexing
 # deserializing responses during inter-node communications

The changes are minimal for #1 and #2 and #3 are complex

 

I would recommend reverting #2 and #3 and let #1 continue to be there with the 
bug fix (and more auditing) I have just submitted.

 

I'll go with the decision of the community and do the necessary work

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: SOLR-14013.patch, TestQuerySpeed.java, test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991014#comment-16991014
 ] 

Yonik Seeley commented on SOLR-14013:
-

I worked up a quick-n-dirty patch to disable the charseq optimization stuff to 
test my hypothesis on slower indexing speed:
{code}
git diff
diff --git 
a/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java 
b/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java
index 69da3948fe9..620fffb1303 100644
--- a/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java
+++ b/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java
@@ -146,7 +146,7 @@ public class HttpShardHandler extends ShardHandler {
   private static final BinaryResponseParser READ_STR_AS_CHARSEQ_PARSER = new 
BinaryResponseParser() {
 @Override
 protected JavaBinCodec createCodec() {
-  return new JavaBinCodec(null, stringCache).setReadStringAsCharSeq(true);
+  return new JavaBinCodec(null, stringCache).setReadStringAsCharSeq(false);
 }
   };

diff --git a/solr/core/src/java/org/apache/solr/response/DocsStreamer.java 
b/solr/core/src/java/org/apache/solr/response/DocsStreamer.java
index 3d1976e143c..056dc08d963 100644
--- a/solr/core/src/java/org/apache/solr/response/DocsStreamer.java
+++ b/solr/core/src/java/org/apache/solr/response/DocsStreamer.java
@@ -148,9 +148,7 @@ public class DocsStreamer implements Iterator 
{
 // because that doesn't include extra fields needed by transformers
 final Set fieldNamesNeeded = fields.getLuceneFieldNames();

-final SolrDocument out = ResultContext.READASBYTES.get() == null ?
-new SolrDocument() :
-new BinaryResponseWriter.MaskCharSeqSolrDocument();
+final SolrDocument out = new SolrDocument();

 // NOTE: it would be tempting to try and optimize this to loop over 
fieldNamesNeeded
 // when it's smaller then the IndexableField[] in the Document -- but 
that's actually *less* effecient
diff --git 
a/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java
 
b/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java
index 7a4abe2c303..53cfbee320f 100644
--- 
a/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java
+++ 
b/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java
@@ -209,8 +209,11 @@ public class ByteArrayUtf8CharSequence implements 
Utf8CharSequence {
 }
 return vals;
   }
-
   public static Object convertCharSeq(Object o) {
+return o; // nocommit
+  }
+
+  public static Object _convertCharSeq(Object o) {
 if (o == null) return null;
 if (o instanceof Utf8CharSequence) return ((Utf8CharSequence) 
o).toString();
 if (o instanceof Collection) return convertCharSeq((Collection) o);
{code}

I also hacked up the unit test I used to find the N^2 issue...
it's obviously not good for benchmarking (being a unit test, etc), but good 
enough to detect anything major.
I tested with a single value per string field (and many fields per doc).. it 
would be worse for multiple values per field.

Results:
= master, single valued string fields
 [junit4] 2> INDEX TIME=10293
 [junit4] 2> QUERY TIME=891 xml
 [junit4] 2> QUERY TIME=415 javabin
 [junit4] 2> QUERY TIME=600 json

 [junit4] 2> INDEX TIME=10313
 [junit4] 2> QUERY TIME=872 xml
 [junit4] 2> QUERY TIME=389 javabin
 [junit4] 2> QUERY TIME=579 json

 [junit4] 2> INDEX TIME=10307
 [junit4] 2> QUERY TIME=858 xml
 [junit4] 2> QUERY TIME=410 javabin
 [junit4] 2> QUERY TIME=570 json

 [junit4] 2> INDEX TIME=10318
 [junit4] 2> QUERY TIME=915 xml
 [junit4] 2> QUERY TIME=382 javabin
 [junit4] 2> QUERY TIME=600 json

 [junit4] 2> INDEX TIME=10579
 [junit4] 2> QUERY TIME=843 xml
 [junit4] 2> QUERY TIME=386 javabin
 [junit4] 2> QUERY TIME=570 json

= patch disabling charseq stuff, single valued string fields
   [junit4]   2> INDEX TIME=8547
   [junit4]   2> QUERY TIME=881 xml
   [junit4]   2> QUERY TIME=396 javabin
   [junit4]   2> QUERY TIME=576 json

   [junit4]   2> INDEX TIME=9428
   [junit4]   2> QUERY TIME=821 xml
   [junit4]   2> QUERY TIME=374 javabin
   [junit4]   2> QUERY TIME=543 json

   [junit4]   2> INDEX TIME=9181
   [junit4]   2> QUERY TIME=812 xml
   [junit4]   2> QUERY TIME=382 javabin
   [junit4]   2> QUERY TIME=533 json

   [junit4]   2> INDEX TIME=9455
   [junit4]   2> QUERY TIME=863 xml
   [junit4]   2> QUERY TIME=395 javabin
   [junit4]   2> QUERY TIME=613 json

   [junit4]   2> INDEX TIME=9530
   [junit4]   2> QUERY TIME=863 xml
   [junit4]   2> QUERY TIME=385 javabin
   [junit4]   2> QUERY TIME=559 json

So the charseq stuff (or rather probably the extra work to 
auto-convert-to-string) did cause slower indexing speed.
There is enough noise that I don't think one can draw any conclusions about 
query speed.





> javabin 

[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990926#comment-16990926
 ] 

Yonik Seeley commented on SOLR-14013:
-

Those benchmarks look like they are testing different settings, not a 
before-vs-after patch scenario?

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990919#comment-16990919
 ] 

Ishan Chattopadhyaya commented on SOLR-14013:
-

bq.  Perf improvements should be backed by benchmarks.
FYI, 
https://issues.apache.org/jira/browse/SOLR-12885?focusedCommentId=16709641=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16709641

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990868#comment-16990868
 ] 

Adrien Grand commented on SOLR-14013:
-

I'm also a bit disappointed that SOLR-12885 changed Field's constructor from 
String to CharSequence silently.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-08 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990802#comment-16990802
 ] 

Jan Høydahl commented on SOLR-14013:


+1
Perf improvements should be backed by benchmarks.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-07 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990752#comment-16990752
 ] 

Ishan Chattopadhyaya commented on SOLR-14013:
-

bq. At this point I think the best thing to do is roll it back.
+1

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-07 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990579#comment-16990579
 ] 

Yonik Seeley commented on SOLR-14013:
-

Even without the O(N^2) bug, which would be that hard to work around, this 
auto-check-and-convert
on access is quite a trap (as seen above) that would be constantly biting devs 
forever.  It's also almost assuredly
the case that after just handling the N^2 bug, things will be slower overall 
(often with more memory usage)
than before this attempt to save utf-8 conversion.

At this point I think the best thing to do is roll it back.
I support the idea of trying to use more CharSequence... but it's hard in 
practice and we need to be careful.
The original fault lies with Java of course, which introduced CharSequence long 
after String, and was
never fully converted/adopted ;-)

In the future, we should certainly benchmark any changes that are meant to 
improve performance.


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-07 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990572#comment-16990572
 ] 

Yonik Seeley commented on SOLR-14013:
-

OK, found the primary issue it's an N^2 bug.
h3. Background: 
SOLR-12885 changed SolrDocument (among other things) to make it so
 that when used through certain interfaces, gets would auto-check-and-convert
 keys & values of type Utf8CharSequence to String. This happened in methods
 getFieldValueMap.get(), getFieldValueMap.keySet(), remove() and getValues()

h3. Issues: 
 SolrDocument does not auto-convert through many other of its methods,
 such as .get(), .put(), .keySet(), so depending on this anywhere is extremely
 fragile and will break if you change how you access SolrDocument

SolrInputField has some of the same issues as SolrDocument, the mere act of
 doing a .get() on a multi-valued field (which should be O(1)) scans the entire
 list for CharSequence and if it finds one, creates a new list and iterates over
 the whole thing again to convert each element.

h3. Client side indexing:
 And it's worse, because it looks like this auto-check-and-convert logic is even
 triggered when the SolrJ is using JavaBinCodec to send documents... so even if
 some field values were Utf8CharSequence to begin with, they would still be 
converted to
 String before being converted back to utf8 by JavaBinCodec!

h3. Server side indexing:
 Then on the server side, JavaBinCodec parses String values as 
Utf8CharSequence, and
 we start going through the update processor chain. 
FieldMutatingUpdateProcessor (used
 in our _default config to remove blank values) asks each SolrInputField for its
 value, which again triggers iteration over the complete list. Also, for *any*
 string values (single valued too), FieldMutatingUpdateProcessor replaces those
 Utf8CharSequence objects with String objects (destroying any attempted 
re-serializing
 optimization)

Then comes NestedUpdateProcessorFactory, which triggers the
 auto-check-and-convert *twice*, because getValue() returned a pointer
 previously, which would have been optimized away. Both lines below iterate
 over all values, *before* the actual iteration by the explicit "for" loop:
{code:java}
  boolean isSingleVal = !(field.getValue() instanceof Collection); 
  for(Objectval: field) {
{code}
then isAtomicUpdate(), and then finally writeSolrInputDocument() to convert to
 JavaBin for the transaction log both trigger the extra
 iterate-over-all-values with each inspection. If FieldMutatingUpdateProcessor
 hadn't overwritten Utf8CharSequence already, all of these accesses would have
 also triggered a new collection creation each time (and an additional 
iteration to
 create the new collection) for every multi-valued string field.

h3. Server side query:
 On the query side, we get a Lucene Document, and then convert it into a
 SolrDocument. Binary ResponseWriter uses MaskCharSeqSolrDocument which
 inherits from SolrDocument to do the auto-convert-on-access stuff more
 thoroughly.
{code:java}
 
for (IndexableField f : doc.getFields()) {
  final String fname = f.name();
  if (null == fieldNamesNeeded || fieldNamesNeeded.contains(fname) ) {
// Make sure multivalued fields are represented as lists
Object existing = out.get(fname);
{code}
For multi-valued fields, what we get back from lucene is actually a flat list
 of all the values in the whole document. We need to collect all values
 with the same field into a list. So if there are 1000 values in a
 field, the outer loop executes over 1000 times. Then in the inner loop we
 retrieve any existing value for the field by calling "out.get(fname)", which
 triggers the auto-convert-on-access which scans all the values so far (on 
average 1000/2),
 and hence we have our O(N^2) behavior that the original poster reported.

h3. Other:
It took a really long time to review some of this code (and I've only reviewed
 some), often because a lack of comments around non-obvious things. I thought
 there might be lifetime/sharing bugs with BytesBlock for example, until I
 realized that strings are appended in the block rather than placed at the 
start.

Same issue for FastInputStream.readDirectUtf8... since it looked like it
 was sharing the internal buffer, I thought there was a possible lifetime issue
 there. A single line comment in both of those cases would have saved me quite
 a bit.

Actually... looking at it again, there still may be a subtle sharing bug in
 this new FastInputStream.readDirectUtf8. I can't say I quite understand the
 logic behind for when you can't share the internal buffer.
{code:java}
 
  if (in !=null || end < pos + len) return false; 
{code}
You can only share the buffer when the bytes you want are right up at the end 
of the buffer?
 I'm not sure I understand the logic around that, but ChannelFastInputStream 
(used by
 TransactionLog) derives from FastInputStream 

[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-06 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990124#comment-16990124
 ] 

Noble Paul commented on SOLR-14013:
---

[~ysee...@gmail.com]
You can just avoid the call to
{{JavaBinCodec#setReadStringAsCharSeq()}} and get the old behavior


> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-06 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989860#comment-16989860
 ] 

Yonik Seeley commented on SOLR-14013:
-

Just an update... I tried speeding things up by skipping most of the above, and 
it did get faster, but it's still much slower.  Still digging...

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-05 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989121#comment-16989121
 ] 

Yonik Seeley commented on SOLR-14013:
-

I did some digging... the current code is certainly more complex than it used 
to be.

So for a multi-valued field, the values internally are now IndexableField (for 
each value)
For each of those values, we call JavaBinCodec.writeVal
  which tries writeKnownTypes
  which tests if it's a member of ~10 primitive types via instanceof
  failing, then tests against 19 other types
  failing, falls back to object resolver which tries against 4 other types, 
finally matching it up to IndexableField
  the schema is used to look up the SchemaField based on name (2 more hash 
lookups)
  then we call DocStreamer.getValue(), which does another hash lookup based on 
the .getClass()
  and then calls FieldType.toObject() which calls toExternal() which finally 
calls stringValue() on the IndexableField 
  and now that we have our object, JavaBinCodec can try writeKnownTypes() again

And this is now the common case!

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-05 Thread Ryan Rockenbaugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988961#comment-16988961
 ] 

Ryan Rockenbaugh commented on SOLR-14013:
-

Here are my initial steps to reproduce:

extract solr 7.6.0

start solr: 
{noformat}
bin/solr start -e cloud -noprompt{noformat}
index record: 
{noformat}
curl -X POST 
"http://localhost:8983/solr/gettingstarted/update/json/docs?commit=true; -T 
test.json{noformat}
query record:
{noformat}
curl "http://localhost:8983/solr/gettingstarted/select?q=id:1"{noformat}
Results are returned in miliseconds (10-20 ms for me)

Then do the same for solr 7.7.2:

extract solr 7.7.2

start solr: 
{noformat}
bin/solr start -e cloud -noprompt{noformat}
index record: 
{noformat}
curl -X POST 
"http://localhost:8983/solr/gettingstarted/update/json/docs?commit=true; -T 
test.json{noformat}
query record:
{noformat}
curl "http://localhost:8983/solr/gettingstarted/select?q=id:1"{noformat}
Results are returned in seconds(20-30 seconds for me)

I had the same behavior in 8.0.0, 8.1.0, 8.2.0, 8.3.0

 

Note:  If I query a specific shard and set distrib=false, and javabin format is 
not used, and return times are miliseconds.
{noformat}
curl 
"http://localhost:8983/solr/gettingstarted_shard1_replica_n1/select?q=id:1=false"{noformat}
If I add wt=javabin:
{noformat}
curl 
"http://localhost:8983/solr/gettingstarted_shard1_replica_n1/select?q=id:1=false=javabin"{noformat}
 Results are returned in seconds (20-30 seconds for me)

 

 

 

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
> Attachments: test.json
>
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-04 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988249#comment-16988249
 ] 

Noble Paul commented on SOLR-14013:
---

[~rrockenbaugh] can you add relevant details here as well please

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>Priority: Major
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14013) javabin performance regressions

2019-12-04 Thread Yonik Seeley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987986#comment-16987986
 ] 

Yonik Seeley commented on SOLR-14013:
-

The original reporter suspects that this may be caused by SOLR-12983, so I'll 
link it as such for now.

> javabin performance regressions
> ---
>
> Key: SOLR-14013
> URL: https://issues.apache.org/jira/browse/SOLR-14013
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7
>Reporter: Yonik Seeley
>Priority: Major
>
> As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became 
> orders of magnitude slower in certain cases since v7.7.  The cases identified 
> so far include large numbers of values in a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org