date:20200809



[ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174075#comment-17174075
 ] 

Jason Baik commented on SOLR-14630:
---

Also, [~idjurasevic] mentioned indexing and querying still worked "correctly" 
for his system because the forwarding, although not desired, still allows the 
request to reach the correct replica eventually. However, in our case, we 
explicitly disable distributed processing bc we depend on the _route_ param 
being able to pin-point the replica, so our system lost correctness, too.

We're hoping that the fix for this issue is also back-ported to Solr 7 if 
possible please.

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
> Attachments: 
> 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch
>
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module



[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173933#comment-17173933
 ] 

Tomoko Uchida edited comment on LUCENE-9448 at 8/10/20, 4:52 AM:
-

I attached a poc patch [^LUCENE-9448.patch].
 Main class and classpaths are specified in the Manifest, so that Luke is 
launched by java command for testing.
{code:java}
$ ./gradlew lucene:luke:testAssemble
$ java -jar lucene/luke/build/libs/lucene-luke-9.0.0-SNAPSHOT.jar
{code}
There remains one TODO; how can we set correct Class-Path for distribution 
package, or maybe we should omit it for the distro for now? [~dweiss] what do 
you think?

 

We could emulate the directory structure of the final distribution package for 
all dependent jars when testing (so that the same Class-Path attribute can be 
used for UI testing and packaging), but it would mess up \{{luke/build/}} ...


was (Author: tomoko uchida):
I attached a poc patch [^LUCENE-9448.patch].
 Main class and classpaths are specified in the Manifest, so that Luke is 
launched by java command for testing.
{code:java}
$ ./gradlew lucene:luke:testAssemble
$ java -jar lucene/luke/build/libs/lucene-luke-9.0.0-SNAPSHOT.jar
{code}
There remains one TODO; how can we set correct Class-Path for distribution 
package, or maybe we should omit it for the distro for now? [~dweiss] what do 
you think?

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module



[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173929#comment-17173929
 ] 

Tomoko Uchida edited comment on LUCENE-9448 at 8/10/20, 4:46 AM:
-

Hi [~erickerickson]
 as for SOLR-13412, I think there would be nothing special to ship Luke with 
Solr - it's just an ordinary JAR file (like other lucene modules) and its all 
dependent jars may be already included in Solr. Once correct class paths are 
set, Luke runs on everywhere else. (Please see this launch shell/bat: 
[https://github.com/apache/lucene-solr/tree/master/lucene/luke/bin])

I'm not fully sure if it'd be somewhat helpful/useful for Solr users though...


was (Author: tomoko uchida):
Hi [~erickerickson]
 as for SOLR-13412, I think there wouldn't be nothing special to ship Luke with 
Solr - it's just an ordinary JAR file (like other lucene modules) and its all 
dependent jars may be already included in Solr. Once correct class paths are 
set, Luke runs on everywhere else. (Please see this launch shell/bat: 
[https://github.com/apache/lucene-solr/tree/master/lucene/luke/bin])

I'm not fully sure if it'd be somewhat helpful/useful for Solr users though...

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards



[ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174072#comment-17174072
 ] 

Jason Baik edited comment on SOLR-14630 at 8/10/20, 4:44 AM:
-

Attached is a test case demonstrating the problem: 
[^0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch]. 

We're finding the same problem as [~idjurasevic] as we're upgrading from Solr 6 
to 7. The _route_ param no longer works as expected. The regression seems to 
have happened around L1061 in 
[https://github.com/apache/lucene-solr/commit/e001f352895c83652c3cf31e3c724d29a46bb721#diff-c8d54eacd46180b332c86c7ae448abaeR1065].
 It made requests no longer be routed to a specific replica, but only to the 
node level. 

This:
{code:java}
ZkCoreNodeProps.getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), 
joinedInputCollections) {code}
Shouldn't have replaced:
{code:java}
url = coreNodeProps.getCoreUrl() {code}
Although they sound like they do the same thing, they actually don't. Only the 
latter produces a replica specific url. I guess the name of the method 
getCoreUrl() is little tricky. 

 


was (Author: jason.j.b...@gmail.com):
Attached is a test case demonstrating the problem: 
[^0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch].

 

We're finding the same problem as [~idjurasevic] as we're upgrading from Solr 6 
to 7. The _route_ param no longer works as expected. The regression seems to 
have happened around L1061 in 
[https://github.com/apache/lucene-solr/commit/e001f352895c83652c3cf31e3c724d29a46bb721#diff-c8d54eacd46180b332c86c7ae448abaeR1065].
 It made requests no longer be routed to a specific replica, but only to the 
node level. 

This:
{code:java}
ZkCoreNodeProps.getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), 
joinedInputCollections) {code}
Shouldn't have replaced:
{code:java}
url = coreNodeProps.getCoreUrl() {code}
Although they sound like they do the same thing, they actually don't. Only the 
latter produces a replica specific url. I guess the name of the method 
getCoreUrl() is little tricky. 

 

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
> Attachments: 
> 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch
>
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards



[ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174072#comment-17174072
 ] 

Jason Baik commented on SOLR-14630:
---

Attached is a test case demonstrating the problem: 
[^0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch].

 

We're finding the same problem as [~idjurasevic] as we're upgrading from Solr 6 
to 7. The _route_ param no longer works as expected. The regression seems to 
have happened around L1061 in 
[https://github.com/apache/lucene-solr/commit/e001f352895c83652c3cf31e3c724d29a46bb721#diff-c8d54eacd46180b332c86c7ae448abaeR1065].
 It made requests no longer be routed to a specific replica, but only to the 
node level. 

This:
{code:java}
ZkCoreNodeProps.getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), 
joinedInputCollections) {code}
Shouldn't have replaced:
{code:java}
url = coreNodeProps.getCoreUrl() {code}
Although they sound like they do the same thing, they actually don't. Only the 
latter produces a replica specific url. I guess the name of the method 
getCoreUrl() is little tricky. 

 

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
> Attachments: 
> 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch
>
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards



 [ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Baik updated SOLR-14630:
--
Attachment: 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
> Attachments: 
> 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch
>
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards



 [ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Baik updated SOLR-14630:
--
Attachment: (was: 
0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch)

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards



 [ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Baik updated SOLR-14630:
--
Attachment: 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
> Attachments: 
> 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch
>
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on pull request #1730: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes



noblepaul commented on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671136587


   Planning to commit this soon



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul opened a new pull request #1730: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes



noblepaul opened a new pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730


   new PR for #1694 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul closed pull request #1694: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes



noblepaul closed pull request #1694:
URL: https://github.com/apache/lucene-solr/pull/1694


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] noblepaul commented on pull request #1694: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes



noblepaul commented on pull request #1694:
URL: https://github.com/apache/lucene-solr/pull/1694#issuecomment-671136445


   Opening another PR



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields

2020-08-09 Thread Gautam Worah (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gautam Worah updated LUCENE-9450:
-
Description: 
The taxonomy index that maps binning labels to ordinals was created before 
Lucene added BinaryDocValues.

I've attached a WIP patch (does not pass tests currently)

Issue suggested by [~mikemccand]

  was:
The taxonomy index that maps binning labels to ordinals was created before 
Lucene added BinaryDocValues.

I've attached a WIP patch (does not pass tests currently)


> Taxonomy index should use DocValues not StoredFields
> 
>
> Key: LUCENE-9450
> URL: https://issues.apache.org/jira/browse/LUCENE-9450
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.5.2
>Reporter: Gautam Worah
>Priority: Minor
>  Labels: performance
> Attachments: wip_taxonomy_patch
>
>
> The taxonomy index that maps binning labels to ordinals was created before 
> Lucene added BinaryDocValues.
> I've attached a WIP patch (does not pass tests currently)
> Issue suggested by [~mikemccand]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields

2020-08-09 Thread Gautam Worah (Jira)

Gautam Worah created LUCENE-9450:


 Summary: Taxonomy index should use DocValues not StoredFields
 Key: LUCENE-9450
 URL: https://issues.apache.org/jira/browse/LUCENE-9450
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Affects Versions: 8.5.2
Reporter: Gautam Worah
 Attachments: wip_taxonomy_patch

The taxonomy index that maps binning labels to ordinals was created before 
Lucene added BinaryDocValues.

I've attached a WIP patch (does not pass tests currently)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-09 Thread Alexandre Rafalovitch (Jira)

[
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173986#comment-17173986
]

Alexandre Rafalovitch commented on SOLR-14726:
--

There are so many points in here that It is hard to answer them all together.
But, since my opinion was asked, I feel that the steps proposed go into
opposite direction from its title. Of course, I do not have as much exposure to
the real users as other participants, so below are my strong opinions but with
a very large sack of salt. I am also going to comment in random order a bit.

(tl;dr of disagreed part) I think we should have a new coherent (example?
production?) configset with matching large and interesting example dataset that
we use to demonstrate both classic and new features and we should keep post and
focus on explaining it better.
# Removing the examples. We currently ship with 10-ish, we are about to lose 5
of them (DIH ones). I agree that what we have is confusing. However, removing
them all is not the right answer. I feel that we should have one complex
example dataset that we use in multiple ways to demonstrate lots of Solr
features. Techproducts used to be that, but is rather out of date and is not
really internally consistent. Films was aiming to become that but the source
service has disappeared and it had its own little issues. I have been looking
for a potential example for a while and the one that appeals to me most is a
[https://www.fakenamegenerator.com/] (which allows for bulk generations). This
would give us multiple field types to demonstrate, advanced searches/analysis
and multilingual aspects. Maybe we can have the dataset split into chunks with
each chunk using different format Solr support (similar to films example).
# Using curl - I am with Erick that post tool is better than curl and we
worked for it to be more explicit on explaining what it is actually doing (with
base URL vs destination logging). I think we should explain its output better
so people know what to look for.
# Postman/Insomnia are good in theory, but I heard Postman's company strategy
made it less and less reliable as a tool to promote. I don't know about
Insomnia. It would have been nice to have commands in some consistent way.
# Google Colab and output.serve_kernel_port_as_window trick looks really
interesting and potentially promising. Could that be used instead of
Postman/Insomnia/curl?
# V2 API - yes, totally
# Docker? Maybe, no opinion; I use docker for other projects, it is nice. But
I don't know if it is an official path for Solr distribution (just honestly,
out of the loop on that)
# Auth - good idea, I guess. As a first step, I don't know. But somewhere in
the process.
# First example using cloud - I was never super comfortable with that. To me,
it feels as a ES-competition move, similar to the schemaless issue with
semi-expected negative consequences. I think the first example should be super
simple single Solr/Collection start. Then, the further example should introduce
cloud and related schema evolution process differences. So, for example, the
cloud example would take the same fake names dataset and then do graph analysis
on it or machine learning or some other advanced features we have only in cloud
configuration. I am aware that there is a discussion to make everything cloud
under the hood in future Solr, but don't think that was actually decided,
partially because for a lot of people, single Solr instance is more than
sufficient.
# Make the tutorial shorter? Part of the length is the cloud instructions,
part of it is the screenshots, which is very useful. I don't think it is the
length that matters, but the fact that the current text is a bit all over the
place and not super coherent, including switching between different datasets
and schemas without properly indicating it.
# Configset - suggested to be removed as kind of a part of a point 10. I think
we need a new configset to go together with new example (back to my point 1)
that is coherent with the new Solr features. That's a big discussion on its own
(e.g. Do we need to demonstrate requestHandlers and initParams and overrides
and ... all in one file?). We should also recognize that the documentation
should no longer live in the configset, but be in the reference guide,
especially for the managed-schema files where all comments get blown away on
first API change. Recognizing this, would allow us to move commented-out
defaults out of those files as well, making them shorter and easier to read.
# I do not recognize anything in the original suggestion as specifically
addressing "that should also be followed in production". That, to me, is a huge
question, as none of the current configsets are 'production ready' and I don't
see specific suggestions to strengthen it. Nor do I, myself, truly know what
productio

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-09 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173971#comment-17173971
 ] 

Erick Erickson commented on SOLR-14726:
---

{quote}We have no example of a JSON document sent to the /update or 
/update/json/docs endpoint, even though this is what the main usecase is for 
most people
{quote}
Setting aside the quibble whether "most people" really do this or if you've 
seen a biased sample ;)...

Isn't that case served by "bin/solr do_the_right_thing some_file.csv" or 
"bin/solr do_the_right_thing somefile.json" where bin/solr, well, does the 
right thing based on the extension?

I admit I haven't thought this through very carefully, but if we're going for 
"as easy as possible", we shouldn't have to build into Solr dealing with a 
random file format. I can curl _anything_ to Solr. Are we going to send 
anything we don't recognize to ExtractingRequesthandler or intercept that on 
the client side and, say, send anything we don't recognize to a Tika server and 
send the results to Solr?

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets



[ 
https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173970#comment-17173970
 ] 

Ishan Chattopadhyaya commented on SOLR-13438:
-

Sure, but please don't wait too long. I'll find more issues for new devs, and 
this better be fixed soon :-)

> DELETE collection should remove AUTOCREATED configsets
> --
>
> Key: SOLR-13438
> URL: https://issues.apache.org/jira/browse/SOLR-13438
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> Current user experience:
> # User creates a collection (without specifying configset), and makes some 
> schema/config changes.
> # He's/She's not happy with how the changes turned out, so he/she deletes and 
> re-creates the collection.
> # He/she observes that the previously made settings changes persist. If 
> he/she is only aware of Schema and Config APIs and not explicitly aware of 
> the concept of configsets, this will be un-intuitive for him/her.
> Proposed:
> DELETE collection should delete the configset if it has the prefix 
> ".AUTOCREATED" and that configset isn't being shared by any other collection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173969#comment-17173969
 ] 

Ishan Chattopadhyaya commented on SOLR-14726:
-

bq. I also think a video from someone in the community should be made. Some 
people learn differently.

Absolutely +1 to an introductory beginner's video, possibly embedded in the ref 
guide!

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173968#comment-17173968
 ] 

Ishan Chattopadhyaya commented on SOLR-14726:
-

bq. If we abstract this a bit, it becomes "let's make it super-simple to index 
any data whatsoever".
Absolutely, +1

bq. I'm not wild about replacing bin/solr with curl. WDYT about "bin/solr 
index_this_thing something"? Where "something" is a directory, a file, 
whatever. That would give us more control over what/how we send things to Solr.

For an example showing indexing of some documents residing in a directory, I 
agree that is better than a complex curl request. But, in most cases, we want 
to show the user how to index regular documents like JSON or CSV etc. We have 
no example of a JSON document sent to the /update or /update/json/docs 
endpoint, even though this is what the main usecase is for most people. For 
those, I strongly favour using curl. It helps develop familiarity in dealing 
with indexing documents into Solr even for production environments where a 
developer doesn't have bin/solr access.

By the way, expert users and committers are sometimes not aware of something 
that we need every regular user to be aware of! 
https://twitter.com/dep4b/status/1292191202624897025. No better place than the 
solr tutorial, IMHO.

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173965#comment-17173965
 ] 

Marcus Eagan commented on SOLR-14726:
-

I also think a video from someone in the community should be made. Some people 
learn differently. 

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-09 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173964#comment-17173964
 ] 

Erick Erickson commented on SOLR-14726:
---

Hmmm. In the past, we provided sample data and configsets to give people a 
place to start. Let's back up a bit. That approach was based on the model of 
Solr where there was a learning curve (to put it politely) to get over first 
before being able to do anything. We provided canned examples that we knew 
would work.

If we abstract this a bit, it becomes "let's make it super-simple to index any 
data whatsoever".

I'm not wild about replacing bin/solr with curl. WDYT about "bin/solr 
index_this_thing something"? Where "something" is a directory, a file, 
whatever. That would give us more control over what/how we send things to Solr.

I suppose it comes down to a question of where we want to put the smarts. We 
either put it in Solr or put it in bin/solr (or something). A curl command that 
took a directory seems "fraught".

I'm not even sure bin/solr is the right place, but you see where this is 
heading. We could even use the Tika server idea to process docs on the client 
side and avoid ExtractingRequestHandler all together.

Anyway, random thoughts for discussion

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14726) Streamline getting started experience



 [ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eagan updated SOLR-14726:

Labels: newdev  (was: )

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets



[ 
https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173962#comment-17173962
 ] 

Marcus Eagan commented on SOLR-13438:
-

[~ichattopadhyaya] I'll leave this issue open for a new dev to step in and get 
involved. If it is not fixed after a while, I will wrap it up. I personally 
have seen this issue cause many problems.

> DELETE collection should remove AUTOCREATED configsets
> --
>
> Key: SOLR-13438
> URL: https://issues.apache.org/jira/browse/SOLR-13438
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> Current user experience:
> # User creates a collection (without specifying configset), and makes some 
> schema/config changes.
> # He's/She's not happy with how the changes turned out, so he/she deletes and 
> re-creates the collection.
> # He/she observes that the previously made settings changes persist. If 
> he/she is only aware of Schema and Config APIs and not explicitly aware of 
> the concept of configsets, this will be un-intuitive for him/her.
> Proposed:
> DELETE collection should delete the configset if it has the prefix 
> ".AUTOCREATED" and that configset isn't being shared by any other collection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958
 ] 

Ishan Chattopadhyaya edited comment on SOLR-14726 at 8/9/20, 7:41 PM:
--

Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~ctargett], [~erikhatcher], 
[~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], 
[~marcussorealheis], [~dsmiley], [~epugh].


was (Author: ichattopadhyaya):
Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~ctargett], [~erikhatcher], 
[~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], 
[~marcussorealheis].

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14726) Streamline getting started experience



 [ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-14726:

Description: 
The reference guide Solr tutorial is here:
https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html

It needs to be simplified and easy to follow. Also, it should reflect our best 
practices, that should also be followed in production. I have following 
suggestions:
# Make it less verbose. It is too long. On my laptop, it required 35 page downs 
button presses to get to the bottom of the page!
# First step of the tutorial should be to enable security (basic auth should 
suffice).
# {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
# All references of {{bin/solr post}} to be replaced with {{curl}}
# Convert all {{bin/solr create}} references to curl of collection creation 
commands
# Add docker based startup instructions.
# Create a Jupyter Notebook version of the entire tutorial, make it so that it 
can be easily executed from Google Colaboratory. Here's an example: 
https://twitter.com/TheSearchStack/status/1289703715981496320
# Provide downloadable Postman and Insomnia files so that the same tutorial can 
be executed from those tools. Except for starting Solr, all other steps should 
be possible to be carried out from those tools.
# Use V2 APIs everywhere in the tutorial
# Remove all example modes, sample data (films, tech products etc.), configsets 
from Solr's distribution (instead let the examples refer to them from github)
# Remove the post tool from Solr, curl should suffice.


  was:
The reference guide Solr tutorial is here:
https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html

It needs to be simplified and easy to follow. Also, it should reflect our best 
practices, that should also be followed in production. I have following 
suggestions:
# Make it less verbose. It is too long.
# First step of the tutorial should be to enable security (basic auth should 
suffice).
# {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
# All references of {{bin/solr post}} to be replaced with {{curl}}
# Convert all {{bin/solr create}} references to curl of collection creation 
commands
# Add docker based startup instructions.
# Create a Jupyter Notebook version of the entire tutorial, make it so that it 
can be easily executed from Google Colaboratory. Here's an example: 
https://twitter.com/TheSearchStack/status/1289703715981496320
# Provide downloadable Postman and Insomnia files so that the same tutorial can 
be executed from those tools. Except for starting Solr, all other steps should 
be possible to be carried out from those tools.
# Use V2 APIs everywhere in the tutorial
# Remove all example modes, sample data (films, tech products etc.), configsets 
from Solr's distribution (instead let the examples refer to them from github)
# Remove the post tool from Solr, curl should suffice.



> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---

[jira] [Commented] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173960#comment-17173960
 ] 

Marcus Eagan commented on SOLR-14726:
-

Looks like a great list to start. 

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long.
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958
 ] 

Ishan Chattopadhyaya edited comment on SOLR-14726 at 8/9/20, 7:30 PM:
--

Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~ctargett], [~erikhatcher], 
[~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], 
[~marcussorealheis].


was (Author: ichattopadhyaya):
Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], 
[~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], 
[~marcussorealheis].

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long.
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958
 ] 

Ishan Chattopadhyaya edited comment on SOLR-14726 at 8/9/20, 7:28 PM:
--

Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], 
[~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], 
[~marcussorealheis].


was (Author: ichattopadhyaya):
Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], 
[~arafalov], [~noble.paul].

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long.
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958
 ] 

Ishan Chattopadhyaya commented on SOLR-14726:
-

Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for 
broader consensus and discussion.
Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], 
[~arafalov], [~noble.paul].

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long.
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14726) Streamline getting started experience

Ishan Chattopadhyaya created SOLR-14726:
---

 Summary: Streamline getting started experience
 Key: SOLR-14726
 URL: https://issues.apache.org/jira/browse/SOLR-14726
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya


The reference guide Solr tutorial is here:
https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html

It needs to be simplified and easy to follow. Also, it should reflect our best 
practices, that should also be followed in production. I have following 
suggestions:
# Make it less verbose. It is too long.
# First step of the tutorial should be to enable security (basic auth should 
suffice).
# {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
# All references of {{bin/solr post}} to be replaced with {{curl}}
# Convert all {{bin/solr create}} references to curl of collection creation 
commands
# Add docker based startup instructions.
# Create a Jupyter Notebook version of the entire tutorial, make it so that it 
can be easily executed from Google Colaboratory. Here's an example: 
https://twitter.com/TheSearchStack/status/1289703715981496320
# Provide downloadable Postman and Insomnia files so that the same tutorial can 
be executed from those tools. Except for starting Solr, all other steps should 
be possible to be carried out from those tools.
# Use V2 APIs everywhere in the tutorial
# Remove all example modes, sample data (films, tech products etc.), configsets 
from Solr's distribution (instead let the examples refer to them from github)
# Remove the post tool from Solr, curl should suffice.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module



[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173933#comment-17173933
 ] 

Tomoko Uchida commented on LUCENE-9448:
---

I attached a poc patch [^LUCENE-9448.patch].
 Main class and classpaths are specified in the Manifest, so that Luke is 
launched by java command for testing.
{code:java}
$ ./gradlew lucene:luke:testAssemble
$ java -jar lucene/luke/build/libs/lucene-luke-9.0.0-SNAPSHOT.jar
{code}
There remains one TODO; how can we set correct Class-Path for distribution 
package, or maybe we should omit it for the distro for now? [~dweiss] what do 
you think?

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module



[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173929#comment-17173929
 ] 

Tomoko Uchida commented on LUCENE-9448:
---

Hi [~erickerickson]
 as for SOLR-13412, I think there wouldn't be nothing special to ship Luke with 
Solr - it's just an ordinary JAR file (like other lucene modules) and its all 
dependent jars may be already included in Solr. Once correct class paths are 
set, Luke runs on everywhere else. (Please see this launch shell/bat: 
[https://github.com/apache/lucene-solr/tree/master/lucene/luke/bin])

I'm not fully sure if it'd be somewhat helpful/useful for Solr users though...

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module



 [ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-9448:
--
Attachment: LUCENE-9448.patch

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] epugh opened a new pull request #1729: SOLR-14725 update batchSize parameter docs for update() and delete() stream expressions

2020-08-09 Thread ASF subversion and git services (Jira)



epugh opened a new pull request #1729:
URL: https://github.com/apache/lucene-solr/pull/1729


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14725) Fix the update() stream docs

2020-08-09 Thread David Eric Pugh (Jira)

David Eric Pugh created SOLR-14725:
--

 Summary: Fix the update() stream docs
 Key: SOLR-14725
 URL: https://issues.apache.org/jira/browse/SOLR-14725
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: documentation
Affects Versions: 8.6
Reporter: David Eric Pugh
Assignee: David Eric Pugh


Ref guide specifies the batchSize is mandatory, but it's now optional with a 
"sane default" of 250.   Check other batchSize parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14581) Document the way auto commits work in SolrCloud



[ 
https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173841#comment-17173841
 ] 

ASF subversion and git services commented on SOLR-14581:


Commit f9c6737bcc24e75a47cb8b924f97260ba6d7b499 in lucene-solr's branch 
refs/heads/branch_8x from Eric Pugh
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f9c6737 ]

SOLR-14581 Document the way auto commits work in SolrCloud (#1692)

* provide some detail on eventually consistent code

* small tweak to language

* respond to comments and word smithing

> Document the way auto commits work in SolrCloud
> ---
>
> Key: SOLR-14581
> URL: https://issues.apache.org/jira/browse/SOLR-14581
> Project: Solr
>  Issue Type: Bug
>  Components: documentation, SolrCloud
>Affects Versions: master (9.0)
>Reporter: Bram Van Dam
>Assignee: David Eric Pugh
>Priority: Minor
> Attachments: SOLR-14581.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The documentation is unclear about how auto commits actually work in 
> SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. 
> Erick's reply verbatim:
> {quote}Each node has its own timer that starts when it receives an update.
> So in your situation, 60 seconds after any give replica gets it’s first
> update, all documents that have been received in the interval will
> be committed.
> But note several things:
> 1> commits will tend to cluster for a given shard. By that I mean
> they’ll tend to happen within a few milliseconds of each other
>‘cause it doesn’t take that long for an update to get from the
>leader to all the followers.
> 2> this is per replica. So if you host replicas from multiple collections
>on some node, their commits have no relation to each other. And
>say for some reason you transmit exactly one document that lands
>on shard1. Further, say nodeA contains replicas for shard1 and shard2.
>Only the replica for shard1 would commit.
> 3> Solr promises eventual consistency. In this case, due to all the
>timing variables it is not guaranteed that every replica of a single
>shard has the same document available for search at any given time.
>Say doc1 hits the leader at time T and a follower at time T+10ms.
>Say doc2 hits the leader and gets indexed 5ms before the 
>commit is triggered, but for some reason it takes 15ms for it to get
>to the follower. The leader will be able to search doc2, but the
>   follower won’t until 60 seconds later.{quote}
> Perhaps the subject deserves a section of its own, but I'll attach a patch 
> which includes the gist of Erick's reply as a Tip in the "indexing in 
> SolrCloud"-section.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14581) Document the way auto commits work in SolrCloud

2020-08-09 Thread David Eric Pugh (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Eric Pugh updated SOLR-14581:
---
Fix Version/s: 8.7
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fixed via commit 35771c3cfe955c8631755b52cbb5de480285ded9

> Document the way auto commits work in SolrCloud
> ---
>
> Key: SOLR-14581
> URL: https://issues.apache.org/jira/browse/SOLR-14581
> Project: Solr
>  Issue Type: Bug
>  Components: documentation, SolrCloud
>Affects Versions: master (9.0)
>Reporter: Bram Van Dam
>Assignee: David Eric Pugh
>Priority: Minor
> Fix For: 8.7
>
> Attachments: SOLR-14581.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The documentation is unclear about how auto commits actually work in 
> SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. 
> Erick's reply verbatim:
> {quote}Each node has its own timer that starts when it receives an update.
> So in your situation, 60 seconds after any give replica gets it’s first
> update, all documents that have been received in the interval will
> be committed.
> But note several things:
> 1> commits will tend to cluster for a given shard. By that I mean
> they’ll tend to happen within a few milliseconds of each other
>‘cause it doesn’t take that long for an update to get from the
>leader to all the followers.
> 2> this is per replica. So if you host replicas from multiple collections
>on some node, their commits have no relation to each other. And
>say for some reason you transmit exactly one document that lands
>on shard1. Further, say nodeA contains replicas for shard1 and shard2.
>Only the replica for shard1 would commit.
> 3> Solr promises eventual consistency. In this case, due to all the
>timing variables it is not guaranteed that every replica of a single
>shard has the same document available for search at any given time.
>Say doc1 hits the leader at time T and a follower at time T+10ms.
>Say doc2 hits the leader and gets indexed 5ms before the 
>commit is triggered, but for some reason it takes 15ms for it to get
>to the follower. The leader will be able to search doc2, but the
>   follower won’t until 60 seconds later.{quote}
> Perhaps the subject deserves a section of its own, but I'll attach a patch 
> which includes the gist of Erick's reply as a Tip in the "indexing in 
> SolrCloud"-section.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14581) Document the way auto commits work in SolrCloud

2020-08-09 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173839#comment-17173839
 ] 

ASF subversion and git services commented on SOLR-14581:


Commit 35771c3cfe955c8631755b52cbb5de480285ded9 in lucene-solr's branch 
refs/heads/master from Eric Pugh
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=35771c3 ]

SOLR-14581 Document the way auto commits work in SolrCloud (#1692)

* provide some detail on eventually consistent code

* small tweak to language

* respond to comments and word smithing

> Document the way auto commits work in SolrCloud
> ---
>
> Key: SOLR-14581
> URL: https://issues.apache.org/jira/browse/SOLR-14581
> Project: Solr
>  Issue Type: Bug
>  Components: documentation, SolrCloud
>Affects Versions: master (9.0)
>Reporter: Bram Van Dam
>Assignee: David Eric Pugh
>Priority: Minor
> Attachments: SOLR-14581.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The documentation is unclear about how auto commits actually work in 
> SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. 
> Erick's reply verbatim:
> {quote}Each node has its own timer that starts when it receives an update.
> So in your situation, 60 seconds after any give replica gets it’s first
> update, all documents that have been received in the interval will
> be committed.
> But note several things:
> 1> commits will tend to cluster for a given shard. By that I mean
> they’ll tend to happen within a few milliseconds of each other
>‘cause it doesn’t take that long for an update to get from the
>leader to all the followers.
> 2> this is per replica. So if you host replicas from multiple collections
>on some node, their commits have no relation to each other. And
>say for some reason you transmit exactly one document that lands
>on shard1. Further, say nodeA contains replicas for shard1 and shard2.
>Only the replica for shard1 would commit.
> 3> Solr promises eventual consistency. In this case, due to all the
>timing variables it is not guaranteed that every replica of a single
>shard has the same document available for search at any given time.
>Say doc1 hits the leader at time T and a follower at time T+10ms.
>Say doc2 hits the leader and gets indexed 5ms before the 
>commit is triggered, but for some reason it takes 15ms for it to get
>to the follower. The leader will be able to search doc2, but the
>   follower won’t until 60 seconds later.{quote}
> Perhaps the subject deserves a section of its own, but I'll attach a patch 
> which includes the gist of Erick's reply as a Tip in the "indexing in 
> SolrCloud"-section.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] epugh merged pull request #1692: SOLR-14581 Document the way auto commits work in SolrCloud