[jira] [Commented] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet
[ https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174115#comment-17174115 ] Tobias Ibounig commented on SOLR-13381: --- Thanks [~erickerickson] for your expertise. I have not yet heard that there are issues where a just indexing all documents again is not enough. If this is a common issue, would it make sense to add some warnings in the documentation? For example here [https://lucene.apache.org/solr/guide/8_6/reindexing.html] Just one thing to add, this error appeared just after changing Trie-Field to point field. The additional change to multi value was in attempts to find a workaround. > Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a > PointField facet > -- > > Key: SOLR-13381 > URL: https://issues.apache.org/jira/browse/SOLR-13381 > Project: Solr > Issue Type: Bug > Components: faceting >Affects Versions: 7.0, 7.6, 7.7, 7.7.1 > Environment: solr, solrcloud >Reporter: Zhu JiaJun >Priority: Major > Attachments: SOLR-13381.patch > > > Hey, > I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform > group facet on an IntPointField. Debugging into the source code, the cause is > that internally the docvalue type for PointField is "NUMERIC" (single value) > or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class > requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: > [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313] > > When I change schema for all int field to TrieIntField, the group facet then > work. Since internally the docvalue type for TrieField is SORTED (single > value) or SORTED_SET (multi value). > Regarding that the "TrieField" is depreciated in Solr7, please help on this > grouping facet issue for PointField. I also commented this issue in SOLR-7495. > > In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test > files seems to be using the "TrieintField", if change to "IntPointField", > some unit tests will fail, for example: > [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174114#comment-17174114 ] Dawid Weiss commented on LUCENE-9448: - Looks good to me, Tomoko. Regarding Erick's question - I always thought Luke is still a "stand-alone" tool so I suggested dependency assembly for a stand-alone tool (relative to its jar). If you want to avoid JAR duplication and put it in the distribution then things become a bit more convoluted. Give me some time and I'll try to provide a patch for this, have to think about how to do it myself. > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174075#comment-17174075 ] Jason Baik commented on SOLR-14630: --- Also, [~idjurasevic] mentioned indexing and querying still worked "correctly" for his system because the forwarding, although not desired, still allows the request to reach the correct replica eventually. However, in our case, we explicitly disable distributed processing bc we depend on the _route_ param being able to pin-point the replica, so our system lost correctness, too. We're hoping that the fix for this issue is also back-ported to Solr 7 if possible please. > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > Attachments: > 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173933#comment-17173933 ] Tomoko Uchida edited comment on LUCENE-9448 at 8/10/20, 4:52 AM: - I attached a poc patch [^LUCENE-9448.patch]. Main class and classpaths are specified in the Manifest, so that Luke is launched by java command for testing. {code:java} $ ./gradlew lucene:luke:testAssemble $ java -jar lucene/luke/build/libs/lucene-luke-9.0.0-SNAPSHOT.jar {code} There remains one TODO; how can we set correct Class-Path for distribution package, or maybe we should omit it for the distro for now? [~dweiss] what do you think? We could emulate the directory structure of the final distribution package for all dependent jars when testing (so that the same Class-Path attribute can be used for UI testing and packaging), but it would mess up \{{luke/build/}} ... was (Author: tomoko uchida): I attached a poc patch [^LUCENE-9448.patch]. Main class and classpaths are specified in the Manifest, so that Luke is launched by java command for testing. {code:java} $ ./gradlew lucene:luke:testAssemble $ java -jar lucene/luke/build/libs/lucene-luke-9.0.0-SNAPSHOT.jar {code} There remains one TODO; how can we set correct Class-Path for distribution package, or maybe we should omit it for the distro for now? [~dweiss] what do you think? > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173929#comment-17173929 ] Tomoko Uchida edited comment on LUCENE-9448 at 8/10/20, 4:46 AM: - Hi [~erickerickson] as for SOLR-13412, I think there would be nothing special to ship Luke with Solr - it's just an ordinary JAR file (like other lucene modules) and its all dependent jars may be already included in Solr. Once correct class paths are set, Luke runs on everywhere else. (Please see this launch shell/bat: [https://github.com/apache/lucene-solr/tree/master/lucene/luke/bin]) I'm not fully sure if it'd be somewhat helpful/useful for Solr users though... was (Author: tomoko uchida): Hi [~erickerickson] as for SOLR-13412, I think there wouldn't be nothing special to ship Luke with Solr - it's just an ordinary JAR file (like other lucene modules) and its all dependent jars may be already included in Solr. Once correct class paths are set, Luke runs on everywhere else. (Please see this launch shell/bat: [https://github.com/apache/lucene-solr/tree/master/lucene/luke/bin]) I'm not fully sure if it'd be somewhat helpful/useful for Solr users though... > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174072#comment-17174072 ] Jason Baik edited comment on SOLR-14630 at 8/10/20, 4:44 AM: - Attached is a test case demonstrating the problem: [^0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch]. We're finding the same problem as [~idjurasevic] as we're upgrading from Solr 6 to 7. The _route_ param no longer works as expected. The regression seems to have happened around L1061 in [https://github.com/apache/lucene-solr/commit/e001f352895c83652c3cf31e3c724d29a46bb721#diff-c8d54eacd46180b332c86c7ae448abaeR1065]. It made requests no longer be routed to a specific replica, but only to the node level. This: {code:java} ZkCoreNodeProps.getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), joinedInputCollections) {code} Shouldn't have replaced: {code:java} url = coreNodeProps.getCoreUrl() {code} Although they sound like they do the same thing, they actually don't. Only the latter produces a replica specific url. I guess the name of the method getCoreUrl() is little tricky. was (Author: jason.j.b...@gmail.com): Attached is a test case demonstrating the problem: [^0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch]. We're finding the same problem as [~idjurasevic] as we're upgrading from Solr 6 to 7. The _route_ param no longer works as expected. The regression seems to have happened around L1061 in [https://github.com/apache/lucene-solr/commit/e001f352895c83652c3cf31e3c724d29a46bb721#diff-c8d54eacd46180b332c86c7ae448abaeR1065]. It made requests no longer be routed to a specific replica, but only to the node level. This: {code:java} ZkCoreNodeProps.getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), joinedInputCollections) {code} Shouldn't have replaced: {code:java} url = coreNodeProps.getCoreUrl() {code} Although they sound like they do the same thing, they actually don't. Only the latter produces a replica specific url. I guess the name of the method getCoreUrl() is little tricky. > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > Attachments: > 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174072#comment-17174072 ] Jason Baik commented on SOLR-14630: --- Attached is a test case demonstrating the problem: [^0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch]. We're finding the same problem as [~idjurasevic] as we're upgrading from Solr 6 to 7. The _route_ param no longer works as expected. The regression seems to have happened around L1061 in [https://github.com/apache/lucene-solr/commit/e001f352895c83652c3cf31e3c724d29a46bb721#diff-c8d54eacd46180b332c86c7ae448abaeR1065]. It made requests no longer be routed to a specific replica, but only to the node level. This: {code:java} ZkCoreNodeProps.getCoreUrl(nodeProps.getStr(ZkStateReader.BASE_URL_PROP), joinedInputCollections) {code} Shouldn't have replaced: {code:java} url = coreNodeProps.getCoreUrl() {code} Although they sound like they do the same thing, they actually don't. Only the latter produces a replica specific url. I guess the name of the method getCoreUrl() is little tricky. > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > Attachments: > 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Baik updated SOLR-14630: -- Attachment: 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > Attachments: > 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Baik updated SOLR-14630: -- Attachment: (was: 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch) > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Baik updated SOLR-14630: -- Attachment: 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > Attachments: > 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1730: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes
noblepaul commented on pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671136587 Planning to commit this soon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul opened a new pull request #1730: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes
noblepaul opened a new pull request #1730: URL: https://github.com/apache/lucene-solr/pull/1730 new PR for #1694 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul closed pull request #1694: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes
noblepaul closed pull request #1694: URL: https://github.com/apache/lucene-solr/pull/1694 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1694: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes
noblepaul commented on pull request #1694: URL: https://github.com/apache/lucene-solr/pull/1694#issuecomment-671136445 Opening another PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields
[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gautam Worah updated LUCENE-9450: - Description: The taxonomy index that maps binning labels to ordinals was created before Lucene added BinaryDocValues. I've attached a WIP patch (does not pass tests currently) Issue suggested by [~mikemccand] was: The taxonomy index that maps binning labels to ordinals was created before Lucene added BinaryDocValues. I've attached a WIP patch (does not pass tests currently) > Taxonomy index should use DocValues not StoredFields > > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.5.2 >Reporter: Gautam Worah >Priority: Minor > Labels: performance > Attachments: wip_taxonomy_patch > > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields
Gautam Worah created LUCENE-9450: Summary: Taxonomy index should use DocValues not StoredFields Key: LUCENE-9450 URL: https://issues.apache.org/jira/browse/LUCENE-9450 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Affects Versions: 8.5.2 Reporter: Gautam Worah Attachments: wip_taxonomy_patch The taxonomy index that maps binning labels to ordinals was created before Lucene added BinaryDocValues. I've attached a WIP patch (does not pass tests currently) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173986#comment-17173986 ] Alexandre Rafalovitch commented on SOLR-14726: -- There are so many points in here that It is hard to answer them all together. But, since my opinion was asked, I feel that the steps proposed go into opposite direction from its title. Of course, I do not have as much exposure to the real users as other participants, so below are my strong opinions but with a very large sack of salt. I am also going to comment in random order a bit. (tl;dr of disagreed part) I think we should have a new coherent (example? production?) configset with matching large and interesting example dataset that we use to demonstrate both classic and new features and we should keep post and focus on explaining it better. # Removing the examples. We currently ship with 10-ish, we are about to lose 5 of them (DIH ones). I agree that what we have is confusing. However, removing them all is not the right answer. I feel that we should have one complex example dataset that we use in multiple ways to demonstrate lots of Solr features. Techproducts used to be that, but is rather out of date and is not really internally consistent. Films was aiming to become that but the source service has disappeared and it had its own little issues. I have been looking for a potential example for a while and the one that appeals to me most is a [https://www.fakenamegenerator.com/] (which allows for bulk generations). This would give us multiple field types to demonstrate, advanced searches/analysis and multilingual aspects. Maybe we can have the dataset split into chunks with each chunk using different format Solr support (similar to films example). # Using curl - I am with Erick that post tool is better than curl and we worked for it to be more explicit on explaining what it is actually doing (with base URL vs destination logging). I think we should explain its output better so people know what to look for. # Postman/Insomnia are good in theory, but I heard Postman's company strategy made it less and less reliable as a tool to promote. I don't know about Insomnia. It would have been nice to have commands in some consistent way. # Google Colab and output.serve_kernel_port_as_window trick looks really interesting and potentially promising. Could that be used instead of Postman/Insomnia/curl? # V2 API - yes, totally # Docker? Maybe, no opinion; I use docker for other projects, it is nice. But I don't know if it is an official path for Solr distribution (just honestly, out of the loop on that) # Auth - good idea, I guess. As a first step, I don't know. But somewhere in the process. # First example using cloud - I was never super comfortable with that. To me, it feels as a ES-competition move, similar to the schemaless issue with semi-expected negative consequences. I think the first example should be super simple single Solr/Collection start. Then, the further example should introduce cloud and related schema evolution process differences. So, for example, the cloud example would take the same fake names dataset and then do graph analysis on it or machine learning or some other advanced features we have only in cloud configuration. I am aware that there is a discussion to make everything cloud under the hood in future Solr, but don't think that was actually decided, partially because for a lot of people, single Solr instance is more than sufficient. # Make the tutorial shorter? Part of the length is the cloud instructions, part of it is the screenshots, which is very useful. I don't think it is the length that matters, but the fact that the current text is a bit all over the place and not super coherent, including switching between different datasets and schemas without properly indicating it. # Configset - suggested to be removed as kind of a part of a point 10. I think we need a new configset to go together with new example (back to my point 1) that is coherent with the new Solr features. That's a big discussion on its own (e.g. Do we need to demonstrate requestHandlers and initParams and overrides and ... all in one file?). We should also recognize that the documentation should no longer live in the configset, but be in the reference guide, especially for the managed-schema files where all comments get blown away on first API change. Recognizing this, would allow us to move commented-out defaults out of those files as well, making them shorter and easier to read. # I do not recognize anything in the original suggestion as specifically addressing "that should also be followed in production". That, to me, is a huge question, as none of the current configsets are 'production ready' and I don't see specific suggestions to strengthen it. Nor do I, myself, truly know what productio
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173971#comment-17173971 ] Erick Erickson commented on SOLR-14726: --- {quote}We have no example of a JSON document sent to the /update or /update/json/docs endpoint, even though this is what the main usecase is for most people {quote} Setting aside the quibble whether "most people" really do this or if you've seen a biased sample ;)... Isn't that case served by "bin/solr do_the_right_thing some_file.csv" or "bin/solr do_the_right_thing somefile.json" where bin/solr, well, does the right thing based on the extension? I admit I haven't thought this through very carefully, but if we're going for "as easy as possible", we shouldn't have to build into Solr dealing with a random file format. I can curl _anything_ to Solr. Are we going to send anything we don't recognize to ExtractingRequesthandler or intercept that on the client side and, say, send anything we don't recognize to a Tika server and send the results to Solr? > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets
[ https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173970#comment-17173970 ] Ishan Chattopadhyaya commented on SOLR-13438: - Sure, but please don't wait too long. I'll find more issues for new devs, and this better be fixed soon :-) > DELETE collection should remove AUTOCREATED configsets > -- > > Key: SOLR-13438 > URL: https://issues.apache.org/jira/browse/SOLR-13438 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > Current user experience: > # User creates a collection (without specifying configset), and makes some > schema/config changes. > # He's/She's not happy with how the changes turned out, so he/she deletes and > re-creates the collection. > # He/she observes that the previously made settings changes persist. If > he/she is only aware of Schema and Config APIs and not explicitly aware of > the concept of configsets, this will be un-intuitive for him/her. > Proposed: > DELETE collection should delete the configset if it has the prefix > ".AUTOCREATED" and that configset isn't being shared by any other collection. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173969#comment-17173969 ] Ishan Chattopadhyaya commented on SOLR-14726: - bq. I also think a video from someone in the community should be made. Some people learn differently. Absolutely +1 to an introductory beginner's video, possibly embedded in the ref guide! > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173968#comment-17173968 ] Ishan Chattopadhyaya commented on SOLR-14726: - bq. If we abstract this a bit, it becomes "let's make it super-simple to index any data whatsoever". Absolutely, +1 bq. I'm not wild about replacing bin/solr with curl. WDYT about "bin/solr index_this_thing something"? Where "something" is a directory, a file, whatever. That would give us more control over what/how we send things to Solr. For an example showing indexing of some documents residing in a directory, I agree that is better than a complex curl request. But, in most cases, we want to show the user how to index regular documents like JSON or CSV etc. We have no example of a JSON document sent to the /update or /update/json/docs endpoint, even though this is what the main usecase is for most people. For those, I strongly favour using curl. It helps develop familiarity in dealing with indexing documents into Solr even for production environments where a developer doesn't have bin/solr access. By the way, expert users and committers are sometimes not aware of something that we need every regular user to be aware of! https://twitter.com/dep4b/status/1292191202624897025. No better place than the solr tutorial, IMHO. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173965#comment-17173965 ] Marcus Eagan commented on SOLR-14726: - I also think a video from someone in the community should be made. Some people learn differently. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173964#comment-17173964 ] Erick Erickson commented on SOLR-14726: --- Hmmm. In the past, we provided sample data and configsets to give people a place to start. Let's back up a bit. That approach was based on the model of Solr where there was a learning curve (to put it politely) to get over first before being able to do anything. We provided canned examples that we knew would work. If we abstract this a bit, it becomes "let's make it super-simple to index any data whatsoever". I'm not wild about replacing bin/solr with curl. WDYT about "bin/solr index_this_thing something"? Where "something" is a directory, a file, whatever. That would give us more control over what/how we send things to Solr. I suppose it comes down to a question of where we want to put the smarts. We either put it in Solr or put it in bin/solr (or something). A curl command that took a directory seems "fraught". I'm not even sure bin/solr is the right place, but you see where this is heading. We could even use the Tika server idea to process docs on the client side and avoid ExtractingRequestHandler all together. Anyway, random thoughts for discussion > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eagan updated SOLR-14726: Labels: newdev (was: ) > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13438) DELETE collection should remove AUTOCREATED configsets
[ https://issues.apache.org/jira/browse/SOLR-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173962#comment-17173962 ] Marcus Eagan commented on SOLR-13438: - [~ichattopadhyaya] I'll leave this issue open for a new dev to step in and get involved. If it is not fixed after a while, I will wrap it up. I personally have seen this issue cause many problems. > DELETE collection should remove AUTOCREATED configsets > -- > > Key: SOLR-13438 > URL: https://issues.apache.org/jira/browse/SOLR-13438 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > Current user experience: > # User creates a collection (without specifying configset), and makes some > schema/config changes. > # He's/She's not happy with how the changes turned out, so he/she deletes and > re-creates the collection. > # He/she observes that the previously made settings changes persist. If > he/she is only aware of Schema and Config APIs and not explicitly aware of > the concept of configsets, this will be un-intuitive for him/her. > Proposed: > DELETE collection should delete the configset if it has the prefix > ".AUTOCREATED" and that configset isn't being shared by any other collection. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958 ] Ishan Chattopadhyaya edited comment on SOLR-14726 at 8/9/20, 7:41 PM: -- Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~ctargett], [~erikhatcher], [~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], [~marcussorealheis], [~dsmiley], [~epugh]. was (Author: ichattopadhyaya): Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~ctargett], [~erikhatcher], [~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], [~marcussorealheis]. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-14726: Description: The reference guide Solr tutorial is here: https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html It needs to be simplified and easy to follow. Also, it should reflect our best practices, that should also be followed in production. I have following suggestions: # Make it less verbose. It is too long. On my laptop, it required 35 page downs button presses to get to the bottom of the page! # First step of the tutorial should be to enable security (basic auth should suffice). # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. # All references of {{bin/solr post}} to be replaced with {{curl}} # Convert all {{bin/solr create}} references to curl of collection creation commands # Add docker based startup instructions. # Create a Jupyter Notebook version of the entire tutorial, make it so that it can be easily executed from Google Colaboratory. Here's an example: https://twitter.com/TheSearchStack/status/1289703715981496320 # Provide downloadable Postman and Insomnia files so that the same tutorial can be executed from those tools. Except for starting Solr, all other steps should be possible to be carried out from those tools. # Use V2 APIs everywhere in the tutorial # Remove all example modes, sample data (films, tech products etc.), configsets from Solr's distribution (instead let the examples refer to them from github) # Remove the post tool from Solr, curl should suffice. was: The reference guide Solr tutorial is here: https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html It needs to be simplified and easy to follow. Also, it should reflect our best practices, that should also be followed in production. I have following suggestions: # Make it less verbose. It is too long. # First step of the tutorial should be to enable security (basic auth should suffice). # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. # All references of {{bin/solr post}} to be replaced with {{curl}} # Convert all {{bin/solr create}} references to curl of collection creation commands # Add docker based startup instructions. # Create a Jupyter Notebook version of the entire tutorial, make it so that it can be easily executed from Google Colaboratory. Here's an example: https://twitter.com/TheSearchStack/status/1289703715981496320 # Provide downloadable Postman and Insomnia files so that the same tutorial can be executed from those tools. Except for starting Solr, all other steps should be possible to be carried out from those tools. # Use V2 APIs everywhere in the tutorial # Remove all example modes, sample data (films, tech products etc.), configsets from Solr's distribution (instead let the examples refer to them from github) # Remove the post tool from Solr, curl should suffice. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) ---
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173960#comment-17173960 ] Marcus Eagan commented on SOLR-14726: - Looks like a great list to start. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958 ] Ishan Chattopadhyaya edited comment on SOLR-14726 at 8/9/20, 7:30 PM: -- Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~ctargett], [~erikhatcher], [~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], [~marcussorealheis]. was (Author: ichattopadhyaya): Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], [~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], [~marcussorealheis]. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958 ] Ishan Chattopadhyaya edited comment on SOLR-14726 at 8/9/20, 7:28 PM: -- Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], [~arafalov], [~noble.paul], [~atris], [~erickerickson], [~rcmuir], [~marcussorealheis]. was (Author: ichattopadhyaya): Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], [~arafalov], [~noble.paul]. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173958#comment-17173958 ] Ishan Chattopadhyaya commented on SOLR-14726: - Some or most of these can be sub-tasks for this JIRA. We can use this JIRA for broader consensus and discussion. Would appreciate all thoughts and inputs, esp. [~cassandra], [~erikhatcher], [~arafalov], [~noble.paul]. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14726) Streamline getting started experience
Ishan Chattopadhyaya created SOLR-14726: --- Summary: Streamline getting started experience Key: SOLR-14726 URL: https://issues.apache.org/jira/browse/SOLR-14726 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Ishan Chattopadhyaya The reference guide Solr tutorial is here: https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html It needs to be simplified and easy to follow. Also, it should reflect our best practices, that should also be followed in production. I have following suggestions: # Make it less verbose. It is too long. # First step of the tutorial should be to enable security (basic auth should suffice). # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. # All references of {{bin/solr post}} to be replaced with {{curl}} # Convert all {{bin/solr create}} references to curl of collection creation commands # Add docker based startup instructions. # Create a Jupyter Notebook version of the entire tutorial, make it so that it can be easily executed from Google Colaboratory. Here's an example: https://twitter.com/TheSearchStack/status/1289703715981496320 # Provide downloadable Postman and Insomnia files so that the same tutorial can be executed from those tools. Except for starting Solr, all other steps should be possible to be carried out from those tools. # Use V2 APIs everywhere in the tutorial # Remove all example modes, sample data (films, tech products etc.), configsets from Solr's distribution (instead let the examples refer to them from github) # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173933#comment-17173933 ] Tomoko Uchida commented on LUCENE-9448: --- I attached a poc patch [^LUCENE-9448.patch]. Main class and classpaths are specified in the Manifest, so that Luke is launched by java command for testing. {code:java} $ ./gradlew lucene:luke:testAssemble $ java -jar lucene/luke/build/libs/lucene-luke-9.0.0-SNAPSHOT.jar {code} There remains one TODO; how can we set correct Class-Path for distribution package, or maybe we should omit it for the distro for now? [~dweiss] what do you think? > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173929#comment-17173929 ] Tomoko Uchida commented on LUCENE-9448: --- Hi [~erickerickson] as for SOLR-13412, I think there wouldn't be nothing special to ship Luke with Solr - it's just an ordinary JAR file (like other lucene modules) and its all dependent jars may be already included in Solr. Once correct class paths are set, Luke runs on everywhere else. (Please see this launch shell/bat: [https://github.com/apache/lucene-solr/tree/master/lucene/luke/bin]) I'm not fully sure if it'd be somewhat helpful/useful for Solr users though... > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomoko Uchida updated LUCENE-9448: -- Attachment: LUCENE-9448.patch > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh opened a new pull request #1729: SOLR-14725 update batchSize parameter docs for update() and delete() stream expressions
epugh opened a new pull request #1729: URL: https://github.com/apache/lucene-solr/pull/1729 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14725) Fix the update() stream docs
David Eric Pugh created SOLR-14725: -- Summary: Fix the update() stream docs Key: SOLR-14725 URL: https://issues.apache.org/jira/browse/SOLR-14725 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: documentation Affects Versions: 8.6 Reporter: David Eric Pugh Assignee: David Eric Pugh Ref guide specifies the batchSize is mandatory, but it's now optional with a "sane default" of 250. Check other batchSize parameters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14581) Document the way auto commits work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173841#comment-17173841 ] ASF subversion and git services commented on SOLR-14581: Commit f9c6737bcc24e75a47cb8b924f97260ba6d7b499 in lucene-solr's branch refs/heads/branch_8x from Eric Pugh [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f9c6737 ] SOLR-14581 Document the way auto commits work in SolrCloud (#1692) * provide some detail on eventually consistent code * small tweak to language * respond to comments and word smithing > Document the way auto commits work in SolrCloud > --- > > Key: SOLR-14581 > URL: https://issues.apache.org/jira/browse/SOLR-14581 > Project: Solr > Issue Type: Bug > Components: documentation, SolrCloud >Affects Versions: master (9.0) >Reporter: Bram Van Dam >Assignee: David Eric Pugh >Priority: Minor > Attachments: SOLR-14581.patch > > Time Spent: 1h > Remaining Estimate: 0h > > The documentation is unclear about how auto commits actually work in > SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. > Erick's reply verbatim: > {quote}Each node has its own timer that starts when it receives an update. > So in your situation, 60 seconds after any give replica gets it’s first > update, all documents that have been received in the interval will > be committed. > But note several things: > 1> commits will tend to cluster for a given shard. By that I mean > they’ll tend to happen within a few milliseconds of each other >‘cause it doesn’t take that long for an update to get from the >leader to all the followers. > 2> this is per replica. So if you host replicas from multiple collections >on some node, their commits have no relation to each other. And >say for some reason you transmit exactly one document that lands >on shard1. Further, say nodeA contains replicas for shard1 and shard2. >Only the replica for shard1 would commit. > 3> Solr promises eventual consistency. In this case, due to all the >timing variables it is not guaranteed that every replica of a single >shard has the same document available for search at any given time. >Say doc1 hits the leader at time T and a follower at time T+10ms. >Say doc2 hits the leader and gets indexed 5ms before the >commit is triggered, but for some reason it takes 15ms for it to get >to the follower. The leader will be able to search doc2, but the > follower won’t until 60 seconds later.{quote} > Perhaps the subject deserves a section of its own, but I'll attach a patch > which includes the gist of Erick's reply as a Tip in the "indexing in > SolrCloud"-section. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14581) Document the way auto commits work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Eric Pugh updated SOLR-14581: --- Fix Version/s: 8.7 Resolution: Fixed Status: Resolved (was: Patch Available) Fixed via commit 35771c3cfe955c8631755b52cbb5de480285ded9 > Document the way auto commits work in SolrCloud > --- > > Key: SOLR-14581 > URL: https://issues.apache.org/jira/browse/SOLR-14581 > Project: Solr > Issue Type: Bug > Components: documentation, SolrCloud >Affects Versions: master (9.0) >Reporter: Bram Van Dam >Assignee: David Eric Pugh >Priority: Minor > Fix For: 8.7 > > Attachments: SOLR-14581.patch > > Time Spent: 1h > Remaining Estimate: 0h > > The documentation is unclear about how auto commits actually work in > SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. > Erick's reply verbatim: > {quote}Each node has its own timer that starts when it receives an update. > So in your situation, 60 seconds after any give replica gets it’s first > update, all documents that have been received in the interval will > be committed. > But note several things: > 1> commits will tend to cluster for a given shard. By that I mean > they’ll tend to happen within a few milliseconds of each other >‘cause it doesn’t take that long for an update to get from the >leader to all the followers. > 2> this is per replica. So if you host replicas from multiple collections >on some node, their commits have no relation to each other. And >say for some reason you transmit exactly one document that lands >on shard1. Further, say nodeA contains replicas for shard1 and shard2. >Only the replica for shard1 would commit. > 3> Solr promises eventual consistency. In this case, due to all the >timing variables it is not guaranteed that every replica of a single >shard has the same document available for search at any given time. >Say doc1 hits the leader at time T and a follower at time T+10ms. >Say doc2 hits the leader and gets indexed 5ms before the >commit is triggered, but for some reason it takes 15ms for it to get >to the follower. The leader will be able to search doc2, but the > follower won’t until 60 seconds later.{quote} > Perhaps the subject deserves a section of its own, but I'll attach a patch > which includes the gist of Erick's reply as a Tip in the "indexing in > SolrCloud"-section. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14581) Document the way auto commits work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173839#comment-17173839 ] ASF subversion and git services commented on SOLR-14581: Commit 35771c3cfe955c8631755b52cbb5de480285ded9 in lucene-solr's branch refs/heads/master from Eric Pugh [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=35771c3 ] SOLR-14581 Document the way auto commits work in SolrCloud (#1692) * provide some detail on eventually consistent code * small tweak to language * respond to comments and word smithing > Document the way auto commits work in SolrCloud > --- > > Key: SOLR-14581 > URL: https://issues.apache.org/jira/browse/SOLR-14581 > Project: Solr > Issue Type: Bug > Components: documentation, SolrCloud >Affects Versions: master (9.0) >Reporter: Bram Van Dam >Assignee: David Eric Pugh >Priority: Minor > Attachments: SOLR-14581.patch > > Time Spent: 1h > Remaining Estimate: 0h > > The documentation is unclear about how auto commits actually work in > SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. > Erick's reply verbatim: > {quote}Each node has its own timer that starts when it receives an update. > So in your situation, 60 seconds after any give replica gets it’s first > update, all documents that have been received in the interval will > be committed. > But note several things: > 1> commits will tend to cluster for a given shard. By that I mean > they’ll tend to happen within a few milliseconds of each other >‘cause it doesn’t take that long for an update to get from the >leader to all the followers. > 2> this is per replica. So if you host replicas from multiple collections >on some node, their commits have no relation to each other. And >say for some reason you transmit exactly one document that lands >on shard1. Further, say nodeA contains replicas for shard1 and shard2. >Only the replica for shard1 would commit. > 3> Solr promises eventual consistency. In this case, due to all the >timing variables it is not guaranteed that every replica of a single >shard has the same document available for search at any given time. >Say doc1 hits the leader at time T and a follower at time T+10ms. >Say doc2 hits the leader and gets indexed 5ms before the >commit is triggered, but for some reason it takes 15ms for it to get >to the follower. The leader will be able to search doc2, but the > follower won’t until 60 seconds later.{quote} > Perhaps the subject deserves a section of its own, but I'll attach a patch > which includes the gist of Erick's reply as a Tip in the "indexing in > SolrCloud"-section. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh merged pull request #1692: SOLR-14581 Document the way auto commits work in SolrCloud
epugh merged pull request #1692: URL: https://github.com/apache/lucene-solr/pull/1692 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org