[GitHub] [lucene-solr] dsmiley commented on pull request #1592: SOLR-14579 First pass at dismantling Utils

2020-06-20 Thread GitBox


dsmiley commented on pull request #1592:
URL: https://github.com/apache/lucene-solr/pull/1592#issuecomment-647082412


   +1
   Can you see who added these and get their attention here for their opinion?.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14581) Document the way auto commits work in SolrCloud

2020-06-20 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141299#comment-17141299
 ] 

David Smiley commented on SOLR-14581:
-

Thanks for improving Solr's documentation!

For reference, your patch is simply the following:
bq. +TIP: Each node has its own auto commit timer which starts upon receipt of 
an update. While Solr promises eventual consistency, leaders will generally 
receive updates *before* replicas; it is therefore possible for replicas to lag 
behind somewhat.

> TIP:
If this is a tip then... well what is the advise you are offering?  Perhaps 
"NOTE:" is better.  (see `about-this-guide.adoc`) is better.

> Each node has its own auto commit timer
No, each *core* (replica) has one.  Nodes can host many cores which act 
independently.

I'd like to propose the following new language.  I thought about your approach 
of including some rationale but I think it's way more important to point out 
the consequences than the causes.

bq. +NOTE: Using auto soft commit or commitWithin requires the client app to 
embrace the realities of "eventual consistency".  Solr will make documents 
searchable at _roughly_ the same time across NRT replicas of a collection but 
there are no hard guarantees.  Consequently, in rare cases, it's possible for a 
document to show up in one search only for it not to appear in a subsequent 
search occurring immediately after when the second is routed to a different 
replica.  Also, documents added in a particular order (even in the same batch) 
might become searchable out of order of submission when there is sharding.

CC [~erickerickson]

> Document the way auto commits work in SolrCloud
> ---
>
> Key: SOLR-14581
> URL: https://issues.apache.org/jira/browse/SOLR-14581
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation, SolrCloud
>Affects Versions: master (9.0)
>Reporter: Bram Van Dam
>Priority: Minor
> Attachments: SOLR-14581.patch
>
>
> The documentation is unclear about how auto commits actually work in 
> SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening. 
> Erick's reply verbatim:
> {quote}Each node has its own timer that starts when it receives an update.
> So in your situation, 60 seconds after any give replica gets it’s first
> update, all documents that have been received in the interval will
> be committed.
> But note several things:
> 1> commits will tend to cluster for a given shard. By that I mean
> they’ll tend to happen within a few milliseconds of each other
>‘cause it doesn’t take that long for an update to get from the
>leader to all the followers.
> 2> this is per replica. So if you host replicas from multiple collections
>on some node, their commits have no relation to each other. And
>say for some reason you transmit exactly one document that lands
>on shard1. Further, say nodeA contains replicas for shard1 and shard2.
>Only the replica for shard1 would commit.
> 3> Solr promises eventual consistency. In this case, due to all the
>timing variables it is not guaranteed that every replica of a single
>shard has the same document available for search at any given time.
>Say doc1 hits the leader at time T and a follower at time T+10ms.
>Say doc2 hits the leader and gets indexed 5ms before the 
>commit is triggered, but for some reason it takes 15ms for it to get
>to the follower. The leader will be able to search doc2, but the
>   follower won’t until 60 seconds later.{quote}
> Perhaps the subject deserves a section of its own, but I'll attach a patch 
> which includes the gist of Erick's reply as a Tip in the "indexing in 
> SolrCloud"-section.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

2020-06-20 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141249#comment-17141249
 ] 

Tomoko Uchida commented on LUCENE-9286:
---

Thanks Robert and Mike for your comments,

bq.  To get the benchmark to cover JapaneseAnalyzer (and the other CJK 
analyzers too, maybe?) we'd need to incorporate some documents that include 
text in ideographic scripts.

I can work for preparing the corpus but I'm unusually busy for a while here; 
maybe I can start it next month... 

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -
>
> Key: LUCENE-9286
> URL: https://issues.apache.org/jira/browse/LUCENE-9286
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5
>Reporter: Dawid Weiss
>Assignee: Bruno Roustant
>Priority: Major
> Fix For: 8.6
>
> Attachments: screen-[1].png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14404) CoreContainer level custom requesthandlers

2020-06-20 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14404:
--
Description: 
caveats:
 * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. Which 
means only V2 APIs are supported
 * The path should have prefix {{/api/plugin}}

add a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "full.ClassName"
  }
}' http://localhost:8983/api/cluster/plugins
{code}
add a plugin from a package
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "pkgName:full.ClassName" ,
  "version: "1.0"   
  }
}' http://localhost:8983/api/cluster/plugins
{code}
remove a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "remove": "myplugin"
}' http://localhost:8983/api/cluster/plugins
{code}
The configuration will be stored in the {{clusterprops.json}}
 as
{code:java}
{
"plugins" : {
"myplugin" : {"class": "full.ClassName" }
}
}
{code}
example plugin
{code:java}
public class MyPlugin {

  private final CoreContainer coreContainer;

  public MyPlugin(CoreContainer coreContainer) {
this.coreContainer = coreContainer;
  }


  @EndPoint(path = "/myplugin/path1",
method = METHOD.GET,
permission = READ)
  public void call(SolrQueryRequest req, SolrQueryResponse rsp){
rsp.add("myplugin.version", "2.0");
  }
}
{code}
This plugin will be accessible on all nodes at {{/api/myplugin/path1}}. It's 
possible to add more methods at different paths. Ensure that all paths start 
with {{myplugin}} because that is the name in which the plugin is registered 
with. So {{/myplugin/path2}} , {{/myplugin/my/deeply/nested/path}} are all 
valid paths. 

It's possible that the suer chooses to register the plugin with a different 
name. In that case , use a template variable as follows in paths 
{{$plugin-name/path1}}

  was:
caveats:
 * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. Which 
means only V2 APIs are supported
 * The path should have prefix {{/api/plugin}}

add a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "full.ClassName"
  }
}' http://localhost:8983/api/cluster/plugins
{code}

add a plugin from a package
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "pkgName:full.ClassName" ,
  "version: "1.0"   
  }
}' http://localhost:8983/api/cluster/plugins
{code}


remove a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "remove": "myplugin"
}' http://localhost:8983/api/cluster/plugins
{code}
The configuration will be stored in the {{clusterprops.json}}
 as
{code:java}
{
"plugins" : {
"myplugin" : {"class": "full.ClassName" }
}
}
{code}

example plugin

{code:java}
@EndPoint(path = "/plugin/my/path",
method = METHOD.GET,
permission = READ)
public class MyPlugin {

  private final CoreContainer coreContainer;

  public MyPlugin(CoreContainer coreContainer) {
this.coreContainer = coreContainer;
  }

  @Command
  public void call(SolrQueryRequest req, SolrQueryResponse rsp){
rsp.add("myplugin.version", "2.0");
  }
}
{code}

This  plugin will be accessible on all nodes at {{/api/plugin/my/path}}


> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", "class": "full.ClassName"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", "class": "pkgName:full.ClassName" ,
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final 

[jira] [Updated] (SOLR-14404) CoreContainer level custom requesthandlers

2020-06-20 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14404:
--
Description: 
caveats:
 * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. Which 
means only V2 APIs are supported
 * The path should have prefix {{/api/plugin}}

add a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "full.ClassName"
  }
}' http://localhost:8983/api/cluster/plugins
{code}
add a plugin from a package
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "pkgName:full.ClassName" ,
  "version: "1.0"   
  }
}' http://localhost:8983/api/cluster/plugins
{code}
remove a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "remove": "myplugin"
}' http://localhost:8983/api/cluster/plugins
{code}
The configuration will be stored in the {{clusterprops.json}}
 as
{code:java}
{
"plugins" : {
"myplugin" : {"class": "full.ClassName" }
}
}
{code}
example plugin
{code:java}
public class MyPlugin {

  private final CoreContainer coreContainer;

  public MyPlugin(CoreContainer coreContainer) {
this.coreContainer = coreContainer;
  }


  @EndPoint(path = "/myplugin/path1",
method = METHOD.GET,
permission = READ)
  public void call(SolrQueryRequest req, SolrQueryResponse rsp){
rsp.add("myplugin.version", "2.0");
  }
}
{code}
This plugin will be accessible on all nodes at {{/api/myplugin/path1}}. It's 
possible to add more methods at different paths. Ensure that all paths start 
with {{myplugin}} because that is the name in which the plugin is registered 
with. So {{/myplugin/path2}} , {{/myplugin/my/deeply/nested/path}} are all 
valid paths. 

It's possible that the user chooses to register the plugin with a different 
name. In that case , use a template variable as follows in paths 
{{$plugin-name/path1}}

  was:
caveats:
 * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. Which 
means only V2 APIs are supported
 * The path should have prefix {{/api/plugin}}

add a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "full.ClassName"
  }
}' http://localhost:8983/api/cluster/plugins
{code}
add a plugin from a package
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "add": {
  "name":"myplugin", "class": "pkgName:full.ClassName" ,
  "version: "1.0"   
  }
}' http://localhost:8983/api/cluster/plugins
{code}
remove a plugin
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '
{
  "remove": "myplugin"
}' http://localhost:8983/api/cluster/plugins
{code}
The configuration will be stored in the {{clusterprops.json}}
 as
{code:java}
{
"plugins" : {
"myplugin" : {"class": "full.ClassName" }
}
}
{code}
example plugin
{code:java}
public class MyPlugin {

  private final CoreContainer coreContainer;

  public MyPlugin(CoreContainer coreContainer) {
this.coreContainer = coreContainer;
  }


  @EndPoint(path = "/myplugin/path1",
method = METHOD.GET,
permission = READ)
  public void call(SolrQueryRequest req, SolrQueryResponse rsp){
rsp.add("myplugin.version", "2.0");
  }
}
{code}
This plugin will be accessible on all nodes at {{/api/myplugin/path1}}. It's 
possible to add more methods at different paths. Ensure that all paths start 
with {{myplugin}} because that is the name in which the plugin is registered 
with. So {{/myplugin/path2}} , {{/myplugin/my/deeply/nested/path}} are all 
valid paths. 

It's possible that the suer chooses to register the plugin with a different 
name. In that case , use a template variable as follows in paths 
{{$plugin-name/path1}}


> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", "class": "full.ClassName"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", "class": "pkgName:full.ClassName" ,
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> 

[jira] [Comment Edited] (LUCENE-9394) Fix or suppress compile-time warnings

2020-06-20 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141165#comment-17141165
 ] 

Erick Erickson edited comment on LUCENE-9394 at 6/20/20, 9:20 PM:
--

Thanks, I propose that I just add the SuppressWarnings to the 8x code line. My 
reasoning is that, despite the effort I've been putting in to get clean 
compiles, what I've done hasn't actually _fixed_ anything. It has laid the 
groundwork for not getting worse is all (8,000 warnings in Solr sheesh!).

Given that, it's hard for me to justify any changes affecting back-compat for a 
minor release, even if it's not that much of an inconvenience. Add to that that 
I imagine we'll be cutting 9.0 in the not too distant future and there'll be a 
limited amount of back-port pain.

I could be persuaded otherwise, but that's my starting position...


was (Author: erickerickson):
Thanks, I propose that I just add the SuppressWarnings to the 8x code line. My 
reasoning is that, despite the effort I've been putting in to get clean 
compiles, what I've done hasn't actually _fixed_ anything. It has laid the 
groundwork for not getting worse is all (8,000 warnings in Solr sheesh!).

Given that, it's hard for me to justify any changes affecting back-compat for a 
minor release, even if it's not that much of an inconvanience. Add to that that 
I imagine we'll be cutting 9.0 in the not too distant future and there'll be a 
limited amount of back-port pain.

I could be persuaded otherwise, but that's my starting position...

> Fix or suppress compile-time warnings
> -
>
> Key: LUCENE-9394
> URL: https://issues.apache.org/jira/browse/LUCENE-9394
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Assignee: Michael Sokolov
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is a spinoff from [~erickerickson]'s efforts over in  SOLR-10778 
> The goal is a warning-free compilation, followed by enforcement of build 
> failure on warnings, with the idea of suppressing innocuous warnings to the 
> extent that the remaining warnings be treated as build failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14584) solr.in.cmd and solr.in.sh still reference obsolete jks files

2020-06-20 Thread Aren Cambre (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aren Cambre updated SOLR-14584:
---
Summary: solr.in.cmd and solr.in.sh still reference obsolete jks files  
(was: solr.in.cmd and solr.in.sh still reference jks files)

> solr.in.cmd and solr.in.sh still reference obsolete jks files
> -
>
> Key: SOLR-14584
> URL: https://issues.apache.org/jira/browse/SOLR-14584
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Server
>Affects Versions: 8.5.2
>Reporter: Aren Cambre
>Priority: Major
>  Labels: easyfix
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When following the Enabling SSL documentation 
> ([https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]), the end 
> result is an error if you miss a critical detail: that you need to change the 
> *.jks* file extension in two lines to *.p12*.
> Please update the default *bin/solr.in.cmd* and *bin/solr.in.sh* files to 
> reference *p12* files. It appears that the JKS format has been left behind, 
> so there's no reason to reference those by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14584) solr.in.cmd and solr.in.sh still reference obsolete jks files

2020-06-20 Thread Aren Cambre (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aren Cambre updated SOLR-14584:
---
Description: 
When following the Enabling SSL documentation 
([https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]), the end result 
is an error if you miss a critical detail: that you need to change the *.jks* 
file extension in two lines to *.p12*.

Please update the default *bin/solr.in.cmd* and *bin/solr.in.sh* files to 
reference *p12* files. It appears that the JKS format is obsolete, so there's 
no reason to reference those by default.

  was:
When following the Enabling SSL documentation 
([https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]), the end result 
is an error if you miss a critical detail: that you need to change the *.jks* 
file extension in two lines to *.p12*.

Please update the default *bin/solr.in.cmd* and *bin/solr.in.sh* files to 
reference *p12* files. It appears that the JKS format has been left behind, so 
there's no reason to reference those by default.


> solr.in.cmd and solr.in.sh still reference obsolete jks files
> -
>
> Key: SOLR-14584
> URL: https://issues.apache.org/jira/browse/SOLR-14584
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Server
>Affects Versions: 8.5.2
>Reporter: Aren Cambre
>Priority: Major
>  Labels: easyfix
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When following the Enabling SSL documentation 
> ([https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]), the end 
> result is an error if you miss a critical detail: that you need to change the 
> *.jks* file extension in two lines to *.p12*.
> Please update the default *bin/solr.in.cmd* and *bin/solr.in.sh* files to 
> reference *p12* files. It appears that the JKS format is obsolete, so there's 
> no reason to reference those by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14584) solr.in.cmd and solr.in.sh still reference jks files

2020-06-20 Thread Aren Cambre (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aren Cambre updated SOLR-14584:
---
Description: 
When following the Enabling SSL documentation 
([https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]), the end result 
is an error if you miss a critical detail: that you need to change the *.jks* 
file extension in two lines to *.p12*.

Please update the default *bin/solr.in.cmd* and *bin/solr.in.sh* files to 
reference *p12* files. It appears that the JKS format has been left behind, so 
there's no reason to reference those by default.

  was:
When following the [Enabling 
SSL|[https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]] documentation 
exactly, the end result is an error if you miss a critical detail: that you 
need to change the *.jks* file extension in two lines to *.p12*.

Please update the default *bin/solr.in.cmd* file to reference *p12* files. It 
appears that the JKS format has been left behind, so there's no reason to 
reference those by default.


> solr.in.cmd and solr.in.sh still reference jks files
> 
>
> Key: SOLR-14584
> URL: https://issues.apache.org/jira/browse/SOLR-14584
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Server
>Affects Versions: 8.5.2
>Reporter: Aren Cambre
>Priority: Major
>  Labels: easyfix
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When following the Enabling SSL documentation 
> ([https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]), the end 
> result is an error if you miss a critical detail: that you need to change the 
> *.jks* file extension in two lines to *.p12*.
> Please update the default *bin/solr.in.cmd* and *bin/solr.in.sh* files to 
> reference *p12* files. It appears that the JKS format has been left behind, 
> so there's no reason to reference those by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] arencambre opened a new pull request #1597: fixes SOLR-14584

2020-06-20 Thread GitBox


arencambre opened a new pull request #1597:
URL: https://github.com/apache/lucene-solr/pull/1597


   
   
   
   # Description
   
   Please provide a short description of the changes you're making with this 
pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your 
solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14584) solr.in.cmd and solr.in.sh still reference jks files

2020-06-20 Thread Aren Cambre (Jira)
Aren Cambre created SOLR-14584:
--

 Summary: solr.in.cmd and solr.in.sh still reference jks files
 Key: SOLR-14584
 URL: https://issues.apache.org/jira/browse/SOLR-14584
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Server
Affects Versions: 8.5.2
Reporter: Aren Cambre


When following the [Enabling 
SSL|[https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html]] documentation 
exactly, the end result is an error if you miss a critical detail: that you 
need to change the *.jks* file extension in two lines to *.p12*.

Please update the default *bin/solr.in.cmd* file to reference *p12* files. It 
appears that the JKS format has been left behind, so there's no reason to 
reference those by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9394) Fix or suppress compile-time warnings

2020-06-20 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141165#comment-17141165
 ] 

Erick Erickson commented on LUCENE-9394:


Thanks, I propose that I just add the SuppressWarnings to the 8x code line. My 
reasoning is that, despite the effort I've been putting in to get clean 
compiles, what I've done hasn't actually _fixed_ anything. It has laid the 
groundwork for not getting worse is all (8,000 warnings in Solr sheesh!).

Given that, it's hard for me to justify any changes affecting back-compat for a 
minor release, even if it's not that much of an inconvanience. Add to that that 
I imagine we'll be cutting 9.0 in the not too distant future and there'll be a 
limited amount of back-port pain.

I could be persuaded otherwise, but that's my starting position...

> Fix or suppress compile-time warnings
> -
>
> Key: LUCENE-9394
> URL: https://issues.apache.org/jira/browse/LUCENE-9394
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Assignee: Michael Sokolov
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is a spinoff from [~erickerickson]'s efforts over in  SOLR-10778 
> The goal is a warning-free compilation, followed by enforcement of build 
> failure on warnings, with the idea of suppressing innocuous warnings to the 
> extent that the remaining warnings be treated as build failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14583) Spell suggestion is returned even if hits are non-zero when spellcheck.maxResultsForSuggest=0

2020-06-20 Thread Munendra S N (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N updated SOLR-14583:

Summary: Spell suggestion is returned even if hits are non-zero when 
spellcheck.maxResultsForSuggest=0  (was: Spell suggestions is returned even if 
hits are non-zero when spellcheck.maxResultsForSuggest=0)

> Spell suggestion is returned even if hits are non-zero when 
> spellcheck.maxResultsForSuggest=0
> -
>
> Key: SOLR-14583
> URL: https://issues.apache.org/jira/browse/SOLR-14583
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Munendra S N
>Priority: Major
>
> SOLR-4280 added to support fractional support for 
> spellcheck.maxResultsForSuggest. After SOLR-4280, 
> {{spellcheck.maxResultsForSuggest=0}} is treated same as not specify the 
> {{spellcheck.maxResultsForSuggest}} parameter. This can cause spell 
> suggestions to be returned even when hits are non-zero and greater than 
> {{spellcheck.maxResultsForSuggest}} (i.e, greater than 0)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14583) Spell suggestions is returned even if hits are non-zero when spellcheck.maxResultsForSuggest=0

2020-06-20 Thread Munendra S N (Jira)
Munendra S N created SOLR-14583:
---

 Summary: Spell suggestions is returned even if hits are non-zero 
when spellcheck.maxResultsForSuggest=0
 Key: SOLR-14583
 URL: https://issues.apache.org/jira/browse/SOLR-14583
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Munendra S N
Assignee: Munendra S N


SOLR-4280 added to support fractional support for 
spellcheck.maxResultsForSuggest. After SOLR-4280, 
{{spellcheck.maxResultsForSuggest=0}} is treated same as not specify the 
{{spellcheck.maxResultsForSuggest}} parameter. This can cause spell suggestions 
to be returned even when hits are non-zero and greater than 
{{spellcheck.maxResultsForSuggest}} (i.e, greater than 0)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14582) Expose IWC.setMaxCommitMergeWaitSeconds as an expert feature in Solr's index config

2020-06-20 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe updated SOLR-14582:
-
Summary: Expose IWC.setMaxCommitMergeWaitSeconds as an expert feature in 
Solr's index config  (was: Exponse IWC.setMaxCommitMergeWaitSeconds as an 
expert feature in Solr's index config)

> Expose IWC.setMaxCommitMergeWaitSeconds as an expert feature in Solr's index 
> config
> ---
>
> Key: SOLR-14582
> URL: https://issues.apache.org/jira/browse/SOLR-14582
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tomas Eduardo Fernandez Lobbe
>Priority: Trivial
>
> LUCENE-8962 added the ability to merge segments synchronously on commit. This 
> isn't done by default and the default {{MergePolicy}} won't do it, but custom 
> merge policies can take advantage of this. Solr allows plugging in custom 
> merge policies, so if someone wants to make use of this feature they could, 
> however, they need to set {{IndexWriterConfig.maxCommitMergeWaitSeconds}} to 
> something greater than 0.
> Since this is an expert feature, I plan to document it only in javadoc and 
> not the ref guide.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

2020-06-20 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141107#comment-17141107
 ] 

Michael Sokolov commented on LUCENE-9286:
-

> We could improve the analyzers nightly benchmark

That makes sense. There is also the commented out 
{{TestJapaneseTokenizer.testWikipedia}} that tests performance of Kuromoji 
specifically, but one has to remember to run it.  To get the benchmark to cover 
JapaneseAnalyzer (and the other CJK analyzers too, maybe?) we'd need to 
incorporate some documents that include text in ideographic scripts. It looks 
as if the benchmarks use English Wikipedia docs exclusively right now. 
luceneutil data seems to be kept in [~mikemccand]'s Apache homedir. Simplest 
first step would be to add a Japanese Wikipedia dump to that, but we could also 
source the data from somewhere else if need be ...

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -
>
> Key: LUCENE-9286
> URL: https://issues.apache.org/jira/browse/LUCENE-9286
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5
>Reporter: Dawid Weiss
>Assignee: Bruno Roustant
>Priority: Major
> Fix For: 8.6
>
> Attachments: screen-[1].png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9394) Fix or suppress compile-time warnings

2020-06-20 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141102#comment-17141102
 ] 

Michael Sokolov commented on LUCENE-9394:
-

> Do you (or anyone else) want to weigh in on whether to backport this fix or 
> just SuppressWarnings in 8x for Lucene?

I think it's down to what our back compat policy is. If we're OK with 
introducing breaking API changes in a minor release, then we should fix rather 
than suppress, but I was under the impression that we only made such changes on 
major releases. I personally feel this would be OK - it's a compilation 
failure, not a behavior change, so there's no risk someone gets a surprise; 
they just have to fix up their  Map types or add SuppressWarnings.

> Fix or suppress compile-time warnings
> -
>
> Key: LUCENE-9394
> URL: https://issues.apache.org/jira/browse/LUCENE-9394
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Assignee: Michael Sokolov
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is a spinoff from [~erickerickson]'s efforts over in  SOLR-10778 
> The goal is a warning-free compilation, followed by enforcement of build 
> failure on warnings, with the idea of suppressing innocuous warnings to the 
> extent that the remaining warnings be treated as build failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9411) Fail complation on warnings

2020-06-20 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141101#comment-17141101
 ] 

Michael Sokolov commented on LUCENE-9411:
-

> I was tending to the fail-early (i.e. not just on precommit) for the same 
>reason. After a bit of annoyance, people should be able to write the code 
>right the first time

OK, I agree - I think this is appropriate for things like compiler warnings. I 
just want to make sure that for more stringent checks like style checks, 
javadoc, etc. we don't move them up to compile phase. We want to be able to 
make some speculative changes without worrying about all the fine points. Once 
we have some code that seems worth committing, then we can polish up the 
imports, the lines with trailing whitespace and so on. I think that's how it 
works now - precommit handles these fussier checks, right?

> Fail complation on warnings
> ---
>
> Key: LUCENE-9411
> URL: https://issues.apache.org/jira/browse/LUCENE-9411
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Labels: build
> Attachments: LUCENE-9411.patch, LUCENE-9411.patch, LUCENE-9411.patch, 
> annotations-warnings.patch
>
>
> Moving this over here from SOLR-11973 since it's part of the build system and 
> affects Lucene as well as Solr. You might want to see the discussion there.
> We have a clean compile for both Solr and Lucene, no rawtypes, unchecked, 
> try, etc. warnings. There are some peculiar warnings (things like 
> SuppressFBWarnings, i.e. FindBugs) that I'm not sure about at all, but let's 
> assume those are not a problem. Now I'd like to start failing the compilation 
> if people write new code that generates warnings.
> From what I can tell, just adding the flag is easy in both the Gradle and Ant 
> builds. I still have to prove out that adding -Werrors does what I expect, 
> i.e. succeeds now and fails when I introduce warnings.
> But let's assume that works. Are there objections to this idea generally? I 
> hope to have some data by next Monday.
> FWIW, the Lucene code base had far fewer issues than Solr, but 
> common-build.xml is in Lucene.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9411) Fail complation on warnings

2020-06-20 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141096#comment-17141096
 ] 

Dawid Weiss commented on LUCENE-9411:
-

A single one is fine I think but I'd rather have it separately to know it's a 
workaround for an odd behavior of javac than have it scattered around various 
build files.

> Fail complation on warnings
> ---
>
> Key: LUCENE-9411
> URL: https://issues.apache.org/jira/browse/LUCENE-9411
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Labels: build
> Attachments: LUCENE-9411.patch, LUCENE-9411.patch, LUCENE-9411.patch, 
> annotations-warnings.patch
>
>
> Moving this over here from SOLR-11973 since it's part of the build system and 
> affects Lucene as well as Solr. You might want to see the discussion there.
> We have a clean compile for both Solr and Lucene, no rawtypes, unchecked, 
> try, etc. warnings. There are some peculiar warnings (things like 
> SuppressFBWarnings, i.e. FindBugs) that I'm not sure about at all, but let's 
> assume those are not a problem. Now I'd like to start failing the compilation 
> if people write new code that generates warnings.
> From what I can tell, just adding the flag is easy in both the Gradle and Ant 
> builds. I still have to prove out that adding -Werrors does what I expect, 
> i.e. succeeds now and fails when I introduce warnings.
> But let's assume that works. Are there objections to this idea generally? I 
> hope to have some data by next Monday.
> FWIW, the Lucene code base had far fewer issues than Solr, but 
> common-build.xml is in Lucene.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9411) Fail complation on warnings

2020-06-20 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141062#comment-17141062
 ] 

Erick Erickson commented on LUCENE-9411:


Sure. Are you thinking two different files, one for findbugs and one for 
error_prone? Or just a single file, something like 
gradle/hacks/annotations.gradle?

> Fail complation on warnings
> ---
>
> Key: LUCENE-9411
> URL: https://issues.apache.org/jira/browse/LUCENE-9411
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Labels: build
> Attachments: LUCENE-9411.patch, LUCENE-9411.patch, LUCENE-9411.patch, 
> annotations-warnings.patch
>
>
> Moving this over here from SOLR-11973 since it's part of the build system and 
> affects Lucene as well as Solr. You might want to see the discussion there.
> We have a clean compile for both Solr and Lucene, no rawtypes, unchecked, 
> try, etc. warnings. There are some peculiar warnings (things like 
> SuppressFBWarnings, i.e. FindBugs) that I'm not sure about at all, but let's 
> assume those are not a problem. Now I'd like to start failing the compilation 
> if people write new code that generates warnings.
> From what I can tell, just adding the flag is easy in both the Gradle and Ant 
> builds. I still have to prove out that adding -Werrors does what I expect, 
> i.e. succeeds now and fails when I introduce warnings.
> But let's assume that works. Are there objections to this idea generally? I 
> hope to have some data by next Monday.
> FWIW, the Lucene code base had far fewer issues than Solr, but 
> common-build.xml is in Lucene.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

2020-06-20 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141058#comment-17141058
 ] 

Robert Muir commented on LUCENE-9286:
-

We could improve the analyzers nightly benchmark: 
https://people.apache.org/~mikemccand/lucenebench/analyzers.html

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -
>
> Key: LUCENE-9286
> URL: https://issues.apache.org/jira/browse/LUCENE-9286
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.5
>Reporter: Dawid Weiss
>Assignee: Bruno Roustant
>Priority: Major
> Fix For: 8.6
>
> Attachments: screen-[1].png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9413) Add a char filter corresponding to CJKWidthFilter

2020-06-20 Thread Tomoko Uchida (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida resolved LUCENE-9413.
---
Resolution: Won't Fix

> Add a char filter corresponding to CJKWidthFilter
> -
>
> Key: LUCENE-9413
> URL: https://issues.apache.org/jira/browse/LUCENE-9413
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Tomoko Uchida
>Priority: Minor
>
> In association with issues in Elasticsearch 
> ([https://github.com/elastic/elasticsearch/issues/58384] and 
> [https://github.com/elastic/elasticsearch/issues/58385]), it might be useful 
> for Japanese default analyzer.
> Although I don't think it's a bug to not normalize FULL and HALF width 
> characters before tokenization, the behaviour sometimes confuses beginners or 
> users who have limited knowledge about Japanese analysis (and Unicode).
> If we have a FULL and HALF width character normalization filter in 
> {{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, 
> JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization 
> so some of FULL width numbers or latin alphabets are separated by the 
> tokenizer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9413) Add a char filter corresponding to CJKWidthFilter

2020-06-20 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17141009#comment-17141009
 ] 

Tomoko Uchida commented on LUCENE-9413:
---

The mecab-ipadic dictionary has entries which includes FULL width characters, 
so this naive approach - FULL / HALF width character normalization before 
tokenizing can break tokenization. :/

Maybe we could concat "unknown" word sequence which consists of only numbers or 
latin alphabets, after tokenization ?

{code}
$ cut -d',' -f1 mecab-ipadic-all-utf8.csv | grep 1
12月
1番
11月
1月
10月
G7プラス1
小1
高1
1つ
F1
中1
110番
G1
1
ファスニング21
G10
インパクト21
アルゴテクノス21
セルヴィ21
モクネット21
U19
どさんこワイド212
西15線北
北13線
西14線北
北14線
西10号南
南1条
東11号北
東12線北
西11号北
駒場北1条通
東1線南
第1安井牧場
西10号北
東11線北
美旗町中1番
南21線西
南17線西
西10線北
岩内町第1基線
北15線
南12線西
東13線南
西13線北
西1線北
南16線西
西10線南
西16線北
西11線北
西12号北
西11線南
東10線北
北1線
東1線北
南13号
南14線西
南1線
北11線
西12線南
西14線南
南13線西
浦臼第1
西13線南
東10号北
南19線西
北1条
南11線西
平泉外12入会
東10線南
東10号南
南18線西
南15線西
東11号南
東12号北
北10線
駒場南1条通
南1番通
南10線西
北12線
西1線南
太田1の通り
東11線南
西12線北
東12線南
大泉1区南部
M40A1
F15戦闘機
DF31
F15
G1
辞林21
R12
O157
DF41
スーパー301
GP125
北13条東
M1A2
アポロ11号
{code}

> Add a char filter corresponding to CJKWidthFilter
> -
>
> Key: LUCENE-9413
> URL: https://issues.apache.org/jira/browse/LUCENE-9413
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Tomoko Uchida
>Priority: Minor
>
> In association with issues in Elasticsearch 
> ([https://github.com/elastic/elasticsearch/issues/58384] and 
> [https://github.com/elastic/elasticsearch/issues/58385]), it might be useful 
> for Japanese default analyzer.
> Although I don't think it's a bug to not normalize FULL and HALF width 
> characters before tokenization, the behaviour sometimes confuses beginners or 
> users who have limited knowledge about Japanese analysis (and Unicode).
> If we have a FULL and HALF width character normalization filter in 
> {{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, 
> JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization 
> so some of FULL width numbers or latin alphabets are separated by the 
> tokenizer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9411) Fail complation on warnings

2020-06-20 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140997#comment-17140997
 ] 

Dawid Weiss commented on LUCENE-9411:
-

Can you move all those blocks into a separate single file (and apply to those 
projects that need it), Erik?
{code:java}
+  // Prometeus exporter classes reference this although it's not part of the 
exported classpath
+  // which causes odd warnings during compilation. Shut it up with an 
explicit-version
+  // compile-only dependency (!).
+  compileOnly 'com.google.errorprone:error_prone_annotations:2.1.3'{code}
It should be included from top-level (gradle/hacks/findbugs-annotations.gradle) 
and look something like this:
{code:java}
configure([project(":solr:foo"), project(":solr:bar")]) { 
  plugins.withType(JavaPlugin) {
// blah blah
dependencies {
  compileOnly 'com.google.errorprone:error_prone_annotations:2.1.3'
}
  }
}{code}
The "withType" bit is needed just in case the file is included before the java 
plugin is applied - then dependencies configuration wouldn't be resolved 
properly.

> Fail complation on warnings
> ---
>
> Key: LUCENE-9411
> URL: https://issues.apache.org/jira/browse/LUCENE-9411
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>  Labels: build
> Attachments: LUCENE-9411.patch, LUCENE-9411.patch, LUCENE-9411.patch, 
> annotations-warnings.patch
>
>
> Moving this over here from SOLR-11973 since it's part of the build system and 
> affects Lucene as well as Solr. You might want to see the discussion there.
> We have a clean compile for both Solr and Lucene, no rawtypes, unchecked, 
> try, etc. warnings. There are some peculiar warnings (things like 
> SuppressFBWarnings, i.e. FindBugs) that I'm not sure about at all, but let's 
> assume those are not a problem. Now I'd like to start failing the compilation 
> if people write new code that generates warnings.
> From what I can tell, just adding the flag is easy in both the Gradle and Ant 
> builds. I still have to prove out that adding -Werrors does what I expect, 
> i.e. succeeds now and fails when I introduce warnings.
> But let's assume that works. Are there objections to this idea generally? I 
> hope to have some data by next Monday.
> FWIW, the Lucene code base had far fewer issues than Solr, but 
> common-build.xml is in Lucene.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9413) Add a char filter corresponding to CJKWidthFilter

2020-06-20 Thread Tomoko Uchida (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-9413:
--
Description: 
In association with issues in Elasticsearch 
([https://github.com/elastic/elasticsearch/issues/58384] and 
[https://github.com/elastic/elasticsearch/issues/58385]), it might be useful 
for Japanese default analyzer.

Although I don't think it's a bug to not normalize FULL and HALF width 
characters before tokenization, the behaviour sometimes confuses beginners or 
users who have limited knowledge about Japanese analysis (and Unicode).

If we have a FULL and HALF width character normalization filter in 
{{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, 
JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization 
so some of FULL width numbers or latin alphabets are separated by the 
tokenizer).

  was:
In association with issues in Elasticsearch 
([https://github.com/elastic/elasticsearch/issues/58384] and 
[https://github.com/elastic/elasticsearch/issues/58385]), it might be useful 
for Japanese default analyzer.

Although I don't think it's a bug to not normalize FULL and HALF width 
characters before tokenization, the behaviour sometimes confuses beginners or 
users who have limited knowledge about Japanese analysis (and Unicode).

If we have a FULL and HALF width character normalization filter in 
{{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, 
JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization 
so some of FULL width numbers or alphabets are separated by the tokenizer).


> Add a char filter corresponding to CJKWidthFilter
> -
>
> Key: LUCENE-9413
> URL: https://issues.apache.org/jira/browse/LUCENE-9413
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Tomoko Uchida
>Priority: Minor
>
> In association with issues in Elasticsearch 
> ([https://github.com/elastic/elasticsearch/issues/58384] and 
> [https://github.com/elastic/elasticsearch/issues/58385]), it might be useful 
> for Japanese default analyzer.
> Although I don't think it's a bug to not normalize FULL and HALF width 
> characters before tokenization, the behaviour sometimes confuses beginners or 
> users who have limited knowledge about Japanese analysis (and Unicode).
> If we have a FULL and HALF width character normalization filter in 
> {{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, 
> JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization 
> so some of FULL width numbers or latin alphabets are separated by the 
> tokenizer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9413) Add a char filter corresponding to CJKWidthFilter

2020-06-20 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140976#comment-17140976
 ] 

Tomoko Uchida commented on LUCENE-9413:
---

I cannot take time for working on this soon, but wanted to hook it as an 
issue... comments and thoughts are welcomed.

> Add a char filter corresponding to CJKWidthFilter
> -
>
> Key: LUCENE-9413
> URL: https://issues.apache.org/jira/browse/LUCENE-9413
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Tomoko Uchida
>Priority: Minor
>
> In association with issues in Elasticsearch 
> ([https://github.com/elastic/elasticsearch/issues/58384] and 
> [https://github.com/elastic/elasticsearch/issues/58385]), it might be useful 
> for Japanese default analyzer.
> Although I don't think it's a bug to not normalize FULL and HALF width 
> characters before tokenization, the behaviour sometimes confuses beginners or 
> users who have limited knowledge about Japanese analysis (and Unicode).
> If we have a FULL and HALF width character normalization filter in 
> {{analyzers-common}}, we can include it into JapaneseAnalyzer (currently, 
> JapaneseAnalyzer contains CJKWidthFilter but it is applied after tokenization 
> so some of FULL width numbers or alphabets are separated by the tokenizer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org