Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

2020-10-01 Thread Alexandre Rafalovitch
I am indexing a deeply nested structure and am trying to return it
with fl=*,[child].

And it is supposed to have 5 children under the top element but
returns only 4. Two hours of debugging later, I realize that the
"limit" parameter is set to 10 by default and that 10 seems to be
counting children at ANY level. And calculating them depth-first. So,
it was quite unobvious to discover when the children suddenly stopped
showing up.

The documentation says:
> The maximum number of child documents to be returned per parent document. > 
> The default is `10`.

So, is that (all nested children included in limit) what we actually
mean? Or did we mean maximum number of "immediate children" for any
specific document/level and the code is wrong?

I can update the doc to clarify the results, but I don't know whether
I am looking at the bug or the feature.

Regards,
   Alex.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Backward compatability handling across major versions

2020-10-01 Thread Noble Paul
Yes, we break backward compatibility all the time. Solr is a web
service. Most users are only concerned about

* REST end points
* configurations in ZK
* index format

Java API changes are not that critical.  But, we still need to call them out.

The case in point was a configuration file stored in ZK. With that
change, it is IMPOSSIBLE to do a rolling upgrade. The only choice is
to

* Bring down the entire cluster
* Run the scripts to do an upgrade
* Pray everything comes back up

We should minimize any change that will prevent people from doing
rolling upgrades. If possible, our changes should be friendly to a
rolling upgrade.

All such changes MUST HAVE an associated ticket just to discuss the
backward compatibility break. We should weigh in the impact of such
changes on our users.



On Thu, Oct 1, 2020 at 10:18 PM David Smiley  wrote:
>
> Agreed that back-compat matters should not "sneak" into an issue that is not 
> about that.  There are of course gray areas -- much of Solr core Java APIs 
> are public yet we don't need to treat everything with such burdensome care.  
> It takes experience and some subjectivity to know.  The PR you point to is 
> very clear to me, as it's a web service API endpoint.
>
> We *can* break back-compat on a major release :-).  But such discussion 
> deserves its own issue about breaking that compatibility.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Oct 1, 2020 at 4:25 AM Ishan Chattopadhyaya 
>  wrote:
>>
>> Hi Devs,
>> As per earlier discussions, we want to do a better job of handling major 
>> version upgrades, possibly support rolling upgrades wherever possible. This 
>> implies that we don't break backward compatibility without a strong reason 
>> and adequate discussion around it.
>>
>> Recently, there was a PR that attempted to sneak in a backward incompatible 
>> change to an endpoint for plugins (package management). This change was 
>> totally unrelated to the JIRA/PR and there was absolutely no discussion or 
>> even an attempt to address the upgrade strategy with that change. The 
>> attitude was a careless one, on the lines of we can break backward 
>> compatibility in a major release. 
>> https://github.com/apache/lucene-solr/pull/1758#discussion_r494134314
>>
>> Do we have any consensus on whether we need a separate JIRA or broader 
>> discussion on any backward compatibility breaks? Or shall we let these 
>> changes be sneaked in, unless someone notices very carefully a few lines of 
>> changes in a 25 class PR?
>>
>> Looking for some suggestions here.
>> Thanks and regards,
>> Ishan



-- 
-
Noble Paul

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene » Lucene-Solr-Check-master - Build # 485 - Still Failing!

2020-10-01 Thread Chris Hostetter


https://issues.apache.org/jira/browse/SOLR-12987




: Date: Thu, 1 Oct 2020 17:27:53 + (UTC)
: From: Apache Jenkins Server 
: Reply-To: dev@lucene.apache.org
: To: bui...@lucene.apache.org
: Subject: [JENKINS] Lucene » Lucene-Solr-Check-master - Build # 485 - Still
: Failing!
: 
: Build: 
https://ci-builds.apache.org/job/Lucene/job/Lucene-Solr-Check-master/485/
: 
: All tests passed
: 
: Build Log:
: [...truncated 1544 lines...]
: BUILD FAILED in 1h 6m 21s
: 834 actionable tasks: 834 executed
: Build step 'Invoke Gradle script' changed build result to FAILURE
: Build step 'Invoke Gradle script' marked build as failure
: Archiving artifacts
: Recording test results
: Email was triggered for: Failure - Any
: Sending email for trigger: Failure - Any
: 

-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Highlight with Proximity search throws an exception

2020-10-01 Thread Michael Sokolov
I traced this to this block in FuzzyTermsEnum:

if (ed == 0) { // exact match
  boostAtt.setBoost(1.0F);
} else {
  final int codePointCount = UnicodeUtil.codePointCount(term);
  int minTermLength = Math.min(codePointCount, termLength);

  float similarity = 1.0f - (float) ed / (float) minTermLength;
  boostAtt.setBoost(similarity);
}

where in your test ed (edit distance) was 2 and minTermLength 1,
leading to negative boost.

I don't really understand this code at all, but I wonder if it should
divide by maxTermLength instead of minTermLength?

On Thu, Oct 1, 2020 at 9:54 AM Juraj Jurčo  wrote:
>
> Hi guys,
> we are trying to implement search and we have experienced a strange 
> situation. When our text contains an apostrophe followed by a single 
> character AND we our search query is composed of exactly two letters followed 
> by proximity search AND we use highlighting, we get an exception:
>
>> java.lang.IllegalArgumentException: boost must be a positive float, got -1.0
>
>
> It seems there is a problem at:FuzzyTermsEnum.java:271 (float similarity = 
> 1.0f - (float) ed / (float) minTermLength) when it reaches it with ed=2 and 
> it sets a negative boost.
>
> I was able to reproduce the error with following code:
>
> import java.io.IOException;
> import java.nio.file.Path;
>
> import org.apache.commons.io.FileUtils;
> import org.apache.lucene.analysis.Analyzer;
> import org.apache.lucene.analysis.TokenStream;
> import org.apache.lucene.analysis.core.SimpleAnalyzer;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.document.Field;
> import org.apache.lucene.document.TextField;
> import org.apache.lucene.index.IndexWriter;
> import org.apache.lucene.index.IndexWriterConfig;
> import org.apache.lucene.queryparser.classic.ParseException;
> import org.apache.lucene.queryparser.classic.QueryParser;
> import org.apache.lucene.search.Query;
> import org.apache.lucene.search.highlight.Highlighter;
> import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
> import org.apache.lucene.search.highlight.QueryScorer;
> import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
> import org.apache.lucene.search.highlight.TokenSources;
> import org.apache.lucene.store.Directory;
> import org.apache.lucene.store.FSDirectory;
> import org.junit.jupiter.api.Test;
>
> class FindSqlHighlightTest {
>
>@Test
>void reproduceHighlightProblem() throws IOException, ParseException, 
> InvalidTokenOffsetsException {
>   String text = "doesn't";
>   String field = "text";
>   //NOK: se~, se~2 and any higher number
>   //OK: sel~, s~, se~1
>   String uQuery = "se~";
>   int maxStartOffset = -1;
>   Analyzer analyzer = new SimpleAnalyzer();
>
>   Path indexLocation = Path.of("temp", 
> "reproduceHighlightProblem").toAbsolutePath();
>   if (indexLocation.toFile().exists()) {
>  FileUtils.deleteDirectory(indexLocation.toFile());
>   }
>   Directory indexDir = FSDirectory.open(indexLocation);
>
>   //Create index
>   IndexWriterConfig dimsIndexWriterConfig = new 
> IndexWriterConfig(analyzer);
>   dimsIndexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
>   IndexWriter idxWriter = new IndexWriter(indexDir, 
> dimsIndexWriterConfig);
>   //add doc
>   Document doc = new Document();
>   doc.add(new TextField(field, text, Field.Store.NO));
>   idxWriter.addDocument(doc);
>   //commit
>   idxWriter.commit();
>   idxWriter.close();
>
>   //search & highlight
>   Query query = new QueryParser(field, analyzer).parse(uQuery);
>   Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter(), 
> new QueryScorer(query));
>   TokenStream tokenStream = TokenSources.getTokenStream(field, null, 
> text, analyzer, maxStartOffset);
>   String highlighted = highlighter.getBestFragment(tokenStream, text);
>   System.out.println(highlighted);
>}
> }
>
>
> Could you please confirm whether it's a bug in Lucene or whether we do 
> something that is not allowed?
>
> Thanks a lot!
> Best,
> Juraj+

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: restlet dependencies

2020-10-01 Thread Timothy Potter
Awesome guys, thanks for the pointers ... am cooking up a PR (for master)
for this today

On Thu, Oct 1, 2020 at 2:22 AM Noble Paul  wrote:

> The annotation (
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/api/EndPoint.java
> )
> supports wild cards and templated paths
>
> On Thu, Oct 1, 2020 at 5:28 PM Ishan Chattopadhyaya
>  wrote:
> >
> > > But when I suggested porting the code that uses restlet to JAX-RS /
> Jersey, Ishan said
> > > that wasn't necessary and is already supported with some Annotations
> ... I have no idea
> > > what that means and need more info about what is already in place.
> >
> > I was mainly referring to the @Endpoint annotations. It is available for
> V2 APIs today (and I think it should be fine to use for anything we build
> now onwards, including managed resources V2).
> > It is possible to make it work with V1, but that will require some work.
> >
> > On Thu, Oct 1, 2020 at 12:50 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
> >>
> >> @Tim
> >>
> >> Please check ClusterAPI or ZookeeperReadAPI etc.
> >> Recently used it in Yasa:
> https://github.com/yasa-org/yasa/blob/master/yasa-solr-plugin/src/main/java/io/github/kezhenxu94/YasaHandler.java
> >>
> >> On Thu, Oct 1, 2020 at 10:46 AM Noble Paul 
> wrote:
> >>>
> >>> @Tim Potter
> >>>
> >>> I tried several times to get rid of the restlet dependency & keep the
> >>> functionality as is. I failed miserably. I'm not saying this to
> >>> discourage anyone who wants to give a try. Just letting you know that
> >>> it is not as easy as it may sound
> >>>
> >>> On Thu, Oct 1, 2020 at 2:42 AM Houston Putman 
> wrote:
> >>> >
> >>> > +1 to Tomas' proposal. Created SOLR-14907 to track the effort.
> >>> >
> >>> >  - Houston
> >>> >
> >>> > On Wed, Sep 30, 2020 at 12:26 PM Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
> >>> >>
> >>> >> > Let's support the single file upload feature
> >>> >> +1, but let this behave exactly as a zip file with a single file in
> it (regarding trusted/untrusted). We just need to change the configset
> handler to be able to handle non-zip files, and have a way to "locate" that
> file inside the configset (in case it needs to go somewhere other than the
> root).
> >>> >>
> >>> >> On Wed, Sep 30, 2020 at 8:45 AM Eric Pugh <
> ep...@opensourceconnections.com> wrote:
> >>> >>>
> >>> >>> I think that me in “violent agreement” with you.   Let’s
> understand the Annotations approach that we have, or pick something that is
> commonly used like JAX-RS / Jersey.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> On Sep 30, 2020, at 11:41 AM, Timothy Potter 
> wrote:
> >>> >>>
> >>> >>> I'm sorry, I don't understand what you mean by "make it a single
> pattern (the annotations?)" Eric?
> >>> >>>
> >>> >>> To me, the pattern is well established in the Java world: JAX-RS
> (with Jersey as the underlying impl. which has nice integration with
> Jetty). But when I suggested porting the code that uses restlet to JAX-RS /
> Jersey, Ishan said that wasn't necessary and is already supported with some
> Annotations ... I have no idea what that means and need more info about
> what is already in place. Short of that, replacing restlet with JAX-RS /
> Jersey looks like a trivial amount of work to me (and I'm happy to take it
> on).
> >>> >>>
> >>> >>> Tim
> >>> >>>
> >>> >>> On Wed, Sep 30, 2020 at 9:36 AM Eric Pugh <
> ep...@opensourceconnections.com> wrote:
> >>> 
> >>>  The use case of “I want to update something via a API” is I think
> pretty common, and it would be nice to make it a single pattern (the
> annotations?) with lots of examples/developer docs for the next person.
> >>> 
> >>> 
> >>> 
> >>>  On Sep 30, 2020, at 11:04 AM, Timothy Potter <
> thelabd...@gmail.com> wrote:
> >>> 
> >>>  I started looking into removing Managed Resources in master and
> wanted to mention that the LTR contrib also relies on this framework
> (ManagedModelStore and ManagedFeatureStore, see:
> https://lucene.apache.org/solr/guide/8_6/learning-to-rank.html#uploading-a-model).
> I only mention this b/c it's been said several times in this thread that
> nobody uses this feature and it's only for editing config/schema like
> synonyms. Afaik, LTR is a broadly used feature of Solr so now I'm not so
> bullish on removing the ability to manage dynamic resources using a REST
> like API. I agree that changing resources like the synonym set could be
> replaced with configSet updates but I don't see how to replace the RESTful
> model / feature store API w/o something like Managed Resources?
> >>> 
> >>>  From where I sit, I think we should just remove the use of
> restlet in the implementation but keep the API for Solr 9 (master).
> >>> 
> >>>  @Ishan ~ you mentioned there is a way to get REST API like
> behavior w/o using JAX-RS / Jersey ... something about annotations? Can you
> point me to some example code of how that is done please?
> >>> 
> >>> 

Re: 8.6.3 Release

2020-10-01 Thread Jason Gerlowski
I've put together draft Release Notes for 8.6.3 here. [1] [2].  Can
someone please sanity check the summaries there when they get a
chance?  Would appreciate the review.

8.6.3 is a bit interesting in that Lucene has no changes in this
bugfix release.  As a result I had to omit the standard phrase in the
Solr release notes about there being additional changes at the Lucene
level, and change some of the wording in the Lucene announcement to
indicate the lack of changes.  So that's something to pay particular
attention to, if someone can check my wording there.

[1] https://cwiki.apache.org/confluence/display/SOLR/DRAFT-ReleaseNote863
[2] https://cwiki.apache.org/confluence/display/LUCENE/DRAFT-ReleaseNote863

On Wed, Sep 30, 2020 at 10:57 AM Jason Gerlowski  wrote:
>
> The only one that was previously mentioned as a blocker was
> SOLR-14835, but from the comments on the ticket it looks like it ended
> up being purely a cosmetic issue.  Andrzej left a comment there
> suggesting that we "address" this with documentation for 8.6.3 but
> otherwise leave it as-is.
>
> So it looks like we're unblocked on starting the release process.
> Will begin the preliminary steps this afternoon.
>
> On Tue, Sep 29, 2020 at 3:40 PM Cassandra Targett  
> wrote:
> >
> > It looks to me like everything for 8.6.3 is resolved now 
> > (https://issues.apache.org/jira/projects/SOLR/versions/12348713), and it 
> > seems from comments in SOLR-14897 and SOLR-14898 that those fixes make a 
> > Jetty upgrade less compelling to try.
> >
> > Are there any other issues not currently marked for 8.6.3 we’re waiting for 
> > before starting the RC?
> > On Sep 29, 2020, 12:04 PM -0500, Jason Gerlowski , 
> > wrote:
> >
> > That said, if someone can use 8.6.3, what’s stopping them from going to 8.7 
> > when it’e released?
> >
> >
> > The same things that always stop users from going directly to the
> > latest-and-greatest: fear of instability from new minor-release
> > features, reliance on behavior changed across minor versions, breaking
> > changes on Lucene elements that don't guarantee backcompat (e.g.
> > SOLR-14254), security issues in later versions (new libraries pulled
> > in with vulns), etc. There's lots of reasons a given user might want
> > to stick on 8.6.x rather than 8.7 (in the short/medium term).
> >
> > I'm ambivalent to whether we upgrade Jetty in 8.6.3 - as I said above
> > the worst of the Jetty issue should be mitigated by work on our end -
> > but I think there's a lot of reasons users might not upgrade as far as
> > we'd expect/like.
> >
> >
> > On Mon, Sep 28, 2020 at 2:05 PM Erick Erickson  
> > wrote:
> >
> >
> > For me, there’s a sharp distinction between changing a dependency in a 
> > point release just because there’s a new version, and changing the 
> > dependency because there’s a bug in it. That said, if someone can use 
> > 8.6.3, what’s stopping them from going to 8.7 when it’e released? Would it 
> > make more sense to do the upgrades for 8.7 and get that out the door rather 
> > than backport?
> >
> > FWIW,
> > Erick
> >
> > On Sep 28, 2020, at 1:45 PM, Jason Gerlowski  wrote:
> >
> > Hey all,
> >
> > I wanted to add 2 more blocker tickets to the list: SOLR-14897 and
> > SOLR-14898. These tickets (while bad bugs in their own right) are
> > especially necessary because they work around a Jetty buffer-reuse bug
> > (see SOLR-14896) that causes sporadic request failures once triggered.
> >
> > So that brings the list of 8.6.3 blockers up to: SOLR-14850,
> > SOLR-14835, SOLR-14897, and SOLR-14898. (Thanks David for the quick
> > work on SOLR-14768!)
> >
> > Additionally, should we also consider a Jetty upgrade for 8.6.3 in
> > light of the issue mentioned above? I know it's atypical for bug-fix
> > releases to change deps, but here the bug is serious and tied directly
> > to the dep. SOLR-14897 and SOLR-14898 help greatly here, but the
> > Jetty bug is likely still a problem for users making requests that
> > match a specific (albeit rare) profile. Anyone have thoughts?
> >
> > Best,
> >
> > Jason
> >
> > On Fri, Sep 25, 2020 at 12:28 AM Houston Putman  
> > wrote:
> >
> >
> > If I recall correctly, thats a step in the release wizard.
> >
> > After checking, I think this fits the bill:
> > https://github.com/apache/lucene-solr/blob/master/dev-tools/scripts/releaseWizard.yaml#L1435
> >
> > - Houston
> >
> > On Fri, Sep 25, 2020 at 12:06 AM David Smiley  wrote:
> >
> >
> > When moving changes from 8.7 to 8.6.3, must we (the mover of an individual 
> > change) move the CHANGES.txt entry on all branches -- master, branch_8x, 
> > branch_8_6? I expect the release branch but am unsure of the other two. In 
> > the past I have but it's annoying. Does the RM sync CHANGES.txt on the 
> > other branches in one go? If not, I think it'd make sense for that to 
> > happen.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Thu, Sep 24, 2020 at 6:22 AM 

Highlight with Proximity search throws an exception

2020-10-01 Thread Juraj Jurčo
Hi guys,
we are trying to implement search and we have experienced a strange
situation. When our text contains an apostrophe followed by a single
character AND we our search query is composed of exactly two letters
followed by proximity search AND we use highlighting, we get an exception:

*java.lang.IllegalArgumentException: boost must be a positive float, got
> -1.0*


It seems there is a problem at:FuzzyTermsEnum.java:271 (float similarity =
1.0f - (float) ed / (float) minTermLength) when it reaches it with ed=2 and
it sets a negative boost.

I was able to reproduce the error with following code:

import java.io.IOException;
import java.nio.file.Path;

import org.apache.commons.io.FileUtils;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.search.highlight.TokenSources;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.junit.jupiter.api.Test;

class FindSqlHighlightTest {

   @Test
   void reproduceHighlightProblem() throws IOException,
ParseException, InvalidTokenOffsetsException {
  String text = "doesn't";
  String field = "text";
  //NOK: se~, se~2 and any higher number
  //OK: sel~, s~, se~1
  String uQuery = "se~";
  int maxStartOffset = -1;
  Analyzer analyzer = new SimpleAnalyzer();

  Path indexLocation = Path.of("temp",
"reproduceHighlightProblem").toAbsolutePath();
  if (indexLocation.toFile().exists()) {
 FileUtils.deleteDirectory(indexLocation.toFile());
  }
  Directory indexDir = FSDirectory.open(indexLocation);

  //Create index
  IndexWriterConfig dimsIndexWriterConfig = new IndexWriterConfig(analyzer);
  dimsIndexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
  IndexWriter idxWriter = new IndexWriter(indexDir, dimsIndexWriterConfig);
  //add doc
  Document doc = new Document();
  doc.add(new TextField(field, text, Field.Store.NO));
  idxWriter.addDocument(doc);
  //commit
  idxWriter.commit();
  idxWriter.close();

  //search & highlight
  Query query = new QueryParser(field, analyzer).parse(uQuery);
  Highlighter highlighter = new Highlighter(new
SimpleHTMLFormatter(), new QueryScorer(query));
  TokenStream tokenStream = TokenSources.getTokenStream(field,
null, text, analyzer, maxStartOffset);
  String highlighted = highlighter.getBestFragment(tokenStream, text);
  System.out.println(highlighted);
   }
}


Could you please confirm whether it's a bug in Lucene or whether we do
something that is not allowed?

Thanks a lot!
Best,
Juraj+


Re: Backward compatability handling across major versions

2020-10-01 Thread David Smiley
Agreed that back-compat matters should not "sneak" into an issue that is
not about that.  There are of course gray areas -- much of Solr core Java
APIs are public yet we don't need to treat everything with such burdensome
care.  It takes experience and some subjectivity to know.  The PR you point
to is very clear to me, as it's a web service API endpoint.

We *can* break back-compat on a major release :-).  But such discussion
deserves its own issue about breaking that compatibility.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Oct 1, 2020 at 4:25 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Hi Devs,
> As per earlier discussions, we want to do a better job of handling major
> version upgrades, possibly support rolling upgrades wherever possible. This
> implies that we don't break backward compatibility without a strong reason
> and adequate discussion around it.
>
> Recently, there was a PR that attempted to sneak in a backward
> incompatible change to an endpoint for plugins (package management). This
> change was totally unrelated to the JIRA/PR and there was absolutely no
> discussion or even an attempt to address the upgrade strategy with that
> change. The attitude was a careless one, on the lines of we can break
> backward compatibility in a major release.
> https://github.com/apache/lucene-solr/pull/1758#discussion_r494134314
>
> Do we have any consensus on whether we need a separate JIRA or broader
> discussion on any backward compatibility breaks? Or shall we let these
> changes be sneaked in, unless someone notices very carefully a few lines of
> changes in a 25 class PR?
>
> Looking for some suggestions here.
> Thanks and regards,
> Ishan
>


Re: Backward compatability handling across major versions

2020-10-01 Thread Ilan Ginzburg
In my opinion, when we really need to break backward compatibility (be
it a change of API or of how features are made available, for example
Autoscaling), I think the friendly way to do it is to introduce the
new implementation first (co-existing with the old one!), deprecate
but keep the old way of doing, and after a few releases remove from
the code the old way of doing.

This gives users time and freedom of how to manage their transition
(and does not force a transition when upgrading to a specific
version), with both the old and new ways of doing things available for
a while.

Sometimes this is not possible and we have to be less friendly to our
users, but we should really try to limit these cases as much as
possible (implies discussions and exploring available options).

Ilan

On Thu, Oct 1, 2020 at 10:30 AM Noble Paul  wrote:
>
> In fact I was shocked at the cavalier attitude with which backward
> compatibility is broken. If we are going to make a backward
> incompatible change
>
> There should be a JIRA with the proper discussions
>
> * What is the change?
> * Why is the change important?
> * What is the strategy for someone who does a rolling upgrade?
> * Is it possible to avoid it?
> * Can the change be done in a backward compatible way so that the
> users are not inconvenienced
>
> On Thu, Oct 1, 2020 at 6:25 PM Ishan Chattopadhyaya
>  wrote:
> >
> > Hi Devs,
> > As per earlier discussions, we want to do a better job of handling major 
> > version upgrades, possibly support rolling upgrades wherever possible. This 
> > implies that we don't break backward compatibility without a strong reason 
> > and adequate discussion around it.
> >
> > Recently, there was a PR that attempted to sneak in a backward incompatible 
> > change to an endpoint for plugins (package management). This change was 
> > totally unrelated to the JIRA/PR and there was absolutely no discussion or 
> > even an attempt to address the upgrade strategy with that change. The 
> > attitude was a careless one, on the lines of we can break backward 
> > compatibility in a major release. 
> > https://github.com/apache/lucene-solr/pull/1758#discussion_r494134314
> >
> > Do we have any consensus on whether we need a separate JIRA or broader 
> > discussion on any backward compatibility breaks? Or shall we let these 
> > changes be sneaked in, unless someone notices very carefully a few lines of 
> > changes in a 25 class PR?
> >
> > Looking for some suggestions here.
> > Thanks and regards,
> > Ishan
>
>
>
> --
> -
> Noble Paul
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: What is "Solr core"?

2020-10-01 Thread Ishan Chattopadhyaya
Yes, solr-core usually means /solr/core module in the repository. It also
refers to the generated artifact solr-core-.jar.

> Or is it a dependency graph where "core"
> depends on nothing outside of core, but anything outside of core can
> depend on core?

solr-core depends on solrj module, but nothing else outside. Other modules
can/should depend on solr-core or solrj.

> In other words, what's the cost of moving "outside of core" something
> that's in core, and what's the value of doing so?

Outside the core means solr-core will have no traces of something in it.
Value is to reduce the clutter in solr-core. That way, only essential and
important functionality stays in solr-core, and hence reviews are simpler:
any PR that touches solr-core should get urgent attention, because it can
potentially disrupt the stability of Solr for its essential and default
functionality. Today, even almost all PRs touch solr-core, and hence the
potential of someone inadvertently making changes that mess up important
parts without others knowing how urgent that PR is.

On Thu, Oct 1, 2020 at 2:14 PM Ilan Ginzburg  wrote:

> In code review/design discussions I've seen a few time comments made
> about a feature or piece of code: "it doesn't belong in [Solr] core".
>
> What's the definition of Solr "core" other than it being an IntelliJ
> module? Does core have access to things that can't be accessed from
> elsewhere? (like an OS kernel that can do processor tricks that use
> code is not allowed to). Or is it a dependency graph where "core"
> depends on nothing outside of core, but anything outside of core can
> depend on core?
>
> In other words, what's the cost of moving "outside of core" something
> that's in core, and what's the value of doing so?
>
> Thanks,
> Ilan
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


What is "Solr core"?

2020-10-01 Thread Ilan Ginzburg
In code review/design discussions I've seen a few time comments made
about a feature or piece of code: "it doesn't belong in [Solr] core".

What's the definition of Solr "core" other than it being an IntelliJ
module? Does core have access to things that can't be accessed from
elsewhere? (like an OS kernel that can do processor tricks that use
code is not allowed to). Or is it a dependency graph where "core"
depends on nothing outside of core, but anything outside of core can
depend on core?

In other words, what's the cost of moving "outside of core" something
that's in core, and what's the value of doing so?

Thanks,
Ilan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Backward compatability handling across major versions

2020-10-01 Thread Noble Paul
In fact I was shocked at the cavalier attitude with which backward
compatibility is broken. If we are going to make a backward
incompatible change

There should be a JIRA with the proper discussions

* What is the change?
* Why is the change important?
* What is the strategy for someone who does a rolling upgrade?
* Is it possible to avoid it?
* Can the change be done in a backward compatible way so that the
users are not inconvenienced

On Thu, Oct 1, 2020 at 6:25 PM Ishan Chattopadhyaya
 wrote:
>
> Hi Devs,
> As per earlier discussions, we want to do a better job of handling major 
> version upgrades, possibly support rolling upgrades wherever possible. This 
> implies that we don't break backward compatibility without a strong reason 
> and adequate discussion around it.
>
> Recently, there was a PR that attempted to sneak in a backward incompatible 
> change to an endpoint for plugins (package management). This change was 
> totally unrelated to the JIRA/PR and there was absolutely no discussion or 
> even an attempt to address the upgrade strategy with that change. The 
> attitude was a careless one, on the lines of we can break backward 
> compatibility in a major release. 
> https://github.com/apache/lucene-solr/pull/1758#discussion_r494134314
>
> Do we have any consensus on whether we need a separate JIRA or broader 
> discussion on any backward compatibility breaks? Or shall we let these 
> changes be sneaked in, unless someone notices very carefully a few lines of 
> changes in a 25 class PR?
>
> Looking for some suggestions here.
> Thanks and regards,
> Ishan



-- 
-
Noble Paul

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Backward compatability handling across major versions

2020-10-01 Thread Ishan Chattopadhyaya
Hi Devs,
As per earlier discussions, we want to do a better job of handling major
version upgrades, possibly support rolling upgrades wherever possible. This
implies that we don't break backward compatibility without a strong reason
and adequate discussion around it.

Recently, there was a PR that attempted to sneak in a backward incompatible
change to an endpoint for plugins (package management). This change was
totally unrelated to the JIRA/PR and there was absolutely no discussion or
even an attempt to address the upgrade strategy with that change. The
attitude was a careless one, on the lines of we can break backward
compatibility in a major release.
https://github.com/apache/lucene-solr/pull/1758#discussion_r494134314

Do we have any consensus on whether we need a separate JIRA or broader
discussion on any backward compatibility breaks? Or shall we let these
changes be sneaked in, unless someone notices very carefully a few lines of
changes in a 25 class PR?

Looking for some suggestions here.
Thanks and regards,
Ishan


Re: restlet dependencies

2020-10-01 Thread Noble Paul
The annotation 
(https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/api/EndPoint.java)
supports wild cards and templated paths

On Thu, Oct 1, 2020 at 5:28 PM Ishan Chattopadhyaya
 wrote:
>
> > But when I suggested porting the code that uses restlet to JAX-RS / Jersey, 
> > Ishan said
> > that wasn't necessary and is already supported with some Annotations ... I 
> > have no idea
> > what that means and need more info about what is already in place.
>
> I was mainly referring to the @Endpoint annotations. It is available for V2 
> APIs today (and I think it should be fine to use for anything we build now 
> onwards, including managed resources V2).
> It is possible to make it work with V1, but that will require some work.
>
> On Thu, Oct 1, 2020 at 12:50 PM Ishan Chattopadhyaya 
>  wrote:
>>
>> @Tim
>>
>> Please check ClusterAPI or ZookeeperReadAPI etc.
>> Recently used it in Yasa: 
>> https://github.com/yasa-org/yasa/blob/master/yasa-solr-plugin/src/main/java/io/github/kezhenxu94/YasaHandler.java
>>
>> On Thu, Oct 1, 2020 at 10:46 AM Noble Paul  wrote:
>>>
>>> @Tim Potter
>>>
>>> I tried several times to get rid of the restlet dependency & keep the
>>> functionality as is. I failed miserably. I'm not saying this to
>>> discourage anyone who wants to give a try. Just letting you know that
>>> it is not as easy as it may sound
>>>
>>> On Thu, Oct 1, 2020 at 2:42 AM Houston Putman  
>>> wrote:
>>> >
>>> > +1 to Tomas' proposal. Created SOLR-14907 to track the effort.
>>> >
>>> >  - Houston
>>> >
>>> > On Wed, Sep 30, 2020 at 12:26 PM Tomás Fernández Löbbe 
>>> >  wrote:
>>> >>
>>> >> > Let's support the single file upload feature
>>> >> +1, but let this behave exactly as a zip file with a single file in it 
>>> >> (regarding trusted/untrusted). We just need to change the configset 
>>> >> handler to be able to handle non-zip files, and have a way to "locate" 
>>> >> that file inside the configset (in case it needs to go somewhere other 
>>> >> than the root).
>>> >>
>>> >> On Wed, Sep 30, 2020 at 8:45 AM Eric Pugh 
>>> >>  wrote:
>>> >>>
>>> >>> I think that me in “violent agreement” with you.   Let’s understand the 
>>> >>> Annotations approach that we have, or pick something that is commonly 
>>> >>> used like JAX-RS / Jersey.
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Sep 30, 2020, at 11:41 AM, Timothy Potter  
>>> >>> wrote:
>>> >>>
>>> >>> I'm sorry, I don't understand what you mean by "make it a single 
>>> >>> pattern (the annotations?)" Eric?
>>> >>>
>>> >>> To me, the pattern is well established in the Java world: JAX-RS (with 
>>> >>> Jersey as the underlying impl. which has nice integration with Jetty). 
>>> >>> But when I suggested porting the code that uses restlet to JAX-RS / 
>>> >>> Jersey, Ishan said that wasn't necessary and is already supported with 
>>> >>> some Annotations ... I have no idea what that means and need more info 
>>> >>> about what is already in place. Short of that, replacing restlet with 
>>> >>> JAX-RS / Jersey looks like a trivial amount of work to me (and I'm 
>>> >>> happy to take it on).
>>> >>>
>>> >>> Tim
>>> >>>
>>> >>> On Wed, Sep 30, 2020 at 9:36 AM Eric Pugh 
>>> >>>  wrote:
>>> 
>>>  The use case of “I want to update something via a API” is I think 
>>>  pretty common, and it would be nice to make it a single pattern (the 
>>>  annotations?) with lots of examples/developer docs for the next person.
>>> 
>>> 
>>> 
>>>  On Sep 30, 2020, at 11:04 AM, Timothy Potter  
>>>  wrote:
>>> 
>>>  I started looking into removing Managed Resources in master and wanted 
>>>  to mention that the LTR contrib also relies on this framework 
>>>  (ManagedModelStore and ManagedFeatureStore, see: 
>>>  https://lucene.apache.org/solr/guide/8_6/learning-to-rank.html#uploading-a-model).
>>>   I only mention this b/c it's been said several times in this thread 
>>>  that nobody uses this feature and it's only for editing config/schema 
>>>  like synonyms. Afaik, LTR is a broadly used feature of Solr so now I'm 
>>>  not so bullish on removing the ability to manage dynamic resources 
>>>  using a REST like API. I agree that changing resources like the 
>>>  synonym set could be replaced with configSet updates but I don't see 
>>>  how to replace the RESTful model / feature store API w/o something 
>>>  like Managed Resources?
>>> 
>>>  From where I sit, I think we should just remove the use of restlet in 
>>>  the implementation but keep the API for Solr 9 (master).
>>> 
>>>  @Ishan ~ you mentioned there is a way to get REST API like behavior 
>>>  w/o using JAX-RS / Jersey ... something about annotations? Can you 
>>>  point me to some example code of how that is done please?
>>> 
>>>  Cheers,
>>>  Tim
>>> 
>>>  On Wed, Sep 30, 2020 at 8:29 AM David Smiley  
>>>  wrote:
>>> >
>>> > These resources are 

Re: restlet dependencies

2020-10-01 Thread Ishan Chattopadhyaya
> But when I suggested porting the code that uses restlet to JAX-RS /
Jersey, Ishan said
> that wasn't necessary and is already supported with some Annotations ...
I have no idea
> what that means and need more info about what is already in place.

I was mainly referring to the @Endpoint annotations. It is available for V2
APIs today (and I think it should be fine to use for anything we build now
onwards, including managed resources V2).
It is possible to make it work with V1, but that will require some work.

On Thu, Oct 1, 2020 at 12:50 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> @Tim
>
> Please check ClusterAPI or ZookeeperReadAPI etc.
> Recently used it in Yasa:
> https://github.com/yasa-org/yasa/blob/master/yasa-solr-plugin/src/main/java/io/github/kezhenxu94/YasaHandler.java
>
> On Thu, Oct 1, 2020 at 10:46 AM Noble Paul  wrote:
>
>> @Tim Potter
>>
>> I tried several times to get rid of the restlet dependency & keep the
>> functionality as is. I failed miserably. I'm not saying this to
>> discourage anyone who wants to give a try. Just letting you know that
>> it is not as easy as it may sound
>>
>> On Thu, Oct 1, 2020 at 2:42 AM Houston Putman 
>> wrote:
>> >
>> > +1 to Tomas' proposal. Created SOLR-14907 to track the effort.
>> >
>> >  - Houston
>> >
>> > On Wed, Sep 30, 2020 at 12:26 PM Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>> >>
>> >> > Let's support the single file upload feature
>> >> +1, but let this behave exactly as a zip file with a single file in it
>> (regarding trusted/untrusted). We just need to change the configset handler
>> to be able to handle non-zip files, and have a way to "locate" that file
>> inside the configset (in case it needs to go somewhere other than the root).
>> >>
>> >> On Wed, Sep 30, 2020 at 8:45 AM Eric Pugh <
>> ep...@opensourceconnections.com> wrote:
>> >>>
>> >>> I think that me in “violent agreement” with you.   Let’s understand
>> the Annotations approach that we have, or pick something that is commonly
>> used like JAX-RS / Jersey.
>> >>>
>> >>>
>> >>>
>> >>> On Sep 30, 2020, at 11:41 AM, Timothy Potter 
>> wrote:
>> >>>
>> >>> I'm sorry, I don't understand what you mean by "make it a single
>> pattern (the annotations?)" Eric?
>> >>>
>> >>> To me, the pattern is well established in the Java world: JAX-RS
>> (with Jersey as the underlying impl. which has nice integration with
>> Jetty). But when I suggested porting the code that uses restlet to JAX-RS /
>> Jersey, Ishan said that wasn't necessary and is already supported with some
>> Annotations ... I have no idea what that means and need more info about
>> what is already in place. Short of that, replacing restlet with JAX-RS /
>> Jersey looks like a trivial amount of work to me (and I'm happy to take it
>> on).
>> >>>
>> >>> Tim
>> >>>
>> >>> On Wed, Sep 30, 2020 at 9:36 AM Eric Pugh <
>> ep...@opensourceconnections.com> wrote:
>> 
>>  The use case of “I want to update something via a API” is I think
>> pretty common, and it would be nice to make it a single pattern (the
>> annotations?) with lots of examples/developer docs for the next person.
>> 
>> 
>> 
>>  On Sep 30, 2020, at 11:04 AM, Timothy Potter 
>> wrote:
>> 
>>  I started looking into removing Managed Resources in master and
>> wanted to mention that the LTR contrib also relies on this framework
>> (ManagedModelStore and ManagedFeatureStore, see:
>> https://lucene.apache.org/solr/guide/8_6/learning-to-rank.html#uploading-a-model).
>> I only mention this b/c it's been said several times in this thread that
>> nobody uses this feature and it's only for editing config/schema like
>> synonyms. Afaik, LTR is a broadly used feature of Solr so now I'm not so
>> bullish on removing the ability to manage dynamic resources using a REST
>> like API. I agree that changing resources like the synonym set could be
>> replaced with configSet updates but I don't see how to replace the RESTful
>> model / feature store API w/o something like Managed Resources?
>> 
>>  From where I sit, I think we should just remove the use of restlet
>> in the implementation but keep the API for Solr 9 (master).
>> 
>>  @Ishan ~ you mentioned there is a way to get REST API like behavior
>> w/o using JAX-RS / Jersey ... something about annotations? Can you point me
>> to some example code of how that is done please?
>> 
>>  Cheers,
>>  Tim
>> 
>>  On Wed, Sep 30, 2020 at 8:29 AM David Smiley 
>> wrote:
>> >
>> > These resources are fundamentally a part of the configSet and can
>> (in general) affect query results and thus flushing caches (via a reload)
>> is appropriate.
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>> >
>> >
>> > On Wed, Sep 30, 2020 at 9:06 AM Noble Paul 
>> wrote:
>> >>
>> >> Well, I believe we should have a mechanism to upload a single file
>> to
>> 

Re: restlet dependencies

2020-10-01 Thread Ishan Chattopadhyaya
@Tim

Please check ClusterAPI or ZookeeperReadAPI etc.
Recently used it in Yasa:
https://github.com/yasa-org/yasa/blob/master/yasa-solr-plugin/src/main/java/io/github/kezhenxu94/YasaHandler.java

On Thu, Oct 1, 2020 at 10:46 AM Noble Paul  wrote:

> @Tim Potter
>
> I tried several times to get rid of the restlet dependency & keep the
> functionality as is. I failed miserably. I'm not saying this to
> discourage anyone who wants to give a try. Just letting you know that
> it is not as easy as it may sound
>
> On Thu, Oct 1, 2020 at 2:42 AM Houston Putman 
> wrote:
> >
> > +1 to Tomas' proposal. Created SOLR-14907 to track the effort.
> >
> >  - Houston
> >
> > On Wed, Sep 30, 2020 at 12:26 PM Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
> >>
> >> > Let's support the single file upload feature
> >> +1, but let this behave exactly as a zip file with a single file in it
> (regarding trusted/untrusted). We just need to change the configset handler
> to be able to handle non-zip files, and have a way to "locate" that file
> inside the configset (in case it needs to go somewhere other than the root).
> >>
> >> On Wed, Sep 30, 2020 at 8:45 AM Eric Pugh <
> ep...@opensourceconnections.com> wrote:
> >>>
> >>> I think that me in “violent agreement” with you.   Let’s understand
> the Annotations approach that we have, or pick something that is commonly
> used like JAX-RS / Jersey.
> >>>
> >>>
> >>>
> >>> On Sep 30, 2020, at 11:41 AM, Timothy Potter 
> wrote:
> >>>
> >>> I'm sorry, I don't understand what you mean by "make it a single
> pattern (the annotations?)" Eric?
> >>>
> >>> To me, the pattern is well established in the Java world: JAX-RS (with
> Jersey as the underlying impl. which has nice integration with Jetty). But
> when I suggested porting the code that uses restlet to JAX-RS / Jersey,
> Ishan said that wasn't necessary and is already supported with some
> Annotations ... I have no idea what that means and need more info about
> what is already in place. Short of that, replacing restlet with JAX-RS /
> Jersey looks like a trivial amount of work to me (and I'm happy to take it
> on).
> >>>
> >>> Tim
> >>>
> >>> On Wed, Sep 30, 2020 at 9:36 AM Eric Pugh <
> ep...@opensourceconnections.com> wrote:
> 
>  The use case of “I want to update something via a API” is I think
> pretty common, and it would be nice to make it a single pattern (the
> annotations?) with lots of examples/developer docs for the next person.
> 
> 
> 
>  On Sep 30, 2020, at 11:04 AM, Timothy Potter 
> wrote:
> 
>  I started looking into removing Managed Resources in master and
> wanted to mention that the LTR contrib also relies on this framework
> (ManagedModelStore and ManagedFeatureStore, see:
> https://lucene.apache.org/solr/guide/8_6/learning-to-rank.html#uploading-a-model).
> I only mention this b/c it's been said several times in this thread that
> nobody uses this feature and it's only for editing config/schema like
> synonyms. Afaik, LTR is a broadly used feature of Solr so now I'm not so
> bullish on removing the ability to manage dynamic resources using a REST
> like API. I agree that changing resources like the synonym set could be
> replaced with configSet updates but I don't see how to replace the RESTful
> model / feature store API w/o something like Managed Resources?
> 
>  From where I sit, I think we should just remove the use of restlet in
> the implementation but keep the API for Solr 9 (master).
> 
>  @Ishan ~ you mentioned there is a way to get REST API like behavior
> w/o using JAX-RS / Jersey ... something about annotations? Can you point me
> to some example code of how that is done please?
> 
>  Cheers,
>  Tim
> 
>  On Wed, Sep 30, 2020 at 8:29 AM David Smiley 
> wrote:
> >
> > These resources are fundamentally a part of the configSet and can
> (in general) affect query results and thus flushing caches (via a reload)
> is appropriate.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Wed, Sep 30, 2020 at 9:06 AM Noble Paul 
> wrote:
> >>
> >> Well, I believe we should have a mechanism to upload a single file
> to
> >> a configset.
> >>
> >> >  A single file configset upload would require the user to reload
> the collection, so it isn't better than managed resources.
> >>
> >> This is not true
> >>
> >> Only config/schema file changes result in core reload.
> >>
> >> On Wed, Sep 30, 2020 at 10:23 PM David Smiley 
> wrote:
> >> >
> >> > Definitely don't remove in 8.x!
> >> >
> >> > >  A single file configset upload would require the user to
> reload the collection, so it isn't better than managed resources.
> >> >
> >> > Do you view that as a substantial point in favor of
> managed-resources?  I view that as a trivial matter, and one I prefer to
> automagic and