Re: Solr configuration options

2020-09-09 Thread David Smiley
I think it's important to be able to roll out changes to nodes in a way the
user controls (e.g. one node at a time), instead of only having an
all-at-once option.  I really liked Tomas's explanation of the need.  The
same need exists for collections, and Solr satisfies that today via
configSets.  Create my-configset-v2 with some new but maybe buggy stuff,
then roll it out slowly to collections as you wish (by using
MODIFYCOLLECTION) -- needn't be all-at-once.  The package manager should
work with that fine because the package is tied at the configSet level in
params.json.  I don't know how node level handlers are registered in the
package manager, though.  If hypothetically there was an option to tie it
via some file on disk (be it solr.xml or something else), then a user
wanting to do this would be empowered to.

Any way, it appears SIP-11 Uniform cluster-level configuration API

/
SOLR-14843 is where this discussion has gone.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Sep 3, 2020 at 3:03 PM Tomás Fernández Löbbe 
wrote:

> Thanks Ishan,
> I still don't think it covers the cases very well. The possibilities of
> how that handler could be screwing up things are infinite (it could be
> corrupting local cores, causing OOMs, it could be spawning infinite loops,
> you name it). If the new handler requires initialization that reaches out
> to external system, having a large enough cluster means this can hit
> throttling or even take down something if you start them all atomically.
> I'm fine with Solr supporting atomic deployments with packages and such,
> but I'm not fine with that being the only way to deploy Solr, it may not be
> suitable for all use cases.
>
> Also, your workaround requires a ton of knowledge of Solr APIs and
> internals, vs a simpler and more standard approach where there are two
> versions (Docker images, AMIs, tars, whatever you use): old and new.  Add
> "new" and remove "old" in your preferred way. This is exactly the same
> you'll do when you need to upgrade Solr BTW, so it needs to be handled
> anyways.
>
> On Thu, Sep 3, 2020 at 11:35 AM Erick Erickson 
> wrote:
>
>> Hmmm, interesting point about deliberately changing one solr.xml for
>> testing purposes. To emulate that on a per-node basis you’d have to have
>> something like a “node props” associated with each node, which my instant
>> reaction to is “y”.
>>
>> As far as API only, I’d assumed changes to clusterprops could be either
>> way. If we allow Solr to start with no clusterprops, then the API route
>> would create one. Pros can go ahead and hand-edit one and push it up if
>> they want.
>>
>> In your nightmare scenario, where are the ZK’s located? Are they still
>> running somewhere? Could you hand-edit clusterprops and push it to ZK?
>>
>> I wish everyone would just use Solr the way I think about it ;)
>>
>> > On Sep 3, 2020, at 2:11 PM, Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>> >
>> > I can see that some of these configurations should be moved to
>> clusterporps.json, I don’t believe this is the case for all of them. Some
>> are configurations that are targeting the local node (i.e sharedLib path),
>> some are needed before connecting to ZooKeeper (zk config). Configuration
>> of global handlers and components, while in general you do want to see the
>> same conf across all nodes, you may not want the changes to reflect
>> atomically and instead rely on a phased upgrade (rolling, blue/green, etc),
>> where the conf goes together with the binaries that are being deployed. I
>> also fear that making the configuration of some of these components dynamic
>> means we have to make the code handle them dynamically (i.e. recreate the
>> CollectionsHandler based on callback from ZooKeeper). This would be very
>> hardly used in reality, but all our code needs to be restructured to handle
>> this, I fear this will complicate the code needlessly, and may introduce
>> leaks and races of all kinds. If those components can have configuration
>> that should be dynamic (some toggle, threshold, etc), I’d love to see those
>> as clusterporps, key-value mostly.
>> >
>> > If we were to put this configuration in clusterprops, would that mean
>> that I’m only able to do config changes via API? On a new cluster, do I
>> need to start Solr, make a collections API call to change the collections
>> handler? Or am I supposed to manually change the clusterporps file before
>> starting Solr and push it to Zookeeper (having a file intended for manual
>> edits and API edits is bad IMO)? Maybe via the cli, but still, I’d need to
>> do this for every cluster I create (vs have the solr.xml in my source
>> repository and Docker image, for example). Also I lose the ability to have
>> this configuration in my git repo?
>> >
>> > I'm +1 to keep a node configuration local to the 

[ANNOUNCE] Apache PyLucene 8.6.1

2020-09-09 Thread Andi Vajda



I am pleased to announce the availability of Apache PyLucene 8.6.1.

Apache PyLucene, a subproject of Apache Lucene, is a Python extension for
accessing Apache Lucene Core. Its goal is to allow you to use Lucene's text
indexing and searching capabilities from Python. It is API compatible with
Lucene 8.x Core, version 8.6.1.

For changes in this release, please review:
http://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_8_6_1/CHANGES
http://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_8_6_1/jcc/CHANGES
http://lucene.apache.org/core/8_6_1/changes/Changes.html

Apache PyLucene is available from the following download page:
http://www.apache.org/dyn/closer.cgi/lucene/pylucene/pylucene-8.6.1-src.tar.gz

When downloading from a mirror site, please remember to verify the downloads
using signatures found on the Apache site:
https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS

For more information on Apache PyLucene, visit the project home page:
  http://lucene.apache.org/pylucene

Andi..


Re: [VOTE] Release PyLucene 8.6.1

2020-09-09 Thread Andi Vajda



On Wed, 9 Sep 2020, Dawid Weiss wrote:


+1 to release, thanks Andi.


This vote has passed.
Thank you all who voted !

Andi..



Dawid

On Tue, Aug 25, 2020 at 1:56 AM Andi Vajda  wrote:



The PyLucene 8.6.1 (rc1) release tracking the recent release of
Apache Lucene 8.6.1 is ready.

A release candidate is available from:
https://dist.apache.org/repos/dist/dev/lucene/pylucene/8.6.1-rc1/

PyLucene 8.6.1 is built with JCC 3.8, included in these release artifacts.

JCC 3.8 supports Python 3.3 up to Python 3.8 (in addition to Python 2.3+).
PyLucene may be built with Python 2 or Python 3.

Please vote to release these artifacts as PyLucene 8.6.1.
Anyone interested in this release can and should vote !

Thanks !

Andi..

ps: the KEYS file for PyLucene release signing is at:
https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS

pps: here is my +1




Re: Tests that use bin/solr?

2020-09-09 Thread Houston Putman
I would agree that the docker-solr tests (soon to be the solr/docker tests)
do a fairly good job of testing bin/solr and general runtime logic. Once
that gets merged in, I think it would be great to use that test suite for
all of the additional "bin/solr" tests that we want to add. Once we get the
docker stuff in and it's matured a bit, we could get it running in our CI
and have nightly images built and tested.

Running the tests in a container would be great, and would solve some of
the cross-platform issues that exist there. At that point it might be good
to add an option to run the tests via Kubernetes instead of Docker. That
way we can fully containerize the tests without having to mount the docker
socket within a container, which would be required if we tried to run Solr
containers from within the tests container.

That'd be a fairly big overhaul, though I do think it would be a good win.
Ideally we'd have the option to run the tests with docker or Kube, but I'm
not sure how feasible that would be.

- Houston

On Tue, Sep 8, 2020 at 4:39 PM David Smiley  wrote:

> I never really looked at the smoketester before today.
>  smokeTestRelease.py does very very little with a running Solr instance --
> just run techproducts and do a query.  testSolrExample() is the function
> where this happens.  Unless I'm missing something lots more, I can
> confidently say that the docker-solr project's tests actually test the
> final Solr instance substantially more than our project does.
>
> https://github.com/docker-solr/docker-solr/tree/master/tests
>
> I'd like to see the Solr project incorporating the docker-solr tests into
> nightly CI builds somehow.  We'll be much closer to making this happen once
> docker-solr is absorbed into Solr --
> https://github.com/apache/lucene-solr/pull/1769  because we'll be able to
> have nightly builds of images.  The docker-solr separate project is limited
> to releases, which poses a distinct chicken-and-egg problem.
>
> Maybe those tests could themselves run from a Docker image, thus
> insulating platform issues further.
>
> Separately, someday I do want to work on running SolrExampleTests against
> the docker image, which will be more possible once the projects merge.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Sep 8, 2020 at 8:02 AM Jason Gerlowski 
> wrote:
>
>> I created a few JIRAs to spike "bin/solr" tests several years back
>> (SOLR-11749).  I didn't have quite enough time to get it across the finish
>> line previously but I think it's a great idea to have tests of this sort.
>> At the time I was writing tests in bash - but realized that wasn't a
>> particularly scalable approach.
>>
>> Would be happy to help where I can if the effort gets restarted.
>>
>> Best,
>>
>> Jason
>>
>> On Mon, Sep 7, 2020 at 3:14 AM Uwe Schindler  wrote:
>>
>>> It should be part of an integration test suite, only on module "package"
>>> after assembly. That's something to setup. Please make sure that it works
>>> with any operating system, because we are leaving Java world here (and
>>> because of this, don't mix that into default unit tests).
>>>
>>> Currently we have a test for bin/solr: Smoketester 
>>>
>>> Uwe
>>>
>>> -
>>> Uwe Schindler
>>> Achterdiek 19, D-28357 Bremen
>>> https://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>>
>>> > -Original Message-
>>> > From: Dawid Weiss 
>>> > Sent: Monday, September 7, 2020 9:00 AM
>>> > To: Lucene Dev 
>>> > Subject: Re: Tests that use bin/solr?
>>> >
>>> > Just a note - such integration tests should depend on (and consume)
>>> > the output of solr/packaging (a ZIP file with fully assembled
>>> > package). Then you're really sure you're testing the final artifact.
>>> >
>>> > Dawid
>>> >
>>> > On Mon, Sep 7, 2020 at 7:51 AM David Smiley 
>>> wrote:
>>> > >
>>> > > Do we have any tests that operate on a "real" Solr instance running
>>> from
>>> > "bin/solr"?  Such tests could find problems with bin/solr and any
>>> classpath
>>> > matters in how Jetty operates.  Solr does have JettySolrRunner which
>>> is great
>>> > but doesn't cover the aforementioned matters.
>>> > >
>>> > > We've got some really nice tests in SolrExampleTests which is a base
>>> class and
>>> > many implementations that create SolrJ clients in different ways.  I
>>> could
>>> > imagine modifying this such that if a magic system property is
>>> specified to a
>>> > URL of an existing Solr instance, then the test would not create a
>>> > JettySolrRunner but instead use the configured one.  This would then be
>>> > executed by the smoke tester and maybe a future Docker release
>>> process.  I
>>> > have test infrastructure I wrote where I work that does this sort of
>>> thing for our
>>> > Solr plugins, and it works great.
>>> > >
>>> > > ~ David Smiley
>>> > > Apache Lucene/Solr Search Developer
>>> > > http://www.linkedin.com/in/davidwsmiley
>>> >
>>> > 

Re: Code Analysis during CI?

2020-09-09 Thread Bruno Roustant
+1 for analysis within the PR workflow.

Le ven. 4 sept. 2020 à 06:38, David Smiley  a écrit :

> Sounds great to me!  I'm really glad to hear it works with the PR
> workflow, and only on the files touched in the PR.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Sep 3, 2020 at 8:03 PM Tom DuBuisson  wrote:
>
>> Tomás,
>> Oof, thanks for the note on TOS.  I fixed the link.  The tool can be
>> configured and I'm happy to make things work better for your use case.
>> Muse is free for public repos and will remain free for open source
>> indefinitely.  You can try it and remove it any time - github is in charge
>> of access control and provides you as the repository owner with control via
>> the website.
>>
>> On Thu, Sep 3, 2020 at 4:37 PM Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>>
>>> Thanks Tom. I think this could be very useful as long as it can be
>>> configurable. (The "terms of use here[1] link to "google.com", so I
>>> couldn't check that, but they claim it's free for public repos, so...). We
>>> could always try it and remove it if we don't like it? What do others think?
>>>
>>>
>>> [1] https://github.com/apps/muse-dev
>>>
>>> On Thu, Sep 3, 2020 at 3:06 PM Tom DuBuisson  wrote:
>>>
 Hello Lucene/Solr folks,

 During Lucene development CI is used for build and unit tests to gate
 merges.  The CI doesn't yet include any analysis tools though, but their
 use has been discussed [1].  I fixed some issues flagged by Facebook's
 Infer and was prompted to bring up the topic here [2].

 The recent PR fixed some low-hanging fruit that was reported when I ran
 Muse [3] - a github app that is a platform for static analysis tools.
  Muse's platform bundles the most useful analysis tools, all open source
 with many of them developed by FANG, and triggers analysis on PRs
 then delivers results as comments.

 Because of the PR-centric workflow you only see issues related to the
 changes in the pull request.  This means that even a project where tools
 give a daunting list of issues can still have quiet day-to-day operation.
 Muse also has options to configure individual tools and turn tools or
 warnings off entirely.  If there are concerns in addition to noise and
 added mental tax on development then I'd really like to hear those 
 thoughts.

 Would you be up for running Muse on the lucene-solr repo?  Let me know,
 and I hope to hear your thoughts on analysis tools either way.

 -Tom

 [1] https://issues.apache.org/jira/projects/LUCENE/issues/LUCENE-8847
 [2] https://issues.apache.org/jira/projects/SOLR/issues/SOLR-14819
 [3] Muse result on Lucene:
 https://console.muse.dev/result/TomMD/lucene-solr/01EH5WXS6C1RH1NFYHP6ATXTZ9?tab=results
 Muse app link: https://github.com/apps/muse-dev
 [4] https://github.com/TomMD/lucene-solr/pulls
 [5] Example of muse commenting on an issue
 https://github.com/TomMD/shiro/pull/2




Re: [VOTE] Release PyLucene 8.6.1

2020-09-09 Thread Dawid Weiss
+1 to release, thanks Andi.

Dawid

On Tue, Aug 25, 2020 at 1:56 AM Andi Vajda  wrote:
>
>
> The PyLucene 8.6.1 (rc1) release tracking the recent release of
> Apache Lucene 8.6.1 is ready.
>
> A release candidate is available from:
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/8.6.1-rc1/
>
> PyLucene 8.6.1 is built with JCC 3.8, included in these release artifacts.
>
> JCC 3.8 supports Python 3.3 up to Python 3.8 (in addition to Python 2.3+).
> PyLucene may be built with Python 2 or Python 3.
>
> Please vote to release these artifacts as PyLucene 8.6.1.
> Anyone interested in this release can and should vote !
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
> https://dist.apache.org/repos/dist/release/lucene/pylucene/KEYS
> https://dist.apache.org/repos/dist/dev/lucene/pylucene/KEYS
>
> pps: here is my +1