Re: Lucene v9.9.1: org.apache.lucene.search.ScoreMode

2024-01-06 Thread Marcus Eagan
It’s there for sure, but that doesn’t mean there is no problem. Could you
share what you are seeing in more detail given the class certainly exists?

Marcus Eagan



On Sat, Jan 6, 2024 at 14:05 Chris Hegarty
 wrote:

> Hi,
>
> I see no issue. ScoreMode is present in lucene-core-9.9.1.jar
>
> $ curl https://dlcdn.apache.org/lucene/java/9.9.1/lucene-9.9.1.tgz >
> lucene-9.9.1.tgz
>...
> $  $ tar -xzf  lucene-9.9.1.tgz  $ jar -tvf
> lucene-9.9.1/modules/lucene-core-9.9.1.jar | grep ScoreMode
>   1618 Wed Dec 13 11:06:00 GMT 2023
> org/apache/lucene/search/ScoreMode.class
>
> Or from maven
>
> $ curl
> https://repo1.maven.org/maven2/org/apache/lucene/lucene-core/9.9.1/lucene-core-9.9.1.jar
> > lucene-core-9.9.1.jar
>...
> $ jar -tvf lucene-core-9.9.1.jar | grep ScoreMode
> -rw-r--r--  0 0  01618 13 Dec 11:06
> org/apache/lucene/search/ScoreMode.class
>
> -Chris.
>
> > On 6 Jan 2024, at 12:42, Nazerke S  wrote:
> >
> > Hi,
> >
> > While I was trying to upgrade Solr to use Lucene v9.9.1, I encountered
> 'org.apache.lucene.search.ScoreMode' not found, getting resolve class
> issue.
> > Quickly took a look into the ScoreMode class in lucene codebase,  there
> is no change.
> > Maybe it is related to lucene-core-9.9.1.jar issue where ScoreMode class
> is  ?
> >  Anyone could help with this ?
> >
> > Thanksss,
> >
> > --Nazerke
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Luca Cavanna to the Lucene PMC

2023-10-20 Thread Marcus Eagan
Congratulations Luca, well deserved!

On Thu, Oct 19, 2023 at 11:02 PM Lu Xugang  wrote:

> Congratulations Luca !
>
>
> Xugang
> https://www.amazingkoala.com.cn
>
>
> Adrien Grand 于2023年10月20日 周五13:51写道:
>
>> I'm pleased to announce that Luca Cavanna has accepted an invitation to
>> join the Lucene PMC!
>>
>> Congratulations Luca, and welcome aboard!
>>
>>
>> --
>> Adrien
>>
>

-- 
Marcus Eagan


Re: Patch to change murmurhash implementation slightly

2023-08-25 Thread Marcus Eagan
Thomas,

Also, is it possible to open this patch as a pull request in GitHub?

I guess it does not matter for a lot of the people here. It would make it
easier for more people to collaborate in that medium given the shift to
GitHub recently.

- Marcus



On Fri, Aug 25, 2023 at 7:03 PM Marcus Eagan  wrote:

> Hi Thomas,
>
> Thank you for the hard work thus far. I'm excited to see if the community
> can benefit from the work. The best way to use the lucene bench is to run
> the baseline and candidate branches as described here
> <https://github.com/mikemccand/luceneutil#preparing-the-benchmark-candidates>
> .
>
> I can help you with it and even submit an update to the benchmark repo as
> needed if we find that we can improve some of the steps there to make it
> easier for onlookers. Have you already tried setting up lucene_util?
>
> - Marcus
>
>
>
> On Fri, Aug 25, 2023 at 6:34 PM Thomas Dullien
>  wrote:
>
>> Hey all,
>>
>> apologies if the chart is incorrect. Anyhow, I think the more important
>> questions are:
>>
>> 1) Which benchmarks does the Lucene community have that y'all would like
>> to see an improvement on before accepting this (or any other future)
>> performance patches?
>>
>> I'm guessing the reason why the patch improves http_log performance is
>> because that benchmark indexes many IP addresses, which tend to be 9-15
>> bytes in length. That does not strike me as an atypical workload.
>>
>> I've also done some quick experiments to estimate the average UTF-8 word
>> size of languages in non-ASCII scripts (for example Hindi), and it seems to
>> me the average word size is 2-3 times larger than english because most
>> indic characters will encode to 2-3 bytes. The following excerpt from Hindi
>> Wikipedia is 2242 bytes, but just 146 words
>>  begin hindi-example.txt 
>> १० वीं शताब्दी के बाद घुमन्तु मुस्लिम वंशों ने जातियता तथा धर्म द्वारा
>> संघठित तेज घोड़ों से युक्त बड़ी सेना के द्वारा उत्तर-पश्चिमी मैदानों पर बार
>> बार आकर्मण किया, अंततः १२०६ [[दिल्ली सल्तनत|इस्लामीक दिल्ली सल्तनत]] की
>> स्थापना हुई।{{sfn|Ludden|2002|p = 68}} उन्हें उतर भारत  को
>>  धिक नियंत्रित करना था तथा दक्षिण भारत पर आकर्मण करना था। भारतीय कुलीन
>> वर्ग के विघटनकारी सल्तनत ने बड़े पैमाने पर गैर-मुस्लिमों को स्वयं के
>> रीतिरिवाजों पर छोड़ दिया।{{sfn|Asher|Talbot|2008|p =
>> 47}}{{sfn|Metcalf|Metcalf|2006|p = 6}} १३ वीं शताब्दी में [[मंगोल साम्राज्
>> य|मंगोलों]] द्वारा किये के विनाशकारी आकर्मण से भारत की रक्षा की। सल्तनत
>> के पतन के कारण स्वशासित [[विजयनगर साम्राज्य|विजयनगर साम्राज्य]] का मार्ग
>> प्रशस्त हुआ।{{sfn|Asher|Talbot|2008|p = 53}} एक मजबूत [[शैव|शैव परंपरा]] और
>> सल्तनत की सैन्य तकनीक पर निर्माण करते हुए साम्रा
>> ज्य ने भारत के विशाल भाग पर शासन किया और इसके बाद लंबे समय तक दक्षिण
>> भारतीय समाज को प्रभावित किया।{{sfn|Metcalf|Metcalf|2006|p =
>> 12}}{{sfn|Asher|Talbot|2008|p = 53}}
>>  end hindi-example.txt 
>>
>> cat hindi-example.txt | wc -w
>> 146
>> 2242 divided by 146 yields a word length of ~15 bytes, so I'd be
>> surprised if average word length of Hindi wikipedia was below 12 bytes.
>>
>> Do y'all wish for me to create another benchmark for indexing indic
>> scripts and large corpora of IPv4 and IPv6 addresses (both things that seem
>> not very niche), and if the patch shows improvement there, will y'all
>> accept it?
>>
>> 2) It'd be great if there was some public documentation about "what we
>> want to see from a performance improvement before we accept it". I
>> personally find the discussion around this patch to be bewilderingly long,
>> and I am happy to help work on such a guideline with others -- IMO having a
>> clear set of hurdles is preferable to the back-and-forth we've had so far?
>>
>> Cheers,
>> Thomas
>>
>>
>>
>>
>>
>> On Fri, Aug 25, 2023 at 3:38 PM Robert Muir  wrote:
>>
>>> chart is wrong, average word length for english is like 5.
>>>
>>> On Fri, Aug 25, 2023 at 9:35 AM Thomas Dullien
>>>  wrote:
>>> >
>>> > Hey all,
>>> >
>>> > another data point: There's a diagram with the relevant distributions
>>> of word lengths in various languages here:
>>> >
>>> >
>>> https://www.reddit.com/r/languagelearning/comments/h9eao2/average_word_length_of_languages_in_europe_except/
>>> >
>>> > While English is close to the 8-byte limit, average word length in
>>> German is 11+ bytes, and Mongolian and Finnish will likewise be 11+ bytes.
>>> I'll gather some averages over the va

Re: Patch to change murmurhash implementation slightly

2023-08-25 Thread Marcus Eagan
t; > Let me know what data you'd like to see to decide whether
>> this patch is a good idea, and if there is consensus among the Lucene
>> committers that those are reasonable criteria, I'll work on producing that
>> data.
>> >>>>> >> >> >
>> >>>>> >> >> > Cheers,
>> >>>>> >> >> > Thomas
>> >>>>> >> >> >
>> >>>>> >> >> >
>> >>>>> >> >> >
>> >>>>> >> >> > On Tue, Apr 25, 2023 at 4:02 PM Robert Muir <
>> rcm...@gmail.com> wrote:
>> >>>>> >> >> >>
>> >>>>> >> >> >> well there is some cost, as it must add additional checks
>> to see if
>> >>>>> >> >> >> its longer than 8. in your patch, additional loops. it
>> increases the
>> >>>>> >> >> >> method size and may impact inlining and other things. also
>> we can't
>> >>>>> >> >> >> forget about correctness, if the hash function does the
>> wrong thing it
>> >>>>> >> >> >> could slow everything to a crawl.
>> >>>>> >> >> >>
>> >>>>> >> >> >> On Tue, Apr 25, 2023 at 9:56 AM Thomas Dullien
>> >>>>> >> >> >>  wrote:
>> >>>>> >> >> >> >
>> >>>>> >> >> >> > Ah, I see what you mean.
>> >>>>> >> >> >> >
>> >>>>> >> >> >> > You are correct -- the change will not speed up a 5-byte
>> word, but it *will* speed up all 8+-byte words, at no cost to the shorter
>> words.
>> >>>>> >> >> >> >
>> >>>>> >> >> >> > On Tue, Apr 25, 2023 at 3:20 PM Robert Muir <
>> rcm...@gmail.com> wrote:
>> >>>>> >> >> >> >>
>> >>>>> >> >> >> >> if a word is of length 5, processing 8 bytes at a time
>> isn't going to
>> >>>>> >> >> >> >> speed anything up. there aren't 8 bytes to process.
>> >>>>> >> >> >> >>
>> >>>>> >> >> >> >> On Tue, Apr 25, 2023 at 9:17 AM Thomas Dullien
>> >>>>> >> >> >> >>  wrote:
>> >>>>> >> >> >> >> >
>> >>>>> >> >> >> >> > Is average word length <= 4 realistic though? I mean,
>> even the english wiki corpus has ~5, which would require two calls to the
>> lucene layer instead of one; e.g. multiple layers of virtual dispatch that
>> are unnecessary?
>> >>>>> >> >> >> >> >
>> >>>>> >> >> >> >> > You're not going to pay any cycles for reading 8
>> bytes instead of 4 bytes, so the cost of doing so will be the same - while
>> speeding up in cases where 4 isn't quite enough?
>> >>>>> >> >> >> >> >
>> >>>>> >> >> >> >> > Cheers,
>> >>>>> >> >> >> >> > Thomas
>> >>>>> >> >> >> >> >
>> >>>>> >> >> >> >> > On Tue, Apr 25, 2023 at 3:07 PM Robert Muir <
>> rcm...@gmail.com> wrote:
>> >>>>> >> >> >> >> >>
>> >>>>> >> >> >> >> >> i think from my perspective it has nothing to do
>> with cpus being
>> >>>>> >> >> >> >> >> 32-bit or 64-bit and more to do with the average
>> length of terms in
>> >>>>> >> >> >> >> >> most languages being smaller than 8. for the
>> languages with longer
>> >>>>> >> >> >> >> >> word length, its usually because of complex
>> morphology that most users
>> >>>>> >> >> >> >> >> would stem away. so doing 4 bytes at a time seems
>> optimal IMO.
>> >>>>> >> >> >> >> >> languages from nature don't care about your cpu.
>> >>>>> >> >> >> >> >>
>> >>>>> >> >> >> >> >> On Tue, Apr 25, 2023 at 8:52 AM Michael McCandless
>> >>>>> >> >> >> >> >>  wrote:
>> >>>>> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> > For a truly "pure" indexing test I usually use a
>> single thread for indexing, and SerialMergeScheduler (using that single
>> thread to also do single-threaded merging).  It makes the indexing take
>> forever lol but it produces "comparable" results.
>> >>>>> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> > But ... this sounds like a great change anyway?
>> Do we really need to gate it on benchmark results?  Do we think there could
>> be a downside e.g. slower indexing on (the dwindling) 32 bit CPUs?
>> >>>>> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> > Mike McCandless
>> >>>>> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> > http://blog.mikemccandless.com
>> >>>>> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> > On Tue, Apr 25, 2023 at 7:39 AM Robert Muir <
>> rcm...@gmail.com> wrote:
>> >>>>> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> I think the results of the benchmark will depend
>> on the properties of
>> >>>>> >> >> >> >> >> >> the indexed terms. For english wikipedia
>> (luceneutil) the average word
>> >>>>> >> >> >> >> >> >> length is around 5 bytes so this optimization may
>> not do much.
>> >>>>> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> On Tue, Apr 25, 2023 at 1:58 AM Patrick Zhai <
>> zhai7...@gmail.com> wrote:
>> >>>>> >> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> >> > I did a quick run with your patch, but since I
>> turned on the CMS as well as TieredMergePolicy I'm not sure how fair the
>> comparison is. Here's the result:
>> >>>>> >> >> >> >> >> >> > Candidate:
>> >>>>> >> >> >> >> >> >> > Indexer: indexing done (890209 msec); total
>> 2620 docs
>> >>>>> >> >> >> >> >> >> > Indexer: waitForMerges done (71622 msec)
>> >>>>> >> >> >> >> >> >> > Indexer: finished (961877 msec)
>> >>>>> >> >> >> >> >> >> > Baseline:
>> >>>>> >> >> >> >> >> >> > Indexer: indexing done (909706 msec); total
>> 2620 docs
>> >>>>> >> >> >> >> >> >> > Indexer: waitForMerges done (54775 msec)
>> >>>>> >> >> >> >> >> >> > Indexer: finished (964528 msec)
>> >>>>> >> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> >> > For more accurate comparison I guess it's
>> better to use LogxxMergePolicy and turn off CMS? If you want to run it
>> yourself you can find the lines I quoted from the log file.
>> >>>>> >> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> >> > Patrick
>> >>>>> >> >> >> >> >> >> >
>> >>>>> >> >> >> >> >> >> > On Mon, Apr 24, 2023 at 12:34 PM Thomas Dullien
>>  wrote:
>> >>>>> >> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> >> Hey all,
>> >>>>> >> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> >> I've been experimenting with fixing some
>> low-hanging performance fruit in the ElasticSearch codebase, and came
>> across the fact that the MurmurHash implementation that is used by
>> ByteRef.hashCode() is reading 4 bytes per loop iteration (which is likely
>> an artifact from 32-bit architectures, which are ever-less-important). I
>> made a small fix to change the implementation to read 8 bytes per loop
>> iteration; I expected a very small impact (2-3% CPU or so over an indexing
>> run in ElasticSearch), but got a pretty nontrivial throughput improvement
>> over a few indexing benchmarks.
>> >>>>> >> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> >> I tried running Lucene-only benchmarks, and
>> succeeded in running the example from
>> https://github.com/mikemccand/luceneutil - but I couldn't figure out how
>> to run indexing benchmarks and how to interpret the results.
>> >>>>> >> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> >> Could someone help me in running the
>> benchmarks for the attached patch?
>> >>>>> >> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> >> Cheers,
>> >>>>> >> >> >> >> >> >> >> Thomas
>> >>>>> >> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >> >>
>> -
>> >>>>> >> >> >> >> >> >> >> To unsubscribe, e-mail:
>> dev-unsubscr...@lucene.apache.org
>> >>>>> >> >> >> >> >> >> >> For additional commands, e-mail:
>> dev-h...@lucene.apache.org
>> >>>>> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >> >>
>> -
>> >>>>> >> >> >> >> >> >> To unsubscribe, e-mail:
>> dev-unsubscr...@lucene.apache.org
>> >>>>> >> >> >> >> >> >> For additional commands, e-mail:
>> dev-h...@lucene.apache.org
>> >>>>> >> >> >> >> >> >>
>> >>>>> >> >> >> >> >>
>> >>>>> >> >> >> >> >>
>> -
>> >>>>> >> >> >> >> >> To unsubscribe, e-mail:
>> dev-unsubscr...@lucene.apache.org
>> >>>>> >> >> >> >> >> For additional commands, e-mail:
>> dev-h...@lucene.apache.org
>> >>>>> >> >> >> >> >>
>> >>>
>> >>> 
>> >>> -
>> >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>>
>> >>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>

-- 
Marcus Eagan


Re: Welcome Chris Hegarty to the Lucene PMC

2023-06-19 Thread Marcus Eagan
Congratulations Chris!

On Mon, Jun 19, 2023 at 12:57 PM Nhat Nguyen 
wrote:

> Congratulations, Chris!
>
> On Mon, Jun 19, 2023 at 9:55 AM Mayya Sharipova
>  wrote:
>
>> Congratulations Chris!
>>
>> On Mon, Jun 19, 2023 at 10:50 AM Greg Miller  wrote:
>>
>>> Congrats! Welcome Chirs!
>>>
>>> On Mon, Jun 19, 2023 at 5:02 AM Michael Sokolov 
>>> wrote:
>>>
>>>> Welcome Chris!
>>>>
>>>> On Mon, Jun 19, 2023, 7:31 AM Michael McCandless <
>>>> luc...@mikemccandless.com> wrote:
>>>>
>>>>> Welcome aboard Chris!
>>>>>
>>>>> Mike McCandless
>>>>>
>>>>> http://blog.mikemccandless.com
>>>>>
>>>>>
>>>>> On Mon, Jun 19, 2023 at 7:16 AM Ishan Chattopadhyaya <
>>>>> ichattopadhy...@gmail.com> wrote:
>>>>>
>>>>>> Congratulations Chris!
>>>>>>
>>>>>> On Mon, 19 Jun, 2023, 3:23 pm Adrien Grand, 
>>>>>> wrote:
>>>>>>
>>>>>>> I'm pleased to announce that Chris Hegarty has accepted an
>>>>>>> invitation to join the Lucene PMC!
>>>>>>>
>>>>>>> Congratulations Chris, and welcome aboard!
>>>>>>>
>>>>>>> --
>>>>>>> Adrien
>>>>>>>
>>>>>>

-- 
Marcus Eagan


Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-16 Thread Marcus Eagan
e it configurable and move it to an appropriate place.
>>> In particular, a simple Integer.getInteger("lucene.hnsw.maxDimensions",
>>> 1024) should be enough.
>>> *Motivation*:
>>> Both are good and not mutually exclusive and could happen in any order.
>>> Someone suggested to perfect what the _default_ limit should be, but
>>> I've not seen an argument _against_ configurability.  Especially in this
>>> way -- a toggle that doesn't bind Lucene's APIs in any way.
>>>
>>> I'll keep this [VOTE] open for a week and then proceed to the
>>> implementation.
>>> --
>>> *Alessandro Benedetti*
>>> Director @ Sease Ltd.
>>> *Apache Lucene/Solr Committer*
>>> *Apache Solr PMC Member*
>>>
>>> e-mail: a.benede...@sease.io
>>>
>>>
>>> *Sease* - Information Retrieval Applied
>>> Consulting | Training | Open Source
>>>
>>> Website: Sease.io <http://sease.io/>
>>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
>>> <https://twitter.com/seaseltd> | Youtube
>>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
>>> <https://github.com/seaseltd>
>>>
>> --
>> Uwe SchindlerAchterdiek 19, D-28357 Bremen 
>> <https://www.google.com/maps/search/Achterdiek+19,+D-28357+Bremen?entry=gmail=g>https://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>> --
Marcus Eagan


Re: Dimensions Limit for KNN vectors - Next Steps

2023-05-09 Thread Marcus Eagan
Over the past month, and lots of working with Lucene, I've moved to Robert
Muir's camp.

*Proposed option*: We focus our efforts on improving the testing
infrastructure, stability, and performance of the feature as is prior to
introducing more complexity. Someone could benefit the community to take
the lead in cataloging all of these efforts in a common place to be easily
referenced and analyzed. If we go with this option, I or someone more
talented than me could lead that effort. After we have sufficient evidence,
we could reconsider bumping the limit with strong consensus.

*Motivation*: We are close to improving on many fronts. Given the
criticality of Lucene in computing infrastructure and the concerns raised
by one of the most active stewards of the project, I think we should keep
working toward improving the feature as is and move to up the limit after
we can demonstrate improvement unambiguously.

-1 (non-binding)

Marcus Eagan

On Tue, May 9, 2023 at 2:49 PM Michael Wechner 
wrote:

> +1
>
> Michael Wechner
>
> Am 09.05.23 um 14:08 schrieb Alessandro Benedetti:
>
>
> *Proposed option*: make the limit configurable
> *Motivation*:
> The system administrator can enforce a limit its users need to respect
> that it's in line with whatever the admin decided to be acceptable for
> them.
> The default can stay the current one.
> This should open the doors for Apache Solr, Elasticsearch, OpenSearch, and
> any sort of plugin development
> --
> *Alessandro Benedetti*
> Director @ Sease Ltd.
> *Apache Lucene/Solr Committer*
> *Apache Solr PMC Member*
>
> e-mail: a.benede...@sease.io
>
>
> *Sease* - Information Retrieval Applied
> Consulting | Training | Open Source
>
> Website: Sease.io <http://sease.io/>
> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
> <https://twitter.com/seaseltd> | Youtube
> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
> <https://github.com/seaseltd>
>
>
> On Tue, 9 May 2023 at 13:07, Alessandro Benedetti 
> wrote:
>
>> We had a very long-running (and heated) thread about this (*[Proposal]
>> Remove max number of dimensions for KNN vectors*).
>> Without repeating any of it, I recommend we move this forward in this way:
>> *We stop any discussion* and everyone interested proposes an option with
>> a motivation, then we aggregate the options and create a Vote.
>>
>> *Please, DO NOT use this thread for anything else than your proposed
>> option.*
>> All e-mails in this thread should be structured:
>> *Proposed Option:*
>> *Motivation:*
>>
>> Let's keep this open for 1 week and then I'll aggregate the options and
>> set up the VOTE thread.
>> If you have anything else to add, please use the old thread.
>>
>> Cheers
>>
>> --
>> *Alessandro Benedetti*
>> Director @ Sease Ltd.
>> *Apache Lucene/Solr Committer*
>> *Apache Solr PMC Member*
>>
>> e-mail: a.benede...@sease.io
>>
>>
>> *Sease* - Information Retrieval Applied
>> Consulting | Training | Open Source
>>
>> Website: Sease.io <http://sease.io/>
>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
>> <https://twitter.com/seaseltd> | Youtube
>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
>> <https://github.com/seaseltd>
>>
>
>

-- 
Marcus Eagan


Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-08 Thread Marcus Eagan
hub.com/apache/lucene/pull/11905
> >>
> >> Attacking me isn't helping the situation.
> >>
> >> PS: when i said the "one guy who wrote the code" I didn't mean it in
> >> any kind of demeaning fashion really. I meant to describe the current
> >> state of usability with respect to indexing a few million docs with
> >> high dimensions. You can scroll up the thread and see that at least
> >> one other committer on the project experienced similar pain as me.
> >> Then, think about users who aren't committers trying to use the
> >> functionality!
> >>
> >> On Sat, Apr 8, 2023 at 12:51 PM Michael Sokolov 
> wrote:
> >> >
> >> > What you said about increasing dimensions requiring a bigger ram
> buffer on merge is wrong. That's the point I was trying to make. Your
> concerns about merge costs are not wrong, but your conclusion that we need
> to limit dimensions is not justified.
> >> >
> >> > You complain that hnsw sucks it doesn't scale, but when I show it
> scales linearly with dimension you just ignore that and complain about
> something entirely different.
> >> >
> >> > You demand that people run all kinds of tests to prove you wrong but
> when they do, you don't listen and you won't put in the work yourself or
> complain that it's too hard.
> >> >
> >> > Then you complain about people not meeting you half way. Wow
> >> >
> >> > On Sat, Apr 8, 2023, 12:40 PM Robert Muir  wrote:
> >> >>
> >> >> On Sat, Apr 8, 2023 at 8:33 AM Michael Wechner
> >> >>  wrote:
> >> >> >
> >> >> > What exactly do you consider reasonable?
> >> >>
> >> >> Let's begin a real discussion by being HONEST about the current
> >> >> status. Please put politically correct or your own company's wishes
> >> >> aside, we know it's not in a good state.
> >> >>
> >> >> Current status is the one guy who wrote the code can set a
> >> >> multi-gigabyte ram buffer and index a small dataset with 1024
> >> >> dimensions in HOURS (i didn't ask what hardware).
> >> >>
> >> >> My concerns are everyone else except the one guy, I want it to be
> >> >> usable. Increasing dimensions just means even bigger multi-gigabyte
> >> >> ram buffer and bigger heap to avoid OOM on merge.
> >> >> It is also a permanent backwards compatibility decision, we have to
> >> >> support it once we do this and we can't just say "oops" and flip it
> >> >> back.
> >> >>
> >> >> It is unclear to me, if the multi-gigabyte ram buffer is really to
> >> >> avoid merges because they are so slow and it would be DAYS otherwise,
> >> >> or if its to avoid merges so it doesn't hit OOM.
> >> >> Also from personal experience, it takes trial and error (means
> >> >> experiencing OOM on merge!!!) before you get those heap values
> correct
> >> >> for your dataset. This usually means starting over which is
> >> >> frustrating and wastes more time.
> >> >>
> >> >> Jim mentioned some ideas about the memory usage in IndexWriter, seems
> >> >> to me like its a good idea. maybe the multigigabyte ram buffer can be
> >> >> avoided in this way and performance improved by writing bigger
> >> >> segments with lucene's defaults. But this doesn't mean we can simply
> >> >> ignore the horrors of what happens on merge. merging needs to scale
> so
> >> >> that indexing really scales.
> >> >>
> >> >> At least it shouldnt spike RAM on trivial data amounts and cause OOM,
> >> >> and definitely it shouldnt burn hours and hours of CPU in O(n^2)
> >> >> fashion when indexing.
> >> >>
> >> >> -
> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
>
>
> --
> Adrien
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Marcus Eagan


Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-07 Thread Marcus Eagan
Important data point and it doesn't seem too bad or good. What is
acceptable performance should be decided by the user? What do you all think?

On Fri, Apr 7, 2023 at 8:20 AM Michael Sokolov  wrote:

> one more data point:
>
> 32M 100dim (fp32) vectors indexed in 1h20m (M=16, IW cache=1994, heap=4GB)
>
> On Fri, Apr 7, 2023 at 8:52 AM Michael Sokolov  wrote:
> >
> > I also want to add that we do impose some other limits on graph
> > construction to help ensure that HNSW-based vector fields remain
> > manageable; M is limited to <= 512, and maximum segment size also
> > helps limit merge costs
> >
> > On Fri, Apr 7, 2023 at 7:45 AM Michael Sokolov 
> wrote:
> > >
> > > Thanks Kent - I tried something similar to what you did I think. Took
> > > a set of 256d vectors I had and concatenated them to make bigger ones,
> > > then shifted the dimensions to make more of them. Here are a few
> > > single-threaded indexing test runs. I ran all tests with M=16.
> > >
> > >
> > > 8M 100d float vectors indexed in 20 minutes (16G heap, IndexWriter
> > > buffer size=1994)
> > > 8M 1024d float vectors indexed in 1h48m (16G heap, IW buffer size=1994)
> > > 4M 2048d float vectors indexed in 1h44m (w/ 4G heap, IW buffer
> size=1994)
> > >
> > > increasing the vector dimension makes things take longer (scaling
> > > *linearly*) but doesn't lead to RAM issues. I think we could get to
> > > OOM while merging with a small heap and a large number of vectors, or
> > > by increasing M, but none of this has anything to do with vector
> > > dimensions. Also, if merge RAM usage is a problem I think we could
> > > address it by adding accounting to the merge process and simply not
> > > merging graphs when they exceed the buffer size (as we do with
> > > flushing).
> > >
> > > Robert, since you're the only on-the-record veto here, does this
> > > change your thinking at all, or if not could you share some test
> > > results that didn't go the way you expected? Maybe we can find some
> > > mitigation if we focus on a specific issue.
> > >
> > > On Fri, Apr 7, 2023 at 5:18 AM Kent Fitch 
> wrote:
> > > >
> > > > Hi,
> > > > I have been testing Lucene with a custom vector similarity and
> loaded 192m vectors of dim 512 bytes. (Yes, segment merges use a lot of
> java memory..).
> > > >
> > > > As this was a performance test, the 192m vectors were derived by
> dithering 47k original vectors in such a way to allow realistic ANN
> evaluation of HNSW.  The original 47k vectors were generated by ada-002 on
> source newspaper article text.  After dithering, I used PQ to reduce their
> dimensionality from 1536 floats to 512 bytes - 3 source dimensions to a
> 1byte code, 512 code tables, each learnt to reduce total encoding error
> using Lloyds algorithm (hence the need for the custom similarity). BTW,
> HNSW retrieval was accurate and fast enough for the use case I was
> investigating as long as a machine with 128gb memory was available as the
> graph needs to be cached in memory for reasonable query rates.
> > > >
> > > > Anyway, if you want them, you are welcome to those 47k vectors of
> 1532 floats which can be readily dithered to generate very large and
> realistic test vector sets.
> > > >
> > > > Best regards,
> > > >
> > > > Kent Fitch
> > > >
> > > >
> > > > On Fri, 7 Apr 2023, 6:53 pm Michael Wechner, <
> michael.wech...@wyona.com> wrote:
> > > >>
> > > >> you might want to use SentenceBERT to generate vectors
> > > >>
> > > >> https://sbert.net
> > > >>
> > > >> whereas for example the model "all-mpnet-base-v2" generates vectors
> with dimension 768
> > > >>
> > > >> We have SentenceBERT running as a web service, which we could open
> for these tests, but because of network latency it should be faster running
> locally.
> > > >>
> > > >> HTH
> > > >>
> > > >> Michael
> > > >>
> > > >>
> > > >> Am 07.04.23 um 10:11 schrieb Marcus Eagan:
> > > >>
> > > >> I've started to look on the internet, and surely someone will come,
> but the challenge I suspect is that these vectors are expensive to generate
> so people have not gone all in on generating such large vectors for large
> datasets. They certainly have not made them easy to find. Here i

Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-07 Thread Marcus Eagan
ve we need to remove the limit or size it in a way it's not a problem
> for both users and internal data structure optimizations, if any.
> >>>>
> >>>>
> >>>> On Wed, 5 Apr 2023, 18:54 Robert Muir,  wrote:
> >>>>> I'd ask anyone voting +1 to raise this limit to at least try to index
> >>>>> a few million vectors with 756 or 1024, which is allowed today.
> >>>>>
> >>>>> IMO based on how painful it is, it seems the limit is already too
> >>>>> high, I realize that will sound controversial but please at least try
> >>>>> it out!
> >>>>>
> >>>>> voting +1 without at least doing this is really the
> >>>>> "weak/unscientifically minded" approach.
> >>>>>
> >>>>> On Wed, Apr 5, 2023 at 12:52 PM Michael Wechner
> >>>>>  wrote:
> >>>>>> Thanks for your feedback!
> >>>>>>
> >>>>>> I agree, that it should not crash.
> >>>>>>
> >>>>>> So far we did not experience crashes ourselves, but we did not index
> >>>>>> millions of vectors.
> >>>>>>
> >>>>>> I will try to reproduce the crash, maybe this will help us to move
> forward.
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> Michael
> >>>>>>
> >>>>>> Am 05.04.23 um 18:30 schrieb Dawid Weiss:
> >>>>>>>> Can you describe your crash in more detail?
> >>>>>>> I can't. That experiment was a while ago and a quick test to see
> if I
> >>>>>>> could index rather large-ish USPTO (patent office) data as vectors.
> >>>>>>> Couldn't do it then.
> >>>>>>>
> >>>>>>>> How much RAM?
> >>>>>>> My indexing jobs run with rather smallish heaps to give space for
> I/O
> >>>>>>> buffers. Think 4-8GB at most. So yes, it could have been the
> problem.
> >>>>>>> I recall segment merging grew slower and slower and then simply
> >>>>>>> crashed. Lucene should work with low heap requirements, even if it
> >>>>>>> slows down. Throwing ram at the indexing/ segment merging problem
> >>>>>>> is... I don't know - not elegant?
> >>>>>>>
> >>>>>>> Anyway. My main point was to remind folks about how Apache works -
> >>>>>>> code is merged in when there are no vetoes. If Rob (or anybody
> else)
> >>>>>>> remains unconvinced, he or she can block the change. (I didn't
> invent
> >>>>>>> those rules).
> >>>>>>>
> >>>>>>> D.
> >>>>>>>
> >>>>>>>
> -
> >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>>>>>>
> >>>>>>
> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>>>>>
> >>>>> -
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>>>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >>> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Marcus Eagan


Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-06 Thread Marcus Eagan
; As I said earlier, a max limit limits usability.
>>> > It's not forcing users with small vectors to pay the performance
>>> penalty of big vectors, it's literally preventing some users to use
>>> Lucene/Solr/Elasticsearch at all.
>>> > As far as I know, the max limit is used to raise an exception, it's
>>> not used to initialise or optimise data structures (please correct me if
>>> I'm wrong).
>>> >
>>> > Improving the algorithm performance is a separate discussion.
>>> > I don't see a correlation with the fact that indexing billions of
>>> whatever dimensioned vector is slow with a usability parameter.
>>> >
>>> > What about potential users that need few high dimensional vectors?
>>> >
>>> > As I said before, I am a big +1 for NOT just raise it blindly, but I
>>> believe we need to remove the limit or size it in a way it's not a problem
>>> for both users and internal data structure optimizations, if any.
>>> >
>>> >
>>> > On Wed, 5 Apr 2023, 18:54 Robert Muir,  wrote:
>>> >>
>>> >> I'd ask anyone voting +1 to raise this limit to at least try to index
>>> >> a few million vectors with 756 or 1024, which is allowed today.
>>> >>
>>> >> IMO based on how painful it is, it seems the limit is already too
>>> >> high, I realize that will sound controversial but please at least try
>>> >> it out!
>>> >>
>>> >> voting +1 without at least doing this is really the
>>> >> "weak/unscientifically minded" approach.
>>> >>
>>> >> On Wed, Apr 5, 2023 at 12:52 PM Michael Wechner
>>> >>  wrote:
>>> >> >
>>> >> > Thanks for your feedback!
>>> >> >
>>> >> > I agree, that it should not crash.
>>> >> >
>>> >> > So far we did not experience crashes ourselves, but we did not index
>>> >> > millions of vectors.
>>> >> >
>>> >> > I will try to reproduce the crash, maybe this will help us to move
>>> forward.
>>> >> >
>>> >> > Thanks
>>> >> >
>>> >> > Michael
>>> >> >
>>> >> > Am 05.04.23 um 18:30 schrieb Dawid Weiss:
>>> >> > >> Can you describe your crash in more detail?
>>> >> > > I can't. That experiment was a while ago and a quick test to see
>>> if I
>>> >> > > could index rather large-ish USPTO (patent office) data as
>>> vectors.
>>> >> > > Couldn't do it then.
>>> >> > >
>>> >> > >> How much RAM?
>>> >> > > My indexing jobs run with rather smallish heaps to give space for
>>> I/O
>>> >> > > buffers. Think 4-8GB at most. So yes, it could have been the
>>> problem.
>>> >> > > I recall segment merging grew slower and slower and then simply
>>> >> > > crashed. Lucene should work with low heap requirements, even if it
>>> >> > > slows down. Throwing ram at the indexing/ segment merging problem
>>> >> > > is... I don't know - not elegant?
>>> >> > >
>>> >> > > Anyway. My main point was to remind folks about how Apache works -
>>> >> > > code is merged in when there are no vetoes. If Rob (or anybody
>>> else)
>>> >> > > remains unconvinced, he or she can block the change. (I didn't
>>> invent
>>> >> > > those rules).
>>> >> > >
>>> >> > > D.
>>> >> > >
>>> >> > >
>>> -
>>> >> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >> > > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> -
>>> >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >> >
>>> >>
>>> >> -
>>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>

-- 
Marcus Eagan


Raising the Value of MAX_DIMENSIONS of Vector Values

2022-08-07 Thread Marcus Eagan
Hi Lucene Team,

In general, I have advised very strongly against our team at MongoDB
modifying the Lucene source, except in scenarios where we have strong needs
for a particular customization. Ultimately, people can do what they would
like to do.

That being said, we have a number of customers preparing to use Lucene for
dense vector search. There are many language models that are optimized for
> 1024 dimensions. I remember Michael Wechner's email
<https://www.mail-archive.com/dev@lucene.apache.org/msg314281.html> about
one instance with Open API.

I just tried to test the OpenAI model
> "text-similarity-davinci-001" with 12288 dimension


It seems that customers who attempt to use these models should not be
turned away. It could be sufficient to explain the issues. The only ones I
have identified are two expected ones in very slow indexing throughput,
high CPU usage, and a maybe less defined risk of more numerical errors.

I opened an issue <https://github.com/apache/lucene/issues/1060> and PR
<https://github.com/apache/lucene/pull/1061> for the discussion as well. I
would appreciate guidance on where we think the warning should go. I feel
like burying in a Javadoc is a less than ideal experience. It would be
better to be a warning on startup. In the PR, I increased the max limit by
a factor of twenty. We should let users use the system based on their needs
even if it was designed or optimized for the models they bring because we
need the feedback and the data from the world.

Is there something I'm overlooking from a risk standpoint?

Best,
-- 
Marcus Eagan


Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-07-31 Thread Marcus Eagan
marcussorealheis, marcussorealheis, Marcus Eagan
On Sun, Jul 31, 2022 at 7:39 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Thanks, added here:
>
> https://github.com/apache/lucene-jira-archive/commit/d91534e67b35004f212100d73008283327f2f1e7
>
> Please confirm it's right ;)
>
> Mike
>
> On Sun, Jul 31, 2022 at 7:27 AM 翁剑平  wrote:
>
>> Hi, could you help to add my info, thanks a lot
>> jira full name: jianping weng
>> github id: wjp719
>>
>> the jira issue I create before:
>> https://issues.apache.org/jira/browse/LUCENE-10425
>> the github pr I submit before: https://github.com/apache/lucene/pull/780
>>
>>
>> Best Regards,
>> jianping weng
>>
>>
>>
>> Michael McCandless  于2022年7月31日周日 18:08写道:
>>
>>> Hello Lucene users, contributors and developers,
>>>
>>> If you have used Lucene's Jira and you have a GitHub account as well,
>>> please check whether your user id mapping is in this file:
>>> https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/account-map.csv.20220722.verified
>>>
>>> If not, please reply to this email and we will try to add you.
>>>
>>> Please forward this email to anyone you know might be impacted and who
>>> might not be tracking the Lucene lists.
>>>
>>>
>>> Full details:
>>>
>>> The Lucene project will soon migrate from Jira to GitHub for issue
>>> tracking.
>>>
>>> There have been discussions, votes, a migration tool created / iterated
>>> (thanks to Tomoko Uchida's incredibly hard work), all iterating on Lucene's
>>> dev list.
>>>
>>> When we run the migration, we would like to map Jira users to the right
>>> GitHub users to properly @-mention the right person and make it easier for
>>> you to find issues you have engaged with.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>> --
> Mike McCandless
>
> http://blog.mikemccandless.com
>
-- 
Marcus Eagan


Re: [DISCUSS] A proposal for migration to GitHub issue (LUCENE-10557)

2022-05-10 Thread Marcus Eagan
I recommend people take a look at the now deprecated helm project. It was
very difficult to land PRs because they had so much governance and
automation. For a data store as mature as SOLR, I would suggest it is
needed.

Many issues are worth a read: https://github.com/helm/helm

On Tue, May 10, 2022 at 10:16 AM Gus Heck  wrote:

>
>
> On Tue, May 10, 2022 at 10:40 AM Houston Putman 
> wrote:
>
>>
>>>
>> Most modern open source projects use Github Issues for their issue
>> tracking, so it's definitely doable, and really what new
>> users/contributors will be expecting. Also I see that much discussion is
>> already done on PRs, and JIRAs are mainly there just for
>> bureaucratic purposes. So I think it would be a wonderful direction to go
>> in.
>>
>>
> On that note, many such projects I find it more difficult to get clarity
> on whether or not I'm affected by the issue, or in what version it was
> resolved. Usually i can be achieved by clicking on the referenced commit,
> and then inspecting what tags are on that commit, but it's several clicks
> and a minute or two vs just looking at the field in Jira...
>
> This can be made easier by using milestones as seen here (random example,
> used gradle because it's a very large, healthy project):
> https://github.com/gradle/gradle/issues/20182
>
> But I've seen a lot of projects that don't do that... which probably
> colors my view a bit.
>
> -Gus
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Marcus Eagan


Re: [DISCUSS] A proposal for migration to GitHub issue (LUCENE-10557)

2022-05-08 Thread Marcus Eagan
fected version and the fixed
>>>version.
>>>- While one can create arbitrary labels, they are not segregated
>>>into fields so we would have to put up with what is effectively a single
>>>field for priority, component, and resolution
>>>- No way to enforce that a resolution label is applied to the issue.
>>>
>>> I feel that Github issues are simply lacking in depth and riding along
>>> on the virtue of their integrations. I feel like their issue tracking
>>> implementation is a lower priority sideline to their code repository (so
>>> they can say they have it).  On the flip side Jira has become hugely
>>> entrenched in the industry and its profits are no-longer tied to innovation
>>> or even usability... and it shows. So I am basically dissatisfied with both
>>> (for most of the last 20 years I've been known to mutter about writing my
>>> own issue tracker... I've not met one that didn't irritate me somehow,
>>> though I haven't actually tried yet, because that's a LOT of work ;) ).
>>>
>>> Given how many things I've listed, it's likely something's wrong or
>>> misapprehended, so certainly speak up if I'm just unaware of something in
>>> github.
>>>
>>> In the end I don't feel like the benefits of github are worth giving up
>>> the data quality and features that we do actually use in Jira, so I'm -0.9
>>> on moving to github.
>>>
>>> -Gus
>>>
>>>
>>> On Fri, May 6, 2022 at 11:11 AM Michael McCandless <
>>> luc...@mikemccandless.com> wrote:
>>>
>>>> On Thu, May 5, 2022 at 7:56 AM Robert Muir  wrote:
>>>>
>>>> As far as replies, in github I highlight the part of the thing i want
>>>>> to reply to, and press 'r' key on my keyboard. it quotes it and
>>>>> everything. Really convenient to me.
>>>>>
>>>>
>>>> Whoa, thank you!!  I had no idea GitHub has such extensive keyboard
>>>> shortcuts (just type ? to see them all).
>>>>
>>>> Mike McCandless
>>>>
>>>> http://blog.mikemccandless.com
>>>>
>>>
>>>
>>> --
>>> http://www.needhamsoftware.com (work)
>>> http://www.the111shift.com (play)
>>>
>>

-- 
Marcus Eagan


Re: Welcome Guo Feng as Lucene committer

2022-01-25 Thread Marcus Eagan
Congratulations Feng!

On Tue, Jan 25, 2022 at 10:51 AM Anshum Gupta 
wrote:

> Congratulations and welcome, Feng!
>
> On Tue, Jan 25, 2022 at 1:09 AM Adrien Grand  wrote:
>
>> I'm pleased to announce that Guo Feng has accepted the PMC's
>> invitation to become a committer.
>>
>> Feng, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Congratulations and welcome!
>>
>> --
>> Adrien
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> Anshum Gupta
>
-- 
Marcus Eagan


Re: The Retired Open Relevance Project

2022-01-21 Thread Marcus Eagan
Makes a lot of sense. I have only seen Anserini in passing. It looks great.

I will dig in. Thanks Robert!

On Thu, Jan 20, 2022 at 9:18 PM Robert Muir  wrote:

> On Thu, Jan 20, 2022 at 9:57 PM Marcus Eagan 
> wrote:
> >
> > Hi everyone,
> >
> > Is there anyone here that has any information on what happened with the
> Open Relevance Project? I saw that there was a vote and it was discontinued
> here. I could not track down the vote email thread, perhaps because it was
> a PMC email.
> >
> > I feel something like it could bring a few different developer personas
> to both Apache Lucene and Apache SOLR, and probably some interest from some
> past and future corporate sponsors. I'd love to better understand what went
> wrong and what it would take to reboot it.
> >
> > Thank you all for your contributions,
> >
> > Marcus Eagan
> >
>
> Hi Marcus, I think at the time, not many people in academia were using
> lucene for these kinds of experiments. Also, not many people working
> on lucene had basic necessary things such as access to the relevant
> datasets that you need to run these experiments. I don't think we had
> the adequate time to invest in it anyway, probably because we are
> developers and not researchers. So the project didn't really make much
> progress.
>
> I think these days, the anserini toolkit fills the missing gaps
> perfectly: https://github.com/castorini/anserini
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Marcus Eagan


The Retired Open Relevance Project

2022-01-20 Thread Marcus Eagan
Hi everyone,

Is there anyone here that has any information on what happened with the
Open Relevance Project? I saw that there was a vote and it was discontinued
here.
<https://lucene.apache.org/openrelevance/#:~:text=The%20Open%20Relevance%20Project%20(ORP,(NLP)%20into%20open%20source>
I
could not track down the vote email thread, perhaps because it was a PMC
email.

I feel something like it could bring a few different developer personas to
both Apache Lucene and Apache SOLR, and probably some interest from some
past and future corporate sponsors. I'd love to better understand what went
wrong and what it would take to reboot it.

Thank you all for your contributions,

Marcus Eagan


Re: Revisiting Standardized Test Names in Solr

2021-06-02 Thread Marcus Eagan
Gerlowski 
>> wrote:
>>
>>
>> I'm fine with standardization, whichever convention we choose.  I have
>> a slight preference for FooTest, for the same reason Gus mentioned,
>> but any standard is better than none here IMO.
>>
>> prefer that we not make a sweeping change like this until after Mark's
>> "ref branch" is reconciled
>>
>>
>> Personally I disagree about the need to wait.  It'd be one thing if
>> there was an agreed-upon plan or a timeframe for merging "ref-branch".
>> But since that's not the case today, I don't think it makes sense to
>> ignore concrete/mergeable improvements.  It seems like a "bird in the
>> hand vs two in the bush" situation.  Especially when there are
>> strategies for handling the conflicts that might arise with Mark's
>> "ref-branch" (e.g. do the test renames on both master and ref_impl).
>>
>> Jason
>>
>> On Sun, Feb 21, 2021 at 12:44 PM David Smiley  wrote:
>>
>>
>> I look forward to a standardization on *something* but would prefer that
>> we not make a sweeping change like this until after Mark's "ref branch" is
>> reconciled.  I don't want that to hang over the project indefinitely, but
>> we can wait; we've not had this standardization yet for many years, after
>> all.
>>
>> That said, it would be good to choose the standard name now so that there
>> is less to change later.  Can someone dig up the statistics on Solr's name
>> choice to see if there is a clear winner (e.g. >60%)?  I don't have a
>> strong opinion on whatever the standard should be so long as there is a
>> standard :-)
>>
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Sun, Feb 21, 2021 at 12:18 PM Gus Heck  wrote:
>>
>>
>> FWIW, I'm not really in favor of the convention Lucene adopted. I
>> probably lost track of the debate and failed to object which is on me, but
>> I guess it was because that was the lower number of changes there? It's
>> certainly much less legible in the IDE to have a wall of classes all
>> starting with T. Maybe given that the projects are splitting Solr can Stick
>> with FooTest not TestFoo? I think *Test suffix is more common in Solr...
>> (though I haven't attempted to quantify it)
>>
>> On Sun, Feb 21, 2021 at 12:05 PM Eric Pugh <
>> ep...@opensourceconnections.com> wrote:
>>
>>
>> Makes sense to me.
>>
>>
>> On Feb 20, 2021, at 2:42 PM, Marcus Eagan  wrote:
>>
>> Hi all,
>>
>> Now that Lucene’s standardization is complete and I believe enforced,
>> should we discuss if we could bring the same consistency to Solr?
>>
>> Best,
>>
>> Marcus
>> --
>> Marcus Eagan
>>
>>
>> ___
>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
>> http://www.opensourceconnections.com | My Free/Busy
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless of
>> whether attachments are marked as such.
>>
>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>>
>> -----
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>> --
>> - Mark
>>
>> http://about.me/markrmiller
>>
>>
>> --
>> - Mark
>>
>> http://about.me/markrmiller
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>> ___
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Marcus Eagan


Re: A proposition: Make Luke a standalone package (again)

2021-05-29 Thread Marcus Eagan
t;>  running it
> > > >>>>>
> > > >>>>>  Cons:
> > > >>>>>  - Duplication of many jars (analyzers, queries, codec, etc.)
> > > >>>>>
> > > >>>>>  I am sure it makes sense for long-term Luke users who used to
> just
> > > >>>>>  download Luke from the original or forked sites - but let me
> know if
> > > >>>>>  there is anyone who has thoughts (eg. from the aritifact
> maintainers'
> > > >>>>>  perspective) on it.
> > > >>>>>  If there is no objection/concern, I will explore what changes
> are
> > > >>>>>  required to do so on LUCENE-9978.
> > > >>>>>
> > > >>>>>  Final note: It doesn't affect ongoing 9.0 release. With the
> assemble
> > > >>>>>  task, Luke works just fine as before.
> > > >>>>>
> > > >>>>>  Thanks,
> > > >>>>>  Tomoko
> > > >>>>> 
> > > >>>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > >>>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
> > > >>>>>
> > > >>>> 
> > > >>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > >>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
> > > >>>>
> > > >>> 
> > > >>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > >>>  For additional commands, e-mail: dev-h...@lucene.apache.org
> > > >>>
> > > >> 
> > > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > >> For additional commands, e-mail: dev-h...@lucene.apache.org
> > > >>
> > > >
> > > > --
> > > > Uwe Schindler
> > > > Achterdiek 19, 28357 Bremen
> > > > https://www.thetaphi.de
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
> --
Marcus Eagan


Re: Welcome Peter Gromov as Lucene committer

2021-04-09 Thread Marcus Eagan
Congrats Peter. Much deserved!

On Fri, Apr 9, 2021 at 2:59 PM Nhat Nguyen 
wrote:

> Welcome Peter!
>
> On Fri, Apr 9, 2021 at 1:41 PM Christine Poerschke (BLOOMBERG/ LONDON) <
> cpoersc...@bloomberg.net> wrote:
>
>> Welcome Peter!
>>
>> From: dev@lucene.apache.org At: 04/07/21 08:12:00
>> To: dev@lucene.apache.org
>> Subject: Re: Welcome Peter Gromov as Lucene committer
>>
>> Thanks for the honor!
>>
>> (BTW I'm still not recognized by Github as having write access, and can't
>> merge my pull requests :))
>>
>> > Peter, the tradition is that new committers introduce themselves with a
>> brief bio.
>>
>> Okay, time for some bragging :) I've been working at JetBrains for some
>> 17 years, most of them on IntelliJ platform
>> <https://www.jetbrains.com/idea/>, mainly supporting various languages
>> and their infrastructure, analyzing snapshots and improving performance.
>> Aiming to catch more bugs before they hit production, I've introduced
>> property-based testing to IntelliJ by creating a small library called
>> jetCheck <https://github.com/JetBrains/jetCheck>. Recently I've switched
>> to the Grazie <https://plugins.jetbrains.com/plugin/12175-grazie>
>> project and now I do some rule-based computational linguistics there and
>> enhance the IDE support for English. As Grazie needs LanguageTool
>> <https://languagetool.org/> and Hunspell, I've also spent some time
>> rewriting the latter in Java (here in Lucene), and optimizing them both. In
>> my free time, I like mountain hiking (Munich/Germany is a great location
>> for that!), and some amateur piano/harmonica playing/singing
>> <https://www.youtube.com/user/gromopetr/videos>.
>>
>>>
>>

-- 
Marcus Eagan


Re: Congratulations to the new Lucene PMC Chair, Michael Sokolov!

2021-02-20 Thread Marcus Eagan
Awesome. Congratulations Mike!

You truly have pushed open source search into a new era with Lucene-9004
and so many of your efforts to steward our community. You have inspired me
and others, I’m sure, to think about more innovative ways to contribute for
the future.

- Marcus

On Sat, Feb 20, 2021 at 16:18 Namgyu Kim  wrote:

> Congratulations, Mike! :D
>
> On Thu, Feb 18, 2021 at 6:32 AM Anshum Gupta 
> wrote:
>
>> Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice
>> President position.
>>
>> This year we nominated and elected Michael Sokolov as the Chair, a
>> decision that the board approved in its February 2021 meeting.
>>
>> Congratulations, Mike!
>>
>> --
>> Anshum Gupta
>>
> --
Marcus Eagan


Revisiting Standardized Test Names in Solr

2021-02-20 Thread Marcus Eagan
Hi all,

Now that Lucene’s standardization is complete and I believe enforced,
should we discuss if we could bring the same consistency to Solr?

Best,

Marcus
-- 
Marcus Eagan


Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-20 Thread Marcus Eagan
Awesome, way to go Jan!

- Marcus

On Sat, Feb 20, 2021 at 10:53 Lucky Sharma  wrote:

> Congratulations Jan
>
> Regards,
> Lucky Sharma
>
> On Sat, 20 Feb 2021 at 8:07 PM, Karl Wright  wrote:
>
>> Congratulations!
>> Karl
>>
>> On Sat, Feb 20, 2021 at 6:28 AM Uwe Schindler  wrote:
>>
>>> Congrats Jan!
>>>
>>>
>>>
>>> Uwe
>>>
>>>
>>>
>>> -
>>>
>>> Uwe Schindler
>>>
>>> Achterdiek 19, D-28357 Bremen
>>> <https://www.google.com/maps/search/Achterdiek+19,+D-28357+Bremen?entry=gmail=g>
>>>
>>> https://www.thetaphi.de
>>>
>>> eMail: u...@thetaphi.de
>>>
>>>
>>>
>>> *From:* Anshum Gupta 
>>> *Sent:* Thursday, February 18, 2021 7:55 PM
>>> *To:* Lucene Dev ; solr-u...@lucene.apache.org
>>> *Subject:* Congratulations to the new Apache Solr PMC Chair, Jan
>>> Høydahl!
>>>
>>>
>>>
>>> Hi everyone,
>>>
>>>
>>>
>>> I’d like to inform everyone that the newly formed Apache Solr PMC
>>> nominated and elected Jan Høydahl for the position of the Solr PMC Chair
>>> and Vice President. This decision was approved by the board in its February
>>> 2021 meeting.
>>>
>>>
>>>
>>> Congratulations Jan!
>>>
>>>
>>>
>>> --
>>>
>>> Anshum Gupta
>>>
>> --
> Warm Regards,
>
> Lucky Sharma
> Contact No :+91 9821559918
>
-- 
Marcus Eagan


MongoDB Hosting an Apache Lucene/Solr Meetup

2021-02-09 Thread Marcus Eagan
Hello All (is there a better list?),

There's a cohort of Lucene hackers at MongoDB (and a couple ES and Solr
ones, too). In an effort to foster more collaboration with everyone here,
we at MongoDB would like to sponsor a meetup for the community and mail out
some MongoDB paraphernalia to a few presenters.

We don't wish to control the topic by any means. We'd love to hear if
anyone would like to present something they are working on or have recently
landed in the project, we'd love to hear about it. We are also happy to
present. There have been several really impressive projects to land
recently in Apache's Lucene and SOLR, so there are a lot of options. The
MongoDB team wants to help with the maintenance and innovation happening in
the project. It would be good for them to know who they will be working
with.

Anyway, if people would be up for a virtual get together let me know. If
so, I'll put out a few dates and meetings invites, and a few optional plans
for the broader community to connect with our team and vice versa.

Best,

-- 
Marcus Eagan


Re: Consider Removing the `@` Special Character from RegExp

2021-01-25 Thread Marcus Eagan
That's right. It's optional. I think we should remove it unless we have a
good reason to keep it. I just think that it's maddening and unnecessary.
Perhaps, I am the only one?

On Fri, Jan 22, 2021 at 7:54 AM Gus Heck  wrote:

> I think it's already an optional feature; if you construct the regexp with
> explicit syntax flags you can get an instance that won't consider '@'
> special. Haven't actually had a need to do that so I'm assuming it works as
> documented.
>
> /** Syntax flag, enables anystring (@). */
> public static final int ANYSTRING = 0x0008;
>
>
>
> On Thu, Jan 21, 2021 at 9:21 PM Marcus Eagan 
> wrote:
>
>> Hi All,
>>
>> In looking at the Java Docs, our Lucene team noticed that the `@` symbol
>> is a reserved character in the Lucene regular expression syntax.
>>
>> In re-visiting the page in curiosity, I found that the symbol was
>> [Optional] for "any string." This came at a surprise because there's a very
>> common way to achieve "any string" in `.*`. Is there any compelling reason
>> to preserve this tiny vector of complexity? I suspect there may be some
>> differences in the constructions of the finite automata produced by `.*`
>> and `@` but I am not sure.
>>
>> If insignificant or non-existent, I suggest we remove `@` from the
>> regular expression syntax.
>>
>> --
>> Marcus Eagan
>>
>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Marcus Eagan


Is it Time to Deprecate the Legacy Facets API

2021-01-21 Thread Marcus Eagan
Hi all,

Sorry to spam the list. I am querying the list in such quick succession
because of a realization I came to while on Twitter. Is it time to
deprecate the Legacy Facet API?

I understood in the past that they behaved slightly differently. Now, I'm
wondering if it makes sense to keep the legacy facets package as it adds a
burden of maintenance to the project. If some activists really want it, I
will abandon the effort. If the interest is very light, I suppose they can
package it up in a plugin. In fact, I would help if they run into trouble
and I am able to help.

Anyway, let me know what you think. If it's a good idea, I will head over
to the chopping block.

-- 
Marcus Eagan


Consider Removing the `@` Special Character from RegExp

2021-01-21 Thread Marcus Eagan
Hi All,

In looking at the Java Docs, our Lucene team noticed that the `@` symbol is
a reserved character in the Lucene regular expression syntax.

In re-visiting the page in curiosity, I found that the symbol was
[Optional] for "any string." This came at a surprise because there's a very
common way to achieve "any string" in `.*`. Is there any compelling reason
to preserve this tiny vector of complexity? I suspect there may be some
differences in the constructions of the finite automata produced by `.*`
and `@` but I am not sure.

If insignificant or non-existent, I suggest we remove `@` from the regular
expression syntax.

-- 
Marcus Eagan


Re: Add maxFields Option to IndexWriter

2021-01-14 Thread Marcus Eagan
I like Oren's idea and Simon's proposal of unlimited by default but
configurable.
Marcus

On Thu, Jan 14, 2021 at 12:16 AM Simon Willnauer 
wrote:

> I personally have pretty positive experience with what I call softlimits.
> At elastic we use them all over the place to catch issues when a user
> likely misconfigures something or if there is likely a issue on the users
> end.
> I think having an option on the IW that allows to limit the fieldnumbers.
> We can even extract a general limits object with total num docs etc. if we
> want. We can still set stuff to unlimited by default.
>
> WDYT
>
> Sent from a mobile device
>
> On 14. Jan 2021, at 06:36, David Smiley  wrote:
>
> 
> I don't like the idea of IndexWriter limiting field names, but I do like
> the idea of un-deprecating that method, which appeared to have a trivial
> implementation.  Try commenting on the issue of it's deprecations, which
> has various watchers to get their attention.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Wed, Jan 13, 2021 at 5:02 PM Oren Ovadia
>  wrote:
>
>> Hi All,
>>
>> I work on Lucene at MongoDB.
>>
>> I would like to limit the amount of fields in an index to prevent tenants
>> from causing a mapping explosion.
>>
>> Since IndexWriter.getFieldNames has been deprecated
>> <https://issues.apache.org/jira/browse/LUCENE-8909>, there is no way to
>> do this without using a reader (which comes with a set of problems
>> regarding flush/commit rates).
>>
>> Would love to add to Lucene the ability to have IndexWriters limiting the
>> number of fields. Curious to hear your thoughts.
>>
>> Thanks,
>> Oren
>>
>>

-- 
Marcus Eagan


Re: [DISCUSS] Cross Data-Center Replication in Apache Solr

2020-12-06 Thread Marcus Eagan
Looking forward to the doc. Thanks

Marcus

On Sun, Dec 6, 2020 at 05:47 Erick Erickson  wrote:

> Anshum:
>
> I know I’ve been recommending something like this to clients for a while,
> do you think a call to the community for people who’ve already put
> something in the middle might net us some good info on the lurking
> gremlins? Mind you “recommend” hasn’t actually involved me _doing_ it
> so I don’t have any actual experience there…
>
> But yeah, absolutely +1 for something making this easier for clients...
>
> Erick
>
> > On Dec 5, 2020, at 11:43 AM, Ilan Ginzburg  wrote:
> >
> > That's an interesting initiative Anshum!
> >
> > I can see at least two different approaches here, your mention of SolrJ
> seems to hint at the first one:
> > 1. Get the data as it comes from the client and fork it to local and
> remote data centers,
> > 2. Create (an asynchronous) stream replicating local data center data to
> remote.
> >
> > Option 1 is strongly consistent but adds latency and potentially
> blocking on the critical path.
> > Option 2 could look like remote PULL replicas, might have lower impact
> on the local data center but has to deal with the remote data center always
> being somewhat behind. If the client application can handle that, the
> performance and efficiency gain (as well as simpler implementation? It
> doesn't require another persistence layer) might be worth it...
> >
> > Ilan
> >
> > On Fri, Dec 4, 2020 at 5:24 PM Anshum Gupta 
> wrote:
> > Hi everyone,
> >
> > Large scale Solr installations often require cross data-center
> replication in order to achieve data replication for both, access latency
> reasons as well as disaster recovery. In the past users have either
> designed their own solutions to deal with this or have tried to rely on the
> now-deprecated CDCR.
> >
> > It would be really good to have support for cross data-center
> replication within Solr, that is offered and supported by the community.
> This would allow the effort around this shared problem to converge.
> >
> > I’d like to propose a new solution based on my experiences at my day
> job. The key points about this approach:
> >   • Uses an external, configurable, messaging system in the middle
> for actual replication/mirroring.
> >   • We offer an abstraction and some default implementations based
> on what we can support and what users really want. An example here would be
> Kafka.
> >   • This would be a separate repository allowing it to have its own
> release cadence. We shouldn’t have to release this with every Solr release
> as the overlap is just limited to SolrJ interactions.
> >
> > I’ll share a more detailed and evolving document soon with the design
> for everyone else to contribute to but wanted to share this as I’m starting
> to work on this and wanted to avoid parallel efforts towards the same
> end-goal.
> >
> > --
> > Anshum Gupta
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
> --
Marcus Eagan


Re: [Apache Solr] Twitter Account

2020-12-05 Thread Marcus Eagan
One of the committers should pick it up, make release announcements, and
share important insights with the community periodically.

On Sat, Dec 5, 2020 at 14:06 Anshum Gupta  wrote:

> Yes, I agree that it should be more active, but that's not really an
> official 'Apache' Solr account :)
>
> I know at some point Shalin tried to find someone who would be up for it
> and manage it, but considering we are all volunteering, it's tough to keep
> up and he didn't get any volunteers.
>
> As of now it's just a dormant account w.r.t. activity.
>
> On Sat, Dec 5, 2020 at 2:40 AM Alessandro Benedetti 
> wrote:
>
>> Hi,
>> I noticed the Apache Solr twitter account not to be that active anymore.
>> There are not even a tweet - > release 1 to 1 matching.
>> Not to mention the countless interesting blog posts Solr related, that
>> could benefit the community if better shared.
>>
>> In my opinion, that's a shame, given the good number of followers the
>> account has.
>> Who's managing it?
>>
>> I understand that the management of that page must be unbiased, sharing
>> interesting posts, without direct commercial purpose, given the fact many
>> companies (including mine) make a living out of Apache Solr (but also give
>> back to the community in form of blogs and contributions).
>>
>>
>>
>> Regards
>>
>
>
> --
> Anshum Gupta
>
-- 
Marcus Eagan


Re: Maintaining Unnecessary Test Files

2020-11-03 Thread Marcus Eagan
@David Smiley  It would seem to me that
TestSpatialFilter would be fine with no mention of the port in the name.
It's a confusing identifier.

As for TestGeo3dShapeWGS84ModelRectRelation, there are lots of comments.

Not sure what the value of this comment is in the class

 /*
 [junit4]   1> S-R Rel: {}, Shape {}, Rectangle {}lap# {}
[CONTAINS, Geo3dShape{planetmodel=PlanetModel:
{xyScaling=1.0011188180710464, zScaling=0.9977622539852008},
shape=GeoPath: {planetmodel=PlanetModel:
{xyScaling=1.0011188180710464, zScaling=0.9977622539852008},
width=1.53588974175501(87.99),
  points={[[X=0.12097657665150223, Y=-0.6754177666095532,
Z=0.7265376136709238], [X=-0.3837892785614207, Y=0.4258049113530899,
Z=0.8180007850434892]]}}},
  Rect(minX=4.0,maxX=36.0,minY=16.0,maxY=16.0), 6981](no slf4j subst; sorry)
 [junit4] FAILURE 0.59s | Geo3dWGS84ShapeRectRelationTest.testGeoPathRect <<<
 [junit4]> Throwable #1: java.lang.AssertionError:
Geo3dShape{planetmodel=PlanetModel: {xyScaling=1.0011188180710464,
zScaling=0.9977622539852008}, shape=GeoPath: {planetmodel=PlanetModel:
{xyScaling=1.0011188180710464, zScaling=0.9977622539852008},
width=1.53588974175501(87.99),
  points={[[X=0.12097657665150223, Y=-0.6754177666095532,
Z=0.7265376136709238], [X=-0.3837892785614207, Y=0.4258049113530899,
Z=0.8180007850434892]]}}} intersect Pt(x=23.81626064835212,y=16.0)
 [junit4]>  at
__randomizedtesting.SeedInfo.seed([2595268DA3F13FEA:6CC30D8C83453E5D]:0)
 [junit4]>  at
org.apache.lucene.spatial.spatial4j.RandomizedShapeTestCase._assertIntersect(RandomizedShapeTestCase.java:168)
 [junit4]>  at
org.apache.lucene.spatial.spatial4j.RandomizedShapeTestCase.assertRelation(RandomizedShapeTestCase.java:153)
 [junit4]>  at
org.apache.lucene.spatial.spatial4j.RectIntersectionTestHelper.testRelateWithRectangle(RectIntersectionTestHelper.java:128)
 [junit4]>  at
org.apache.lucene.spatial.spatial4j.Geo3dWGS84ShapeRectRelationTest.testGeoPathRect(Geo3dWGS84ShapeRectRelationTest.java:265)
*/

This comment either. It just seems like these are stream of conscious
notes and maybe they should be captured in the relevant tickets, which
could be referenced. Do you think it should be in the actual source
code?

// Rectangle contains point
//assertTrue(rect.isWithin(pt));
// Path contains point (THIS FAILS)
//assertTrue(path.isWithin(pt));
// What happens: (1) The center point of the horizontal line is within
the path, in fact within a radius of one of the endpoints.
// (2) The point mentioned is NOT inside either SegmentEndpoint.
// (3) The point mentioned is NOT inside the path segment, either.  (I
think it should be...)




On Mon, Nov 2, 2020 at 7:00 AM David Smiley  wrote:

> Hi Marcus,
>
> PortSolr3Test is documented "Based off of Solr 3's SpatialFilterTest".
> Why do you propose removing it?  A quick gloss over it suggests to me this
> test is a rather straight-forward test to understand and maintain, and
> should be fast.  Perhaps the class name should have been something else,
> and consider it's heritage as an implementation detail that is only worthy
> of a comment?  But then name it what?  IMO it's fine.
>
> TestGeo3dShapeWGS84ModelRectRelation: What about it?  This test class is
> 99% implemented by it's subclass -- ShapeRectRelationTestCase.  The
> subclass provides the Geo3d context to its superclass, and the rest is
> handled from there.  The tests explicitly on this subclass are for some
> regressions.
>
> > Is there anyone keeping a list of test cases that we can get rid of  or
> significantly refactor today?
>
> For Solr -- yes (sorta): SOLR-11872
> <https://issues.apache.org/jira/browse/SOLR-11872>.  That would be a
> major make-over, but wouldn't really change the number of tests; it'd make
> them easier to maintain and more flexible.  There is another issue, 
> SOLR-10229,
> pertaining to Solr tests ought to configure Solr themselves and rely less
> on static test config files, which are a mess to maintain.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sun, Nov 1, 2020 at 5:21 PM Marcus Eagan  wrote:
>
>> Hi Community,
>>
>> Lately I have been reading a lot of test files in an attempt to
>> understand what they seek to accomplish. Specifically, what stability and
>> reliability assurance does a given test class provide. In short, I have
>> found some test files that I am unsure are required to provide any of the
>> expected guarantees of the project today.
>>
>> It is more possible that I am misreading or don’t know all the history to
>> opine, and I don’t want to waste anyone’s time with a ticket without first
>> raising a discussion here. Below, I’ll include a few examples from Lucene.
>> As o

Maintaining Unnecessary Test Files

2020-11-01 Thread Marcus Eagan
Hi Community,

Lately I have been reading a lot of test files in an attempt to understand
what they seek to accomplish. Specifically, what stability and reliability
assurance does a given test class provide. In short, I have found some test
files that I am unsure are required to provide any of the expected
guarantees of the project today.

It is more possible that I am misreading or don’t know all the history to
opine, and I don’t want to waste anyone’s time with a ticket without first
raising a discussion here. Below, I’ll include a few examples from Lucene.
As of today, I fully intend to step through many of the test files from
Solr as well for a related effort, but I started with Lucene because I have
~800 more classes in Solr to review/modify/flag for review and because
there is a fast-changing reference impl out there.

The first example is the PortSolr3Test class. It seems relevant because it
tests some currently relevant cases, but the name suggests that it might
not be.

Another class I read that has lots of cruft that I don't really could use
some guidance one on is the TestGeo3dShapeWGS84ModelRectRelation class. Is
it possible there are lots of test classes that are no longer necessary
given changes over the versions.

Is there anyone keeping a list of test cases that we can get rid of  or
significantly refactor today?

Please advise,

Marcus


Re: Communicating the future of DIH?

2020-10-15 Thread Marcus Eagan
There’s always issues opened in every product that aren’t being closed.
Everyone who knows it or cares about it should be pitching in.

Marcus

On Thu, Oct 15, 2020 at 12:21 Eric Pugh 
wrote:

> I noticed that we’re getting tickets like SOLR-14938 opened that are all
> about the future of DIH.  I know some of my own clients are asking about it
> as well.   I suspect we will get more and more of these!
>
> I wonder if there are any ideas/suggestions on how to better communicate
> that DIH isn’t going away, and indeed, it’s moving to a better place (I
> hope!).   Do we want to add to the UI a message about “join the new
> community at https://github.com/rohitbemax/dataimporthandler”?
>
> Having said that, I see issues opening at
> https://github.com/rohitbemax/dataimporthandler and not being closed, so
> I do have some concerns that a supportive community may not actually be
> forming.
>
>
> Eric
>
> ___
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
> --
Marcus Eagan


Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-07 Thread Marcus Eagan
(non-binding)
B5d, A2

On Mon, Sep 7, 2020 at 8:45 PM Ryan Ernst  wrote:

> I forgot to vote myself. :)
>
> (binding)
> A1, A2, D
>
> On Tue, Sep 1, 2020 at 1:21 PM Ryan Ernst  wrote:
>
>> Dear Lucene and Solr developers!
>>
>> Sorry for the multiple threads. This should be the last one.
>>
>> In February a contest was started to design a new logo for Lucene
>> [jira-issue]. The initial attempt [first-vote] to call a vote resulted in
>> some confusion on the rules, as well the request for one additional
>> submission. The second attempt [second-vote] yesterday had incorrect links
>> for one of the submissions. I would like to call a new vote, now with more
>> explicit instructions on how to vote, and corrected links.
>>
>> *Please read the following rules carefully* before submitting your vote.
>>
>> *Who can vote?*
>>
>> Anyone is welcome to cast a vote in support of their favorite
>> submission(s). Note that only PMC member's votes are binding. If you are a
>> PMC member, please indicate with your vote that the vote is binding, to
>> ease collection of votes. In tallying the votes, I will attempt to verify
>> only those marked as binding.
>>
>>
>> *How do I vote?*
>> Votes can be cast simply by replying to this email. It is a ranked-choice
>> vote [rank-choice-voting]. Multiple selections may be made, where the order
>> of preference must be specified. If an entry gets more than half the votes,
>> it is the winner. Otherwise, the entry with the lowest number of votes is
>> removed, and the votes are retallied, taking into account the next
>> preferred entry for those whose first entry was removed. This process
>> repeats until there is a winner.
>>
>> The entries are broken up by variants, since some entries have multiple
>> color or style variations. The entry identifiers are first a capital
>> letter, followed by a variation id (described with each entry below), if
>> applicable. As an example, if you prefer variant 1 of entry A, followed by
>> variant 2 of entry A, variant 3 of entry C, entry D, and lastly variant 4e
>> of entry B, the following should be in your reply:
>>
>> (binding)
>> vote: A1, A2, C3, D, B4e
>>
>> *Entries*
>>
>> The entries are as follows:
>>
>> A*.* Submitted by Dustin Haver. This entry has two variants, A1 and A2.
>>
>> [A1]
>> https://issues.apache.org/jira/secure/attachment/12999548/Screen%20Shot%202020-04-10%20at%208.29.32%20AM.png
>> [A2]
>> https://issues.apache.org/jira/secure/attachment/12997172/LuceneLogo.png
>>
>> B. Submitted by Stamatis Zampetakis. This has several variants. Within
>> the linked entry there are 7 patterns and 7 color palettes. Any vote for B
>> should contain the pattern number followed by the lowercase letter of the
>> color palette. For example, B3e or B1a.
>>
>> [B]
>> https://issues.apache.org/jira/secure/attachment/12997768/zabetak-1-7.pdf
>>
>> C. Submitted by Baris Kazar. This entry has 8 variants.
>>
>> [C1]
>> https://issues.apache.org/jira/secure/attachment/13006392/lucene_logo1_full.pdf
>> [C2]
>> https://issues.apache.org/jira/secure/attachment/13006393/lucene_logo2_full.pdf
>> [C3]
>> https://issues.apache.org/jira/secure/attachment/13006394/lucene_logo3_full.pdf
>> [C4]
>> https://issues.apache.org/jira/secure/attachment/13006395/lucene_logo4_full.pdf
>> [C5]
>> https://issues.apache.org/jira/secure/attachment/13006396/lucene_logo5_full.pdf
>> [C6]
>> https://issues.apache.org/jira/secure/attachment/13006397/lucene_logo6_full.pdf
>> [C7]
>> https://issues.apache.org/jira/secure/attachment/13006398/lucene_logo7_full.pdf
>> [C8]
>> https://issues.apache.org/jira/secure/attachment/13006399/lucene_logo8_full.pdf
>>
>> D. The current Lucene logo.
>>
>> [D]
>> https://lucene.apache.org/theme/images/lucene/lucene_logo_green_300.png
>>
>> Please vote for one of the above choices. This vote will close about one
>> week from today, Mon, Sept 7, 2020 at 11:59PM.
>>
>> Thanks!
>>
>> [jira-issue] https://issues.apache.org/jira/browse/LUCENE-9221
>> [first-vote]
>> http://mail-archives.apache.org/mod_mbox/lucene-dev/202006.mbox/%3cCA+DiXd74Mz4H6o9SmUNLUuHQc6Q1-9mzUR7xfxR03ntGwo=d...@mail.gmail.com%3e
>> [second-vote]
>> http://mail-archives.apache.org/mod_mbox/lucene-dev/202009.mbox/%3cCA+DiXd7eBrQu5+aJQ3jKaUtUTJUqaG2U6o+kUZfNe-m=smn...@mail.gmail.com%3e
>> [rank-choice-voting] https://en.wikipedia.org/wiki/Instant-runoff_voting
>>
>

-- 
Marcus Eagan


Re: Codebase Janitorial Services: the SolrSingleThreaded PR

2020-08-12 Thread Marcus Eagan
That is a great thought and I agree with it.

On Wed, Aug 12, 2020 at 5:55 AM Mike Drob  wrote:

> I don’t have time to write a more comprehensive email with links and
> references but the basic outline here is that Mark did some (read: a ton)
> work on a private branch that he later shared with his coworkers. Pieces of
> that branch filtered out to Apache.
>
> The intent behind this annotations was to start labeling things so that we
> can reason about our concurrency better, since right now we likely have
> bugs relating to misuse of concurrent classes and shared state. Solr is a
> complex beast that is very hard to understand, so the thought was that we
> could start chipping away at it.
>
>
> Mike
>
> On Wed, Aug 12, 2020 at 1:18 AM Marcus Eagan 
> wrote:
>
>> Hi Community,
>>
>> As I have taken it upon myself to slowly document opportunities for
>> improving the overall quality and maintainability of the code base, I was
>> looking into my notes on test failures.
>>
>> As you all know many test failures in distributed systems repos sometimes
>> can be traced back to race conditions or trying to simulate thread safety
>> with waits. So I went digging into any work in this area.
>>
>> I stumbled across a peculiar contribution: SOLR-13998: Add thread safety
>> annotations to classes (#1053)
>> <https://github.com/apache/lucene-solr/pull/1057>. In the title of the
>> PR, @Anshum Gupta  writes that he adds thread
>> safety annotation to classes.
>>
>> This is misleading and confusing to me for multiple reasons:
>>
>> 1. I was thrown off a little bit by the title. While I tend to defer to
>> Anshum the expert,  and I think it might be helpful to have thread safety
>> annotations for the compiler in many cases, there were literally no
>> annotations added to any classes in this PR. H the changes notes from the
>> PR say:
>>
>>> SOLR-13998: Add thread safety annotations to Solr. This only introduces
>>> the annotations and doesn't add these to existing classes. (Anshum
>>> Gupta, Mark Miller)
>>> existing classes.
>>
>>
>> Since this is true, I think the title should change. What was Mark's
>> involvement? Is there a commit/comment missing?
>>
>> 2. What is the value in adding this SingleThreaded annotation 8 months
>> ago, yet not implementing it once since? In fact, it has only been used
>> once, in one file by Dat, which is great. The only place I found it was
>> here: https://github.com/apache/lucene-solr/pull/1470 from last month's
>> shard handler improvements from Dat. I'm mostly concerned about how the
>> project operates with regard to this question.
>>
>> 3. As a general engagement model that could help the project, I think
>> that if someone is going to seemingly add something to Solr, they should
>> implement it in a few places, or point the rest of the community at places
>> where they think it could help the project. Getting in the habit of adding
>> a class, and then not using it anywhere should be something that the
>> committers and contributors on the project reconsider. Otherwise, the
>> codebase will be sprawling. It is sprawling in some areas. More on that
>> later. There are other much older manifestations of this behavior, so I am
>> asking the community, in particular the contributor of this recently added
>> but scantly implemented class, to take a more active role in spearheading
>> the addition of the SolrSingleThreaded annotations where/if they can add
>> value.
>>
>> 4. Ideas I have around this:
>> a) maybe you could share why you added this code in the first place.
>> b) where and why you or the community see we should be annotating more in
>> this manner.
>> c) re-examine your own contribution and ask yourself "should I have
>> contributed this code without more discussion about how it would be
>> implemented in the project?" I stumbled upon this looking for what the PR
>> says it is, but in reality there's not much there. I understand that, and I
>> really like micro-PRs. They're easier to review. But I think we should take
>> this opportunity to build on that work, or get rid of it and other PRs like
>> it.
>>
>> 5. Anshum, did you have some other objectives or aims back in December
>> when you added the SolrSingleThreaded annotation? Does anyone else have any
>> objectives around supporting this effort? Can I and others be of assistance
>> in helping you get to where you intended to go with the PR? I really want
>> to be helpful.
>>
>> 6. Most of all, maybe Anshum or someone here can actually explain to me
>> what  SolrSingleThreaded actually does that is needed? I have thoughts and
>> have used annotations like this before with Spring, but in my very limited
>> knowledge about what this PR was intended to accomplish, this
>> implementation does nothing.
>>
>> Please advise,
>>
>> Marcus
>>
>>
>>
>>
>> --
>> Marcus Eagan
>>
>>

-- 
Marcus Eagan


Re: Remove/Replace Nashorn, Remove Eval

2020-08-12 Thread Marcus Eagan
I hadn't opened a ticket or PR but would as soon as I receive some support
from the community.


On Wed, Aug 12, 2020 at 1:29 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> +1 to removing it.
> Does the build pass if we remove that line?
>
> On Wed, Aug 12, 2020 at 12:48 PM Marcus Eagan 
> wrote:
>
>> Not trying to spam the list, just looking to get feedback about the
>> goings on in the project and on some of my items before I share my Google
>> Doc, which is damning, even of my own work and efforts.
>>
>> This line and subsequent lines concern me:
>>
>>
>> https://github.com/apache/lucene-solr/blob/1d2749295b5378db9f54d603b581d1d9a1e3cc93/lucene/tools/javadoc/java11/package-list#L265
>>
>> We should remove Nashorn and eval from our code base.
>>
>> One could argue that eval should've been removed eight years ago.
>> Nashorn  should have been removed in 2018 when Oracle announced it w
>> <https://blogs.oracle.com/javamagazine/jep-335-deprecate-the-nashorn-javascript-engine>as
>> shifting all efforts to GraalVM. Adopting GraalVm, if we feel we need it,
>> gives the platform many capabilities and much more security that what is
>> offered by Nashorn. Nashorn is not actively maintained anymore to my
>> knowledge.
>>
>> Are there any objections to me removing Nashorn, revisiting adding
>> GraalVM if we feel we need it, and totally removing eval from the code
>> base. It is already mostly removed thanks to work from Kevin and Jan, I
>> believe. I wanted to remove it back in March of 2019, but that's another
>> story for a different email thread.
>>
>> Anyway, please advise.
>>
>>  Best,
>>
>> Marcus Eagan
>>
>>

-- 
Marcus Eagan


Remove/Replace Nashorn, Remove Eval

2020-08-12 Thread Marcus Eagan
Not trying to spam the list, just looking to get feedback about the goings
on in the project and on some of my items before I share my Google Doc,
which is damning, even of my own work and efforts.

This line and subsequent lines concern me:

https://github.com/apache/lucene-solr/blob/1d2749295b5378db9f54d603b581d1d9a1e3cc93/lucene/tools/javadoc/java11/package-list#L265

We should remove Nashorn and eval from our code base.

One could argue that eval should've been removed eight years ago. Nashorn
should have been removed in 2018 when Oracle announced it w
<https://blogs.oracle.com/javamagazine/jep-335-deprecate-the-nashorn-javascript-engine>as
shifting all efforts to GraalVM. Adopting GraalVm, if we feel we need it,
gives the platform many capabilities and much more security that what is
offered by Nashorn. Nashorn is not actively maintained anymore to my
knowledge.

Are there any objections to me removing Nashorn, revisiting adding GraalVM
if we feel we need it, and totally removing eval from the code base. It is
already mostly removed thanks to work from Kevin and Jan, I believe. I
wanted to remove it back in March of 2019, but that's another story for a
different email thread.

Anyway, please advise.

 Best,

Marcus Eagan


Re: First Issue Label

2020-08-12 Thread Marcus Eagan
Awesome Tomoko. Thanks to everyone who is helping here.

On Tue, Aug 11, 2020 at 10:29 PM Tomoko Uchida 
wrote:

> JFYI: I opened an issue with the "newdev" label. It's mainly about
> documentation and requires a bit of knowledge about our build system
> (gradle).
> https://issues.apache.org/jira/browse/LUCENE-9459
>
> Thanks,
> Tomoko
>
>
> 2020年8月9日(日) 5:00 Eric Pugh :
>
>> I’d be interested in shepherding “newdev” style contributions to being
>> commits.   I’m not comfortable making any deeper changes in Solr, but if
>> it’s a “newdev” labeled featured, well then it’s probably “newcommitter”
>> friendly as well ;-).
>>
>> Feel free to tag me on any issues that have patches etc….
>>
>> On Aug 8, 2020, at 8:19 AM, Ishan Chattopadhyaya <
>> ichattopadhy...@gmail.com> wrote:
>>
>> Thanks for the reminder, Marcus. I just added a "newdev" label to this:
>> https://issues.apache.org/jira/browse/SOLR-13438.
>>
>> On Fri, Aug 7, 2020 at 4:55 AM Jan Høydahl  wrote:
>>
>>> I have tagged some of the issues I have filed but not had bandwidth to
>>> tackle immediately as ’newdev’, but could probably have done it far more
>>> often.
>>> If all of us browse through the issues we have created and tag those we
>>> think are simple and important, then there would suddenly be a bunch!
>>> Great reminder, Marcus! Having a clear focus on new devs is important!
>>>
>>> Jan
>>>
>>> 6. aug. 2020 kl. 06:12 skrev Anshum Gupta :
>>>
>>> There used to be a 'newdev' label in the past that fell through the
>>> cracks.
>>> https://cwiki.apache.org/confluence/display/LUCENE/HowToContribute mentions
>>> the label, but of course, not a ton of JIRAs exist with that label for new
>>> devs to pick up and run with.
>>>
>>> We could start using the label. I personally tagged a bunch of JIRAs
>>> once upon a time with that label and also remember that as something we did
>>> at one of the committer meetings, but then the lower hanging JIRAs were
>>> really created and resolved without much delay, not leaving much on the
>>> table for new developers.
>>> We can certainly get back to using the label again.
>>>
>>> -Anshum
>>>
>>> On Wed, Aug 5, 2020 at 8:04 PM Marcus Eagan 
>>> wrote:
>>>
>>>> Community,
>>>>
>>>> In the vane of more developer friendly, I think we should create a
>>>> first issue label. In my experience, that label has been a great way to get
>>>> newcomers involved in projects new to them.
>>>>
>>>> I've seen it in a number of Apache projects that I have contributed to,
>>>> proprietary projects, and in CNCF projects.
>>>>
>>>> Please let me know what you think about a first issue label to make it
>>>> easier for people not necessarily in the community looking to join to do so
>>>> in the future.
>>>>
>>>> Thanks,
>>>> --
>>>> Marcus Eagan
>>>>
>>>>
>>>
>>> --
>>> Anshum Gupta
>>>
>>>
>>>
>> ___
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>

-- 
Marcus Eagan


Re: Codebase Janitorial Services: the SolrSingleThreaded PR

2020-08-12 Thread Marcus Eagan
to be clear because I don't want to come off as antagonizing... the author
of the PR could be anyone. I think if we forget who and focus on what we
can all row together.

I'm writing the group as a whole because I do want to contest this sort of
way of doing things in the absence of good reason. I've modified my initial
email below to be a bit more focused.

Thanks all,

Marcus

On Tue, Aug 11, 2020 at 11:17 PM Marcus Eagan  wrote:

> Hi Community,
>
> As I have taken it upon myself to slowly document opportunities for
> improving the overall quality and maintainability of the code base, I was
> looking into my notes on test failures.
>
> As you all know many test failures in distributed systems repos sometimes
> can be traced back to race conditions or trying to simulate thread safety
> with waits. So I went digging into any work in this area.
>
> I stumbled across a peculiar contribution: SOLR-13998: Add thread safety
> annotations to classes (#1053)
> <https://github.com/apache/lucene-solr/pull/1057>. In the title of the
> PR, some writes that they adds thread safety annotation to classes.
>
> This is misleading and confusing to me for multiple reasons:
>
> 1. I was thrown off a little bit by the title. While I tend to defer to
> the experts,  and I think it might be helpful to have thread safety
> annotations for the compiler in many cases, there were literally no
> annotations added to any classes in this PR. The changes notes from the PR
> say:
>
>> SOLR-13998: Add thread safety annotations to Solr. This only introduces
>> the annotations and doesn't add these to existing classes. (Anshum
>> Gupta, Mark Miller)
>> existing classes.
>
>
> Since this is true, I think the title should change. What was Mark's
> involvement? Is there a commit/comment missing?
>
> 2. What is the value in adding this SingleThreaded annotation 8 months
> ago, yet not implementing it once since? In fact, it has only been used
> once, in one file by Dat, which is great. The only place I found it was
> here: https://github.com/apache/lucene-solr/pull/1470 from last month's
> shard handler improvements from Dat. I'm mostly concerned about how the
> project operates with regard to this question.
>
> 3. As a general engagement model that could help the project, I think that
> if someone is going to seemingly add something to Solr, they should
> implement it in a few places, or point the rest of the community at places
> where they think it could help the project. Getting in the habit of adding
> a class, and then not using it anywhere should be something that the
> committers and contributors on the project reconsider. Otherwise, the
> codebase will be sprawling. It is sprawling in some areas. More on that
> later. There are other much older manifestations of this behavior, so I am
> asking the community, in particular the contributor of this recently added
> but scantly implemented class, to take a more active role in spearheading
> the addition of the SolrSingleThreaded annotations where/if they can add
> value.
>
> 4. Ideas I have around this:
> a) maybe you could share why you added this code in the first place.
> b) where and why you or the community see we should be annotating more in
> this manner.
> c) re-examine your own contribution and ask yourself "should I have
> contributed this code without more discussion about how it would be
> implemented in the project?" I stumbled upon this looking for what the PR
> says it is, but in reality there's not much there. I understand that, and I
> really like micro-PRs. They're easier to review. But I think we should take
> this opportunity to build on that work, or get rid of it and other PRs like
> it.
>
> 5. Does the author u have some other objectives or aims back in December
> when you added the SolrSingleThreaded annotation? Does anyone else have any
> objectives around supporting this effort? Can I and others be of assistance
> in helping you get to where you intended to go with the PR? I really want
> to be helpful.
>
> 6. Most of all, maybe someone here can actually explain to me what
> SolrSingleThreaded actually does that is needed? I have thoughts and have
> used annotations like this before with Spring, but in my very limited
> knowledge about what this PR was intended to accomplish, this
> implementation does nothing.
>
> Please advise,
>
> Marcus
>
>
>
>
> --
> Marcus Eagan
>
>

-- 
Marcus Eagan


Codebase Janitorial Services: the SolrSingleThreaded PR

2020-08-12 Thread Marcus Eagan
Hi Community,

As I have taken it upon myself to slowly document opportunities for
improving the overall quality and maintainability of the code base, I was
looking into my notes on test failures.

As you all know many test failures in distributed systems repos sometimes
can be traced back to race conditions or trying to simulate thread safety
with waits. So I went digging into any work in this area.

I stumbled across a peculiar contribution: SOLR-13998: Add thread safety
annotations to classes (#1053)
<https://github.com/apache/lucene-solr/pull/1057>. In the title of the
PR, @Anshum
Gupta  writes that he adds thread safety annotation
to classes.

This is misleading and confusing to me for multiple reasons:

1. I was thrown off a little bit by the title. While I tend to defer to
Anshum the expert,  and I think it might be helpful to have thread safety
annotations for the compiler in many cases, there were literally no
annotations added to any classes in this PR. H the changes notes from the
PR say:

> SOLR-13998: Add thread safety annotations to Solr. This only introduces
> the annotations and doesn't add these to existing classes. (Anshum Gupta,
> Mark Miller)
> existing classes.


Since this is true, I think the title should change. What was Mark's
involvement? Is there a commit/comment missing?

2. What is the value in adding this SingleThreaded annotation 8 months ago,
yet not implementing it once since? In fact, it has only been used once, in
one file by Dat, which is great. The only place I found it was here:
https://github.com/apache/lucene-solr/pull/1470 from last month's shard
handler improvements from Dat. I'm mostly concerned about how the project
operates with regard to this question.

3. As a general engagement model that could help the project, I think that
if someone is going to seemingly add something to Solr, they should
implement it in a few places, or point the rest of the community at places
where they think it could help the project. Getting in the habit of adding
a class, and then not using it anywhere should be something that the
committers and contributors on the project reconsider. Otherwise, the
codebase will be sprawling. It is sprawling in some areas. More on that
later. There are other much older manifestations of this behavior, so I am
asking the community, in particular the contributor of this recently added
but scantly implemented class, to take a more active role in spearheading
the addition of the SolrSingleThreaded annotations where/if they can add
value.

4. Ideas I have around this:
a) maybe you could share why you added this code in the first place.
b) where and why you or the community see we should be annotating more in
this manner.
c) re-examine your own contribution and ask yourself "should I have
contributed this code without more discussion about how it would be
implemented in the project?" I stumbled upon this looking for what the PR
says it is, but in reality there's not much there. I understand that, and I
really like micro-PRs. They're easier to review. But I think we should take
this opportunity to build on that work, or get rid of it and other PRs like
it.

5. Anshum, did you have some other objectives or aims back in December when
you added the SolrSingleThreaded annotation? Does anyone else have any
objectives around supporting this effort? Can I and others be of assistance
in helping you get to where you intended to go with the PR? I really want
to be helpful.

6. Most of all, maybe Anshum or someone here can actually explain to me
what  SolrSingleThreaded actually does that is needed? I have thoughts and
have used annotations like this before with Spring, but in my very limited
knowledge about what this PR was intended to accomplish, this
implementation does nothing.

Please advise,

Marcus




-- 
Marcus Eagan


Re: RoadMap?

2020-08-11 Thread Marcus Eagan
;> > Austoscaling , if required, should not be a part of Solr
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> > On Fri, Jul 3, 2020 at 5:48 PM Jan Høydahl 
>>>>> wrote:
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >> +1
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >> Why don’t we make a Roadmap wiki page as Cassandra suggests, and
>>>>> indicate what major things needs to happen when.
>>>>>
>>>>>
>>>>> >> Perhaps if we can get the Solr TLP and git-split ball rolling as a
>>>>> pre-9.0 task, then perhaps 8.8 could be the last joint release (6.6, 7.7,
>>>>> 8.8 hehe)?
>>>>>
>>>>>
>>>>> >> That would enable Lucene to ship 9.0 without waiting for a ton of
>>>>> alpha-quality Solr features, and Solr could have its own Roadmap wiki.
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >> Jan
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >> 3. jul. 2020 kl. 09:19 skrev Dawid Weiss :
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >>> I totally expect some things to bubble up when we try to release
>>>>> with Gradle, the tarball being one. I don’t think that’s a very big issue,
>>>>> but if you have lots of “not very big” issues they do add up.
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >> Adding a tarball is literally 3-5 lines of code (you add a task
>>>>> that builds a tarball or a zip file from the outputs of solr/packaging
>>>>> toDir task)... The bigger issue with gradle is that somebody has to step 
>>>>> up
>>>>> and try to identify any other issues and/or missing bits when trying to do
>>>>> a full release cycle.
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >> D.
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >>
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> > --
>>>>>
>>>>>
>>>>> > -
>>>>>
>>>>>
>>>>> > Noble Paul
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>> > -
>>>>>
>>>>>
>>>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>
>>>>>
>>>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>
>>>>>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -
>>>>>
>>>>>
>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>
>>>>>
>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> --
>>> http://www.needhamsoftware.com (work)
>>> http://www.the111shift.com (play)
>>>
>>>
>>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>
-- 
Marcus Eagan


Performance in Solr 9 / Java 11

2020-08-10 Thread Marcus Eagan
In my IDE, I have a few profiling tools that I bounce between that I
started using in my work at Lucidworks but I continue to use in my current
work today. I have suspicions that there may be some performance
improvements in Java 11 that we can exploit further.  I'm curious as to if
there has been any investigation, possibly Mark Miller or @u...@thetaphi.de
,  into performance improvements specific to the newer
version of Java in Master? There are some obvious ones that we get for
free, like a better GC, but curious as to prior work in this area before
publishing anything that might be redundant or irrelevant.

Best,

-- 
Marcus Eagan


Re: Standardize Leading Test or Trailing Test

2020-08-07 Thread Marcus Eagan
Any -1's? I just learned more about them from David Smiley earlier today
and I recognize they are for the rare case, but I want to ensure consensus
before I spend time on this tomorrow evening. I hadn't remembered seeing
them in real life except for on the Lucene/Solr separation vote.

Of course, if someone else does this before I get to it today or tomorrow
morning, great. :)

Marcus

On Thu, Aug 6, 2020 at 8:05 AM Gus Heck  wrote:

> Yeah  +1 for standardization +1.01 if it lands on *Test :) but that's just
> my personal preference.
>
> On Thu, Aug 6, 2020 at 9:17 AM Adrien Grand  wrote:
>
>> +1
>>
>> On Thu, Aug 6, 2020 at 1:54 PM Erick Erickson 
>> wrote:
>>
>>> This has amused/annoyed me for a long time. But did I ever have the
>>> energy to tackle it? N.
>>>
>>> +1
>>>
>>> > On Aug 6, 2020, at 1:50 AM, Tomás Fernández Löbbe <
>>> tomasflo...@gmail.com> wrote:
>>> >
>>> > +1
>>> >
>>> > On Wed, Aug 5, 2020 at 10:37 PM David Smiley 
>>> wrote:
>>> > +1 to standardize on something.
>>> > This has been brought up before: LUCENE-8626 -- credit to Christine
>>> who started the work.  I recommend resuming the discussion there.
>>> >
>>> > ~ David
>>> >
>>> >
>>> > On Thu, Aug 6, 2020 at 12:08 AM Anshum Gupta 
>>> wrote:
>>> > +1
>>> >
>>> > Thanks for bringing this up, Marcus. Standardizing this is really
>>> great.
>>> >
>>> > On Wed, Aug 5, 2020 at 8:01 PM Marcus Eagan 
>>> wrote:
>>> > Hi community, what do you think a small effort to standardize on
>>> leading with the word "Test" or trailing with the word "Test" in the repo.
>>> Most projects do one or the other and it has an impact on developer
>>> productivity. I'll explain my use case:
>>> >
>>> > I'm working on a class and I want to modify the test to evaluate my
>>> changes. If the class is named in a standard way, I can find it easily. If
>>> it is not, it's fine. There are typically two options. I consider it
>>> distracting and sloppy. Distraction is expensive for developers. I have
>>> some more important efforts that I'm working on, but if the community
>>> agrees on this one, I can open a ticket and submit a PR. Let me know what
>>> you think.
>>> >
>>> > Hoping to make the project more developer friendly.
>>> >
>>> > --
>>> > Marcus Eagan
>>> >
>>> >
>>> >
>>> > --
>>> > Anshum Gupta
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> --
>> Adrien
>>
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Marcus Eagan


First Issue Label

2020-08-05 Thread Marcus Eagan
Community,

In the vane of more developer friendly, I think we should create a first
issue label. In my experience, that label has been a great way to get
newcomers involved in projects new to them.

I've seen it in a number of Apache projects that I have contributed to,
proprietary projects, and in CNCF projects.

Please let me know what you think about a first issue label to make it
easier for people not necessarily in the community looking to join to do so
in the future.

Thanks,
-- 
Marcus Eagan


Standardize Leading Test or Trailing Test

2020-08-05 Thread Marcus Eagan
Hi community, what do you think a small effort to standardize on leading
with the word "Test" or trailing with the word "Test" in the repo. Most
projects do one or the other and it has an impact on developer
productivity. I'll explain my use case:

I'm working on a class and I want to modify the test to evaluate my
changes. If the class is named in a standard way, I can find it easily. If
it is not, it's fine. There are typically two options. I consider it
distracting and sloppy. Distraction is expensive for developers. I have
some more important efforts that I'm working on, but if the community
agrees on this one, I can open a ticket and submit a PR. Let me know what
you think.

Hoping to make the project more developer friendly.

-- 
Marcus Eagan


Re: SolrClient and making requests asynchronously

2020-08-03 Thread Marcus Eagan
+1  I and a few of my former Lucidworks colleagues know the pain of the
synchronous client very well.

Thanks for the first step toward an improved design Dat and  David Smiley!

Marcus
On Sun, Aug 2, 2020 at 07:47 Atri Sharma  wrote:

> I am +1 to this approach. Some thoughts inline.
>
> How would query timeout be respected in this approach?
>
>>
>
>  The default approach might be configured to throw
>> UnsupportedOperationException, or perhaps might simply use an Executor to
>> get it done in an obvious way (assuming we can get ahold of an Executor
>> somewhere?).
>>
>
> Would that mean that we use an Executor to execute a single thread?
>
>
> >>
> CompletableFuture, and which merely takes the SolrRequest parameter and
> nothing else.  Alternatively the client could supply a CompletableFuture
> parameter that Solr will call complete() on etc. but that seems a bit less
> natural to the notion of a method that returns it's results, albeit with a
> wrapper.
>
> I would think that we allow users to specify their callback. One of the
> advantages of AsyncListener is that it a custom implementation can allow
> users to handle the behaviour of timeout and other events. We should retain
> that behaviour.
> --
> Regards,
>
> Atri
> Apache Concerted
>
-- 
Marcus Eagan


Re: Deprecate Schemaless Mode?

2020-08-03 Thread Marcus Eagan
Typo*, I meant deprecate vs. remove, which obviously cannot do.

On Mon, Aug 3, 2020 at 12:05 Marcus Eagan  wrote:

> Furthermore, just to be clear, I opened a discussion about deprecating and
> not replacing schemaless mode for two reasons:
>
> (1) the pain it has inflicted on Solr users and reputation of Solr —
> deprecation logs speak volumes.
> (2) to get a better understanding of what engineers and others in the
> community use Schemaless for to inform the design of its replacement.
>
> At no point would I argue that a feature like Schemaless is unnecessary.
> It was the first way I used Solr (the second time around, the first time I
> tried it I built my company using Elasticsearch because of other issues). I
> am of the opinion that "Schemaless Mode" has done more harm to Solr than
> good in my limited experience with the feature. Heck, *I've only been
> consulting for a week and it has already come up*. I acknowledge a very
> small sample size.
>
> I am curious as to your thoughts on these points. There are not lots of
> people getting started with Solr today relative to the other solutions on
> the market regardless of what you might assume. I am here to see if I can
> change that through a shift in how we approach user experience and the
> knowledge requisite to operate a production cluster. I hope no one takes
> offense to me challenging how some community members think about what is a
> good feature vs what is a bad one.
>
> Marcus
>
>
>
>
> On Mon, Aug 3, 2020 at 11:44 AM Marcus Eagan 
> wrote:
>
>> I know a person using it in production today. It's causing problems. They
>> could abandon Solr altogether. It seems like a schema creation wizard is
>> the right getting started motion if we know that schemaless doesn't do what
>> people think it does. It's misleading. It's also a false representation of
>> how easy it is to get started when compared to other solutions on the
>> market. If schemaless is about support new use/adoption, it should actually
>> help that more than hurt it.
>>
>> That's why I raised it. Re-branding this feature is like pig-lipsticking
>> in my mind, but you all have more experience than me and are committers. I
>> will defer to you for now. I am in favor on re-naming the feature as the
>> minimum change that should happen.
>>
>> Schemaless mode makes sense in a world where schemas are largely opaque
>> like IoT-telemetry or server logs. When you are searching data primarily
>> for human consumption, I think it is just a headache in a bottle. In the
>> cases of CSV and TSV, customers know the schema. I like to approach
>> designing software such that no one ever needs to talk to me. No
>> firefighting consulting is necessary, and you can skim the docs and proceed
>> safely. I understand others may not feel that way, but it is the future of
>> software.
>>
>> I encourage everyone here to try the newer search systems that have been
>> released and are growing rapidly to inform your opinions on this topic. I
>> am doing that because it is the concrete poured to build the common ground
>> of the future.
>>
>> On Mon, Aug 3, 2020 at 11:40 AM Anshum Gupta 
>> wrote:
>>
>>> +1 Jason.
>>>
>>> Here's some context on how this came into being.
>>>
>>> Users find it difficult to understand and create a basic schema when
>>> just trying out Solr. This mode was supposed to help them bootstrap, and
>>> one they had a better understanding of how things worked, they'd tune it
>>> before using the schema in production.
>>> This did improve the OTB experience for new users, but a lot of people
>>> abused this convenience and used this in production causing issues.
>>>
>>> As Jason mentioned, we'd better serve our users if we left this feature
>>> for the getting started experience and add warnings (in UI and responses?)
>>> so users would know what they are doing when they take this to production.
>>>
>>> This feature isn't trappy unless people use it in ways it was not
>>> intended to be used in. We just need to warn and educate people better.
>>>
>>> On Mon, Aug 3, 2020 at 10:41 AM Jason Gerlowski 
>>> wrote:
>>>
>>>> > Is anyone on this list using schemaless mode in production or have
>>>> you tried to?
>>>>
>>>> Schemaless mode is one of a group of Solr features present for
>>>> convenience but not intended for production usage.  It's in the same
>>>> boat as "bin/post", and SolrCell, and others.  These features do cause
>

Re: Deprecate Schemaless Mode?

2020-08-03 Thread Marcus Eagan
Furthermore, just to be clear, I opened a discussion about deprecating and
not replacing schemaless mode for two reasons:

(1) the pain it has inflicted on Solr users and reputation of Solr —
deprecation logs speak volumes.
(2) to get a better understanding of what engineers and others in the
community use Schemaless for to inform the design of its replacement.

At no point would I argue that a feature like Schemaless is unnecessary. It
was the first way I used Solr (the second time around, the first time I
tried it I built my company using Elasticsearch because of other issues). I
am of the opinion that "Schemaless Mode" has done more harm to Solr than
good in my limited experience with the feature. Heck, *I've only been
consulting for a week and it has already come up*. I acknowledge a very
small sample size.

I am curious as to your thoughts on these points. There are not lots of
people getting started with Solr today relative to the other solutions on
the market regardless of what you might assume. I am here to see if I can
change that through a shift in how we approach user experience and the
knowledge requisite to operate a production cluster. I hope no one takes
offense to me challenging how some community members think about what is a
good feature vs what is a bad one.

Marcus




On Mon, Aug 3, 2020 at 11:44 AM Marcus Eagan  wrote:

> I know a person using it in production today. It's causing problems. They
> could abandon Solr altogether. It seems like a schema creation wizard is
> the right getting started motion if we know that schemaless doesn't do what
> people think it does. It's misleading. It's also a false representation of
> how easy it is to get started when compared to other solutions on the
> market. If schemaless is about support new use/adoption, it should actually
> help that more than hurt it.
>
> That's why I raised it. Re-branding this feature is like pig-lipsticking
> in my mind, but you all have more experience than me and are committers. I
> will defer to you for now. I am in favor on re-naming the feature as the
> minimum change that should happen.
>
> Schemaless mode makes sense in a world where schemas are largely opaque
> like IoT-telemetry or server logs. When you are searching data primarily
> for human consumption, I think it is just a headache in a bottle. In the
> cases of CSV and TSV, customers know the schema. I like to approach
> designing software such that no one ever needs to talk to me. No
> firefighting consulting is necessary, and you can skim the docs and proceed
> safely. I understand others may not feel that way, but it is the future of
> software.
>
> I encourage everyone here to try the newer search systems that have been
> released and are growing rapidly to inform your opinions on this topic. I
> am doing that because it is the concrete poured to build the common ground
> of the future.
>
> On Mon, Aug 3, 2020 at 11:40 AM Anshum Gupta 
> wrote:
>
>> +1 Jason.
>>
>> Here's some context on how this came into being.
>>
>> Users find it difficult to understand and create a basic schema when just
>> trying out Solr. This mode was supposed to help them bootstrap, and one
>> they had a better understanding of how things worked, they'd tune it before
>> using the schema in production.
>> This did improve the OTB experience for new users, but a lot of people
>> abused this convenience and used this in production causing issues.
>>
>> As Jason mentioned, we'd better serve our users if we left this feature
>> for the getting started experience and add warnings (in UI and responses?)
>> so users would know what they are doing when they take this to production.
>>
>> This feature isn't trappy unless people use it in ways it was not
>> intended to be used in. We just need to warn and educate people better.
>>
>> On Mon, Aug 3, 2020 at 10:41 AM Jason Gerlowski 
>> wrote:
>>
>>> > Is anyone on this list using schemaless mode in production or have you
>>> tried to?
>>>
>>> Schemaless mode is one of a group of Solr features present for
>>> convenience but not intended for production usage.  It's in the same
>>> boat as "bin/post", and SolrCell, and others.  These features do cause
>>> headaches when users ignore the documented restrictions and use them
>>> for more than prototyping.  But at the same time they're super
>>> valuable for these sort of demo-ing or getting-started use cases.  An
>>> easy getting-started experience is important, and schemaless et al
>>> serve a mostly positive role in that.
>>>
>>> I think we'd better serve our users if we left schemaless
>>> in/undeprecated, and instead focused 

Re: Deprecate Schemaless Mode?

2020-08-03 Thread Marcus Eagan
I know a person using it in production today. It's causing problems. They
could abandon Solr altogether. It seems like a schema creation wizard is
the right getting started motion if we know that schemaless doesn't do what
people think it does. It's misleading. It's also a false representation of
how easy it is to get started when compared to other solutions on the
market. If schemaless is about support new use/adoption, it should actually
help that more than hurt it.

That's why I raised it. Re-branding this feature is like pig-lipsticking in
my mind, but you all have more experience than me and are committers. I
will defer to you for now. I am in favor on re-naming the feature as the
minimum change that should happen.

Schemaless mode makes sense in a world where schemas are largely opaque
like IoT-telemetry or server logs. When you are searching data primarily
for human consumption, I think it is just a headache in a bottle. In the
cases of CSV and TSV, customers know the schema. I like to approach
designing software such that no one ever needs to talk to me. No
firefighting consulting is necessary, and you can skim the docs and proceed
safely. I understand others may not feel that way, but it is the future of
software.

I encourage everyone here to try the newer search systems that have been
released and are growing rapidly to inform your opinions on this topic. I
am doing that because it is the concrete poured to build the common ground
of the future.

On Mon, Aug 3, 2020 at 11:40 AM Anshum Gupta  wrote:

> +1 Jason.
>
> Here's some context on how this came into being.
>
> Users find it difficult to understand and create a basic schema when just
> trying out Solr. This mode was supposed to help them bootstrap, and one
> they had a better understanding of how things worked, they'd tune it before
> using the schema in production.
> This did improve the OTB experience for new users, but a lot of people
> abused this convenience and used this in production causing issues.
>
> As Jason mentioned, we'd better serve our users if we left this feature
> for the getting started experience and add warnings (in UI and responses?)
> so users would know what they are doing when they take this to production.
>
> This feature isn't trappy unless people use it in ways it was not intended
> to be used in. We just need to warn and educate people better.
>
> On Mon, Aug 3, 2020 at 10:41 AM Jason Gerlowski 
> wrote:
>
>> > Is anyone on this list using schemaless mode in production or have you
>> tried to?
>>
>> Schemaless mode is one of a group of Solr features present for
>> convenience but not intended for production usage.  It's in the same
>> boat as "bin/post", and SolrCell, and others.  These features do cause
>> headaches when users ignore the documented restrictions and use them
>> for more than prototyping.  But at the same time they're super
>> valuable for these sort of demo-ing or getting-started use cases.  An
>> easy getting-started experience is important, and schemaless et al
>> serve a mostly positive role in that.
>>
>> I think we'd better serve our users if we left schemaless
>> in/undeprecated, and instead focused on making it harder to
>> (unknowingly) use them in ways contrary to community recommendations.
>> Add louder warnings in the documentation (where not already present).
>> Add warnings to the Solr logs the first time these features are used.
>> Disable them by default (where that makes sense).  Taken to the
>> extreme, we could even add a section into Solr's response that lists
>> non-production features used in serving a given request.
>>
>> There are lots of ways to address the "feature X is trappy" problem
>> without removing X together.
>>
>> On Mon, Aug 3, 2020 at 11:33 AM Marcus Eagan 
>> wrote:
>> >
>> > Community,
>> >
>> > There are many of us that have had to deal with the pain of managing
>> the schemaless mode of operation in Solr. I'm curious to get others
>> thoughts about how well it is working for them and if they would like to
>> continue to use it.
>> >
>> > I for one don't think Schemaless works as intended and favor
>> deprecating it and replacing it with some more usable but I am sure others
>> have thoughts here.
>> >
>> > Is anyone on this list using schemaless mode in production or have you
>> tried to?
>> >
>> > A preliminary discussion has occurred in this Jira ticket:
>> https://issues.apache.org/jira/browse/SOLR-14701
>> >
>> > Thank you all,
>> >
>> > Marcus Eagan
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> Anshum Gupta
>


-- 
Marcus Eagan


Re: [VOTE] Release Lucene/Solr 8.6.1 RC1

2020-08-03 Thread Marcus Eagan
revising my vote to 0 (non-binding) because of collection creation failures
cited by others. Until I can confirm or deny that things function properly,
I will abstain. I will wait to run my test until after the ticket is
addressed.

Marcus

On Mon, Aug 3, 2020 at 6:14 AM Atri Sharma  wrote:

> Can confirm the failure seen by Gus. Changing my vote to 0.
>
> On Mon, 3 Aug 2020 at 17:50, Jan Høydahl  wrote:
>
>> I keep getting HDFS related test failures and timeouts, so I cannot vote.
>> (macOS)
>>
>> Jan
>>
>> > 3. aug. 2020 kl. 09:43 skrev Atri Sharma :
>> >
>> > +1
>> >
>> > SUCCESS! [1:27:33.14892]
>> >
>> > On Mon, Aug 3, 2020 at 1:11 PM Marcus Eagan 
>> wrote:
>> >>
>> >> Community,
>> >>
>> >> Results from my local smoke test (Mac OS 10.15.5 | 1.8.0_265, x86_64:
>> "Amazon Corretto 8"):
>> >>
>> >> SUCCESS! [1:33:51.132902]
>> >>
>> >> I'm still going through and checking a few aforementioned issues, but
>> non-binding +1 from me. Wanted to share with the community because most
>> probably are not running Corretto.
>> >>
>> >> Hope this helps.
>> >>
>> >> marcus
>> >>
>> >>
>> >>
>> >> On Sun, Aug 2, 2020 at 9:36 PM Gus Heck  wrote:
>> >>>
>> >>> Digging a little further, I notice that the deployment that had the
>> error has this autoscaling (whereas the working deployment does not).
>> >>>
>> >>> "cluster-preferences":[{
>> >>> "minimize":"cores",
>> >>> "precision":1},
>> >>> {"maximize":"freedisk"}],
>> >>> "cluster-policy":[
>> >>> {
>> >>> "replica":"<2",
>> >>> "shard":"#EACH",
>> >>> "node":"#ANY",
>> >>> "strict":"false"},
>> >>> {
>> >>> "replica":"#EQUAL",
>> >>> "node":"#ANY",
>> >>> "strict":"false"},
>> >>> {
>> >>> "cores":"#EQUAL",
>> >>> "node":"#ANY",
>> >>> "strict":"false"}],
>> >>>
>> >>> So this may raise the question of whether or not we have an issue
>> upgrading an 8.6.0 version to 8.6.1... also, not very familiar with
>> autoscaling's error messages, but it kinda looks dodgy too since "one extra
>> tag in cores" appears to be referring to a cores attribute that has only
>> one value, but no idea yet if I'm reading that error message right.  ... As
>> to how I got that, I'm pretty sure it was one of the times when my edits to
>> cloud.sh errored and  tried to deploy an existing branch_8x build. Zk
>> probably was not clean, and retained the old config.
>> >>>
>> >>> Tomorrow I'll try to deploy 8_6_0 and then upgrade it to 8_6_1 (late
>> here now) and see if I get a similar result.
>> >>>
>> >>>
>> >>> On Sun, Aug 2, 2020 at 11:59 PM Gus Heck  wrote:
>> >>>>
>> >>>> I Got:
>> >>>>
>> >>>> Ubuntu 18.04.4 LTS:
>> >>>> SUCCESS! [0:53:02.203047]
>> >>>> Mac OS 10.13:
>> >>>>
>> >>>> SUCCESS! [1:00:57.938586]
>> >>>>
>> >>>>
>> >>>> BUT... when I deployed the tarball locally and tried to create a
>> collection (single shard, _default config, via the solr UI), I got:
>> >>>>
>> >>>>
>> >>>> 2020-08-03 02:55:15.585 INFO  (zkCallback-14-thread-1) [   ]
>> o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (1) -> (2)
>> >>>>
>> >>>> 2020-08-03 02:55:21.288 INFO  (zkCallback-14-thread-1) [   ]
>> o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (2) -> (3)
>> >>>>
>> >>>> 2020-08-03 02:55:26.705 INFO  (zkCallback-14-thread-1) [   ]
>> o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (3) -> (4)
>> >>>>
>> >>>> 2020-08-03 03:00:07.521 INFO
>> (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr)
>> [   ] o.a.s.c.a.c.CreateCollectionCmd Create collection test
>

Deprecate Schemaless Mode?

2020-08-03 Thread Marcus Eagan
Community,

There are many of us that have had to deal with the pain of managing the
schemaless mode of operation in Solr. I'm curious to get others thoughts
about how well it is working for them and if they would like to continue to
use it.

I for one don't think Schemaless works as intended and favor deprecating it
and replacing it with some more usable but I am sure others have thoughts
here.

Is anyone on this list using schemaless mode in production or have you
tried to?

A preliminary discussion has occurred in this Jira ticket:
https://issues.apache.org/jira/browse/SOLR-14701
<https://issues.apache.org/jira/browse/SOLR-14701?>

Thank you all,

Marcus Eagan


Re: [VOTE] Release Lucene/Solr 8.6.1 RC1

2020-08-03 Thread Marcus Eagan
va.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>>
>> at
>> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>>
>> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>>
>> at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>>
>> at
>> org.apache.solr.client.solrj.cloud.autoscaling.Policy.(Policy.java:144)
>>
>> at
>> org.apache.solr.client.solrj.cloud.autoscaling.AutoScalingConfig.getPolicy(AutoScalingConfig.java:372)
>>
>> at
>> org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:300)
>>
>> at
>> org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:277)
>>
>> at
>> org.apache.solr.cloud.api.collections.Assign$AssignStrategyFactory.create(Assign.java:661)
>>
>> at
>> org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:415)
>>
>> at
>> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:192)
>>
>> ... 6 more
>>
>>
>> However, when I re-did everything a second time to double check creating
>> a collection worked just fine and now I can't seem to reproduce this.
>>
>>
>> If nobody else gets this I'll figure I just managed to mangle something
>> while working on https://issues.apache.org/jira/browse/SOLR-14704
>>
>>
>> But others should perhaps give it a spin to look for this, So I'll give
>> it +0
>>
>>
>>
>> On Fri, Jul 31, 2020 at 8:14 AM Noble Paul  wrote:
>>
>>> success SUCCESS! [1:03:21.786536]
>>> Ubuntu 20.04 LTS
>>>
>>>
>>> On Fri, Jul 31, 2020 at 7:34 AM Houston Putman 
>>> wrote:
>>> >
>>> > Due to the weekend the vote will be open until 2020-08-03 22:00 UTC.
>>> That's 96 hours, and two business days.
>>> >
>>> > I can leave the vote open for longer if people want an additional
>>> business day, but will end it on Monday otherwise.
>>> >
>>> > - Houston
>>> >
>>> >
>>> >
>>> > On Thu, Jul 30, 2020 at 5:07 PM Houston Putman <
>>> houstonput...@gmail.com> wrote:
>>> >>
>>> >> Please vote for release candidate 1 for Lucene/Solr 8.6.1
>>> >>
>>> >> The artifacts can be downloaded from:
>>> >>
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.6.1-RC1-reva32a3ac4e43f629df71e5ae30a3330be94b095f2
>>> >>
>>> >> You can run the smoke tester directly with this command:
>>> >>
>>> >> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> >>
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.6.1-RC1-reva32a3ac4e43f629df71e5ae30a3330be94b095f2
>>> >>
>>> >> The vote will be open for at least 72 hours i.e. until 2020-08-02
>>> 22:00 UTC.
>>> >>
>>> >> [ ] +1  approve
>>> >> [ ] +0  no opinion
>>> >> [ ] -1  disapprove (and reason why)
>>> >>
>>> >> Here is my +1
>>>
>>>
>>>
>>> --
>>> -
>>> Noble Paul
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Marcus Eagan


Re: 8.6.1 Release

2020-08-01 Thread Marcus Eagan
>> >>>>>>>>>>>> I realized after I went looking for it in the new docs
>>>>>> that I didn't actually push the doc changes for MOVEREPLICA to 8x (had
>>>>>> intended to verify that nothing differed in 8x before pushing). Doing 
>>>>>> that
>>>>>> now, and suspect that we probably want to include it for 8.6.1
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> On Sat, Jul 25, 2020 at 10:20 PM Varun Thacker <
>>>>>> va...@vthacker.in> wrote:
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>> > does the default autoscaling policy stay once they
>>>>>> have upgraded?
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>> Looking at
>>>>>> https://github.com/apache/lucene-solr/commit/8e0eae2/#diff-de88ca16848af57d2474e04e26ea462cR90
>>>>>> , it seems like just upgrading to Solr 8.6.1 will be enough.
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>> On Fri, Jul 24, 2020 at 1:51 PM Houston Putman <
>>>>>> houstonput...@gmail.com> wrote:
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> So it looks like we all agree that 8.6.1 should be cut
>>>>>> to fix this issue.
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Unless there's any alternatives proposed, early next
>>>>>> week I'm going to push my branch_8_6 that has the offending commits
>>>>>> reverted and some additional documentation on reverting the defaulted
>>>>>> autoscaling policy from 8.6.0.
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Then I'll start the release process for 8.6.1.
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Please speak up if there is any disagreement.
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> - Houston
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> On Wed, Jul 22, 2020 at 4:40 PM Ishan Chattopadhyaya <
>>>>>> ichattopadhy...@gmail.com> wrote:
>>>>>> >>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>> Absolutely, Ilan! Good idea. I initially hesitated in
>>>>>> doing so because Andrzej had a workaround in mind for them, so I thought 
>>>>>> it
>>>>>> would be better if he did this. But, it makes sense to inform them of the
>>>>>> issue right away anyway.
>>>>>> >>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>> On Wed, 22 Jul, 2020, 11:42 pm Ilan Ginzburg, <
>>>>>> ilans...@gmail.com> wrote:
>>>>>> >>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>> Shouldn't we add a note right away to 8.6 notifying
>>>>>> of the issue?
>>>>>> >>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>> Le mer. 22 juil. 2020 à 20:08, Atri Sharma <
>>>>>> a...@apache.org> a écrit :
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>>> +1, thanks Houston.
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>>> On Wed, Jul 22, 2020 at 10:51 PM Houston Putman <
>>>>>> houstonput...@gmail.com> wrote:
>>>>>> >>>>>>>>>>>>>>>>> >
>>>>>> >>>>>>>>>>>>>>>>> > If we agree that this warrants a patch release, I
>>>>>> volunteer to do the release.
>>>>>> >>>>>>>>>>>>>>>>> >
>>>>>> >>>>>>>>>>>>>>>>> > I do think a patch release is reasonable even if
>>>>>> users have to take an action when upgrading from 8.6.0. I imagine most
>>>>>> users haven't upgraded to 8.6.0 yet, so if we make the patch now we will
>>>>>> make life easier for everyone that upgrades between now and when 8.7 is
>>>>>> released.
>>>>>> >>>>>>>>>>>>>>>>> >
>>>>>> >>>>>>>>>>>>>>>>> > On Wed, Jul 22, 2020 at 12:50 PM Atri Sharma <
>>>>>> a...@apache.org> wrote:
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >> Ignore this, I misread your email.
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >> On Wed, Jul 22, 2020 at 9:11 PM Atri Sharma <
>>>>>> a...@apache.org> wrote:
>>>>>> >>>>>>>>>>>>>>>>> >> >
>>>>>> >>>>>>>>>>>>>>>>> >> > Should we not revert the change so that users
>>>>>> upgrading from 8.6 to
>>>>>> >>>>>>>>>>>>>>>>> >> > 8.6.1 get the earlier default policy?
>>>>>> >>>>>>>>>>>>>>>>> >> >
>>>>>> >>>>>>>>>>>>>>>>> >> > On Wed, Jul 22, 2020 at 9:09 PM Houston Putman
>>>>>>  wrote:
>>>>>> >>>>>>>>>>>>>>>>> >> > >
>>>>>> >>>>>>>>>>>>>>>>> >> > > +1
>>>>>> >>>>>>>>>>>>>>>>> >> > >
>>>>>> >>>>>>>>>>>>>>>>> >> > > Question about the change. Since this patch
>>>>>> added a default autoscaling policy, if users upgrade to 8.6 and then 
>>>>>> 8.6.1,
>>>>>> does the default autoscaling policy stay once they have upgraded? If so 
>>>>>> we
>>>>>> probably want to include instructions in the release notes on how to fix
>>>>>> this issue once upgrading.
>>>>>> >>>>>>>>>>>>>>>>> >> > >
>>>>>> >>>>>>>>>>>>>>>>> >> > > - Houston
>>>>>> >>>>>>>>>>>>>>>>> >> > >
>>>>>> >>>>>>>>>>>>>>>>> >> > > On Wed, Jul 22, 2020 at 1:53 AM Ishan
>>>>>> Chattopadhyaya  wrote:
>>>>>> >>>>>>>>>>>>>>>>> >> > >>
>>>>>> >>>>>>>>>>>>>>>>> >> > >> Hi,
>>>>>> >>>>>>>>>>>>>>>>> >> > >> There was a performance regression
>>>>>> identified in 8.6.0 release due to SOLR-12845. I think it is serious 
>>>>>> enough
>>>>>> to warrant an immediate bug fix release.
>>>>>> >>>>>>>>>>>>>>>>> >> > >>
>>>>>> >>>>>>>>>>>>>>>>> >> > >> I propose a 8.6.1 release. Unfortunately,
>>>>>> I'll be unable to volunteer for this release owning to some other
>>>>>> commitments, however Andrzej mentioned in Slack that he might be able to
>>>>>> volunteer for this post 27th.
>>>>>> >>>>>>>>>>>>>>>>> >> > >>
>>>>>> >>>>>>>>>>>>>>>>> >> > >> Are there any thoughts/concerns regarding
>>>>>> this?
>>>>>> >>>>>>>>>>>>>>>>> >> > >> Regards,
>>>>>> >>>>>>>>>>>>>>>>> >> > >> Ishan
>>>>>> >>>>>>>>>>>>>>>>> >> >
>>>>>> >>>>>>>>>>>>>>>>> >> > --
>>>>>> >>>>>>>>>>>>>>>>> >> > Regards,
>>>>>> >>>>>>>>>>>>>>>>> >> >
>>>>>> >>>>>>>>>>>>>>>>> >> > Atri
>>>>>> >>>>>>>>>>>>>>>>> >> > Apache Concerted
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >> --
>>>>>> >>>>>>>>>>>>>>>>> >> Regards,
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >> Atri
>>>>>> >>>>>>>>>>>>>>>>> >> Apache Concerted
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> -
>>>>>> >>>>>>>>>>>>>>>>> >> To unsubscribe, e-mail:
>>>>>> dev-unsubscr...@lucene.apache.org
>>>>>> >>>>>>>>>>>>>>>>> >> For additional commands, e-mail:
>>>>>> dev-h...@lucene.apache.org
>>>>>> >>>>>>>>>>>>>>>>> >>
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>>> --
>>>>>> >>>>>>>>>>>>>>>>> Regards,
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>>> Atri
>>>>>> >>>>>>>>>>>>>>>>> Apache Concerted
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> -
>>>>>> >>>>>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>> dev-unsubscr...@lucene.apache.org
>>>>>> >>>>>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>> dev-h...@lucene.apache.org
>>>>>> >>>>>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> --
>>>>>> >>>>>>>>>>>> http://www.needhamsoftware.com (work)
>>>>>> >>>>>>>>>>>> http://www.the111shift.com (play)
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>> --
>>>>>> >>>>>>>>>> http://www.needhamsoftware.com (work)
>>>>>> >>>>>>>>>> http://www.the111shift.com (play)
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> --
>>>>>> >>>>>>>>> http://www.needhamsoftware.com (work)
>>>>>> >>>>>>>>> http://www.the111shift.com (play)
>>>>>> >>>>>>>>
>>>>>> >>>>>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> --
>>>>>> >>>>> http://www.needhamsoftware.com (work)
>>>>>> >>>>> http://www.the111shift.com (play)
>>>>>> >>>>
>>>>>> >>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -
>>>>>> Noble Paul
>>>>>>
>>>>>> -
>>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>>
>>>>>> --
>>>> Regards,
>>>>
>>>> Atri
>>>> Apache Concerted
>>>>
>>> --
Marcus Eagan


Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-07-24 Thread Marcus Eagan
mitter base, git repositories and other managerial aspects can be
> > worked out during the process if the decision passes.
> >
> > Please indicate one of the following (see [1] for guidelines):
> >
> > [ ] +1 - yes, I vote for the proposal
> > [ ] -1 - no, I vote against the proposal
> >
> > Please note that anyone in the Lucene+Solr community is invited to
> > express their opinion, though only Lucene+Solr committers cast binding
> > votes (indicate non-binding votes in your reply, please).
> >
> > The vote will be active for a week to give everyone a chance to read
> > and cast a vote.
> >
> > Dawid
> >
> > [1] https://www.apache.org/foundation/voting.html
> > [2]
> https://lists.apache.org/thread.html/rfae2440264f6f874e91545b2030c98e7b7e3854ddf090f7747d338df%40%3Cdev.lucene.apache.org%3E
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Marcus Eagan


Re: Solr Admin UI Refresh 2020

2020-04-29 Thread Marcus Eagan
Gus, sorry for the delay. I did not see this email probably because of a
draft. There is module loader known as Webpack that has a feature known as
Hot Module Reload that enables hot reloads and preserves state when most
changes are made. It can be used with any framework. See here:
https://webpack.js.org/concepts/hot-module-replacement/

For vue, there is the vue-cli, the equivalent of Angular CLI that provides
ng serve.
https://vue-loader.vuejs.org/guide/hot-reload.html#state-preservation-rules

For React, there are many options for enabling hot reload.

Thanks Gus,

Marcus

On Mon, Apr 27, 2020 at 6:26 AM Gus Heck  wrote:

> From this one article (probably should read several) angular has a higher
> learning curve, but a better tooling system, and is better for a one-page
> app style - from the descriptions I tend to suspect there's also a greater
> chance of react/vue turning into mush in the hands of less experienced UI
> developers... perhaps more rope allowed? Just an impression though. So
> perhaps one of the first things to consider is do we expect to build it as
> a single page or multi-page app?
>
> One Q for anyone who has used them: does react or vue have the equivalent
> of ng serve? It's nice to be able to iterate the ui quickly. That's one
> thing I do like about it.
>
>
> https://www.udemy.com/blog/react-js-vs-angular-vs-vue-js-which-is-the-best-javascript-framework/
>
> -Gus
>
> On Thu, Apr 23, 2020 at 3:27 PM Marcus Eagan 
> wrote:
>
>> Jan, I think this is a great option. Angular's future is probably < 5
>> years. More and more people move to React and Vue every day it seems. Very
>> occasionally, people move to StencilJS (which we should avoid like the
>> plague). At least, thee insights are what the market and my friends tell me.
>>
>> thanks,
>>
>> Marcus
>>
>> On Thu, Apr 23, 2020 at 1:06 AM Jan Høydahl 
>> wrote:
>>
>>> The old UI is not perfect, and it cannot do everything, i.e. you cannot
>>> choose replica types when creating a collection, it has no support for
>>> authoring Autoscaling rules, no support for doing backup/restore etc. The
>>> dream would of course be that the new UI is so sweet to work with that
>>> almost any new feature added to the backend also gets a simple UI
>>> component, if not in the same release or by the same developer, then at
>>> least a release or two further down the road.
>>>
>>> Why I put the three github links in this thread was not because it is a
>>> thorough survey of everything that is out there. It was 30min search on
>>> github just to find prior art. I expect whoever drives the SIP work to do
>>> even more thorough investigation as to what our options are. Not being a
>>> full-time frontend engineer, I cannot speak for Angular’s future. It would
>>> of course be a pity if we choose the framework that happens to be the next
>>> one to go extinct :) so Vue feels more secure than Angular but what do I
>>> know?
>>>
>>> Finally, I hope the SIP, in addition to surveying existing code and
>>> possible frameworks, should also list the deployment options we have (jetty
>>> inside solr app, jetty separate webapp, jetty, different port, standalone
>>> node, package system…), and recommend one. Perhaps lay out a roadmap
>>> starting with drop-in replacement as first step?
>>>
>>>
>>> When it comes to the question of covernance and where the code lives, I
>>> came to think of another option we have not yet discussed. It goes like
>>> this:
>>>
>>> Solr Admin UI could be another sub project of Lucene/Solr, as PyLucene
>>> is, with its own git repo and its own releases.
>>> The release would be a ZIP with the compiled UI and instructions on how
>>> to deploy as part of Solr or standalone.
>>> Then Solr core build would pull in the zip as a dependency and serve it
>>> up as today or in a new way.
>>> Benefits of this solution includes:
>>> + The core codebase remains Java only (with versioned UI as dependency)
>>> + The UI is still governed and released by the project
>>> + The UI gets its own repo, and its own GitHub home, which can maybe
>>> easier attract PRs from newcomers?
>>> + UI can do releases out-of-band. Solr can stay on same UI-release for
>>> several versions if it wants, and we can publish UI-only bugfixes for XSS
>>> or whatever, with easy patch instructions
>>> + End users may choose to disable UI in solr.in.sh and deploy the UI
>>> release standalone, i.e. more choice
>>> The downsides are of course
&g

Re: Solr Admin UI Refresh 2020

2020-04-23 Thread Marcus Eagan
Jan, I think this is a great option. Angular's future is probably < 5
years. More and more people move to React and Vue every day it seems. Very
occasionally, people move to StencilJS (which we should avoid like the
plague). At least, thee insights are what the market and my friends tell me.

thanks,

Marcus

On Thu, Apr 23, 2020 at 1:06 AM Jan Høydahl  wrote:

> The old UI is not perfect, and it cannot do everything, i.e. you cannot
> choose replica types when creating a collection, it has no support for
> authoring Autoscaling rules, no support for doing backup/restore etc. The
> dream would of course be that the new UI is so sweet to work with that
> almost any new feature added to the backend also gets a simple UI
> component, if not in the same release or by the same developer, then at
> least a release or two further down the road.
>
> Why I put the three github links in this thread was not because it is a
> thorough survey of everything that is out there. It was 30min search on
> github just to find prior art. I expect whoever drives the SIP work to do
> even more thorough investigation as to what our options are. Not being a
> full-time frontend engineer, I cannot speak for Angular’s future. It would
> of course be a pity if we choose the framework that happens to be the next
> one to go extinct :) so Vue feels more secure than Angular but what do I
> know?
>
> Finally, I hope the SIP, in addition to surveying existing code and
> possible frameworks, should also list the deployment options we have (jetty
> inside solr app, jetty separate webapp, jetty, different port, standalone
> node, package system…), and recommend one. Perhaps lay out a roadmap
> starting with drop-in replacement as first step?
>
>
> When it comes to the question of covernance and where the code lives, I
> came to think of another option we have not yet discussed. It goes like
> this:
>
> Solr Admin UI could be another sub project of Lucene/Solr, as PyLucene is,
> with its own git repo and its own releases.
> The release would be a ZIP with the compiled UI and instructions on how to
> deploy as part of Solr or standalone.
> Then Solr core build would pull in the zip as a dependency and serve it up
> as today or in a new way.
> Benefits of this solution includes:
> + The core codebase remains Java only (with versioned UI as dependency)
> + The UI is still governed and released by the project
> + The UI gets its own repo, and its own GitHub home, which can maybe
> easier attract PRs from newcomers?
> + UI can do releases out-of-band. Solr can stay on same UI-release for
> several versions if it wants, and we can publish UI-only bugfixes for XSS
> or whatever, with easy patch instructions
> + End users may choose to disable UI in solr.in.sh and deploy the UI
> release standalone, i.e. more choice
> The downsides are of course
> - More moving parts, separate releases, more bureaucracy
> - May be a problem to obtain enough PMC members vote (ask Andi), need a
> separate smoketester?
> - UI is not tested as part of Solr, breaking changes may be merged
> unnoticed. (This can be mitigated with e2e Jenkins tests)
>
> There may be downsides I have not thought of, but interested in your
> thoughts
>
> Jan
>
> 22. apr. 2020 kl. 18:54 skrev Marcus Eagan :
>
> Nothing is ever finished.
>
> Yet I agree with you one hundred percent. I don't like the idea that you
> can delete a collection from the UI, but that's just me. I didn't want to
> get in to the discussion until I was further along.
>
> Marcus
>
> On Wed, Apr 22, 2020 at 9:50 AM Gus Heck  wrote:
>
>> Re Parity: If we are going to drop a feature from the UI it should be an
>> explicit decision to do so. I think until we have either
>>
>> 1) a decision to drop (expressed in the SIP or Jira)
>> 2) a working re-implementation
>>
>> For each feature the new UI should not be considered finished.
>>
>>
>> On Wed, Apr 22, 2020 at 12:14 PM Marcus Eagan 
>> wrote:
>>
>>> Parity is not necessarily a good thing. Maintaining most of the existing
>>> functionality is good. I would recommend some of it is removed because it’s
>>> dangerous.
>>>
>>> My choice to pick up the project I did was because it was the most
>>> updated of them all and I can change one variable rather than multiple
>>> because the Angular name was the same.
>>>
>>> Thanks Jan and Houston. I need to try the project out later tonight.
>>>
>>> I’m happy to contribute to any of them.
>>> As for the test, the scaffolds are the first step to adding tests. They
>>> can be added relatively quickly but even the scaffolds ensure the
>>> c

Re: Solr Admin UI Refresh 2020

2020-04-22 Thread Marcus Eagan
Nothing is ever finished.

Yet I agree with you one hundred percent. I don't like the idea that you
can delete a collection from the UI, but that's just me. I didn't want to
get in to the discussion until I was further along.

Marcus

On Wed, Apr 22, 2020 at 9:50 AM Gus Heck  wrote:

> Re Parity: If we are going to drop a feature from the UI it should be an
> explicit decision to do so. I think until we have either
>
> 1) a decision to drop (expressed in the SIP or Jira)
> 2) a working re-implementation
>
> For each feature the new UI should not be considered finished.
>
>
> On Wed, Apr 22, 2020 at 12:14 PM Marcus Eagan 
> wrote:
>
>> Parity is not necessarily a good thing. Maintaining most of the existing
>> functionality is good. I would recommend some of it is removed because it’s
>> dangerous.
>>
>> My choice to pick up the project I did was because it was the most
>> updated of them all and I can change one variable rather than multiple
>> because the Angular name was the same.
>>
>> Thanks Jan and Houston. I need to try the project out later tonight.
>>
>> I’m happy to contribute to any of them.
>> As for the test, the scaffolds are the first step to adding tests. They
>> can be added relatively quickly but even the scaffolds ensure the
>> components compile, which is step forward for the project where it is
>> today. There are a couple months of work on it. I did not intend for
>> anything to be merged yet, but for code to exist for people to test it out.
>>
>> The Angular project is a pain.  However, I will keep the project up and
>> work to support whatever the community needs.
>>
>> My goal is to get an updated Solr Admin UI Im the project to help
>> developers get started, and improve security. Whichever one the community
>> decides on, I will work with everyone to help get it done.
>>
>> Thank you,
>>
>> Marcus
>>
>>
>> On Wed, Apr 22, 2020 at 08:24 Houston Putman 
>> wrote:
>>
>>> I agree with Jan, I think we need some discussions on alternatives and
>>> the pros/cons of each before we invest in implementing a solution.
>>>
>>> I personally have the most experience with React and don't know much
>>> about other frameworks, but I'd love to understand why Angular or Vue.JS
>>> might be a better option.
>>> (Having an implementation to start with is definitely a plus, and it
>>> doesn't look like there is one for React)
>>>
>>> Yasa looks more complete than savantly-net/solr-admin to me, and
>>> definitely warrants at least a look.
>>>
>>> - Houston
>>>
>>> On Wed, Apr 22, 2020 at 7:27 AM Jan Høydahl 
>>> wrote:
>>>
>>>> I spun up the proposed app for the first time today, clicked around and
>>>> browsed the code.
>>>> It appears to me that the app is far less developed than I thought,
>>>> which agrees well with only 12 commits.
>>>> The collections component only knows how to list collections, the
>>>> «create collection» button is dead etc.
>>>> It will be a HUGE effort to bring this to feature parity with current
>>>> AdminUI.
>>>> I cannot find any substantial tests other than scaffold tests verifying
>>>> that ng components are created ok. Could be because there is not that much
>>>> functionality to really test yet?
>>>>
>>>> Which makes me question again the perhaps premature decision on using
>>>> this repo as a basis.
>>>>
>>>>
>>>> So I did a quick test with the VueJS based YASA app (
>>>> https://github.com/kezhenxu94/yasa) and got up and running in a few
>>>> minutes, with a much more feature complete UI.
>>>> It is also a complete drop-in replacement for the old UI, once
>>>> compiled. Downside is that it is older and needs upgrade and to play well
>>>> with CPF.
>>>> So let’s step back for a while and not make hasty choices too early. I
>>>> worked with VueJS in a project and really like it. Vue is the 2nd coolest
>>>> kid on the block after React
>>>> according to https://2019.stateofjs.com/front-end-frameworks/ and
>>>> Wikipedia just chose it over React for their UI makeover.
>>>>
>>>> Anyway, if you want to test YASA locally, here is a 3 minutes recipe
>>>> for doing so:
>>>>
>>>>https://gist.github.com/janhoy/0f7cddc0d92f9e53db7522fe93ff7003
>>>>
>>>> To me, this looks like a much better st

Re: Solr Admin UI Refresh 2020

2020-04-22 Thread Marcus Eagan
Also, as an aside, if the community decides to go with Vue, let me know. I
like it a lot more and had more experience with it than Angular in its new
carnation. I would be happy to help update YASA. I won't get into all the
UX considerations if it is more feature complete as much as I would update
the code.

Happy to support, looking for input. Singular goal in my mind - replace
current UI for safety of users, especially new ones.

Thanks,

On Wed, Apr 22, 2020 at 9:14 AM Marcus Eagan  wrote:

> Parity is not necessarily a good thing. Maintaining most of the existing
> functionality is good. I would recommend some of it is removed because it’s
> dangerous.
>
> My choice to pick up the project I did was because it was the most updated
> of them all and I can change one variable rather than multiple because the
> Angular name was the same.
>
> Thanks Jan and Houston. I need to try the project out later tonight.
>
> I’m happy to contribute to any of them.
> As for the test, the scaffolds are the first step to adding tests. They
> can be added relatively quickly but even the scaffolds ensure the
> components compile, which is step forward for the project where it is
> today. There are a couple months of work on it. I did not intend for
> anything to be merged yet, but for code to exist for people to test it out.
>
> The Angular project is a pain.  However, I will keep the project up and
> work to support whatever the community needs.
>
> My goal is to get an updated Solr Admin UI Im the project to help
> developers get started, and improve security. Whichever one the community
> decides on, I will work with everyone to help get it done.
>
> Thank you,
>
> Marcus
>
>
> On Wed, Apr 22, 2020 at 08:24 Houston Putman 
> wrote:
>
>> I agree with Jan, I think we need some discussions on alternatives and
>> the pros/cons of each before we invest in implementing a solution.
>>
>> I personally have the most experience with React and don't know much
>> about other frameworks, but I'd love to understand why Angular or Vue.JS
>> might be a better option.
>> (Having an implementation to start with is definitely a plus, and it
>> doesn't look like there is one for React)
>>
>> Yasa looks more complete than savantly-net/solr-admin to me, and
>> definitely warrants at least a look.
>>
>> - Houston
>>
>> On Wed, Apr 22, 2020 at 7:27 AM Jan Høydahl 
>> wrote:
>>
>>> I spun up the proposed app for the first time today, clicked around and
>>> browsed the code.
>>> It appears to me that the app is far less developed than I thought,
>>> which agrees well with only 12 commits.
>>> The collections component only knows how to list collections, the
>>> «create collection» button is dead etc.
>>> It will be a HUGE effort to bring this to feature parity with current
>>> AdminUI.
>>> I cannot find any substantial tests other than scaffold tests verifying
>>> that ng components are created ok. Could be because there is not that much
>>> functionality to really test yet?
>>>
>>> Which makes me question again the perhaps premature decision on using
>>> this repo as a basis.
>>>
>>>
>>> So I did a quick test with the VueJS based YASA app (
>>> https://github.com/kezhenxu94/yasa) and got up and running in a few
>>> minutes, with a much more feature complete UI.
>>> It is also a complete drop-in replacement for the old UI, once compiled.
>>> Downside is that it is older and needs upgrade and to play well with CPF.
>>> So let’s step back for a while and not make hasty choices too early. I
>>> worked with VueJS in a project and really like it. Vue is the 2nd coolest
>>> kid on the block after React
>>> according to https://2019.stateofjs.com/front-end-frameworks/ and
>>> Wikipedia just chose it over React for their UI makeover.
>>>
>>> Anyway, if you want to test YASA locally, here is a 3 minutes recipe for
>>> doing so:
>>>
>>>https://gist.github.com/janhoy/0f7cddc0d92f9e53db7522fe93ff7003
>>>
>>> To me, this looks like a much better starting point, and the project has
>>> 2x the contributors, 3x the commits and a MIT license :-)
>>>
>>> Another reason to spend more time in SIP mode, iterating on what is best
>>> for the project, what alternatives were considered and why certain
>>> frameworks were selected/rejected etc etc, before spending much more time
>>> coding.
>>>
>>> Jan
>>>
>>> > 22. apr. 2020 kl. 11:30 skrev Noble Paul :
>>> >
>>>

Re: Solr Admin UI Refresh 2020

2020-04-22 Thread Marcus Eagan
Parity is not necessarily a good thing. Maintaining most of the existing
functionality is good. I would recommend some of it is removed because it’s
dangerous.

My choice to pick up the project I did was because it was the most updated
of them all and I can change one variable rather than multiple because the
Angular name was the same.

Thanks Jan and Houston. I need to try the project out later tonight.

I’m happy to contribute to any of them.
As for the test, the scaffolds are the first step to adding tests. They can
be added relatively quickly but even the scaffolds ensure the components
compile, which is step forward for the project where it is today. There are
a couple months of work on it. I did not intend for anything to be merged
yet, but for code to exist for people to test it out.

The Angular project is a pain.  However, I will keep the project up and
work to support whatever the community needs.

My goal is to get an updated Solr Admin UI Im the project to help
developers get started, and improve security. Whichever one the community
decides on, I will work with everyone to help get it done.

Thank you,

Marcus


On Wed, Apr 22, 2020 at 08:24 Houston Putman 
wrote:

> I agree with Jan, I think we need some discussions on alternatives and the
> pros/cons of each before we invest in implementing a solution.
>
> I personally have the most experience with React and don't know much about
> other frameworks, but I'd love to understand why Angular or Vue.JS might be
> a better option.
> (Having an implementation to start with is definitely a plus, and it
> doesn't look like there is one for React)
>
> Yasa looks more complete than savantly-net/solr-admin to me, and
> definitely warrants at least a look.
>
> - Houston
>
> On Wed, Apr 22, 2020 at 7:27 AM Jan Høydahl  wrote:
>
>> I spun up the proposed app for the first time today, clicked around and
>> browsed the code.
>> It appears to me that the app is far less developed than I thought, which
>> agrees well with only 12 commits.
>> The collections component only knows how to list collections, the «create
>> collection» button is dead etc.
>> It will be a HUGE effort to bring this to feature parity with current
>> AdminUI.
>> I cannot find any substantial tests other than scaffold tests verifying
>> that ng components are created ok. Could be because there is not that much
>> functionality to really test yet?
>>
>> Which makes me question again the perhaps premature decision on using
>> this repo as a basis.
>>
>>
>> So I did a quick test with the VueJS based YASA app (
>> https://github.com/kezhenxu94/yasa) and got up and running in a few
>> minutes, with a much more feature complete UI.
>> It is also a complete drop-in replacement for the old UI, once compiled.
>> Downside is that it is older and needs upgrade and to play well with CPF.
>> So let’s step back for a while and not make hasty choices too early. I
>> worked with VueJS in a project and really like it. Vue is the 2nd coolest
>> kid on the block after React
>> according to https://2019.stateofjs.com/front-end-frameworks/ and
>> Wikipedia just chose it over React for their UI makeover.
>>
>> Anyway, if you want to test YASA locally, here is a 3 minutes recipe for
>> doing so:
>>
>>https://gist.github.com/janhoy/0f7cddc0d92f9e53db7522fe93ff7003
>>
>> To me, this looks like a much better starting point, and the project has
>> 2x the contributors, 3x the commits and a MIT license :-)
>>
>> Another reason to spend more time in SIP mode, iterating on what is best
>> for the project, what alternatives were considered and why certain
>> frameworks were selected/rejected etc etc, before spending much more time
>> coding.
>>
>> Jan
>>
>> > 22. apr. 2020 kl. 11:30 skrev Noble Paul :
>> >
>> > As I see it all the 12 commits to that project is made by Jeremy
>> Branham.
>> >
>> > Kudos to Jan Høydahl to save Solr from potential lawsuit &
>> > embarrassment in the future. Awesome, I guess you are a part time
>> > private detective
>> >
>> > On Wed, Apr 22, 2020 at 7:25 PM Ishan Chattopadhyaya
>> >  wrote:
>> >>
>> >>> The shoulders of the homie that put that scaffold together are broad!
>> Props to him.
>> >> Marcus, are you working with Jeremy Branham on this?
>> >>
>> >> On Wed, 22 Apr, 2020, 2:25 pm Jan Høydahl, 
>> wrote:
>> >>>
>> >>> WRT legal aspect, the original git repo
>> https://github.com/savantly-net/solr-admin does not say anything about
>> copyright or license. I en

Re: Solr Admin UI Refresh 2020

2020-04-20 Thread Marcus Eagan
SIP here:
https://cwiki.apache.org/confluence/display/SOLR/Updated+Solr+Admin+UI

On Mon, Apr 20, 2020 at 9:32 AM Gus Heck  wrote:

> If Marcus has ability to edit existing pages, why don't we create the
> empty page for him and sort out access granting issues later. I'd hate for
> this much needed SIP to bog down on a technical issue.
>
> -Gus
>
> On Mon, Apr 20, 2020 at 7:10 AM Jan Høydahl  wrote:
>
>> Please retry. I gave edit access to confluence user id ‘marcussorealheis’.
>>
>> Jan
>>
>> 20. apr. 2020 kl. 01:30 skrev Marcus Eagan :
>>
>> I do need help. I am not allowed to create a SIP. Or, I have been unable
>> to create a SIP in three previous attempts.
>>
>> Marcus
>>
>> On Sun, Apr 19, 2020 at 3:45 AM Jan Høydahl 
>> wrote:
>>
>>> Thanks. The PR is useful for people to try out the UI. But for overall
>>> replacement plan I really think we neeed that SIP, do you still need help
>>> with Confluence?
>>>
>>> Jan Høydahl
>>>
>>> 19. apr. 2020 kl. 06:30 skrev Marcus Eagan :
>>>
>>> 
>>> I hope everybody is enjoying their weekend and is in good health.
>>>
>>> Filed a Jira, made a PR:
>>> https://issues.apache.org/jira/browse/SOLR-14414
>>>
>>> Still, quite a bit more work to do. I need to spend some time on the
>>> query screen, improving the cluster view, and adding alias, and more tests.
>>> The last three should be pretty easy. Would probably spend a couple weeks
>>> working on style as well, but that can be an ongoing effort, just as making
>>> it package manager compatible and using v2 commands. There are also many
>>> areas where the Use of TypeScript or the Angular framework will improve.
>>> That will come with time, some involvement from a few Angular wizards, and
>>> a bit of research.
>>>
>>> Thank you everyone,
>>>
>>> Marcus
>>>
>>> On Tue, Apr 14, 2020 at 2:01 PM Marcus Eagan 
>>> wrote:
>>>
>>>>
>>>> Gus, At first it looked like it let me, but today it seemed that it did
>>>> not allow me to create a SIP.
>>>>
>>>>
>>>>
>>>> On Tue, Apr 14, 2020 at 8:57 AM Gus Heck  wrote:
>>>>
>>>>> First, sorry you’re having problems with Confluence. I suspect the
>>>>>> issue is permissions. There are only two groups allowed to add pages to 
>>>>>> the
>>>>>> SOLR space, “lucene” and “lucene-pmc”. I believe these correspond to ASF
>>>>>> LDAP groups, which would mean they include committers and PMC members 
>>>>>> only.
>>>>>> We can grant you individual permission to add/edit pages, however; we’ve
>>>>>> done this for a handful of others. I could do this for you, just ping me
>>>>>> off-thread so I can confirm your username.
>>>>>>
>>>>>
>>>>> If  that is the issue, then we should advertise clearly on the SIP
>>>>> page that non-committers wishing to create a SIP should request access on
>>>>> this list. That's probably a good mechanic because it ensures that contact
>>>>> with this list is established first. And it sounds like confluence is
>>>>> allowing him to start editing and then throwing away all his work on
>>>>> submission which is VERY bad behavior... Possibly an INFRA ticket if that
>>>>> is indeed the case...
>>>>>
>>>>> @Marcus can you confirm that you tried to create a page, it appeared
>>>>> to let you and then threw out your work on submission? (or am I reading
>>>>> what you wrote wrong?)
>>>>>
>>>>
>>>>
>>>> --
>>>> Marcus Eagan
>>>>
>>>>
>>>
>>> --
>>> Marcus Eagan
>>>
>>>
>>
>> --
>> Marcus Eagan
>>
>>
>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Marcus Eagan


Re: Solr Admin UI Refresh 2020

2020-04-19 Thread Marcus Eagan
I do need help. I am not allowed to create a SIP. Or, I have been unable to
create a SIP in three previous attempts.

Marcus

On Sun, Apr 19, 2020 at 3:45 AM Jan Høydahl  wrote:

> Thanks. The PR is useful for people to try out the UI. But for overall
> replacement plan I really think we neeed that SIP, do you still need help
> with Confluence?
>
> Jan Høydahl
>
> 19. apr. 2020 kl. 06:30 skrev Marcus Eagan :
>
> 
> I hope everybody is enjoying their weekend and is in good health.
>
> Filed a Jira, made a PR: https://issues.apache.org/jira/browse/SOLR-14414
>
> Still, quite a bit more work to do. I need to spend some time on the query
> screen, improving the cluster view, and adding alias, and more tests. The
> last three should be pretty easy. Would probably spend a couple weeks
> working on style as well, but that can be an ongoing effort, just as making
> it package manager compatible and using v2 commands. There are also many
> areas where the Use of TypeScript or the Angular framework will improve.
> That will come with time, some involvement from a few Angular wizards, and
> a bit of research.
>
> Thank you everyone,
>
> Marcus
>
> On Tue, Apr 14, 2020 at 2:01 PM Marcus Eagan 
> wrote:
>
>>
>> Gus, At first it looked like it let me, but today it seemed that it did
>> not allow me to create a SIP.
>>
>>
>>
>> On Tue, Apr 14, 2020 at 8:57 AM Gus Heck  wrote:
>>
>>> First, sorry you’re having problems with Confluence. I suspect the issue
>>>> is permissions. There are only two groups allowed to add pages to the SOLR
>>>> space, “lucene” and “lucene-pmc”. I believe these correspond to ASF LDAP
>>>> groups, which would mean they include committers and PMC members only. We
>>>> can grant you individual permission to add/edit pages, however; we’ve done
>>>> this for a handful of others. I could do this for you, just ping me
>>>> off-thread so I can confirm your username.
>>>>
>>>
>>> If  that is the issue, then we should advertise clearly on the SIP page
>>> that non-committers wishing to create a SIP should request access on this
>>> list. That's probably a good mechanic because it ensures that contact with
>>> this list is established first. And it sounds like confluence is allowing
>>> him to start editing and then throwing away all his work on submission
>>> which is VERY bad behavior... Possibly an INFRA ticket if that is
>>> indeed the case...
>>>
>>> @Marcus can you confirm that you tried to create a page, it appeared to
>>> let you and then threw out your work on submission? (or am I reading what
>>> you wrote wrong?)
>>>
>>
>>
>> --
>> Marcus Eagan
>>
>>
>
> --
> Marcus Eagan
>
>

-- 
Marcus Eagan


Re: Solr Admin UI Refresh 2020

2020-04-18 Thread Marcus Eagan
I hope everybody is enjoying their weekend and is in good health.

Filed a Jira, made a PR: https://issues.apache.org/jira/browse/SOLR-14414

Still, quite a bit more work to do. I need to spend some time on the query
screen, improving the cluster view, and adding alias, and more tests. The
last three should be pretty easy. Would probably spend a couple weeks
working on style as well, but that can be an ongoing effort, just as making
it package manager compatible and using v2 commands. There are also many
areas where the Use of TypeScript or the Angular framework will improve.
That will come with time, some involvement from a few Angular wizards, and
a bit of research.

Thank you everyone,

Marcus

On Tue, Apr 14, 2020 at 2:01 PM Marcus Eagan  wrote:

>
> Gus, At first it looked like it let me, but today it seemed that it did
> not allow me to create a SIP.
>
>
>
> On Tue, Apr 14, 2020 at 8:57 AM Gus Heck  wrote:
>
>> First, sorry you’re having problems with Confluence. I suspect the issue
>>> is permissions. There are only two groups allowed to add pages to the SOLR
>>> space, “lucene” and “lucene-pmc”. I believe these correspond to ASF LDAP
>>> groups, which would mean they include committers and PMC members only. We
>>> can grant you individual permission to add/edit pages, however; we’ve done
>>> this for a handful of others. I could do this for you, just ping me
>>> off-thread so I can confirm your username.
>>>
>>
>> If  that is the issue, then we should advertise clearly on the SIP page
>> that non-committers wishing to create a SIP should request access on this
>> list. That's probably a good mechanic because it ensures that contact with
>> this list is established first. And it sounds like confluence is allowing
>> him to start editing and then throwing away all his work on submission
>> which is VERY bad behavior... Possibly an INFRA ticket if that is
>> indeed the case...
>>
>> @Marcus can you confirm that you tried to create a page, it appeared to
>> let you and then threw out your work on submission? (or am I reading what
>> you wrote wrong?)
>>
>
>
> --
> Marcus Eagan
>
>

-- 
Marcus Eagan


Re: Solr Admin UI Refresh 2020

2020-04-14 Thread Marcus Eagan
Gus, At first it looked like it let me, but today it seemed that it did not
allow me to create a SIP.



On Tue, Apr 14, 2020 at 8:57 AM Gus Heck  wrote:

> First, sorry you’re having problems with Confluence. I suspect the issue
>> is permissions. There are only two groups allowed to add pages to the SOLR
>> space, “lucene” and “lucene-pmc”. I believe these correspond to ASF LDAP
>> groups, which would mean they include committers and PMC members only. We
>> can grant you individual permission to add/edit pages, however; we’ve done
>> this for a handful of others. I could do this for you, just ping me
>> off-thread so I can confirm your username.
>>
>
> If  that is the issue, then we should advertise clearly on the SIP page
> that non-committers wishing to create a SIP should request access on this
> list. That's probably a good mechanic because it ensures that contact with
> this list is established first. And it sounds like confluence is allowing
> him to start editing and then throwing away all his work on submission
> which is VERY bad behavior... Possibly an INFRA ticket if that is
> indeed the case...
>
> @Marcus can you confirm that you tried to create a page, it appeared to
> let you and then threw out your work on submission? (or am I reading what
> you wrote wrong?)
>


-- 
Marcus Eagan


Re: Solr Admin UI Refresh 2020

2020-04-14 Thread Marcus Eagan
Cassandra, Jan, et al.,

Looks like I cannot create a SIP. There is a Jira already, but perhaps I
will create a new Jira that answers these two very questions:

My wish for the SIP is to be very clear on exactly how the UI is proposed
shipped and whether any manual steps are needed to enable/install. Also try
to clarify whether you propose a big-bang replace or whether the old and
new UI will need to co-exists.

@ Yeikel - The UI will not introduce changes to the APIs. I think we ought
to consider as a community modifying some APIs is a learning I have
garnered from this work.  The UI will be maintained, for sure.

Thanks, Marcus

On Tue, Apr 14, 2020 at 7:35 AM yeikel valdes  wrote:

> And whether they can coexist at all. From the previous emails it seemed
> to me this change will introduce changes to the existing APIS and I that
> means that compatibility will be broken. I am also not sure if it we will
> maintain the existing UI after that.
>
>
>  On Tue, 14 Apr 2020 09:51:23 -0400 * jan@cominvent.com
>  * wrote 
>
> Yes, please go through SIP before inviting us to review the actual PR.
> And once you re-work the SIP page based on feedback and converge towards
> consensus, I’d recommend starting a formal VOTE thread for the SIP, as
> outlined in the SIP page. That way people can have a chance to speak up on
> the overall design so that does not become a topic once again in the PR.
>
> My wish for the SIP is to be very clear on exactly how the UI is proposed
> shipped and whether any manual steps are needed to enable/install. Also try
> to clarify whether you propose a big-bang replace or whether the old and
> new UI will need to co-exists.
>
> Jan
>
> 14. apr. 2020 kl. 15:29 skrev Cassandra Targett :
>
> Marcus,
>
> A couple thoughts…
>
> First, sorry you’re having problems with Confluence. I suspect the issue
> is permissions. There are only two groups allowed to add pages to the SOLR
> space, “lucene” and “lucene-pmc”. I believe these correspond to ASF LDAP
> groups, which would mean they include committers and PMC members only. We
> can grant you individual permission to add/edit pages, however; we’ve done
> this for a handful of others. I could do this for you, just ping me
> off-thread so I can confirm your username.
>
> I don’t know what happened to the pages you tried to create. You can try
> to see if they are hidden somehow, maybe in your profile’s “Recently worked
> on” section (
> https://cwiki.apache.org/confluence/dashboard.action#recently-worked).
> When you’re on that page, the menu on the left also shows a “Saved for
> later” page that maybe has them.
>
> At any rate, I wouldn’t suggest putting all the design
> decisions/discussion into a PR. That’s what we traditionally used Jira for.
> We decided to use SIPs to make navigating design discussions in Jira easier
> - decisions would get lost in comments - and to forestall someone doing
> possibly wasted work until they have some degree of community agreement for
> what they hope to do. If you put it in a PR, by its very nature those
> decisions would already be made and that could lead to significant amounts
> of rework.
>
> The SIP process is to write up the SIP and then file a Jira issue, and you
> can’t file a PR without a Jira, so you’ll need a Jira issue for this work
> anyway. If you really can’t write a SIP, then the fallback is Jira, not a
> PR.
>
> Re: the API issues - those should also go into Jira. I can’t see your list
> so far since it requires permission, but I’ll just say I would not
> recommend dropping your whole doc into a single Jira. They’ll need to be
> broken out into separate ones (and it’s pretty likely issues already exist
> for at least some of the things). I know that sounds like a PITA, but we
> don’t track issues in personal Google docs, we use Jira. And if some of the
> items are possibly controversial, then individual items for each one will
> allow us to work through the controversies and not stall progress on things
> that are not problematic (if anyone is so inclined to work on those).
>
> Hopefully you don’t mind the unsolicited advice on this, just trying to
> help you understand some of our ways of doing things.
>
> Cassandra
> On Apr 14, 2020, 3:18 AM -0500, Marcus Eagan ,
> wrote:
>
> Mike and Gus,
>
> I tried to share my SIP after writing it, and then it disappeared. I also
> tried to write some and save it, but then it disappeared. I also tried to
> write it outside the wiki and paste it again.
>
> I'm not sure what's going on, but I will try again tomorrow I suppose. If
> I fail again, my SIP will probably come in the form of a Pull Request, with
> videos, screenshots, a lit of todos, and helpful documentation so that
> people ca

Re: Solr Admin UI Refresh 2020

2020-04-14 Thread Marcus Eagan
Mike and Gus,

I tried to share my SIP after writing it, and then it disappeared. I also
tried to write some and save it, but then it disappeared. I also tried to
write it outside the wiki and paste it again.

I'm not sure what's going on, but I will try again tomorrow I suppose. If I
fail again, my SIP will probably come in the form of a Pull Request, with
videos, screenshots, a lit of todos, and helpful documentation so that
people can get started easily.

>From this project, I have also uncovered what I would consider some serious
issues with the Solr API and some other issues that I dig through Jira for
before I bother the community. I will do my best to document them all and
open tickets. Some times I rage quit, walk away, come back and forget. I
will try to fix some of them, but some of them that I have looked into look
like worm cans. It might be good if a few people here are prepared to
explore fixing the API in a few areas, but I'm not sure about the best way
to go about doing that with angering lots of people. *Seeking advice on how
to approach the API challenges. *I don't just want to start complaining
about things, nor do I want to take on everything. A small group from a
variety of different organizations to discuss some of these challenges
might be most helpful. On the flip side, I suspect things will improve in
other areas as a result.

You can view progress here:
https://drive.google.com/file/d/1NgO34DRp1llMp3EwJQcKBn2LM4r5PD39/view?usp=sharing

I've mostly managed to get all the requests sorted, and rendering of the
data (mostly). i didn't get much time today as I was very busy at work. And
tomorrow I get my quarantine checkup and will probably be sleepy early. But
by the end of next weekend, I should have finished everything but the query
editor and all the other associated collections screens which I will bucket
under collections. That current dropdown menu is less intuitive than
optimal but I understand.

Soliciting feedback now because once I open the PR I am hoping we have
agreed on as many things as possible. If no one suggest query view designs,
I will mock some up and share them. Again, I repeat, not a designer. But I
did take couple classes, and have built products from start to finish
mostly on an island so I will be able to manage. Looking for feedback that
the community might be getting from its clients. Obviously, I cannot
accommodate all the requests, but if something is recurring, I want to
support the users to support the adoption of this new UI.

As if I needed to say it, and many of you probably suspected this was
coming: Erick this is a lot of work. Holy shit. And not just because I'm a
product manager. It would be challenging for anyone. This project, like all
project, has some skeletons. From my perspective, that's gfreat because
there are lots of things to improve. Also, the project is still pretty
amazing.

Thanks again for your help everyone,

Marcus


On Mon, Apr 13, 2020 at 9:55 AM Marcus Eagan  wrote:

> Gus,
>
> SIP sounds good. I will share.
>
> Marcus
>
>
>
> On Mon, Apr 13, 2020 at 09:13 Gus Heck  wrote:
>
>> Maybe start collecting Design and Design choices in a SIP? This
>> discussion has been good and there seems to be consensus that we want a new
>> UI, we want it to be a package and we want the package to be available by
>> default and well tested. "Package" seems to imply that it can be added or
>> removed or replaced or an alternative UI installed along side of it. If we
>> got all of those things done this would be amazingly awesome :)
>>
>> Another thing that would be valuable is a good doc that explains  "how to
>> edit and maintain the UI", written for an audience that is experienced in
>> SW dev but not UI development (probably including some basics around
>> framework chosen). This could be in a README or in the "dev docs" that has
>> been mentioned elsewhere.
>>
>> The SIP would be a great place to elaborate on technology choices &
>> supply a link to things like the video :)
>>
>> On Mon, Apr 13, 2020 at 10:35 AM Mike Drob  wrote:
>>
>>> Hi Marcus,
>>>
>>> The mailing list strips attachments for some folks, can you upload the
>>> video somewhere else and link to it for us poor unfortunate souls? Thanks
>>> for your work! Excited to see the progress as it happens.
>>>
>>> Mike
>>>
>>> On Mon, Apr 13, 2020 at 5:30 AM Marcus Eagan 
>>> wrote:
>>>
>>>> In general, I asked for some degree of trouble when I volunteered for
>>>> this work. Don't beat me too hard. My primary goal is to achieve three
>>>> things:
>>>>
>>>> 1) Improve security when using Solr Admin UI by removing EOL,
>>>> unsupported code.
>

Re: Solr Admin UI Refresh 2020

2020-04-13 Thread Marcus Eagan
Gus,

SIP sounds good. I will share.

Marcus



On Mon, Apr 13, 2020 at 09:13 Gus Heck  wrote:

> Maybe start collecting Design and Design choices in a SIP? This discussion
> has been good and there seems to be consensus that we want a new UI, we
> want it to be a package and we want the package to be available by default
> and well tested. "Package" seems to imply that it can be added or removed
> or replaced or an alternative UI installed along side of it. If we got all
> of those things done this would be amazingly awesome :)
>
> Another thing that would be valuable is a good doc that explains  "how to
> edit and maintain the UI", written for an audience that is experienced in
> SW dev but not UI development (probably including some basics around
> framework chosen). This could be in a README or in the "dev docs" that has
> been mentioned elsewhere.
>
> The SIP would be a great place to elaborate on technology choices & supply
> a link to things like the video :)
>
> On Mon, Apr 13, 2020 at 10:35 AM Mike Drob  wrote:
>
>> Hi Marcus,
>>
>> The mailing list strips attachments for some folks, can you upload the
>> video somewhere else and link to it for us poor unfortunate souls? Thanks
>> for your work! Excited to see the progress as it happens.
>>
>> Mike
>>
>> On Mon, Apr 13, 2020 at 5:30 AM Marcus Eagan 
>> wrote:
>>
>>> In general, I asked for some degree of trouble when I volunteered for
>>> this work. Don't beat me too hard. My primary goal is to achieve three
>>> things:
>>>
>>> 1) Improve security when using Solr Admin UI by removing EOL,
>>> unsupported code.
>>> 2) Make it easier or more welcoming for new developers to try the
>>> project and even become contributors in all areas of the project because
>>> the UI looks and functions as slightly more contemporary.
>>> 3) Give back more substantially to a community from which I have
>>> received so much with a testable and perrty UI.
>>>
>>> I've added another (4) which is contribute to help make the package
>>> manager a first-class citizen in the minds of many Solr users around the
>>> world via the UI package. I will need some help from someone in this list
>>> on deploying this UI with a jar in Gradle if we want to offer an
>>> alternative option to install the UI in Solr 9. I've attached where I'm at.
>>> It's a nights and weekends project, but I will always be available for
>>> bug fixes or discussions, unless i'm in a meeting or reading a book 
>>>
>>> I won't solicit a ton of feedback prior to the the first PR, which I
>>> will leave open for a few weeks or even a couple months while I put some
>>> lipstick on it and improve performance of the application
>>>
>>> *<<<<<<< Over the weekend read lots of documentation, wrote a bunch of
>>> code when it wasn't holidays, built the services, and stumbled through the
>>> logic of rendering all these data points so you can watch the attached
>>> video if you want to check it out.  >>>>>>>>*
>>>
>>> . There's definitely some areas where I didn't do the TypeScript thing
>>> because I'm still trying to grok it a bit.The two areas of the that I am
>>> looking to somewhat overhaul in potentially controversial ways are the 
>>> *queries
>>> page* and actual flow of the collections experience, which at the
>>> moment are sort of linked, yet disconnected at the same time. The
>>> Collections page tab today is really an Alias page. It won't have its own
>>> tab in the new application is the plan, unless someone can give me a good
>>> reason. Almost everything else will stay the same.
>>>
>>> For that reason, I'd like to solicit feedback if anyone has any examples
>>> or ideas they'd like to share, I would greatly appreciate it. I'm somewhat
>>> far along with the Admin UI as of now. Short weekend because of holiday
>>> activities and general quarantine craziness, but I'm maybe 25-35% of the
>>> way to completion, depending on how much care I devote to the query
>>> screen.
>>>
>>> Even though there are some big problems with Angular — some major ones —
>>> I think this was the right way to go for many reasons. I'm about a quarter
>>> as fast in Angular as I am in React, but this is the right decision for the
>>> long haul. I can elaborate if anyone really cares. Most importantly, this
>>> app will be a lot easier to maintain.
>>>
>

Re: Solr Admin UI Refresh 2020

2020-04-10 Thread Marcus Eagan
I don't see admin UI as non-core. I think that an application UI for
end-users of an application consumes Solr non-core. I have to resign from
arguing, though.

I don't consider myself a UI expert. I can do the work.


On Fri, Apr 10, 2020 at 11:42 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> David, you capture my thoughts well.
>
> Having a UI as a package gives users more choice and gives our users more
> flexibility.
> 1. Users would be able to use a latter version of the UI with an older
> version of Solr, or vice versa.
> 2. Users should be able to install multiple types of UI, from different
> publishers, at once.
> 3. Contributors should be able to contribute to the UI more easily, since
> collaboration can be less bureaucratic. Experts like Marcus won't need to
> depend on preoccupied committers like us.
> 4. A UI (not the default one) can use libraries that aren't even Apache
> 2.0 compatible.
> 5. We can setup and use UI test frameworks for test automation (selenium
> etc), that would be challenging to setup and maintain with ASF Jenkins.
>
> List goes on..
>
> Whether the package is a first party or third party can be a separate
> discussion. There should be an extremely easy and well defined way (support
> in the script itself) to start Solr with the packaged UI enabled.
>
> In any case, I don't think it is conducive to let UI code be part of the
> Solr's core codebase, where it currently is. The reason is, we can't fix
> bugs if we break something. We don't have automated testing either to know
> whether or not we broke anything.
>
> Every healthy project has a rich plugin ecosystem, and such non core
> improvements should be delivered via packages.
>
> On Fri, 10 Apr, 2020, 8:33 pm Marcus Eagan,  wrote:
>
>> I agree with you. 
>>
>> On Fri, Apr 10, 2020 at 06:22 David Smiley 
>> wrote:
>>
>>> I disagree that "package management frameworks are for loading
>>> non-essential features or features not enabled by default".
>>>
>>> I don't think the proposal of the UI being a "package" (in the new
>>> package system) implies that the UI (or _any_ package) is not a
>>> highly valuable package that is so highly valuable that we want it
>>> installed by default.  Noble and I were brainstorming on some ideas where
>>> even much of Solr's internal instances of plugin interfaces (e.g. query
>>> parsers, etc.) might even be a new "core" package or some-such.  The value
>>> in putting much of Solr in a package, 1st party, is separation of concerns.
>>> and better classpath management.
>>>
>>> I think it's "essential" that a UI ship with Solr by default -- meaning,
>>> without the user having to take any additional steps whatsoever.  As Jan
>>> said it's been this way a long time.
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Fri, Apr 10, 2020 at 1:35 AM Marcus Eagan 
>>> wrote:
>>>
>>>> I think package management frameworks are for loading non-essential
>>>> features or features not enabled by default. On essentialness,  experts
>>>> should not decide what is essential based on how they use a system. They
>>>> should consider the community of users. Regarding UI, it is and should be
>>>> enabled by default. Only a few use cases prefer it to be disabled and some
>>>> of those are because of its current state. They would like to use it in an
>>>> updated form.
>>>>
>>>> What is the technical rationale that outweighs the needs and behaviors
>>>> of our users to strip the user interface out of Solr?
>>>>
>>>> Thank you Noble and everyone else,
>>>> Marcus
>>>>
>>>>
>>>> On Thu, Apr 9, 2020 at 19:06 Noble Paul  wrote:
>>>>
>>>>> My 2 cents (again)
>>>>>
>>>>> if packages are disabled by default , how will UI work?
>>>>>
>>>>> We can make an exception for this one and enable only this by default
>>>>>
>>>>> Do we test the UI and certify it?
>>>>>
>>>>> The UI package can be shipped along with Solr distro ,like the million
>>>>> other jars that we ship with Solr today and every version of Solr can
>>>>> be certified for a certain version of the UI package. We should have
>>>>> sanity tests to ensure that the given version of UI works well with
>>>

Re: Solr Admin UI Refresh 2020

2020-04-10 Thread Marcus Eagan
I agree with you. 

On Fri, Apr 10, 2020 at 06:22 David Smiley  wrote:

> I disagree that "package management frameworks are for loading
> non-essential features or features not enabled by default".
>
> I don't think the proposal of the UI being a "package" (in the new package
> system) implies that the UI (or _any_ package) is not a highly valuable
> package that is so highly valuable that we want it installed by default.
> Noble and I were brainstorming on some ideas where even much of Solr's
> internal instances of plugin interfaces (e.g. query parsers, etc.) might
> even be a new "core" package or some-such.  The value in putting much of
> Solr in a package, 1st party, is separation of concerns. and better
> classpath management.
>
> I think it's "essential" that a UI ship with Solr by default -- meaning,
> without the user having to take any additional steps whatsoever.  As Jan
> said it's been this way a long time.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Apr 10, 2020 at 1:35 AM Marcus Eagan 
> wrote:
>
>> I think package management frameworks are for loading non-essential
>> features or features not enabled by default. On essentialness,  experts
>> should not decide what is essential based on how they use a system. They
>> should consider the community of users. Regarding UI, it is and should be
>> enabled by default. Only a few use cases prefer it to be disabled and some
>> of those are because of its current state. They would like to use it in an
>> updated form.
>>
>> What is the technical rationale that outweighs the needs and behaviors of
>> our users to strip the user interface out of Solr?
>>
>> Thank you Noble and everyone else,
>> Marcus
>>
>>
>> On Thu, Apr 9, 2020 at 19:06 Noble Paul  wrote:
>>
>>> My 2 cents (again)
>>>
>>> if packages are disabled by default , how will UI work?
>>>
>>> We can make an exception for this one and enable only this by default
>>>
>>> Do we test the UI and certify it?
>>>
>>> The UI package can be shipped along with Solr distro ,like the million
>>> other jars that we ship with Solr today and every version of Solr can
>>> be certified for a certain version of the UI package. We should have
>>> sanity tests to ensure that the given version of UI works well with
>>> Solr. "My commit has broken the UI and it's not my problem" should not
>>> be a valid excuse. The UI sanity tests should pass as a part of the
>>> tests.
>>>
>>> Is the UI important?
>>>
>>> Yes, the admin UI is the face of Solr for may users. People always
>>> assumed it existed & they depend on it.
>>>
>>> The current admin UI has fallen behind. If the new UI effort delivers
>>> on the promise, this is a great opportunity to get rid of that old
>>> baggage & make Solr codebase even slimmer
>>>
>>> On Fri, Apr 10, 2020 at 6:06 AM Jan Høydahl 
>>> wrote:
>>> >
>>> > Solr has always had an admin UI and if anyone wants to propose it
>>> should not, please start another thread or vote about that, and do not
>>> divert this thread which is about how to improve and future proof the Admin
>>> UI.
>>> >
>>> > I believe the Admin UI should be strengthened and enhanced, not
>>> removed. It can perfectly well be an official and even default on part of
>>> every release, perfectly in sync. Whether it is in core as today or a
>>> package or a stand-alone process or a new webapp, are then really what we
>>> discuss here.
>>> >
>>> > Perhaps after people have voiced their opinions in this thread, a SIP
>>> can be crafted with a concrete plan. We can then have a vote on the SIP.
>>> >
>>> > Jan Høydahl
>>> >
>>> > > 9. apr. 2020 kl. 20:06 skrev Cassandra Targett <
>>> casstarg...@gmail.com>:
>>> > >
>>> > > 
>>> > > Thanks for your message, Gus. You touched on things I was thinking
>>> this morning as I caught up to the thread, and had started to draft a
>>> message about.
>>> > >
>>> > > I feel like there is an assumption underlying some of our discussion
>>> about packages that says a feature or whatever has to either part of our
>>> core codebase or 100% maintained by someone “outside” the community (by
>>> which I mean som

Re: Solr Admin UI Refresh 2020

2020-04-09 Thread Marcus Eagan
repository server
> indefinitely, but again that's surely been discussed WRT packages
> already... Using Github in such a way is subject to being broken
> arbitrarily when Github decides to restrict things for cost reasons (ask
> Bower about that one WRT rate limiting...) or the "repository" has to be
> something local and therefore must be included part of the distribution...
> at which point it's still a thing we distribute and since we're
> distributing it and we probably don't mean to distribute broken stuff we
> still need UI developers...
> > >>>
> > >>> Also, I thought the package loading stuff was supposed to be
> disabled by default for security, that seems to conflict with or at least
> complicate the notion of easily installing as a package.
> > >>>
> > >>> So "package" is a good for modularizing code, or for 3rd party
> (possibly paid) plugins (I have a client that might find that interesting
> in the future) but we have to ensure that it doesn't lead to a lack of
> maintenance for things that are critical.
> > >>>
> > >>> Incidentally though I've said I favor Angular CLI, (significantly
> because I've got some start on learning it) it also occurs to me that
> perhaps anything "modern" is a difficulty because those things all have a
> learning curve, and maximizing accessibility and ease of modifications for
> folks not steeped in UI development might be our priority (different from
> the priorities a commercial site would have). The flip side argument is
> that with a popular framework, it would be easier for UI focused folks to
> contribute... but will they? and does that leave us perennially rewriting
> the UI in whatever is popular? (maybe that's ok?) I think in all our
> decisions here we need to be very careful to distinguish how our needs may
> differ in unusual ways from the needs of commercial web development.
> > >>>
> > >>> -Gus
> > >>>
> > >>> On Thu, Apr 9, 2020 at 8:14 AM Erick Erickson <
> erickerick...@gmail.com> wrote:
> > >>> Marcus:
> > >>>
> > >>> re-reading the thread, it looks to me like the consensus from Noble
> and Ishan and Jan is that as long as the new, nifty UI is a separate
> package, go ahead and knock yourself out ;). The objection is to making it
> part of the Solr code base… We’ll all be thrilled with if we can rip the
> current admin UI out ;)
> > >>>
> > >>> That said, I suspect it’ll be one of the tighter packages. It’d be
> super-cool if we could run the UI tests on Jenkins say once a day just to
> keep it up to date.
> > >>>
> > >>> The admin UI has always been somewhat awkwardly bolted on the side
> of Solr, it’d be great to have it have a more elegant architecture.
> > >>>
> > >>> The other exciting thing would be that clients could then use the
> package code as something they can incorporate/fork/whatever. Practically
> every client I’ve worked with at large installations has rolled their own
> dashboard. If they could use a package as a starting point, it’d be welcome.
> > >>>
> > >>> Best,
> > >>> Erick
> > >>>
> > >>>> On Apr 9, 2020, at 3:07 AM, Marcus Eagan 
> wrote:
> > >>>>
> > >>>> Hey Noble,
> > >>>>
> > >>>> -1 is a definitive, so I want to clarify that you are saying you do
> not wish to remove the EOL front end and replace it with another one in the
> longer term?
> > >>>>
> > >>>> I hear you! As a product manager in my day job, my primary goal is
> to find features to cut! I spend a lot of time thinking about non-essential
> vs used heavily vs causes more problems than it's worth. I can tell you
> from watching the many people in the field at Lucidworks, there are a lot
> of people who know quite a bit about Solr, but rely on the Admin UI heavily
> because they feel comfortable there. Those people in effect help us stay
> employed despite never contributing or being capable of contributing to
> Solr. So hear me out. I've got a proposal:
> > >>>>
> > >>>> To start, I can work on this app as an optional package for your
> awesome new package manager. It will be the second one I've worked on in my
> evenings and weekends btw. The first was a package validator that I hope to
> eventually open source, but its complexity and lack of popularity because
> it is security ;( will likely make it the second one I open source/finish.
> I'm also collaborating with a couple mem

Re: Solr Admin UI Refresh 2020

2020-04-08 Thread Marcus Eagan
Thanks again Gus.

Lots of people indeed misuse REST so we could go on and on about whether
requests are stateless or not in another thread. Let's spare the group.

I think most everyone on this channel would be in agreement with you on
separate app. I'll be opening a new ticket and a PR that will document a
few things to make it easy for UI devs who know little to no Java how to
get started.

Ishan, there's some significant UI expertise in the team. Erickson finds
his way to open every cookie jar. Erik Hatcher wrote the first version of
Blacklight. I've seen Pugh do lots of work on Quepid's UI. Jan and Kevin
have done a lot of work, and so have many others. The list goes on, and
*likes to work on UI* is a different discussion.

Beyond committers because I'm not a committer, I have UI expertise that I
can polish off and improve for the sake of my interest and commitment to
the community and I like to do it. I've also led UI teams. I can help to
steward the effort overall and keep things up to date up to the point where
I need to ask one of the committers to help me get changes merged. I'll
probably even hire a developer to work on it once we are to that point. ;-)

Expertise is not something that should block us but motivate us to expand
this community and/or our own skillsets long term.

 Thank you both and everyone else,

Marcus

On Wed, Apr 8, 2020, 10:21 AM Gus Heck  wrote:

> While running it in an external node does ensure separability, I don't
> think it does a good job of addressing my other point of not needing to
> manage a 3rd server. It's still a server if it's started by java, and one
> still has to ensure it exists, and it will be extra hard to figure out how
> to configure it if started by Solr.
>
> I'm strongly in favor of us having a UI from my perspective as a
> consultant it makes discovery of things like their startup parameters and
> directories and such very easy (just go to front page of the admin screen),
> and it's so much easier to get a customer with security concerns and strict
> controls on who can access what (think banks, military, etc) to share a web
> session where they drive the UI than to get direct access to machines.
> It'll be a lot slower and much lower service to be making people wait while
> I craft curl statements to paste into the web session (and then fix
> the inevitable typos, or detect when they missed the last char of what I
> pasted, etc...).
>
> I definitely against Solr spawning some other server (node or otherwise)
> on it's own and thereby requiring additional system dependencies, or
> creating a second process that needs to be configured and properly secured.
> To me that's even worse than requiring the UI to run outside of Solr. We
> have a perfectly good web container already, and furthermore there's a much
> greater likelihood that maintainers will be facile with java/j2ee than
> anything else (IMHO). It's great if the framework we choose uses little or
> no JSP/Servlet and is modernized with a 100% javascript, templated etc.
> front end, but the back end should be java/jetty because we've got lots of
> java folks.
>
> If the back end matters deeply then you're not really programming to
> MVC/REST style...
>
> So there's another $0.02 :) and if you're not careful I'll give you an
> entire nickle's worth of ways people misuse/misunderstand the term REST :)
>
> -Gus
>
> On Tue, Apr 7, 2020 at 9:06 PM Marcus Eagan  wrote:
>
>> Gus,
>>
>> Your $.02 are worth a lot more than $.02 USD, so thank you.
>>
>> By separate app, I think I mean to endorse managed by a Node.js process
>> started by NPM. I don’t think that conflicts with what you have proposed.
>> The NPM command should be issued by Java || or Bash but I don’t think it
>> would add significant overhead. Also, seems like on CI and or precommit
>> hooks front end could be sizzled in parallel without adding much overhead.
>>
>> As for the front end framework, the most important things to consider in
>> my view are simplicity and maintainability. We need to do a thorough
>> analysis on the ecosystem and issues like the size of a React project vs
>> Angular project vs Vue project, but React and Vue certainly have the
>> velocity and the hearts if the front end community more than Angular. React
>> is MIT license now and for the foreseeable
>> future thanks to the power and reach of its developers.
>>
>>  wrote:
>>
>>> +1 for Angular CLI / Typescript since I've fiddled with this in a minor
>>> way recently, Also MIT license is super friendly.
>>>
>>>
>> As a disenfranchised volunteer to the project, I also assume voters on
>> specific choices like frameworks will be helping build in some respect at
>> some point now or i

Re: Solr Admin UI Refresh 2020

2020-04-08 Thread Marcus Eagan
 wrote:

>
> At the risk of displaying my ignorance for the current state of the art in
> front-end dev/tech:
> Why would we need a Node.js backend or any backend for that matter if this
> is purely a browser front-end based UI that will be deployed?  I'm aware
> there needs to be a webserver of course, but jeesh, Jetty is competent at
> that!
>

WRT to separate. The Node.js backend I suggest is a backend that is used
for transpiling code that isn’t JS into JS, and for packaging apps or
running the tests. Almost all new apps use some permutation of this pattern
afaik. Could and probably should still use Jetty or keep things as simple
as possible. The Hetty dependency still may put us at risk for boxing out
some UI devs. So, perhaps there is an option for running the app as a
node-only app locally for development purposes. These are implementation
details, though and I don’t have answers.

As a first step, I’ll test the existing work Jan pointed to and see how the
apps behave with Solr to identify gaps and share my findings. Feedback
always welcomed

Marcus

On Tue, Apr 7, 2020 at 21:05 David Smiley  wrote:

> I sympathize with what Gus wrote 100%.  For "small" users, I even say run
> ZK on those Solr nodes if you like, but that still leaves you with 3
> machines.
>
> At the risk of displaying my ignorance for the current state of the art in
> front-end dev/tech:
> Why would we need a Node.js backend or any backend for that matter if this
> is purely a browser front-end based UI that will be deployed?  I'm aware
> there needs to be a webserver of course, but jeesh, Jetty is competent at
> that!
>
> > As a disenfranchised volunteer to the project, I also assume voters on
> specific choices like frameworks will be helping build in some respect at
> some point now or in the future. Is that a fair or misguided assumption?
>
> Eh... are you saying either we vote (e.g. express opinions) + (actively)
> help or neither?   LOL.  You'll gets votes from any/everyone because they
> are cheap to give.  Maybe you'll get coding help or maybe not, but I think
> you can count on sufficient attention to get good code that works
> committed, especially since you are also discussing design/architecture now
> to get buy-in.  You will not waste your time.  If there are sacred cows to
> butcher then NOW is the time to be up front about what some of the most
> opinionated amongst us can accept.
>
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Apr 7, 2020 at 9:06 PM Marcus Eagan  wrote:
> >
> > Gus,
> >
> > Your $.02 are worth a lot more than $.02 USD, so thank you.
> >
> > By separate app, I think I mean to endorse managed by a Node.js process
> started by NPM. I don’t think that conflicts with what you have proposed.
> The NPM command should be issued by Java || or Bash but I don’t think it
> would add significant overhead. Also, seems like on CI and or precommit
> hooks front end could be sizzled in parallel without adding much overhead.
> >
> > As for the front end framework, the most important things to consider in
> my view are simplicity and maintainability. We need to do a thorough
> analysis on the ecosystem and issues like the size of a React project vs
> Angular project vs Vue project, but React and Vue certainly have the
> velocity and the hearts if the front end community more than Angular. React
> is MIT license now and for the foreseeable
> > future thanks to the power and reach of its developers.
> >
> >  wrote:
> >>
> >> +1 for Angular CLI / Typescript since I've fiddled with this in a minor
> way recently, Also MIT license is super friendly.
> >>
> >
> > As a disenfranchised volunteer to the project, I also assume voters on
> specific choices like frameworks will be helping build in some respect at
> some point now or in the future. Is that a fair or misguided assumption?
> >
> > Marcus
> >
> > On Tue, Apr 7, 2020 at 17:15 Gus Heck  wrote:
> >>
> >> +1 for Angular CLI / Typescript since I've fiddled with this in a minor
> way recently, Also MIT license is super friendly.
> >>
> >> Separate App - hmm... that's got some attraction, but also gives my
> stomach some churning when I think about solr now requiring management of 3
> different servers (solr, something to serve UI and zookeeper). Adding more
> infrastructure gives me pause with respect to all the smaller
> installations. I've had several small self funded startup clients and a few
> clients with existing initial installs that they are outgrowing in places
> where procuring new machines and new software is a 6-12 mo endeavor and
> both types

Re: Solr Admin UI Refresh 2020

2020-04-07 Thread Marcus Eagan
Gus,

Your $.02 are worth a lot more than $.02 USD, so thank you.

By separate app, I think I mean to endorse managed by a Node.js process
started by NPM. I don’t think that conflicts with what you have proposed.
The NPM command should be issued by Java || or Bash but I don’t think it
would add significant overhead. Also, seems like on CI and or precommit
hooks front end could be sizzled in parallel without adding much overhead.

As for the front end framework, the most important things to consider in my
view are simplicity and maintainability. We need to do a thorough analysis
on the ecosystem and issues like the size of a React project vs Angular
project vs Vue project, but React and Vue certainly have the velocity and
the hearts if the front end community more than Angular. React is MIT
license now and for the foreseeable
future thanks to the power and reach of its developers.

 wrote:

> +1 for Angular CLI / Typescript since I've fiddled with this in a minor
> way recently, Also MIT license is super friendly.
>
>
As a disenfranchised volunteer to the project, I also assume voters on
specific choices like frameworks will be helping build in some respect at
some point now or in the future. Is that a fair or misguided assumption?

Marcus

On Tue, Apr 7, 2020 at 17:15 Gus Heck  wrote:

> +1 for Angular CLI / Typescript since I've fiddled with this in a minor
> way recently, Also MIT license is super friendly.
>
> Separate App - hmm... that's got some attraction, but also gives my
> stomach some churning when I think about solr now requiring management of 3
> different servers (solr, something to serve UI and zookeeper). Adding more
> infrastructure gives me pause with respect to all the smaller
> installations. I've had several small self funded startup clients and a few
> clients with existing initial installs that they are outgrowing in places
> where procuring new machines and new software is a 6-12 mo endeavor and
> both types seem to squirm when I make suggestions such as running zookeeper
> separately, (let alone 3 of them). I think separate looks good for medium
> to large folks or very large companies that **already have** a solr expert
> on hand, but hurts the small clients and the departments in large orgs that
> got started with insufficient advice/expertise, so maybe
>
> - The UI should be installed by default
> - it should be easy to remove it, or start with it disabled
> - it should be self contained and separately downloadable.
>
> My recent fiddling included figuring out how to make angular CLI play nice
> in a J2ee war file structure seen here: https://github.com/nsoft/ns-login
>
> By play nice I mean,
> - build creates a war file that "just works" when installed
> - Angluar CLI commands work
> - Angular serve command works (for auto-reloading ui changes, running on
> port 4200; note the use of proxy to allow it to talk to an already running
> web container)
>
> My $0.02,
>
> -Gus
>
> On Mon, Apr 6, 2020 at 11:03 AM Jörn Franke  wrote:
>
>> I think standalone would be very useful.
>> I propose Angular with Typescript - it fits to a more data centric
>> approach with data types etc.
>> Maybe even two types of UIs - Admin UI and a simple Search UI.
>>
>>
>> Am 06.04.2020 um 16:53 schrieb Jan Høydahl :
>>
>> Thanks for kickstarting this and bringing some fresh blood and
>> enthusiasm :)
>>
>> Looks like others have had similar wish for a standalone Solr Admin App,
>> here’s a quick GitHub search for inspiration:
>>
>>   https://github.com/savantly-net/solr-admin (Angular, nice screenshots,
>> 1y old)
>>   https://github.com/kezhenxu94/yasa (vuejs, impressive screenshots, 2y
>> old)
>>   https://github.com/thereactleague/galaxy (React, no screenshots, 4y
>> old)
>>
>> They all seem abandoned but perhaps a new official effort could bring
>> their developers in as contributors again?
>>
>>  the people who work on the Admin UI do not need to be expected to know
>> the Java workflow, necessarily. This reality widens the net for who can
>> contribute.
>>
>>
>> Agree. Frontend devs have been a shortage in this project, and if we can
>> make it easier to attract UI committers who feel at home and productive
>> with the UI code, that would be a win. On the other hand, if we expect that
>> the UI will be maintained by regular Java committers, then anything that
>> makes it easier for them/us to contribute is also a win, like perhaps
>> strongly-typed.
>>
>> Again, thanks Marcus for reviving this topic. Let us all try not to be
>> overly ambitious here or shoot the initiative down with bikeshedding. It is
>> far more important to fuel the

Re: Solr Admin UI Refresh 2020

2020-04-07 Thread Marcus Eagan
Erick—it will be a lot of work. That’s good for me, er, I’m used to it.
Blame Ann Arbor and Solr.

Thanks Jan. I will do my best to move this effort along in a collaborative
yet productive  manner.  Thanks for the links. I’ve bookmarked them.

Jörn and Alex, I appreciate the input. I think the scope must be very
limited to Solr Admin moving off of deprecated tools for phase 1 (maybe
with some visual improvements baked in).

Specifically, to each of you:

Jörn - an open source search UI is something that I hear is in the works
right now. More on that later.

Alex, the Language Server Protocol is also awesome but probably not fit for
this discussion’s focus at for the moment. If you want to talk about it in
a separate thread I’m happy to chat through it and figure out how to reduce
friction for when it’s time to consider implementing it or something like
it.

Marcus

On Mon, Apr 6, 2020 at 12:55 Alexandre Rafalovitch 
wrote:

> I always wondered if Solr could benefit from Language Server Protocol:
> https://microsoft.github.io/language-server-protocol/ , at least for
> the Query screen. That would have allowed us to integrate with a bunch
> of tools automatically rather than having a great query implementation
> ourselves.
>
> But I don't know how feasible or relevant this is, so mostly just
> throwing it out there in case others also thought of it and/or if it
> will seem promising as a line of thought.
>
> Regards,
>Alex.
>
> On Mon, 6 Apr 2020 at 10:53, Jan Høydahl  wrote:
> >
> > Thanks for kickstarting this and bringing some fresh blood and
> enthusiasm :)
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
> --
Marcus Eagan


Solr Admin UI Refresh 2020

2020-04-06 Thread Marcus Eagan
Coming back to these existential questions from my phone:









*Jan Høydahl*
Added 1 hour ago

There are many opinions around admin UI. So I think the best place to start
would be a new mail-thread in dev@ to discuss the way forward. Before we
start a major re-work, we should probably ask ourselves a few existential
questions:

   - Should we turn Amin UI into a standalone app instead of embedded in
   Solr?


I think it should be a standalone app. There are many advantages gained
from a separation of such concerns. Some of the ones include, the people
who work on the Admin UI do not need to be expected to know the Java
workflow, necessarily. This reality widens the net for who can contribute.

Testing becomes a lot easier because JS developers are accustomed to
building tests for static assets and self-contained node apps. They
generally know less about testing a bit of JS within a massive Java
project.  The test could also run independently for changes that only
affect the front end. Adding test coverage without adding time to tests
sounds awesome.

There are quite a few tickets over the years that have seemed to suggest
that people want more fine-grained control over the Solr admin UI overall.
Two recent tickets discussed topics like running a Solr Admin app on only
one node and disabling it al together for whatever reason. See:
https://issues.apache.org/jira/browse/SOLR-14014.


   - What UI framework? Guess anything is better than current EOL, but will
   largely depend on who is willing to do the job!

I’m happy to take this on (and willing to follow through on completing in
my nights and weekends), but I am mostly framework agnostic. My stronge
preference would be React, provided the license is kosher. There was one
blip of “practically unusable for most orgs” a couple years back, but
Facebook made it right really soon after.  However, I’m flexible. Angular
(not JS) and Vue are also great.  I would recommend we consider Typescript
also because of the size of project and number of strongly-typed devs on
this mailing list. My only reservation with TypeScript, though it may not
apply in this case, is that the supersets of JS have changed a lot more
than the frameworks. While CoffeeScript was an unnecessary layer of
abstraction from my limited perspective, TypeScript might make JS more
embraceable to a list of Java hackers.


   - Current UI has no test coverage, can we do better with the new UI?


It’s imperative.React, Angular, and Vue each make it easy to include tests.



https://issues.apache.org/jira/browse/SOLR-12276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076204#comment-17076204


Re: Migrate Solr helm chart into apache/lucene-solr repo

2020-02-01 Thread Marcus Eagan
David, I think he is referring to the note here:


https://github.com/helm/charts/blob/master/README.md#deprecation-timeline


All the best,

Marcus

On Sat, Feb 1, 2020 at 14:37 David Smiley  wrote:

> Hi Lee,
>
> We're currently in-process of adopting Solr's Docker image.  I confess
> I've never used Helm so I have no clue how it's maintained, tested, etc and
> so I'll leave this sort of decision to others here.
>
> BTW I went to that link about Helm deprecation and AFAICT it seems about
> 2.x version of Helm and not about 3.x.  See
> https://helm.sh/blog/2019-10-22-helm-2150-released/#helm-2-support-plan
> too.  Am I missing something?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 31, 2020 at 1:09 AM LEE Ween Jiann <
> wjlee.2...@phdcs.smu.edu.sg> wrote:
>
>> Hi,
>>
>>
>>
>> Since helm stable repo is deprecated, it’s taking a lot of time to get
>> the pull requests approved.
>>
>> Also the helm chart for Solr has to go somewhere soon, refer to
>> https://github.com/helm/charts#deprecation-timeline.
>>
>>
>>
>> Would you guys be adopting it?
>>
> --
Marcus Eagan


Re: Maintenance of Solr's official Dockerfile

2020-01-05 Thread Marcus Eagan
Hi Jan,

Thanks or the update, and thanks Jan from Martijn for the donation! :)

I think that regardless of what the community decides to do with the
docker-solr repo, a good first step would be to add a Docker folder to the
Apache repository that contains a base Dockerfile and a README. In that
README, users can be directed to the location of the docker-solr repo,
wherever that may be, or leverage the Dockerfile in the  Apache repo as a
starting point for building their own image.

Two cents,

Marcus




On Sun, Jan 5, 2020 at 3:52 PM Jan Høydahl  wrote:

> Hi,
>
> The Lucene project is asked to take over maintenance of the official Solr
> Dockerfile that ends up on Docker hub (located in
> https://github.com/docker-solr/docker-solr). We have received a Software
> Grant from current maintainer Martijn Koster who has done a fantastic job
> together with a few committers maintaining it.
>
> I think it makes a lot of sense for the project to more tightly support
> Docker and ensure a good experience running Solr on Docker.
>
> This email thread is to discuss what that may look like and how we should
> transition the current code into the project.
>
> As a first step we invite all committers and contributors who use Docker
> to get involved, checkout the current docker-solr git repo, try building
> the images, submitting PRs etc. I have started doing this myself and have
> submitted a few PRs.
>
> Next step would be to agree on how we bring the current code into our
> project and ASF repos in the best possible way. Questions that arise are:
>
> 1. Are we allowed to maintain ASF code in a non-ASF repo? If not, how do
> we transition to an ASF git repo?
> * Can it be a sub folder in our main repo or does it need to be a
> separate repo?
> 2. How will the current build/test/publish process need to change?
> * Can we continue using travis for CI?
> * Do we need to talk to Docker folks to change repo location?
> * Should publishing of new Docker be a RM responsibility, or something
> that happens right after each release like the ref-guide?
> 3. Legal stuff - when we as a project file a PR to update the official
> solr docker images, are we then legally releasing a binary version of Solr?
> Technically it is Docker CI that build and publish the images, we just
> initiate it…
> Do we know any other ASF project that maintain their own official
> docker image?
> 4. Practical things - change README, NOTICE, header files, wording etc
>
> I have opened https://issues.apache.org/jira/browse/SOLR-14168 as an
> umbrella issue for tasks that spin out from this email thread discussion.
>
> Jan Høydahl
>


-- 
Marcus Eagan


Re: Change solr/lucene Readme file format

2019-11-10 Thread Marcus Eagan
Most README files in contemporary open source projects are Markdown because
of the formatting features. I personally favor convention over ease of use
in this case.

Marcus Eagan

On Sun, Nov 10, 2019, 8:58 AM Erick Erickson 
wrote:

> Personally I’d make them text files. The last thing I want to do is make
> reading/updating these have a barrier to entry. We should save formatting
> for the ref guide and/or Wiki.
>
> Best,
> Erick
>
> > On Nov 10, 2019, at 1:01 AM, Man with No Name 
> wrote:
> >
> > Hey folks,
> > I have been looking into the solr/lucene source code, and the first
> thing caught my eye was the different Readme files. All the files had
> different file and text format. What do you guys think about making all the
> readmes to markdown file rather than text files, and a standard template?
> >
> >
> > --
> > Regards:
> > Pinkesh Sharma
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] [Updated] (SOLR-13737) Lead with SolrCloud

2019-09-03 Thread Marcus Eagan (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eagan updated SOLR-13737:

Status: Patch Available  (was: Open)

> Lead with SolrCloud
> ---
>
> Key: SOLR-13737
> URL: https://issues.apache.org/jira/browse/SOLR-13737
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Affects Versions: master (9.0)
>    Reporter: Marcus Eagan
>Priority: Trivial
> Fix For: master (9.0)
>
>
> Based on some of the unnecessary and non-constructive criticism I have heard 
> that SolrCloud is an after thought in 2019, which is totally not true, I 
> decided it might be better if we moved it up ahead of standalone Solr in the 
> README.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13737) Lead with SolrCloud

2019-09-03 Thread Marcus Eagan (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eagan updated SOLR-13737:

Status: Open  (was: Patch Available)

> Lead with SolrCloud
> ---
>
> Key: SOLR-13737
> URL: https://issues.apache.org/jira/browse/SOLR-13737
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Affects Versions: master (9.0)
>    Reporter: Marcus Eagan
>Priority: Trivial
> Fix For: master (9.0)
>
>
> Based on some of the unnecessary and non-constructive criticism I have heard 
> that SolrCloud is an after thought in 2019, which is totally not true, I 
> decided it might be better if we moved it up ahead of standalone Solr in the 
> README.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13737) Lead with SolrCloud

2019-09-03 Thread Marcus Eagan (Jira)
Marcus Eagan created SOLR-13737:
---

 Summary: Lead with SolrCloud
 Key: SOLR-13737
 URL: https://issues.apache.org/jira/browse/SOLR-13737
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: documentation
Affects Versions: master (9.0)
Reporter: Marcus Eagan
 Fix For: master (9.0)


Based on some of the unnecessary and non-constructive criticism I have heard 
that SolrCloud is an after thought in 2019, which is totally not true, I 
decided it might be better if we moved it up ahead of standalone Solr in the 
README.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-08-23 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914342#comment-16914342
 ] 

Marcus Eagan edited comment on SOLR-13649 at 8/23/19 3:22 PM:
--

bq. What I was hoping for wrt smooth upgrade was a way to make the default 
depend on config version. We could have used luceneMatchVersion if this was a 
per-core config but it is a cluster-wide config so we cannot. I'm not aware of 
any cluster-wide config version parameter we could use instead. Perhaps a new 
clusterProperty solrMatchVersion could be of benefit for this and other cluster 
wide breaking changes. Then if solrMatchVersion is not set you'll assume 
Version.LATEST, but if it is set to e.g. 8.2 then blockUnknown could default to 
true as before. Or perhaps better is to introduce a "version" property in 
security.json that would work much like our schema version property, and we 
could start on version=2 from Solr9. This is how e.g. docker versions their 
docker-compose configs. This could be useful in the future if we need to change 
the very format of security.json to e.g. support multiple auth schemes and 
backends in the same cluster.

I think that would need to be addressed in another issue or PR that is linked 
to this one. I can write it, but would prefer the scope not creep on this 
change.

Great suggestion, though, although I feel like containers seem to address a lot 
of this version checking


was (Author: marcussorealheis):
bq. What I was hoping for wrt smooth upgrade was a way to make the default 
depend on config version. We could have used luceneMatchVersion if this was a 
per-core config but it is a cluster-wide config so we cannot. I'm not aware of 
any cluster-wide config version parameter we could use instead. Perhaps a new 
clusterProperty solrMatchVersion could be of benefit for this and other cluster 
wide breaking changes. Then if solrMatchVersion is not set you'll assume 
Version.LATEST, but if it is set to e.g. 8.2 then blockUnknown could default to 
true as before. Or perhaps better is to introduce a "version" property in 
security.json that would work much like our schema version property, and we 
could start on version=2 from Solr9. This is how e.g. docker versions their 
docker-compose configs. This could be useful in the future if we need to change 
the very format of security.json to e.g. support multiple auth schemes and 
backends in the same cluster.

I think that would need to be addressed in another issue or PR that is linked 
to this one. I can write it, but would prefer the scope not creep on this 
change.

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
>     Environment: All
>Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-08-23 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914342#comment-16914342
 ] 

Marcus Eagan commented on SOLR-13649:
-

bq. What I was hoping for wrt smooth upgrade was a way to make the default 
depend on config version. We could have used luceneMatchVersion if this was a 
per-core config but it is a cluster-wide config so we cannot. I'm not aware of 
any cluster-wide config version parameter we could use instead. Perhaps a new 
clusterProperty solrMatchVersion could be of benefit for this and other cluster 
wide breaking changes. Then if solrMatchVersion is not set you'll assume 
Version.LATEST, but if it is set to e.g. 8.2 then blockUnknown could default to 
true as before. Or perhaps better is to introduce a "version" property in 
security.json that would work much like our schema version property, and we 
could start on version=2 from Solr9. This is how e.g. docker versions their 
docker-compose configs. This could be useful in the future if we need to change 
the very format of security.json to e.g. support multiple auth schemes and 
backends in the same cluster.

I think that would need to be addressed in another issue or PR that is linked 
to this one. I can write it, but would prefer the scope not creep on this 
change.

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-08-20 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911042#comment-16911042
 ] 

Marcus Eagan commented on SOLR-13649:
-

for people watching this issue, I have added the appropriate tests and now 
throw an exception if a user attempts to delete the final user or enable the 
basic auth plugin without at least one user. 

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>    Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-08-13 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905973#comment-16905973
 ] 

Marcus Eagan commented on SOLR-13649:
-

My apologies. I thought that an error was what you were requesting. I will 
revert that change in the morning. 

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>    Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-08-12 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905822#comment-16905822
 ] 

Marcus Eagan commented on SOLR-13649:
-

I've added an error in case the blockUnknown parameter is not set to make it 
easier for the community to adopt this change upon upgrading.

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>    Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-07-31 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897572#comment-16897572
 ] 

Marcus Eagan commented on SOLR-13649:
-

[~noble.paul] That makes sense. It will be only added to 9.0 (master branch, I 
believe)

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>    Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13649) BasicAuth's 'blockUnknown' param should default to true

2019-07-31 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897554#comment-16897554
 ] 

Marcus Eagan commented on SOLR-13649:
-

[~noble.paul] Can you explain what's backward incompatible about it so that the 
community has the details?

I've explained why we need to change it if you read above. All our 
documentation is a false statement, starting with documentation you wrote. 
Secondly, the default behavior is not intuitive yet should not require 
documentation consultation.

> BasicAuth's 'blockUnknown' param should default to true
> ---
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI, Authentication, security
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>    Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13649) When Using Basic Authentication, the blockUnknown Value should be True

2019-07-24 Thread Marcus Eagan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eagan updated SOLR-13649:

Fix Version/s: master (9.0)

> When Using Basic Authentication, the blockUnknown Value should be True
> --
>
> Key: SOLR-13649
> URL: https://issues.apache.org/jira/browse/SOLR-13649
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI, Authentication
>Affects Versions: 7.7.2, 8.1.1
> Environment: All
>Reporter: Marcus Eagan
>Priority: Major
>  Labels: Authentication
> Fix For: master (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If someone seeks to enable basic authentication but they do not specify the 
> {{blockUnknown}} parameter, the default value is {{false}}. That default 
> behavior is a bit counterintuitive because if someone wishes to enable basic 
> authentication, you would expect that they would want all unknown users to 
> need to authenticate by default. I can imagine cases where you would not, but 
> those cases would be less frequent.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13649) When Using Basic Authentication, the blockUnknown Value should be True

2019-07-23 Thread Marcus Eagan (JIRA)
Marcus Eagan created SOLR-13649:
---

 Summary: When Using Basic Authentication, the blockUnknown Value 
should be True
 Key: SOLR-13649
 URL: https://issues.apache.org/jira/browse/SOLR-13649
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Admin UI, Authentication
Affects Versions: 8.1.1, 7.7.2
 Environment: All
Reporter: Marcus Eagan


If someone seeks to enable basic authentication but they do not specify the 
{{blockUnknown}} parameter, the default value is {{false}}. That default 
behavior is a bit counterintuitive because if someone wishes to enable basic 
authentication, you would expect that they would want all unknown users to need 
to authenticate by default. I can imagine cases where you would not, but those 
cases would be less frequent.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13537) Build Status Badge in git README

2019-06-23 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870715#comment-16870715
 ] 

Marcus Eagan edited comment on SOLR-13537 at 6/24/19 1:02 AM:
--

Build badges won’t be dynamic in a forked repo because they point to builds on 
the Apache master branch.

For pull requests, we only need to configure web hooks in Jenkins to kick off a 
build whenever a PR is opened. We cannot do that until we clean up the tests, 
unfortunately. Otherwise, such an addition would be a nuisance. It’s pretty 
easy but will need to be a bit different due to the nature of this project and 
its test suite.

Still, adding a PR appropriate CI pipeline is an effort for another PR/JIRA 
Issue.


was (Author: marcussorealheis):
Build badges won’t be dynamic in a forked repo because they point to builds on 
the Apache master branch. 

For pull requests, we only need to configure web hooks in Jenkins to kick off a 
build whenever a PR is opened. We cannot do that until we clean up the tests, 
unfortunately. Otherwise, such an addition would be a nuisance. It’s pretty 
easy but will need to be a bit different due to the type of this project and 
its test suite. 

Still,  adding a PR appropriate CI pipeline is an effort for another PR/JIRA 
Issue. 

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png, Single Line 
> Badges.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13537) Build Status Badge in git README

2019-06-23 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870715#comment-16870715
 ] 

Marcus Eagan commented on SOLR-13537:
-

Build badges won’t be dynamic in a forked repo because they point to builds on 
the Apache master branch. 

For pull requests, we only need to configure web hooks in Jenkins to kick off a 
build whenever a PR is opened. We cannot do that until we clean up the tests, 
unfortunately. Otherwise, such an addition would be a nuisance. It’s pretty 
easy but will need to be a bit different due to the type of this project and 
its test suite. 

Still,  adding a PR appropriate CI pipeline is an effort for another PR/JIRA 
Issue. 

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png, Single Line 
> Badges.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-13537) Build Status Badge in git README

2019-06-23 Thread Marcus Eagan
Build badges won’t be dynamic in a forked repo because they point to builds
on the Apache master branch.

For pull requests, we only need to configure web hooks in Jenkins to kick
off a build whenever a PR is opened. We cannot do that until we clean up
the tests, unfortunately. Otherwise, such an addition would be a nuisance.
It’s pretty easy but will need to be a bit different due to the type of
this project and its test suite.

Still, that’s an effort for another PR.



On Mon, Jun 24, 2019 at 1:18 AM Jan Høydahl (JIRA)  wrote:

>
> [
> https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870695#comment-16870695
> ]
>
> Jan Høydahl commented on SOLR-13537:
> 
>
> {quote}somebody forks the repo, commits changes to their fork, and the
> badge actually shows whether or not their repository compiles
> {quote}
> I have seen this done in Pull Requests. You can configure various
> validation rules including running tests, that will be required before the
> merge button will work. I.e. if they fork repo, commit some changes and
> file a PR, we should be able to configure something that is run after every
> commit to the PR. Don't know if tools like circleCI will be free for open
> source, but they are capable of running our build and precommit. Yetus does
> this for Jira patch attachments but I have not seen it doing it for PRs?
>
> > Build Status Badge in git README
> > 
> >
> > Key: SOLR-13537
> > URL: https://issues.apache.org/jira/browse/SOLR-13537
> > Project: Solr
> >  Issue Type: Wish
> >  Security Level: Public(Default Security Level. Issues are Public)
> >      Components: Build, documentation
> >Affects Versions: master (9.0), 8.2
> >Reporter: Marcus Eagan
> >Priority: Trivial
> > Attachments: Markdown Preview Of Build Status README.png, Simple
> Artifact Build Badge.png, Simple Artifact Build Badges.png, Single Line
> Badges.png
> >
> >  Time Spent: 0.5h
> >  Remaining Estimate: 0h
> >
> > In order to aid developers and DevOps engineers who are working in a
> git-driven ecosystem, it would be helpful to see the status builds in the
> README. This is a standard for many open source projects. I think one could
> debate whether we should have a multi-line build badge visual in the README
> because people need to know about the builds for various versions and
> platforms in the case of Lucene/Solr because it is such a large and widely
> used project, in a variety of environments. The badges not only celebrate
> that fact, they support its persistence in the future with new developers
> who look for such information instictively.
> > I would recommend the active build pipelines (currently 8.x and 9.x) for
> each platform, Linux, Windows, MacOSX, and Solaris.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
> --
Marcus Eagan


[jira] [Commented] (SOLR-13537) Build Status Badge in git README

2019-06-21 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869929#comment-16869929
 ] 

Marcus Eagan commented on SOLR-13537:
-

[~janhoy] Good idea Jan. I've updated the PR and added screenshot to the JIRA 
issue. 

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png, Single Line 
> Badges.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13537) Build Status Badge in git README

2019-06-21 Thread Marcus Eagan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eagan updated SOLR-13537:

Attachment: Single Line Badges.png

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png, Single Line 
> Badges.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13537) Build Status Badge in git README

2019-06-16 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865229#comment-16865229
 ] 

Marcus Eagan edited comment on SOLR-13537 at 6/17/19 1:37 AM:
--

[~elyograg] I was using the Solr artifact/Solr builds so I put it in the Solr 
Jira. I've now added the Lucene artifact. So would it be appropriate for 
either, or default to Lucene?

More committers in the future could be using GitHub a lot more. That is a 
reasonable assumption that I made but it could turn out to be false. 

One of my initial motivations was to inspire others to dig into why things were 
red. I do not think there's any reputation problem at risk here. Many projects 
are often red. 

There have been instances, though infrequently, where Lucene did not compile. 
Those builds would have been red, but usually Solr is green. It's almost always 
green for that task. 

Perfect world the tests pass when they should and fail when they should not.

 


was (Author: marcussorealheis):
[~elyograg] I was using the Solr artifact so I put it in the Solr Jira. I've 
now added the Lucene artifact. So would it be appropriate for either, or 
default to Lucene?

More committers in the future could be using GitHub a lot more. That is a 
reasonable assumption that I made. 

One of my initial motivations was to inspire others to dig into why things were 
red. I do not think there's any reputation problem at risk here. Many projects 
are often red. 

There have been instances, though infrequently, where Lucene did not compile. 
Those builds would have been red, but usually Solr is green. It's almost always 
green for that task.  

Perfect world the tests pass when they should and fail when they should not.

 

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13537) Build Status Badge in git README

2019-06-16 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16865229#comment-16865229
 ] 

Marcus Eagan commented on SOLR-13537:
-

[~elyograg] I was using the Solr artifact so I put it in the Solr Jira. I've 
now added the Lucene artifact. So would it be appropriate for either, or 
default to Lucene?

More committers in the future could be using GitHub a lot more. That is a 
reasonable assumption that I made. 

One of my initial motivations was to inspire others to dig into why things were 
red. I do not think there's any reputation problem at risk here. Many projects 
are often red. 

There have been instances, though infrequently, where Lucene did not compile. 
Those builds would have been red, but usually Solr is green. It's almost always 
green for that task.  

Perfect world the tests pass when they should and fail when they should not.

 

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13537) Build Status Badge in git README

2019-06-16 Thread Marcus Eagan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eagan updated SOLR-13537:

Attachment: Simple Artifact Build Badges.png

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png, Simple Artifact Build Badges.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13537) Build Status Badge in git README

2019-06-15 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864642#comment-16864642
 ] 

Marcus Eagan commented on SOLR-13537:
-

[~janhoy] I don't know if you saw, but I added a new badge and screenshot using 
a different build rather than the flaky ones.

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13537) Build Status Badge in git README

2019-06-14 Thread Marcus Eagan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864185#comment-16864185
 ] 

Marcus Eagan edited comment on SOLR-13537 at 6/14/19 3:25 PM:
--

I added a new screenshot to reflect how it would look in the future. I changed 
the build badge based on comments from some committers. 


was (Author: marcussorealheis):
I added a new screenshot to reflect how it would look in the future. I changed 
the build badge based on comments from some members of the Solr team.

> Build Status Badge in git README
> 
>
> Key: SOLR-13537
> URL: https://issues.apache.org/jira/browse/SOLR-13537
> Project: Solr
>  Issue Type: Wish
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build, documentation
>Affects Versions: master (9.0), 8.2
>    Reporter: Marcus Eagan
>Priority: Trivial
> Attachments: Markdown Preview Of Build Status README.png, Simple 
> Artifact Build Badge.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to aid developers and DevOps engineers who are working in a 
> git-driven ecosystem, it would be helpful to see the status builds in the 
> README. This is a standard for many open source projects. I think one could 
> debate whether we should have a multi-line build badge visual in the README 
> because people need to know about the builds for various versions and 
> platforms in the case of Lucene/Solr because it is such a large and widely 
> used project, in a variety of environments. The badges not only celebrate 
> that fact, they support its persistence in the future with new developers who 
> look for such information instictively.
> I would recommend the active build pipelines (currently 8.x and 9.x) for each 
> platform, Linux, Windows, MacOSX, and Solaris.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >