Re: gradlew check failure

2022-01-24 Thread Joel Bernstein
I stopped the gradle daemon (./gradlew --stop) and deleted the
gradle.properties. I'm still getting the same error though. My new
gradle.properties has the following:



org.gradle.jvmargs=-Xmx3g \

 --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \

 --add-exports jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED \

 --add-exports jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED \

 --add-exports jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED \

 --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Jan 24, 2022 at 8:57 AM Alan Woodward  wrote:

> AIUI no, it’s a problem when loading gradle’s JVM, but Dawid might have a
> better idea?
>
> On 24 Jan 2022, at 13:50, Mike Drob  wrote:
>
> Is there a way to check for these missing module exports early and fail
> with a more informative message?
>
> On Mon, Jan 24, 2022 at 7:42 AM Alan Woodward 
> wrote:
>
>> Hey Joel,
>>
>> The fix for this is to delete the gradle.properties file in the root
>> directory and stop any daemons before running gradle check again.  The
>> build will regenerate the gradle.properties file with some module exports
>> that work around this problem in the formatter.
>>
>> - A
>>
>> On 24 Jan 2022, at 13:33, Joel Bernstein  wrote:
>>
>> Hi.
>>
>> I'm getting the following gradlew check failure with Java 17 on the
>> lucene main branch:
>>
>> Caused by: java.lang.IllegalAccessError: class
>> com.google.googlejavaformat.java.JavaInput (in unnamed module @0x3d6a6107)
>> cannot access class com.sun.tools.javac.parser.Tokens$TokenKind (in module
>> jdk.compiler) because module jdk.compiler does not export
>> com.sun.tools.javac.parser to unnamed module @0x3d6a6107
>> at
>> com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:349)
>> at
>> com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:334)
>> at
>> com.google.googlejavaformat.java.JavaInput.(JavaInput.java:276)
>> at
>> com.google.googlejavaformat.java.Formatter.getFormatReplacements(Formatter.java:280)
>> at
>> com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:267)
>> at
>> com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:233)
>> ... 142 more
>>
>> Is there a step I'm missing in the setup process?
>>
>> Thanks,
>> Joel
>>
>>
>>
>


gradlew check failure

2022-01-24 Thread Joel Bernstein
Hi.

I'm getting the following gradlew check failure with Java 17 on the lucene
main branch:

Caused by: java.lang.IllegalAccessError: class
com.google.googlejavaformat.java.JavaInput (in unnamed module @0x3d6a6107)
cannot access class com.sun.tools.javac.parser.Tokens$TokenKind (in module
jdk.compiler) because module jdk.compiler does not export
com.sun.tools.javac.parser to unnamed module @0x3d6a6107

at
com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:349)

at
com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:334)

at
com.google.googlejavaformat.java.JavaInput.(JavaInput.java:276)

at
com.google.googlejavaformat.java.Formatter.getFormatReplacements(Formatter.java:280)

at
com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:267)

at
com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:233)

... 142 more

Is there a step I'm missing in the setup process?

Thanks,
Joel


Re: Filtering before a vector search.

2022-01-19 Thread Joel Bernstein
https://issues.apache.org/jira/browse/LUCENE-10382


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Jan 19, 2022 at 2:59 PM Joel Bernstein  wrote:

> Ok, I can create the jira.
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Wed, Jan 19, 2022 at 2:49 PM Michael Sokolov 
> wrote:
>
>> +1 we should extend the functionality to support any Bits, not just
>> liveDocs; we need to propose an API. The implementation should not be
>> too hard - we need to intersect the user-supplied Bits with liveDocs
>> and use that to filter.
>>
>> On Wed, Jan 19, 2022 at 1:42 PM Joel Bernstein 
>> wrote:
>> >
>> > Hi,
>> >
>> > Thanks for all the work on the vector search!
>> >
>> > I was wondering if there was a way using KnnVectorQuery to filter the
>> docs this query looks at. Right now the searchLeaf method passes in the
>> liveDocs to LeafReader.searchNearestVectors, but there appears to be no way
>> to have the KnnVectorQuery operate on a subset of liveDocs.
>> >
>> > Thanks,
>> >
>> > Joel Bernstein
>> > http://joelsolr.blogspot.com/
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Filtering before a vector search.

2022-01-19 Thread Joel Bernstein
Ok, I can create the jira.



Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Jan 19, 2022 at 2:49 PM Michael Sokolov  wrote:

> +1 we should extend the functionality to support any Bits, not just
> liveDocs; we need to propose an API. The implementation should not be
> too hard - we need to intersect the user-supplied Bits with liveDocs
> and use that to filter.
>
> On Wed, Jan 19, 2022 at 1:42 PM Joel Bernstein  wrote:
> >
> > Hi,
> >
> > Thanks for all the work on the vector search!
> >
> > I was wondering if there was a way using KnnVectorQuery to filter the
> docs this query looks at. Right now the searchLeaf method passes in the
> liveDocs to LeafReader.searchNearestVectors, but there appears to be no way
> to have the KnnVectorQuery operate on a subset of liveDocs.
> >
> > Thanks,
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Filtering before a vector search.

2022-01-19 Thread Joel Bernstein
Hi,

Thanks for all the work on the vector search!

I was wondering if there was a way using KnnVectorQuery to filter the docs
this query looks at. Right now the searchLeaf method passes in the liveDocs
to LeafReader.searchNearestVectors, but there appears to be no way to have
the KnnVectorQuery operate on a subset of liveDocs.

Thanks,

Joel Bernstein
http://joelsolr.blogspot.com/


Re: Slow DV equivalent of TermInSetQuery

2021-10-26 Thread Joel Bernstein
There are times, particularly in ecommerce and access control, where speed
really matters. So, you build stuff that's really fast at query time, with
a tradeoff at commit time.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 26, 2021 at 5:31 PM Robert Muir  wrote:

> Sorry, I don't think there is a need to use any top-level ordinals.
> none of these docvalues-based query implementations need it.
>
> As far as query intersecting an input-stream, that is a big no-go.
> Lucene Queries need to have correct hashcode/equals/etc.
>
> That's why current stuff around this such as TermInSetQuery encode
> everything into a PrefixCodedTerms.
>
> On Tue, Oct 26, 2021 at 4:57 PM Joel Bernstein  wrote:
> >
> > One more wrinkle for extremely large lists, is pass the list in as an
> InputStream which is a presorted binary representation of the ASIN's and
> slide a BytesRef across the stream and merge it with the SortedDocValues.
> This saves on all the object creation and String overhead for really long
> lists of id's.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> >
> > On Tue, Oct 26, 2021 at 4:50 PM Joel Bernstein 
> wrote:
> >>
> >> If the list of ASIN's is presorted you can quickly merge it with the
> SortedDocValues and produce a FixedBitSet of the top level ordinals, which
> can be used as the post filter. This is a nice approach for things like
> passing in a long list of access control predicates.
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >>
> >> On Tue, Oct 26, 2021 at 3:52 PM Adrien Grand  wrote:
> >>>
> >>> I opened https://issues.apache.org/jira/browse/LUCENE-10207 about
> these ideas.
> >>>
> >>> On Tue, Oct 26, 2021 at 7:52 PM Robert Muir  wrote:
> >>>>
> >>>> On Tue, Oct 26, 2021 at 1:37 PM Adrien Grand 
> wrote:
> >>>> >
> >>>> > > And then we could make an IndexOrDocValuesQuery with both the
> TermInSetQuery and this SDV.newSlowInSetQuery?
> >>>> >
> >>>> > Unfortunately IndexOrDocValuesQuery relies on the fact that the
> "index" query can evaluate its cost (ScorerSupplier#cost) without doing
> anything costly, which isn't the case for TermInSetQuery.
> >>>> >
> >>>> > So we'd need to make some changes. Estimating the cost of a
> TermInSetQuery in general without seeking the terms is a hard problem, but
> maybe we could specialize the unique key case to return the number of terms
> as the cost?
> >>>>
> >>>> Yes we know each term in terms dict only has a single document, when
> >>>> terms.size() == terms.getSumDocFreq(): there's only one posting for
> >>>> each term.
> >>>> But we can probably generalize a cost estimation a bit more, just
> >>>> based on these two stats?
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >>>> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>>>
> >>>
> >>>
> >>> --
> >>> Adrien
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Slow DV equivalent of TermInSetQuery

2021-10-26 Thread Joel Bernstein
One more wrinkle for extremely large lists, is pass the list in as an
InputStream which is a presorted binary representation of the ASIN's and
slide a BytesRef across the stream and merge it with the SortedDocValues.
This saves on all the object creation and String overhead for really long
lists of id's.

Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 26, 2021 at 4:50 PM Joel Bernstein  wrote:

> If the list of ASIN's is presorted you can quickly merge it with the
> SortedDocValues and produce a FixedBitSet of the top level ordinals, which
> can be used as the post filter. This is a nice approach for things like
> passing in a long list of access control predicates.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Tue, Oct 26, 2021 at 3:52 PM Adrien Grand  wrote:
>
>> I opened https://issues.apache.org/jira/browse/LUCENE-10207 about these
>> ideas.
>>
>> On Tue, Oct 26, 2021 at 7:52 PM Robert Muir  wrote:
>>
>>> On Tue, Oct 26, 2021 at 1:37 PM Adrien Grand  wrote:
>>> >
>>> > > And then we could make an IndexOrDocValuesQuery with both the
>>> TermInSetQuery and this SDV.newSlowInSetQuery?
>>> >
>>> > Unfortunately IndexOrDocValuesQuery relies on the fact that the
>>> "index" query can evaluate its cost (ScorerSupplier#cost) without doing
>>> anything costly, which isn't the case for TermInSetQuery.
>>> >
>>> > So we'd need to make some changes. Estimating the cost of a
>>> TermInSetQuery in general without seeking the terms is a hard problem, but
>>> maybe we could specialize the unique key case to return the number of terms
>>> as the cost?
>>>
>>> Yes we know each term in terms dict only has a single document, when
>>> terms.size() == terms.getSumDocFreq(): there's only one posting for
>>> each term.
>>> But we can probably generalize a cost estimation a bit more, just
>>> based on these two stats?
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> --
>> Adrien
>>
>


Re: Slow DV equivalent of TermInSetQuery

2021-10-26 Thread Joel Bernstein
If the list of ASIN's is presorted you can quickly merge it with the
SortedDocValues and produce a FixedBitSet of the top level ordinals, which
can be used as the post filter. This is a nice approach for things like
passing in a long list of access control predicates.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 26, 2021 at 3:52 PM Adrien Grand  wrote:

> I opened https://issues.apache.org/jira/browse/LUCENE-10207 about these
> ideas.
>
> On Tue, Oct 26, 2021 at 7:52 PM Robert Muir  wrote:
>
>> On Tue, Oct 26, 2021 at 1:37 PM Adrien Grand  wrote:
>> >
>> > > And then we could make an IndexOrDocValuesQuery with both the
>> TermInSetQuery and this SDV.newSlowInSetQuery?
>> >
>> > Unfortunately IndexOrDocValuesQuery relies on the fact that the "index"
>> query can evaluate its cost (ScorerSupplier#cost) without doing anything
>> costly, which isn't the case for TermInSetQuery.
>> >
>> > So we'd need to make some changes. Estimating the cost of a
>> TermInSetQuery in general without seeking the terms is a hard problem, but
>> maybe we could specialize the unique key case to return the number of terms
>> as the cost?
>>
>> Yes we know each term in terms dict only has a single document, when
>> terms.size() == terms.getSumDocFreq(): there's only one posting for
>> each term.
>> But we can probably generalize a cost estimation a bit more, just
>> based on these two stats?
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> Adrien
>


Re: Welcome Michael Gibney as Lucene committer

2021-10-12 Thread Joel Bernstein
Welcome, Michael!

Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Oct 10, 2021 at 2:41 AM Atri Sharma  wrote:

> Welcome, Michael!
>
> On Thu, 7 Oct 2021, 23:29 Michael Sokolov,  wrote:
>
>> Welcome, Michael!
>>
>> On Wed, Oct 6, 2021 at 9:34 AM Dawid Weiss  wrote:
>> >
>> > Hello everyone!
>> >
>> > Please welcome Michael Gibney as the latest Lucene committer. Michael
>> > - it's a tradition for you to introduce yourself, even if we've been
>> > seeing you for quite a while! :)
>> >
>> > Dawid
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Lucene and Solr repositories mirrored, main branch ready

2021-03-10 Thread Joel Bernstein
Just tested out the main branch of the new repo, packaged, started, loaded
data, searched from the UI. All looks great. Very exciting! Thanks Dawid
for all your work on this!


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Mar 10, 2021 at 10:14 AM Atri Sharma  wrote:

> Totally agreed. They have really driven this to completion with as minimal
> disruption as possible.
>
> Special mention to Uwe, as always!
>
> On Wed, 10 Mar 2021, 20:32 David Smiley,  wrote:
>
>> Thank *you* Dawid!  You and Jan have been big heroes of this transition!
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Wed, Mar 10, 2021 at 9:36 AM Dawid Weiss 
>> wrote:
>>
>>> Thank you everyone for the collective effort to clean up stale project
>>> references, templates, etc.
>>>
>>> D.
>>>
>>> On Wed, Mar 10, 2021 at 1:04 PM Dawid Weiss 
>>> wrote:
>>> >
>>> > First of all, apologies for the e-mail commit bomb... Things like that
>>> > can happen, hard to tell in advance. Thanks to infra for helping out.
>>> >
>>> > Solr and Lucene repositories have been cloned at commit 7ada403218.
>>> >
>>> > Master branch is wiped out of content on all repositories, branch_8x
>>> > is wiped on lucene and solr repositories to avoid confusion (8x
>>> > development takes place at the joint repository).
>>> >
>>> > I've removed lucene/solr from each other. Things should work out of
>>> > the box but if something does not, please file an issue (or better -
>>> > try to fix it).
>>> >
>>> > There is going to be a lot of mundane cleanup work to remove cross
>>> > references and get the documentation going but it's all a follow-up.
>>> >
>>> > Here is a short help guide to port existing PRs:
>>> > https://github.com/apache/lucene-solr/blob/master/PRs.md
>>> >
>>> > Github actions should work too, as shown here:
>>> > https://github.com/apache/lucene/pull/2
>>> >
>>> > Builds can be enabled (perhaps slowly, at first? :).
>>> >
>>> > Solr developers: Lucene can be built and installed in your local maven
>>> > repositories with:
>>> > gradlew mavenToLocalRepo
>>> >
>>> > Dawid
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


Re: Some small questions on streaming expressions

2021-03-01 Thread Joel Bernstein
Your first example looks like a bug to me. This may be work around for you:

select(echo("Hello"),
  echo as blah,
  lower(echo) as blah1)

Returns:

{ "result-set": { "docs": [ { "blah": "Hello", "blah1": "hello" }, { "EOF":
true, "RESPONSE_TIME": 0 } ] } }

The string manipulation function is working properly but the straight
mapping does not.

Your second question: can we split a stream's output to two streams.
Currently only the let expression does this I believe.

But, to your third question, the let expression does not stream, it's all
in memory. The let expression is designed for vector math over samples or
aggregations (time series).

So, right now I don't think we have a way to split a stream and operate
over it with a different set of streams.

Joel Bernstein
http://joelsolr.blogspot.com/


On Sat, Feb 27, 2021 at 4:42 PM ufuk yılmaz 
wrote:

> Hello all,
>
>
>
> I’m trying to reindex from a collection to a new collection with a
> different schema, using streaming expressions. I can’t use
> REINDEXCOLLECTION directly, because I need to process documents a bit.
>
>
>
> I couldn’t figure out 3 simple, related things for hours so forgive me if
> I just ask.
>
>
>
>1. Is there a way to duplicate the value of a field of an incoming
>tuple into two fields?
>
> I tried the select expression:
>
> select(
>
>   echo("Hello"),
>
>   echo as echy, echo as echee
>
> )
>
>
>
> But when I use the same field twice, only the last “as” takes effect, it
> doesn’t copy the value to two fields:
>
> {
>
>   "result-set": {
>
> "docs": [
>
>   {
>
> "echee": "Hello"
>
>   },
>
>   {
>
> "EOF": true,
>
> "RESPONSE_TIME": 0
>
>   }
>
> ]
>
>   }
>
> }
>
>
>
> I accomplished this by using leftOuterJoin, with same exact stream in left
> and right, joining on itself with different field names. But this has the
> penaly of executing the same stream twice, It’s no problem for small
> streams but in my case there will be a couple hundred million tuples coming
> from the stream.
>
>
>
>
>
>1. Is there a way to “feed” one stream’s output to two different
>streams? Like feeding output of a stream source to two different stream
>decorator without executing the same stream twice?
>2. Does the “let” stream hold its entire content in memory when a
>stream is assigned to a variable, or does it stream continuously too? If
>not, I imagine it can be used for my question 2.
>
>
>
>
>
> I’m glad that Solr has streaming expressions.
>
>
>
> --ufuk yilmaz
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>


Re: Select streaming expression, add a field to every tuple, replaceor raw not working

2021-02-27 Thread Joel Bernstein
Yeah, this is an error in the docs which needs to be corrected as this is a
common use case. The val function is the one to use. I will make the change
in the docs.



Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Feb 26, 2021 at 12:28 PM ufuk yılmaz 
wrote:

> I tried to debug this to the best of my ability, and it seems the correct
> name for the “raw” evaluator is “val”.
>
>
>
> Copied from StreamContext: val=class
> org.apache.solr.client.solrj.io.eval.RawValueEvaluator
>
>
>
> I think there’s a small error in stream evaluator documentation of 8.4
>
>
>
> https://lucene.apache.org/solr/guide/8_4/stream-evaluator-reference.html
>
>
>
> When I used “val” instead of “raw”, I got the expected response:
>
>
>
> select(
>
> search(
>
> myCollection,
>
> q="*:*",
>
> qt="/export",
>
> sort="id_str asc",
>
> fl="id_str"
>
> ),
>
> id_str,
>
> val(abc) as text
>
> )
>
>
>
> {
>
>   "result-set": {
>
> "docs": [
>
>   {
>
> "id_str": "deneme123",
>
> "text": "abc"
>
>   },
>
>   {
>
> "EOF": true,
>
> "RESPONSE_TIME": 70
>
>   }
>
> ]
>
>   }
>
> }
>
>
>
> --ufuk yilmaz
>
>
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>
> *From: *ufuk yılmaz 
> *Sent: *26 February 2021 16:38
> *To: *solr-u...@lucene.apache.org
> *Subject: *Select streaming expression, add a field to every tuple,
> replaceor raw not working
>
>
>
> Hello all,
>
>
>
> Solr version 8.4
>
>
>
> I have a very simple select expression here. What I’m trying to do is to
> add a constant value to incoming tuples.
>
>
>
> My collection has only 1 document. Id_str is of type String. Other fields
> are Solr generated.
>
>
>
> {
>
> "_version_":1692761378187640832,
>
> "id_str":"experiment123",
>
> "id":"18d658b13b6b072f"}]
>
>   }
>
>
>
> My streaming expression:
>
>
>
> select(
>
> search(
>
> myCollection,
>
> q="*:*",
>
> qt="/export",
>
> sort="id_str asc",
>
> fl="id_str"
>
> ),
>
> id_str,
>
> raw(ttt) as text // Docs state that select
> works with any evaluator. “raw” here is a stream evaluator.
>
> )
>
>
>
> I also tried:
>
>
>
> select(
>
> search(
>
> myCollection,
>
> q="*:*",
>
> qt="/export",
>
> sort="id_str asc",
>
> fl="id_str"
>
> ),
>
> id_str,
>
> replace(text, null, withValue=raw(ttt)) as
> text //replace is described in select expression documentation. I also
> tried withValue=ttt directly
>
> )
>
>
>
> No matter what I do, response only includes id_str field, without any
> error:
>
>
>
> {
>
>   "result-set":{
>
> "docs":[{
>
> "id_str":" experiment123"}
>
>   ,{
>
> "EOF":true,
>
> "RESPONSE_TIME":45}]}}
>
>
>
> I also tried wrapping text value with quotes, that didn’t work too.
>
>
>
> What am I doing wrong?
>
>
>
> --ufuk yilmaz
>
>
>
> Sent from Mail for Windows 10
>
>
>
>
>


Re: Congratulations to the new Lucene PMC Chair, Michael Sokolov!

2021-02-18 Thread Joel Bernstein
Congratulations Michael!



Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Feb 18, 2021 at 2:46 PM Houston Putman 
wrote:

> Congrats Michael!
>
> On Thu, Feb 18, 2021 at 11:52 AM Martin Gainty 
> wrote:
>
>> поздравления!
>>
>>
>> --
>> *From:* Julie Tibshirani 
>> *Sent:* Wednesday, February 17, 2021 9:13 PM
>> *To:* Lucene Dev 
>> *Subject:* Re: Congratulations to the new Lucene PMC Chair, Michael
>> Sokolov!
>>
>> Congratulations Mike!!
>>
>> On Wed, Feb 17, 2021 at 3:12 PM Gus Heck  wrote:
>>
>> Congratulations :)
>>
>> On Wed, Feb 17, 2021 at 5:42 PM Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>>
>> Congratulations Mike!
>>
>> On Wed, Feb 17, 2021 at 2:42 PM Steve Rowe  wrote:
>>
>> Congrats Mike!
>>
>> --
>> Steve
>>
>> > On Feb 17, 2021, at 4:31 PM, Anshum Gupta 
>> wrote:
>> >
>> > Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice
>> President position.
>> >
>> > This year we nominated and elected Michael Sokolov as the Chair, a
>> decision that the board approved in its February 2021 meeting.
>> >
>> > Congratulations, Mike!
>> >
>> > --
>> > Anshum Gupta
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>>


Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-18 Thread Joel Bernstein
Congratulations Jan!


Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Feb 18, 2021 at 2:37 PM Atri Sharma  wrote:

> Congratulations!!
>
> On Fri, 19 Feb 2021, 00:26 Anshum Gupta,  wrote:
>
>> Hi everyone,
>>
>> I’d like to inform everyone that the newly formed Apache Solr PMC
>> nominated and elected Jan Høydahl for the position of the Solr PMC Chair
>> and Vice President. This decision was approved by the board in its February
>> 2021 meeting.
>>
>> Congratulations Jan!
>>
>> --
>> Anshum Gupta
>>
>


PR without jira ticket

2021-02-07 Thread Joel Bernstein
This PR https://github.com/apache/lucene-solr/pull/1600 doesn't have a jira
attached to it. This ticket was not a small change. While the PR does have
lots of information about the decisions made we still need Jira tickets and
commits to point to jira tickets. Was there a jira ticket created for this
PR?



Joel Bernstein
http://joelsolr.blogspot.com/


Re: Is it Time to Deprecate the Legacy Facets API

2021-01-27 Thread Joel Bernstein
It's worth investigating deprecating the stats component also. I believe
JSON facets covers that functionality as well. It will be painful for users
though to switch over unfortunately.


Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Jan 22, 2021 at 1:14 PM Jason Gerlowski 
wrote:

> Personally I'd love to see us stop maintaining the duplicated code of
> the underlying implementations.  I wouldn't mind losing the legacy
> syntax as well - I'll take a clear, verbose API over a less-clear,
> concise one any day.  But I'm probably a minority there.
>
> Either way I agree with Michael when he said above that the first step
> would have to be a parity investigation for features and performance.
>
> Best,
>
> Jason
>
> On Fri, Jan 22, 2021 at 10:05 AM Michael Gibney
>  wrote:
> >
> > I agree it would make long-term sense to consolidate the backend
> implementation. I think leaving the "classic" user-facing facet API (with
> JSON Facet module as a backend) would be a good idea. Either way, I think a
> first step would be checking for parity between existing backend
> implementations -- possibly in terms of features [1], but certainly in
> terms of performance for common use cases [2].
> >
> > I think removal of the "classic" user-facing API would cause a lot of
> consternation in the user community. I can even see a
> non-backward-compatibility argument for preserving the "classic"
> user-facing API: it's simpler for simple use cases. _If_ the ultimate goal
> is removal of the "classic" user-facing API (not presuming that it is),
> that approach could be facilitated in the short term by enticing users
> towards "JSON Facet" API ... basically with a "feature freeze" on the
> legacy implementation. No new features [3], no new optimizations [4] for
> "classic"; concentrate such efforts on JSON Facet. This seems to already be
> the de facto case, but it could be a more intentional decision -- e.g. in
> [3] it's straightforward to extend the the proposed "facet cache" to the
> "classic" impl ... but I could see an argument for intentionally not doing
> so.
> >
> > Robert, I think your concerns about UninvertedField could be addressed
> by the `uninvertible="false"` property (currently defaults to "true" for
> backward compatibility iiuc; but could default to "false", or at least
> provide the ability to set the default for all fields to "false" at node
> level solr.xml? -- I know I've wished for the latter!). Also fwiw I'm not
> aware of any JSON Facet processors that work with string values in RAM ...
> I do think all JSON Facet processors use OrdinalMap now, where relevant.
> >
> > [1] https://issues.apache.org/jira/browse/SOLR-14921
> > [2] https://issues.apache.org/jira/browse/SOLR-14764
> > [3] https://issues.apache.org/jira/browse/SOLR-13807
> > [4] https://issues.apache.org/jira/browse/SOLR-10732
> >
> > On Fri, Jan 22, 2021 at 12:46 AM Robert Muir  wrote:
> >>
> >> Do these two options conflate concerns of input format vs. actual
> >> algorithm? That was always my disappointment.
> >>
> >> I feel like the java apis are off here at the lower level, and it
> >> hurts the user.
> >> I don't talk about the input format from the user, instead I mean the
> >> execution of the faceting query.
> >>
> >> IMO: building top-level caches (e.g. uninvertedfield) or
> >> on-the-fly-caches (e.g. fieldcache) is totally trappy already.
> >> But with the uninvertedfield of json facets it does its own thing,
> >> even if you went thru the trouble to enable docvalues at index time:
> >> that's sad.
> >>
> >> the code by default should not give the user jvm
> >> heap/garbage-collector hell. If you want to do that to yourself, for a
> >> totally static index, IMO that should be opt-in.
> >>
> >> But for the record, it is no longer just two shitty choices like
> >> "top-level vs per-segment". There are different field types, e.g.
> >> numeric types where the per-segment approach works efficiently.
> >> Then you have the strings, but there is a newish middle ground for
> >> Strings: OrdinalMap (lucene Multi* interfaces do it) which builds
> >> top-level integers structures to speed up string-faceting, but doesnt
> >> need *string values* in ram.
> >> It is just integers and mostly compresses as deltas. Adrien compresses
> >> the shit out of it.
> >>
> >> So I'd hate for the user to lose the option here of using docvalues to
> >&g

Re: Failing gradle precommits

2021-01-08 Thread Joel Bernstein
It turned out to be this while I merged branches:

warning: inexact rename detection was skipped due to too many files.

warning: you may want to set your merge.renamelimit variable to at least
1639 and retry the command.

Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Jan 8, 2021 at 11:16 AM Joel Bernstein  wrote:

> Thanks Eric, I'll do a fresh clone, something must be out of wack with my
> local repo.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Jan 8, 2021 at 10:55 AM Eric Pugh 
> wrote:
>
>> It ran for me just fine.   I *think* you may not be up to date, as
>> dataimporthandler/ is no longer in master!
>>
>>
>> On Jan 8, 2021, at 10:08 AM, Joel Bernstein  wrote:
>>
>> I'm getting failing gradle precommits in master:
>>
>> *> Task :solr:contrib:validateSourcePatterns* FAILED
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestDocBuilder.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestSolrEntityProcessorEndToEnd.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestErrorHandling.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestScriptTransformer.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestSqlEntityProcessor.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestDocBuilder2.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestZKPropertiesWriter.xml
>> tabs instead spaces:
>> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler-extras/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestTikaEntityProcessor.xml
>>
>> FAILURE: Build failed with an exception.
>>
>> * Where:
>> Script
>> '/Users/joelbernstein/committer/lucene-solr/gradle/validation/validate-source-patterns.gradle'
>> line: 324
>>
>> * What went wrong:
>> Execution failed for task ':solr:contrib:validateSourcePatterns'.
>> > Found 8 violations in source files (tabs instead spaces).
>>
>>
>> Are others seeing this as well? I'm not seeing Jenkins emails about this.
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> ___
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>


Re: Failing gradle precommits

2021-01-08 Thread Joel Bernstein
Thanks Eric, I'll do a fresh clone, something must be out of wack with my
local repo.


Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Jan 8, 2021 at 10:55 AM Eric Pugh 
wrote:

> It ran for me just fine.   I *think* you may not be up to date, as
> dataimporthandler/ is no longer in master!
>
>
> On Jan 8, 2021, at 10:08 AM, Joel Bernstein  wrote:
>
> I'm getting failing gradle precommits in master:
>
> *> Task :solr:contrib:validateSourcePatterns* FAILED
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestDocBuilder.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestSolrEntityProcessorEndToEnd.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestErrorHandling.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestScriptTransformer.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestSqlEntityProcessor.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestDocBuilder2.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestZKPropertiesWriter.xml
> tabs instead spaces:
> /Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler-extras/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestTikaEntityProcessor.xml
>
> FAILURE: Build failed with an exception.
>
> * Where:
> Script
> '/Users/joelbernstein/committer/lucene-solr/gradle/validation/validate-source-patterns.gradle'
> line: 324
>
> * What went wrong:
> Execution failed for task ':solr:contrib:validateSourcePatterns'.
> > Found 8 violations in source files (tabs instead spaces).
>
>
> Are others seeing this as well? I'm not seeing Jenkins emails about this.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> ___
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>


Failing gradle precommits

2021-01-08 Thread Joel Bernstein
I'm getting failing gradle precommits in master:

*> Task :solr:contrib:validateSourcePatterns* FAILED

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestDocBuilder.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestSolrEntityProcessorEndToEnd.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestErrorHandling.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestScriptTransformer.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestSqlEntityProcessor.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestDocBuilder2.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestZKPropertiesWriter.xml

tabs instead spaces:
/Users/joelbernstein/committer/lucene-solr/solr/contrib/dataimporthandler-extras/build/test-results/test/TEST-org.apache.solr.handler.dataimport.TestTikaEntityProcessor.xml


FAILURE: Build failed with an exception.


* Where:

Script
'/Users/joelbernstein/committer/lucene-solr/gradle/validation/validate-source-patterns.gradle'
line: 324


* What went wrong:

Execution failed for task ':solr:contrib:validateSourcePatterns'.

> Found 8 violations in source files (tabs instead spaces).


Are others seeing this as well? I'm not seeing Jenkins emails about this.


Joel Bernstein
http://joelsolr.blogspot.com/


Re: LeafReaderContext ord is unexpectedly 0

2020-12-28 Thread Joel Bernstein
That's exactly how this problem occurred. I'll make sure this is fixed
before merging into the codebase.


Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Dec 27, 2020 at 6:12 PM Uwe Schindler  wrote:

> Hi,
>
>
>
> just to add: Any public query API (weight, query, DocIdSetIterators,…)
> should always take LeafReaderContext as parameter. If you have some solr
> plugin that maybe implements some method only taking LeafReader, this one
> lost context and it’s impossible to restore from that. So if sending
> IndexReader instances around (no matter what type), always use
> ReaderContexts, especially in public APIs.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Joel Bernstein 
> *Sent:* Sunday, December 27, 2020 7:36 PM
> *To:* lucene dev 
> *Subject:* Re: LeafReaderContext ord is unexpectedly 0
>
>
>
> Ok this makes sense. I suspect I never ran across this before because I
> always accessed the ord through the context before getting the reader.
>
>
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>
>
>
>
>
> On Sun, Dec 27, 2020 at 1:10 PM Uwe Schindler  wrote:
>
> Hi,
>
>
>
> that behaviour is fully correct and was always like that. Just for info (I
> had some slides on berlinbuzzwords like 8.5 years ago):
>
> https://youtu.be/iZZ1AbJ6dik?t=1975
>
>
>
> The problem is a classical “wrong point of view” problem!
>
>
>
> IndexReaders and their subclasses have no idea about their neighbours or
> parents, they can always be used on their own. They can also be in multiple
> contexts (), like a LeafReader (in that talk we used AtomicReader) is
> part of a DirectoryReader but at same time somebody else has constructed
> another composite reader  with LeafReaders from totally different
> directories (e.g., when merging different indexes together). So in short: A
> reader does not know anything about its own “where I am”.
>
>
>
> The method getContext() is only there as a helper method (it’s a bit
> misnomed), to create a **new** context that describes this reader as the
> only one in it, so inside this new context it has an ord of 0.
>
>
>
> The problem in your code is: you dive down through the correct context
> from top-level (the top context is from the point of view of the
> SolrSearcher), but then you leave this hierarchy by calling reader(). At
> that point you lost context information. After that you get a new context
> and this one returns 0, because its no longer form SolrIndexSearcher’s
> point of view, but its own PoV.
>
>
>
> Replace: leaves.get(5).reader().getContext().ord
>
> By: leaves.get(5).ord
>
>
>
> And you’re fine. The red part leaves the top level context and then
> creates a new one – an then you’re lost!
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Joel Bernstein 
> *Sent:* Sunday, December 27, 2020 5:59 PM
> *To:* lucene dev 
> *Subject:* LeafReaderContext ord is unexpectedly 0
>
>
>
> I ran into this while writing some Solr code today.
>
>
>
> List leaves =
> req.getSearcher().getTopReaderContext().leaves();
>
>
>
> The req is a SolrQueryRequest object.
>
>
>
> Now if I do this:
>
>
>
> leaves.get(5).reader().getContext().ord
>
>
>
> I would expect *ord* in this scenario to be *5*.
>
>
>
> But in my testing in master it's returning 0.
>
>
>
> It seems like this is a bug. Not sure yet if this is a bug in Sor or
> Lucene. Am I missing anything here that anyone can see?
>
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>
>


Re: LeafReaderContext ord is unexpectedly 0

2020-12-27 Thread Joel Bernstein
Ok this makes sense. I suspect I never ran across this before because I
always accessed the ord through the context before getting the reader.


Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Dec 27, 2020 at 1:10 PM Uwe Schindler  wrote:

> Hi,
>
>
>
> that behaviour is fully correct and was always like that. Just for info (I
> had some slides on berlinbuzzwords like 8.5 years ago):
>
> https://youtu.be/iZZ1AbJ6dik?t=1975
>
>
>
> The problem is a classical “wrong point of view” problem!
>
>
>
> IndexReaders and their subclasses have no idea about their neighbours or
> parents, they can always be used on their own. They can also be in multiple
> contexts (), like a LeafReader (in that talk we used AtomicReader) is
> part of a DirectoryReader but at same time somebody else has constructed
> another composite reader  with LeafReaders from totally different
> directories (e.g., when merging different indexes together). So in short: A
> reader does not know anything about its own “where I am”.
>
>
>
> The method getContext() is only there as a helper method (it’s a bit
> misnomed), to create a **new** context that describes this reader as the
> only one in it, so inside this new context it has an ord of 0.
>
>
>
> The problem in your code is: you dive down through the correct context
> from top-level (the top context is from the point of view of the
> SolrSearcher), but then you leave this hierarchy by calling reader(). At
> that point you lost context information. After that you get a new context
> and this one returns 0, because its no longer form SolrIndexSearcher’s
> point of view, but its own PoV.
>
>
>
> Replace: leaves.get(5).reader().getContext().ord
>
> By: leaves.get(5).ord
>
>
>
> And you’re fine. The red part leaves the top level context and then
> creates a new one – an then you’re lost!
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Joel Bernstein 
> *Sent:* Sunday, December 27, 2020 5:59 PM
> *To:* lucene dev 
> *Subject:* LeafReaderContext ord is unexpectedly 0
>
>
>
> I ran into this while writing some Solr code today.
>
>
>
> List leaves =
> req.getSearcher().getTopReaderContext().leaves();
>
>
>
> The req is a SolrQueryRequest object.
>
>
>
> Now if I do this:
>
>
>
> leaves.get(5).reader().getContext().ord
>
>
>
> I would expect *ord* in this scenario to be *5*.
>
>
>
> But in my testing in master it's returning 0.
>
>
>
> It seems like this is a bug. Not sure yet if this is a bug in Sor or
> Lucene. Am I missing anything here that anyone can see?
>
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>


Re: LeafReaderContext ord is unexpectedly 0

2020-12-27 Thread Joel Bernstein
I'll have to dig around in some collector code. I could swear that you
could track the ord of the leaf this way at collection time. But there may
be different code paths used then one I showed above.

Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Dec 27, 2020 at 12:25 PM Haoyu Zhai  wrote:

> Hi Joel,
> LeafReader.getContext() is expected to return "the root IndexReaderContext
> <https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/IndexReaderContext.html>
>  for
> this IndexReader
> <https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/IndexReader.html>'s
> sub-reader tree." (
> https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/LeafReader.html#getContext()
> )
> Which means it will returns a context with ord 0 (a newly constructed, not
> the previous one [1]) if it is already a leaf. So I think this is expected?
>
> [1]:
> https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/LeafReader.java#L43
>
> Best
> Patrick
>
> Joel Bernstein  于2020年12月27日周日 上午8:59写道:
>
>> I ran into this while writing some Solr code today.
>>
>> List leaves =
>> req.getSearcher().getTopReaderContext().leaves();
>>
>> The req is a SolrQueryRequest object.
>>
>> Now if I do this:
>>
>> leaves.get(5).reader().getContext().ord
>>
>> I would expect *ord* in this scenario to be *5*.
>>
>> But in my testing in master it's returning 0.
>>
>> It seems like this is a bug. Not sure yet if this is a bug in Sor or
>> Lucene. Am I missing anything here that anyone can see?
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>


LeafReaderContext ord is unexpectedly 0

2020-12-27 Thread Joel Bernstein
I ran into this while writing some Solr code today.

List leaves =
req.getSearcher().getTopReaderContext().leaves();

The req is a SolrQueryRequest object.

Now if I do this:

leaves.get(5).reader().getContext().ord

I would expect *ord* in this scenario to be *5*.

But in my testing in master it's returning 0.

It seems like this is a bug. Not sure yet if this is a bug in Sor or
Lucene. Am I missing anything here that anyone can see?


Joel Bernstein
http://joelsolr.blogspot.com/


Problems creating collections on branch_8x due to SSL errors

2020-12-14 Thread Joel Bernstein
I did a pull this morning and checked out branch_8x and then did the
following:

ant server
bin/solr start -c
bin/solr create -c test -s 1 -d _default

I get the following error in the logs. Jason Gerlowski confirmed he is
seeing it as well. Anyone know the cause of this? If not I'll create a
ticket.

2020-12-14 16:33:15.463 INFO
(OverseerStateUpdate-72065849883426816-10.0.0.238:8983_solr-n_00)
[   ] o.a.s.c.o.SliceMutator createReplica() {

  "operation":"ADDREPLICA",

  "collection":"test",

  "shard":"shard1",

  "core":"test_shard1_replica_n1",

  "state":"down",

  "node_name":"10.0.0.238:8983_solr",

  "type":"NRT",

  "waitForFinalState":"false"}

2020-12-14 16:33:15.736 ERROR
(OverseerThreadFactory-18-thread-1-processing-n:10.0.0.238:8983_solr) [   ]
o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
https://10.0.0.238:8983/solr =>
org.apache.solr.client.solrj.SolrServerException: IOException occurred when
talking to server at: https://10.0.0.238:8983/solr

at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)

org.apache.solr.client.solrj.SolrServerException: IOException occurred when
talking to server at: https://10.0.0.238:8983/solr

at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)
~[?:?]

at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
~[?:?]

at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
~[?:?]

at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]

at
org.apache.solr.handler.component.HttpShardHandlerFactory$1.request(HttpShardHandlerFactory.java:169)
~[?:?]

at
org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:130)
~[?:?]

at
org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:41)
~[?:?]

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_271]

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[?:1.8.0_271]

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_271]

at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
~[metrics-core-4.1.5.jar:4.1.5]

at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
~[?:?]

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_271]

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_271]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_271]

Caused by: javax.net.ssl.SSLException: Unsupported or unrecognized SSL
message

at
sun.security.ssl.SSLSocketInputRecord.handleUnknownRecord(SSLSocketInputRecord.java:448)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:174)
~[?:1.8.0_271]

at sun.security.ssl.SSLTransport.decode(SSLTransport.java:110)
~[?:1.8.0_271]

at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1279)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1188)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:401)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:373)
~[?:1.8.0_271]

at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:436)
~[?:?]

at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
~[?:?]

at
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
~[?:?]

at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
~[?:?]

at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
~[?:?]

at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
~[?:?]

at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
~[?:?]

at
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[?:?]

at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
~[?:?]

at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[?:?]

at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
~[?:?]

at
org.apache.http.impl.client.CloseableHttpClient.execute(Closeab

Re: 8.8 Release

2020-12-10 Thread Joel Bernstein
+1

Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Dec 10, 2020 at 11:23 AM David Smiley  wrote:

> Thanks for volunteering!
>
> On Thu, Dec 10, 2020 at 11:11 AM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Hi Devs,
>> There are lots of changes accumulated and some underway. I wish to
>> volunteer for a 8.8 release, if there are no objections. I'm planning to
>> build the RC in three weeks, i.e. 31 December (and cut the branch about 3-4
>> days before that). Please let me know if someone has any concerns.
>> Thanks and regards,
>> Ishan
>>
>> --
> Sent from Gmail Mobile
>


Re: Welcome Houston Putman to the PMC

2020-12-02 Thread Joel Bernstein
Welcome Houston!


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Dec 2, 2020 at 11:25 AM Namgyu Kim  wrote:

> Congratulations and welcome, Houston! :D
>
> On Thu, Dec 3, 2020 at 12:24 AM Steve Rowe  wrote:
>
>> Welcome Houston!
>>
>> --
>> Steve
>>
>> > On Dec 1, 2020, at 4:19 PM, Mike Drob  wrote:
>> >
>> > I am pleased to announce that Houston Putman has accepted the PMC's
>> invitation to join.
>> >
>> > Congratulations and welcome, Houston!
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: DIH replacement

2020-11-30 Thread Joel Bernstein
Check out this ticket:

https://issues.apache.org/jira/browse/SOLR-14673

There are lots of different ways that this could be applied as a
replacement for DIH.


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Nov 30, 2020 at 9:56 AM Erick Erickson 
wrote:

> For what I suggested, there’s no code to write, these streams exist
> already.
>
> As far as supporting the more complex cases… I’m -1 for adding special
> code to streaming. DIH has many moving parts. Each of those parts was put
> there for a reason, and needed to be supported through successive Solr
> releases. What I specifically do _not_ want to do is to start down the path
> of reproducing those parts with special-purpose streaming code that tries
> to replace DIH with equivalent streaming functionality.
>
> I think it’s kinder to end users to set expectations that they need to be
> responsible for the ETL process. If there is streaming capabilities that do
> the needful, they can certainly use them rather than write something
> themselves. Otherwise they need to create an independent ETL process.
>
> The origin of this thought was the realization that streaming can import
> from a DB as-is, one of the base use-cases for DIH. On a quick look, I
> don’t see any other streams that work with other data sources, say a
> TikaStream, a FileStream, etc...
>
> FWIW,
> Erick
>
>
> > On Nov 29, 2020, at 11:52 AM, Atri Sharma  wrote:
> >
> > FWIW i am interested in this -- happy to collaborate
> >
> > On Sun, 29 Nov 2020, 22:07 Erick Erickson, 
> wrote:
> > How far can we get in replacing DIH with streams? I can write a simple
> DIH implementation by wrapping a jdbc stream in an update stream for
> instance (I think).
> >
> > It falls down with some of the more complex DIH constructs, but the
> simple “pull data from the DB and insert it into Solr” case seems covered...
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Julie Tibshirani as Lucene/Solr committer

2020-11-18 Thread Joel Bernstein
Welcome Julie!

On Wed, Nov 18, 2020 at 10:14 AM Adrien Grand  wrote:

> Welcome Julie!
>
> On Wed, Nov 18, 2020 at 4:09 PM Alan Woodward 
> wrote:
>
>> Congratulations and welcome Julie!
>>
>> > On 18 Nov 2020, at 15:06, Michael Sokolov  wrote:
>> >
>> > I'm pleased to announce that Julie Tibshirani has accepted the PMC's
>> > invitation to become a committer.
>> >
>> > Julie, the tradition is that new committers introduce themselves with
>> > a brief bio.
>> >
>> > I think we may still be sorting out the details of your Apache account
>> > (julie@ may have been taken?), but as soon as that has been sorted out
>> > and karma has been granted, you can use your new powers to add
>> > yourself to the committers section of the Who We Are page on the
>> > website: 
>> >
>> > Congratulations and welcome!
>> >
>> > Mike Sokolov
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> Adrien
>


Re: Index documents in async way

2020-10-08 Thread Joel Bernstein
I think this model has a lot of potential.

I'd like to add another wrinkle to this. Which is to store the information
about each batch as a record in the index. Each batch record would contain
a fingerprint for the batch. This solves lots of problems, and allows us to
confirm the integrity of the batch. It also means that we can compare
indexes by comparing the batch fingerprints rather than building a
fingerprint from the entire index.


Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Oct 8, 2020 at 11:31 AM Erick Erickson 
wrote:

> I suppose failures would be returned to the client one the async response?
>
> How would one keep the tlog from growing forever if the actual indexing
> took a long time?
>
> I'm guessing that this would be optional..
>
> On Thu, Oct 8, 2020, 11:14 Ishan Chattopadhyaya 
> wrote:
>
>> Can there be a situation where the index writer fails after the document
>> was added to tlog and a success is sent to the user? I think we want to
>> avoid such a situation, isn't it?
>>
>> On Thu, 8 Oct, 2020, 8:25 pm Cao Mạnh Đạt,  wrote:
>>
>>> > Can you explain a little more on how this would impact durability of
>>> updates?
>>> Since we persist updates into tlog, I do not think this will be an issue
>>>
>>> > What does a failure look like, and how does that information get
>>> propagated back to the client app?
>>> I did not be able to do much research but I think this is gonna be the
>>> same as the current way of our asyncId. In this case asyncId will be the
>>> version of an update (in case of distributed queue it will be offset)
>>> failures update will be put into a time-to-live map so users can query the
>>> failure, for success we can skip that by leverage the max succeeded version
>>> so far.
>>>
>>> On Thu, Oct 8, 2020 at 9:31 PM Mike Drob  wrote:
>>>
>>>> Interesting idea! Can you explain a little more on how this would
>>>> impact durability of updates? What does a failure look like, and how does
>>>> that information get propagated back to the client app?
>>>>
>>>> Mike
>>>>
>>>> On Thu, Oct 8, 2020 at 9:21 AM Cao Mạnh Đạt  wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> First of all it seems that I used the term async a lot recently :D.
>>>>> Recently I have been thinking a lot about changing the current
>>>>> indexing model of Solr from sync way like currently (user submit an update
>>>>> request waiting for response). What about changing it to async model, 
>>>>> where
>>>>> nodes will only persist the update into tlog then return immediately much
>>>>> like what tlog is doing now. Then we have a dedicated executor which reads
>>>>> from tlog to do indexing (producer consumer model with tlog acting like 
>>>>> the
>>>>> queue).
>>>>>
>>>>> I do see several big benefits of this approach
>>>>>
>>>>>- We can batching updates in a single call, right now we do not
>>>>>use writer.add(documents) api from lucene, by batching updates this 
>>>>> gonna
>>>>>boost the performance of indexing
>>>>>- One common problems with Solr now is we have lot of threads
>>>>>doing indexing so that can ends up with many small segments. Using this
>>>>>model we can have bigger segments so less merge cost
>>>>>- Another huge reason here is after switching to this model, we
>>>>>can remove tlog and use a distributed queue like Kafka, Pulsar. Since 
>>>>> the
>>>>>purpose of leader in SolrCloud now is ordering updates, the distributed
>>>>>queue is already ordering updates for us, so no need to have a 
>>>>> dedicated
>>>>>leader. That is just the beginning of things that we can do after 
>>>>> using a
>>>>>distributed queue.
>>>>>
>>>>> What do your guys think about this? Just want to hear from your guys
>>>>> before going deep into this rabbit hole.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>


Re: restlet dependencies

2020-09-21 Thread Joel Bernstein
Restlet again!!!



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Sep 21, 2020 at 7:18 AM Eric Pugh 
wrote:

> Do we have a community blessed alternative to restlet already?
>
> On Sep 20, 2020, at 9:40 AM, Noble Paul  wrote:
>
> Haha.
>
> In fact schema APIs don't use restlet. Only the managed resources use it
>
> On Sat, Sep 19, 2020, 3:35 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> If I were talend, I'd immediately start publishing to maven central. If I
>> were the developer who built the schema APIs, I would never have used
>> restlet to begin with.
>>
>> On Sat, 19 Sep, 2020, 1:13 am Uwe Schindler,  wrote:
>>
>>> I was thinking the same. Because GitHub does not cache the downloaded
>>> artifacts like our jenkins servers.
>>>
>>> It seems to run it in a new VM or container every time, so it downloads
>>> all artifacts. If I were Talend, I'd also block this.
>>>
>>> Uwe
>>>
>>> Am September 18, 2020 7:32:47 PM UTC schrieb Dawid Weiss <
>>> dawid.we...@gmail.com>:
>>>>
>>>> I don't think it's http/https - I believe restlet repository simply
>>>> bans github servers because of excessive traffic? These URLs work for
>>>> me locally...
>>>>
>>>> Dawid
>>>>
>>>> On Fri, Sep 18, 2020 at 6:35 PM Christine Poerschke (BLOOMBERG/
>>>> LONDON)  wrote:
>>>>
>>>>>
>>>>>  This sounds vaguely familiar. "http works, https does not work" and 
>>>>> https://issues.apache.org/jira/browse/SOLR-13756 possibly related?
>>>>>
>>>>>  From: dev@lucene.apache.org At: 09/18/20 10:01:29
>>>>>  To: dev@lucene.apache.org
>>>>>  Subject: Re: restlet dependencies
>>>>>
>>>>>  I don't think it is, sadly.
>>>>>  https://repo1.maven.org/maven2/org/restlet
>>>>>
>>>>>  The link you provided (mvnrepository) aggregates from several maven
>>>>>  repositories.
>>>>>
>>>>>
>>>>>  D.
>>>>>
>>>>>  On Fri, Sep 18, 2020 at 10:46 AM Ishan Chattopadhyaya
>>>>>   wrote:
>>>>>
>>>>>>
>>>>>>  Sorry, afk, but I heard (*hearsay*) that restlet is also on maven 
>>>>>> central
>>>>>>
>>>>> these days. Can we confirm and switch to that? Sorry, if that's not the 
>>>>> case.
>>>>>
>>>>>>
>>>>>>  On Fri, 18 Sep, 2020, 1:15 pm Dawid Weiss,  
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>  Just FYI: can't get PR builds on github to work recently because of 
>>>>>>> this:
>>>>>>>
>>>>>>> Could not resolve all files for configuration
>>>>>>>>
>>>>>>> ':solr:core:compileClasspath'.
>>>>>
>>>>>>  350 > Could not download org.restlet.ext.servlet-2.4.3.jar
>>>>>>>  (org.restlet.jee:org.restlet.ext.servlet:2.4.3)
>>>>>>>  351 > Could not get resource
>>>>>>>
>>>>>>> 'https://maven.restlet.com/org/restlet/jee/org.restlet.ext.servlet/2.4.3/org.res
>>>>> tlet.ext.servlet-2.4.3.jar'.
>>>>>
>>>>>>  352 > Could not GET
>>>>>>>
>>>>>>> 'https://maven.restlet.com/org/restlet/jee/org.restlet.ext.servlet/2.4.3/org.res
>>>>> tlet.ext.servlet-2.4.3.jar'.
>>>>>
>>>>>>  353 > Connection reset
>>>>>>>  354 > Could not download org.restlet-2.4.3.jar
>>>>>>>  (org.restlet.jee:org.restlet:2.4.3)
>>>>>>>  355 > Could not get resource
>>>>>>>
>>>>>>> 'https://maven.restlet.com/org/restlet/jee/org.restlet/2.4.3/org.restlet-2.4.3.j
>>>>> ar'.
>>>>>
>>>>>>  356 > Could not GET
>>>>>>>
>>>>>>> 'https://maven.restlet.talend.com/org/restlet/jee/org.restlet/2.4.3/org.restlet-
>>>>> 2.4.3.jar'.
>>>>>
>>>>>>  357 > Connection reset
>>>>>>>
>>>>>>>  D.
>>>>>>> --
>>>>>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>>>
>>>>>>> --
>>>>>  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>>  For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>
>>>>>
>>>>> --
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>
>>>>
>>> --
>>> Uwe Schindler
>>> Achterdiek 19, 28357 Bremen
>>> https://www.thetaphi.de
>>>
>>
> ___
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>


Re: hybrid document routing

2020-08-11 Thread Joel Bernstein
SOLR-14728 supports sub-second performance on joins with more than 1
million values from the from index. Nice for access control.



Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Aug 11, 2020 at 9:49 AM Joel Bernstein  wrote:

> This ticket will shed some light:
>
> https://issues.apache.org/jira/browse/SOLR-14728
>
>
> I think I'm planning using a different approach to distribute tha ACL's to
> all shards.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Tue, Aug 11, 2020 at 1:18 AM Gus Heck  wrote:
>
>> Sounds like complex ACLs based on group memberships that use graph
>> queries ? that would require local ACL's...
>>
>> On Mon, Aug 10, 2020 at 5:56 PM Ishan Chattopadhyaya <
>> ichattopadhy...@gmail.com> wrote:
>>
>>> This seems like an XY problem. Would it be possible to describe the
>>> original problem that led you to this solution (in the prototype)? Also, do
>>> you think folks at solr-users@ list would have more ideas related to
>>> this usecase and cross posting there would help?
>>>
>>> On Tue, 11 Aug, 2020, 1:43 am David Smiley,  wrote:
>>>
>>>> Are you sure you need the docs in the same shard when maybe you could
>>>> assume a core exists on each node and then do a query-time join?
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Mon, Aug 10, 2020 at 2:34 PM Joel Bernstein 
>>>> wrote:
>>>>
>>>>> I have a situation where I'd like to have the standard compositeId
>>>>> router in place for a collection. But, I'd like certain documents (ACL
>>>>> documents) to be duplicated on each shard in the collection. To achieve 
>>>>> the
>>>>> level of access control performance and scalability I'm looking for I need
>>>>> the ACL records to be in the same core as the main documents.
>>>>>
>>>>> I put together a prototype where the compositeId router accepted
>>>>> implicit routing parameters and it worked in my testing. Before I open a
>>>>> ticket suggesting this approach I wonder what other people thought the 
>>>>> best
>>>>> approach would be to accomplish this goal.
>>>>>
>>>>>
>>>>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>


Re: hybrid document routing

2020-08-11 Thread Joel Bernstein
This ticket will shed some light:

https://issues.apache.org/jira/browse/SOLR-14728


I think I'm planning using a different approach to distribute tha ACL's to
all shards.




Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Aug 11, 2020 at 1:18 AM Gus Heck  wrote:

> Sounds like complex ACLs based on group memberships that use graph queries
> ? that would require local ACL's...
>
> On Mon, Aug 10, 2020 at 5:56 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> This seems like an XY problem. Would it be possible to describe the
>> original problem that led you to this solution (in the prototype)? Also, do
>> you think folks at solr-users@ list would have more ideas related to
>> this usecase and cross posting there would help?
>>
>> On Tue, 11 Aug, 2020, 1:43 am David Smiley,  wrote:
>>
>>> Are you sure you need the docs in the same shard when maybe you could
>>> assume a core exists on each node and then do a query-time join?
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Mon, Aug 10, 2020 at 2:34 PM Joel Bernstein 
>>> wrote:
>>>
>>>> I have a situation where I'd like to have the standard compositeId
>>>> router in place for a collection. But, I'd like certain documents (ACL
>>>> documents) to be duplicated on each shard in the collection. To achieve the
>>>> level of access control performance and scalability I'm looking for I need
>>>> the ACL records to be in the same core as the main documents.
>>>>
>>>> I put together a prototype where the compositeId router accepted
>>>> implicit routing parameters and it worked in my testing. Before I open a
>>>> ticket suggesting this approach I wonder what other people thought the best
>>>> approach would be to accomplish this goal.
>>>>
>>>>
>>>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


hybrid document routing

2020-08-10 Thread Joel Bernstein
I have a situation where I'd like to have the standard compositeId router
in place for a collection. But, I'd like certain documents (ACL documents)
to be duplicated on each shard in the collection. To achieve the level of
access control performance and scalability I'm looking for I need the ACL
records to be in the same core as the main documents.

I put together a prototype where the compositeId router accepted implicit
routing parameters and it worked in my testing. Before I open a ticket
suggesting this approach I wonder what other people thought the best
approach would be to accomplish this goal.


Re: Parallel SQL join on multivalued fields

2020-07-22 Thread Joel Bernstein
I think the first step would be comprehensive unit tests for joins in
Parallel SQL, coupled with performance tests so we understand how
distributed performs at scale through the calcites framework. Then we know
if we can actually say joins are really supported. Then we can add the
documentation.

If join support becomes part of parallel SQL then we can actively look at
improving them.

If you want to add the unit tests I can find the time to help commit and I
can help with the performance tests.




Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Jul 21, 2020 at 5:02 AM Piero Scrima  wrote:

> any suggestion on this?
> Thanks
>
> Il giorno mer 1 lug 2020 alle ore 11:22 Piero Scrima 
> ha scritto:
>
>> Hi,
>>
>> I don't know if this is the right place for my question, anyway I'll try
>> to explain the issue here and understand together with you if it's worth
>> working on it.
>> I'm working with the parallel sql feature of Solr. Even though , looking
>> at the documentation, the join seems not supported, the join works. I gave
>> a look to the code and I understood that it works thanks to the calcite
>> features, (the framework on top of which is is build parellel sql feature).
>> My project doesn't need to works with big amount of data, and I think that
>> the calcite join feature can work well for my use case.
>> The problems arise when I need to join two multivalued field.
>> In parallel sql, the JOIN operation on two multivalued field seems to
>> works matching the two fields as a unique string, so that, for example , a
>> document with the join field like this :["a","b"]; will match with a
>> document with a join field exactly equal, like this: ["a","b"]; otherwise
>> if even only one element is different, they do not match.The right way to
>> do this should be first to explode the document in more document as much as
>> the number of element in the multivalued field (cross product on the field)
>> and then perform the join. I managed to solve the problem using streaming
>> expression:
>>
>> innerJoin(
>> sort(
>> cartesianProduct(
>>
>> search(census_defence_system,q="*:*",fl="id,defence_system,description,supplier",sort="id
>> asc",qt="/select",rows="1000"),
>>   supplier
>> ),
>> by="supplier asc"
>> ),
>> sort(
>>   cartesianProduct(
>>
>> search(census_components,q="*:*",fl="id,compoenent_name,supplier",sort="id
>> asc",qt="/select",rows="1"),
>> supplier
>> ),
>> by="supplier asc"
>> ),
>>   on="supplier"
>> )
>> with suplier as a multivalued field. It works very well.
>> Anyway it would be great if the JOIN of multivalued field performed this
>> behavior with cartesian product also in parallel sql.
>> I think this could be a very powerful improvement, apply sql in  a not
>> normalized collection/table. It could be possible to implement this
>> feature?  I will be very glad to work on it.
>>
>> Thank you,
>>
>> Piero
>>
>


Re: 8.6 release

2020-06-29 Thread Joel Bernstein
Hi Bruno,

Andrzej and I decided that SOLR-14537 is headed to master to bake for a
while and won't make it into the 8.6 release. So please feel free to cut
the branch when ready.


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Jun 29, 2020 at 6:13 AM Andrzej Białecki  wrote:

> I wold like to include SOLR-14537 in 8.6 (it’s already tagged), the patch
> is ready and I’m just waiting for Joel to finish performance testing.
>
> On 27 Jun 2020, at 04:59, Tomás Fernández Löbbe 
> wrote:
>
> I tagged SOLR-14590 for 8.6, The PR is ready for review and I plan to
> merge it soon
>
> On Fri, Jun 26, 2020 at 12:54 PM Andrzej Białecki  wrote:
>
>> Jan,
>>
>> I just removed SOLR-14182 from 8.6, this needs proper back-compat shims
>> and testing, and I don’t have enough time to get it done properly for 8.6.
>>
>> On 26 Jun 2020, at 13:37, Jan Høydahl  wrote:
>>
>> Unresolved Solr issues tagged with 8.6:
>>
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%208.6
>> <https://issues.apache.org/jira/issues/?jql=project%20=%20SOLR%20AND%20resolution%20=%20Unresolved%20AND%20fixVersion%20=%208.6>
>>
>>
>> SOLR-14593   Package store API to disable file upload over HTTP
>>Blocker
>> SOLR-14580   CloudSolrClient cannot be initialized using 'zkHosts'
>> builder   Blocker
>> SOLR-14516   NPE during Realtime GET
>> Major
>> SOLR-14502   increase bin/solr's post kill sleep
>> Minor
>> SOLR-14398   package store PUT should be idempotent
>>Trivial
>> SOLR-14311   Shared schema should not have access to core level classes
>>Major
>> SOLR-14182   Move metric reporters config from solr.xml to ZK cluster
>> properties Major
>> SOLR-14066   Deprecate DIH
>> Blocker
>> SOLR-14022   Deprecate CDCR from Solr in 8.x
>> Blocker
>>
>> Plus two private JIRA issues.
>>
>> Jan
>>
>> 26. jun. 2020 kl. 12:06 skrev Bruno Roustant :
>>
>> So the plan is to cut the release branch on next Tuesday June 30th. If
>> you anticipate a problem with the date, please reply.
>>
>> Is there any JIRA issue that must be committed before the release is made
>> and that has not already the appropriate "Fix Version"?
>>
>> Currently there 3 unresolved issues flagged as Fix Version = 8.6:
>> Add tests for corruptions caused by byte flips LUCENE-9356
>> <https://issues.apache.org/jira/browse/LUCENE-9356>
>> Fix linefiledocs compression or replace in tests LUCENE-9191
>> <https://issues.apache.org/jira/browse/LUCENE-9191>
>> Can we merge small segments during refresh, for faster searching?
>> LUCENE-8962 <https://issues.apache.org/jira/browse/LUCENE-8962>
>>
>>
>> Le mer. 24 juin 2020 à 21:05, David Smiley  a
>> écrit :
>>
>>> Thanks starting this discussion, Cassandra.
>>>
>>> I reviewed the issues I was involved with and I don't quite see
>>> something worth noting.
>>>
>>> I plan to add a note about a change in defaults within
>>> UnifiedHighlighter that could be a significant perf regression.  This
>>> wasn't introduced in 8.6 but introduced in 8.5 and it's significant enough
>>> to bring attention to.  I could add it in 8.5's section but then add a
>>> short pointer to it in 8.6.
>>>
>>> ~ David
>>>
>>>
>>> On Wed, Jun 24, 2020 at 2:52 PM Cassandra Targett 
>>> wrote:
>>>
>>>> I started looking at the Ref Guide for 8.6 to get it ready, and notice
>>>> there are no Upgrade Notes in `solr-upgrade-notes.adoc` for 8.6. Is it
>>>> really true that none are needed at all?
>>>>
>>>> I’ll add what I usually do about new features/changes that maybe
>>>> wouldn’t normally make the old Upgrade Notes section, I just find it
>>>> surprising that there weren’t any devs who thought any of the 100 or so
>>>> Solr changes warrant any user caveats.
>>>> On Jun 17, 2020, 12:27 PM -0500, Tomás Fernández Löbbe <
>>>> tomasflo...@gmail.com>, wrote:
>>>>
>>>> +1. Thanks Bruno
>>>>
>>>> On Wed, Jun 17, 2020 at 6:22 AM Mike Drob  wrote:
>>>>
>>>>> +1
>>>>>
>>>>> The release wizard python script should be sufficient for everything.
>>>>> If you run into any issues with it, let me know, I used it for 8.5.2 and
>>>>> think I understand it pretty well.
>>>>>
>>>>> On Tue, Jun 16, 2020 at 8:31 AM Bruno Roustant <
>>>>> bruno.roust...@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> It’s been a while since we released Lucene/Solr 8.5.
>>>>>> I’d like to volunteer to be a release manager for an 8.6 release. If
>>>>>> there's agreement, then I plan to cut the release branch two weeks today,
>>>>>> on June 30th, and then to build the first RC two days later.
>>>>>>
>>>>>> This will be my first time as release manager so I'll probably need
>>>>>> some guidance. Currently I have two resource links on this subject:
>>>>>> https://cwiki.apache.org/confluence/display/LUCENE/ReleaseTodo
>>>>>>
>>>>>> https://github.com/apache/lucene-solr/tree/master/dev-tools/scripts#releasewizardpy
>>>>>> If you have more, please share with me.
>>>>>>
>>>>>> Bruno
>>>>>>
>>>>>
>>
>>
>


Re: StreamExpressionTest failures

2020-06-25 Thread Joel Bernstein
This just started failing out of the blue today. I wonder what changed? Is
it straightforward to detect windows in the test cases?


Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Jun 25, 2020 at 7:46 PM Erick Erickson 
wrote:

> This test fails on line 3500, here’s the code block. Notice that the test
> carefully constructs
> the test string with the system file separator. Even so, the the tuple has
> a *nix style separator
> and the test string a Windows separator.
>
>
> final String expectedSecondLevel1Path = "directory1" + File.separator +
> "secondLevel1.txt";
> for (int i = 0; i < 4; i++) {
>   Tuple t = tuples.get(i);
>   assertEquals("secondLevel1.txt line " + String.valueOf(i+1),
> t.get("line"));
>   assertEquals(expectedSecondLevel1Path, t.get("file"));
> }
>
> FAILED:
> org.apache.solr.client.solrj.io.stream.StreamExpressionTest.testCatStreamDirectoryCrawl
>
> Error Message:
> expected: but
> was:
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Ilan Ginzburg as Lucene/Solr committer

2020-06-22 Thread Joel Bernstein
Welcome Ilan!


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Jun 22, 2020 at 9:11 AM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Welcome Ilan!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, Jun 21, 2020 at 5:44 AM Noble Paul  wrote:
>
>> Hi all,
>>
>> Please join me in welcoming Ilan Ginzburg as the latest Lucene/Solr
>> committer.
>> Ilan, it's tradition for you to introduce yourself with a brief bio.
>>
>> Congratulations and Welcome!
>> Noble
>>
>


Re: Welcome Mayya Sharipova as Lucene/Solr committer

2020-06-11 Thread Joel Bernstein
Welcome Mayya!

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Jun 10, 2020 at 3:40 PM Yonik Seeley  wrote:

> Congrats Mayya!
> -Yonik
>
>
> On Mon, Jun 8, 2020 at 12:58 PM jim ferenczi  wrote:
>
>> Hi all,
>>
>> Please join me in welcoming Mayya Sharipova as the latest Lucene/Solr
>> committer.
>> Mayya, it's tradition for you to introduce yourself with a brief bio.
>>
>> Congratulations and Welcome!
>>
>> Jim
>>
>


Re: SortedDocValues.lookupOrd and BytesRef reuse

2020-05-14 Thread Joel Bernstein
Ok thanks, this makes sense. It's safe to use for the same SortedDocValues
instance until the method is called again. I think changing the javadoc to
the following will help clear up the confusion:

/** Retrieves the value for the specified ordinal. The returned
 * {@link BytesRef} may be re-used across calls on the same instance
to the {@link #lookupOrd(int)}
 * so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
 * to keep it around.
 * @param ord ordinal to lookup (must be = 0 and  {@link
#getValueCount()})
 * @see #ordValue()
 */

I can make this change if others agree.


Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, May 14, 2020 at 4:37 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hi Joel,
>
> You should trust the javadocs.
>
> Looking at our default Codec on master (Lucene84Codec), and at its default
> doc values implementation (Lucene80DocValuesProducer), it is clearly
> reusing the private "BytesRef term" instance.
>
> If your code is fully consuming this BytesRef before calling any other
> methods on the same SortedDocValues instance, you can safely reuse it for
> that duration.
>
> But if you want to call methods on that same SortedDocValues and continue
> using the previous BytesRef, you'll need to make a copy.
>
> Maybe improve the javadocs?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, May 14, 2020 at 4:12 PM Joel Bernstein  wrote:
>
>> In the SortedDocValues.lookupOrd documentation it says that a deep copy is 
>> needed for the returned BytesRef. I wanted to verify that this was actually 
>> true. I'm
>>
>> trying to see a way that this BytesRef could be safely reused by the API but 
>> I don't see one. Is there actually an implementation of lookupOrd that 
>> somehow reuses the
>>
>> same BytesRef between invocations. The java doc is copied below:
>>
>> Thanks!
>>
>> /** Retrieves the value for the specified ordinal. The returned
>>  * {@link BytesRef} may be re-used across calls to {@link #lookupOrd(int)}
>>  * so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
>>  * to keep it around.
>>  * @param ord ordinal to lookup (must be = 0 and  {@link 
>> #getValueCount()})
>>  * @see #ordValue()
>>  */
>> public abstract BytesRef lookupOrd(int ord) throws IOException;
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>


SortedDocValues.lookupOrd and BytesRef reuse

2020-05-14 Thread Joel Bernstein
In the SortedDocValues.lookupOrd documentation it says that a deep
copy is needed for the returned BytesRef. I wanted to verify that this
was actually true. I'm

trying to see a way that this BytesRef could be safely reused by the
API but I don't see one. Is there actually an implementation of
lookupOrd that somehow reuses the

same BytesRef between invocations. The java doc is copied below:

Thanks!

/** Retrieves the value for the specified ordinal. The returned
 * {@link BytesRef} may be re-used across calls to {@link #lookupOrd(int)}
 * so make sure to {@link BytesRef#deepCopyOf(BytesRef) copy it} if you want
 * to keep it around.
 * @param ord ordinal to lookup (must be = 0 and  {@link
#getValueCount()})
 * @see #ordValue()
 */
public abstract BytesRef lookupOrd(int ord) throws IOException;



Joel Bernstein
http://joelsolr.blogspot.com/


Re: pre-commit failing

2020-05-13 Thread Joel Bernstein
Thanks Erick,

I'll give that a try.


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 13, 2020 at 3:22 PM Erick Erickson 
wrote:

> Joel:
>
> 'git clean -dxf’ is your friend ;). Beware if you’re adding new files,
> ‘cause that removes everything that’s not already in git. It won’t touch
> anything you’ve added locally even if it’s not part of the remote repo.
>
> Sometimes I’ve also gotten weird stuff that’s cured with “gradlew —stop”
> if you’ve used Gradle…
>
> Erick
>
> > On May 13, 2020, at 2:33 PM, Joel Bernstein  wrote:
> >
> > A fresh clone doesn't produce this error so it looks like it is specific
> to my local repo.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> >
> > On Wed, May 13, 2020 at 2:17 PM Joel Bernstein 
> wrote:
> > Is anybody seeing the following error on pre-commit:
> > JAR resource does not exist: core/lib/commons-beanutils-1.9.3.jar
> >
> > Thanks,
> > Joel
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: pre-commit failing

2020-05-13 Thread Joel Bernstein
A fresh clone doesn't produce this error so it looks like it is specific to
my local repo.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 13, 2020 at 2:17 PM Joel Bernstein  wrote:

> Is anybody seeing the following error on pre-commit:
>
> JAR resource does not exist: core/lib/commons-beanutils-1.9.3.jar
>
>
> Thanks,
>
> Joel
>
>


pre-commit failing

2020-05-13 Thread Joel Bernstein
Is anybody seeing the following error on pre-commit:

JAR resource does not exist: core/lib/commons-beanutils-1.9.3.jar


Thanks,

Joel


Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-05-13 Thread Joel Bernstein
-1 (binding)


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 13, 2020 at 4:39 AM Tomoko Uchida 
wrote:

> Personally, I am not particularly interested in "promoting" Solr to TLP
> though, agree with the idea Lucene and Solr should have separate code base,
> CI infra, contribution procedure, release cycles, etc.
> If this proposal is the best way to do so, +1 (non binding).
>
> Tomoko
>
>
> 2020年5月13日(水) 16:21 Dawid Weiss :
>
>> > I wanted to clarify that only PMC members cast binding votes and not
>> committers.
>>
>> I thought it's up to the decision of the one who sends the vote thread
>> but I stand corrected, thank you Anshum.
>>
>> Dawid
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Welcome Eric Pugh as a Lucene/Solr committer

2020-04-06 Thread Joel Bernstein
Welcome Eric!


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Apr 6, 2020 at 3:14 PM Yonik Seeley  wrote:

> Congrats Eric!
> -Yonik
>
>
> On Mon, Apr 6, 2020 at 8:21 AM Jan Høydahl  wrote:
>
>> Hi all,
>>
>> Please join me in welcoming Eric Pugh as the latest Lucene/Solr committer!
>>
>> Eric has been part of the Solr community for over a decade, as a code
>> contributor, book author, company founder, blogger and mailing list
>> contributor! We look forward to his future contributions!
>>
>> Congratulations and welcome! It is a tradition to introduce yourself with
>> a brief bio, Eric.
>>
>> Jan Høydahl
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Welcome Alessandro Benedetti as a Lucene/Solr committer

2020-03-19 Thread Joel Bernstein
Welcome Alessandro!

Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Mar 19, 2020 at 11:25 AM Erik Hatcher 
wrote:

> I’m glad you’ve been here, Alessandro!   Congrats!
>
> Erik
>
> On Mar 18, 2020, at 09:01, David Smiley  wrote:
>
> 
> Hi all,
>
> Please join me in welcoming Alessandro Benedetti as the latest Lucene/Solr
> committer!
>
> Alessandro has been contributing to Lucene and Solr in areas such as More
> Like This, Synonym boosting, and Suggesters, and other areas for years.
> Furthermore he's been a help to many users on the solr-user mailing list
> and has helped others through his blog posts and presentations about
> search.  We look forward to his future contributions.
>
> Congratulations and welcome!  It is a tradition to introduce yourself with
> a brief bio, Alessandro.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>


Re: 8.5 release

2020-03-10 Thread Joel Bernstein
I just updated solr/CHANGES.txt as I missed something. If you've already
created the RC then it will be there in case of a respin.



Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Mar 10, 2020 at 5:45 AM Ignacio Vera  wrote:

> done. Thank you!
>
> On Tue, Mar 10, 2020 at 10:43 AM Alan Woodward 
> wrote:
>
>> Go ahead, I’ll start the release build once it’s in.
>>
>> On 10 Mar 2020, at 07:26, Ignacio Vera  wrote:
>>
>> Hi Alanm
>>
>> Is it  possible to backport
>> https://issues.apache.org/jira/browse/LUCENE-9263 for the 8.5 release, I
>> push it tester day and CI is happy.
>>
>> Thanks,
>>
>> On Tue, Mar 10, 2020 at 2:35 AM Joel Bernstein 
>> wrote:
>>
>>>
>>> Finished the backport for
>>> https://issues.apache.org/jira/browse/SOLR-14073.
>>>
>>> Thanks!
>>>
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>>
>>> On Mon, Mar 9, 2020 at 8:44 AM Joel Bernstein 
>>> wrote:
>>>
>>>> Ok, I'll do the backport today. Thanks!
>>>>
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>>
>>>>
>>>> On Mon, Mar 9, 2020 at 6:21 AM Alan Woodward 
>>>> wrote:
>>>>
>>>>> Thanks Uwe!
>>>>>
>>>>> On 7 Mar 2020, at 10:06, Uwe Schindler  wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> FYI, I cleaned, renamed, and changed the Jenkins Jobs, so the 8.5
>>>>> branch is in the loop on ASF Jenkins and Policeman Jenkins.
>>>>>
>>>>> Uwe
>>>>>
>>>>> -
>>>>> Uwe Schindler
>>>>> Achterdiek 19, D-28357 Bremen
>>>>> https://www.thetaphi.de
>>>>> eMail: u...@thetaphi.de
>>>>>
>>>>> *From:* Alan Woodward 
>>>>> *Sent:* Wednesday, March 4, 2020 5:35 PM
>>>>> *To:* dev@lucene.apache.org
>>>>> *Subject:* Re: 8.5 release
>>>>>
>>>>> I’ve created a branch for the 8.5 release (`branch_8_5`) and pushed it
>>>>> to the apache repository.  We’re now at feature freeze, so only bug fixes
>>>>> should be pushed to the branch.
>>>>>
>>>>> I can see from
>>>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR%2C%20LUCENE)%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20priority%20%3D%20Blocker%20AND%20fixVersion%20%3D%208.5%20ORDER%20BY%20priority%20DESC
>>>>> <https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR,%20LUCENE)%20AND%20status%20in%20(Open,%20Reopened)%20AND%20priority%20=%20Blocker%20AND%20fixVersion%20=%208.5%20ORDER%20BY%20priority%20DESC>
>>>>>  that
>>>>> we have 4 tickets marked as Blockers for this release.  I plan to build a
>>>>> first release candidate next Monday, which gives us a few days to resolve
>>>>> these.  If that’s not going to be long enough, please let me know.
>>>>>
>>>>> Uwe, Steve, can one of you start the Jenkins tasks for the new branch?
>>>>>
>>>>> Thanks, Alan
>>>>>
>>>>>
>>>>> On 3 Mar 2020, at 14:50, Alan Woodward  wrote:
>>>>>
>>>>> PSA: I’ve had to generate a new GPG key for this release, and it takes
>>>>> a while for it to get mirrored to the lucene KEYS file.  I’ll hold off
>>>>> cutting the branch until everything is ready, so it will probably now be
>>>>> tomorrow UK time before I start the release proper.
>>>>>
>>>>>
>>>>> On 25 Feb 2020, at 07:49, Noble Paul  wrote:
>>>>>
>>>>> +1
>>>>>
>>>>> On Wed, Feb 19, 2020 at 9:35 PM Ignacio Vera 
>>>>> wrote:
>>>>>
>>>>>
>>>>> +1
>>>>>
>>>>> On Tue, Feb 18, 2020 at 7:26 PM Jan Høydahl 
>>>>> wrote:
>>>>>
>>>>>
>>>>> +1
>>>>>
>>>>> That should give us time to update release docs for the new website
>>>>> too.
>>>>>
>>>>> Jan Høydahl
>>>>>
>>>>> 18. feb. 2020 kl. 18:28 skrev Adrien Grand :
>>>>>
>>>>> 
>>>>> +1
>>>>>
>>>>> On Tue, Feb 18, 2020 at 4:58 PM Alan Woodward 
>>>>> wrote:
>>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> It’s been a while since we released lucene-solr 8.4, and we’ve
>>>>> accumulated quite a few nice new features since then.  I’d like to
>>>>> volunteer to be a release manager for an 8.5 release.  If there's
>>>>> agreement, then I plan to cut the release branch two weeks today, on
>>>>> Tuesday 3rd March.
>>>>>
>>>>> - Alan
>>>>> -
>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>> 
>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>> 
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Adrien
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -
>>>>> Noble Paul
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>> 
>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>> 
>>>>>
>>>>>
>>>>>
>> - To
>> unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: 8.5 release

2020-03-09 Thread Joel Bernstein
Finished the backport for https://issues.apache.org/jira/browse/SOLR-14073.

Thanks!


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 9, 2020 at 8:44 AM Joel Bernstein  wrote:

> Ok, I'll do the backport today. Thanks!
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 9, 2020 at 6:21 AM Alan Woodward  wrote:
>
>> Thanks Uwe!
>>
>> On 7 Mar 2020, at 10:06, Uwe Schindler  wrote:
>>
>> Hi,
>>
>> FYI, I cleaned, renamed, and changed the Jenkins Jobs, so the 8.5 branch
>> is in the loop on ASF Jenkins and Policeman Jenkins.
>>
>> Uwe
>>
>> -
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>> *From:* Alan Woodward 
>> *Sent:* Wednesday, March 4, 2020 5:35 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* Re: 8.5 release
>>
>> I’ve created a branch for the 8.5 release (`branch_8_5`) and pushed it to
>> the apache repository.  We’re now at feature freeze, so only bug fixes
>> should be pushed to the branch.
>>
>> I can see from
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR%2C%20LUCENE)%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20priority%20%3D%20Blocker%20AND%20fixVersion%20%3D%208.5%20ORDER%20BY%20priority%20DESC
>> <https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR,%20LUCENE)%20AND%20status%20in%20(Open,%20Reopened)%20AND%20priority%20=%20Blocker%20AND%20fixVersion%20=%208.5%20ORDER%20BY%20priority%20DESC>
>>  that
>> we have 4 tickets marked as Blockers for this release.  I plan to build a
>> first release candidate next Monday, which gives us a few days to resolve
>> these.  If that’s not going to be long enough, please let me know.
>>
>> Uwe, Steve, can one of you start the Jenkins tasks for the new branch?
>>
>> Thanks, Alan
>>
>>
>> On 3 Mar 2020, at 14:50, Alan Woodward  wrote:
>>
>> PSA: I’ve had to generate a new GPG key for this release, and it takes a
>> while for it to get mirrored to the lucene KEYS file.  I’ll hold off
>> cutting the branch until everything is ready, so it will probably now be
>> tomorrow UK time before I start the release proper.
>>
>>
>> On 25 Feb 2020, at 07:49, Noble Paul  wrote:
>>
>> +1
>>
>> On Wed, Feb 19, 2020 at 9:35 PM Ignacio Vera  wrote:
>>
>>
>> +1
>>
>> On Tue, Feb 18, 2020 at 7:26 PM Jan Høydahl 
>> wrote:
>>
>>
>> +1
>>
>> That should give us time to update release docs for the new website too.
>>
>> Jan Høydahl
>>
>> 18. feb. 2020 kl. 18:28 skrev Adrien Grand :
>>
>> 
>> +1
>>
>> On Tue, Feb 18, 2020 at 4:58 PM Alan Woodward 
>> wrote:
>>
>>
>> Hi all,
>>
>> It’s been a while since we released lucene-solr 8.4, and we’ve
>> accumulated quite a few nice new features since then.  I’d like to
>> volunteer to be a release manager for an 8.5 release.  If there's
>> agreement, then I plan to cut the release branch two weeks today, on
>> Tuesday 3rd March.
>>
>> - Alan
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>>
>> --
>> Adrien
>>
>>
>>
>>
>> --
>> -
>> Noble Paul
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>>


Re: 8.5 release

2020-03-09 Thread Joel Bernstein
Ok, I'll do the backport today. Thanks!

Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 9, 2020 at 6:21 AM Alan Woodward  wrote:

> Thanks Uwe!
>
> On 7 Mar 2020, at 10:06, Uwe Schindler  wrote:
>
> Hi,
>
> FYI, I cleaned, renamed, and changed the Jenkins Jobs, so the 8.5 branch
> is in the loop on ASF Jenkins and Policeman Jenkins.
>
> Uwe
>
> -
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> *From:* Alan Woodward 
> *Sent:* Wednesday, March 4, 2020 5:35 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: 8.5 release
>
> I’ve created a branch for the 8.5 release (`branch_8_5`) and pushed it to
> the apache repository.  We’re now at feature freeze, so only bug fixes
> should be pushed to the branch.
>
> I can see from
> https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR%2C%20LUCENE)%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20priority%20%3D%20Blocker%20AND%20fixVersion%20%3D%208.5%20ORDER%20BY%20priority%20DESC
> <https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR,%20LUCENE)%20AND%20status%20in%20(Open,%20Reopened)%20AND%20priority%20=%20Blocker%20AND%20fixVersion%20=%208.5%20ORDER%20BY%20priority%20DESC>
>  that
> we have 4 tickets marked as Blockers for this release.  I plan to build a
> first release candidate next Monday, which gives us a few days to resolve
> these.  If that’s not going to be long enough, please let me know.
>
> Uwe, Steve, can one of you start the Jenkins tasks for the new branch?
>
> Thanks, Alan
>
>
> On 3 Mar 2020, at 14:50, Alan Woodward  wrote:
>
> PSA: I’ve had to generate a new GPG key for this release, and it takes a
> while for it to get mirrored to the lucene KEYS file.  I’ll hold off
> cutting the branch until everything is ready, so it will probably now be
> tomorrow UK time before I start the release proper.
>
>
> On 25 Feb 2020, at 07:49, Noble Paul  wrote:
>
> +1
>
> On Wed, Feb 19, 2020 at 9:35 PM Ignacio Vera  wrote:
>
>
> +1
>
> On Tue, Feb 18, 2020 at 7:26 PM Jan Høydahl  wrote:
>
>
> +1
>
> That should give us time to update release docs for the new website too.
>
> Jan Høydahl
>
> 18. feb. 2020 kl. 18:28 skrev Adrien Grand :
>
> 
> +1
>
> On Tue, Feb 18, 2020 at 4:58 PM Alan Woodward 
> wrote:
>
>
> Hi all,
>
> It’s been a while since we released lucene-solr 8.4, and we’ve accumulated
> quite a few nice new features since then.  I’d like to volunteer to be a
> release manager for an 8.5 release.  If there's agreement, then I plan to
> cut the release branch two weeks today, on Tuesday 3rd March.
>
> - Alan
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> 
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
>
>
>
> --
> Adrien
>
>
>
>
> --
> -
> Noble Paul
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> 
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
>
>
>


Re: 8.5 release

2020-03-08 Thread Joel Bernstein
If possible I'd like to backport
https://issues.apache.org/jira/browse/SOLR-14073 for the 8.5 release. The
work is already committed just waiting on the go ahead for the backport and
I'll update CHANGES,txt and close out the ticket.

Joel Bernstein
http://joelsolr.blogspot.com/


On Sat, Mar 7, 2020 at 5:06 AM Uwe Schindler  wrote:

> Hi,
>
>
>
> FYI, I cleaned, renamed, and changed the Jenkins Jobs, so the 8.5 branch
> is in the loop on ASF Jenkins and Policeman Jenkins.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Alan Woodward 
> *Sent:* Wednesday, March 4, 2020 5:35 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: 8.5 release
>
>
>
> I’ve created a branch for the 8.5 release (`branch_8_5`) and pushed it to
> the apache repository.  We’re now at feature freeze, so only bug fixes
> should be pushed to the branch.
>
>
>
> I can see from
> https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR%2C%20LUCENE)%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20priority%20%3D%20Blocker%20AND%20fixVersion%20%3D%208.5%20ORDER%20BY%20priority%20DESC
> <https://issues.apache.org/jira/issues/?jql=project%20in%20(SOLR,%20LUCENE)%20AND%20status%20in%20(Open,%20Reopened)%20AND%20priority%20=%20Blocker%20AND%20fixVersion%20=%208.5%20ORDER%20BY%20priority%20DESC>
>  that
> we have 4 tickets marked as Blockers for this release.  I plan to build a
> first release candidate next Monday, which gives us a few days to resolve
> these.  If that’s not going to be long enough, please let me know.
>
>
>
> Uwe, Steve, can one of you start the Jenkins tasks for the new branch?
>
>
>
> Thanks, Alan
>
>
>
> On 3 Mar 2020, at 14:50, Alan Woodward  wrote:
>
>
>
> PSA: I’ve had to generate a new GPG key for this release, and it takes a
> while for it to get mirrored to the lucene KEYS file.  I’ll hold off
> cutting the branch until everything is ready, so it will probably now be
> tomorrow UK time before I start the release proper.
>
>
> On 25 Feb 2020, at 07:49, Noble Paul  wrote:
>
> +1
>
> On Wed, Feb 19, 2020 at 9:35 PM Ignacio Vera  wrote:
>
>
> +1
>
> On Tue, Feb 18, 2020 at 7:26 PM Jan Høydahl  wrote:
>
>
> +1
>
> That should give us time to update release docs for the new website too.
>
> Jan Høydahl
>
> 18. feb. 2020 kl. 18:28 skrev Adrien Grand :
>
> 
> +1
>
> On Tue, Feb 18, 2020 at 4:58 PM Alan Woodward 
> wrote:
>
>
> Hi all,
>
> It’s been a while since we released lucene-solr 8.4, and we’ve accumulated
> quite a few nice new features since then.  I’d like to volunteer to be a
> release manager for an 8.5 release.  If there's agreement, then I plan to
> cut the release branch two weeks today, on Tuesday 3rd March.
>
> - Alan
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> 
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
>
>
>
> --
> Adrien
>
>
>
>
> --
> -
> Noble Paul
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> 
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
>
>
>
>
>


Re: Welcome Nhat Nguyen to the PMC

2020-03-06 Thread Joel Bernstein
Welcome Nhat!

Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Mar 5, 2020 at 2:03 PM Nhat Nguyen 
wrote:

> Thank you very much for the warm welcome!
>
> On Wed, Mar 4, 2020 at 10:59 AM Jan Høydahl  wrote:
>
>> Welcome Nhat!
>>
>> Jan
>>
>> > 3. mar. 2020 kl. 17:34 skrev Adrien Grand :
>> >
>> > I am pleased to announce that Nhat Nguyen has accepted the PMC's
>> invitation to join.
>> >
>> > Welcome Nhat!
>> >
>> > --
>> > Adrien
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Congratulations to the new Lucene/Solr PMC Chair, Anshum Gupta!

2020-01-16 Thread Joel Bernstein
Congrats Anshum!

Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Jan 16, 2020 at 1:05 PM Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> wrote:

> Congrats Anshum!
>
> Christine
>
> From: dev@lucene.apache.org At: 01/15/20 21:15:28
> To: dev@lucene.apache.org
> Subject: Congratulations to the new Lucene/Solr PMC Chair, Anshum Gupta!
>
> Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice
> President position.
>
> This year we have nominated and elected Anshum Gupta as the Chair, a
> decision that the board approved in its January 2020 meeting.
>
> Congratulations, Anshum!
>
> Cassandra
>
>
>


Re: maven issues with org.restlet.jee:org.restlet

2019-12-28 Thread Joel Bernstein
Let's move the discussion to this ticket:

https://issues.apache.org/jira/browse/SOLR-13756

Joel Bernstein
http://joelsolr.blogspot.com/


On Sat, Dec 28, 2019 at 1:31 PM Joel Bernstein  wrote:

> Here is the ticket on the reslet project:
>
> https://github.com/restlet/restlet-framework-java/issues/1366
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Sat, Dec 28, 2019 at 1:17 PM Joel Bernstein  wrote:
>
>> Ok Uwe,
>>
>> I think I've got all the details. I'll open a ticket with the restlet
>> project explaining the http->https redirect problem. Hopefully they will
>> fix this and put this problem to rest (pun intended).
>>
>> I'll also open a Solr ticket so we can discuss what can be done to
>> possibly mitigate this issue without the help of the restlet project and to
>> discuss the removal of this dependency.
>>
>> Thanks!
>>
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Fri, Dec 27, 2019 at 6:11 PM Uwe Schindler  wrote:
>>
>>> Sorry,
>>>
>>>
>>>
>>> the Ivy build was fixed in
>>> https://issues.apache.org/jira/browse/LUCENE-8807 (Lucene/Solr 8.2),
>>> the Maven POMs were fixed:
>>> https://issues.apache.org/jira/browse/LUCENE-8993 (Lucene/Solr 8.3)
>>>
>>>
>>>
>>> Sorry both links pointed to same diff. The history is above.
>>>
>>>
>>>
>>> So in short: to build Solr from source you need 8.2, otherwise Ivy won’t
>>> find any Restlet artifacts. To use the Maven POMs in 3rd party
>>> projects, you need 8.3.
>>>
>>>
>>>
>>> Uwe
>>>
>>>
>>>
>>> -
>>>
>>> Uwe Schindler
>>>
>>> Achterdiek 19, D-28357 Bremen
>>>
>>> https://www.thetaphi.de
>>>
>>> eMail: u...@thetaphi.de
>>>
>>>
>>>
>>> *From:* Uwe Schindler 
>>> *Sent:* Saturday, December 28, 2019 12:07 AM
>>> *To:* 'Joel Bernstein' ; 'lucene dev' <
>>> dev@lucene.apache.org>
>>> *Subject:* RE: maven issues with org.restlet.jee:org.restlet
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> there are few issues:
>>>
>>>- Java does not support redirects from HTTP -> HTTPS. It simply
>>>won’t follow those. This is a known issue and well-known. This was the
>>>reason why I changed all URLs to HTTPS in recently, as any redirect won’t
>>>work.  We can’t change that for old Solr releases, they keep broken. I
>>>changed this here (possible since 8.3.0):
>>>
>>> https://github.com/apache/lucene-solr/commit/4a015e224dcd4b1c5f3db92c01d8bf80be3c244a.
>>>The Maven POMs were changed a bit later:
>>>
>>> https://github.com/apache/lucene-solr/commit/4a015e224dcd4b1c5f3db92c01d8bf80be3c244a.
>>>So basically everything after 8.3.0 should work correct, older versions
>>>cannot be fixed anymore. The change to talend is not the issue, it’s the
>>>    HTTP->HTTPS one which breaks Ivy.
>>>- This is no longer an issue with pure Maven (as they have a
>>>workaround), but Ivy can’t handle that (as it relies on Java’s own URL
>>>handling). Newer Maven has its own one.
>>>- The HTTPS stuff redirects to the talend URL and finally it’s
>>>internally handled by Cloudfront. And it looks like it breaks there. With
>>>Lucene/Solr Master on Java 11 I get no error. I think Java 8 does not
>>>support TLS 1.3 and cloudfront wants this. No idea at all. But it works
>>>here.
>>>
>>>
>>>
>>> Uwe
>>>
>>>
>>>
>>> -
>>>
>>> Uwe Schindler
>>>
>>> Achterdiek 19, D-28357 Bremen
>>>
>>> https://www.thetaphi.de
>>>
>>> eMail: u...@thetaphi.de
>>>
>>>
>>>
>>> *From:* Joel Bernstein 
>>> *Sent:* Friday, December 27, 2019 9:17 PM
>>> *To:* lucene dev 
>>> *Cc:* Uwe Schindler 
>>> *Subject:* Re: maven issues with org.restlet.jee:org.restlet
>>>
>>>
>>>
>>> Agreed, if they don't fix this it needs to be removed, this is a mess.
>>>
>>>
>>>
>>> I did some more digging and the files are present when you point a
>>> browser at:
>>>
>>>
>>>
>>>
>>> https://maven.

Re: maven issues with org.restlet.jee:org.restlet

2019-12-28 Thread Joel Bernstein
Here is the ticket on the reslet project:

https://github.com/restlet/restlet-framework-java/issues/1366


Joel Bernstein
http://joelsolr.blogspot.com/


On Sat, Dec 28, 2019 at 1:17 PM Joel Bernstein  wrote:

> Ok Uwe,
>
> I think I've got all the details. I'll open a ticket with the restlet
> project explaining the http->https redirect problem. Hopefully they will
> fix this and put this problem to rest (pun intended).
>
> I'll also open a Solr ticket so we can discuss what can be done to
> possibly mitigate this issue without the help of the restlet project and to
> discuss the removal of this dependency.
>
> Thanks!
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Dec 27, 2019 at 6:11 PM Uwe Schindler  wrote:
>
>> Sorry,
>>
>>
>>
>> the Ivy build was fixed in
>> https://issues.apache.org/jira/browse/LUCENE-8807 (Lucene/Solr 8.2), the
>> Maven POMs were fixed: https://issues.apache.org/jira/browse/LUCENE-8993 
>> (Lucene/Solr
>> 8.3)
>>
>>
>>
>> Sorry both links pointed to same diff. The history is above.
>>
>>
>>
>> So in short: to build Solr from source you need 8.2, otherwise Ivy won’t
>> find any Restlet artifacts. To use the Maven POMs in 3rd party projects,
>> you need 8.3.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> https://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> *From:* Uwe Schindler 
>> *Sent:* Saturday, December 28, 2019 12:07 AM
>> *To:* 'Joel Bernstein' ; 'lucene dev' <
>> dev@lucene.apache.org>
>> *Subject:* RE: maven issues with org.restlet.jee:org.restlet
>>
>>
>>
>> Hi,
>>
>>
>>
>> there are few issues:
>>
>>- Java does not support redirects from HTTP -> HTTPS. It simply won’t
>>follow those. This is a known issue and well-known. This was the reason 
>> why
>>I changed all URLs to HTTPS in recently, as any redirect won’t work.  We
>>can’t change that for old Solr releases, they keep broken. I changed this
>>here (possible since 8.3.0):
>>
>> https://github.com/apache/lucene-solr/commit/4a015e224dcd4b1c5f3db92c01d8bf80be3c244a.
>>The Maven POMs were changed a bit later:
>>
>> https://github.com/apache/lucene-solr/commit/4a015e224dcd4b1c5f3db92c01d8bf80be3c244a.
>>So basically everything after 8.3.0 should work correct, older versions
>>cannot be fixed anymore. The change to talend is not the issue, it’s the
>>HTTP->HTTPS one which breaks Ivy.
>>- This is no longer an issue with pure Maven (as they have a
>>workaround), but Ivy can’t handle that (as it relies on Java’s own URL
>>handling). Newer Maven has its own one.
>>- The HTTPS stuff redirects to the talend URL and finally it’s
>>internally handled by Cloudfront. And it looks like it breaks there. With
>>Lucene/Solr Master on Java 11 I get no error. I think Java 8 does not
>>support TLS 1.3 and cloudfront wants this. No idea at all. But it works
>>here.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> https://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> *From:* Joel Bernstein 
>> *Sent:* Friday, December 27, 2019 9:17 PM
>> *To:* lucene dev 
>> *Cc:* Uwe Schindler 
>> *Subject:* Re: maven issues with org.restlet.jee:org.restlet
>>
>>
>>
>> Agreed, if they don't fix this it needs to be removed, this is a mess.
>>
>>
>>
>> I did some more digging and the files are present when you point a
>> browser at:
>>
>>
>>
>>
>> https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>>
>>
>> https://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>
>>
>> http://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>
>>
>> http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>
>>
>>
>> The error I get is a ha

Re: maven issues with org.restlet.jee:org.restlet

2019-12-28 Thread Joel Bernstein
Ok Uwe,

I think I've got all the details. I'll open a ticket with the restlet
project explaining the http->https redirect problem. Hopefully they will
fix this and put this problem to rest (pun intended).

I'll also open a Solr ticket so we can discuss what can be done to possibly
mitigate this issue without the help of the restlet project and to discuss
the removal of this dependency.

Thanks!





Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Dec 27, 2019 at 6:11 PM Uwe Schindler  wrote:

> Sorry,
>
>
>
> the Ivy build was fixed in
> https://issues.apache.org/jira/browse/LUCENE-8807 (Lucene/Solr 8.2), the
> Maven POMs were fixed: https://issues.apache.org/jira/browse/LUCENE-8993 
> (Lucene/Solr
> 8.3)
>
>
>
> Sorry both links pointed to same diff. The history is above.
>
>
>
> So in short: to build Solr from source you need 8.2, otherwise Ivy won’t
> find any Restlet artifacts. To use the Maven POMs in 3rd party projects,
> you need 8.3.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Uwe Schindler 
> *Sent:* Saturday, December 28, 2019 12:07 AM
> *To:* 'Joel Bernstein' ; 'lucene dev' <
> dev@lucene.apache.org>
> *Subject:* RE: maven issues with org.restlet.jee:org.restlet
>
>
>
> Hi,
>
>
>
> there are few issues:
>
>- Java does not support redirects from HTTP -> HTTPS. It simply won’t
>follow those. This is a known issue and well-known. This was the reason why
>I changed all URLs to HTTPS in recently, as any redirect won’t work.  We
>can’t change that for old Solr releases, they keep broken. I changed this
>here (possible since 8.3.0):
>
> https://github.com/apache/lucene-solr/commit/4a015e224dcd4b1c5f3db92c01d8bf80be3c244a.
>The Maven POMs were changed a bit later:
>
> https://github.com/apache/lucene-solr/commit/4a015e224dcd4b1c5f3db92c01d8bf80be3c244a.
>So basically everything after 8.3.0 should work correct, older versions
>cannot be fixed anymore. The change to talend is not the issue, it’s the
>HTTP->HTTPS one which breaks Ivy.
>- This is no longer an issue with pure Maven (as they have a
>workaround), but Ivy can’t handle that (as it relies on Java’s own URL
>handling). Newer Maven has its own one.
>- The HTTPS stuff redirects to the talend URL and finally it’s
>internally handled by Cloudfront. And it looks like it breaks there. With
>Lucene/Solr Master on Java 11 I get no error. I think Java 8 does not
>support TLS 1.3 and cloudfront wants this. No idea at all. But it works
>here.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Joel Bernstein 
> *Sent:* Friday, December 27, 2019 9:17 PM
> *To:* lucene dev 
> *Cc:* Uwe Schindler 
> *Subject:* Re: maven issues with org.restlet.jee:org.restlet
>
>
>
> Agreed, if they don't fix this it needs to be removed, this is a mess.
>
>
>
> I did some more digging and the files are present when you point a browser
> at:
>
>
>
>
> https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>
>
> https://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>
>
> http://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>
>
> http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>
>
>
> The error I get is a handshake failure which is a failure to connect
> through the Maven java libraries. So, something about how they're hosting
> these files seems to be problematic.
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>
>
>
>
>
> On Fri, Dec 27, 2019 at 2:10 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
> Let us find out and eliminate all uses of restlet from Solr. I don't think
> we should be relying on any dependency that is not published to Maven
> Central.
>
>
>
> On Sat, 28 Dec, 2019, 12:32 AM Joel Bernstein,  wrote:
>
> Ok, thanks.
>
>
>
> I'll dig around some more and see if I find a solution. And I'll complain
> to them for sure.
>
>
>
>
> Joel Bernstein
>
> http:/

Re: maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
I updated this ticket:

https://github.com/restlet/restlet-framework-java/issues/481

If I don't hear back soon, I'll create a new ticket specific to the Solr
issues.


Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Dec 27, 2019 at 4:07 PM Joel Bernstein  wrote:

> Older versions of Solr can also not be built from from ivy. This is from a
> 7.1 build:
>
> [ivy:retrieve]  maven.restlet.org: tried
>
> [ivy:retrieve]
> http://maven.restlet.org/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom
>
> [ivy:retrieve]   -- artifact
> org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:
>
> [ivy:retrieve]
> http://maven.restlet.org/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
>
> [ivy:retrieve]  sonatype-releases: tried
>
> [ivy:retrieve]
> https://oss.sonatype.org/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom
>
> [ivy:retrieve]   -- artifact
> org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:
>
> [ivy:retrieve]
> https://oss.sonatype.org/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
>
> [ivy:retrieve]  releases.cloudera.com: tried
>
> [ivy:retrieve]
> http://repository.cloudera.com/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom
>
> [ivy:retrieve]   -- artifact
> org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:
>
> [ivy:retrieve]
> http://repository.cloudera.com/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
>
> [ivy:retrieve]  working-chinese-mirror: tried
>
> [ivy:retrieve]
> http://uk.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom
>
> [ivy:retrieve]   -- artifact
> org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:
>
> [ivy:retrieve]
> http://uk.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
>
> [ivy:retrieve] ::
>
> [ivy:retrieve] ::  UNRESOLVED DEPENDENCIES ::
>
> [ivy:retrieve] ::
>
> [ivy:retrieve] :: org.restlet.jee#org.restlet;2.3.0: not found
>
> [ivy:retrieve] :: org.restlet.jee#org.restlet.ext.servlet;2.3.0: not found
>
> [ivy:retrieve] ::
>
> [ivy:retrieve]
>
> [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Dec 27, 2019 at 3:18 PM Joel Bernstein  wrote:
>
>> But if you go to the directory rather then file you see the redirection
>> to:
>>
>> https://maven.restlet.talend.com/org/restlet/jee/org.restlet/2.3.0/
>>
>> This redirection is likely the problem, as Uwe mentioned.
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Fri, Dec 27, 2019 at 3:16 PM Joel Bernstein 
>> wrote:
>>
>>> Agreed, if they don't fix this it needs to be removed, this is a mess.
>>>
>>> I did some more digging and the files are present when you point a
>>> browser at:
>>>
>>>
>>> https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>>>
>>> https://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>>
>>> http://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>>
>>> http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>>
>>> The error I get is a handshake failure which is a failure to connect
>>> through the Maven java libraries. So, something about how they're hosting
>>> these files seems to be problematic.
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>>
>>> On Fri, Dec 27, 2019 at 2:10 PM Ishan Chattopadhyaya <
>>> ichattopadhy...@gmail.com> wrote:
>>>
>>>> Let us find out and eliminate all uses of restlet from Solr. I don't
>>>> think we should be relyin

Re: maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
Older versions of Solr can also not be built from from ivy. This is from a
7.1 build:

[ivy:retrieve]  maven.restlet.org: tried

[ivy:retrieve]
http://maven.restlet.org/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom

[ivy:retrieve]   -- artifact
org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:

[ivy:retrieve]
http://maven.restlet.org/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar

[ivy:retrieve]  sonatype-releases: tried

[ivy:retrieve]
https://oss.sonatype.org/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom

[ivy:retrieve]   -- artifact
org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:

[ivy:retrieve]
https://oss.sonatype.org/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar

[ivy:retrieve]  releases.cloudera.com: tried

[ivy:retrieve]
http://repository.cloudera.com/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom

[ivy:retrieve]   -- artifact
org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:

[ivy:retrieve]
http://repository.cloudera.com/content/repositories/releases/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar

[ivy:retrieve]  working-chinese-mirror: tried

[ivy:retrieve]
http://uk.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.pom

[ivy:retrieve]   -- artifact
org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar:

[ivy:retrieve]
http://uk.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar

[ivy:retrieve] ::

[ivy:retrieve] ::  UNRESOLVED DEPENDENCIES ::

[ivy:retrieve] ::

[ivy:retrieve] :: org.restlet.jee#org.restlet;2.3.0: not found

[ivy:retrieve] :: org.restlet.jee#org.restlet.ext.servlet;2.3.0: not found

[ivy:retrieve] ::

[ivy:retrieve]

[ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS


Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Dec 27, 2019 at 3:18 PM Joel Bernstein  wrote:

> But if you go to the directory rather then file you see the redirection to:
>
> https://maven.restlet.talend.com/org/restlet/jee/org.restlet/2.3.0/
>
> This redirection is likely the problem, as Uwe mentioned.
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Dec 27, 2019 at 3:16 PM Joel Bernstein  wrote:
>
>> Agreed, if they don't fix this it needs to be removed, this is a mess.
>>
>> I did some more digging and the files are present when you point a
>> browser at:
>>
>>
>> https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>>
>> https://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>
>> http://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>
>> http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>>
>> The error I get is a handshake failure which is a failure to connect
>> through the Maven java libraries. So, something about how they're hosting
>> these files seems to be problematic.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Fri, Dec 27, 2019 at 2:10 PM Ishan Chattopadhyaya <
>> ichattopadhy...@gmail.com> wrote:
>>
>>> Let us find out and eliminate all uses of restlet from Solr. I don't
>>> think we should be relying on any dependency that is not published to Maven
>>> Central.
>>>
>>> On Sat, 28 Dec, 2019, 12:32 AM Joel Bernstein, 
>>> wrote:
>>>
>>>> Ok, thanks.
>>>>
>>>> I'll dig around some more and see if I find a solution. And I'll
>>>> complain to them for sure.
>>>>
>>>>
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>>
>>>>
>>>> On Fri, Dec 27, 2019 at 1:57 PM Uwe Schindler  wrote:
>>>>
>>>>> No idea. Complaint at them for breaking millions of builds.
>>>>>
>>>>> They should really post their stuff to Maven Central. No idea why they
>>>

Re: maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
But if you go to the directory rather then file you see the redirection to:

https://maven.restlet.talend.com/org/restlet/jee/org.restlet/2.3.0/

This redirection is likely the problem, as Uwe mentioned.



Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Dec 27, 2019 at 3:16 PM Joel Bernstein  wrote:

> Agreed, if they don't fix this it needs to be removed, this is a mess.
>
> I did some more digging and the files are present when you point a browser
> at:
>
>
> https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
>
> https://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>
> http://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>
> http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
> <https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
>
> The error I get is a handshake failure which is a failure to connect
> through the Maven java libraries. So, something about how they're hosting
> these files seems to be problematic.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Dec 27, 2019 at 2:10 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Let us find out and eliminate all uses of restlet from Solr. I don't
>> think we should be relying on any dependency that is not published to Maven
>> Central.
>>
>> On Sat, 28 Dec, 2019, 12:32 AM Joel Bernstein, 
>> wrote:
>>
>>> Ok, thanks.
>>>
>>> I'll dig around some more and see if I find a solution. And I'll
>>> complain to them for sure.
>>>
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>>
>>> On Fri, Dec 27, 2019 at 1:57 PM Uwe Schindler  wrote:
>>>
>>>> No idea. Complaint at them for breaking millions of builds.
>>>>
>>>> They should really post their stuff to Maven Central. No idea why they
>>>> don't do this.
>>>>
>>>> Uwe
>>>>
>>>> Am December 27, 2019 6:54:04 PM UTC schrieb Joel Bernstein <
>>>> joels...@gmail.com>:
>>>>>
>>>>> Yeah this a crazy way for them to manage dependencies.
>>>>>
>>>>> I see the old URL now redirects to https://maven.restlet.talend.com/.
>>>>>
>>>>> I tried adding the repo to my POM as follows:
>>>>>
>>>>> 
>>>>> 
>>>>>   maven-restlet
>>>>>   Restlet repository
>>>>>   https://maven.restlet.talend.com
>>>>> 
>>>>>
>>>>> And still get the handshake error. I tried http and still get the same
>>>>> handshake error.
>>>>>
>>>>> Any thoughts on what to try next?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Joel Bernstein
>>>>> http://joelsolr.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Dec 27, 2019 at 1:46 PM Uwe Schindler  wrote:
>>>>>
>>>>>> I figured out they again changed urls. No to talend.
>>>>>>
>>>>>> This is a big issue and should reported that this horrible company,
>>>>>> sorry! This is a no go for maven dependencies. The reason is that Java
>>>>>> handles redirection in a bad way. So never ever change urls for branding
>>>>>> purposes! Sorry Talked: bad idea, revert this…!
>>>>>>
>>>>>> Uwe
>>>>>>
>>>>>> Uwe
>>>>>>
>>>>>> Am December 27, 2019 6:42:49 PM UTC schrieb Uwe Schindler <
>>>>>> u...@thetaphi.de>:
>>>>>>>
>>>>>>> This should be fixed with newer versions of Solr. The reason is
>>>>>>> missing https and this causes some redirection problems.
>>>>>>>
>>>>>>> Maybe you are using a Solr version with a POM that still refers to
>>>>>>> non encrypted artifact repos.
>>>>>>>
>>>>>>> This was driving me crazy when I changed the remote repositories a
>>>>>>> whole ago, too.
>>>>>>>
>>>>>>> Uwe
>>>>>>>
>>>>>>> Am December 27, 2019 6:33:32 PM UTC schrieb Joel Bernstein <
>>>>>>> joels...@gmail.com>:
>>>>>>>>
>>>>>>>> I'm currently building an outside project that uses the solrj and
>>>>>>>> solr-core dependencies. I'm getting the following errors when 
>>>>>>>> attempting
>>>>>>>> build the project on a jenkins server:
>>>>>>>>
>>>>>>>> Failed to read artifact descriptor for 
>>>>>>>> org.restlet.jee:org.restlet:jar:2.3.0: Could not transfer artifact 
>>>>>>>> org.restlet.jee:org.restlet:pom:2.3.0 from/to maven-restlet 
>>>>>>>> (http://maven.restlet.org): Received fatal alert: handshake_failure
>>>>>>>>
>>>>>>>>
>>>>>>>> Has anyone ran into the restlet resolution issues when resolving
>>>>>>>> Solr dependencies before and found the fix?
>>>>>>>>
>>>>>>>>
>>>>>>>> Joel Bernstein
>>>>>>>> http://joelsolr.blogspot.com/
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Uwe Schindler
>>>>>>> Achterdiek 19, 28357 Bremen
>>>>>>> https://www.thetaphi.de
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Uwe Schindler
>>>>>> Achterdiek 19, 28357 Bremen
>>>>>> https://www.thetaphi.de
>>>>>>
>>>>>
>>>> --
>>>> Uwe Schindler
>>>> Achterdiek 19, 28357 Bremen
>>>> https://www.thetaphi.de
>>>>
>>>


Re: maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
Agreed, if they don't fix this it needs to be removed, this is a mess.

I did some more digging and the files are present when you point a browser
at:

https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
https://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
<https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
http://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
<https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>
http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
<https://maven.restlet.com/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar>

The error I get is a handshake failure which is a failure to connect
through the Maven java libraries. So, something about how they're hosting
these files seems to be problematic.

Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Dec 27, 2019 at 2:10 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Let us find out and eliminate all uses of restlet from Solr. I don't think
> we should be relying on any dependency that is not published to Maven
> Central.
>
> On Sat, 28 Dec, 2019, 12:32 AM Joel Bernstein,  wrote:
>
>> Ok, thanks.
>>
>> I'll dig around some more and see if I find a solution. And I'll complain
>> to them for sure.
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Fri, Dec 27, 2019 at 1:57 PM Uwe Schindler  wrote:
>>
>>> No idea. Complaint at them for breaking millions of builds.
>>>
>>> They should really post their stuff to Maven Central. No idea why they
>>> don't do this.
>>>
>>> Uwe
>>>
>>> Am December 27, 2019 6:54:04 PM UTC schrieb Joel Bernstein <
>>> joels...@gmail.com>:
>>>>
>>>> Yeah this a crazy way for them to manage dependencies.
>>>>
>>>> I see the old URL now redirects to https://maven.restlet.talend.com/.
>>>>
>>>> I tried adding the repo to my POM as follows:
>>>>
>>>> 
>>>> 
>>>>   maven-restlet
>>>>   Restlet repository
>>>>   https://maven.restlet.talend.com
>>>> 
>>>>
>>>> And still get the handshake error. I tried http and still get the same
>>>> handshake error.
>>>>
>>>> Any thoughts on what to try next?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Dec 27, 2019 at 1:46 PM Uwe Schindler  wrote:
>>>>
>>>>> I figured out they again changed urls. No to talend.
>>>>>
>>>>> This is a big issue and should reported that this horrible company,
>>>>> sorry! This is a no go for maven dependencies. The reason is that Java
>>>>> handles redirection in a bad way. So never ever change urls for branding
>>>>> purposes! Sorry Talked: bad idea, revert this…!
>>>>>
>>>>> Uwe
>>>>>
>>>>> Uwe
>>>>>
>>>>> Am December 27, 2019 6:42:49 PM UTC schrieb Uwe Schindler <
>>>>> u...@thetaphi.de>:
>>>>>>
>>>>>> This should be fixed with newer versions of Solr. The reason is
>>>>>> missing https and this causes some redirection problems.
>>>>>>
>>>>>> Maybe you are using a Solr version with a POM that still refers to
>>>>>> non encrypted artifact repos.
>>>>>>
>>>>>> This was driving me crazy when I changed the remote repositories a
>>>>>> whole ago, too.
>>>>>>
>>>>>> Uwe
>>>>>>
>>>>>> Am December 27, 2019 6:33:32 PM UTC schrieb Joel Bernstein <
>>>>>> joels...@gmail.com>:
>>>>>>>
>>>>>>> I'm currently building an outside project that uses the solrj and
>>>>>>> solr-core dependencies. I'm getting the following errors when attempting
>>>>>>> build the project on a jenkins server:
>>>>>>>
>>>>>>> Failed to read artifact descriptor for 
>>>>>>> org.restlet.jee:org.restlet:jar:2.3.0: Could not transfer artifact 
>>>>>>> org.restlet.jee:org.restlet:pom:2.3.0 from/to maven-restlet 
>>>>>>> (http://maven.restlet.org): Received fatal alert: handshake_failure
>>>>>>>
>>>>>>>
>>>>>>> Has anyone ran into the restlet resolution issues when resolving
>>>>>>> Solr dependencies before and found the fix?
>>>>>>>
>>>>>>>
>>>>>>> Joel Bernstein
>>>>>>> http://joelsolr.blogspot.com/
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Uwe Schindler
>>>>>> Achterdiek 19, 28357 Bremen
>>>>>> https://www.thetaphi.de
>>>>>
>>>>>
>>>>> --
>>>>> Uwe Schindler
>>>>> Achterdiek 19, 28357 Bremen
>>>>> https://www.thetaphi.de
>>>>>
>>>>
>>> --
>>> Uwe Schindler
>>> Achterdiek 19, 28357 Bremen
>>> https://www.thetaphi.de
>>>
>>


Re: maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
Ok, thanks.

I'll dig around some more and see if I find a solution. And I'll complain
to them for sure.


Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Dec 27, 2019 at 1:57 PM Uwe Schindler  wrote:

> No idea. Complaint at them for breaking millions of builds.
>
> They should really post their stuff to Maven Central. No idea why they
> don't do this.
>
> Uwe
>
> Am December 27, 2019 6:54:04 PM UTC schrieb Joel Bernstein <
> joels...@gmail.com>:
>>
>> Yeah this a crazy way for them to manage dependencies.
>>
>> I see the old URL now redirects to https://maven.restlet.talend.com/.
>>
>> I tried adding the repo to my POM as follows:
>>
>> 
>> 
>>   maven-restlet
>>   Restlet repository
>>   https://maven.restlet.talend.com
>> 
>>
>> And still get the handshake error. I tried http and still get the same
>> handshake error.
>>
>> Any thoughts on what to try next?
>>
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>>
>>
>>
>> On Fri, Dec 27, 2019 at 1:46 PM Uwe Schindler  wrote:
>>
>>> I figured out they again changed urls. No to talend.
>>>
>>> This is a big issue and should reported that this horrible company,
>>> sorry! This is a no go for maven dependencies. The reason is that Java
>>> handles redirection in a bad way. So never ever change urls for branding
>>> purposes! Sorry Talked: bad idea, revert this…!
>>>
>>> Uwe
>>>
>>> Uwe
>>>
>>> Am December 27, 2019 6:42:49 PM UTC schrieb Uwe Schindler <
>>> u...@thetaphi.de>:
>>>>
>>>> This should be fixed with newer versions of Solr. The reason is missing
>>>> https and this causes some redirection problems.
>>>>
>>>> Maybe you are using a Solr version with a POM that still refers to non
>>>> encrypted artifact repos.
>>>>
>>>> This was driving me crazy when I changed the remote repositories a
>>>> whole ago, too.
>>>>
>>>> Uwe
>>>>
>>>> Am December 27, 2019 6:33:32 PM UTC schrieb Joel Bernstein <
>>>> joels...@gmail.com>:
>>>>>
>>>>> I'm currently building an outside project that uses the solrj and
>>>>> solr-core dependencies. I'm getting the following errors when attempting
>>>>> build the project on a jenkins server:
>>>>>
>>>>> Failed to read artifact descriptor for 
>>>>> org.restlet.jee:org.restlet:jar:2.3.0: Could not transfer artifact 
>>>>> org.restlet.jee:org.restlet:pom:2.3.0 from/to maven-restlet 
>>>>> (http://maven.restlet.org): Received fatal alert: handshake_failure
>>>>>
>>>>>
>>>>> Has anyone ran into the restlet resolution issues when resolving Solr
>>>>> dependencies before and found the fix?
>>>>>
>>>>>
>>>>> Joel Bernstein
>>>>> http://joelsolr.blogspot.com/
>>>>>
>>>>
>>>> --
>>>> Uwe Schindler
>>>> Achterdiek 19, 28357 Bremen
>>>> https://www.thetaphi.de
>>>
>>>
>>> --
>>> Uwe Schindler
>>> Achterdiek 19, 28357 Bremen
>>> https://www.thetaphi.de
>>>
>>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>


Re: maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
Yeah this a crazy way for them to manage dependencies.

I see the old URL now redirects to https://maven.restlet.talend.com/.

I tried adding the repo to my POM as follows:



  maven-restlet
  Restlet repository
  https://maven.restlet.talend.com


And still get the handshake error. I tried http and still get the same
handshake error.

Any thoughts on what to try next?





Joel Bernstein
http://joelsolr.blogspot.com/





On Fri, Dec 27, 2019 at 1:46 PM Uwe Schindler  wrote:

> I figured out they again changed urls. No to talend.
>
> This is a big issue and should reported that this horrible company, sorry!
> This is a no go for maven dependencies. The reason is that Java handles
> redirection in a bad way. So never ever change urls for branding purposes!
> Sorry Talked: bad idea, revert this…!
>
> Uwe
>
> Uwe
>
> Am December 27, 2019 6:42:49 PM UTC schrieb Uwe Schindler  >:
>>
>> This should be fixed with newer versions of Solr. The reason is missing
>> https and this causes some redirection problems.
>>
>> Maybe you are using a Solr version with a POM that still refers to non
>> encrypted artifact repos.
>>
>> This was driving me crazy when I changed the remote repositories a whole
>> ago, too.
>>
>> Uwe
>>
>> Am December 27, 2019 6:33:32 PM UTC schrieb Joel Bernstein <
>> joels...@gmail.com>:
>>>
>>> I'm currently building an outside project that uses the solrj and
>>> solr-core dependencies. I'm getting the following errors when attempting
>>> build the project on a jenkins server:
>>>
>>> Failed to read artifact descriptor for 
>>> org.restlet.jee:org.restlet:jar:2.3.0: Could not transfer artifact 
>>> org.restlet.jee:org.restlet:pom:2.3.0 from/to maven-restlet 
>>> (http://maven.restlet.org): Received fatal alert: handshake_failure
>>>
>>>
>>> Has anyone ran into the restlet resolution issues when resolving Solr
>>> dependencies before and found the fix?
>>>
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>
>> --
>> Uwe Schindler
>> Achterdiek 19, 28357 Bremen
>> https://www.thetaphi.de
>
>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>


maven issues with org.restlet.jee:org.restlet

2019-12-27 Thread Joel Bernstein
I'm currently building an outside project that uses the solrj and solr-core
dependencies. I'm getting the following errors when attempting build the
project on a jenkins server:

Failed to read artifact descriptor for
org.restlet.jee:org.restlet:jar:2.3.0: Could not transfer artifact
org.restlet.jee:org.restlet:pom:2.3.0 from/to maven-restlet
(http://maven.restlet.org): Received fatal alert: handshake_failure


Has anyone ran into the restlet resolution issues when resolving Solr
dependencies before and found the fix?


Joel Bernstein
http://joelsolr.blogspot.com/


Re: Disturbing and steady decrease in boosting by date performance (and maybe others).

2019-12-18 Thread Joel Bernstein
One of the things that would be interesting would be to analyze the QTimes
for individual queries from the logs for these runs. If you ship me the log
files I can take a look. I'll also be posting a branch with new command
line tool for posting logs to be indexed in Solr tomorrow and you can take
a look at that.

And the profiler is probably the only way to know for sure what's happening
here.





Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Dec 18, 2019 at 7:37 PM Erick Erickson 
wrote:

> The very short form is that from Solr 6.6.1 to Solr 8.3.1, the throughput
> for date boosting in my tests dropped by 40+%
>
> I’ve been hearing about slowdowns in successive Solr releases with boost
> functions, so I dug into it a bit. The test setup is just a boost-by-date
> with an additional big OR clause of 100 random words so I’d be sure to hit
> a bunch of docs. I figured that if there were few hits, the signal would be
> lost in the noise, but I didn’t look at the actual hit counts.
>
> I saw several Solr JIRAs about this subject, but they were slightly
> different, although quite possibly the same underlying issue. So I tried to
> get this down to a very specific form of a query.
>
> I’ve also seen some cases in the wild where the response was proportional
> to the number of segments, thus my optimize experiments.
>
> Here are the results, explanation below. O stands for optimized to one
> segment. I spot checked pdate against 7x and 8x and they weren’t
> significantly different performance wise from tdate. All have docValues
> enabled. I ran these against a multiValued=“false” field. All the tests
> pegged all my CPUs. Jmeter is being run on a different machine than Solr.
> Only one Solr was running for any test.
>
> Solr version   queries/min
> 6.6.1  3,400
> 6.6.1 O   4,800
>
> 7.1 2,800
> 7.1 O 4,200
>
> 7.7.1  2,400
> 7.7.1 O  3,500
>
> 8.3.1 2,000
> 8.3.1 O  2,600
>
>
> The tests I’ve been running just index 20M docs into a single core, then
> run the exact same 10,000 queries against them from jmeter with 24 threads.
> Spot checks showed no hits on the queryResultCache.
>
> A query looks like this:
> rows=0&{!boost b=recip(ms(NOW,
> INSERT_FIELD_HERE),3.16e-11,1,1)}text_txt:(campaigners OR adjourned OR
> anyplace…97 more random words)
>
> There is no faceting. No grouping. No sorting.
>
> I fill in INSERT_FIELD_HERE through jmeter magic. I’m running the exact
> same queries for every test.
>
> One wildcard is that I did regenerate the index for each major revision,
> and the chose random words from the same list of words, as well as random
> times (bounded in the same range though) so the docs are not completely
> identical. The index was in the native format for that major version even
> if slightly different between versions. I ran the test once, then ran it
> again after optimizing the index.
>
> I haven’t dug any farther, if anyone’s interested I can throw a profiler
> at, say, 8.3 and see what I can see, although I’m not going to have time to
> dive into this any time soon. I’d be glad to run some tests though. I saved
> the queries and the indexes so running a test would  only take a few
> minutes.
>
> While I concentrated on date fields, the docs have date, int, and long
> fields, both docValues=true and docValues=false, each variant with
> multiValued=true and multiValued=false and both Trie and Point (where
> possible) variants as well as a pretty simple text field.
>
> Erick
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Feedback requested on: SOLR-14014: Allow Solr to start with Admin UI disabled

2019-12-05 Thread Joel Bernstein
Please respond on the ticket:

https://issues.apache.org/jira/browse/SOLR-14014

This is in response to discussions SOLR-13987: fix admin UI to not rely on
javascript eval() which
is a security issue.

https://issues.apache.org/jira/browse/SOLR-13987

SOLR-14014 is not the final solution. It mitigates the issues discussed in
SOLR-13987 in the short term until a final solution is determined.

Joel Bernstein
http://joelsolr.blogspot.com/


Re: Commit / Code Review Policy

2019-11-26 Thread Joel Bernstein
I think more is needed going forward.

What I would like to see is an explicit "risks" section of the jira that
shows the committer has thought about the different risks to the system
before committing code that effects the core. The committer should take
time to understand what part of the system might be effected. This will do
two things:

1) Force the committer to think more about how there change affects the
rest of the system.
2) Helps other committers understand the risks so that there is more
incentive to get involved and test.





Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Nov 26, 2019 at 4:26 AM Atri Sharma  wrote:

> +1
>
> I am generally wary of such proposals since they tend to impose hard
> processes in the places where trust should be dominant instead.
>
> Apart from that, LGTM
>
> On Tue, 26 Nov 2019 at 14:46, Adrien Grand  wrote:
>
>> This document looks reasonable to me and a good description of the way
>> changes get merged today. Something it says between the lines and that
>> is the most important bit to me is that this isn't really a policy but
>> rather a set of guidelines, and that we trust each other to do the
>> right thing. Maybe we could better reflect this in the name, e.g.
>> "Commit/Merging guidelines".
>>
>> On Tue, Nov 26, 2019 at 6:34 AM David Smiley 
>> wrote:
>> >
>> > Last Wednesday at a Solr committers meeting, there was general
>> agreement in attendance to raise the bar for commit permission to require
>> another's consent, which might not have entailed a review of the code.  I
>> volunteered to draft a proposal.  Other things distracted me but I'm
>> finally thinking of this task now.  *This email is NOT the proposal*.
>> >
>> > I was about to write something from scratch when I discovered we
>> already have some internal documentation on a commit policy that is both
>> reasonably well written/composed and the actual policy/information is
>> pretty good -- kudos to the mystery author!
>> >
>> > https://cwiki.apache.org/confluence/display/SOLR/CommitPolicy
>> >
>> > I'd prefer we have one "Commit Policy" document for Lucene/Solr and
>> only call out Solr specifics when applicable.  This is easier to maintain
>> and is in line with the joint-ness of Lucene TLP.  So I think it should
>> move to the Lucene cwiki.  Granted there is a possibility this kind of
>> content might move into our source control somewhere but that possibility
>> is a subject for another day.
>> >
>> > I plan to copy this to Lucene, mark as PROPOSAL and then make some
>> large edits.  The diff will probably be kinda unrecognizable despite it
>> being in nice shape now.  A "Commit Policy" is more broad that a "Code
>> Review Policy"; it could cover a variety of things.  For example when to
>> commit without even filing a JIRA issue, which I think is worth
>> mentioning.  It should probably also cover Git considerations like merge vs
>> rebase, and multiple commits vs squashing.  Maybe we should also cover when
>> to bother adding to CHANGES.txt and "via"?  Probably commit message
>> requirements.  Snowballing scope :-). Probably not JIRA metadata as it's
>> not part of the commit to be part of a commit policy, but _somewhere_
>> that's needed.  I'm not sure I want to  sign up for all that now but at
>> least for the code review subject.
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>>
>>
>>
>> --
>> Adrien
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>> --
> Regards,
>
> Atri
> Apache Concerted
>


Re: Lucene/Solr 8.4

2019-11-22 Thread Joel Bernstein
+1

Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Nov 22, 2019 at 10:08 AM Michael Sokolov  wrote:

> +1 from me - does this mean you (Adrien) are volunteering to be RM?
>
> On Fri, Nov 22, 2019 at 9:01 AM Erick Erickson 
> wrote:
> >
> > +1
> >
> > > On Nov 22, 2019, at 5:10 AM, Ignacio Vera  wrote:
> > >
> > > +1
> > >
> > > On Fri, Nov 22, 2019 at 10:56 AM jim ferenczi 
> wrote:
> > > +1
> > >
> > > Le ven. 22 nov. 2019 à 10:08, Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> a écrit :
> > > +1
> > >
> > > On Fri, Nov 22, 2019 at 2:16 PM Atri Sharma  wrote:
> > > >
> > > > +1
> > > >
> > > > On Fri, Nov 22, 2019 at 2:08 PM Adrien Grand 
> wrote:
> > > > >
> > > > > Hello all,
> > > > >
> > > > > With Thanksgiving and then Christmas coming up, this is going to
> be a
> > > > > busy time for most of us. I'd like to get a new release before the
> end
> > > > > of the year, so I'm proposing the following schedule for
> Lucene/Solr
> > > > > 8.4:
> > > > >  - cutting the branch on December 12th
> > > > >  - building the first RC on December 14th
> > > > > and hopefully we'll have a release in the following week.
> > > > >
> > > > > --
> > > > > Adrien
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > Atri
> > > > Apache Concerted
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Solr 8.3 Solrj streaming expressions do not return all field values

2019-11-20 Thread Joel Bernstein
Ah, I should have noticed that. Yes, there was a change in how string
literals are handled. Quotes are now required for string literals in Math
Expressions. I suspect this change is going to cause problems for other
people as well, but it needed to be done to clarify aspects of the
language. Sorry for the confusion. I'll make a point of updating the
documentation to make sure all examples in the documentation are correct.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Nov 19, 2019 at 5:56 PM Jörn Franke  wrote:

> Ok, solved.  Solr 8.2 accepted this statement:
> select(search(testcollection,q="test",df="Default",defType="edismax",fl=
> "id", qt="/export", sort="id asc"),id,if(eq(1,1),Y,N) as found)
>
> and return to me the expected results. Note that around Y and N there is
> no ". Solr 8.3 requires the following statement
> select(search(testcollection,q="test",df="Default",defType="edismax",fl=
> "id", qt="/export", sort="id asc"),id,if(eq(1,1),"Y","N") as found)
> to return the exact results. I do not know why I did originally not
> include the quotation marks around Y and N, but it seems that 8.2 accepted
> this and 8.3 not. I will update the Jira.
>
>
>
> On Tue, Nov 19, 2019 at 11:25 PM Jörn Franke  wrote:
>
>> Could it be that Solr 8.3 is more strict on the if statement?
>> the statement if(eq(1,1),Y,N)
>> is supposed to return the character Y (not the field). In Solr 8.2 it
>> returns the character Y, but in Solr 8.3 not.
>>
>> On Tue, Nov 19, 2019 at 11:21 PM Jörn Franke 
>> wrote:
>>
>>> thanks i will find another installation of Solr, but last time the
>>> underlying queries to the export handler were correct. What was wrong was
>>> the generated field "found" based on the if statement based NOT on any data
>>> in the collectioj.  As said, in Solr 8.3 the id fields are returned, but
>>> not the "found" field, which is generated by the selected statement:
>>> select(search(testcollection,q="test",df="Default",defType="edismax",fl=
>>> "id", qt="/export", sort="id asc"),id,if(eq(1,1),Y,N) as found)
>>>
>>> returns a tuple
>>> id: "blabla"
>>>
>>> but not the part generated by the if(eq(1,1),Y,N) as found. In Solr 8.2
>>> the found field is returned:
>>> id: "blabla"
>>> found: "Y"
>>> To my best knowledge this is a fully generated field => it does not
>>> depend on the underlying collection data and the export handler. This was
>>> done to exclude any issues with this one.
>>> it states simply if 1=1 return Y else N
>>>
>>>
>>> On Fri, Nov 8, 2019 at 3:20 PM Joel Bernstein 
>>> wrote:
>>>
>>>> Before moving to a jira let's take a look at the underlying Solr
>>>> queries in the log. The Streaming Expressions just creates solr queries, in
>>>> this case queries to the /export handler. So when something is not working
>>>> as expected we want to strip away the streaming expression and debug the
>>>> actual queries that are being run.
>>>>
>>>> You can find the Solr queries that appear in the log after running the
>>>> expressions and then try running them outside of the expression as plain
>>>> Solr queries.
>>>>
>>>> You can also post the Solr queries to this thread and we discuss what
>>>> the logs say.
>>>>
>>>> In these cases the logs always are the way to debug whats going on.
>>>>
>>>>
>>>> Joel Bernstein
>>>> http://joelsolr.blogspot.com/
>>>>
>>>>
>>>> On Wed, Nov 6, 2019 at 4:14 PM Jörn Franke 
>>>> wrote:
>>>>
>>>>> I created a JIRA for this:
>>>>> https://issues.apache.org/jira/browse/SOLR-13894
>>>>>
>>>>> On Wed, Nov 6, 2019 at 10:45 AM Jörn Franke 
>>>>> wrote:
>>>>>
>>>>>> I have checked now Solr 8.3 server in admin UI. Same issue.
>>>>>>
>>>>>> Reproduction:
>>>>>> select(search(testcollection,q=“test”,df=“Default”,defType=“edismax”,fl=“id”,
>>>>>> qt=“/export”, sort=“id asc”),id,if(eq(1,1),Y,N) as found)
>>>>>>
>>>>>> In 8.3 it returns only the id field.
>>>>>> In 8.2 it returns id,found field.
>>>

Re: Welcome Houston Putman as Lucene/Solr committer

2019-11-14 Thread Joel Bernstein
Welcome Houston!



On Thu, Nov 14, 2019 at 3:20 PM Yonik Seeley  wrote:

> Congrats Houston!
> -Yonik
>
> On Thu, Nov 14, 2019 at 3:58 AM Anshum Gupta 
> wrote:
>
>> Hi all,
>>
>> Please join me in welcoming Houston Putman as the latest Lucene/Solr
>> committer!
>>
>> Houston has been involved with the community since 2013, when he first
>> contributed the Analytics contrib module. Since then he has been involved
>> with the community, participated in conferences and spoken about his work
>> with Lucene/Solr. In the recent past, he has been involved with getting
>> Solr to scale on Kubernetes.
>>
>> Looking forward to your commits to the Apache Lucene/Solr project :)
>>
>> Congratulations and welcome, Houston! It's a tradition to introduce
>> yourself with a brief bio.
>>
>> --
>> Anshum Gupta
>>
>


Re: Solr 8.3 Solrj streaming expressions do not return all field values

2019-11-08 Thread Joel Bernstein
Before moving to a jira let's take a look at the underlying Solr queries in
the log. The Streaming Expressions just creates solr queries, in this case
queries to the /export handler. So when something is not working as
expected we want to strip away the streaming expression and debug the
actual queries that are being run.

You can find the Solr queries that appear in the log after running the
expressions and then try running them outside of the expression as plain
Solr queries.

You can also post the Solr queries to this thread and we discuss what the
logs say.

In these cases the logs always are the way to debug whats going on.


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Nov 6, 2019 at 4:14 PM Jörn Franke  wrote:

> I created a JIRA for this:
> https://issues.apache.org/jira/browse/SOLR-13894
>
> On Wed, Nov 6, 2019 at 10:45 AM Jörn Franke  wrote:
>
>> I have checked now Solr 8.3 server in admin UI. Same issue.
>>
>> Reproduction:
>> select(search(testcollection,q=“test”,df=“Default”,defType=“edismax”,fl=“id”,
>> qt=“/export”, sort=“id asc”),id,if(eq(1,1),Y,N) as found)
>>
>> In 8.3 it returns only the id field.
>> In 8.2 it returns id,found field.
>>
>> Since found is generated by select (and not coming from the collection)
>> there must be an issue with select.
>>
>> Any idea why this is happening.
>>
>> Debug logs do not show any error and the expression is correctly received
>> by Solr.
>>
>> Thank you.
>>
>> Best regards
>>
>> > Am 05.11.2019 um 14:59 schrieb Jörn Franke :
>> >
>> > Thanks I will check and come back to you. As far as I remember (but
>> have to check) the queries generated by Solr were correct
>> >
>> > Just to be clear the same thing works with Solr 8.2 server and Solr 8.2
>> client.
>> >
>> > It show the odd behaviour with Solr 8.2 server and Solr 8.3 client.
>> >
>> >> Am 05.11.2019 um 14:49 schrieb Joel Bernstein :
>> >>
>> >> I'll probably need some more details. One thing that's useful is to
>> look at
>> >> the logs and see the underlying Solr queries that are generated. Then
>> try
>> >> those underlying queries against the Solr index and see what comes
>> back. If
>> >> you're not seeing the fields with the plain Solr queries then we know
>> it's
>> >> something going on below streaming expressions. If you are seeing the
>> >> fields then it's the expressions themselves that are not handling the
>> data
>> >> as expected.
>> >>
>> >>
>> >> Joel Bernstein
>> >> http://joelsolr.blogspot.com/
>> >>
>> >>
>> >>>> On Mon, Nov 4, 2019 at 9:09 AM Jörn Franke 
>> wrote:
>> >>>
>> >>> Most likely this issue can bei also reproduced in the admin UI for the
>> >>> streaming handler of a collection.
>> >>>
>> >>>>> Am 04.11.2019 um 13:32 schrieb Jörn Franke :
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> I use streaming expressions, e.g.
>> >>>> Sort(Select(search(...),id,if(eq(1,1),Y,N) as found), by=“field A
>> asc”)
>> >>>> (Using export handler, sort is not really mandatory , I will remove
>> it
>> >>> later anyway)
>> >>>>
>> >>>> This works perfectly fine if I use Solr 8.2.0 (server + client). It
>> >>> returns Tuples in the form { “id”,”12345”, “found”:”Y”}
>> >>>>
>> >>>> However, if I use Solr 8.2.0 as server and Solr 8.3.0 as client then
>> the
>> >>> above statement only returns the id field, but not the found field.
>> >>>>
>> >>>> Questions:
>> >>>> 1) is this expected behavior, ie Solr client 8.3.0 is in this case
>> not
>> >>> compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix
>> this?
>> >>>> 2) has the syntax for the above expression changed? If so how?
>> >>>> 3) is this not expected behavior and I should create a Jira for it?
>> >>>>
>> >>>> Thank you.
>> >>>> Best regards
>> >>>
>>
>


Re: Capturing URL params for use within Streaming Expressions

2019-10-18 Thread Joel Bernstein
I think it's fine to pass through parameters to the various stream sources.
Perhaps we should limit it to a set list of parameters to pass through just
so it limits the scope,


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Oct 16, 2019 at 4:47 PM Houston Putman 
wrote:

> Streaming expressions allow for users to pass in any arbitrary URL params
> in the search streaming source. I'm looking to add the ability for certain
> streaming functions, maybe just "search()" but possibly more, to extract
> the extra URL params passed along in the streaming request.
>
> For example sending a request:
> http://localhost:8983/solr/example/stream?expr=search(collection1,
> q="*:*", fl="id", sort="id")&
> *shards.preference=shards.preference=replica.location:local*
>
> would be equivalent to:
> http://localhost:8983/solr/example/stream?expr=search(collection1,
> q="*:*", fl="id", sort="id",
> *shards.preference="shards.preference=replica.location:local"*)
>
> The beauty of URL params is that they can easily be overridden and
> checked, for example in a proxy between the user and solr. It is harder to
> do this with streaming expressions as the proxy would need to parse the
> expression and know the logic of the functions and sources.
>
> I'm open to discussion on whether the params able to be captured by the
> streaming function would need to be white-listed or black-listed. My idea
> is that this would be generically implemented through something like the
> StreamContext, so that any streaming function that wants to add this
> functionality is able to do so.
>
> Another option is to add a URL parameter such as
> *=replica.location:local* (
> *expr.override..=*). That way it's explicit
> that the user is trying to send options to the streaming expression, and
> extraneous URL params aren't accidentally captured when they were included
> for a different purpose.
>
> Anyways this would really help us for some uses cases, especially the
> replica routing options used in the example above. Really interested to see
> opinions on either of these options.
>
> - Houston Putman
>


Re: The Visual Guide to Streaming Expressions and Math Expressions

2019-10-16 Thread Joel Bernstein
Hi Pratik,

The visualizations are all done using Apache Zeppelin and the Zeppelin-Solr
interpreter. The getting started part of the user guide provides links for
Zeppelin-Solr. The install process in pretty quick. This is all open
source, freely available software. It's possible that Zepplin-Solr can be
incorporated into the Solr code eventually but the test frameworks are
quite different. I think some simple scripts can be included with the Solr
to automated the downloads for Zeppelin and Zeppelin-Solr.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Oct 16, 2019 at 11:27 AM Pratik Patel  wrote:

> Hi Joel,
>
> Looks like this is going to be very helpful, thank you! I am wondering
> whether the visualizations are generated through third party library or is
> it something which would be part of solr distribution?
>
> https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/visualization.adoc#visualization
>
>
> Thanks,
> Pratik
>
>
> On Wed, Oct 16, 2019 at 10:54 AM Joel Bernstein 
> wrote:
>
> > Hi,
> >
> > The Visual Guide to Streaming Expressions and Math Expressions is now
> > complete. It's been published to Github at the following location:
> >
> >
> >
> https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc#streaming-expressions-and-math-expressions
> >
> > The guide will eventually be part of Solr's release when the RefGuide is
> > ready to accommodate it. In the meantime its been designed to be easily
> > read directly from Github.
> >
> > The guide contains close to 200 visualizations and examples showing how
> to
> > use Streaming Expressions and Math Expressions for data analysis and
> > visualization. The visual guide is also designed to guide users that are
> > not experts in math in how to apply the functions to analysis and
> visualize
> > data.
> >
> > The new visual data loading feature in Solr 8.3 is also covered in the
> > guide. This feature should cut down on the time it takes to load CSV
> files
> > so that more time can be spent on analysis and visualization.
> >
> >
> >
> https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/loading.adoc#loading-data
> >
> > Joel Bernstein
> >
>


The Visual Guide to Streaming Expressions and Math Expressions

2019-10-16 Thread Joel Bernstein
Hi,

The Visual Guide to Streaming Expressions and Math Expressions is now
complete. It's been published to Github at the following location:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc#streaming-expressions-and-math-expressions

The guide will eventually be part of Solr's release when the RefGuide is
ready to accommodate it. In the meantime its been designed to be easily
read directly from Github.

The guide contains close to 200 visualizations and examples showing how to
use Streaming Expressions and Math Expressions for data analysis and
visualization. The visual guide is also designed to guide users that are
not experts in math in how to apply the functions to analysis and visualize
data.

The new visual data loading feature in Solr 8.3 is also covered in the
guide. This feature should cut down on the time it takes to load CSV files
so that more time can be spent on analysis and visualization.

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/loading.adoc#loading-data

Joel Bernstein


Re: 8.3 release

2019-10-10 Thread Joel Bernstein
I've plan on merging this bug fix as well:
https://issues.apache.org/jira/browse/SOLR-13829

Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Oct 10, 2019 at 6:21 PM Gus Heck  wrote:

> I'll be merging SOLR-13760 tonight as well
>
>
> On Thu, Oct 10, 2019 at 5:57 PM Andrzej Białecki  wrote:
>
>> Hi Ishan,
>>
>> Me too, me too :) I’d like to merge a bug fix for SOLR-13828.
>>
>> On 9 Oct 2019, at 15:38, Ishan Chattopadhyaya 
>> wrote:
>>
>> +1, Dat! Please go ahead.
>>
>> On Wed, 9 Oct, 2019, 2:33 PM Andrzej Białecki,  wrote:
>>
>>> It’s marked as Minor in Jira, but after reading the description it
>>> sounds scary - IMHO it should be at least well investigated before 8.3 in
>>> order to determine whether it causes real damage (apart from looking scary
>>> and filling the logs).
>>>
>>> +1 from me.
>>>
>>> On 9 Oct 2019, at 10:48, Đạt Cao Mạnh  wrote:
>>>
>>> Hi Ishan, guys
>>>
>>> I want to include the fix for
>>> https://issues.apache.org/jira/browse/SOLR-13293 in this release.
>>> Hoping that is ok!
>>>
>>> Thanks!
>>>
>>> On Tue, Oct 8, 2019 at 4:02 PM Uwe Schindler  wrote:
>>>
>>>> ASF Jenkins Jobs also reconfigured.
>>>>
>>>> I left the 8.2 Refguide job in the queue (I cloned it), not sure if
>>>> that one is still needed. All other jobs are 8.3 now.
>>>>
>>>> Uwe
>>>>
>>>> -
>>>> Uwe Schindler
>>>> Achterdiek 19, D-28357 Bremen
>>>> https://www.thetaphi.de
>>>> eMail: u...@thetaphi.de
>>>>
>>>> > -Original Message-
>>>> > From: Uwe Schindler 
>>>> > Sent: Tuesday, October 8, 2019 4:49 PM
>>>> > To: dev@lucene.apache.org
>>>> > Subject: RE: 8.3 release
>>>> >
>>>> > Policeman Jenkins builds and tests 8.3 for Windows and Linux.
>>>> >
>>>> > Uwe
>>>> >
>>>> > -
>>>> > Uwe Schindler
>>>> > Achterdiek 19, D-28357 Bremen
>>>> > https://www.thetaphi.de
>>>> > eMail: u...@thetaphi.de
>>>> >
>>>> > > -Original Message-
>>>> > > From: Ishan Chattopadhyaya 
>>>> > > Sent: Tuesday, October 8, 2019 4:32 PM
>>>> > > To: Lucene Dev ; Uwe Schindler
>>>> > > ; Steve Rowe 
>>>> > > Subject: Re: 8.3 release
>>>> > >
>>>> > > I've cut the 8.3 branch. Please feel free to push in any bug fix.
>>>> For
>>>> > > any feature, please let me know to see how we can accommodate it
>>>> > > safely.
>>>> > > I'm planning to get myself familiarized with what I need to do for
>>>> the
>>>> > > ref guide release (simultaneously with the binary release). So, most
>>>> > > likely, I'll be able to build artifacts in another week's time.
>>>> > > @Uwe Schindler / @Steve Rowe  would it be possible to please create
>>>> a
>>>> > > Jenkins branch for 8.3?
>>>> > > Thanks,
>>>> > > Ishan
>>>> > >
>>>> > > On Mon, Oct 7, 2019 at 4:17 PM Ishan Chattopadhyaya
>>>> > >  wrote:
>>>> > > >
>>>> > > > I'll cut the branch in 12-24 hours from now. If someone has
>>>> anything
>>>> > > > to put into branch_8x now, please feel free.
>>>> > > > If someone has a non-bugfix issue that they want to push in to 8.3
>>>> > > > after the branch cut, but you're sure it will not disrupt the
>>>> > > > stability of the release, please let me know and we can discuss
>>>> on a
>>>> > > > case-by-case basis.
>>>> > > >
>>>> > > > On Wed, Oct 2, 2019 at 8:50 PM Mikhail Khludnev 
>>>> > > wrote:
>>>> > > > >
>>>> > > > > Excuse me. I have to recall this message regarding SOLR-13764.
>>>> > > > >
>>>> > > > > On Mon, Sep 30, 2019 at 10:56 PM Mikhail Khludnev
>>>> > >  wrote:
>>>> > > > >>
>>>> > > > >> Ishan, thanks for update.
>>>> > > > >> May I propose to hold it for this week, beside of the s

Re: Welcome Atri Sharma as Lucene/Solr committer

2019-09-18 Thread Joel Bernstein
Welcome Atri!


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Sep 18, 2019 at 6:32 AM Jason Gerlowski 
wrote:

> Congratulations and welcome Atri!
>
> On Wed, Sep 18, 2019 at 6:24 AM Shalin Shekhar Mangar
>  wrote:
> >
> > Congratulations and welcome Atri!
> >
> > On Wed, Sep 18, 2019 at 3:12 AM Adrien Grand  wrote:
> >>
> >> Hi all,
> >>
> >> Please join me in welcoming Atri Sharma as Lucene/ Solr committer!
> >>
> >> If you are following activity on Lucene, this name will likely sound
> >> familiar to you: Atri has been very busy trying to improve Lucene over
> >> the past months. In particular, Atri recently started improving our
> >> top-hits optimizations like early termination on sorted indexes and
> >> WAND, when indexes are searched using multiple threads.
> >>
> >> Congratulations and welcome! It is a tradition to introduce yourself
> >> with a brief bio.
> >>
> >> --
> >> Adrien
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: 8.3 release

2019-09-18 Thread Joel Bernstein
That timeframe sounds good. A major focus for me will be completing and
merging SOLR-13105 <https://issues.apache.org/jira/browse/SOLR-13105> which
is part of the ref guide.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Sep 17, 2019 at 9:19 PM Anshum Gupta  wrote:

> I think it makes sense to just bundle the code and ref guide release into
> one. Right now, it's been Cassandra and Hoss who have taken care of the ref
> guide, but that shouldn't be the case.
>
> It might mean that the ref guide is a little behind when the code
> releases, but then we should just be better at committing the documentation
> when we commit the code. Hopefully, we'll get better at that and the ref
> guide releases would contain all new features but overall, having a
> different release/voting process for the ref guide is a lot of overhead
> that isn't really needed.
>
>
> On Tue, Sep 17, 2019 at 4:31 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Hi Adrien,
>> Indeed, meant to write about starting a vote.
>>
>> @Gus, I'll have to let Cassandra weigh in on this one as I'm not very
>> familiar with the ref guide release process.
>>
>> Regards,
>> Ishan
>>
>> On Mon, 16 Sep, 2019, 7:28 PM Adrien Grand,  wrote:
>>
>>> +1 to start working on 8.3
>>>
>>> Did you mean "start a vote" when you wrote "release the artifacts"? It
>>> got me wondering because I don't think we frequently managed to go
>>> from cutting a branch to releasing artifacts in so little time in the
>>> past.
>>>
>>> On Mon, Sep 16, 2019 at 5:52 PM Ishan Chattopadhyaya
>>>  wrote:
>>> >
>>> > Hi all,
>>> > We have a lot of unreleased features and fixes. I propose that we cut
>>> > a 8.3 branch in two weeks (in order to have sufficient time to bake in
>>> > all in-progress features). If there are no objections to doing so, I
>>> > can volunteer for the release as an RM and plan for cutting a release
>>> > branch on 30 September (and release the artifacts about 3-4 days after
>>> > that).
>>> >
>>> > WDYT?
>>> > Regards,
>>> > Ishan
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >
>>>
>>>
>>> --
>>> Adrien
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>
> --
> Anshum Gupta
>


[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:46 PM:
---

This function works great! As I've been testing it out I've found the "files" 
function name to be a bit confusing. I was thinking that a better name might be 
"cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  


was (Author: joel.bernstein):
This function works great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested the name 
"files"). I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:30 PM:
---

This function works great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested the name 
"files"). I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  


was (Author: joel.bernstein):
This function work great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested the name 
"files"). I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:29 PM:
---

This function work great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested the name 
"files"). I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  


was (Author: joel.bernstein):
This function work great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested that name). I 
was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:29 PM:
---

This functions work great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested that name). I 
was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  


was (Author: joel.bernstein):
The more that I test out this feature the less I like the function name 
"files". I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:29 PM:
---

This function work great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested that name). I 
was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  


was (Author: joel.bernstein):
This functions work great! As I've been testing it out I've found the "files" 
function name to be a bit confusing (I had originally requested that name). I 
was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:17 PM:
---

The more that I test out this feature the less I like the function name 
"files". I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design.  


was (Author: joel.bernstein):
The more that I test out this feature the less I like the function name 
"files". I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design. 

 

 

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein edited comment on SOLR-13622 at 8/4/19 4:16 PM:
---

The more that I test out this feature the less I like the function name 
"files". I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}
We can keep the undelying FileStream class and just map a different function 
name.

The reason I like "cat" is that it behaves very much like the unix cat command 
and the Streaming Expression design is very similar to the unix pipes design. 

 

 


was (Author: joel.bernstein):
The more that I test out this feature the less I like the function name 
"files". I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13622) Add FileStream Streaming Expression

2019-08-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899653#comment-16899653
 ] 

Joel Bernstein commented on SOLR-13622:
---

The more that I test out this feature the less I like the function name 
"files". I was thinking that a better name might be "cat".

The sample syntax would be:
{code:java}
cat("file.csv"){code}

> Add FileStream Streaming Expression
> ---
>
> Key: SOLR-13622
> URL: https://issues.apache.org/jira/browse/SOLR-13622
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>Reporter: Joel Bernstein
>Assignee: Jason Gerlowski
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13622.patch, SOLR-13622.patch
>
>
> The FileStream will read files from a local filesystem and Stream back each 
> line of the file as a tuple.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13667) Add upper, lower, trim and split Stream Evaluators

2019-08-02 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13667:
--
Attachment: SOLR-13667.patch

> Add upper, lower, trim and split Stream Evaluators
> --
>
> Key: SOLR-13667
> URL: https://issues.apache.org/jira/browse/SOLR-13667
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>Priority: Major
> Attachments: SOLR-13667.patch, SOLR-13667.patch
>
>
> The upper and lower Stream Evaluators will convert strings to upper and lower 
> case. The trim Stream Evaluator will trim whitespace from strings and the 
> split Stream Evaluator will split a string by a delimiter regex.
> These functions will operate on both strings and lists of strings. These are 
> useful functions for cleaning data during the loading process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13675) Allow zplot to visualize 2D cluster centroids

2019-08-02 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13675:
--
Summary: Allow zplot to visualize 2D cluster centroids  (was: All zplot to 
visualize 2D cluster centroids)

> Allow zplot to visualize 2D cluster centroids
> -
>
> Key: SOLR-13675
> URL: https://issues.apache.org/jira/browse/SOLR-13675
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Currently zplot can visualize 2D clusters in Apache Zeppelin. This ticket 
> will allow zplot to plot 2D cluster centroids as well. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13675) All zplot to visualize 2D cluster centroids

2019-08-02 Thread Joel Bernstein (JIRA)
Joel Bernstein created SOLR-13675:
-

 Summary: All zplot to visualize 2D cluster centroids
 Key: SOLR-13675
 URL: https://issues.apache.org/jira/browse/SOLR-13675
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
  Components: streaming expressions
Reporter: Joel Bernstein


Currently zplot can visualize 2D clusters in Apache Zeppelin. This ticket will 
allow zplot to plot 2D cluster centroids as well. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-13675) All zplot to visualize 2D cluster centroids

2019-08-02 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-13675:
-

Assignee: Joel Bernstein

> All zplot to visualize 2D cluster centroids
> ---
>
> Key: SOLR-13675
> URL: https://issues.apache.org/jira/browse/SOLR-13675
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Currently zplot can visualize 2D clusters in Apache Zeppelin. This ticket 
> will allow zplot to plot 2D cluster centroids as well. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13667) Add upper, lower, trim and split Stream Evaluators

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13667:
--
Attachment: SOLR-13667.patch

> Add upper, lower, trim and split Stream Evaluators
> --
>
> Key: SOLR-13667
> URL: https://issues.apache.org/jira/browse/SOLR-13667
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>Priority: Major
> Attachments: SOLR-13667.patch
>
>
> The upper and lower Stream Evaluators will convert strings to upper and lower 
> case. The trim Stream Evaluator will trim whitespace from strings and the 
> split Stream Evaluator will split a string by a delimiter regex.
> These functions will operate on both strings and lists of strings. These are 
> useful functions for cleaning data during the loading process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13667) Add upper, lower, trim and split Stream Evaluators

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13667:
--
Description: 
The upper and lower Stream Evaluators will convert strings to upper and lower 
case. The trim Stream Evaluator will trim whitespace from strings and the split 
Stream Evaluator will split a string by a delimiter regex.

These functions will operate on both strings and lists of strings. These are 
useful functions for cleaning data during the loading process.

  was:
The upper and lower Stream Evaluators will convert strings to upper and lower 
case. The trim Stream Evaluator will trim whitespace from strings and the split 
Stream Evaluator will split a string by a delimiter regex.

These functions will operate on both strings and lists of strings. These are 
useful functions for cleaning data for during the loading process.


> Add upper, lower, trim and split Stream Evaluators
> --
>
> Key: SOLR-13667
> URL: https://issues.apache.org/jira/browse/SOLR-13667
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>Priority: Major
>
> The upper and lower Stream Evaluators will convert strings to upper and lower 
> case. The trim Stream Evaluator will trim whitespace from strings and the 
> split Stream Evaluator will split a string by a delimiter regex.
> These functions will operate on both strings and lists of strings. These are 
> useful functions for cleaning data during the loading process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13667) Add upper, lower, trim and split Stream Evaluators

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13667:
--
Description: 
The upper and lower Stream Evaluators will convert strings to upper and lower 
case. The trim Stream Evaluator will trim whitespace from strings and the split 
Stream Evaluator will split a string by a delimiter regex.

These functions will operate on both strings and lists of strings. These are 
useful functions for cleaning data for during the loading process.

  was:
The upper and lower Stream Evaluators will convert strings to upper and lower 
case. The trim Stream Evaluator will trim whitespace from strings and the split 
Stream Evaluator will split a string by a delimiter regex.

These functions will operate on both strings and lists of strings. These are 
useful functions for cleaning data for string fields.


> Add upper, lower, trim and split Stream Evaluators
> --
>
> Key: SOLR-13667
> URL: https://issues.apache.org/jira/browse/SOLR-13667
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>Priority: Major
>
> The upper and lower Stream Evaluators will convert strings to upper and lower 
> case. The trim Stream Evaluator will trim whitespace from strings and the 
> split Stream Evaluator will split a string by a delimiter regex.
> These functions will operate on both strings and lists of strings. These are 
> useful functions for cleaning data for during the loading process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13667) Add upper, lower, trim and split Stream Evaluators

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13667:
--
Description: 
The upper and lower Stream Evaluators will convert strings to upper and lower 
case. The trim Stream Evaluator will trim whitespace from strings and the split 
Stream Evaluator will split a string by a delimiter regex.

These functions will operate on both strings and lists of strings. These are 
useful functions for cleaning data for string fields.

  was:The upper and lower Stream Evaluators will convert strings to upper and 
lower case. They will operate on both strings and lists of strings. These are 
useful functions for cleaning data for string fields.


> Add upper, lower, trim and split Stream Evaluators
> --
>
> Key: SOLR-13667
> URL: https://issues.apache.org/jira/browse/SOLR-13667
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>Priority: Major
>
> The upper and lower Stream Evaluators will convert strings to upper and lower 
> case. The trim Stream Evaluator will trim whitespace from strings and the 
> split Stream Evaluator will split a string by a delimiter regex.
> These functions will operate on both strings and lists of strings. These are 
> useful functions for cleaning data for string fields.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13667) Add upper, lower, trim and split Stream Evaluators

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13667:
--
Summary: Add upper, lower, trim and split Stream Evaluators  (was: Add 
upper and lower Stream Evaluators)

> Add upper, lower, trim and split Stream Evaluators
> --
>
> Key: SOLR-13667
> URL: https://issues.apache.org/jira/browse/SOLR-13667
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>Priority: Major
>
> The upper and lower Stream Evaluators will convert strings to upper and lower 
> case. They will operate on both strings and lists of strings. These are 
> useful functions for cleaning data for string fields.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading with Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Summary: Visual log parsing and data loading with Streaming Expressions  
(was: Visual log parsing and data loading with for Streaming Expressions)

> Visual log parsing and data loading with Streaming Expressions
> --
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualization directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier to understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading with for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Summary: Visual log parsing and data loading with for Streaming Expressions 
 (was: Visual log parsing and data with for Streaming Expressions)

> Visual log parsing and data loading with for Streaming Expressions
> --
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualization directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier to understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data with for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Summary: Visual log parsing and data with for Streaming Expressions  (was: 
Visual log parsing and data loading framework for Streaming Expressions)

> Visual log parsing and data with for Streaming Expressions
> --
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualization directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier to understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading framework for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Description: 
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualization directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier to understand the data before it's loaded into an 
index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.

  was:
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualization directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.


> Visual log parsing and data loading framework for Streaming Expressions
> ---
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualization directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier to understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading framework for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Description: 
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualization directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.

  was:
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualizations directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.


> Visual log parsing and data loading framework for Streaming Expressions
> ---
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualization directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading framework for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Description: 
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, statistical analysis and visualizations directly over CSV files 
and log files before the data is loaded to Solr. Because of Streaming 
Expressions built in paralyzation capabilities this allows Solr Cloud to behave 
like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.

  was:
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, analysis and visualizations directly over CSV files and log files 
before the data is loaded to Solr. Because of Streaming Expressions built in 
paralyzation capabilities this allows Solr Cloud to behave like a massively 
parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.


> Visual log parsing and data loading framework for Streaming Expressions
> ---
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, statistical analysis and visualizations directly over CSV files 
> and log files before the data is loaded to Solr. Because of Streaming 
> Expressions built in paralyzation capabilities this allows Solr Cloud to 
> behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading framework for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Description: 
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, analysis and visualizations directly over CSV files and log files 
before the data is loaded to Solr. Because of Streaming Expressions built in 
paralyzation capabilities this allows Solr Cloud to behave like a massively 
parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.

  was:
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, analysis and visualizations over CSV files and log files. Because 
of Streaming Expressions built in paralyzation capabilities this allows Solr 
Cloud to behave like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.


> Visual log parsing and data loading framework for Streaming Expressions
> ---
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, analysis and visualizations directly over CSV files and log 
> files before the data is loaded to Solr. Because of Streaming Expressions 
> built in paralyzation capabilities this allows Solr Cloud to behave like a 
> massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-13621) Visual log parsing and data loading framework for Streaming Expressions

2019-08-01 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13621:
--
Description: 
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, analysis and visualizations over CSV files and log files. Because 
of Streaming Expressions built in paralyzation capabilities this allows Solr 
Cloud to behave like a massively parallel *grep* engine. 

It also allows users to visualize data using Apache Zeppelin as part of loading 
process, to make it easier understand the data before it's loaded into an index.

This ticket will track the sub-tickets for the different log formats that will 
be supported.

  was:
Streaming Expressions and Math Expressions are now mature on the query side. 
This includes the ability to query, transform, analyze and visualize data. 

It's now time to build the data loading and log parsing capabilities to apply 
the full suite of mathematics and visualizations over log data and CSV files.

The design is to have stream sources that read from a file system and stream 
decorators that parse different file and log formats. The data can be further 
transformed and joined with other data by stream decorators and sent to any 
Solr Cloud collection with the update Stream.

This design also allows Streaming Expressions to perform regex filtering, 
aggregations, analysis and visualizations over CSV files and log files. Because 
of Streaming Expressions built in paralyzation capabilities this allows Solr 
Cloud to behave like a massively parallel *grep* engine. 

This ticket will track the sub-tickets for the different log formats that will 
be supported.


> Visual log parsing and data loading framework for Streaming Expressions
> ---
>
> Key: SOLR-13621
> URL: https://issues.apache.org/jira/browse/SOLR-13621
> Project: Solr
>  Issue Type: New Feature
>  Components: streaming expressions
>    Reporter: Joel Bernstein
>    Assignee: Joel Bernstein
>Priority: Major
>
> Streaming Expressions and Math Expressions are now mature on the query side. 
> This includes the ability to query, transform, analyze and visualize data. 
> It's now time to build the data loading and log parsing capabilities to apply 
> the full suite of mathematics and visualizations over log data and CSV files.
> The design is to have stream sources that read from a file system and stream 
> decorators that parse different file and log formats. The data can be further 
> transformed and joined with other data by stream decorators and sent to any 
> Solr Cloud collection with the update Stream.
> This design also allows Streaming Expressions to perform regex filtering, 
> aggregations, analysis and visualizations over CSV files and log files. 
> Because of Streaming Expressions built in paralyzation capabilities this 
> allows Solr Cloud to behave like a massively parallel *grep* engine. 
> It also allows users to visualize data using Apache Zeppelin as part of 
> loading process, to make it easier understand the data before it's loaded 
> into an index.
> This ticket will track the sub-tickets for the different log formats that 
> will be supported.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   3   4   5   6   7   8   9   10   >