Re: [VOTE] Lucene logo contest

2020-06-15 Thread Dennis Gove
+1 for "A. Submitted by Dustin Haver [2]"

On Tue, Jun 16, 2020 at 12:27 AM David Smiley  wrote:

> C. The current Lucene logo [4]
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Jun 15, 2020 at 6:08 PM Ryan Ernst  wrote:
>
>> Dear Lucene and Solr developers!
>>
>> In February a contest was started to design a new logo for Lucene [1].
>> That contest concluded, and I am now (admittedly a little late!) calling a
>> vote.
>>
>> The entries are labeled as follows:
>>
>> A. Submitted by Dustin Haver [2]
>>
>> B. Submitted by Stamatis Zampetakis [3] Note that this has several
>> variants. Within the linked entry there are 7 patterns and 7 color
>> palettes. Any vote for B should contain the pattern number, like B1 or B3.
>> If a B variant wins, we will have a followup vote on the color palette.
>>
>> C. The current Lucene logo [4]
>>
>> Please vote for one of the three (or nine depending on your perspective!)
>> above choices. Note that anyone in the Lucene+Solr community is invited to
>> express their opinion, though only Lucene+Solr PMC cast binding votes
>> (indicate non-binding votes in your reply, please). This vote will close
>> one week from today, Mon, June 22, 2020.
>>
>> Thanks!
>>
>> [1] https://issues.apache.org/jira/browse/LUCENE-9221
>> [2]
>> https://issues.apache.org/jira/secure/attachment/12999548/Screen%20Shot%202020-04-10%20at%208.29.32%20AM.png
>> [3]
>> https://issues.apache.org/jira/secure/attachment/12997768/zabetak-1-7.pdf
>> [4]
>> https://lucene.apache.org/theme/images/lucene/lucene_logo_green_300.png
>>
>


Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-05-18 Thread Dennis Gove
Sure, I'd considered timezones before responding. The archive permalink for
the original message shows a date of "Tue, 12 May 2020 07:36:57 GMT".

Permalink:
https://mail-archives.apache.org/mod_mbox/lucene-dev/202005.mbox/%3CCAM21Rt-PjpVYKsThzAKJbbPT3OeuVdgnVSZ3E_QEoWfpLNd0%3DQ%40mail.gmail.com%3E

On Mon, May 18, 2020 at 11:48 AM Dawid Weiss  wrote:

> > The vote began on May 12th and the initial message said it will "be
> active for a week to give everyone a chance to read
> > and cast a vote".
>
> I thought I'd sent it on Monday, which would make a week today. Time
> zones will also kick in because I live in Europe... I can wait a day
> more - not a problem.
>
> D.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-05-18 Thread Dennis Gove
Dawid,

The vote began on May 12th and the initial message said it will "be active
for a week to give everyone a chance to read
and cast a vote".

The vote should remain open until at least May 19th.

- Dennis

On Mon, May 18, 2020 at 4:55 AM Dawid Weiss  wrote:

> Correction - Jim is a PMC member (apologies, Jim; you may want to
> update [1]).  This means the results are updated to:
>
> Vote distribution among PMC members (binding votes):
>
>  [+1]: 32
>  [-1]: 4
>
> D.
>
> [1] https://lucene.apache.org/whoweare.html
>
>
> On Mon, May 18, 2020 at 10:46 AM Dawid Weiss 
> wrote:
> >
> > I am closing this vote, thank you for participating.
> >
> > 46 votes have been posted: 35 from PMC members, 8 from committers and
> > 3 from user community.
> >
> > From this total 8 people voted -1 and 38 people voted +1. Vote
> > distribution among PMC members (binding votes):
> >
> > [+1]: 31
> > [-1]: 4
> >
> > The vote has passed. I will follow-up on the process from here on in a
> > separate email.
> >
> > Dawid
> >
> > All collected votes in a spreadsheet:
> >
> https://docs.google.com/spreadsheets/d/1ZmR3C2EgA57QIeJJ3ejKCTUkHdG1lqPOU29heqybkKs/edit?usp=sharing
> >
> > And in plain text:
> >
> > +1
> >
> > PMC Ishan Chattopadhyaya (ichattopadhy...@gmail.com)
> > PMC Doron Cohen (cdor...@gmail.com)
> > PMC Shai Erera (ser...@gmail.com)
> > PMC Ryan Ernst (r...@iernst.net)
> > COMMITTER Jim Ferenczi (jim.feren...@gmail.com)
> > PMC Otis Gospodnetic (otis.gospodne...@gmail.com)
> > PMC Dennis Gove (dpg...@gmail.com)
> > PMC Adrien Grand (jpou...@gmail.com)
> > PMC Martijn v Groningen (martijn.v.gronin...@gmail.com)
> > PMC Mark Harwood (mharw...@apache.org)
> > PMC Erik Hatcher (erik.hatc...@gmail.com)
> > PMC Shawn Heisey (apa...@elyograg.org)
> > PMC Jan Høydahl (jan@cominvent.com)
> > COMMITTER Namgyu Kim (kng0...@gmail.com)
> > PMC Nicholas Knize (nkn...@gmail.com)
> > PMC Shalin Shekhar Mangar (shalinman...@gmail.com)
> > PMC Michael McCandless (luc...@mikemccandless.com)
> > PMC Christian Moen (c...@atilika.com)
> > -- Gora Mohanty (g...@mimirtech.com)
> > PMC Robert Muir (rcm...@gmail.com)
> > PMC Nhat Nguyen (nhat.ngu...@elastic.co.invalid)
> > PMC Kevin Risden (kris...@apache.org)
> > PMC Steven A Rowe (sar...@gmail.com)
> > PMC Uwe Schindler (u...@thetaphi.de)
> > PMC Koji Sekiguchi (koji.sekigu...@rondhuit.com)
> > COMMITTER Atri Sharma (a...@apache.org)
> > -- Lucky Sharma (goku0...@gmail.com)
> > PMC David Smiley (david.w.smi...@gmail.com)
> > COMMITTER Michael Sokolov (msoko...@gmail.com)
> > PMC Tommaso Teofili (tommaso.teof...@gmail.com)
> > PMC Varun Thacker (va...@vthacker.in)
> > COMMITTER Tomoko Uchida (tomoko.uchida.1...@gmail.com)
> > PMC Andi Vajda (o...@ovaltofu.org)
> > PMC Ignacio Vera (iver...@gmail.com)
> > PMC Dawid Weiss (dawid.we...@gmail.com)
> > PMC Simon Willnauer (simon.willna...@gmail.com)
> > PMC Alan Woodward (romseyg...@gmail.com)
> > PMC Karl Wright (daddy...@gmail.com)
> >
> > -1
> >
> > PMC Joel Bernstein (joels...@gmail.com)
> > COMMITTER Mike Drob (md...@apache.org)
> > PMC Jason Gerlowski (gerlowsk...@gmail.com)
> > PMC Anshum Gupta (ans...@anshumgupta.net)
> > COMMITTER Gus Heck (gus.h...@gmail.com)
> > COMMITTER Chris Hostetter (hossman_luc...@fucit.org)
> > PMC Tomás Fernández Löbbe (tomasflo...@gmail.com)
> > -- Kevin Watters (kwatt...@kmwllc.com)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: [VOTE] Solr to become a top-level Apache project (TLP)

2020-05-16 Thread Dennis Gove
For the reasons I included in my message on the discussion thread (
https://mail-archives.apache.org/mod_mbox/lucene-dev/202005.mbox/%3CCAPsCRSWds%3Dctvf837BFRXqyPxGLJ-5a%2B3hCa6HwrA8FFR5MQkA%40mail.gmail.com%3E)
I'm voting in favor of this. I hope the actual execution of it will follow
the path of a code/release split for a few versions before a more permanent
project split, but if not I think the benefits of a split will still be
seen, although perhaps down a bumpier road.

+1 (binding)


On Sat, May 16, 2020 at 9:44 PM Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> +1 (binding)
>
> On Tue, May 12, 2020 at 1:07 PM Dawid Weiss  wrote:
>
>> Dear Lucene and Solr developers!
>>
>> According to an earlier [DISCUSS] thread on the dev list [2], I am
>> calling for a vote on the proposal to make Solr a top-level Apache
>> project (TLP) and separate Lucene and Solr development into two
>> independent entities.
>>
>> To quickly recap the reasons and consequences of such a move: it seems
>> like the reasons for the initial merge of Lucene and Solr, around 10
>> years ago, have been achieved. Both projects are in good shape and
>> exhibit signs of independence already (mailing lists, committers,
>> patch flow). There are many technical considerations that would make
>> development much easier if we move Solr out into its own TLP.
>>
>> We discussed this issue [2] and both PMC members and committers had a
>> chance to review all the pros and cons and express their views. The
>> discussion showed that there are clearly different opinions on the
>> matter - some people are in favor, some are neutral, others are
>> against or not seeing the point of additional labor. Realistically, I
>> don't think reaching 100% level consensus is going to be possible --
>> we are a diverse bunch with different opinions and personalities. I
>> firmly believe this is the right direction hence the decision to put
>> it under the voting process. Should something take a wrong turn in the
>> future (as some folks worry it may), all blame is on me.
>>
>> Therefore, the proposal is to separate Solr from under Lucene TLP, and
>> make it a TLP on its own. The initial structure of the new PMC,
>> committer base, git repositories and other managerial aspects can be
>> worked out during the process if the decision passes.
>>
>> Please indicate one of the following (see [1] for guidelines):
>>
>> [ ] +1 - yes, I vote for the proposal
>> [ ] -1 - no, I vote against the proposal
>>
>> Please note that anyone in the Lucene+Solr community is invited to
>> express their opinion, though only Lucene+Solr committers cast binding
>> votes (indicate non-binding votes in your reply, please).
>>
>> The vote will be active for a week to give everyone a chance to read
>> and cast a vote.
>>
>> Dawid
>>
>> [1] https://www.apache.org/foundation/voting.html
>> [2]
>> https://lists.apache.org/thread.html/rfae2440264f6f874e91545b2030c98e7b7e3854ddf090f7747d338df%40%3Cdev.lucene.apache.org%3E
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: [DISCUSS] Lucene-Solr split (Solr promoted to TLP)

2020-05-16 Thread Dennis Gove
There’s something enticing when thinking of Lucene and Solr as independent
codebases. I’ve always thought of Lucene as core search (indexing,
analysis, tokenization, etc…) and Solr as a search experience. Lucene is
more a library (or set of libraries) used by applications providing search
experiences. Solr is just one of those applications - it provides the
experience of search as a service, and feels focused on making search
approachable and palatable to search novices.

The work I put into streaming expressions was born out of a desire to more
widely expose search functionality. The streaming API drew me in because it
exposed a new way to interact with core search functionality, and
expressions came out of wanting to make it all easy to use for end users. I
didn’t, and don’t, give a whole lot of thought to the internals of Lucene.
I like a good user experience and I see Solr as an application trying to
provide that.

I do, however, have concerns about the long-term impact of a split. Lucene
is able to set a very explicit N-1 backward compatibility policy because it
can have less immediate concern for the downstream user. And this is not to
denigrate Lucene at all - in fact I agree with that policy for core search
functionality. If and when incompatible changes lead to significant gains
they can and are made. Inefficient older ways are not brought forward
further than necessary. Solr has to be concerned with their end users, who
may be relative search novices, when considering backward incompatible
changes. More thought is given to the experience and impact of upgrades.
How are those issues dealt with across replicas and shards? What will
happen in a cloud made up of lucene indexes of varying versions? My concern
revolves around what happens if (when) Solr falls behind Lucene. Will it
ever be able to catch up? There’s an argument to be made that Solr being a
consistent N versions behind Lucene has some value to the Solr project.
But, what happens if Solr gets a slower release cadence? Will it fall
further and further behind? Will its inability to use the latest and
greatest in Lucene be the impetus for a community splitting fork? Will a
new search application come along without the legacy concerns of Solr and
become a more enticing option? Perhaps, to all of that. I can’t really say.

What I can say is I don’t think it’s appropriate to stifle the growth, or
in this case the change, of a community because of fear of the unknown.
Yes, I am worried that a project split will lead to trouble and issues for
Solr, and some of those fears are born out of how I know my company uses
Solr. But I also think a lot of good could come out of a split. It’d be
exciting to see how a Lucene community advances the state of the art of
core search, and how a Solr community provides a clean and easily
digestible search experience to end users. Will Lucene become more
embeddable? Will Solr become more plug-n-play?

I’m a fan of Christine’s suggestion of first executing a code and release
split and later, after seeing the impact of such a split, decide on a
project split. Full disclosure, Christine and I work at the same company. I
think independent codebases will in the end benefit both, though I do agree
there is more inherent and immediate risk to Solr.
- Dennis

On Fri, May 15, 2020 at 4:03 AM Dawid Weiss  wrote:

> Hi Christine!
>
> > * After a while (perhaps with Lucene 10.0 or perhaps at some other
> natural point) we re-arrive at the "together or separate" question. If
> splitting worked well then Solr promotion to TLP could be a natural next
> step
>
> My whole point is that I think the split is by large already there:
> the mailing lists, the issues, the codebase (git constitutes common
> storage but the build system and nearly anything else pretty much
> independent with Solr consuming Lucene artifacts). I also believe the
> will to separate the projects has been with (some of) us for a long
> time and postponing this decision won't change anything.
>
> Dawid
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Houston Putman as Lucene/Solr committer

2019-11-16 Thread Dennis Gove
Congrats Houston!

On Fri, Nov 15, 2019 at 7:57 PM Mike Drob  wrote:

> Welcome, Houston!
>
> On Fri, Nov 15, 2019 at 6:00 PM Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
>
>> Congratulations and welcome Houston!
>>
>> On Thu, Nov 14, 2019 at 12:39 PM Houston Putman 
>> wrote:
>>
>>> Thanks everyone!
>>>
>>> As requested, a brief history of me:
>>>
>>> A native Austinite, I went to The University of Texas at Austin. Back in
>>> 2013 I lucked into an internship with Bloomberg working on a new Search
>>> Infrastructure team. There I had my first exposure to Solr and built the
>>> first iteration of the Analytics Component. Since graduating in 2016,
>>> moving up to NYC and starting at Bloomberg full time, I have been working
>>> on Solr in various ways, from rewriting the Analytics Component to adding
>>> some features to various parts of SolrJ and fixing some weirdness in pivot
>>> facets.
>>>
>>> Lately I’ve been working (and presenting) on running Solr on Kubernetes.
>>> We’ve open sourced a Solr Kubernetes operator (
>>> https://github.com/bloomberg/solr-operator), which is currently being
>>> developed with help from across the community. Our goal is to make this a
>>> standard and flexible way of running Solr in a cloud environment, which
>>> includes making Solr itself run better in the cloud.
>>>
>>> I can’t wait to continue working with y’all and making Solr as great as
>>> it can be!
>>>
>>>
>>> - Houston Putman
>>>
>>> On Thu, Nov 14, 2019 at 2:24 PM Varun Thacker  wrote:
>>>
 Congratulations and welcome Houston!

 On Thu, Nov 14, 2019 at 9:32 AM Tomás Fernández Löbbe <
 tomasflo...@gmail.com> wrote:

> Welcome Houston!
>
> On Thu, Nov 14, 2019 at 9:09 AM Kevin Risden 
> wrote:
>
>> Congrats and welcome!
>>
>> Kevin Risden
>>
>> On Thu, Nov 14, 2019, 12:05 Jason Gerlowski 
>> wrote:
>>
>>> Congratulations!
>>>
>>> On Thu, Nov 14, 2019 at 11:58 AM Gus Heck 
>>> wrote:
>>> >
>>> > Congratulations and welcome :)
>>> >
>>> > On Thu, Nov 14, 2019 at 11:52 AM Namgyu Kim 
>>> wrote:
>>> >>
>>> >> Congratulations and welcome, Houston! :D
>>> >>
>>> >> On Fri, Nov 15, 2019 at 1:18 AM Ken LaPorte 
>>> wrote:
>>> >>>
>>> >>> Congratulations Houston! Well deserved honor.
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Sent from:
>>> https://lucene.472066.n3.nabble.com/Lucene-Java-Developer-f564358.html
>>> >>>
>>> >>>
>>> -
>>> >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >>>
>>> >
>>> >
>>> > --
>>> > http://www.needhamsoftware.com (work)
>>> > http://www.the111shift.com (play)
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>


[jira] [Resolved] (SOLR-12271) Analytics Component reads negative float and double field values incorrectly

2018-05-30 Thread Dennis Gove (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-12271.

Resolution: Fixed

> Analytics Component reads negative float and double field values incorrectly
> 
>
> Key: SOLR-12271
> URL: https://issues.apache.org/jira/browse/SOLR-12271
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.4, master (8.0)
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4, master (8.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the analytics component uses the incorrect way of converting 
> numeric doc values longs to doubles and floats.
> The fix is easy and the tests now cover this use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-12271) Analytics Component reads negative float and double field values incorrectly

2018-05-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-12271:
--

Assignee: Dennis Gove

> Analytics Component reads negative float and double field values incorrectly
> 
>
> Key: SOLR-12271
> URL: https://issues.apache.org/jira/browse/SOLR-12271
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.3.1, 7.4, master (8.0)
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4, master (8.0)
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the analytics component uses the incorrect way of converting 
> numeric doc values longs to doubles and floats.
> The fix is easy and the tests now cover this use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-18 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-12355.

   Resolution: Fixed
Fix Version/s: 7.4

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12355.patch, SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-18 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-12355:
---
Attachment: SOLR-12355.patch

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch, SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-15 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475973#comment-16475973
 ] 

Dennis Gove commented on SOLR-12355:


Initial patch attached. I have not yet run the full suite of tests against this.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-15 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-12355:
---
Attachment: SOLR-12355.patch

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475060#comment-16475060
 ] 

Dennis Gove commented on SOLR-12355:


This also impacts OuterHashJoinStream.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475055#comment-16475055
 ] 

Dennis Gove commented on SOLR-12355:


I have a fix for this where instead of calculating the string value's hashCode 
we just use the string value as the key in the hashed set of tuples. I'm 
creating a few test cases to verify this gives us what we want.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-12355:
--

 Summary: HashJoinStream's use of String::hashCode results in 
non-matching tuples being considered matches
 Key: SOLR-12355
 URL: https://issues.apache.org/jira/browse/SOLR-12355
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 6.0
Reporter: Dennis Gove
Assignee: Dennis Gove


The following strings have been found to have hashCode conflicts and as such 
can result in HashJoinStream considering two tuples with fields of these values 
to be considered the same.


{code:java}
"MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
This means these two tuples are the same if we're comparing on field "foo"
{code:java}
{
  "foo":"MG!!00TNGP::Mtge::"
}
{
  "foo":"MG!!00TNH1::Mtge::"
}
{code}
and these two tuples are the same if we're comparing on fields "foo,bar"
{code:java}
{
  "foo":"MG!!00TNGP"
  "bar":"Mtge"
}
{
  "foo":"MG!!00TNH1"
  "bar":"Mtge"
}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11924) Add the ability to watch collection set changes in ZkStateReader

2018-04-17 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-11924.

   Resolution: Fixed
 Assignee: Dennis Gove
Fix Version/s: (was: master (8.0))

> Add the ability to watch collection set changes in ZkStateReader
> 
>
> Key: SOLR-11924
> URL: https://issues.apache.org/jira/browse/SOLR-11924
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.4, master (8.0)
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Minor
> Fix For: 7.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Allow users to watch when the set of collections for a cluster is changed. 
> This is useful if a user is trying to discover collections within a cloud.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11914) Remove/move questionable SolrParams methods

2018-04-16 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440179#comment-16440179
 ] 

Dennis Gove commented on SOLR-11914:


I agree with [~dsmiley] - the code in the streaming classes appears to be an 
oddly round-about way of doing things. The changes you've made here appear to 
be a much better approach.

> Remove/move questionable SolrParams methods
> ---
>
> Key: SOLR-11914
> URL: https://issues.apache.org/jira/browse/SOLR-11914
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: David Smiley
>Priority: Minor
>  Labels: newdev
> Attachments: SOLR-11914.patch
>
>
> {{Map<String, Object> getAll(Map<String, Object> sink, Collection 
> params)}} 
> Is only used by the CollectionsHandler, and has particular rules about how it 
> handles multi-valued data that make it not very generic, and thus I think 
> doesn't belong here.  Furthermore the existence of this method is confusing 
> in that it gives the user another choice against it use versus toMap (there 
> are two overloaded variants).
> {{SolrParams toFilteredSolrParams(List names)}}
> Is only called in one place, and something about it bothers me, perhaps just 
> the name or that it ought to be a view maybe.
> {{static Map<String,String> toMap(NamedList params)}}
> Isn't used and I don't like it; it doesn't even involve a SolrParams!  Legacy 
> of 2006.
> {{static Map<String,String[]> toMultiMap(NamedList params)}}
> It doesn't even involve a SolrParams! Legacy of 2006 with some updates since. 
> Used in some places. Perhaps should be moved to NamedList as an instance 
> method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Dennis Gove
Welcome Dat!

On Mon, Apr 2, 2018 at 5:30 PM, Steve Rowe  wrote:

> Congrats and welcome Đạt!
>
> --
> Steve
> www.lucidworks.com
>
> > On Apr 2, 2018, at 3:50 PM, Adrien Grand  wrote:
> >
> > Fixing the subject of the email.
> >
> > Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
> > I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
> invitation to join.
> >
> > Welcome Đạt!
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] [Commented] (SOLR-10512) Innerjoin streaming expressions - Invalid JoinStream error

2018-03-09 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392851#comment-16392851
 ] 

Dennis Gove commented on SOLR-10512:


It was certainly designed such that the left field in the on clause is the 
field from the first incoming stream and the right field in the on clause is 
the field from the second incoming stream. If that is not occurring then this 
is a very clear bug.

> Innerjoin streaming expressions - Invalid JoinStream error
> --
>
> Key: SOLR-10512
> URL: https://issues.apache.org/jira/browse/SOLR-10512
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 6.4.2, 6.5
> Environment: Debian Jessie
>Reporter: Dominique Béjean
>Priority: Major
>
> It looks like innerJoin streaming expression do not work as explained in 
> documentation. An invalid JoinStream error occurs.
> {noformat}
> curl --data-urlencode 'expr=innerJoin(
> search(books, 
>q="*:*", 
>fl="id", 
>sort="id asc"),
> searchreviews, 
>q="*:*", 
>fl="id_book_s", 
>sort="id_book_s asc"), 
> on="id=id_books_s"
> )' http://localhost:8983/solr/books/stream
>   
> {"result-set":{"docs":[{"EXCEPTION":"Invalid JoinStream - all incoming stream 
> comparators (sort) must be a superset of this stream's 
> equalitor.","EOF":true}]}}   
> {noformat}
> It is tottaly similar to the documentation example
> 
> {noformat}
> innerJoin(
>   search(people, q=*:*, fl="personId,name", sort="personId asc"),
>   search(pets, q=type:cat, fl="ownerId,petName", sort="ownerId asc"),
>   on="personId=ownerId"
> )
> {noformat}
> Queries on each collection give :
> {noformat}
> $ curl --data-urlencode 'expr=search(books, 
>q="*:*", 
>fl="id, title_s, pubyear_i", 
>sort="pubyear_i asc", 
>qt="/export")' 
> http://localhost:8983/solr/books/stream
> {
>   "result-set": {
> "docs": [
>   {
> "title_s": "Friends",
> "pubyear_i": 1994,
> "id": "book2"
>   },
>   {
> "title_s": "The Way of Kings",
> "pubyear_i": 2010,
> "id": "book1"
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 16
>   }
> ]
>   }
> }
> $ curl --data-urlencode 'expr=search(reviews, 
>q="author_s:d*", 
>fl="id, id_book_s, stars_i, review_dt", 
>sort="id_book_s asc", 
>qt="/export")' 
> http://localhost:8983/solr/reviews/stream
>  
> {
>   "result-set": {
> "docs": [
>   {
> "stars_i": 3,
> "id": "book1_c2",
> "id_book_s": "book1",
> "review_dt": "2014-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book1_c3",
> "id_book_s": "book1",
> "review_dt": "2014-12-15T12:00:00Z"
>   },
>   {
> "stars_i": 3,
> "id": "book2_c2",
> "id_book_s": "book2",
> "review_dt": "1994-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book2_c3",
> "id_book_s": "book2",
> "review_dt": "1

Re: Welcome Jason Gerlowski as committer

2018-02-08 Thread Dennis Gove
Welcome Jason!

On Feb 8, 2018 12:04 PM, "Adrien Grand"  wrote:

Welcome Jason!

Le jeu. 8 févr. 2018 à 18:03, David Smiley  a
écrit :

> Hello everyone,
>
> It's my pleasure to announce that Jason Gerlowski is our latest committer
> for Lucene/Solr in recognition for his contributions to the project!
> Please join me in welcoming him.  Jason, it's tradition for you to
> introduce yourself with a brief bio.
>
> Congratulations and Welcome!
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>


Re: Welcome Dennis Gove to the PMC

2017-12-29 Thread Dennis Gove
Thanks everyone!

On Thu, Dec 28, 2017 at 4:32 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> Welcome Dennis!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Tue, Dec 26, 2017 at 8:12 AM, Joel Bernstein <joels...@gmail.com>
> wrote:
>
>> I am pleased to announce that Dennis Gove has accepted the PMC's
>> invitation to join.
>>
>> Welcome Dennis!
>>
>
>


Re: Welcome Karl Wright to the PMC

2017-12-28 Thread Dennis Gove
Welcome Karl! Congratulations!

On Thu, Dec 28, 2017 at 2:32 PM, David Smiley 
wrote:

> Welcome Karl!
>
> On Thu, Dec 28, 2017 at 12:36 PM Ahmet Arslan 
> wrote:
>
>> Congratulations  Karl!
>>
>> Ahmet
>>
>>
>> On Thursday, December 28, 2017, 7:32:41 PM GMT+3, Steve Rowe <
>> sar...@gmail.com> wrote:
>>
>>
>> Congrats and welcome Karl!
>>
>> --
>> Steve
>> www.lucidworks.com
>>
>> > On Dec 28, 2017, at 9:08 AM, Adrien Grand  wrote:
>> >
>> > I am pleased to announce that Karl Wright has accepted the PMC's
>> invitation to join.
>> >
>> > Welcome Karl!
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>


[jira] [Resolved] (SOLR-11146) Analytics Component 2.0 Bug Fixes

2017-11-01 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-11146.

   Resolution: Fixed
Fix Version/s: (was: 7.0)
   7.2

> Analytics Component 2.0 Bug Fixes
> -
>
> Key: SOLR-11146
> URL: https://issues.apache.org/jira/browse/SOLR-11146
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
> Fix For: 7.2
>
>
> The new Analytics Component has several small bugs in mapping functions and 
> other places. This ticket is a fix for a large number of them. This patch 
> should allow all unit tests created in SOLR-11145 to pass.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11145) Comprehensive Unit Tests for the Analytics Component

2017-11-01 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove resolved SOLR-11145.

   Resolution: Fixed
Fix Version/s: (was: 7.0)
   7.2

> Comprehensive Unit Tests for the Analytics Component
> 
>
> Key: SOLR-11145
> URL: https://issues.apache.org/jira/browse/SOLR-11145
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Critical
> Fix For: 7.2
>
>
> Adding comprehensive unit tests for the new Analytics Component.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11146) Analytics Component 2.0 Bug Fixes

2017-10-13 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-11146:
--

Assignee: Dennis Gove

> Analytics Component 2.0 Bug Fixes
> -
>
> Key: SOLR-11146
> URL: https://issues.apache.org/jira/browse/SOLR-11146
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
> Fix For: 7.0
>
>
> The new Analytics Component has several small bugs in mapping functions and 
> other places. This ticket is a fix for a large number of them. This patch 
> should allow all unit tests created in SOLR-11145 to pass.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11145) Comprehensive Unit Tests for the Analytics Component

2017-10-13 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-11145:
--

Assignee: Dennis Gove

> Comprehensive Unit Tests for the Analytics Component
> 
>
> Key: SOLR-11145
> URL: https://issues.apache.org/jira/browse/SOLR-11145
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Critical
> Fix For: 7.0
>
>
> Adding comprehensive unit tests for the new Analytics Component.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-11283:
---
Attachment: SOLR-11283.patch

All stream related tests pass.

The following 3 tests failed but appear to be completely unrelated to my 
changes.

{code}
[junit4] Tests with failures [seed: EE1F61CDA04C3749]:
[junit4]   - org.apache.solr.cloud.HttpPartitionTest.test
[junit4]   - org.apache.solr.cloud.ForceLeaderTest.testReplicasInLIRNoLeader
[junit4]   -
org.apache.solr.update.processor.UpdateRequestProcessorFactoryTest.testUpdateDistribChainSkipping
{code}

When run individually those tests pass.

> Refactor all Stream Evaluators in solrj.io.eval to simplify them
> 
>
> Key: SOLR-11283
> URL: https://issues.apache.org/jira/browse/SOLR-11283
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
> Attachments: SOLR-11283.patch, SOLR-11283.patch
>
>
> As Stream Evaluators have been evolving we are seeing a need to better handle 
> differing types of data within evaluators. For example, allowing some to 
> evaluate over individual values or arrays of values, like
> {code}
> sin(a)
> sin(a,b,c,d)
> sin([a,b,c,d])
> {code}
> The current structure of Evaluators makes this difficult and repetitive work. 
> Also, the hierarchy of classes behind evaluators can be confusing for 
> developers creating new evaluators. For example, when to use a 
> ComplexEvaluator vs a BooleanEvaluator.
> A full refactoring of these classes will greatly enhance the usability and 
> future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-23 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139070#comment-16139070
 ] 

Dennis Gove commented on SOLR-11283:


I should add, this patch has a bunch of new tests marked to fail with 
NotImplementedException. I intend to implement these tests before committing.

> Refactor all Stream Evaluators in solrj.io.eval to simplify them
> 
>
> Key: SOLR-11283
> URL: https://issues.apache.org/jira/browse/SOLR-11283
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
> Attachments: SOLR-11283.patch
>
>
> As Stream Evaluators have been evolving we are seeing a need to better handle 
> differing types of data within evaluators. For example, allowing some to 
> evaluate over individual values or arrays of values, like
> {code}
> sin(a)
> sin(a,b,c,d)
> sin([a,b,c,d])
> {code}
> The current structure of Evaluators makes this difficult and repetitive work. 
> Also, the hierarchy of classes behind evaluators can be confusing for 
> developers creating new evaluators. For example, when to use a 
> ComplexEvaluator vs a BooleanEvaluator.
> A full refactoring of these classes will greatly enhance the usability and 
> future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-11283:
---
Attachment: SOLR-11283.patch

Full patch. All evaluator and stream related tests pass. Have not yet run full 
tests or precommit checks.

All evaluators are backward compatible in functionality and name/parameters, 
except for addAll which has been renamed to append.

> Refactor all Stream Evaluators in solrj.io.eval to simplify them
> 
>
> Key: SOLR-11283
> URL: https://issues.apache.org/jira/browse/SOLR-11283
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
> Attachments: SOLR-11283.patch
>
>
> As Stream Evaluators have been evolving we are seeing a need to better handle 
> differing types of data within evaluators. For example, allowing some to 
> evaluate over individual values or arrays of values, like
> {code}
> sin(a)
> sin(a,b,c,d)
> sin([a,b,c,d])
> {code}
> The current structure of Evaluators makes this difficult and repetitive work. 
> Also, the hierarchy of classes behind evaluators can be confusing for 
> developers creating new evaluators. For example, when to use a 
> ComplexEvaluator vs a BooleanEvaluator.
> A full refactoring of these classes will greatly enhance the usability and 
> future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11283) Refactor all Stream Evaluators in solrj.io.eval to simplify them

2017-08-23 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-11283:
--

 Summary: Refactor all Stream Evaluators in solrj.io.eval to 
simplify them
 Key: SOLR-11283
 URL: https://issues.apache.org/jira/browse/SOLR-11283
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Dennis Gove
Assignee: Dennis Gove


As Stream Evaluators have been evolving we are seeing a need to better handle 
differing types of data within evaluators. For example, allowing some to 
evaluate over individual values or arrays of values, like
{code}
sin(a)
sin(a,b,c,d)
sin([a,b,c,d])
{code}

The current structure of Evaluators makes this difficult and repetitive work. 

Also, the hierarchy of classes behind evaluators can be confusing for 
developers creating new evaluators. For example, when to use a ComplexEvaluator 
vs a BooleanEvaluator.

A full refactoring of these classes will greatly enhance the usability and 
future evolution of evaluators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10086) Add Streaming Expression for Kafka Streams

2017-07-30 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106528#comment-16106528
 ] 

Dennis Gove commented on SOLR-10086:


I've got a working -SNAPSHOT version of a Kafka Consumer available at 
https://oss.sonatype.org/content/repositories/snapshots/com/dennisgove/streaming-expressions-kafka/
and some initial documentation at 
https://dennisgove.github.io/streaming-expressions/kafka_overview.html

> Add Streaming Expression for Kafka Streams
> --
>
> Key: SOLR-10086
> URL: https://issues.apache.org/jira/browse/SOLR-10086
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: Susheel Kumar
>Priority: Minor
>
> This is being asked to have SolrCloud pull data from Kafka topic periodically 
> using DataImport Handler. 
> Adding streaming expression support to pull data from Kafka would be good 
> feature to have.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10949) analytics component has hard coded assumptions about Trie numeric fields -- tests fail with randomized point fields

2017-07-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095580#comment-16095580
 ] 

Dennis Gove commented on SOLR-10949:


Assumptions abo

> analytics component has hard coded assumptions about Trie numeric fields -- 
> tests fail with randomized point fields
> ---
>
> Key: SOLR-10949
> URL: https://issues.apache.org/jira/browse/SOLR-10949
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>  Labels: numeric-tries-to-points
>
> Found as part of SOLR-10947... attempting to use numeric PointFields in 
> contrib/analytics tests cause problems in some tests due to classes like 
> StatsCollectorSupplierFactory, RangeEndpointCalculator, and AnalyticsParsers 
> having hard coded assumptions about using Trie based numeric fields (via 
> instanceof and clas equality checks)
> (It's not immediately obvious if replacing these checks with inspection of 
> {{FieldType.getNumberType()}} would solve all the problems, or if other 
> assumptions are made down stream in the code)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10949) analytics component has hard coded assumptions about Trie numeric fields -- tests fail with randomized point fields

2017-07-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095580#comment-16095580
 ] 

Dennis Gove edited comment on SOLR-10949 at 7/20/17 11:52 PM:
--

Assumptions about Trie numeric fields in the Analytics component has been 
resolved as part of SOLR-10123. That assumption is no longer being made.


was (Author: dpgove):
Assumptions abo

> analytics component has hard coded assumptions about Trie numeric fields -- 
> tests fail with randomized point fields
> ---
>
> Key: SOLR-10949
> URL: https://issues.apache.org/jira/browse/SOLR-10949
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>  Labels: numeric-tries-to-points
>
> Found as part of SOLR-10947... attempting to use numeric PointFields in 
> contrib/analytics tests cause problems in some tests due to classes like 
> StatsCollectorSupplierFactory, RangeEndpointCalculator, and AnalyticsParsers 
> having hard coded assumptions about using Trie based numeric fields (via 
> instanceof and clas equality checks)
> (It's not immediately obvious if replacing these checks with inspection of 
> {{FieldType.getNumberType()}} would solve all the problems, or if other 
> assumptions are made down stream in the code)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10949) analytics component has hard coded assumptions about Trie numeric fields -- tests fail with randomized point fields

2017-07-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095579#comment-16095579
 ] 

Dennis Gove commented on SOLR-10949:


Assumptions abo

> analytics component has hard coded assumptions about Trie numeric fields -- 
> tests fail with randomized point fields
> ---
>
> Key: SOLR-10949
> URL: https://issues.apache.org/jira/browse/SOLR-10949
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>  Labels: numeric-tries-to-points
>
> Found as part of SOLR-10947... attempting to use numeric PointFields in 
> contrib/analytics tests cause problems in some tests due to classes like 
> StatsCollectorSupplierFactory, RangeEndpointCalculator, and AnalyticsParsers 
> having hard coded assumptions about using Trie based numeric fields (via 
> instanceof and clas equality checks)
> (It's not immediately obvious if replacing these checks with inspection of 
> {{FieldType.getNumberType()}} would solve all the problems, or if other 
> assumptions are made down stream in the code)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-10123) Analytics Component 2.0

2017-07-20 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove closed SOLR-10123.
--
Resolution: Fixed

Work related to this ticket is complete.

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
>  Labels: features
> Fix For: 7.0
>
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch, 
> SOLR-10123.patch.bugfixes
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10123) Analytics Component 2.0

2017-07-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095572#comment-16095572
 ] 

Dennis Gove commented on SOLR-10123:


I thought it might be something related to the cloud not yet being ready, but 
QueryFacetCloudTest extends AbstractAnalyticsFacetCloudTest and calls 
[`setupCluster`|https://github.com/apache/lucene-solr/blob/master/solr/contrib/analytics/src/test/org/apache/solr/analytics/facet/QueryFacetCloudTest.java#L44]
 which calls 
[`AbstractDistribZkTestBase.waitForRecoveriesToFinish`|https://github.com/apache/lucene-solr/blob/master/solr/contrib/analytics/src/test/org/apache/solr/analytics/facet/AbstractAnalyticsFacetCloudTest.java#L59].

I think after that call the cloud is ready for documents to be added.

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
>  Labels: features
> Fix For: 7.0
>
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch, 
> SOLR-10123.patch.bugfixes
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10123) Analytics Component 2.0

2017-07-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095247#comment-16095247
 ] 

Dennis Gove commented on SOLR-10123:


In my opinion this ticket is done and can be marked as such. The feature is in. 
I'm not convinced that the test failure above ([~steve_rowe]) is related. And, 
while additional tests would be nice, I believe they should be added under 
another ticket all together.

[~hossman], what are your thoughts on me closing this ticket?

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
>  Labels: features
> Fix For: 7.0
>
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch, 
> SOLR-10123.patch.bugfixes
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11075) Refactor handling of params in CloudSolrStream and FacetStream

2017-07-15 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088686#comment-16088686
 ] 

Dennis Gove commented on SOLR-11075:


I do not recall any particular reason why parameter values are joined with a 
comma instead of adding the parameter for each value. Looking at the history, 
support for multiple values for a single parameter was added in [this 
commit|https://github.com/apache/lucene-solr/commit/f4359ff8ffd96253ba610865c5e29172307c3c7a#diff-eba4f20196b5119b62f729727a21bf00].
 Supporting multiple is certainly the appropriate thing to do, but I do believe 
you're correct in wondering why commas are used as opposed to adding the 
parameter multiple times.

>From my perspective, and in the world of expressions, there's no effective 
>difference.

> Refactor handling of params in CloudSolrStream and FacetStream
> --
>
> Key: SOLR-11075
> URL: https://issues.apache.org/jira/browse/SOLR-11075
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11075.patch
>
>
> We started to look more closely at how toExpression is used in these classes 
> and the more we look the more puzzled we became (Varun and I that is).
> Is there any reason other than history why the params are pulled apart then 
> reconstructed into comma-separated lists when there are more than one of any 
> particular parameter? I suspect than when I worked on SOLR-8467 I didn't 
> delve deeply enough here.
> [~dpgove][~joel.bernstein] [~risdenk] in particular we'd like your opinion. 
> Arguably this is going to lead to anomalies, i.e. differences in what 
> streaming selects .vs. what standard Solr would select.
> For instance, let's say the user puts two "q" parameters in. Standard Solr 
> parsing uses the first one encountered. what happens when we get 
> q=clause1,clause2 as a result of the toExpression is anybody's guess. It just 
> shouldn't be different than straight-up Solr IMO.
> "fl" parameters on the other hand are all honored, as are "fq" clauses.
> Multiple "sort" clauses it appears first one wins.
> So my question is whether it makes sense to just add the parameter multiple 
> times, presumably reflecting the actual query.
> Assigning to myself but someone else should feel free to take it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10123) Analytics Component 2.0

2017-07-06 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10123:
---
Affects Version/s: 7.0

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Houston Putman
>  Labels: features
> Fix For: 7.0
>
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch, 
> SOLR-10123.patch.bugfixes
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10123) Analytics Component 2.0

2017-07-06 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10123:
---
Fix Version/s: 7.0

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Houston Putman
>  Labels: features
> Fix For: 7.0
>
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch, 
> SOLR-10123.patch.bugfixes
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10123) Analytics Component 2.0

2017-07-06 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10123:
---
Attachment: SOLR-10123.patch.bugfixes

This patch contains only the changes on commit 88b7ed1. It will need to be 
applied to branches 7x and 7.0.

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>  Labels: features
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch, 
> SOLR-10123.patch.bugfixes
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10123) Analytics Component 2.0

2017-06-28 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067007#comment-16067007
 ] 

Dennis Gove commented on SOLR-10123:


We had some issues with some completely unrelated tests failing when running 
{code}ant clean test{code} Sometimes when we ran the full test suite varied 
sets of tests would fail, but re-running with the seed would see those tests 
then pass. Both Houston and I are of the opinion that because Analytics is a 
contrib module, there was no rhyme or reason to which tests failed or why, and 
that the failures we'd see are completely unrelated to analytics that the 
failures are unrelated to this code change. We did also see many full test 
suite runs which showed *no* failures.

I cannot say for 100%, however, so I want to document it here. Houston will be 
watching the daily build/test log and will investigate any related failures.

{code}ant precommit{code} does pass.

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>  Labels: features
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10123) Analytics Component 2.0

2017-06-28 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10123:
---
Attachment: SOLR-10123.patch

This patch version has been applied to master. Houston will be updating this 
ticket with additional documentation and comments describing this change.

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>  Labels: features
> Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10123) Analytics Component 2.0

2017-06-26 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10123:
---
Attachment: SOLR-10123.patch

This is the current patch. It appears to have failing tests and precommit 
issues. Houston is working on both.

> Analytics Component 2.0
> ---
>
> Key: SOLR-10123
> URL: https://issues.apache.org/jira/browse/SOLR-10123
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>  Labels: features
> Attachments: SOLR-10123.patch, SOLR-10123.patch
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-9981) Multiple analytics fixes/performance improvements

2017-06-26 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove closed SOLR-9981.
-
   Resolution: Won't Fix
Fix Version/s: (was: master (7.0))

Closing as no fix as this is being dealt with in SOLR-10123.

> Multiple analytics fixes/performance improvements
> -
>
> Key: SOLR-9981
> URL: https://issues.apache.org/jira/browse/SOLR-9981
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>    Assignee: Dennis Gove
>Priority: Minor
>  Labels: patch
> Attachments: SOLR-9983.patch
>
>
> Included are the following improvements/fixes:
> * Improving the unit test case.
> * Performance fix that stops the reading of ALL lucene segments over and 
> again for each stats collector.
> ** The AtomicReaderContext that refers to the "current " segment is reused.
> ** This fix shows an improvement of about 25% in query time for a dataset of 
> ~10M (=9.8M) records.
> ** Given the nature of the fix, the improvement should get better as the 
> dataset increases.
> * Fix for the NPE during comparison



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-9981) Multiple analytics fixes/performance improvements

2017-06-26 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reopened SOLR-9981:
---

> Multiple analytics fixes/performance improvements
> -
>
> Key: SOLR-9981
> URL: https://issues.apache.org/jira/browse/SOLR-9981
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>    Assignee: Dennis Gove
>Priority: Minor
>  Labels: patch
> Fix For: master (7.0)
>
> Attachments: SOLR-9983.patch
>
>
> Included are the following improvements/fixes:
> * Improving the unit test case.
> * Performance fix that stops the reading of ALL lucene segments over and 
> again for each stats collector.
> ** The AtomicReaderContext that refers to the "current " segment is reused.
> ** This fix shows an improvement of about 25% in query time for a dataset of 
> ~10M (=9.8M) records.
> ** Given the nature of the fix, the improvement should get better as the 
> dataset increases.
> * Fix for the NPE during comparison



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9981) Multiple analytics fixes/performance improvements

2017-06-26 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063516#comment-16063516
 ] 

Dennis Gove commented on SOLR-9981:
---

I am going to revert this change.

> Multiple analytics fixes/performance improvements
> -
>
> Key: SOLR-9981
> URL: https://issues.apache.org/jira/browse/SOLR-9981
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>    Assignee: Dennis Gove
>Priority: Minor
>  Labels: patch
> Fix For: master (7.0)
>
> Attachments: SOLR-9983.patch
>
>
> Included are the following improvements/fixes:
> * Improving the unit test case.
> * Performance fix that stops the reading of ALL lucene segments over and 
> again for each stats collector.
> ** The AtomicReaderContext that refers to the "current " segment is reused.
> ** This fix shows an improvement of about 25% in query time for a dataset of 
> ~10M (=9.8M) records.
> ** Given the nature of the fix, the improvement should get better as the 
> dataset increases.
> * Fix for the NPE during comparison



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9981) Multiple analytics fixes/performance improvements

2017-06-26 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063187#comment-16063187
 ] 

Dennis Gove commented on SOLR-9981:
---

Thanks [~steve_rowe]. Houston is taking a look. All tests passed when I ran 
with this patch on Saturday. Perhaps related to a seed choice.

> Multiple analytics fixes/performance improvements
> -
>
> Key: SOLR-9981
> URL: https://issues.apache.org/jira/browse/SOLR-9981
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>    Assignee: Dennis Gove
>Priority: Minor
>  Labels: patch
> Fix For: master (7.0)
>
> Attachments: SOLR-9983.patch
>
>
> Included are the following improvements/fixes:
> * Improving the unit test case.
> * Performance fix that stops the reading of ALL lucene segments over and 
> again for each stats collector.
> ** The AtomicReaderContext that refers to the "current " segment is reused.
> ** This fix shows an improvement of about 25% in query time for a dataset of 
> ~10M (=9.8M) records.
> ** Given the nature of the fix, the improvement should get better as the 
> dataset increases.
> * Fix for the NPE during comparison



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-9981) Multiple analytics fixes/performance improvements

2017-06-24 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove closed SOLR-9981.
-
   Resolution: Fixed
Fix Version/s: master (7.0)

> Multiple analytics fixes/performance improvements
> -
>
> Key: SOLR-9981
> URL: https://issues.apache.org/jira/browse/SOLR-9981
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>    Assignee: Dennis Gove
>Priority: Minor
>  Labels: patch
> Fix For: master (7.0)
>
> Attachments: SOLR-9983.patch
>
>
> Included are the following improvements/fixes:
> * Improving the unit test case.
> * Performance fix that stops the reading of ALL lucene segments over and 
> again for each stats collector.
> ** The AtomicReaderContext that refers to the "current " segment is reused.
> ** This fix shows an improvement of about 25% in query time for a dataset of 
> ~10M (=9.8M) records.
> ** Given the nature of the fix, the improvement should get better as the 
> dataset increases.
> * Fix for the NPE during comparison



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-9981) Multiple analytics fixes/performance improvements

2017-06-24 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-9981:
-

Assignee: Dennis Gove

> Multiple analytics fixes/performance improvements
> -
>
> Key: SOLR-9981
> URL: https://issues.apache.org/jira/browse/SOLR-9981
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Houston Putman
>    Assignee: Dennis Gove
>Priority: Minor
>  Labels: patch
> Attachments: SOLR-9983.patch
>
>
> Included are the following improvements/fixes:
> * Improving the unit test case.
> * Performance fix that stops the reading of ALL lucene segments over and 
> again for each stats collector.
> ** The AtomicReaderContext that refers to the "current " segment is reused.
> ** This fix shows an improvement of about 25% in query time for a dataset of 
> ~10M (=9.8M) records.
> ** Given the nature of the fix, the improvement should get better as the 
> dataset increases.
> * Fix for the NPE during comparison



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10882) Restructure and Cleanup Stream Evaluators

2017-06-19 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054002#comment-16054002
 ] 

Dennis Gove commented on SOLR-10882:


Appears this
{code}
Comparator comparator = "asc".equals(sortOrder) ? (left,right) -> 
left.compareTo(right) : (left,right) -> right.compareTo(left);
list = list.stream().map(value -> 
(Comparable)value).sorted(comparator).collect(Collectors.toList());
{code}

doesn't take into account differing types (double and long). Will correct with 
a type normalization pass.


> Restructure and Cleanup Stream Evaluators
> -
>
> Key: SOLR-10882
> URL: https://issues.apache.org/jira/browse/SOLR-10882
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dennis Gove
> Attachments: SOLR-10882.patch
>
>
> There are a suite of new Stream Evaluators that I'd like to cleanup and 
> restructure prior to the cutting of v7. This ticket is to track that progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Release planning for 7.0

2017-06-18 Thread Dennis Gove
I've committed the most critical changes I wanted to make. Please don't
hold up on a v7 release on my part.

Thanks!
Dennis

On Tue, Jun 13, 2017 at 9:27 AM, Dennis Gove <dpg...@gmail.com> wrote:

> Hi,
>
> I also have some cleanup I'd like to do prior to a cut of 7. There are
> some new stream evaluators that I'm finding don't flow with the general
> flavor of evaluators. I'm using https://issues.apache.
> org/jira/browse/SOLR-10882 for the cleanup, but I do intend to be
> complete by June 16th.
>
> Thanks,
> Dennis
>
>
> On Sat, Jun 10, 2017 at 11:21 AM, Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> Hi Anshum,
>> I would like to request you to consider delaying the branch cutting by a
>> bit till we finalize the SOLR-10574 discussions and make the changes.
>> Alternatively, we could backport the changes to that branch after you cut
>> the branch now.
>> Regards,
>> Ishan
>>
>> On Sat, Jun 3, 2017 at 1:02 AM, Steve Rowe <sar...@gmail.com> wrote:
>>
>>>
>>> > On Jun 2, 2017, at 5:40 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>>> >
>>> > On 6/2/2017 10:23 AM, Steve Rowe wrote:
>>> >
>>> >> I see zero benefits from cutting branch_7x now.  Shawn, can you
>>> describe why you think we should do this?
>>> >>
>>> >> My interpretation of your argument is that you’re in favor of
>>> delaying cutting branch_7_0 until feature freeze - which BTW is the status
>>> quo - but I don’t get why that argues for cutting branch_7x now.
>>> >
>>> > I think I read something in the message I replied to that wasn't
>>> > actually stated.  I hate it when I don't read things closely enough.
>>> >
>>> > I meant to address the idea of making both branch_7x and branch_7_0 at
>>> > the same time, whenever the branching happens.  Somehow I came up with
>>> > the idea that the gist of the discussion included making the branches
>>> > now, which I can see is not the case.
>>> >
>>> > My point, which I think applies equally to branch_7x, is to wait as
>>> long
>>> > as practical before creating a branch, so that there is as little
>>> > backporting as we can manage, particularly minimizing the amount of
>>> time
>>> > that we have more than two branches being actively changed.
>>>
>>> +1
>>>
>>> --
>>> Steve
>>> www.lucidworks.com
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>


[jira] [Updated] (SOLR-10882) Restructure and Cleanup Stream Evaluators

2017-06-15 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10882:
---
Attachment: SOLR-10882.patch

Changes so far.

> Restructure and Cleanup Stream Evaluators
> -
>
> Key: SOLR-10882
> URL: https://issues.apache.org/jira/browse/SOLR-10882
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
> Attachments: SOLR-10882.patch
>
>
> There are a suite of new Stream Evaluators that I'd like to cleanup and 
> restructure prior to the cutting of v7. This ticket is to track that progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10882) Restructure and Cleanup Stream Evaluators

2017-06-13 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047905#comment-16047905
 ] 

Dennis Gove commented on SOLR-10882:


Working branch can be found at 
https://github.com/dennisgove/lucene-solr/tree/SOLR-10882

> Restructure and Cleanup Stream Evaluators
> -
>
> Key: SOLR-10882
> URL: https://issues.apache.org/jira/browse/SOLR-10882
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>
> There are a suite of new Stream Evaluators that I'd like to cleanup and 
> restructure prior to the cutting of v7. This ticket is to track that progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Release planning for 7.0

2017-06-13 Thread Dennis Gove
Hi,

I also have some cleanup I'd like to do prior to a cut of 7. There are some
new stream evaluators that I'm finding don't flow with the general flavor
of evaluators. I'm using https://issues.apache.org/jira/browse/SOLR-10882
for the cleanup, but I do intend to be complete by June 16th.

Thanks,
Dennis


On Sat, Jun 10, 2017 at 11:21 AM, Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Hi Anshum,
> I would like to request you to consider delaying the branch cutting by a
> bit till we finalize the SOLR-10574 discussions and make the changes.
> Alternatively, we could backport the changes to that branch after you cut
> the branch now.
> Regards,
> Ishan
>
> On Sat, Jun 3, 2017 at 1:02 AM, Steve Rowe  wrote:
>
>>
>> > On Jun 2, 2017, at 5:40 PM, Shawn Heisey  wrote:
>> >
>> > On 6/2/2017 10:23 AM, Steve Rowe wrote:
>> >
>> >> I see zero benefits from cutting branch_7x now.  Shawn, can you
>> describe why you think we should do this?
>> >>
>> >> My interpretation of your argument is that you’re in favor of delaying
>> cutting branch_7_0 until feature freeze - which BTW is the status quo - but
>> I don’t get why that argues for cutting branch_7x now.
>> >
>> > I think I read something in the message I replied to that wasn't
>> > actually stated.  I hate it when I don't read things closely enough.
>> >
>> > I meant to address the idea of making both branch_7x and branch_7_0 at
>> > the same time, whenever the branching happens.  Somehow I came up with
>> > the idea that the gist of the discussion included making the branches
>> > now, which I can see is not the case.
>> >
>> > My point, which I think applies equally to branch_7x, is to wait as long
>> > as practical before creating a branch, so that there is as little
>> > backporting as we can manage, particularly minimizing the amount of time
>> > that we have more than two branches being actively changed.
>>
>> +1
>>
>> --
>> Steve
>> www.lucidworks.com
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>


[jira] [Created] (SOLR-10882) Restructure and Cleanup Stream Evaluators

2017-06-13 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-10882:
--

 Summary: Restructure and Cleanup Stream Evaluators
 Key: SOLR-10882
 URL: https://issues.apache.org/jira/browse/SOLR-10882
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Dennis Gove


There are a suite of new Stream Evaluators that I'd like to cleanup and 
restructure prior to the cutting of v7. This ticket is to track that progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10753) Add array Stream Evaluator

2017-05-26 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026388#comment-16026388
 ] 

Dennis Gove commented on SOLR-10753:


Maybe merge should support and inOrder option which doesn't maintain sort 
between the streams.

> Add array Stream Evaluator
> --
>
> Key: SOLR-10753
> URL: https://issues.apache.org/jira/browse/SOLR-10753
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: master (7.0)
>
> Attachments: SOLR-10753.patch
>
>
> The *array* Stream Evaluator returns an array of numbers. It can contain 
> numbers and evaluators that return numbers.
> Syntax:
> {code}
> a = array(1, 2, 3, 4, 5, 6)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10753) Add array Stream Evaluator

2017-05-26 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026375#comment-16026375
 ] 

Dennis Gove commented on SOLR-10753:


If that's the case, isn't it the same as 
[merge|https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions#StreamingExpressions-merge]?

> Add array Stream Evaluator
> --
>
> Key: SOLR-10753
> URL: https://issues.apache.org/jira/browse/SOLR-10753
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: master (7.0)
>
> Attachments: SOLR-10753.patch
>
>
> The *array* Stream Evaluator returns an array of numbers. It can contain 
> numbers and evaluators that return numbers.
> Syntax:
> {code}
> a = array(1, 2, 3, 4, 5, 6)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10753) Add array Stream Evaluator

2017-05-26 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026348#comment-16026348
 ] 

Dennis Gove commented on SOLR-10753:


The array itself isn't doing the vector math operations, though. Right? I'd 
think it'd be up to the function doing the math operation to validate its 
input, which means accepting a list that could be filled with anything is 
alright - cause it'll be validated anyway.

I'm concerned that there'll end up being a lot of very similar things used for 
very different reasons. And users will be confused about when to use which.

> Add array Stream Evaluator
> --
>
> Key: SOLR-10753
> URL: https://issues.apache.org/jira/browse/SOLR-10753
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: master (7.0)
>
> Attachments: SOLR-10753.patch
>
>
> The *array* Stream Evaluator returns an array of numbers. It can contain 
> numbers and evaluators that return numbers.
> Syntax:
> {code}
> a = array(1, 2, 3, 4, 5, 6)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10753) Add array Stream Evaluator

2017-05-26 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026326#comment-16026326
 ] 

Dennis Gove commented on SOLR-10753:


Could it just be a thing that returns an list of objects? Then it's up to the 
container to handle whatever they are.

{code}
list(1,2,3,4)
list(1,add(2,3),if(gt(a,b),a,b))
list(1,"foo", search())
{code}

Basically, a single function that creates a list/array of whatever. It is up to 
the containing function to decide if the list is valid for its purpose. For 
example,
{code}
add(1, list(2,3,4,5)) is the same as add(1,2,3,4,5)
add(1, list("foo","bar")) is deemed invalid
{code}

And map with a list would allow things like
{code}
map(add(1,?), over=list(2,3,4,5)) would result in list(1 + 2, 1 + 3, 1 + 4, 1 + 
5)
{code}

> Add array Stream Evaluator
> --
>
> Key: SOLR-10753
> URL: https://issues.apache.org/jira/browse/SOLR-10753
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: master (7.0)
>
> Attachments: SOLR-10753.patch
>
>
> The *array* Stream Evaluator returns an array of numbers. It can contain 
> numbers and evaluators that return numbers.
> Syntax:
> {code}
> a = array(1, 2, 3, 4, 5, 6)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10292) Add cartesian Streaming Expression to build cartesian products from multi-value fields and text fields

2017-05-19 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018223#comment-16018223
 ] 

Dennis Gove commented on SOLR-10292:


[~varunthacker], thank you. I've corrected this.

> Add cartesian Streaming Expression to build cartesian products from 
> multi-value fields and text fields
> --
>
> Key: SOLR-10292
> URL: https://issues.apache.org/jira/browse/SOLR-10292
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Fix For: 6.6
>
> Attachments: SOLR-10292.patch, SOLR-10292.patch, SOLR-10292.patch, 
> SOLR-10292.patch, SOLR-10292.patch
>
>
> Currently all the Streaming Expression such as rollups, intersections, fetch 
> etc, work on single value fields. The *cartesian* expression would create a 
> stream of tuples from a single tuple with a multi-value field. This would 
> allow multi-valued fields to be operated on by the wider library of Streaming 
> Expression.
> For example a single tuple with a multi-valued field:
> id: 1
> author: [Jim, Jack, Steve]
> Would be transformed in the following three tuples:
> id:1
> author:Jim
> id:1
> author:Jack
> id:1
> author:Steve



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-10292) Add cartesian Streaming Expression to build cartesian products from multi-value fields and text fields

2017-05-19 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove closed SOLR-10292.
--
   Resolution: Fixed
Fix Version/s: 6.6

> Add cartesian Streaming Expression to build cartesian products from 
> multi-value fields and text fields
> --
>
> Key: SOLR-10292
> URL: https://issues.apache.org/jira/browse/SOLR-10292
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Fix For: 6.6
>
> Attachments: SOLR-10292.patch, SOLR-10292.patch, SOLR-10292.patch, 
> SOLR-10292.patch, SOLR-10292.patch
>
>
> Currently all the Streaming Expression such as rollups, intersections, fetch 
> etc, work on single value fields. The *cartesian* expression would create a 
> stream of tuples from a single tuple with a multi-value field. This would 
> allow multi-valued fields to be operated on by the wider library of Streaming 
> Expression.
> For example a single tuple with a multi-valued field:
> id: 1
> author: [Jim, Jack, Steve]
> Would be transformed in the following three tuples:
> id:1
> author:Jim
> id:1
> author:Jack
> id:1
> author:Steve



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10693) Add copyOfRange Stream Evaluator

2017-05-16 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012524#comment-16012524
 ] 

Dennis Gove commented on SOLR-10693:


That makes sense. I was thinking about String.substring, but following what 
Arrays does seems right.

> Add copyOfRange Stream Evaluator
> 
>
> Key: SOLR-10693
> URL: https://issues.apache.org/jira/browse/SOLR-10693
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The copyOfRange Stream Evaluator copies a range of an array to a new array.
> Syntax:
> {code}
> a = copyOfRange(colA, 1, 4)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10693) Add copyOfRange Stream Evaluator

2017-05-16 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012082#comment-16012082
 ] 

Dennis Gove commented on SOLR-10693:


copyOfRange and copyOf are the same thing, with optional parameters, I think.

{code}
copyOf(array)
copyOf(array, startIdx)
copyOf(array, startIdx, length)
{code}


> Add copyOfRange Stream Evaluator
> 
>
> Key: SOLR-10693
> URL: https://issues.apache.org/jira/browse/SOLR-10693
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The copyOfRange Stream Evaluator copies a range of an array to a new array.
> Syntax:
> {code}
> a = copyOfRange(colA, 1, 4)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10682) Add variance Stream Evaluator

2017-05-13 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009505#comment-16009505
 ] 

Dennis Gove edited comment on SOLR-10682 at 5/13/17 8:53 PM:
-

I think overloading and relying somewhat on context should be fine. An 
overloaded function in java is contingent on the incoming type of argument, so 
min(array), min(tupleStream) both make sense within the context of usage. 


was (Author: dpgove):
I think overloading and relying somewhat on context sho

> Add variance Stream Evaluator
> -
>
> Key: SOLR-10682
> URL: https://issues.apache.org/jira/browse/SOLR-10682
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The variance Stream Evaluator will calculate the variance of a vector of 
> numbers.
> {code}
> v = var(colA)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10682) Add variance Stream Evaluator

2017-05-13 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009505#comment-16009505
 ] 

Dennis Gove commented on SOLR-10682:


I think overloading and relying somewhat on context sho

> Add variance Stream Evaluator
> -
>
> Key: SOLR-10682
> URL: https://issues.apache.org/jira/browse/SOLR-10682
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> The variance Stream Evaluator will calculate the variance of a vector of 
> numbers.
> {code}
> v = var(colA)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10617) JDBCStream: support more data types

2017-05-10 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005815#comment-16005815
 ] 

Dennis Gove edited comment on SOLR-10617 at 5/11/17 2:43 AM:
-

bq. Does the Interface at the bottom need to be in its own source file per our 
guidelines?
The interface is only used internally. I've moved it inside the class body so 
it's a clear part of the JDBCStream class.

bq. How does the java.sql.Array bit work? I haven't encountered sql ARRAY type 
before, and I'm not seeing anything on it in the docs/tests/etc.
Some databases are able to return Array objects. For example, Postgres's jsonb 
function jsonb_build_array 
(https://www.postgresql.org/docs/9.5/static/functions-json.html) can result in 
an array existing in the resultset. The idea here is that an object array would 
be placed in the Tuple under the field name.


was (Author: dpgove):
bq. Does the Interface at the bottom need to be in its own source file per our 
guidelines?
The interface is only used internally. I've moved it inside the class body so 
it's a clear part of the JDBCStream class.

bq. How does the java.sql.Array bit work? I haven't encountered sql ARRAY type 
before, and I'm not seeing anything on it in the docs/tests/etc.
Some databases are able to return Array objects. For example, Postgres's jsonb 
function jsonb_build_array 
(https://www.postgresql.org/docs/9.5/static/functions-json.html) can result in 
an array existing in the resultset.

> JDBCStream: support more data types
> ---
>
> Key: SOLR-10617
> URL: https://issues.apache.org/jira/browse/SOLR-10617
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: James Dyer
>Assignee: James Dyer
>Priority: Minor
> Attachments: SOLR-10617.patch, SOLR-10617.patch, SOLR-10617.patch
>
>
> It would be nice if JDBCStream could support Decimal types as well as 
> Timestamp-related types, and CLOBs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10617) JDBCStream: support more data types

2017-05-10 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005815#comment-16005815
 ] 

Dennis Gove edited comment on SOLR-10617 at 5/11/17 2:42 AM:
-

bq. Does the Interface at the bottom need to be in its own source file per our 
guidelines?
The interface is only used internally. I've moved it inside the class body so 
it's a clear part of the JDBCStream class.

bq. How does the java.sql.Array bit work? I haven't encountered sql ARRAY type 
before, and I'm not seeing anything on it in the docs/tests/etc.
Some databases are able to return Array objects. For example, Postgres's jsonb 
function jsonb_build_array 
(https://www.postgresql.org/docs/9.5/static/functions-json.html) can result in 
an array existing in the resultset.


was (Author: dpgove):
> Does the Interface at the bottom need to be in its own source file per our 
> guidelines?
The interface is only used internally. I've moved it inside the class body so 
it's a clear part of the JDBCStream class.

> How does the java.sql.Array bit work? I haven't encountered sql ARRAY type 
> before, and I'm not seeing anything on it in the docs/tests/etc.
Some databases are able to return Array objects. For example, Postgres's jsonb 
function jsonb_build_array 
(https://www.postgresql.org/docs/9.5/static/functions-json.html) can result in 
an array existing in the resultset.

> JDBCStream: support more data types
> ---
>
> Key: SOLR-10617
> URL: https://issues.apache.org/jira/browse/SOLR-10617
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: James Dyer
>Assignee: James Dyer
>Priority: Minor
> Attachments: SOLR-10617.patch, SOLR-10617.patch, SOLR-10617.patch
>
>
> It would be nice if JDBCStream could support Decimal types as well as 
> Timestamp-related types, and CLOBs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10617) JDBCStream: support more data types

2017-05-10 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005815#comment-16005815
 ] 

Dennis Gove commented on SOLR-10617:


> Does the Interface at the bottom need to be in its own source file per our 
> guidelines?
The interface is only used internally. I've moved it inside the class body so 
it's a clear part of the JDBCStream class.

> How does the java.sql.Array bit work? I haven't encountered sql ARRAY type 
> before, and I'm not seeing anything on it in the docs/tests/etc.
Some databases are able to return Array objects. For example, Postgres's jsonb 
function jsonb_build_array 
(https://www.postgresql.org/docs/9.5/static/functions-json.html) can result in 
an array existing in the resultset.

> JDBCStream: support more data types
> ---
>
> Key: SOLR-10617
> URL: https://issues.apache.org/jira/browse/SOLR-10617
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: James Dyer
>Assignee: James Dyer
>Priority: Minor
> Attachments: SOLR-10617.patch, SOLR-10617.patch, SOLR-10617.patch
>
>
> It would be nice if JDBCStream could support Decimal types as well as 
> Timestamp-related types, and CLOBs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10617) JDBCStream: support more data types

2017-05-10 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10617:
---
Attachment: SOLR-10617.patch

This all looks good. I've just added a new patch with some additional comments 
on why in open() we switch from using the Java class type to the SQL data type.

> JDBCStream: support more data types
> ---
>
> Key: SOLR-10617
> URL: https://issues.apache.org/jira/browse/SOLR-10617
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: James Dyer
>Assignee: James Dyer
>Priority: Minor
> Attachments: SOLR-10617.patch, SOLR-10617.patch, SOLR-10617.patch
>
>
> It would be nice if JDBCStream could support Decimal types as well as 
> Timestamp-related types, and CLOBs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10617) JDBCStream: support more data types

2017-05-09 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002906#comment-16002906
 ] 

Dennis Gove commented on SOLR-10617:


I agree. I think adding support for additional types is a great thing. I'll 
take a closer look tonight.

> JDBCStream: support more data types
> ---
>
> Key: SOLR-10617
> URL: https://issues.apache.org/jira/browse/SOLR-10617
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: James Dyer
>Assignee: James Dyer
>Priority: Minor
> Attachments: SOLR-10617.patch, SOLR-10617.patch
>
>
> It would be nice if JDBCStream could support Decimal types as well as 
> Timestamp-related types, and CLOBs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Mike Drob as Lucene/Solr committer

2017-05-08 Thread Dennis Gove
Welcome Mike!

On Mon, May 8, 2017 at 11:42 AM, Mark Miller  wrote:

> I'm pleased to announce that Mike Drob has accepted the PMC's
> invitation to become a committer.
>
> Mike, it's tradition that you introduce yourself with a brief bio /
> origin story, explaining how you arrived here.
>
> Your existing Apache handle has already added to the “lucene" LDAP group,
> so you now have commit privileges.
>
> Please celebrate this rite of passage, and confirm that the right
> karma has in fact enabled, by embarking on the challenge of adding
> yourself to the committers section of the Who We Are page on the
> website: http://lucene.apache.org/whoweare.html (use the ASF CMS
> bookmarklet
> at the bottom of the page here: https://cms.apache.org/#bookmark -
> more info here http://www.apache.org/dev/cms.html).
>
> Congratulations and welcome!
> --
> - Mark
> about.me/markrmiller
>


[jira] [Commented] (SOLR-10617) JDBCStream: support more data types

2017-05-08 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001842#comment-16001842
 ] 

Dennis Gove commented on SOLR-10617:


I've only taken a quick look at this point, but I do want to point this out.

{code}
} else if (jdbcType == Types.DATE || jdbcType == Types.TIME || jdbcType == 
Types.TIMESTAMP) {
  valueSelectors[columnIdx] = new ResultSetValueSelector() {
public Object selectValue(ResultSet resultSet) throws SQLException {
  if (jdbcType == Types.TIME) {
Time sqlTime = resultSet.getTime(columnNumber);
return resultSet.wasNull() ? null : sqlTime.toString();
  } else if (jdbcType == Types.DATE) {
Date sqlDate = resultSet.getDate(columnNumber);
return resultSet.wasNull() ? null : sqlDate.toString();
  } else {
Timestamp sqlTimestamp = resultSet.getTimestamp(columnNumber);
return resultSet.wasNull() ? null : sqlTimestamp.toInstant().toString();
  }
}

public String getColumnName() {
  return columnName;
}
  };
}
{code}

The value selectors are constructed on open so that we can avoid executing the 
same conditional check on each row in the result set. By putting the jdbc type 
check inside of selectValue it is now repeating the same conditional for each 
row, even though every row will end up going down the same path.

While splitting these checks up does result in repeated code, the performance 
saving is well worth it.

> JDBCStream: support more data types
> ---
>
> Key: SOLR-10617
> URL: https://issues.apache.org/jira/browse/SOLR-10617
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: James Dyer
>Assignee: James Dyer
>Priority: Minor
> Attachments: SOLR-10617.patch
>
>
> It would be nice if JDBCStream could support Decimal types as well as 
> Timestamp-related types, and CLOBs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10512) Innerjoin streaming expressions - Invalid JoinStream error

2017-04-27 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986078#comment-15986078
 ] 

Dennis Gove edited comment on SOLR-10512 at 4/27/17 6:10 AM:
-

This is clearly a bug as the left field in 
{code}
on="fieldA=fieldB"
{code}
is the field from the first stream and the right field is the field from the 
second stream.

I'll take a look.


was (Author: dpgove):
This is clearly a bug as the left field in 
{code}
on=fieldA=fieldB
{code}
is the field from the first stream and the right field is the field from the 
second stream.

I'll take a look.

> Innerjoin streaming expressions - Invalid JoinStream error
> --
>
> Key: SOLR-10512
> URL: https://issues.apache.org/jira/browse/SOLR-10512
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 6.4.2, 6.5
> Environment: Debian Jessie
>Reporter: Dominique Béjean
>
> It looks like innerJoin streaming expression do not work as explained in 
> documentation. An invalid JoinStream error occurs.
> {noformat}
> curl --data-urlencode 'expr=innerJoin(
> search(books, 
>q="*:*", 
>fl="id", 
>sort="id asc"),
> searchreviews, 
>q="*:*", 
>fl="id_book_s", 
>sort="id_book_s asc"), 
> on="id=id_books_s"
> )' http://localhost:8983/solr/books/stream
>   
> {"result-set":{"docs":[{"EXCEPTION":"Invalid JoinStream - all incoming stream 
> comparators (sort) must be a superset of this stream's 
> equalitor.","EOF":true}]}}   
> {noformat}
> It is tottaly similar to the documentation example
> 
> {noformat}
> innerJoin(
>   search(people, q=*:*, fl="personId,name", sort="personId asc"),
>   search(pets, q=type:cat, fl="ownerId,petName", sort="ownerId asc"),
>   on="personId=ownerId"
> )
> {noformat}
> Queries on each collection give :
> {noformat}
> $ curl --data-urlencode 'expr=search(books, 
>q="*:*", 
>fl="id, title_s, pubyear_i", 
>sort="pubyear_i asc", 
>qt="/export")' 
> http://localhost:8983/solr/books/stream
> {
>   "result-set": {
> "docs": [
>   {
> "title_s": "Friends",
> "pubyear_i": 1994,
> "id": "book2"
>   },
>   {
> "title_s": "The Way of Kings",
> "pubyear_i": 2010,
> "id": "book1"
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 16
>   }
> ]
>   }
> }
> $ curl --data-urlencode 'expr=search(reviews, 
>q="author_s:d*", 
>fl="id, id_book_s, stars_i, review_dt", 
>sort="id_book_s asc", 
>qt="/export")' 
> http://localhost:8983/solr/reviews/stream
>  
> {
>   "result-set": {
> "docs": [
>   {
> "stars_i": 3,
> "id": "book1_c2",
> "id_book_s": "book1",
> "review_dt": "2014-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book1_c3",
> "id_book_s": "book1",
> "review_dt": "2014-12-15T12:00:00Z"
>   },
>   {
> "stars_i": 3,
> "id": "book2_c2",
> "id_book_s": "book2",
> "review_dt": "1994-03-15T12:00:00Z"
>   },
>   {
> 

[jira] [Commented] (SOLR-10512) Innerjoin streaming expressions - Invalid JoinStream error

2017-04-27 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986078#comment-15986078
 ] 

Dennis Gove commented on SOLR-10512:


This is clearly a bug as the left field in 
{code}
on=fieldA=fieldB
{code}
is the field from the first stream and the right field is the field from the 
second stream.

I'll take a look.

> Innerjoin streaming expressions - Invalid JoinStream error
> --
>
> Key: SOLR-10512
> URL: https://issues.apache.org/jira/browse/SOLR-10512
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 6.4.2, 6.5
> Environment: Debian Jessie
>Reporter: Dominique Béjean
>
> It looks like innerJoin streaming expression do not work as explained in 
> documentation. An invalid JoinStream error occurs.
> {noformat}
> curl --data-urlencode 'expr=innerJoin(
> search(books, 
>q="*:*", 
>fl="id", 
>sort="id asc"),
> searchreviews, 
>q="*:*", 
>fl="id_book_s", 
>sort="id_book_s asc"), 
> on="id=id_books_s"
> )' http://localhost:8983/solr/books/stream
>   
> {"result-set":{"docs":[{"EXCEPTION":"Invalid JoinStream - all incoming stream 
> comparators (sort) must be a superset of this stream's 
> equalitor.","EOF":true}]}}   
> {noformat}
> It is tottaly similar to the documentation example
> 
> {noformat}
> innerJoin(
>   search(people, q=*:*, fl="personId,name", sort="personId asc"),
>   search(pets, q=type:cat, fl="ownerId,petName", sort="ownerId asc"),
>   on="personId=ownerId"
> )
> {noformat}
> Queries on each collection give :
> {noformat}
> $ curl --data-urlencode 'expr=search(books, 
>q="*:*", 
>fl="id, title_s, pubyear_i", 
>sort="pubyear_i asc", 
>qt="/export")' 
> http://localhost:8983/solr/books/stream
> {
>   "result-set": {
> "docs": [
>   {
> "title_s": "Friends",
> "pubyear_i": 1994,
> "id": "book2"
>   },
>   {
> "title_s": "The Way of Kings",
> "pubyear_i": 2010,
> "id": "book1"
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 16
>   }
> ]
>   }
> }
> $ curl --data-urlencode 'expr=search(reviews, 
>q="author_s:d*", 
>fl="id, id_book_s, stars_i, review_dt", 
>sort="id_book_s asc", 
>qt="/export")' 
> http://localhost:8983/solr/reviews/stream
>  
> {
>   "result-set": {
> "docs": [
>   {
> "stars_i": 3,
> "id": "book1_c2",
> "id_book_s": "book1",
> "review_dt": "2014-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book1_c3",
> "id_book_s": "book1",
> "review_dt": "2014-12-15T12:00:00Z"
>   },
>   {
> "stars_i": 3,
> "id": "book2_c2",
> "id_book_s": "book2",
> "review_dt": "1994-03-15T12:00:00Z"
>   },
>   {
> "stars_i": 4,
> "id": "book2_c3",
> "id_book_s": "book2",
> "review_dt": "1994-12-15T12:00:00Z"
>   },
>   {
> "EOF": true,
>   

[jira] [Commented] (SOLR-10486) Add Length Conversion Evaluators

2017-04-13 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968413#comment-15968413
 ] 

Dennis Gove commented on SOLR-10486:


Aww, that's horrible. I can't believe I did that. Good catch!

> Add Length Conversion Evaluators
> 
>
> Key: SOLR-10486
> URL: https://issues.apache.org/jira/browse/SOLR-10486
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10486.patch, SOLR-10486.patch
>
>
> Based on the work in SOLR-10485 I think it makes sense to add Conversion 
> evaluators. I think a good place to start is with the conversions listed here:
> https://www.learner.org/interactives/metric/conversion_chart.html#1
> This ticket will add length conversions and lay down the initial syntax.
> Sample syntax:
> {code}
> select(eval(), convert(inches, meters, 22) as meters)
> {code}
> This will return a single tuple with 22 inches converted to meters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10486) Add Length Conversion Evaluators

2017-04-13 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968331#comment-15968331
 ] 

Dennis Gove edited comment on SOLR-10486 at 4/13/17 10:27 PM:
--

The switch clauses in {code}evaluate(){code} result in the same code path 
for every tuple in the stream. It would be more efficient to construct a lambda 
with the appropriate code during construction of this class (or on open) and 
execute that in evaluate.

Also, the exception (for invalid conversion) would be thrown before reading a 
single tuple.


was (Author: dpgove):
The switch clauses in {code}evaluate(){code} result in the same code path 
for every tuple in the stream. It would be more efficient to construct a lambda 
with the appropriate code during construction of this class (or on open) and 
execute that in evaluate.

> Add Length Conversion Evaluators
> 
>
> Key: SOLR-10486
> URL: https://issues.apache.org/jira/browse/SOLR-10486
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10486.patch
>
>
> Based on the work in SOLR-10485 I think it makes sense to add Conversion 
> evaluators. I think a good place to start is with the conversions listed here:
> https://www.learner.org/interactives/metric/conversion_chart.html#1
> This ticket will add length conversions and lay down the initial syntax.
> Sample syntax:
> {code}
> select(eval(), convert(inches, meters, 22) as meters)
> {code}
> This will return a single tuple with 22 inches converted to meters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10486) Add Length Conversion Evaluators

2017-04-13 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968331#comment-15968331
 ] 

Dennis Gove commented on SOLR-10486:


The switch clauses in {code}evaluate(){code} result in the same code path 
for every tuple in the stream. It would be more efficient to construct a lambda 
with the appropriate code during construction of this class (or on open) and 
execute that in evaluate.

> Add Length Conversion Evaluators
> 
>
> Key: SOLR-10486
> URL: https://issues.apache.org/jira/browse/SOLR-10486
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10486.patch
>
>
> Based on the work in SOLR-10485 I think it makes sense to add Conversion 
> evaluators. I think a good place to start is with the conversions listed here:
> https://www.learner.org/interactives/metric/conversion_chart.html#1
> This ticket will add length conversions and lay down the initial syntax.
> Sample syntax:
> {code}
> select(eval(), convert(inches, meters, 22) as meters)
> {code}
> This will return a single tuple with 22 inches converted to meters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10303) Add date/time Stream Evaluators

2017-04-05 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957541#comment-15957541
 ] 

Dennis Gove commented on SOLR-10303:


[~covolution] I'm sorry. I forgot to submit the review on Github.

> Add date/time Stream Evaluators
> ---
>
> Key: SOLR-10303
> URL: https://issues.apache.org/jira/browse/SOLR-10303
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10303.patch
>
>
> This ticket will add Stream Evaluators that extract date/time values from a 
> Solr date field. The following Evaluators will be supported:
> hour (date)
> minute (date)
> month (date) 
> monthname(date) 
> quarter(date) 
> second (date)
> year(date)
> Syntax:
> {code}
> select(id,
>year(recdate) as year,
>month(recdate) as month,
>day(recdate) as day,
>search(logs, q="blah", fl="id, recdate", sort="recdate asc", 
> qt="/export"))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10303) Add date/time Stream Evaluators

2017-04-05 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956946#comment-15956946
 ] 

Dennis Gove commented on SOLR-10303:


In case you missed them, I added comments to the PR at 
https://github.com/apache/lucene-solr/pull/171

> Add date/time Stream Evaluators
> ---
>
> Key: SOLR-10303
> URL: https://issues.apache.org/jira/browse/SOLR-10303
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10303.patch
>
>
> This ticket will add Stream Evaluators that extract date/time values from a 
> Solr date field. The following Evaluators will be supported:
> hour (date)
> minute (date)
> month (date) 
> monthname(date) 
> quarter(date) 
> second (date)
> year(date)
> Syntax:
> {code}
> select(id,
>year(recdate) as year,
>month(recdate) as month,
>day(recdate) as day,
>search(logs, q="blah", fl="id, recdate", sort="recdate asc", 
> qt="/export"))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10393) Add UUID Stream Evaluator

2017-04-01 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10393:
---
Attachment: SOLR-10393.patch

> Add UUID Stream Evaluator
> -
>
> Key: SOLR-10393
> URL: https://issues.apache.org/jira/browse/SOLR-10393
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>    Assignee: Dennis Gove
> Attachments: SOLR-10393.patch, SOLR-10393.patch
>
>
> The cartesianProduct function emits multiple tuples from a single tuple. To 
> save the cartesian product in another collection it would be useful to be 
> able to dynamically assign new unique id's to tuples. The uuid() stream 
> evaluator will allow us to do this.
> sample syntax:
> {code}
> cartesianProduct(expr, fielda, uuid() as id)
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10351) Add analyze Stream Evaluator to support streaming NLP

2017-03-31 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951915#comment-15951915
 ] 

Dennis Gove commented on SOLR-10351:


That makes sense and I agree that it probably makes sense to include it in 
evaluators. In the most recent patch for SOLR-10356 (planning to commit to 
master and branch_6x tonight) I've refactored it a little bit to move the 
implementation {code}public void setStreamContext(StreamContext 
streamContext){code} back a level into ComplexEvaluator.

> Add analyze Stream Evaluator to support streaming NLP
> -
>
> Key: SOLR-10351
> URL: https://issues.apache.org/jira/browse/SOLR-10351
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>  Labels: NLP, Streaming
> Fix For: 6.6
>
> Attachments: SOLR-10351.patch, SOLR-10351.patch, SOLR-10351.patch, 
> SOLR-10351.patch
>
>
> The *analyze* Stream Evaluator uses a Solr analyzer to return a collection of 
> tokens from a *text field*. The collection of tokens can then be streamed out 
> by  the *cartesianProduct* Streaming Expression or attached to documents as 
> multi-valued fields by the *select* Streaming Expression.
> This allows Streaming Expressions to leverage all the existing tokenizers and 
> filters and provides a place for future NLP analyzers to be added to 
> Streaming Expressions.
> Sample syntax:
> {code}
> cartesianProduct(expr, analyze(analyzerField, textField) as outfield )
> {code}
> {code}
> select(expr, analyze(analyzerField, textField) as outfield )
> {code}
> Combined with Solr's batch text processing capabilities this provides an 
> entire parallel NLP framework. Solr's batch processing capabilities are 
> described here:
> *Batch jobs, Parallel ETL and Streaming Text Transformation*
> http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10393) Add UUID Stream Evaluator

2017-03-31 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10393:
---
Attachment: SOLR-10393.patch

Ready to go.

> Add UUID Stream Evaluator
> -
>
> Key: SOLR-10393
> URL: https://issues.apache.org/jira/browse/SOLR-10393
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>    Assignee: Dennis Gove
> Attachments: SOLR-10393.patch
>
>
> The cartesianProduct function emits multiple tuples from a single tuple. To 
> save the cartesian product in another collection it would be useful to be 
> able to dynamically assign new unique id's to tuples. The uuid() stream 
> evaluator will allow us to do this.
> sample syntax:
> {code}
> cartesianProduct(expr, fielda, uuid() as id)
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add Streaming Evaluators for basic math functions

2017-03-31 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Attachment: SOLR-10356.patch

Includes some refactoring required due to an addition of StreamContext into the 
evaluators.

> Add Streaming Evaluators for basic math functions
> -
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch, SOLR-10356.patch, SOLR-10356.patch, 
> SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10351) Add analyze Stream Evaluator to support streaming NLP

2017-03-31 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951863#comment-15951863
 ] 

Dennis Gove commented on SOLR-10351:


What's the purpose of a StreamContext in the evaluators?

> Add analyze Stream Evaluator to support streaming NLP
> -
>
> Key: SOLR-10351
> URL: https://issues.apache.org/jira/browse/SOLR-10351
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>  Labels: NLP, Streaming
> Fix For: 6.6
>
> Attachments: SOLR-10351.patch, SOLR-10351.patch, SOLR-10351.patch, 
> SOLR-10351.patch
>
>
> The *analyze* Stream Evaluator uses a Solr analyzer to return a collection of 
> tokens from a *text field*. The collection of tokens can then be streamed out 
> by  the *cartesianProduct* Streaming Expression or attached to documents as 
> multi-valued fields by the *select* Streaming Expression.
> This allows Streaming Expressions to leverage all the existing tokenizers and 
> filters and provides a place for future NLP analyzers to be added to 
> Streaming Expressions.
> Sample syntax:
> {code}
> cartesianProduct(expr, analyze(analyzerField, textField) as outfield )
> {code}
> {code}
> select(expr, analyze(analyzerField, textField) as outfield )
> {code}
> Combined with Solr's batch text processing capabilities this provides an 
> entire parallel NLP framework. Solr's batch processing capabilities are 
> described here:
> *Batch jobs, Parallel ETL and Streaming Text Transformation*
> http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-10393) Add UUID Stream Evaluator

2017-03-31 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-10393:
--

Assignee: Dennis Gove

> Add UUID Stream Evaluator
> -
>
> Key: SOLR-10393
> URL: https://issues.apache.org/jira/browse/SOLR-10393
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>    Assignee: Dennis Gove
>
> The cartesianProduct function emits multiple tuples from a single tuple. To 
> save the cartesian product in another collection it would be useful to be 
> able to dynamically assign new unique id's to tuples. The uuid() stream 
> evaluator will allow us to do this.
> sample syntax:
> {code}
> cartesianProduct(expr, fielda, uuid() as id)
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add Streaming Evaluators for basic math functions

2017-03-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Attachment: SOLR-10356.patch

Includes everything discussed so far (except round(a,b)), including tests.

> Add Streaming Evaluators for basic math functions
> -
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch, SOLR-10356.patch, SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add Streaming Evaluators for basic math functions

2017-03-25 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Attachment: SOLR-10356.patch

Adds 
{code}
+  .withFunctionName("pow", PowerEvaluator.class)
+  .withFunctionName("mod", ModuloEvaluator.class)
+  .withFunctionName("ceil", CeilingEvaluator.class)
+  .withFunctionName("floor", FloorEvaluator.class)
+  .withFunctionName("sin", SineEvaluator.class)
+  .withFunctionName("asin", ArcSineEvaluator.class)
+  .withFunctionName("sinh", HyperbolicSineEvaluator.class)
+  .withFunctionName("cos", CosineEvaluator.class)
+  .withFunctionName("acos", ArcCosineEvaluator.class)
+  .withFunctionName("cosh", HyperbolicCosineEvaluator.class)
+  .withFunctionName("tan", TangentEvaluator.class)
+  .withFunctionName("atan", ArcTangentEvaluator.class)
+  .withFunctionName("tanh", HyperbolicTangentEvaluator.class)
{code}

> Add Streaming Evaluators for basic math functions
> -
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
>     Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch, SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10356) Add Streaming Evaluators for basic math functions

2017-03-24 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940294#comment-15940294
 ] 

Dennis Gove commented on SOLR-10356:


I haven't confirmed, but I think BigDecimal can accomplish this by setting the 
scale
{code}
a.setScale(b, RoundingMode.HALF_UP)
{code}

> Add Streaming Evaluators for basic math functions
> -
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10356) Add Streaming Evaluators for basic math functions

2017-03-24 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940215#comment-15940215
 ] 

Dennis Gove commented on SOLR-10356:


Yeah, intended for streaming expressions. So far we have the following 

{code}
abs(a) // |a|
add(a,b,...,z) // a + b + ... + z
div(a,b) // a/b
mult(a,b,...,z) // a * b * ... * z
sub(a,b,...,z) // a - b - ... - z
log(a) // natural log
pow(a,b) // a^b
mod(a,b) // a % b
ceil(a) // ceiling of a
floor(a) // floor of a
{code}

I'll add these ones
{code}
coalesce(a,b,...,z) // this won't actually be specific to math, we can coalesce 
on any value type
round(a)
sqrt(a)
cbrt(a) // cubed root
sin(a) // sine of a
sinh(a) // hyperbolic sine of a
asin(a) // arc sine of a
cos(a) // cosine of a
cosh(a) // hyperbolic cosine of a
acos(a) // arc cosine of a
tan(a) // tangent of a
tanh(a) // hyperbolic tangent of a
atan(a) // arc tangent of a
{code}

What were you thinking for round(a,b)?

> Add Streaming Evaluators for basic math functions
> -
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add Streaming Evaluators for basic math functions

2017-03-24 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Summary: Add Streaming Evaluators for basic math functions  (was: Add 
evaluators for basic math functions)

> Add Streaming Evaluators for basic math functions
> -
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add evaluators for basic math functions

2017-03-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Summary: Add evaluators for basic math functions  (was: Add evaluators for 
basic math function)

> Add evaluators for basic math functions
> ---
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add evaluators for basic math functions

2017-03-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Attachment: SOLR-10356.patch

> Add evaluators for basic math functions
> ---
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10356) Add evaluators for basic math function

2017-03-23 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10356:
---
Summary: Add evaluators for basic math function  (was: Add PowerEvaluator 
to support stream evaluator pow(value,exponent))

> Add evaluators for basic math function
> --
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10356) Add evaluators for basic math functions

2017-03-23 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939496#comment-15939496
 ] 

Dennis Gove commented on SOLR-10356:


I'm going to keep this open for a week or so and if anyone has other basic math 
evaluators you'd like to see added please just let me know and I'll include 
them here.

> Add evaluators for basic math functions
> ---
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
> Attachments: SOLR-10356.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10356) Add evaluators for basic math function

2017-03-23 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939495#comment-15939495
 ] 

Dennis Gove commented on SOLR-10356:


{code}
pow(a,b) // a^b
mod(a,b) // a % b
ceil(a) // Math.ceil(a)
floor(a) // Math.floor(a)
{code}

> Add evaluators for basic math function
> --
>
> Key: SOLR-10356
> URL: https://issues.apache.org/jira/browse/SOLR-10356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>    Reporter: Dennis Gove
>    Assignee: Dennis Gove
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-10356) Add PowerEvaluator to support stream evaluator pow(value,exponent)

2017-03-23 Thread Dennis Gove (JIRA)
Dennis Gove created SOLR-10356:
--

 Summary: Add PowerEvaluator to support stream evaluator 
pow(value,exponent)
 Key: SOLR-10356
 URL: https://issues.apache.org/jira/browse/SOLR-10356
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Dennis Gove
Assignee: Dennis Gove
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10303) Add date/time Stream Evaluators

2017-03-19 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932130#comment-15932130
 ] 

Dennis Gove commented on SOLR-10303:


For the function names, any concern with using pascal casing for dayOfYear, 
dayOfMonth? That will be consistent with other multi-word cases in streaming.

> Add date/time Stream Evaluators
> ---
>
> Key: SOLR-10303
> URL: https://issues.apache.org/jira/browse/SOLR-10303
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10303.patch
>
>
> This ticket will add Stream Evaluators that extract date/time values from a 
> Solr date field. The following Evaluators will be supported:
> hour (date)
> minute (date)
> month (date) 
> monthname(date) 
> quarter(date) 
> second (date)
> year(date)
> Syntax:
> {code}
> select(id,
>year(recdate) as year,
>month(recdate) as month,
>day(recdate) as day,
>search(logs, q="blah", fl="id, recdate", sort="recdate asc", 
> qt="/export"))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10292) Add cartesian Streaming Expression to build cartesian products from multi-value fields and text fields

2017-03-19 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-10292:
---
Attachment: SOLR-10292.patch

I think this is ready to go. I've decided to be explicit and register it under 
the function name 'cartesianProduct'.

Full suite of tests and precommit pass.

> Add cartesian Streaming Expression to build cartesian products from 
> multi-value fields and text fields
> --
>
> Key: SOLR-10292
> URL: https://issues.apache.org/jira/browse/SOLR-10292
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10292.patch, SOLR-10292.patch, SOLR-10292.patch, 
> SOLR-10292.patch, SOLR-10292.patch
>
>
> Currently all the Streaming Expression such as rollups, intersections, fetch 
> etc, work on single value fields. The *cartesian* expression would create a 
> stream of tuples from a single tuple with a multi-value field. This would 
> allow multi-valued fields to be operated on by the wider library of Streaming 
> Expression.
> For example a single tuple with a multi-valued field:
> id: 1
> author: [Jim, Jack, Steve]
> Would be transformed in the following three tuples:
> id:1
> author:Jim
> id:1
> author:Jack
> id:1
> author:Steve



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10292) Add cartesian Streaming Expression to build cartesian products from multi-value fields and text fields

2017-03-19 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931943#comment-15931943
 ] 

Dennis Gove edited comment on SOLR-10292 at 3/19/17 8:31 PM:
-

Tests added and passing.

This does not add any additional evaluators. I think those can be added in 
other tickets. All evaluators are supported by this stream so anything you 
think to add (regex matching, sentence creation, etc...) will work. The stream 
works with both multi-valued and single-valued fields in so much that it will 
treat single-valued fields as a collection with a single item.


was (Author: dpgove):
Tests added and passing.

This does not add any additional evaluators. I think those can be added in 
other tickets. All evaluators are supported by this stream so anything you 
think to add (regex matching, sentence creation, etc...) will work. The stream 
works with both multi-valued and sing-valued fields in so much that it will 
treat single-valued fields as a collection with a single item.

> Add cartesian Streaming Expression to build cartesian products from 
> multi-value fields and text fields
> --
>
> Key: SOLR-10292
> URL: https://issues.apache.org/jira/browse/SOLR-10292
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Attachments: SOLR-10292.patch, SOLR-10292.patch, SOLR-10292.patch, 
> SOLR-10292.patch
>
>
> Currently all the Streaming Expression such as rollups, intersections, fetch 
> etc, work on single value fields. The *cartesian* expression would create a 
> stream of tuples from a single tuple with a multi-value field. This would 
> allow multi-valued fields to be operated on by the wider library of Streaming 
> Expression.
> For example a single tuple with a multi-valued field:
> id: 1
> author: [Jim, Jack, Steve]
> Would be transformed in the following three tuples:
> id:1
> author:Jim
> id:1
> author:Jack
> id:1
> author:Steve



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   3   4   5   6   >