Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-23 Thread Joe Lawson
FYI everyone, I've updated the README.md to be fully up to date for Solr
6.0 and the latest plugin release.
https://github.com/healthonnet/hon-lucene-synonyms/blob/master/README.md

On Fri, Jun 17, 2016 at 2:34 PM, MaryJo Sminkey  wrote:

> > OK - Slapping forehead now... D'oh!
> >
> > 1.2 >
> > Float, not int!
> >
>
>
> LOL, we've all been there. I'm surprised I didn't notice that myself.
>
> MJ
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-17 Thread MaryJo Sminkey
> OK - Slapping forehead now... D'oh!
>
> 1.2
> Float, not int!
>


LOL, we've all been there. I'm surprised I didn't notice that myself.

MJ


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-17 Thread John Bickerstaff
OK - Slapping forehead now... D'oh!

1.2 wrote:

> Hi all -
>
> I've successfully run the hon-lucene-synonyms plugin from the Admin
> console by adding the following to the Raw Query Parameters field...
>
>
> =text=synonym_edismax=true=1.2=1.1
>
> I got those from the Read Me on the github account.
>
> Now I'm trying to make this work via a requestHandler in solrconfig.xml.
>
> I think the following should work, but it just hangs if I add the last
> line referencing synonyms.originalBoost
>
> 
> 
>  
>explicit
>10
>synonym_edismax
>text
>true
>1.2 --> If I add this
> line, the admin console just hangs when I hit /test1
>  
>  
>
> If I do NOT add the last line and only have the line that sets
> synonyms=true, it appears to work fine.
>
> I see the dot notation all over the sample entries in solrconfig.xml...
> Am I missing something here?
>
> Essentially, how do I get these variables set correctly from inside a
> requestHandler configured in the solrconfig.xml file?
>
> On Tue, Jun 7, 2016 at 11:47 AM, Joe Lawson <
> jlaw...@opensourceconnections.com> wrote:
>
>> MaryJo you might want to start a new thread, I think we kinda hijacked
>> this
>> one. Also if you are interested in tuning queries check out
>> http://splainer.io/ and https://www.quepid.com which are interactive
>> tools
>> (both of which my company makes) to tune for search relevancy.
>>
>> On Tue, Jun 7, 2016 at 1:45 PM, MaryJo Sminkey 
>> wrote:
>>
>> > I'm really thinking this just might not be the right tool for us, what
>> we
>> > really need is a solution that works like the normal synonym filter
>> does,
>> > just with proper multi-term support, so I can apply the synonyms only on
>> > certain fields (copied fields) that have their own, lower boost
>> settings.
>> > The way this plugin works across the entire query just seems too
>> > problematic when you need to do complex queries with lots of different
>> > boost settings to get good relevancy. Anyone used a different method of
>> > handling multi-term synonyms that isn't as global?
>> >
>> > Mary Jo
>> >
>> >
>> >
>> > On Tue, Jun 7, 2016 at 1:31 PM, MaryJo Sminkey 
>> > wrote:
>> >
>> > > Here's the issue I am still having with getting the right search
>> > relevancy
>> > > with the synonym plugin in place. We typically have users searching on
>> > > multiple terms, and we want matches across multiple terms,
>> particularly
>> > > those that appears as phrases, to appear higher than matches for the
>> same
>> > > term multiple times. The synonym filter makes this complicated since
>> we
>> > may
>> > > have cases where the term the user enters, like "sbc", maps to a
>> > multi-term
>> > > synonym like "small block", and we always want the matches for the
>> > original
>> > > term to pop up first, so I'm trying to make sure the original boost is
>> > high
>> > > enough to override a phrase boost that the multi-term synonym would
>> give.
>> > > Unfortunately this then means matches on the same term multiple times
>> get
>> > > pushed up over my phrase matches...those aren't going to be the most
>> > > relevant matches. Not sure there's a way to solve this successfully,
>> > > without a completely different approach to the synonyms... or not
>> > counting
>> > > the number of matches on terms (I assume you can drop that ability,
>> > > although that's not ideal either...just better than what I have now).
>> > >
>> > > MJ
>> > >
>> > >
>> > >
>> > > Sent with MailTrack
>> > > <
>> >
>> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
>> > >
>> > >
>> > > On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey 
>> > > wrote:
>> > >
>> > >>
>> > >> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
>> > >> jlaw...@opensourceconnections.com> wrote:
>> > >>
>> > >>>
>> > >>> We were thinking, as you experimented with, that the 0.5 and 2.0
>> boosts
>> > >>> were no match for the product name and keyword field boosts so that
>> > would
>> > >>> influence your search as well.
>> > >>
>> > >>
>> > >>
>> > >> Yeah I definitely will have to play with the values a bit as we want
>> the
>> > >> product name matches to always appear highest, whether original or
>> > >> synonyms, but I'll have to figure out how to get that result without
>> one
>> > >> word terms that have multi word synonyms getting overly boosted for a
>> > >> phrase match while still sufficiently boosting the normal phrase
>> > match
>> > >> stuff too. With the normal synonym filter I was able to just copy
>> fields
>> > >> that could have synonyms to a new field (which would be the only one
>> > with
>> > >> the synonym filter), and use a different, lower boost on those
>> fields,
>> > but
>> > >> that won't work with this plugin which applies across everything in
>> the
>> > >> query. Makes it a bit more complicated to get everything just right.
>> > >>
>> > >> MJ
>> > >>
>> > >>
>> > >> Sent 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-17 Thread MaryJo Sminkey
On Fri, Jun 17, 2016 at 2:15 PM, John Bickerstaff 
wrote:

> If I do NOT add the last line and only have the line that sets
> synonyms=true, it appears to work fine.
>
> I see the dot notation all over the sample entries in solrconfig.xml...  Am
> I missing something here?
>
> Essentially, how do I get these variables set correctly from inside a
> requestHandler configured in the solrconfig.xml file?
>


I know I didn't have any issues using those boosts but I was sending them
on the query string (or otherwise as part of my query request), rather than
setting them in the config. You might try that to see if it makes a
difference.

Mary Jo


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-17 Thread John Bickerstaff
Hi all -

I've successfully run the hon-lucene-synonyms plugin from the Admin console
by adding the following to the Raw Query Parameters field...

=text=synonym_edismax=true=1.2=1.1

I got those from the Read Me on the github account.

Now I'm trying to make this work via a requestHandler in solrconfig.xml.

I think the following should work, but it just hangs if I add the last line
referencing synonyms.originalBoost



 
   explicit
   10
   synonym_edismax
   text
   true
   1.2 --> If I add this line,
the admin console just hangs when I hit /test1
 
 

If I do NOT add the last line and only have the line that sets
synonyms=true, it appears to work fine.

I see the dot notation all over the sample entries in solrconfig.xml...  Am
I missing something here?

Essentially, how do I get these variables set correctly from inside a
requestHandler configured in the solrconfig.xml file?

On Tue, Jun 7, 2016 at 11:47 AM, Joe Lawson <
jlaw...@opensourceconnections.com> wrote:

> MaryJo you might want to start a new thread, I think we kinda hijacked this
> one. Also if you are interested in tuning queries check out
> http://splainer.io/ and https://www.quepid.com which are interactive tools
> (both of which my company makes) to tune for search relevancy.
>
> On Tue, Jun 7, 2016 at 1:45 PM, MaryJo Sminkey 
> wrote:
>
> > I'm really thinking this just might not be the right tool for us, what we
> > really need is a solution that works like the normal synonym filter does,
> > just with proper multi-term support, so I can apply the synonyms only on
> > certain fields (copied fields) that have their own, lower boost settings.
> > The way this plugin works across the entire query just seems too
> > problematic when you need to do complex queries with lots of different
> > boost settings to get good relevancy. Anyone used a different method of
> > handling multi-term synonyms that isn't as global?
> >
> > Mary Jo
> >
> >
> >
> > On Tue, Jun 7, 2016 at 1:31 PM, MaryJo Sminkey 
> > wrote:
> >
> > > Here's the issue I am still having with getting the right search
> > relevancy
> > > with the synonym plugin in place. We typically have users searching on
> > > multiple terms, and we want matches across multiple terms, particularly
> > > those that appears as phrases, to appear higher than matches for the
> same
> > > term multiple times. The synonym filter makes this complicated since we
> > may
> > > have cases where the term the user enters, like "sbc", maps to a
> > multi-term
> > > synonym like "small block", and we always want the matches for the
> > original
> > > term to pop up first, so I'm trying to make sure the original boost is
> > high
> > > enough to override a phrase boost that the multi-term synonym would
> give.
> > > Unfortunately this then means matches on the same term multiple times
> get
> > > pushed up over my phrase matches...those aren't going to be the most
> > > relevant matches. Not sure there's a way to solve this successfully,
> > > without a completely different approach to the synonyms... or not
> > counting
> > > the number of matches on terms (I assume you can drop that ability,
> > > although that's not ideal either...just better than what I have now).
> > >
> > > MJ
> > >
> > >
> > >
> > > Sent with MailTrack
> > > <
> >
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> > >
> > >
> > > On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey 
> > > wrote:
> > >
> > >>
> > >> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
> > >> jlaw...@opensourceconnections.com> wrote:
> > >>
> > >>>
> > >>> We were thinking, as you experimented with, that the 0.5 and 2.0
> boosts
> > >>> were no match for the product name and keyword field boosts so that
> > would
> > >>> influence your search as well.
> > >>
> > >>
> > >>
> > >> Yeah I definitely will have to play with the values a bit as we want
> the
> > >> product name matches to always appear highest, whether original or
> > >> synonyms, but I'll have to figure out how to get that result without
> one
> > >> word terms that have multi word synonyms getting overly boosted for a
> > >> phrase match while still sufficiently boosting the normal phrase
> > match
> > >> stuff too. With the normal synonym filter I was able to just copy
> fields
> > >> that could have synonyms to a new field (which would be the only one
> > with
> > >> the synonym filter), and use a different, lower boost on those fields,
> > but
> > >> that won't work with this plugin which applies across everything in
> the
> > >> query. Makes it a bit more complicated to get everything just right.
> > >>
> > >> MJ
> > >>
> > >>
> > >> Sent with MailTrack
> > >> <
> >
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> > >
> > >>
> > >
> > >
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-07 Thread Joe Lawson
MaryJo you might want to start a new thread, I think we kinda hijacked this
one. Also if you are interested in tuning queries check out
http://splainer.io/ and https://www.quepid.com which are interactive tools
(both of which my company makes) to tune for search relevancy.

On Tue, Jun 7, 2016 at 1:45 PM, MaryJo Sminkey  wrote:

> I'm really thinking this just might not be the right tool for us, what we
> really need is a solution that works like the normal synonym filter does,
> just with proper multi-term support, so I can apply the synonyms only on
> certain fields (copied fields) that have their own, lower boost settings.
> The way this plugin works across the entire query just seems too
> problematic when you need to do complex queries with lots of different
> boost settings to get good relevancy. Anyone used a different method of
> handling multi-term synonyms that isn't as global?
>
> Mary Jo
>
>
>
> On Tue, Jun 7, 2016 at 1:31 PM, MaryJo Sminkey 
> wrote:
>
> > Here's the issue I am still having with getting the right search
> relevancy
> > with the synonym plugin in place. We typically have users searching on
> > multiple terms, and we want matches across multiple terms, particularly
> > those that appears as phrases, to appear higher than matches for the same
> > term multiple times. The synonym filter makes this complicated since we
> may
> > have cases where the term the user enters, like "sbc", maps to a
> multi-term
> > synonym like "small block", and we always want the matches for the
> original
> > term to pop up first, so I'm trying to make sure the original boost is
> high
> > enough to override a phrase boost that the multi-term synonym would give.
> > Unfortunately this then means matches on the same term multiple times get
> > pushed up over my phrase matches...those aren't going to be the most
> > relevant matches. Not sure there's a way to solve this successfully,
> > without a completely different approach to the synonyms... or not
> counting
> > the number of matches on terms (I assume you can drop that ability,
> > although that's not ideal either...just better than what I have now).
> >
> > MJ
> >
> >
> >
> > Sent with MailTrack
> > <
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> >
> >
> > On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey 
> > wrote:
> >
> >>
> >> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
> >> jlaw...@opensourceconnections.com> wrote:
> >>
> >>>
> >>> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
> >>> were no match for the product name and keyword field boosts so that
> would
> >>> influence your search as well.
> >>
> >>
> >>
> >> Yeah I definitely will have to play with the values a bit as we want the
> >> product name matches to always appear highest, whether original or
> >> synonyms, but I'll have to figure out how to get that result without one
> >> word terms that have multi word synonyms getting overly boosted for a
> >> phrase match while still sufficiently boosting the normal phrase
> match
> >> stuff too. With the normal synonym filter I was able to just copy fields
> >> that could have synonyms to a new field (which would be the only one
> with
> >> the synonym filter), and use a different, lower boost on those fields,
> but
> >> that won't work with this plugin which applies across everything in the
> >> query. Makes it a bit more complicated to get everything just right.
> >>
> >> MJ
> >>
> >>
> >> Sent with MailTrack
> >> <
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> >
> >>
> >
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-07 Thread MaryJo Sminkey
I'm really thinking this just might not be the right tool for us, what we
really need is a solution that works like the normal synonym filter does,
just with proper multi-term support, so I can apply the synonyms only on
certain fields (copied fields) that have their own, lower boost settings.
The way this plugin works across the entire query just seems too
problematic when you need to do complex queries with lots of different
boost settings to get good relevancy. Anyone used a different method of
handling multi-term synonyms that isn't as global?

Mary Jo



On Tue, Jun 7, 2016 at 1:31 PM, MaryJo Sminkey  wrote:

> Here's the issue I am still having with getting the right search relevancy
> with the synonym plugin in place. We typically have users searching on
> multiple terms, and we want matches across multiple terms, particularly
> those that appears as phrases, to appear higher than matches for the same
> term multiple times. The synonym filter makes this complicated since we may
> have cases where the term the user enters, like "sbc", maps to a multi-term
> synonym like "small block", and we always want the matches for the original
> term to pop up first, so I'm trying to make sure the original boost is high
> enough to override a phrase boost that the multi-term synonym would give.
> Unfortunately this then means matches on the same term multiple times get
> pushed up over my phrase matches...those aren't going to be the most
> relevant matches. Not sure there's a way to solve this successfully,
> without a completely different approach to the synonyms... or not counting
> the number of matches on terms (I assume you can drop that ability,
> although that's not ideal either...just better than what I have now).
>
> MJ
>
>
>
> Sent with MailTrack
> 
>
> On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey 
> wrote:
>
>>
>> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
>> jlaw...@opensourceconnections.com> wrote:
>>
>>>
>>> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
>>> were no match for the product name and keyword field boosts so that would
>>> influence your search as well.
>>
>>
>>
>> Yeah I definitely will have to play with the values a bit as we want the
>> product name matches to always appear highest, whether original or
>> synonyms, but I'll have to figure out how to get that result without one
>> word terms that have multi word synonyms getting overly boosted for a
>> phrase match while still sufficiently boosting the normal phrase match
>> stuff too. With the normal synonym filter I was able to just copy fields
>> that could have synonyms to a new field (which would be the only one with
>> the synonym filter), and use a different, lower boost on those fields, but
>> that won't work with this plugin which applies across everything in the
>> query. Makes it a bit more complicated to get everything just right.
>>
>> MJ
>>
>>
>> Sent with MailTrack
>> 
>>
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-07 Thread MaryJo Sminkey
Here's the issue I am still having with getting the right search relevancy
with the synonym plugin in place. We typically have users searching on
multiple terms, and we want matches across multiple terms, particularly
those that appears as phrases, to appear higher than matches for the same
term multiple times. The synonym filter makes this complicated since we may
have cases where the term the user enters, like "sbc", maps to a multi-term
synonym like "small block", and we always want the matches for the original
term to pop up first, so I'm trying to make sure the original boost is high
enough to override a phrase boost that the multi-term synonym would give.
Unfortunately this then means matches on the same term multiple times get
pushed up over my phrase matches...those aren't going to be the most
relevant matches. Not sure there's a way to solve this successfully,
without a completely different approach to the synonyms... or not counting
the number of matches on terms (I assume you can drop that ability,
although that's not ideal either...just better than what I have now).

MJ



Sent with MailTrack


On Mon, Jun 6, 2016 at 9:39 PM, MaryJo Sminkey  wrote:

>
> On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
> jlaw...@opensourceconnections.com> wrote:
>
>>
>> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
>> were no match for the product name and keyword field boosts so that would
>> influence your search as well.
>
>
>
> Yeah I definitely will have to play with the values a bit as we want the
> product name matches to always appear highest, whether original or
> synonyms, but I'll have to figure out how to get that result without one
> word terms that have multi word synonyms getting overly boosted for a
> phrase match while still sufficiently boosting the normal phrase match
> stuff too. With the normal synonym filter I was able to just copy fields
> that could have synonyms to a new field (which would be the only one with
> the synonym filter), and use a different, lower boost on those fields, but
> that won't work with this plugin which applies across everything in the
> query. Makes it a bit more complicated to get everything just right.
>
> MJ
>
>
> Sent with MailTrack
> 
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-06 Thread MaryJo Sminkey
On Mon, Jun 6, 2016 at 7:36 PM, Joe Lawson <
jlaw...@opensourceconnections.com> wrote:

>
> We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
> were no match for the product name and keyword field boosts so that would
> influence your search as well.



Yeah I definitely will have to play with the values a bit as we want the
product name matches to always appear highest, whether original or
synonyms, but I'll have to figure out how to get that result without one
word terms that have multi word synonyms getting overly boosted for a
phrase match while still sufficiently boosting the normal phrase match
stuff too. With the normal synonym filter I was able to just copy fields
that could have synonyms to a new field (which would be the only one with
the synonym filter), and use a different, lower boost on those fields, but
that won't work with this plugin which applies across everything in the
query. Makes it a bit more complicated to get everything just right.

MJ


Sent with MailTrack



Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-06 Thread Joe Lawson
Yeah I thought the scale of the boosts were off as well but got caught up
verifying that the plugin was working. My colleague suggested that it could
be that because small block is a phrase that it would get a higher score in
matching because you basically get a phrase match each time which causes it
to float to the top. You should check out his post about Solr's latest
score engine. It explains the notion of TF*IDF which drives almost all the
theory in information retrieval (aka search).

http://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/

We were thinking, as you experimented with, that the 0.5 and 2.0 boosts
were no match for the product name and keyword field boosts so that would
influence your search as well.
On Jun 6, 2016 6:03 PM, "MaryJo Sminkey"  wrote:

> Oh thanks, yeah I did miss that one field which had a parent type with the
> normal synonym filter. However, that's our product SKU field so really
> doesn't even come into play. I verified that none of the other fields have
> a synonym filter set and even removed the productumbertext just to make
> sure it wasn't doing anything. I was still getting the same results, the
> matches with "SBC" in the name are buried under the "small block" matches.
> After thinking over the issue, I realized what the solution was, I just
> needed to set the synonym.originalBoost high enough that it would be higher
> than the boosts provided by the phrase boosting, which is clearly what was
> letting "small block" jump ahead of "sbc". So I bumped that up to 100
> leaving the synonymBoost at 1 and now I'm getting the results I'm looking
> for.
>
> Thanks for the help!
>
> Mary Jo
>
> Sent with MailTrack
> <
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> >
>
> On Mon, Jun 6, 2016 at 4:57 PM, Joe Lawson <
> jlaw...@opensourceconnections.com> wrote:
>
> > Mary Jo.
> >
> > It appears to be working correctly but you have a very complex query
> going
> > on so it can be confusing. Assuming you are using the queryParser as
> > provided in examples your query would look like "+sbc" when it enters the
> > queryParser and would look like "+((sbc)^2.0 (sb)^0.5 (small block)^0.5)"
> > when it came out and then it would enter the normal pipeline and
> everything
> > would be processed as individual tokens.
> >
> > It appears that you have synonyms being processed at query time on the
> > prodnumbertext field. For example when (sbc)^2.0 enters into the normal
> > query stage then have all the qf, pf, ps and tie modifies added so the
> > first one turns into something like
> >
> > "(body:sbc^0.5 | productinfo:sbc^1.0 | keywords:sbc^2.0 |
> prodname:sbc^10.0
> > | prodnumbertext:sbc^20.0)^2.0"
> >
> > Then the query time synonym expansion on produnumbertext combined with a
> > phrase and default mm being 100% (
> >
> >
> https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter
> > )
> > you end up with query being
> >
> > (((prodnumbertext:sbc prodnumbertext:sb prodnumbertext:small)
> > prodnumbertext:block)~2)^20.0
> >
> > The ~2 comes from mm=100% and having the phrase "small block" as a
> synonym.
> > This messes up your results as well as anything in prodnumbertext will
> have
> > to match "sbc block" "sb block" or "small block" which of course is only
> > going to match small block. Check out the section "Multi-work synonyms
> > won't work as phrase queries" in
> > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ for
> > more info.
> >
> > Advice: make sure on the schema that none of the fields your are running
> > queries against do any complex query operations, especially make sure
> they
> > aren't doing additional synonym resolution against the same file.
> >
> > I think you are getting hit by the MM bug.  Try tuning it way down to
> > something like 0.01% and see how the matches go.
> >
> >
> >
> > On Fri, Jun 3, 2016 at 2:21 PM, MaryJo Sminkey 
> > wrote:
> >
> > > Okay so big thanks for the help with getting the hon_lucene_synonyms
> > plugin
> > > working. That is a big load off to finally have a solution in place for
> > all
> > > our multi-term synonyms. We did find that the information in Step 8
> about
> > > the plugin showing "SynonymExpandingExtendedDismaxQParser" for QParser
> > does
> > > not seem to be correct, we only ever get "ExtendedDismaxQParser" but
> the
> > > synonym expansion is definitely working.
> > >
> > > In implementing it though, the one thing I'm still having an issue with
> > is
> > > trying to figure out how I can get results on the original term to
> appear
> > > first in our results and matches on the synonyms lower in the results.
> > The
> > > plugin includes settings for an originalboost and synonymboost, but
> that
> > > doesn't seem to be working along with all the other edismax boosts I'm
> > > doing. We search across a number of 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-06 Thread MaryJo Sminkey
Oh thanks, yeah I did miss that one field which had a parent type with the
normal synonym filter. However, that's our product SKU field so really
doesn't even come into play. I verified that none of the other fields have
a synonym filter set and even removed the productumbertext just to make
sure it wasn't doing anything. I was still getting the same results, the
matches with "SBC" in the name are buried under the "small block" matches.
After thinking over the issue, I realized what the solution was, I just
needed to set the synonym.originalBoost high enough that it would be higher
than the boosts provided by the phrase boosting, which is clearly what was
letting "small block" jump ahead of "sbc". So I bumped that up to 100
leaving the synonymBoost at 1 and now I'm getting the results I'm looking
for.

Thanks for the help!

Mary Jo

Sent with MailTrack


On Mon, Jun 6, 2016 at 4:57 PM, Joe Lawson <
jlaw...@opensourceconnections.com> wrote:

> Mary Jo.
>
> It appears to be working correctly but you have a very complex query going
> on so it can be confusing. Assuming you are using the queryParser as
> provided in examples your query would look like "+sbc" when it enters the
> queryParser and would look like "+((sbc)^2.0 (sb)^0.5 (small block)^0.5)"
> when it came out and then it would enter the normal pipeline and everything
> would be processed as individual tokens.
>
> It appears that you have synonyms being processed at query time on the
> prodnumbertext field. For example when (sbc)^2.0 enters into the normal
> query stage then have all the qf, pf, ps and tie modifies added so the
> first one turns into something like
>
> "(body:sbc^0.5 | productinfo:sbc^1.0 | keywords:sbc^2.0 | prodname:sbc^10.0
> | prodnumbertext:sbc^20.0)^2.0"
>
> Then the query time synonym expansion on produnumbertext combined with a
> phrase and default mm being 100% (
>
> https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter
> )
> you end up with query being
>
> (((prodnumbertext:sbc prodnumbertext:sb prodnumbertext:small)
> prodnumbertext:block)~2)^20.0
>
> The ~2 comes from mm=100% and having the phrase "small block" as a synonym.
> This messes up your results as well as anything in prodnumbertext will have
> to match "sbc block" "sb block" or "small block" which of course is only
> going to match small block. Check out the section "Multi-work synonyms
> won't work as phrase queries" in
> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ for
> more info.
>
> Advice: make sure on the schema that none of the fields your are running
> queries against do any complex query operations, especially make sure they
> aren't doing additional synonym resolution against the same file.
>
> I think you are getting hit by the MM bug.  Try tuning it way down to
> something like 0.01% and see how the matches go.
>
>
>
> On Fri, Jun 3, 2016 at 2:21 PM, MaryJo Sminkey 
> wrote:
>
> > Okay so big thanks for the help with getting the hon_lucene_synonyms
> plugin
> > working. That is a big load off to finally have a solution in place for
> all
> > our multi-term synonyms. We did find that the information in Step 8 about
> > the plugin showing "SynonymExpandingExtendedDismaxQParser" for QParser
> does
> > not seem to be correct, we only ever get "ExtendedDismaxQParser" but the
> > synonym expansion is definitely working.
> >
> > In implementing it though, the one thing I'm still having an issue with
> is
> > trying to figure out how I can get results on the original term to appear
> > first in our results and matches on the synonyms lower in the results.
> The
> > plugin includes settings for an originalboost and synonymboost, but that
> > doesn't seem to be working along with all the other edismax boosts I'm
> > doing. We search across a number of fields, each with their own boost and
> > then do phrase searches with boosts as well. My params look like this:
> >
> > params["defType"] = 'synonym_edismax';
> > params["qf"] = 'body^0.5 productinfo^1.0 keywords^2.0 prodname^10.0
> > prodnumbertext^20.0';
> > params["pf"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> > params["pf2"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> > params["pf3"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> > params["ps"] = 1;
> > params["tie"] = 0.1;
> > params["synonyms"] = true;
> > params["synonyms.originalBoost"] = 2.0;
> > params["synonyms.synonymBoost"] = 0.5;
> >
> > And here's an example of what the plugin gives me for a search on "sbc"
> > which includes synonyms for "sb" and "small block" I don't really
> know
> > enough about this to figure out what exactly it's doing but since all of
> > the results I am getting first are ones with "small block" in the name,
> and
> > the ones with "sbc" in the prodname field which should be first are
> buried
> > about 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-06 Thread Joe Lawson
>
> Advice: make sure on the schema that none of the fields your are running
> queries against do any complex query operations, especially make sure they
> aren't doing additional synonym resolution against the same file.
>

BTW. I'd do this first before messing with MM


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-06 Thread Joe Lawson
Mary Jo.

It appears to be working correctly but you have a very complex query going
on so it can be confusing. Assuming you are using the queryParser as
provided in examples your query would look like "+sbc" when it enters the
queryParser and would look like "+((sbc)^2.0 (sb)^0.5 (small block)^0.5)"
when it came out and then it would enter the normal pipeline and everything
would be processed as individual tokens.

It appears that you have synonyms being processed at query time on the
prodnumbertext field. For example when (sbc)^2.0 enters into the normal
query stage then have all the qf, pf, ps and tie modifies added so the
first one turns into something like

"(body:sbc^0.5 | productinfo:sbc^1.0 | keywords:sbc^2.0 | prodname:sbc^10.0
| prodnumbertext:sbc^20.0)^2.0"

Then the query time synonym expansion on produnumbertext combined with a
phrase and default mm being 100% (
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter)
you end up with query being

(((prodnumbertext:sbc prodnumbertext:sb prodnumbertext:small)
prodnumbertext:block)~2)^20.0

The ~2 comes from mm=100% and having the phrase "small block" as a synonym.
This messes up your results as well as anything in prodnumbertext will have
to match "sbc block" "sb block" or "small block" which of course is only
going to match small block. Check out the section "Multi-work synonyms
won't work as phrase queries" in
https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ for
more info.

Advice: make sure on the schema that none of the fields your are running
queries against do any complex query operations, especially make sure they
aren't doing additional synonym resolution against the same file.

I think you are getting hit by the MM bug.  Try tuning it way down to
something like 0.01% and see how the matches go.



On Fri, Jun 3, 2016 at 2:21 PM, MaryJo Sminkey  wrote:

> Okay so big thanks for the help with getting the hon_lucene_synonyms plugin
> working. That is a big load off to finally have a solution in place for all
> our multi-term synonyms. We did find that the information in Step 8 about
> the plugin showing "SynonymExpandingExtendedDismaxQParser" for QParser does
> not seem to be correct, we only ever get "ExtendedDismaxQParser" but the
> synonym expansion is definitely working.
>
> In implementing it though, the one thing I'm still having an issue with is
> trying to figure out how I can get results on the original term to appear
> first in our results and matches on the synonyms lower in the results. The
> plugin includes settings for an originalboost and synonymboost, but that
> doesn't seem to be working along with all the other edismax boosts I'm
> doing. We search across a number of fields, each with their own boost and
> then do phrase searches with boosts as well. My params look like this:
>
> params["defType"] = 'synonym_edismax';
> params["qf"] = 'body^0.5 productinfo^1.0 keywords^2.0 prodname^10.0
> prodnumbertext^20.0';
> params["pf"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["pf2"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["pf3"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["ps"] = 1;
> params["tie"] = 0.1;
> params["synonyms"] = true;
> params["synonyms.originalBoost"] = 2.0;
> params["synonyms.synonymBoost"] = 0.5;
>
> And here's an example of what the plugin gives me for a search on "sbc"
> which includes synonyms for "sb" and "small block" I don't really know
> enough about this to figure out what exactly it's doing but since all of
> the results I am getting first are ones with "small block" in the name, and
> the ones with "sbc" in the prodname field which should be first are buried
> about 1000 documents in, I know the originalboost and synonymboost aren't
> working with all this other stuff. Ideas how to fix this? With the normal
> synonym filter we just set up copies of the fields that could have synonyms
> to use with that filter applied and had a lower boost on those. Not sure
> how to make it work with this custom query parser though.
>
> +((prodname:sbc^10.0 | body:sbc^0.5 | productinfo:sbc | keywords:sbc^2.0 |
> (((prodnumbertext:sbc prodnumbertext:small prodnumbertext:sb)
> prodnumbertext:block)~2)^20.0)~0.1^2.0 (((+(prodname:sb^10.0 | body:sb^0.5
> | productinfo:sb | keywords:sb^2.0 | (((prodnumbertext:sb
> prodnumbertext:small prodnumbertext:sbc) prodnumbertext:block)~2)^20.0)~0.1
> ()))^0.5) (((+(((prodname:small^10.0 | body:small^0.5 | productinfo:small |
> keywords:small^2.0 | prodnumbertext:small^20.0)~0.1 (prodname:block^10.0 |
> body:block^0.5 | productinfo:block | keywords:block^2.0 |
> prodnumbertext:block^20.0)~0.1)~2) (productinfo:"small block"~1 |
> body:"small block"~1^5.0 | keywords:"small block"~1^10.0 | prodname:"small
> block"~1^50.0)~0.1 (productinfo:"small block"~1 | body:"small block"~1^5.0
> | keywords:"small block"~1^10.0 | 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-05 Thread John Bickerstaff
Yes, query parameters/modifications mentioned in the readme.  Beyond those
I don't have useful advice at this point
On Jun 4, 2016 10:56 PM, "MaryJo Sminkey"  wrote:

> On Sat, Jun 4, 2016 at 11:47 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > MaryJo - I'm on vacation but can't resist... iirc there are some very
> > useful query modifications suggested in the readme on the github for the
> > plugin... can't access right now.
> >
>
>
> I'm assuming you mean the various query parameters. The only ones I see in
> there that would be of use for me are the ones I'm already using. As far as
> can tell from their description.
>
> MJ
>
>
> Sent with MailTrack
> <
> https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-04 Thread MaryJo Sminkey
On Sat, Jun 4, 2016 at 11:47 PM, John Bickerstaff 
wrote:

> MaryJo - I'm on vacation but can't resist... iirc there are some very
> useful query modifications suggested in the readme on the github for the
> plugin... can't access right now.
>


I'm assuming you mean the various query parameters. The only ones I see in
there that would be of use for me are the ones I'm already using. As far as
can tell from their description.

MJ


Sent with MailTrack



Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-04 Thread John Bickerstaff
MaryJo - I'm on vacation but can't resist... iirc there are some very
useful query modifications suggested in the readme on the github for the
plugin... can't access right now.

You may know about them already, but if it's been a while since you looked,
those may help...
On Jun 3, 2016 12:28 PM, "MaryJo Sminkey"  wrote:

On some additional tests, it looks like it's the phrase matching in
particular that is the issue, if I take that out I do seem to be getting
better results. I definitely don't want to get rid of those so need to find
a way to make them work together.



Sent with MailTrack
<
https://mailtrack.io/install?source=signature=en=mjsmin...@gmail.com=22
>

On Fri, Jun 3, 2016 at 2:21 PM, MaryJo Sminkey  wrote:

> Okay so big thanks for the help with getting the hon_lucene_synonyms
> plugin working. That is a big load off to finally have a solution in place
> for all our multi-term synonyms. We did find that the information in Step
8
> about the plugin showing "SynonymExpandingExtendedDismaxQParser" for
> QParser does not seem to be correct, we only ever get
> "ExtendedDismaxQParser" but the synonym expansion is definitely working.
>
> In implementing it though, the one thing I'm still having an issue with is
> trying to figure out how I can get results on the original term to appear
> first in our results and matches on the synonyms lower in the results. The
> plugin includes settings for an originalboost and synonymboost, but that
> doesn't seem to be working along with all the other edismax boosts I'm
> doing. We search across a number of fields, each with their own boost and
> then do phrase searches with boosts as well. My params look like this:
>
> params["defType"] = 'synonym_edismax';
> params["qf"] = 'body^0.5 productinfo^1.0 keywords^2.0 prodname^10.0
> prodnumbertext^20.0';
> params["pf"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["pf2"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["pf3"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["ps"] = 1;
> params["tie"] = 0.1;
> params["synonyms"] = true;
> params["synonyms.originalBoost"] = 2.0;
> params["synonyms.synonymBoost"] = 0.5;
>
> And here's an example of what the plugin gives me for a search on "sbc"
> which includes synonyms for "sb" and "small block" I don't really know
> enough about this to figure out what exactly it's doing but since all of
> the results I am getting first are ones with "small block" in the name,
and
> the ones with "sbc" in the prodname field which should be first are buried
> about 1000 documents in, I know the originalboost and synonymboost aren't
> working with all this other stuff. Ideas how to fix this? With the normal
> synonym filter we just set up copies of the fields that could have
synonyms
> to use with that filter applied and had a lower boost on those. Not sure
> how to make it work with this custom query parser though.
>
> +((prodname:sbc^10.0 | body:sbc^0.5 | productinfo:sbc | keywords:sbc^2.0 |
> (((prodnumbertext:sbc prodnumbertext:small prodnumbertext:sb)
> prodnumbertext:block)~2)^20.0)~0.1^2.0 (((+(prodname:sb^10.0 | body:sb^0.5
> | productinfo:sb | keywords:sb^2.0 | (((prodnumbertext:sb
> prodnumbertext:small prodnumbertext:sbc)
prodnumbertext:block)~2)^20.0)~0.1
> ()))^0.5) (((+(((prodname:small^10.0 | body:small^0.5 | productinfo:small
|
> keywords:small^2.0 | prodnumbertext:small^20.0)~0.1 (prodname:block^10.0 |
> body:block^0.5 | productinfo:block | keywords:block^2.0 |
> prodnumbertext:block^20.0)~0.1)~2) (productinfo:"small block"~1 |
> body:"small block"~1^5.0 | keywords:"small block"~1^10.0 | prodname:"small
> block"~1^50.0)~0.1 (productinfo:"small block"~1 | body:"small block"~1^5.0
> | keywords:"small block"~1^10.0 | prodname:"small
> block"~1^50.0)~0.1))^0.5)) ()
>
>
> Mary Jo
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-03 Thread MaryJo Sminkey
On some additional tests, it looks like it's the phrase matching in
particular that is the issue, if I take that out I do seem to be getting
better results. I definitely don't want to get rid of those so need to find
a way to make them work together.



Sent with MailTrack


On Fri, Jun 3, 2016 at 2:21 PM, MaryJo Sminkey  wrote:

> Okay so big thanks for the help with getting the hon_lucene_synonyms
> plugin working. That is a big load off to finally have a solution in place
> for all our multi-term synonyms. We did find that the information in Step 8
> about the plugin showing "SynonymExpandingExtendedDismaxQParser" for
> QParser does not seem to be correct, we only ever get
> "ExtendedDismaxQParser" but the synonym expansion is definitely working.
>
> In implementing it though, the one thing I'm still having an issue with is
> trying to figure out how I can get results on the original term to appear
> first in our results and matches on the synonyms lower in the results. The
> plugin includes settings for an originalboost and synonymboost, but that
> doesn't seem to be working along with all the other edismax boosts I'm
> doing. We search across a number of fields, each with their own boost and
> then do phrase searches with boosts as well. My params look like this:
>
> params["defType"] = 'synonym_edismax';
> params["qf"] = 'body^0.5 productinfo^1.0 keywords^2.0 prodname^10.0
> prodnumbertext^20.0';
> params["pf"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["pf2"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["pf3"] = 'productinfo^1 body^5 keywords^10 prodname^50';
> params["ps"] = 1;
> params["tie"] = 0.1;
> params["synonyms"] = true;
> params["synonyms.originalBoost"] = 2.0;
> params["synonyms.synonymBoost"] = 0.5;
>
> And here's an example of what the plugin gives me for a search on "sbc"
> which includes synonyms for "sb" and "small block" I don't really know
> enough about this to figure out what exactly it's doing but since all of
> the results I am getting first are ones with "small block" in the name, and
> the ones with "sbc" in the prodname field which should be first are buried
> about 1000 documents in, I know the originalboost and synonymboost aren't
> working with all this other stuff. Ideas how to fix this? With the normal
> synonym filter we just set up copies of the fields that could have synonyms
> to use with that filter applied and had a lower boost on those. Not sure
> how to make it work with this custom query parser though.
>
> +((prodname:sbc^10.0 | body:sbc^0.5 | productinfo:sbc | keywords:sbc^2.0 |
> (((prodnumbertext:sbc prodnumbertext:small prodnumbertext:sb)
> prodnumbertext:block)~2)^20.0)~0.1^2.0 (((+(prodname:sb^10.0 | body:sb^0.5
> | productinfo:sb | keywords:sb^2.0 | (((prodnumbertext:sb
> prodnumbertext:small prodnumbertext:sbc) prodnumbertext:block)~2)^20.0)~0.1
> ()))^0.5) (((+(((prodname:small^10.0 | body:small^0.5 | productinfo:small |
> keywords:small^2.0 | prodnumbertext:small^20.0)~0.1 (prodname:block^10.0 |
> body:block^0.5 | productinfo:block | keywords:block^2.0 |
> prodnumbertext:block^20.0)~0.1)~2) (productinfo:"small block"~1 |
> body:"small block"~1^5.0 | keywords:"small block"~1^10.0 | prodname:"small
> block"~1^50.0)~0.1 (productinfo:"small block"~1 | body:"small block"~1^5.0
> | keywords:"small block"~1^10.0 | prodname:"small
> block"~1^50.0)~0.1))^0.5)) ()
>
>
> Mary Jo
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-03 Thread MaryJo Sminkey
Okay so big thanks for the help with getting the hon_lucene_synonyms plugin
working. That is a big load off to finally have a solution in place for all
our multi-term synonyms. We did find that the information in Step 8 about
the plugin showing "SynonymExpandingExtendedDismaxQParser" for QParser does
not seem to be correct, we only ever get "ExtendedDismaxQParser" but the
synonym expansion is definitely working.

In implementing it though, the one thing I'm still having an issue with is
trying to figure out how I can get results on the original term to appear
first in our results and matches on the synonyms lower in the results. The
plugin includes settings for an originalboost and synonymboost, but that
doesn't seem to be working along with all the other edismax boosts I'm
doing. We search across a number of fields, each with their own boost and
then do phrase searches with boosts as well. My params look like this:

params["defType"] = 'synonym_edismax';
params["qf"] = 'body^0.5 productinfo^1.0 keywords^2.0 prodname^10.0
prodnumbertext^20.0';
params["pf"] = 'productinfo^1 body^5 keywords^10 prodname^50';
params["pf2"] = 'productinfo^1 body^5 keywords^10 prodname^50';
params["pf3"] = 'productinfo^1 body^5 keywords^10 prodname^50';
params["ps"] = 1;
params["tie"] = 0.1;
params["synonyms"] = true;
params["synonyms.originalBoost"] = 2.0;
params["synonyms.synonymBoost"] = 0.5;

And here's an example of what the plugin gives me for a search on "sbc"
which includes synonyms for "sb" and "small block" I don't really know
enough about this to figure out what exactly it's doing but since all of
the results I am getting first are ones with "small block" in the name, and
the ones with "sbc" in the prodname field which should be first are buried
about 1000 documents in, I know the originalboost and synonymboost aren't
working with all this other stuff. Ideas how to fix this? With the normal
synonym filter we just set up copies of the fields that could have synonyms
to use with that filter applied and had a lower boost on those. Not sure
how to make it work with this custom query parser though.

+((prodname:sbc^10.0 | body:sbc^0.5 | productinfo:sbc | keywords:sbc^2.0 |
(((prodnumbertext:sbc prodnumbertext:small prodnumbertext:sb)
prodnumbertext:block)~2)^20.0)~0.1^2.0 (((+(prodname:sb^10.0 | body:sb^0.5
| productinfo:sb | keywords:sb^2.0 | (((prodnumbertext:sb
prodnumbertext:small prodnumbertext:sbc) prodnumbertext:block)~2)^20.0)~0.1
()))^0.5) (((+(((prodname:small^10.0 | body:small^0.5 | productinfo:small |
keywords:small^2.0 | prodnumbertext:small^20.0)~0.1 (prodname:block^10.0 |
body:block^0.5 | productinfo:block | keywords:block^2.0 |
prodnumbertext:block^20.0)~0.1)~2) (productinfo:"small block"~1 |
body:"small block"~1^5.0 | keywords:"small block"~1^10.0 | prodname:"small
block"~1^50.0)~0.1 (productinfo:"small block"~1 | body:"small block"~1^5.0
| keywords:"small block"~1^10.0 | prodname:"small
block"~1^50.0)~0.1))^0.5)) ()


Mary Jo


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
Yes, I get that, thanks.
On Jun 1, 2016 6:38 PM, "Joe Lawson" 
wrote:

> 2.0 is compiled with Solr 5 and Java 7. It uses the namespace
> solr.SynonymExpandingExtendedDismaxQParserPlugin
>
> 5.0.4 is compiled with Solr 6 and Java 8 and is the first release that made
> it to maven central. It uses the namespace
> com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin
>
> The features are the same for all versions.
>
> Hope this clears things up.
>
> -Joe
> On Jun 1, 2016 8:11 PM, "John Bickerstaff" 
> wrote:
>
> > Just to be clear, I got version 2.0 of the jar from github...  should I
> be
> > look for something in a maven repository?  A bit confused at this point
> > given all the version numbers...
> >
> > I want the latest and greatest unless there's any special
> considerations..
> >
> > Thanks for the assistance!
> > On Jun 1, 2016 5:46 PM, "MaryJo Sminkey"  wrote:
> >
> > Yup that was the issue for us as well. It doesn't seem to be throwing the
> > class error now, although I have not been able to successfully get back
> > results that seem to be using it, it's showing up as the deftype in my
> > params but the QParser in my debug is the normal edismax one. I will have
> > to play around with my config some more tomorrow and try to figure out
> what
> > we're doing wrong.
> >
> > MJ
> >
> >
> >
> > On Wed, Jun 1, 2016 at 6:38 PM, Joe Lawson <
> > jlaw...@opensourceconnections.com> wrote:
> >
> > > Nothing up until 5.0.4 was distributed on maven central. 5.0 -> 5.0.4
> was
> > > just a bunch of clean up to get it ready for maven (including the
> > namespace
> > > change).
> > >
> > > Being that nearly all docs and articles talking about the plugin
> > reference
> > > the old 2.0 one could reasonably get confused as to what config to use
> > esp
> > > when I linked the latest 5.0.4 test config prior.
> > >
> > > You can get the older jars from the links off the readme.md.
> > > On Jun 1, 2016 6:14 PM, "Shawn Heisey"  wrote:
> > >
> > > On 6/1/2016 1:10 PM, John Bickerstaff wrote:
> > > > @Joe:
> > > >
> > > > Is it possible that the jar's package name does not match the entry
> in
> > > the
> > > > sample solrconfig.xml file?
> > > >
> > > > The solrconfig.xml example file in the test directory contains the
> > > > following package name:
> > > >  > > >
> > >
> > >
> >
> >
> class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
> > > >
> > > > However, the jar file (when unzipped) has the following directory
> > > structure
> > > > down to the same class name:
> > > >
> > > > org --> apache --> solr --> search
> > > >
> > > > I just tried with the name change to the org.apache package name
> in
> > > the
> > > > solrconfig.xml file and got no errors.
> > >
> > > Looks like the package name is indeed the problem here.
> > >
> > > They changed the package name from org.apache.solr.search to
> > > com.github.healthonnet.search in the LATEST source code release --
> > > 5.0.4.  The code in the 5.0.3 version (and the 2.0.0 version indicated
> > > in the earlier message) uses org.apache.solr.search.
> > >
> > > I cannot find any files in the 2.0.0 zipfile download that contain the
> > > new package name, so I'm curious where the incorrect information on how
> > > to configure Solr to use the plugin was found.  I did not check the
> > > tarball download.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Joe Lawson
2.0 is compiled with Solr 5 and Java 7. It uses the namespace
solr.SynonymExpandingExtendedDismaxQParserPlugin

5.0.4 is compiled with Solr 6 and Java 8 and is the first release that made
it to maven central. It uses the namespace
com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin

The features are the same for all versions.

Hope this clears things up.

-Joe
On Jun 1, 2016 8:11 PM, "John Bickerstaff"  wrote:

> Just to be clear, I got version 2.0 of the jar from github...  should I be
> look for something in a maven repository?  A bit confused at this point
> given all the version numbers...
>
> I want the latest and greatest unless there's any special considerations..
>
> Thanks for the assistance!
> On Jun 1, 2016 5:46 PM, "MaryJo Sminkey"  wrote:
>
> Yup that was the issue for us as well. It doesn't seem to be throwing the
> class error now, although I have not been able to successfully get back
> results that seem to be using it, it's showing up as the deftype in my
> params but the QParser in my debug is the normal edismax one. I will have
> to play around with my config some more tomorrow and try to figure out what
> we're doing wrong.
>
> MJ
>
>
>
> On Wed, Jun 1, 2016 at 6:38 PM, Joe Lawson <
> jlaw...@opensourceconnections.com> wrote:
>
> > Nothing up until 5.0.4 was distributed on maven central. 5.0 -> 5.0.4 was
> > just a bunch of clean up to get it ready for maven (including the
> namespace
> > change).
> >
> > Being that nearly all docs and articles talking about the plugin
> reference
> > the old 2.0 one could reasonably get confused as to what config to use
> esp
> > when I linked the latest 5.0.4 test config prior.
> >
> > You can get the older jars from the links off the readme.md.
> > On Jun 1, 2016 6:14 PM, "Shawn Heisey"  wrote:
> >
> > On 6/1/2016 1:10 PM, John Bickerstaff wrote:
> > > @Joe:
> > >
> > > Is it possible that the jar's package name does not match the entry in
> > the
> > > sample solrconfig.xml file?
> > >
> > > The solrconfig.xml example file in the test directory contains the
> > > following package name:
> > >  > >
> >
> >
>
> class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
> > >
> > > However, the jar file (when unzipped) has the following directory
> > structure
> > > down to the same class name:
> > >
> > > org --> apache --> solr --> search
> > >
> > > I just tried with the name change to the org.apache package name in
> > the
> > > solrconfig.xml file and got no errors.
> >
> > Looks like the package name is indeed the problem here.
> >
> > They changed the package name from org.apache.solr.search to
> > com.github.healthonnet.search in the LATEST source code release --
> > 5.0.4.  The code in the 5.0.3 version (and the 2.0.0 version indicated
> > in the earlier message) uses org.apache.solr.search.
> >
> > I cannot find any files in the 2.0.0 zipfile download that contain the
> > new package name, so I'm curious where the incorrect information on how
> > to configure Solr to use the plugin was found.  I did not check the
> > tarball download.
> >
> > Thanks,
> > Shawn
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
Just to be clear, I got version 2.0 of the jar from github...  should I be
look for something in a maven repository?  A bit confused at this point
given all the version numbers...

I want the latest and greatest unless there's any special considerations..

Thanks for the assistance!
On Jun 1, 2016 5:46 PM, "MaryJo Sminkey"  wrote:

Yup that was the issue for us as well. It doesn't seem to be throwing the
class error now, although I have not been able to successfully get back
results that seem to be using it, it's showing up as the deftype in my
params but the QParser in my debug is the normal edismax one. I will have
to play around with my config some more tomorrow and try to figure out what
we're doing wrong.

MJ



On Wed, Jun 1, 2016 at 6:38 PM, Joe Lawson <
jlaw...@opensourceconnections.com> wrote:

> Nothing up until 5.0.4 was distributed on maven central. 5.0 -> 5.0.4 was
> just a bunch of clean up to get it ready for maven (including the
namespace
> change).
>
> Being that nearly all docs and articles talking about the plugin reference
> the old 2.0 one could reasonably get confused as to what config to use esp
> when I linked the latest 5.0.4 test config prior.
>
> You can get the older jars from the links off the readme.md.
> On Jun 1, 2016 6:14 PM, "Shawn Heisey"  wrote:
>
> On 6/1/2016 1:10 PM, John Bickerstaff wrote:
> > @Joe:
> >
> > Is it possible that the jar's package name does not match the entry in
> the
> > sample solrconfig.xml file?
> >
> > The solrconfig.xml example file in the test directory contains the
> > following package name:
> >  >
>
>
class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
> >
> > However, the jar file (when unzipped) has the following directory
> structure
> > down to the same class name:
> >
> > org --> apache --> solr --> search
> >
> > I just tried with the name change to the org.apache package name in
> the
> > solrconfig.xml file and got no errors.
>
> Looks like the package name is indeed the problem here.
>
> They changed the package name from org.apache.solr.search to
> com.github.healthonnet.search in the LATEST source code release --
> 5.0.4.  The code in the 5.0.3 version (and the 2.0.0 version indicated
> in the earlier message) uses org.apache.solr.search.
>
> I cannot find any files in the 2.0.0 zipfile download that contain the
> new package name, so I'm curious where the incorrect information on how
> to configure Solr to use the plugin was found.  I did not check the
> tarball download.
>
> Thanks,
> Shawn
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread MaryJo Sminkey
Yup that was the issue for us as well. It doesn't seem to be throwing the
class error now, although I have not been able to successfully get back
results that seem to be using it, it's showing up as the deftype in my
params but the QParser in my debug is the normal edismax one. I will have
to play around with my config some more tomorrow and try to figure out what
we're doing wrong.

MJ



On Wed, Jun 1, 2016 at 6:38 PM, Joe Lawson <
jlaw...@opensourceconnections.com> wrote:

> Nothing up until 5.0.4 was distributed on maven central. 5.0 -> 5.0.4 was
> just a bunch of clean up to get it ready for maven (including the namespace
> change).
>
> Being that nearly all docs and articles talking about the plugin reference
> the old 2.0 one could reasonably get confused as to what config to use esp
> when I linked the latest 5.0.4 test config prior.
>
> You can get the older jars from the links off the readme.md.
> On Jun 1, 2016 6:14 PM, "Shawn Heisey"  wrote:
>
> On 6/1/2016 1:10 PM, John Bickerstaff wrote:
> > @Joe:
> >
> > Is it possible that the jar's package name does not match the entry in
> the
> > sample solrconfig.xml file?
> >
> > The solrconfig.xml example file in the test directory contains the
> > following package name:
> >  >
>
> class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
> >
> > However, the jar file (when unzipped) has the following directory
> structure
> > down to the same class name:
> >
> > org --> apache --> solr --> search
> >
> > I just tried with the name change to the org.apache package name in
> the
> > solrconfig.xml file and got no errors.
>
> Looks like the package name is indeed the problem here.
>
> They changed the package name from org.apache.solr.search to
> com.github.healthonnet.search in the LATEST source code release --
> 5.0.4.  The code in the 5.0.3 version (and the 2.0.0 version indicated
> in the earlier message) uses org.apache.solr.search.
>
> I cannot find any files in the 2.0.0 zipfile download that contain the
> new package name, so I'm curious where the incorrect information on how
> to configure Solr to use the plugin was found.  I did not check the
> tarball download.
>
> Thanks,
> Shawn
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Joe Lawson
Nothing up until 5.0.4 was distributed on maven central. 5.0 -> 5.0.4 was
just a bunch of clean up to get it ready for maven (including the namespace
change).

Being that nearly all docs and articles talking about the plugin reference
the old 2.0 one could reasonably get confused as to what config to use esp
when I linked the latest 5.0.4 test config prior.

You can get the older jars from the links off the readme.md.
On Jun 1, 2016 6:14 PM, "Shawn Heisey"  wrote:

On 6/1/2016 1:10 PM, John Bickerstaff wrote:
> @Joe:
>
> Is it possible that the jar's package name does not match the entry in the
> sample solrconfig.xml file?
>
> The solrconfig.xml example file in the test directory contains the
> following package name:
> 
class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
>
> However, the jar file (when unzipped) has the following directory
structure
> down to the same class name:
>
> org --> apache --> solr --> search
>
> I just tried with the name change to the org.apache package name in
the
> solrconfig.xml file and got no errors.

Looks like the package name is indeed the problem here.

They changed the package name from org.apache.solr.search to
com.github.healthonnet.search in the LATEST source code release --
5.0.4.  The code in the 5.0.3 version (and the 2.0.0 version indicated
in the earlier message) uses org.apache.solr.search.

I cannot find any files in the 2.0.0 zipfile download that contain the
new package name, so I'm curious where the incorrect information on how
to configure Solr to use the plugin was found.  I did not check the
tarball download.

Thanks,
Shawn


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Shawn Heisey
On 6/1/2016 1:10 PM, John Bickerstaff wrote:
> @Joe:
>
> Is it possible that the jar's package name does not match the entry in the
> sample solrconfig.xml file?
>
> The solrconfig.xml example file in the test directory contains the
> following package name:
>  class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
>
> However, the jar file (when unzipped) has the following directory structure
> down to the same class name:
>
> org --> apache --> solr --> search
>
> I just tried with the name change to the org.apache package name in the
> solrconfig.xml file and got no errors.

Looks like the package name is indeed the problem here.

They changed the package name from org.apache.solr.search to
com.github.healthonnet.search in the LATEST source code release --
5.0.4.  The code in the 5.0.3 version (and the 2.0.0 version indicated
in the earlier message) uses org.apache.solr.search.

I cannot find any files in the 2.0.0 zipfile download that contain the
new package name, so I'm curious where the incorrect information on how
to configure Solr to use the plugin was found.  I did not check the
tarball download.

Thanks,
Shawn



Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Joe Lawson
I mean the 5.0 namespace is different from the 2.0 not 3.0.
On Jun 1, 2016 5:43 PM, "Joe Lawson" 
wrote:

2.0 is different from 3.0 so check the test config that is associated with
the 2.0 release. Ie


https://github.com/healthonnet/hon-lucene-synonyms/blob/8f736da053510911517fcb8a712b1d8ca5c920d2/src/test/resources/solr/collection1/conf/example_solrconfig.xml


On Jun 1, 2016 3:10 PM, "John Bickerstaff"  wrote:

> @Joe:
>
> Is it possible that the jar's package name does not match the entry in the
> sample solrconfig.xml file?
>
> The solrconfig.xml example file in the test directory contains the
> following package name:
> 
> class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
>
> However, the jar file (when unzipped) has the following directory structure
> down to the same class name:
>
> org --> apache --> solr --> search
>
> I just tried with the name change to the org.apache package name in the
> solrconfig.xml file and got no errors.
>
> I haven't yet tried to see synonym "stuff" in the debug for a query, but
> I'm betting it's much ado about nothing - just the package name has
> changed...
>
> If that makes sense to you, you may want to edit the example file...
>
> Thanks a lot for all the work you contributed to this by the way!
>
> --JohnB
>
> @ MaryJo - this may be the problem in your situation for this specific file
> -- good luck!
>
> I put it in $SOLR_HOME/lib  - which, taking the default "for production"
> install script on Ubuntu resolved to /var/solr/data/lib
>
> Good luck!
>
> On Wed, Jun 1, 2016 at 12:49 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > I tried this - it didn't fail.  I don't know if it really started in
> > Denable.runtime.lib=true mode or not:
> >
> > service solr start -Denable.runtime.lib=true
> >
> > Of course, I'd still really rather be able to just drop jars into
> > /var/solr/data/lib and have them work...
> >
> > Thanks all.
> >
> > On Wed, Jun 1, 2016 at 12:42 PM, John Bickerstaff <
> > j...@johnbickerstaff.com> wrote:
> >
> >> So - the instructions on using the Blob Store API say to use the
> >> Denable.runtime.lib=true option when starting Solr.
> >>
> >> Thing is, I've installed per the "for production" instructions which
> >> gives me an entry in /etc/init.d called solr.
> >>
> >> Two questions.
> >>
> >> To test this can I still use the start.jar in /opt/solr/server as long
> as
> >> I issue the "cloud mode" flag or does that no longer work in 5.x?
> >>
> >> Do I instead have to modify that start script in /etc/init.d ?
> >>
> >> On Wed, Jun 1, 2016 at 10:42 AM, John Bickerstaff <
> >> j...@johnbickerstaff.com> wrote:
> >>
> >>> Ahhh - gotcha.
> >>>
> >>> Well, not sure why it's not picked up - seems lots of other jars are...
> >>> Maybe Joe will comment...
> >>>
> >>> On Wed, Jun 1, 2016 at 10:22 AM, MaryJo Sminkey 
> >>> wrote:
> >>>
>  That refers to running Solr in cloud mode. We aren't there yet.
> 
>  MJ
> 
> 
> 
>  On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff <
>  j...@johnbickerstaff.com>
>  wrote:
> 
>  > Hi Mary Jo,
>  >
>  > I'll point you to Joe's earlier comment about needing to use the
> Blob
>  Store
>  > API...  He put a link in his response.
>  >
>  > I'm about to try that today...  Given that Joe is a contributor to
>  > hon_lucene there's a good chance his experience is correct here
> -
>  > especially given the evidence you just provided...
>  >
>  > Here's a copy - paste for your convenience.  It's a bit convoluted,
>  > although I totally get how this kind of approach is great for large
>  Solr
>  > Cloud installations that have machines or VMs coming up and going
>  down as
>  > part of a services-based approach...
>  >
>  > Joe said:
>  > The docs are out of date for the synonym_edismax but it does work.
>  Check
>  > out the tests for working examples. I'll try to update it soon. I've
>  run
>  > the plugin on Solr 5 and 6, solrcloud and standalone. For running in
>  > SolrCloud make sure you follow
>  >
>  >
> 
> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
>  >
>  > On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey <
> mjsmin...@gmail.com>
>  > wrote:
>  >
>  > > So we still can't get this to work, here's the latest update my
>  server
>  > guy
>  > > gave me: It seems to not matter where the file is located, it does
>  not
>  > > load. Yet, the the Solr Java class path shows the file has loaded.
>  Only
>  > > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work
> in
>  that
>  > it
>  > > loads in the java class path.  I've yet to find out what the error
>  is.
>  > All
>  > > I can see is this "Error loading class". 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Joe Lawson
2.0 is different from 3.0 so check the test config that is associated with
the 2.0 release. Ie


https://github.com/healthonnet/hon-lucene-synonyms/blob/8f736da053510911517fcb8a712b1d8ca5c920d2/src/test/resources/solr/collection1/conf/example_solrconfig.xml


On Jun 1, 2016 3:10 PM, "John Bickerstaff"  wrote:

> @Joe:
>
> Is it possible that the jar's package name does not match the entry in the
> sample solrconfig.xml file?
>
> The solrconfig.xml example file in the test directory contains the
> following package name:
> 
> class="com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin">
>
> However, the jar file (when unzipped) has the following directory structure
> down to the same class name:
>
> org --> apache --> solr --> search
>
> I just tried with the name change to the org.apache package name in the
> solrconfig.xml file and got no errors.
>
> I haven't yet tried to see synonym "stuff" in the debug for a query, but
> I'm betting it's much ado about nothing - just the package name has
> changed...
>
> If that makes sense to you, you may want to edit the example file...
>
> Thanks a lot for all the work you contributed to this by the way!
>
> --JohnB
>
> @ MaryJo - this may be the problem in your situation for this specific file
> -- good luck!
>
> I put it in $SOLR_HOME/lib  - which, taking the default "for production"
> install script on Ubuntu resolved to /var/solr/data/lib
>
> Good luck!
>
> On Wed, Jun 1, 2016 at 12:49 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > I tried this - it didn't fail.  I don't know if it really started in
> > Denable.runtime.lib=true mode or not:
> >
> > service solr start -Denable.runtime.lib=true
> >
> > Of course, I'd still really rather be able to just drop jars into
> > /var/solr/data/lib and have them work...
> >
> > Thanks all.
> >
> > On Wed, Jun 1, 2016 at 12:42 PM, John Bickerstaff <
> > j...@johnbickerstaff.com> wrote:
> >
> >> So - the instructions on using the Blob Store API say to use the
> >> Denable.runtime.lib=true option when starting Solr.
> >>
> >> Thing is, I've installed per the "for production" instructions which
> >> gives me an entry in /etc/init.d called solr.
> >>
> >> Two questions.
> >>
> >> To test this can I still use the start.jar in /opt/solr/server as long
> as
> >> I issue the "cloud mode" flag or does that no longer work in 5.x?
> >>
> >> Do I instead have to modify that start script in /etc/init.d ?
> >>
> >> On Wed, Jun 1, 2016 at 10:42 AM, John Bickerstaff <
> >> j...@johnbickerstaff.com> wrote:
> >>
> >>> Ahhh - gotcha.
> >>>
> >>> Well, not sure why it's not picked up - seems lots of other jars are...
> >>> Maybe Joe will comment...
> >>>
> >>> On Wed, Jun 1, 2016 at 10:22 AM, MaryJo Sminkey 
> >>> wrote:
> >>>
>  That refers to running Solr in cloud mode. We aren't there yet.
> 
>  MJ
> 
> 
> 
>  On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff <
>  j...@johnbickerstaff.com>
>  wrote:
> 
>  > Hi Mary Jo,
>  >
>  > I'll point you to Joe's earlier comment about needing to use the
> Blob
>  Store
>  > API...  He put a link in his response.
>  >
>  > I'm about to try that today...  Given that Joe is a contributor to
>  > hon_lucene there's a good chance his experience is correct here
> -
>  > especially given the evidence you just provided...
>  >
>  > Here's a copy - paste for your convenience.  It's a bit convoluted,
>  > although I totally get how this kind of approach is great for large
>  Solr
>  > Cloud installations that have machines or VMs coming up and going
>  down as
>  > part of a services-based approach...
>  >
>  > Joe said:
>  > The docs are out of date for the synonym_edismax but it does work.
>  Check
>  > out the tests for working examples. I'll try to update it soon. I've
>  run
>  > the plugin on Solr 5 and 6, solrcloud and standalone. For running in
>  > SolrCloud make sure you follow
>  >
>  >
> 
> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
>  >
>  > On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey <
> mjsmin...@gmail.com>
>  > wrote:
>  >
>  > > So we still can't get this to work, here's the latest update my
>  server
>  > guy
>  > > gave me: It seems to not matter where the file is located, it does
>  not
>  > > load. Yet, the the Solr Java class path shows the file has loaded.
>  Only
>  > > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work
> in
>  that
>  > it
>  > > loads in the java class path.  I've yet to find out what the error
>  is.
>  > All
>  > > I can see is this "Error loading class". Okay, but why? What error
>  was
>  > > encountered in trying to load the class?  I can't find any of this
>  > > information. 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
@Joe:

Is it possible that the jar's package name does not match the entry in the
sample solrconfig.xml file?

The solrconfig.xml example file in the test directory contains the
following package name:


However, the jar file (when unzipped) has the following directory structure
down to the same class name:

org --> apache --> solr --> search

I just tried with the name change to the org.apache package name in the
solrconfig.xml file and got no errors.

I haven't yet tried to see synonym "stuff" in the debug for a query, but
I'm betting it's much ado about nothing - just the package name has
changed...

If that makes sense to you, you may want to edit the example file...

Thanks a lot for all the work you contributed to this by the way!

--JohnB

@ MaryJo - this may be the problem in your situation for this specific file
-- good luck!

I put it in $SOLR_HOME/lib  - which, taking the default "for production"
install script on Ubuntu resolved to /var/solr/data/lib

Good luck!

On Wed, Jun 1, 2016 at 12:49 PM, John Bickerstaff 
wrote:

> I tried this - it didn't fail.  I don't know if it really started in
> Denable.runtime.lib=true mode or not:
>
> service solr start -Denable.runtime.lib=true
>
> Of course, I'd still really rather be able to just drop jars into
> /var/solr/data/lib and have them work...
>
> Thanks all.
>
> On Wed, Jun 1, 2016 at 12:42 PM, John Bickerstaff <
> j...@johnbickerstaff.com> wrote:
>
>> So - the instructions on using the Blob Store API say to use the
>> Denable.runtime.lib=true option when starting Solr.
>>
>> Thing is, I've installed per the "for production" instructions which
>> gives me an entry in /etc/init.d called solr.
>>
>> Two questions.
>>
>> To test this can I still use the start.jar in /opt/solr/server as long as
>> I issue the "cloud mode" flag or does that no longer work in 5.x?
>>
>> Do I instead have to modify that start script in /etc/init.d ?
>>
>> On Wed, Jun 1, 2016 at 10:42 AM, John Bickerstaff <
>> j...@johnbickerstaff.com> wrote:
>>
>>> Ahhh - gotcha.
>>>
>>> Well, not sure why it's not picked up - seems lots of other jars are...
>>> Maybe Joe will comment...
>>>
>>> On Wed, Jun 1, 2016 at 10:22 AM, MaryJo Sminkey 
>>> wrote:
>>>
 That refers to running Solr in cloud mode. We aren't there yet.

 MJ



 On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff <
 j...@johnbickerstaff.com>
 wrote:

 > Hi Mary Jo,
 >
 > I'll point you to Joe's earlier comment about needing to use the Blob
 Store
 > API...  He put a link in his response.
 >
 > I'm about to try that today...  Given that Joe is a contributor to
 > hon_lucene there's a good chance his experience is correct here -
 > especially given the evidence you just provided...
 >
 > Here's a copy - paste for your convenience.  It's a bit convoluted,
 > although I totally get how this kind of approach is great for large
 Solr
 > Cloud installations that have machines or VMs coming up and going
 down as
 > part of a services-based approach...
 >
 > Joe said:
 > The docs are out of date for the synonym_edismax but it does work.
 Check
 > out the tests for working examples. I'll try to update it soon. I've
 run
 > the plugin on Solr 5 and 6, solrcloud and standalone. For running in
 > SolrCloud make sure you follow
 >
 >
 https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
 >
 > On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey 
 > wrote:
 >
 > > So we still can't get this to work, here's the latest update my
 server
 > guy
 > > gave me: It seems to not matter where the file is located, it does
 not
 > > load. Yet, the the Solr Java class path shows the file has loaded.
 Only
 > > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in
 that
 > it
 > > loads in the java class path.  I've yet to find out what the error
 is.
 > All
 > > I can see is this "Error loading class". Okay, but why? What error
 was
 > > encountered in trying to load the class?  I can't find any of this
 > > information. I'm trying to work with the documentation that is
 located
 > here
 > > http://wiki.apache.org/solr/SolrPlugins
 > >
 > > I found that the jar file was put into each of these locations in an
 > > attempt to find a place where it will load without error.
 > >
 > > find .|grep hon-lucene
 > >
 > > ./server/lib/hon-lucene-synonyms-2.0.0.jar
 > >
 > > ./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar
 > >
 > > ./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar
 > >
 > > ./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar
 > >
 > >
 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
I tried this - it didn't fail.  I don't know if it really started in
Denable.runtime.lib=true mode or not:

service solr start -Denable.runtime.lib=true

Of course, I'd still really rather be able to just drop jars into
/var/solr/data/lib and have them work...

Thanks all.

On Wed, Jun 1, 2016 at 12:42 PM, John Bickerstaff 
wrote:

> So - the instructions on using the Blob Store API say to use the
> Denable.runtime.lib=true option when starting Solr.
>
> Thing is, I've installed per the "for production" instructions which gives
> me an entry in /etc/init.d called solr.
>
> Two questions.
>
> To test this can I still use the start.jar in /opt/solr/server as long as
> I issue the "cloud mode" flag or does that no longer work in 5.x?
>
> Do I instead have to modify that start script in /etc/init.d ?
>
> On Wed, Jun 1, 2016 at 10:42 AM, John Bickerstaff <
> j...@johnbickerstaff.com> wrote:
>
>> Ahhh - gotcha.
>>
>> Well, not sure why it's not picked up - seems lots of other jars are...
>> Maybe Joe will comment...
>>
>> On Wed, Jun 1, 2016 at 10:22 AM, MaryJo Sminkey 
>> wrote:
>>
>>> That refers to running Solr in cloud mode. We aren't there yet.
>>>
>>> MJ
>>>
>>>
>>>
>>> On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff <
>>> j...@johnbickerstaff.com>
>>> wrote:
>>>
>>> > Hi Mary Jo,
>>> >
>>> > I'll point you to Joe's earlier comment about needing to use the Blob
>>> Store
>>> > API...  He put a link in his response.
>>> >
>>> > I'm about to try that today...  Given that Joe is a contributor to
>>> > hon_lucene there's a good chance his experience is correct here -
>>> > especially given the evidence you just provided...
>>> >
>>> > Here's a copy - paste for your convenience.  It's a bit convoluted,
>>> > although I totally get how this kind of approach is great for large
>>> Solr
>>> > Cloud installations that have machines or VMs coming up and going down
>>> as
>>> > part of a services-based approach...
>>> >
>>> > Joe said:
>>> > The docs are out of date for the synonym_edismax but it does work.
>>> Check
>>> > out the tests for working examples. I'll try to update it soon. I've
>>> run
>>> > the plugin on Solr 5 and 6, solrcloud and standalone. For running in
>>> > SolrCloud make sure you follow
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
>>> >
>>> > On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey 
>>> > wrote:
>>> >
>>> > > So we still can't get this to work, here's the latest update my
>>> server
>>> > guy
>>> > > gave me: It seems to not matter where the file is located, it does
>>> not
>>> > > load. Yet, the the Solr Java class path shows the file has loaded.
>>> Only
>>> > > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in
>>> that
>>> > it
>>> > > loads in the java class path.  I've yet to find out what the error
>>> is.
>>> > All
>>> > > I can see is this "Error loading class". Okay, but why? What error
>>> was
>>> > > encountered in trying to load the class?  I can't find any of this
>>> > > information. I'm trying to work with the documentation that is
>>> located
>>> > here
>>> > > http://wiki.apache.org/solr/SolrPlugins
>>> > >
>>> > > I found that the jar file was put into each of these locations in an
>>> > > attempt to find a place where it will load without error.
>>> > >
>>> > > find .|grep hon-lucene
>>> > >
>>> > > ./server/lib/hon-lucene-synonyms-2.0.0.jar
>>> > >
>>> > > ./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar
>>> > >
>>> > > ./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar
>>> > >
>>> > > ./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar
>>> > >
>>> > > ./server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
>>> > >
>>> > >  The config specifies that files in certain paths can be loaded as
>>> > plugins
>>> > > or I can specify a path. Following the instructions I added this path
>>> > >
>>> > >   >> > > dir="${solr.install.dir:../../../..}/contrib/hon-lucene-synonyms/lib"
>>> > > regex=".*\.jar" />
>>> > >
>>> > > And I put the jar file in that location.  This did not work either. I
>>> > also
>>> > > tried using an absolute path like this.
>>> > >
>>> > > >> > >
>>> > >
>>> >
>>> dir="/opt/solr/contrib/hon-lucene-synonyms/lib/hon-lucene-synonyms-2.0.0.jar"
>>> > > />
>>> > >
>>> > > This did not work.
>>> > >
>>> > >
>>> > >
>>> > > I'm starting to think this isn't a configuration problem, but a
>>> > > compatibility problem. I have not seen anything from the maker of
>>> this
>>> > > plugin that it works on the exact version of Solr we are using.
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > The best info I have found so far in the logs is this stack trace of
>>> the
>>> > > error. It still does not say why it failed to load.
>>> > >
>>> > > 2016-06-01 00:22:13.470 ERROR (qtp2096057945-14) [   ]
>>> > o.a.s.s.HttpSolrCall
>>> > > null:org.apache.solr.common.SolrException: 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
 >> >> > > time.
> >> >> > > >
> >> >> > > > There isn't any explicit listing of "text_autophrase" as the
> >> default
> >> >> > > search
> >> >> > > > field in the /autophrase search handler
> >> >> > > >
> >> >> > > > There isn't any explicit statement of "df=text_autophrase" in
> the
> >> >> query
> >> >> > > > statment: [/autophrase?q=New+York]
> >> >> > > >
> >> >> > > > Therefore it seems to me that if someone tries to implement
> this,
> >> >> > they're
> >> >> > > > going to be disappointed in the results unless they:
> >> >> > > > a. copy or otherwise get ALL the text they're interested in --
> >> into
> >> >> the
> >> >> > > > "text_autophrase" field as part of the schema.xml setup (to
> >> happen at
> >> >> > > index
> >> >> > > > time)
> >> >> > > > b. somehow explicitly declare "text_autophrase" as the default
> >> search
> >> >> > > field
> >> >> > > > - either in the searchHandler or wherever else the default
> field
> >> is
> >> >> > > > configured.
> >> >> > > >
> >> >> > > > If anyone out there has done this specific approach - could you
> >> >> > validate
> >> >> > > > whether my thought process is correct and / or if I'm missing
> >> >> > something?
> >> >> > > > Yes - I get that I can set it all up and try - but it's what I
> >> don't
> >> >> > > know I
> >> >> > > > don't know that bothers me...
> >> >> > > >
> >> >> > > > On Fri, May 27, 2016 at 11:57 AM, John Bickerstaff <
> >> >> > > > j...@johnbickerstaff.com
> >> >> > > > > wrote:
> >> >> > > >
> >> >> > > > > Thank you Steve -- very helpful.
> >> >> > > > >
> >> >> > > > > I can see that whatever implementation I decide to try, some
> >> >> testing
> >> >> > > will
> >> >> > > > > be in order.  If anyone is aware of significant gotchas with
> >> this
> >> >> > > synonym
> >> >> > > > > thing that are not mentioned in the already-listed URLs,
> please
> >> >> feel
> >> >> > > free
> >> >> > > > > to comment.
> >> >> > > > >
> >> >> > > > > On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <
> sar...@gmail.com>
> >> >> > wrote:
> >> >> > > > >
> >> >> > > > >> I’m working on addressing problems using multi-term
> synonyms at
> >> >> > query
> >> >> > > > >> time in Lucene and Solr.
> >> >> > > > >>
> >> >> > > > >> I recommend these two blogs for understanding the issues
> (the
> >> >> second
> >> >> > > one
> >> >> > > > >> was mentioned earlier in this thread):
> >> >> > > > >>
> >> >> > > > >> <
> >> >> > > > >>
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
> >> >> > > > >> >
> >> >> > > > >> <
> >> >> >
> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
> >> >> > > > >>
> >> >> > > > >> In addition to the already-mentioned projects, there is
> also:
> >> >> > > > >>
> >> >> > > > >> <https://issues.apache.org/jira/browse/SOLR-5379>
> >> >> > > > >>
> >> >> > > > >> All of these projects try in various ways to work around the
> >> fact
> >> >> > that
> >> >>

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Jeff Wartes
eld
>> is
>> >> > > > configured.
>> >> > > >
>> >> > > > If anyone out there has done this specific approach - could you
>> >> > validate
>> >> > > > whether my thought process is correct and / or if I'm missing
>> >> > something?
>> >> > > > Yes - I get that I can set it all up and try - but it's what I
>> don't
>> >> > > know I
>> >> > > > don't know that bothers me...
>> >> > > >
>> >> > > > On Fri, May 27, 2016 at 11:57 AM, John Bickerstaff <
>> >> > > > j...@johnbickerstaff.com
>> >> > > > > wrote:
>> >> > > >
>> >> > > > > Thank you Steve -- very helpful.
>> >> > > > >
>> >> > > > > I can see that whatever implementation I decide to try, some
>> >> testing
>> >> > > will
>> >> > > > > be in order.  If anyone is aware of significant gotchas with
>> this
>> >> > > synonym
>> >> > > > > thing that are not mentioned in the already-listed URLs, please
>> >> feel
>> >> > > free
>> >> > > > > to comment.
>> >> > > > >
>> >> > > > > On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <sar...@gmail.com>
>> >> > wrote:
>> >> > > > >
>> >> > > > >> I’m working on addressing problems using multi-term synonyms at
>> >> > query
>> >> > > > >> time in Lucene and Solr.
>> >> > > > >>
>> >> > > > >> I recommend these two blogs for understanding the issues (the
>> >> second
>> >> > > one
>> >> > > > >> was mentioned earlier in this thread):
>> >> > > > >>
>> >> > > > >> <
>> >> > > > >>
>> >> > > >
>> >> > >
>> >> >
>> >>
>> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
>> >> > > > >> >
>> >> > > > >> <
>> >> > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
>> >> > > > >>
>> >> > > > >> In addition to the already-mentioned projects, there is also:
>> >> > > > >>
>> >> > > > >> <https://issues.apache.org/jira/browse/SOLR-5379>
>> >> > > > >>
>> >> > > > >> All of these projects try in various ways to work around the
>> fact
>> >> > that
>> >> > > > >> Lucene’s QueryParser splits on whitespace before sending text
>> to
>> >> > > > analysis,
>> >> > > > >> one token at a time, so in a synonym filter, multi-word
>> synonyms
>> >> can
>> >> > > > never
>> >> > > > >> match and add alternatives.  See <
>> >> > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve
>> >> > > posted a
>> >> > > > >> patch to directly address that problem - note that it’s still a
>> >> work
>> >> > > in
>> >> > > > >> progress.
>> >> > > > >>
>> >> > > > >> Once LUCENE-2605 has been fixed, there is still work to do
>> getting
>> >> > > > >> (e)dismax to work with the modified Lucene QueryParser, and
>> >> > addressing
>> >> > > > >> problems with how queries are constructed from Lucene’s
>> >> “sausagized”
>> >> > > > token
>> >> > > > >> stream.
>> >> > > > >>
>> >> > > > >> --
>> >> > > > >> Steve
>> >> > > > >> www.lucidworks.com
>> >> > > > >>
>> >> > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
>> >> > > > j...@johnbickerstaff.com>
>> >> > > > >> wrote:
>> >> > > > >> >
>> >> > > > >> > Thanks Chris --
>> >> >

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
So - the instructions on using the Blob Store API say to use the
Denable.runtime.lib=true option when starting Solr.

Thing is, I've installed per the "for production" instructions which gives
me an entry in /etc/init.d called solr.

Two questions.

To test this can I still use the start.jar in /opt/solr/server as long as I
issue the "cloud mode" flag or does that no longer work in 5.x?

Do I instead have to modify that start script in /etc/init.d ?

On Wed, Jun 1, 2016 at 10:42 AM, John Bickerstaff 
wrote:

> Ahhh - gotcha.
>
> Well, not sure why it's not picked up - seems lots of other jars are...
> Maybe Joe will comment...
>
> On Wed, Jun 1, 2016 at 10:22 AM, MaryJo Sminkey 
> wrote:
>
>> That refers to running Solr in cloud mode. We aren't there yet.
>>
>> MJ
>>
>>
>>
>> On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff <
>> j...@johnbickerstaff.com>
>> wrote:
>>
>> > Hi Mary Jo,
>> >
>> > I'll point you to Joe's earlier comment about needing to use the Blob
>> Store
>> > API...  He put a link in his response.
>> >
>> > I'm about to try that today...  Given that Joe is a contributor to
>> > hon_lucene there's a good chance his experience is correct here -
>> > especially given the evidence you just provided...
>> >
>> > Here's a copy - paste for your convenience.  It's a bit convoluted,
>> > although I totally get how this kind of approach is great for large Solr
>> > Cloud installations that have machines or VMs coming up and going down
>> as
>> > part of a services-based approach...
>> >
>> > Joe said:
>> > The docs are out of date for the synonym_edismax but it does work. Check
>> > out the tests for working examples. I'll try to update it soon. I've run
>> > the plugin on Solr 5 and 6, solrcloud and standalone. For running in
>> > SolrCloud make sure you follow
>> >
>> >
>> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
>> >
>> > On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey 
>> > wrote:
>> >
>> > > So we still can't get this to work, here's the latest update my server
>> > guy
>> > > gave me: It seems to not matter where the file is located, it does not
>> > > load. Yet, the the Solr Java class path shows the file has loaded.
>> Only
>> > > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in
>> that
>> > it
>> > > loads in the java class path.  I've yet to find out what the error is.
>> > All
>> > > I can see is this "Error loading class". Okay, but why? What error was
>> > > encountered in trying to load the class?  I can't find any of this
>> > > information. I'm trying to work with the documentation that is located
>> > here
>> > > http://wiki.apache.org/solr/SolrPlugins
>> > >
>> > > I found that the jar file was put into each of these locations in an
>> > > attempt to find a place where it will load without error.
>> > >
>> > > find .|grep hon-lucene
>> > >
>> > > ./server/lib/hon-lucene-synonyms-2.0.0.jar
>> > >
>> > > ./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar
>> > >
>> > > ./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar
>> > >
>> > > ./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar
>> > >
>> > > ./server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
>> > >
>> > >  The config specifies that files in certain paths can be loaded as
>> > plugins
>> > > or I can specify a path. Following the instructions I added this path
>> > >
>> > >   > > > dir="${solr.install.dir:../../../..}/contrib/hon-lucene-synonyms/lib"
>> > > regex=".*\.jar" />
>> > >
>> > > And I put the jar file in that location.  This did not work either. I
>> > also
>> > > tried using an absolute path like this.
>> > >
>> > > > > >
>> > >
>> >
>> dir="/opt/solr/contrib/hon-lucene-synonyms/lib/hon-lucene-synonyms-2.0.0.jar"
>> > > />
>> > >
>> > > This did not work.
>> > >
>> > >
>> > >
>> > > I'm starting to think this isn't a configuration problem, but a
>> > > compatibility problem. I have not seen anything from the maker of this
>> > > plugin that it works on the exact version of Solr we are using.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > The best info I have found so far in the logs is this stack trace of
>> the
>> > > error. It still does not say why it failed to load.
>> > >
>> > > 2016-06-01 00:22:13.470 ERROR (qtp2096057945-14) [   ]
>> > o.a.s.s.HttpSolrCall
>> > > null:org.apache.solr.common.SolrException: SolrCore 'classic_search'
>> is
>> > not
>> > > available due to init failure: Error loading class
>> > > 'com.github.healthonnet.search.Syno
>> > >
>> > > nymExpandingExtendedDismaxQParserPlugin'
>> > >
>> > > at
>> > > org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:993)
>> > >
>> > > at
>> > org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:249)
>> > >
>> > > at
>> > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:411)
>> > >
>> > > at
>> > >
>> > >
>> >
>> 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
Ahhh - gotcha.

Well, not sure why it's not picked up - seems lots of other jars are...
Maybe Joe will comment...

On Wed, Jun 1, 2016 at 10:22 AM, MaryJo Sminkey  wrote:

> That refers to running Solr in cloud mode. We aren't there yet.
>
> MJ
>
>
>
> On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > Hi Mary Jo,
> >
> > I'll point you to Joe's earlier comment about needing to use the Blob
> Store
> > API...  He put a link in his response.
> >
> > I'm about to try that today...  Given that Joe is a contributor to
> > hon_lucene there's a good chance his experience is correct here -
> > especially given the evidence you just provided...
> >
> > Here's a copy - paste for your convenience.  It's a bit convoluted,
> > although I totally get how this kind of approach is great for large Solr
> > Cloud installations that have machines or VMs coming up and going down as
> > part of a services-based approach...
> >
> > Joe said:
> > The docs are out of date for the synonym_edismax but it does work. Check
> > out the tests for working examples. I'll try to update it soon. I've run
> > the plugin on Solr 5 and 6, solrcloud and standalone. For running in
> > SolrCloud make sure you follow
> >
> >
> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
> >
> > On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey 
> > wrote:
> >
> > > So we still can't get this to work, here's the latest update my server
> > guy
> > > gave me: It seems to not matter where the file is located, it does not
> > > load. Yet, the the Solr Java class path shows the file has loaded.
> Only
> > > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in
> that
> > it
> > > loads in the java class path.  I've yet to find out what the error is.
> > All
> > > I can see is this "Error loading class". Okay, but why? What error was
> > > encountered in trying to load the class?  I can't find any of this
> > > information. I'm trying to work with the documentation that is located
> > here
> > > http://wiki.apache.org/solr/SolrPlugins
> > >
> > > I found that the jar file was put into each of these locations in an
> > > attempt to find a place where it will load without error.
> > >
> > > find .|grep hon-lucene
> > >
> > > ./server/lib/hon-lucene-synonyms-2.0.0.jar
> > >
> > > ./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar
> > >
> > > ./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar
> > >
> > > ./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar
> > >
> > > ./server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
> > >
> > >  The config specifies that files in certain paths can be loaded as
> > plugins
> > > or I can specify a path. Following the instructions I added this path
> > >
> > >> > dir="${solr.install.dir:../../../..}/contrib/hon-lucene-synonyms/lib"
> > > regex=".*\.jar" />
> > >
> > > And I put the jar file in that location.  This did not work either. I
> > also
> > > tried using an absolute path like this.
> > >
> > >  > >
> > >
> >
> dir="/opt/solr/contrib/hon-lucene-synonyms/lib/hon-lucene-synonyms-2.0.0.jar"
> > > />
> > >
> > > This did not work.
> > >
> > >
> > >
> > > I'm starting to think this isn't a configuration problem, but a
> > > compatibility problem. I have not seen anything from the maker of this
> > > plugin that it works on the exact version of Solr we are using.
> > >
> > >
> > >
> > >
> > >
> > > The best info I have found so far in the logs is this stack trace of
> the
> > > error. It still does not say why it failed to load.
> > >
> > > 2016-06-01 00:22:13.470 ERROR (qtp2096057945-14) [   ]
> > o.a.s.s.HttpSolrCall
> > > null:org.apache.solr.common.SolrException: SolrCore 'classic_search' is
> > not
> > > available due to init failure: Error loading class
> > > 'com.github.healthonnet.search.Syno
> > >
> > > nymExpandingExtendedDismaxQParserPlugin'
> > >
> > > at
> > > org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:993)
> > >
> > > at
> > org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:249)
> > >
> > > at
> > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:411)
> > >
> > > at
> > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:222)
> > >
> > > at
> > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)
> > >
> > > at
> > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
> > >
> > > at
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
> > >
> > > at
> > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > >
> > > at
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
> > >
> > > 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread MaryJo Sminkey
That refers to running Solr in cloud mode. We aren't there yet.

MJ



On Wed, Jun 1, 2016 at 12:20 PM, John Bickerstaff 
wrote:

> Hi Mary Jo,
>
> I'll point you to Joe's earlier comment about needing to use the Blob Store
> API...  He put a link in his response.
>
> I'm about to try that today...  Given that Joe is a contributor to
> hon_lucene there's a good chance his experience is correct here -
> especially given the evidence you just provided...
>
> Here's a copy - paste for your convenience.  It's a bit convoluted,
> although I totally get how this kind of approach is great for large Solr
> Cloud installations that have machines or VMs coming up and going down as
> part of a services-based approach...
>
> Joe said:
> The docs are out of date for the synonym_edismax but it does work. Check
> out the tests for working examples. I'll try to update it soon. I've run
> the plugin on Solr 5 and 6, solrcloud and standalone. For running in
> SolrCloud make sure you follow
>
> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
>
> On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey 
> wrote:
>
> > So we still can't get this to work, here's the latest update my server
> guy
> > gave me: It seems to not matter where the file is located, it does not
> > load. Yet, the the Solr Java class path shows the file has loaded.  Only
> > this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in that
> it
> > loads in the java class path.  I've yet to find out what the error is.
> All
> > I can see is this "Error loading class". Okay, but why? What error was
> > encountered in trying to load the class?  I can't find any of this
> > information. I'm trying to work with the documentation that is located
> here
> > http://wiki.apache.org/solr/SolrPlugins
> >
> > I found that the jar file was put into each of these locations in an
> > attempt to find a place where it will load without error.
> >
> > find .|grep hon-lucene
> >
> > ./server/lib/hon-lucene-synonyms-2.0.0.jar
> >
> > ./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar
> >
> > ./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar
> >
> > ./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar
> >
> > ./server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
> >
> >  The config specifies that files in certain paths can be loaded as
> plugins
> > or I can specify a path. Following the instructions I added this path
> >
> >> dir="${solr.install.dir:../../../..}/contrib/hon-lucene-synonyms/lib"
> > regex=".*\.jar" />
> >
> > And I put the jar file in that location.  This did not work either. I
> also
> > tried using an absolute path like this.
> >
> >  >
> >
> dir="/opt/solr/contrib/hon-lucene-synonyms/lib/hon-lucene-synonyms-2.0.0.jar"
> > />
> >
> > This did not work.
> >
> >
> >
> > I'm starting to think this isn't a configuration problem, but a
> > compatibility problem. I have not seen anything from the maker of this
> > plugin that it works on the exact version of Solr we are using.
> >
> >
> >
> >
> >
> > The best info I have found so far in the logs is this stack trace of the
> > error. It still does not say why it failed to load.
> >
> > 2016-06-01 00:22:13.470 ERROR (qtp2096057945-14) [   ]
> o.a.s.s.HttpSolrCall
> > null:org.apache.solr.common.SolrException: SolrCore 'classic_search' is
> not
> > available due to init failure: Error loading class
> > 'com.github.healthonnet.search.Syno
> >
> > nymExpandingExtendedDismaxQParserPlugin'
> >
> > at
> > org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:993)
> >
> > at
> org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:249)
> >
> > at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:411)
> >
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:222)
> >
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)
> >
> > at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
> >
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
> >
> > at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
> >
> > at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
> >
> > at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
> >
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
> >
> > at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> >
> > at
> >
> >
> 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
Hi Mary Jo,

I'll point you to Joe's earlier comment about needing to use the Blob Store
API...  He put a link in his response.

I'm about to try that today...  Given that Joe is a contributor to
hon_lucene there's a good chance his experience is correct here -
especially given the evidence you just provided...

Here's a copy - paste for your convenience.  It's a bit convoluted,
although I totally get how this kind of approach is great for large Solr
Cloud installations that have machines or VMs coming up and going down as
part of a services-based approach...

Joe said:
The docs are out of date for the synonym_edismax but it does work. Check
out the tests for working examples. I'll try to update it soon. I've run
the plugin on Solr 5 and 6, solrcloud and standalone. For running in
SolrCloud make sure you follow
https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode

On Wed, Jun 1, 2016 at 10:15 AM, MaryJo Sminkey  wrote:

> So we still can't get this to work, here's the latest update my server guy
> gave me: It seems to not matter where the file is located, it does not
> load. Yet, the the Solr Java class path shows the file has loaded.  Only
> this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in that it
> loads in the java class path.  I've yet to find out what the error is. All
> I can see is this "Error loading class". Okay, but why? What error was
> encountered in trying to load the class?  I can't find any of this
> information. I'm trying to work with the documentation that is located here
> http://wiki.apache.org/solr/SolrPlugins
>
> I found that the jar file was put into each of these locations in an
> attempt to find a place where it will load without error.
>
> find .|grep hon-lucene
>
> ./server/lib/hon-lucene-synonyms-2.0.0.jar
>
> ./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar
>
> ./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar
>
> ./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar
>
> ./server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
>
>  The config specifies that files in certain paths can be loaded as plugins
> or I can specify a path. Following the instructions I added this path
>
>dir="${solr.install.dir:../../../..}/contrib/hon-lucene-synonyms/lib"
> regex=".*\.jar" />
>
> And I put the jar file in that location.  This did not work either. I also
> tried using an absolute path like this.
>
> 
> dir="/opt/solr/contrib/hon-lucene-synonyms/lib/hon-lucene-synonyms-2.0.0.jar"
> />
>
> This did not work.
>
>
>
> I'm starting to think this isn't a configuration problem, but a
> compatibility problem. I have not seen anything from the maker of this
> plugin that it works on the exact version of Solr we are using.
>
>
>
>
>
> The best info I have found so far in the logs is this stack trace of the
> error. It still does not say why it failed to load.
>
> 2016-06-01 00:22:13.470 ERROR (qtp2096057945-14) [   ] o.a.s.s.HttpSolrCall
> null:org.apache.solr.common.SolrException: SolrCore 'classic_search' is not
> available due to init failure: Error loading class
> 'com.github.healthonnet.search.Syno
>
> nymExpandingExtendedDismaxQParserPlugin'
>
> at
> org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:993)
>
> at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:249)
>
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:411)
>
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:222)
>
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)
>
> at
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>
> at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>
> at
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>
> at
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>
> at
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>
> at
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>
> at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>
> at
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>
> at
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>
> at
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>
> at 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread MaryJo Sminkey
So we still can't get this to work, here's the latest update my server guy
gave me: It seems to not matter where the file is located, it does not
load. Yet, the the Solr Java class path shows the file has loaded.  Only
this path (./server/lib/hon-lucene-synonyms-2.0.0.jar) will work in that it
loads in the java class path.  I've yet to find out what the error is. All
I can see is this "Error loading class". Okay, but why? What error was
encountered in trying to load the class?  I can't find any of this
information. I'm trying to work with the documentation that is located here
http://wiki.apache.org/solr/SolrPlugins

I found that the jar file was put into each of these locations in an
attempt to find a place where it will load without error.

find .|grep hon-lucene

./server/lib/hon-lucene-synonyms-2.0.0.jar

./server/solr/plugins/hon-lucene-synonyms-2.0.0.jar

./server/solr/classic_newdb/lib/hon-lucene-synonyms-2.0.0.jar

./server/solr/classic_search/lib/hon-lucene-synonyms-2.0.0.jar

./server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar

 The config specifies that files in certain paths can be loaded as plugins
or I can specify a path. Following the instructions I added this path

  

And I put the jar file in that location.  This did not work either. I also
tried using an absolute path like this.



This did not work.



I'm starting to think this isn't a configuration problem, but a
compatibility problem. I have not seen anything from the maker of this
plugin that it works on the exact version of Solr we are using.





The best info I have found so far in the logs is this stack trace of the
error. It still does not say why it failed to load.

2016-06-01 00:22:13.470 ERROR (qtp2096057945-14) [   ] o.a.s.s.HttpSolrCall
null:org.apache.solr.common.SolrException: SolrCore 'classic_search' is not
available due to init failure: Error loading class
'com.github.healthonnet.search.Syno

nymExpandingExtendedDismaxQParserPlugin'

at
org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:993)

at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:249)

at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:411)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:222)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)

at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)

at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)

at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)

at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)

at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)

at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)

at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)

at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)

at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at org.eclipse.jetty.server.Server.handle(Server.java:499)

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)

at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)

at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)

at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)

at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.solr.common.SolrException: Error loading class
'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'

at org.apache.solr.core.SolrCore.(SolrCore.java:824)

at org.apache.solr.core.SolrCore.(SolrCore.java:665)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:742)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:462)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:453)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread John Bickerstaff
Thanks Shawn

Yup - I created a /lib inside my $SOLR_HOME directory (which by default was
/var/solr/data)

I put the hon_lucene. jar file in there and rebooted - same errors
about class not found.

Tried again in what looked like the next most obvious spot
server/solr-webapp/webapp/WEB-INF/lib

Same result...  Class not found.

I'll go back and triple check

Joe - is that recommendation of using the Blob Store API an absolute?  I
know my IT guys are going to want to have the signing - it would be a lot
easier to just drop in jars we care about without worrying about the
signing.  Yes - I'm being lazy, I know. 

Thanks all!

On Tue, May 31, 2016 at 11:35 PM, Shawn Heisey  wrote:

> On 5/31/2016 3:13 PM, John Bickerstaff wrote:
> > The suggestion on the readme is that I can drop the
> > hon_lucene_synonyms jar file into the $SOLR_HOME directory, but this
> > does not seem to be working - I'm getting class not found exceptions.
>
> What I typically do with *all* extra jars (dataimport, mysql, ICU jars,
> etc) is put them into $SOLR_HOME/lib ... a directory that you will
> usually need to create.  If the installer script is used with default
> options, that directory will be /var/solr/data/lib.
>
> Any jar that you place in that directory will be loaded once at Solr
> startup and available to all cores.  The best thing about this directory
> is that it requires zero configuration.
>
> For 5.3 and later, loading jars into
> server/solr-webapp/webapp/WEB-INF/lib should also work, but then you are
> modifying the actual Solr install, which I normally avoid because it
> makes it a little bit harder to upgrade Solr.
>
> > Does anyone on this list have direct experience with getting this
> > plugin to work in Solr 5.x?
>
> I don't have any experience with that specific plugin, but I have
> successfully used other plugin jars with the lib directory mentioned above.
>
> Thanks,
> Shawn
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Shawn Heisey
On 5/31/2016 3:13 PM, John Bickerstaff wrote:
> The suggestion on the readme is that I can drop the
> hon_lucene_synonyms jar file into the $SOLR_HOME directory, but this
> does not seem to be working - I'm getting class not found exceptions. 

What I typically do with *all* extra jars (dataimport, mysql, ICU jars,
etc) is put them into $SOLR_HOME/lib ... a directory that you will
usually need to create.  If the installer script is used with default
options, that directory will be /var/solr/data/lib.

Any jar that you place in that directory will be loaded once at Solr
startup and available to all cores.  The best thing about this directory
is that it requires zero configuration.

For 5.3 and later, loading jars into
server/solr-webapp/webapp/WEB-INF/lib should also work, but then you are
modifying the actual Solr install, which I normally avoid because it
makes it a little bit harder to upgrade Solr.

> Does anyone on this list have direct experience with getting this
> plugin to work in Solr 5.x? 

I don't have any experience with that specific plugin, but I have
successfully used other plugin jars with the lib directory mentioned above.

Thanks,
Shawn



Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
; > > > If anyone out there has done this specific approach - could you
>> >> > validate
>> >> > > > whether my thought process is correct and / or if I'm missing
>> >> > something?
>> >> > > > Yes - I get that I can set it all up and try - but it's what I
>> don't
>> >> > > know I
>> >> > > > don't know that bothers me...
>> >> > > >
>> >> > > > On Fri, May 27, 2016 at 11:57 AM, John Bickerstaff <
>> >> > > > j...@johnbickerstaff.com
>> >> > > > > wrote:
>> >> > > >
>> >> > > > > Thank you Steve -- very helpful.
>> >> > > > >
>> >> > > > > I can see that whatever implementation I decide to try, some
>> >> testing
>> >> > > will
>> >> > > > > be in order.  If anyone is aware of significant gotchas with
>> this
>> >> > > synonym
>> >> > > > > thing that are not mentioned in the already-listed URLs, please
>> >> feel
>> >> > > free
>> >> > > > > to comment.
>> >> > > > >
>> >> > > > > On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <sar...@gmail.com
>> >
>> >> > wrote:
>> >> > > > >
>> >> > > > >> I’m working on addressing problems using multi-term synonyms
>> at
>> >> > query
>> >> > > > >> time in Lucene and Solr.
>> >> > > > >>
>> >> > > > >> I recommend these two blogs for understanding the issues (the
>> >> second
>> >> > > one
>> >> > > > >> was mentioned earlier in this thread):
>> >> > > > >>
>> >> > > > >> <
>> >> > > > >>
>> >> > > >
>> >> > >
>> >> >
>> >>
>> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
>> >> > > > >> >
>> >> > > > >> <
>> >> > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
>> >> > > > >>
>> >> > > > >> In addition to the already-mentioned projects, there is also:
>> >> > > > >>
>> >> > > > >> <https://issues.apache.org/jira/browse/SOLR-5379>
>> >> > > > >>
>> >> > > > >> All of these projects try in various ways to work around the
>> fact
>> >> > that
>> >> > > > >> Lucene’s QueryParser splits on whitespace before sending text
>> to
>> >> > > > analysis,
>> >> > > > >> one token at a time, so in a synonym filter, multi-word
>> synonyms
>> >> can
>> >> > > > never
>> >> > > > >> match and add alternatives.  See <
>> >> > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where
>> I’ve
>> >> > > posted a
>> >> > > > >> patch to directly address that problem - note that it’s still
>> a
>> >> work
>> >> > > in
>> >> > > > >> progress.
>> >> > > > >>
>> >> > > > >> Once LUCENE-2605 has been fixed, there is still work to do
>> getting
>> >> > > > >> (e)dismax to work with the modified Lucene QueryParser, and
>> >> > addressing
>> >> > > > >> problems with how queries are constructed from Lucene’s
>> >> “sausagized”
>> >> > > > token
>> >> > > > >> stream.
>> >> > > > >>
>> >> > > > >> --
>> >> > > > >> Steve
>> >> > > > >> www.lucidworks.com
>> >> > > > >>
>> >> > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
>> >> > > > j...@johnbickerstaff.com>
>> >> > > > >> wrote:
>> >> > > > >> >
>> >> > > > >> > Thanks Chris --
>> >> > > > >> >
>> >> > > > 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
t; On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <sar...@gmail.com>
> >> > wrote:
> >> > > > >
> >> > > > >> I’m working on addressing problems using multi-term synonyms at
> >> > query
> >> > > > >> time in Lucene and Solr.
> >> > > > >>
> >> > > > >> I recommend these two blogs for understanding the issues (the
> >> second
> >> > > one
> >> > > > >> was mentioned earlier in this thread):
> >> > > > >>
> >> > > > >> <
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
> >> > > > >> >
> >> > > > >> <
> >> > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
> >> > > > >>
> >> > > > >> In addition to the already-mentioned projects, there is also:
> >> > > > >>
> >> > > > >> <https://issues.apache.org/jira/browse/SOLR-5379>
> >> > > > >>
> >> > > > >> All of these projects try in various ways to work around the
> fact
> >> > that
> >> > > > >> Lucene’s QueryParser splits on whitespace before sending text
> to
> >> > > > analysis,
> >> > > > >> one token at a time, so in a synonym filter, multi-word
> synonyms
> >> can
> >> > > > never
> >> > > > >> match and add alternatives.  See <
> >> > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve
> >> > > posted a
> >> > > > >> patch to directly address that problem - note that it’s still a
> >> work
> >> > > in
> >> > > > >> progress.
> >> > > > >>
> >> > > > >> Once LUCENE-2605 has been fixed, there is still work to do
> getting
> >> > > > >> (e)dismax to work with the modified Lucene QueryParser, and
> >> > addressing
> >> > > > >> problems with how queries are constructed from Lucene’s
> >> “sausagized”
> >> > > > token
> >> > > > >> stream.
> >> > > > >>
> >> > > > >> --
> >> > > > >> Steve
> >> > > > >> www.lucidworks.com
> >> > > > >>
> >> > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
> >> > > > j...@johnbickerstaff.com>
> >> > > > >> wrote:
> >> > > > >> >
> >> > > > >> > Thanks Chris --
> >> > > > >> >
> >> > > > >> > The two projects I'm aware of are:
> >> > > > >> >
> >> > > > >> > https://github.com/healthonnet/hon-lucene-synonyms
> >> > > > >> >
> >> > > > >> > and the one referenced from the Lucidworks page here:
> >> > > > >> >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >> > > > >> >
> >> > > > >> > ... which is here :
> >> > > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> >> > > > >> >
> >> > > > >> > Is there anything else out there that you would recommend I
> look
> >> > at?
> >> > > > >> >
> >> > > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <
> >> > ch...@depahelix.com
> >> > > >
> >> > > > >> wrote:
> >> > > > >> >
> >> > > > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> >> > > > >> >>
> >> > > > >> >> Suyash Sonawane and I have worked on multiple word synonyms
> at
> >> > > > Wayfair.
> >> > > > >> >> We worked mostly off of Ted Sullivan's work and also off of
> >> some

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Jeff Wartes
never
>> > > > >> match and add alternatives.  See <
>> > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve
>> > > posted a
>> > > > >> patch to directly address that problem - note that it’s still a
>> work
>> > > in
>> > > > >> progress.
>> > > > >>
>> > > > >> Once LUCENE-2605 has been fixed, there is still work to do getting
>> > > > >> (e)dismax to work with the modified Lucene QueryParser, and
>> > addressing
>> > > > >> problems with how queries are constructed from Lucene’s
>> “sausagized”
>> > > > token
>> > > > >> stream.
>> > > > >>
>> > > > >> --
>> > > > >> Steve
>> > > > >> www.lucidworks.com
>> > > > >>
>> > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
>> > > > j...@johnbickerstaff.com>
>> > > > >> wrote:
>> > > > >> >
>> > > > >> > Thanks Chris --
>> > > > >> >
>> > > > >> > The two projects I'm aware of are:
>> > > > >> >
>> > > > >> > https://github.com/healthonnet/hon-lucene-synonyms
>> > > > >> >
>> > > > >> > and the one referenced from the Lucidworks page here:
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>> > > > >> >
>> > > > >> > ... which is here :
>> > > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
>> > > > >> >
>> > > > >> > Is there anything else out there that you would recommend I look
>> > at?
>> > > > >> >
>> > > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <
>> > ch...@depahelix.com
>> > > >
>> > > > >> wrote:
>> > > > >> >
>> > > > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
>> > > > >> >>
>> > > > >> >> Suyash Sonawane and I have worked on multiple word synonyms at
>> > > > Wayfair.
>> > > > >> >> We worked mostly off of Ted Sullivan's work and also off of
>> some
>> > > > >> >> suggestions from Koorosh Vakhshoori.  We have gotten to a point
>> > > where
>> > > > >> we
>> > > > >> >> have a more sophisticated internal implementation, however,
>> we've
>> > > > found
>> > > > >> >> that it is very difficult to make it do what you want it to do,
>> > and
>> > > > >> also be
>> > > > >> >> sufficiently performant.  Watch out for exceptional situations
>> > with
>> > > > mm
>> > > > >> >> (minimum should match).
>> > > > >> >>
>> > > > >> >> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com
>> > have
>> > > > >> also
>> > > > >> >> done work in this area.
>> > > > >> >>
>> > > > >> >> It should be very possible to get this kind of thing working on
>> > > > >> >> SolrCloud.  I haven't tried it yet but I think theoretically,
>> it
>> > > > should
>> > > > >> >> just work.  The synonyms stuff is mostly about doing things at
>> > > index
>> > > > >> time
>> > > > >> >> and query time.  The index time stuff should translate to
>> > SolrCloud
>> > > > >> >> directly, while the query time stuff might pose some issues,
>> but
>> > > > >> probably
>> > > > >> >> not too bad, if there are any issues at all.
>> > > > >> >>
>> > > > >> >> I've had decent luck porting our various plugins from 4.10.x to
>> > > 5.5.0
>> > > > >> >> because a lot of stuff is just Java, and it still works within
>> > the
>>

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
> at
> > > > query
> > > > > > >> time in Lucene and Solr.
> > > > > > >>
> > > > > > >> I recommend these two blogs for understanding the issues (the
> > > second
> > > > > one
> > > > > > >> was mentioned earlier in this thread):
> > > > > > >>
> > > > > > >> <
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
> > > > > > >> >
> > > > > > >> <
> > > > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
> > > > > > >>
> > > > > > >> In addition to the already-mentioned projects, there is also:
> > > > > > >>
> > > > > > >> <https://issues.apache.org/jira/browse/SOLR-5379>
> > > > > > >>
> > > > > > >> All of these projects try in various ways to work around the
> > fact
> > > > that
> > > > > > >> Lucene’s QueryParser splits on whitespace before sending text
> to
> > > > > > analysis,
> > > > > > >> one token at a time, so in a synonym filter, multi-word
> synonyms
> > > can
> > > > > > never
> > > > > > >> match and add alternatives.  See <
> > > > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where
> I’ve
> > > > > posted a
> > > > > > >> patch to directly address that problem - note that it’s still
> a
> > > work
> > > > > in
> > > > > > >> progress.
> > > > > > >>
> > > > > > >> Once LUCENE-2605 has been fixed, there is still work to do
> > getting
> > > > > > >> (e)dismax to work with the modified Lucene QueryParser, and
> > > > addressing
> > > > > > >> problems with how queries are constructed from Lucene’s
> > > “sausagized”
> > > > > > token
> > > > > > >> stream.
> > > > > > >>
> > > > > > >> --
> > > > > > >> Steve
> > > > > > >> www.lucidworks.com
> > > > > > >>
> > > > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
> > > > > > j...@johnbickerstaff.com>
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> > Thanks Chris --
> > > > > > >> >
> > > > > > >> > The two projects I'm aware of are:
> > > > > > >> >
> > > > > > >> > https://github.com/healthonnet/hon-lucene-synonyms
> > > > > > >> >
> > > > > > >> > and the one referenced from the Lucidworks page here:
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > > > > >> >
> > > > > > >> > ... which is here :
> > > > > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> > > > > > >> >
> > > > > > >> > Is there anything else out there that you would recommend I
> > look
> > > > at?
> > > > > > >> >
> > > > > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <
> > > > ch...@depahelix.com
> > > > > >
> > > > > > >> wrote:
> > > > > > >> >
> > > > > > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> > > > > > >> >>
> > > > > > >> >> Suyash Sonawane and I have worked on multiple word synonyms
> > at
> > > > > > Wayfair.
> > > > > > >> >> We worked mostly off of Ted Sullivan's work and also off of
> > > some
> > > > > > >> >> suggestions from Kooros

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Joe Lawson
ace before sending text to
> > > > > analysis,
> > > > > >> one token at a time, so in a synonym filter, multi-word synonyms
> > can
> > > > > never
> > > > > >> match and add alternatives.  See <
> > > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve
> > > > posted a
> > > > > >> patch to directly address that problem - note that it’s still a
> > work
> > > > in
> > > > > >> progress.
> > > > > >>
> > > > > >> Once LUCENE-2605 has been fixed, there is still work to do
> getting
> > > > > >> (e)dismax to work with the modified Lucene QueryParser, and
> > > addressing
> > > > > >> problems with how queries are constructed from Lucene’s
> > “sausagized”
> > > > > token
> > > > > >> stream.
> > > > > >>
> > > > > >> --
> > > > > >> Steve
> > > > > >> www.lucidworks.com
> > > > > >>
> > > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
> > > > > j...@johnbickerstaff.com>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > Thanks Chris --
> > > > > >> >
> > > > > >> > The two projects I'm aware of are:
> > > > > >> >
> > > > > >> > https://github.com/healthonnet/hon-lucene-synonyms
> > > > > >> >
> > > > > >> > and the one referenced from the Lucidworks page here:
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > > > >> >
> > > > > >> > ... which is here :
> > > > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> > > > > >> >
> > > > > >> > Is there anything else out there that you would recommend I
> look
> > > at?
> > > > > >> >
> > > > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <
> > > ch...@depahelix.com
> > > > >
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> > > > > >> >>
> > > > > >> >> Suyash Sonawane and I have worked on multiple word synonyms
> at
> > > > > Wayfair.
> > > > > >> >> We worked mostly off of Ted Sullivan's work and also off of
> > some
> > > > > >> >> suggestions from Koorosh Vakhshoori.  We have gotten to a
> point
> > > > where
> > > > > >> we
> > > > > >> >> have a more sophisticated internal implementation, however,
> > we've
> > > > > found
> > > > > >> >> that it is very difficult to make it do what you want it to
> do,
> > > and
> > > > > >> also be
> > > > > >> >> sufficiently performant.  Watch out for exceptional
> situations
> > > with
> > > > > mm
> > > > > >> >> (minimum should match).
> > > > > >> >>
> > > > > >> >> Trey Grainger (now at Lucidworks) and Simon Hughes of
> Dice.com
> > > have
> > > > > >> also
> > > > > >> >> done work in this area.
> > > > > >> >>
> > > > > >> >> It should be very possible to get this kind of thing working
> on
> > > > > >> >> SolrCloud.  I haven't tried it yet but I think theoretically,
> > it
> > > > > should
> > > > > >> >> just work.  The synonyms stuff is mostly about doing things
> at
> > > > index
> > > > > >> time
> > > > > >> >> and query time.  The index time stuff should translate to
> > > SolrCloud
> > > > > >> >> directly, while the query time stuff might pose some issues,
> > but
> > > > > >> probably
> > > > > >> >>

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
t;
> > > > >> > https://github.com/healthonnet/hon-lucene-synonyms
> > > > >> >
> > > > >> > and the one referenced from the Lucidworks page here:
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > > >> >
> > > > >> > ... which is here :
> > > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> > > > >> >
> > > > >> > Is there anything else out there that you would recommend I look
> > at?
> > > > >> >
> > > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <
> > ch...@depahelix.com
> > > >
> > > > >> wrote:
> > > > >> >
> > > > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> > > > >> >>
> > > > >> >> Suyash Sonawane and I have worked on multiple word synonyms at
> > > > Wayfair.
> > > > >> >> We worked mostly off of Ted Sullivan's work and also off of
> some
> > > > >> >> suggestions from Koorosh Vakhshoori.  We have gotten to a point
> > > where
> > > > >> we
> > > > >> >> have a more sophisticated internal implementation, however,
> we've
> > > > found
> > > > >> >> that it is very difficult to make it do what you want it to do,
> > and
> > > > >> also be
> > > > >> >> sufficiently performant.  Watch out for exceptional situations
> > with
> > > > mm
> > > > >> >> (minimum should match).
> > > > >> >>
> > > > >> >> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com
> > have
> > > > >> also
> > > > >> >> done work in this area.
> > > > >> >>
> > > > >> >> It should be very possible to get this kind of thing working on
> > > > >> >> SolrCloud.  I haven't tried it yet but I think theoretically,
> it
> > > > should
> > > > >> >> just work.  The synonyms stuff is mostly about doing things at
> > > index
> > > > >> time
> > > > >> >> and query time.  The index time stuff should translate to
> > SolrCloud
> > > > >> >> directly, while the query time stuff might pose some issues,
> but
> > > > >> probably
> > > > >> >> not too bad, if there are any issues at all.
> > > > >> >>
> > > > >> >> I've had decent luck porting our various plugins from 4.10.x to
> > > 5.5.0
> > > > >> >> because a lot of stuff is just Java, and it still works within
> > the
> > > > >> Jetty
> > > > >> >> context.
> > > > >> >>
> > > > >> >> -Chris.
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >> >> 
> > > > >> >> From: "John Bickerstaff" <j...@johnbickerstaff.com>
> > > > >> >> Sent: Thursday, May 26, 2016 1:51 PM
> > > > >> >> To: solr-user@lucene.apache.org
> > > > >> >> Subject: Re: Solr Cloud and Multi-word Synonyms ::
> > synonym_edismax
> > > > >> parser
> > > > >> >> Hey Jeff (or anyone interested in multi-word synonyms) here are
> > > some
> > > > >> >> potentially interesting links...
> > > > >> >>
> > > > >> >> http://wiki.apache.org/solr/QueryParser (search the page for
> > > > >> >> synonum_edismax)
> > > > >> >>
> > > > >> >>
> > > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
> > > > >> (blog
> > > > >> >> post about what became the synonym_edissmax Query Parser)
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > > >> >>
> > > > >> >> This last was useful for lots of reasons and contains links to
> > > other
> > > > >> >> interesting, related web pages...
> > > > >> >>
> > > > >> >> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes <
> > > > jwar...@whitepages.com>
> > > > >> >> wrote:
> > > > >> >>
> > > > >> >>> Oh, interesting. I've certainty encountered issues with
> > multi-word
> > > > >> >>> synonyms, but I hadn't come across this. If you end up using
> it
> > > > with a
> > > > >> >>> recent solr verison, I'd be glad to hear your experience.
> > > > >> >>>
> > > > >> >>> I haven't used it, but I am aware of one other project in this
> > > vein
> > > > >> that
> > > > >> >>> you might be interested in looking at:
> > > > >> >>> https://github.com/LucidWorks/auto-phrase-tokenfilter
> > > > >> >>>
> > > > >> >>>
> > > > >> >>> On 5/26/16, 9:29 AM, "John Bickerstaff" <
> > j...@johnbickerstaff.com
> > > >
> > > > >> >> wrote:
> > > > >> >>>
> > > > >> >>>> Ahh - for question #3 I may have spoken too soon. This line
> > from
> > > > the
> > > > >> >>>> github repository readme suggests a way.
> > > > >> >>>>
> > > > >> >>>> Update: We have tested to run with the jar in $SOLR_HOME/lib
> as
> > > > well,
> > > > >> >> and
> > > > >> >>>> it works (Jetty).
> > > > >> >>>>
> > > > >> >>>> I'll try that and only respond back if that doesn't work.
> > > > >> >>>>
> > > > >> >>>> Questions 1 and 2 still stand of course... If anyone on the
> > list
> > > > has
> > > > >> >>>> experience in this area...
> > > > >> >>>>
> > > > >> >>>> Thanks.
> > > > >> >>>>
> > > > >> >>>> On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff <
> > > > >> >>> j...@johnbickerstaff.com
> > > > >> >>>>> wrote:
> > > > >> >>>>
> > > > >> >>>>> Hi all,
> > > > >> >>>>>
> > > > >> >>>>> I'm creating a Solr Cloud that will index and search medical
> > > text.
> > > > >> >>>>> Multi-word synonyms are a pretty important factor.
> > > > >> >>>>>
> > > > >> >>>>> I find that there are some challenges around multi-word
> > synonyms
> > > > >> and I
> > > > >> >>>>> also found on the wiki that there is a recommended 3rd-party
> > > > parser
> > > > >> >>>>> (synonym_edismax parser) created by Nolan Lawson and found
> > here:
> > > > >> >>>>> https://github.com/healthonnet/hon-lucene-synonyms
> > > > >> >>>>>
> > > > >> >>>>> Here's the thing - the instructions on the github site
> involve
> > > > >> >> bringing
> > > > >> >>>>> the jar file into the war file - which is not applicable any
> > > > more...
> > > > >> >> at
> > > > >> >>>>> least I think it's not...
> > > > >> >>>>>
> > > > >> >>>>> I have three questions:
> > > > >> >>>>>
> > > > >> >>>>> 1. Is this still a good solution for multi-word synonyms
> (I.e.
> > > > Solr
> > > > >> >>> Cloud
> > > > >> >>>>> doesn't break it in some way)
> > > > >> >>>>> 2. Is there a tool or plug-in out there that the
> contributors
> > > > would
> > > > >> >>>>> recommend above this one?
> > > > >> >>>>> 3. Assuming 1 = yes and 2 = no, can anyone tell me an
> updated
> > > > >> >> procedure
> > > > >> >>>>> for bringing it in to Solr Cloud (I'm running 5.4.x)
> > > > >> >>>>>
> > > > >> >>>>> Thanks
> > > > >> >>>>>
> > > > >> >>>
> > > > >> >>>
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-30 Thread MaryJo Sminkey
get that I can set it all up and try - but it's what I don't
> > know I
> > > don't know that bothers me...
> > >
> > > On Fri, May 27, 2016 at 11:57 AM, John Bickerstaff <
> > > j...@johnbickerstaff.com
> > > > wrote:
> > >
> > > > Thank you Steve -- very helpful.
> > > >
> > > > I can see that whatever implementation I decide to try, some testing
> > will
> > > > be in order.  If anyone is aware of significant gotchas with this
> > synonym
> > > > thing that are not mentioned in the already-listed URLs, please feel
> > free
> > > > to comment.
> > > >
> > > > On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <sar...@gmail.com>
> wrote:
> > > >
> > > >> I’m working on addressing problems using multi-term synonyms at
> query
> > > >> time in Lucene and Solr.
> > > >>
> > > >> I recommend these two blogs for understanding the issues (the second
> > one
> > > >> was mentioned earlier in this thread):
> > > >>
> > > >> <
> > > >>
> > >
> >
> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
> > > >> >
> > > >> <
> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
> > > >>
> > > >> In addition to the already-mentioned projects, there is also:
> > > >>
> > > >> <https://issues.apache.org/jira/browse/SOLR-5379>
> > > >>
> > > >> All of these projects try in various ways to work around the fact
> that
> > > >> Lucene’s QueryParser splits on whitespace before sending text to
> > > analysis,
> > > >> one token at a time, so in a synonym filter, multi-word synonyms can
> > > never
> > > >> match and add alternatives.  See <
> > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve
> > posted a
> > > >> patch to directly address that problem - note that it’s still a work
> > in
> > > >> progress.
> > > >>
> > > >> Once LUCENE-2605 has been fixed, there is still work to do getting
> > > >> (e)dismax to work with the modified Lucene QueryParser, and
> addressing
> > > >> problems with how queries are constructed from Lucene’s “sausagized”
> > > token
> > > >> stream.
> > > >>
> > > >> --
> > > >> Steve
> > > >> www.lucidworks.com
> > > >>
> > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
> > > j...@johnbickerstaff.com>
> > > >> wrote:
> > > >> >
> > > >> > Thanks Chris --
> > > >> >
> > > >> > The two projects I'm aware of are:
> > > >> >
> > > >> > https://github.com/healthonnet/hon-lucene-synonyms
> > > >> >
> > > >> > and the one referenced from the Lucidworks page here:
> > > >> >
> > > >>
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > >> >
> > > >> > ... which is here :
> > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> > > >> >
> > > >> > Is there anything else out there that you would recommend I look
> at?
> > > >> >
> > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <
> ch...@depahelix.com
> > >
> > > >> wrote:
> > > >> >
> > > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> > > >> >>
> > > >> >> Suyash Sonawane and I have worked on multiple word synonyms at
> > > Wayfair.
> > > >> >> We worked mostly off of Ted Sullivan's work and also off of some
> > > >> >> suggestions from Koorosh Vakhshoori.  We have gotten to a point
> > where
> > > >> we
> > > >> >> have a more sophisticated internal implementation, however, we've
> > > found
> > > >> >> that it is very difficult to make it do what you want it to do,
> and
> > > >> also be
> > > >> >> sufficiently performant.  Watch out for exceptional situations
> w

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-30 Thread John Bickerstaff
t; >
> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
> > >> >
> > >> <https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
> > >>
> > >> In addition to the already-mentioned projects, there is also:
> > >>
> > >> <https://issues.apache.org/jira/browse/SOLR-5379>
> > >>
> > >> All of these projects try in various ways to work around the fact that
> > >> Lucene’s QueryParser splits on whitespace before sending text to
> > analysis,
> > >> one token at a time, so in a synonym filter, multi-word synonyms can
> > never
> > >> match and add alternatives.  See <
> > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve
> posted a
> > >> patch to directly address that problem - note that it’s still a work
> in
> > >> progress.
> > >>
> > >> Once LUCENE-2605 has been fixed, there is still work to do getting
> > >> (e)dismax to work with the modified Lucene QueryParser, and addressing
> > >> problems with how queries are constructed from Lucene’s “sausagized”
> > token
> > >> stream.
> > >>
> > >> --
> > >> Steve
> > >> www.lucidworks.com
> > >>
> > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
> > j...@johnbickerstaff.com>
> > >> wrote:
> > >> >
> > >> > Thanks Chris --
> > >> >
> > >> > The two projects I'm aware of are:
> > >> >
> > >> > https://github.com/healthonnet/hon-lucene-synonyms
> > >> >
> > >> > and the one referenced from the Lucidworks page here:
> > >> >
> > >>
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > >> >
> > >> > ... which is here :
> > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> > >> >
> > >> > Is there anything else out there that you would recommend I look at?
> > >> >
> > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <ch...@depahelix.com
> >
> > >> wrote:
> > >> >
> > >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> > >> >>
> > >> >> Suyash Sonawane and I have worked on multiple word synonyms at
> > Wayfair.
> > >> >> We worked mostly off of Ted Sullivan's work and also off of some
> > >> >> suggestions from Koorosh Vakhshoori.  We have gotten to a point
> where
> > >> we
> > >> >> have a more sophisticated internal implementation, however, we've
> > found
> > >> >> that it is very difficult to make it do what you want it to do, and
> > >> also be
> > >> >> sufficiently performant.  Watch out for exceptional situations with
> > mm
> > >> >> (minimum should match).
> > >> >>
> > >> >> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have
> > >> also
> > >> >> done work in this area.
> > >> >>
> > >> >> It should be very possible to get this kind of thing working on
> > >> >> SolrCloud.  I haven't tried it yet but I think theoretically, it
> > should
> > >> >> just work.  The synonyms stuff is mostly about doing things at
> index
> > >> time
> > >> >> and query time.  The index time stuff should translate to SolrCloud
> > >> >> directly, while the query time stuff might pose some issues, but
> > >> probably
> > >> >> not too bad, if there are any issues at all.
> > >> >>
> > >> >> I've had decent luck porting our various plugins from 4.10.x to
> 5.5.0
> > >> >> because a lot of stuff is just Java, and it still works within the
> > >> Jetty
> > >> >> context.
> > >> >>
> > >> >> -Chris.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> 
> > >> >> From: "John Bickerstaff" <j...@johnbickerstaff.com>
> > >> >> Sent: Thursday, May 26, 2016 1:51 PM
> > >> >> To: solr-user@lucene.apache.org
> > >

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-30 Thread MaryJo Sminkey
ss.
> >>
> >> Once LUCENE-2605 has been fixed, there is still work to do getting
> >> (e)dismax to work with the modified Lucene QueryParser, and addressing
> >> problems with how queries are constructed from Lucene’s “sausagized”
> token
> >> stream.
> >>
> >> --
> >> Steve
> >> www.lucidworks.com
> >>
> >> > On May 26, 2016, at 2:21 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> >> wrote:
> >> >
> >> > Thanks Chris --
> >> >
> >> > The two projects I'm aware of are:
> >> >
> >> > https://github.com/healthonnet/hon-lucene-synonyms
> >> >
> >> > and the one referenced from the Lucidworks page here:
> >> >
> >>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >> >
> >> > ... which is here :
> >> https://github.com/LucidWorks/auto-phrase-tokenfilter
> >> >
> >> > Is there anything else out there that you would recommend I look at?
> >> >
> >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <ch...@depahelix.com>
> >> wrote:
> >> >
> >> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> >> >>
> >> >> Suyash Sonawane and I have worked on multiple word synonyms at
> Wayfair.
> >> >> We worked mostly off of Ted Sullivan's work and also off of some
> >> >> suggestions from Koorosh Vakhshoori.  We have gotten to a point where
> >> we
> >> >> have a more sophisticated internal implementation, however, we've
> found
> >> >> that it is very difficult to make it do what you want it to do, and
> >> also be
> >> >> sufficiently performant.  Watch out for exceptional situations with
> mm
> >> >> (minimum should match).
> >> >>
> >> >> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have
> >> also
> >> >> done work in this area.
> >> >>
> >> >> It should be very possible to get this kind of thing working on
> >> >> SolrCloud.  I haven't tried it yet but I think theoretically, it
> should
> >> >> just work.  The synonyms stuff is mostly about doing things at index
> >> time
> >> >> and query time.  The index time stuff should translate to SolrCloud
> >> >> directly, while the query time stuff might pose some issues, but
> >> probably
> >> >> not too bad, if there are any issues at all.
> >> >>
> >> >> I've had decent luck porting our various plugins from 4.10.x to 5.5.0
> >> >> because a lot of stuff is just Java, and it still works within the
> >> Jetty
> >> >> context.
> >> >>
> >> >> -Chris.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 
> >> >> From: "John Bickerstaff" <j...@johnbickerstaff.com>
> >> >> Sent: Thursday, May 26, 2016 1:51 PM
> >> >> To: solr-user@lucene.apache.org
> >> >> Subject: Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax
> >> parser
> >> >> Hey Jeff (or anyone interested in multi-word synonyms) here are some
> >> >> potentially interesting links...
> >> >>
> >> >> http://wiki.apache.org/solr/QueryParser (search the page for
> >> >> synonum_edismax)
> >> >>
> >> >> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
> >> (blog
> >> >> post about what became the synonym_edissmax Query Parser)
> >> >>
> >> >>
> >> >>
> >>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >> >>
> >> >> This last was useful for lots of reasons and contains links to other
> >> >> interesting, related web pages...
> >> >>
> >> >> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes <
> jwar...@whitepages.com>
> >> >> wrote:
> >> >>
> >> >>> Oh, interesting. I've certainty encountered issues with multi-word
> >> >>> synonyms, but I hadn't come across this. If you end up using it
> with a
> >> >>&g

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-30 Thread John Bickerstaff
icult to make it do what you want it to do, and
>> also be
>> >> sufficiently performant.  Watch out for exceptional situations with mm
>> >> (minimum should match).
>> >>
>> >> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have
>> also
>> >> done work in this area.
>> >>
>> >> It should be very possible to get this kind of thing working on
>> >> SolrCloud.  I haven't tried it yet but I think theoretically, it should
>> >> just work.  The synonyms stuff is mostly about doing things at index
>> time
>> >> and query time.  The index time stuff should translate to SolrCloud
>> >> directly, while the query time stuff might pose some issues, but
>> probably
>> >> not too bad, if there are any issues at all.
>> >>
>> >> I've had decent luck porting our various plugins from 4.10.x to 5.5.0
>> >> because a lot of stuff is just Java, and it still works within the
>> Jetty
>> >> context.
>> >>
>> >> -Chris.
>> >>
>> >>
>> >>
>> >>
>> >> 
>> >> From: "John Bickerstaff" <j...@johnbickerstaff.com>
>> >> Sent: Thursday, May 26, 2016 1:51 PM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax
>> parser
>> >> Hey Jeff (or anyone interested in multi-word synonyms) here are some
>> >> potentially interesting links...
>> >>
>> >> http://wiki.apache.org/solr/QueryParser (search the page for
>> >> synonum_edismax)
>> >>
>> >> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
>> (blog
>> >> post about what became the synonym_edissmax Query Parser)
>> >>
>> >>
>> >>
>> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>> >>
>> >> This last was useful for lots of reasons and contains links to other
>> >> interesting, related web pages...
>> >>
>> >> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes <jwar...@whitepages.com>
>> >> wrote:
>> >>
>> >>> Oh, interesting. I've certainty encountered issues with multi-word
>> >>> synonyms, but I hadn't come across this. If you end up using it with a
>> >>> recent solr verison, I'd be glad to hear your experience.
>> >>>
>> >>> I haven't used it, but I am aware of one other project in this vein
>> that
>> >>> you might be interested in looking at:
>> >>> https://github.com/LucidWorks/auto-phrase-tokenfilter
>> >>>
>> >>>
>> >>> On 5/26/16, 9:29 AM, "John Bickerstaff" <j...@johnbickerstaff.com>
>> >> wrote:
>> >>>
>> >>>> Ahh - for question #3 I may have spoken too soon. This line from the
>> >>>> github repository readme suggests a way.
>> >>>>
>> >>>> Update: We have tested to run with the jar in $SOLR_HOME/lib as well,
>> >> and
>> >>>> it works (Jetty).
>> >>>>
>> >>>> I'll try that and only respond back if that doesn't work.
>> >>>>
>> >>>> Questions 1 and 2 still stand of course... If anyone on the list has
>> >>>> experience in this area...
>> >>>>
>> >>>> Thanks.
>> >>>>
>> >>>> On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff <
>> >>> j...@johnbickerstaff.com
>> >>>>> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> I'm creating a Solr Cloud that will index and search medical text.
>> >>>>> Multi-word synonyms are a pretty important factor.
>> >>>>>
>> >>>>> I find that there are some challenges around multi-word synonyms
>> and I
>> >>>>> also found on the wiki that there is a recommended 3rd-party parser
>> >>>>> (synonym_edismax parser) created by Nolan Lawson and found here:
>> >>>>> https://github.com/healthonnet/hon-lucene-synonyms
>> >>>>>
>> >>>>> Here's the thing - the instructions on the github site involve
>> >> bringing
>> >>>>> the jar file into the war file - which is not applicable any more...
>> >> at
>> >>>>> least I think it's not...
>> >>>>>
>> >>>>> I have three questions:
>> >>>>>
>> >>>>> 1. Is this still a good solution for multi-word synonyms (I.e. Solr
>> >>> Cloud
>> >>>>> doesn't break it in some way)
>> >>>>> 2. Is there a tool or plug-in out there that the contributors would
>> >>>>> recommend above this one?
>> >>>>> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated
>> >> procedure
>> >>>>> for bringing it in to Solr Cloud (I'm running 5.4.x)
>> >>>>>
>> >>>>> Thanks
>> >>>>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>>
>>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-27 Thread John Bickerstaff
Thank you Steve -- very helpful.

I can see that whatever implementation I decide to try, some testing will
be in order.  If anyone is aware of significant gotchas with this synonym
thing that are not mentioned in the already-listed URLs, please feel free
to comment.

On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <sar...@gmail.com> wrote:

> I’m working on addressing problems using multi-term synonyms at query time
> in Lucene and Solr.
>
> I recommend these two blogs for understanding the issues (the second one
> was mentioned earlier in this thread):
>
> <
> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
> >
> <https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>
>
> In addition to the already-mentioned projects, there is also:
>
> <https://issues.apache.org/jira/browse/SOLR-5379>
>
> All of these projects try in various ways to work around the fact that
> Lucene’s QueryParser splits on whitespace before sending text to analysis,
> one token at a time, so in a synonym filter, multi-word synonyms can never
> match and add alternatives.  See <
> https://issues.apache.org/jira/browse/LUCENE-2605>, where I’ve posted a
> patch to directly address that problem - note that it’s still a work in
> progress.
>
> Once LUCENE-2605 has been fixed, there is still work to do getting
> (e)dismax to work with the modified Lucene QueryParser, and addressing
> problems with how queries are constructed from Lucene’s “sausagized” token
> stream.
>
> --
> Steve
> www.lucidworks.com
>
> > On May 26, 2016, at 2:21 PM, John Bickerstaff <j...@johnbickerstaff.com>
> wrote:
> >
> > Thanks Chris --
> >
> > The two projects I'm aware of are:
> >
> > https://github.com/healthonnet/hon-lucene-synonyms
> >
> > and the one referenced from the Lucidworks page here:
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >
> > ... which is here :
> https://github.com/LucidWorks/auto-phrase-tokenfilter
> >
> > Is there anything else out there that you would recommend I look at?
> >
> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley <ch...@depahelix.com>
> wrote:
> >
> >> Chris Morley here, from Wayfair.  (Depahelix = my domain)
> >>
> >> Suyash Sonawane and I have worked on multiple word synonyms at Wayfair.
> >> We worked mostly off of Ted Sullivan's work and also off of some
> >> suggestions from Koorosh Vakhshoori.  We have gotten to a point where we
> >> have a more sophisticated internal implementation, however, we've found
> >> that it is very difficult to make it do what you want it to do, and
> also be
> >> sufficiently performant.  Watch out for exceptional situations with mm
> >> (minimum should match).
> >>
> >> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have also
> >> done work in this area.
> >>
> >> It should be very possible to get this kind of thing working on
> >> SolrCloud.  I haven't tried it yet but I think theoretically, it should
> >> just work.  The synonyms stuff is mostly about doing things at index
> time
> >> and query time.  The index time stuff should translate to SolrCloud
> >> directly, while the query time stuff might pose some issues, but
> probably
> >> not too bad, if there are any issues at all.
> >>
> >> I've had decent luck porting our various plugins from 4.10.x to 5.5.0
> >> because a lot of stuff is just Java, and it still works within the Jetty
> >> context.
> >>
> >> -Chris.
> >>
> >>
> >>
> >>
> >> 
> >> From: "John Bickerstaff" <j...@johnbickerstaff.com>
> >> Sent: Thursday, May 26, 2016 1:51 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax
> parser
> >> Hey Jeff (or anyone interested in multi-word synonyms) here are some
> >> potentially interesting links...
> >>
> >> http://wiki.apache.org/solr/QueryParser (search the page for
> >> synonum_edismax)
> >>
> >> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
> (blog
> >> post about what became the synonym_edissmax Query Parser)
> >>
> >>
> >>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >>
> >> This last was us

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-27 Thread Steve Rowe
I’m working on addressing problems using multi-term synonyms at query time in 
Lucene and Solr.

I recommend these two blogs for understanding the issues (the second one was 
mentioned earlier in this thread):

<http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html>
<https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/>

In addition to the already-mentioned projects, there is also:

<https://issues.apache.org/jira/browse/SOLR-5379>

All of these projects try in various ways to work around the fact that Lucene’s 
QueryParser splits on whitespace before sending text to analysis, one token at 
a time, so in a synonym filter, multi-word synonyms can never match and add 
alternatives.  See <https://issues.apache.org/jira/browse/LUCENE-2605>, where 
I’ve posted a patch to directly address that problem - note that it’s still a 
work in progress.

Once LUCENE-2605 has been fixed, there is still work to do getting (e)dismax to 
work with the modified Lucene QueryParser, and addressing problems with how 
queries are constructed from Lucene’s “sausagized” token stream.

--
Steve
www.lucidworks.com

> On May 26, 2016, at 2:21 PM, John Bickerstaff <j...@johnbickerstaff.com> 
> wrote:
> 
> Thanks Chris --
> 
> The two projects I'm aware of are:
> 
> https://github.com/healthonnet/hon-lucene-synonyms
> 
> and the one referenced from the Lucidworks page here:
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> 
> ... which is here : https://github.com/LucidWorks/auto-phrase-tokenfilter
> 
> Is there anything else out there that you would recommend I look at?
> 
> On Thu, May 26, 2016 at 12:01 PM, Chris Morley <ch...@depahelix.com> wrote:
> 
>> Chris Morley here, from Wayfair.  (Depahelix = my domain)
>> 
>> Suyash Sonawane and I have worked on multiple word synonyms at Wayfair.
>> We worked mostly off of Ted Sullivan's work and also off of some
>> suggestions from Koorosh Vakhshoori.  We have gotten to a point where we
>> have a more sophisticated internal implementation, however, we've found
>> that it is very difficult to make it do what you want it to do, and also be
>> sufficiently performant.  Watch out for exceptional situations with mm
>> (minimum should match).
>> 
>> Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have also
>> done work in this area.
>> 
>> It should be very possible to get this kind of thing working on
>> SolrCloud.  I haven't tried it yet but I think theoretically, it should
>> just work.  The synonyms stuff is mostly about doing things at index time
>> and query time.  The index time stuff should translate to SolrCloud
>> directly, while the query time stuff might pose some issues, but probably
>> not too bad, if there are any issues at all.
>> 
>> I've had decent luck porting our various plugins from 4.10.x to 5.5.0
>> because a lot of stuff is just Java, and it still works within the Jetty
>> context.
>> 
>> -Chris.
>> 
>> 
>> 
>> 
>> 
>> From: "John Bickerstaff" <j...@johnbickerstaff.com>
>> Sent: Thursday, May 26, 2016 1:51 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser
>> Hey Jeff (or anyone interested in multi-word synonyms) here are some
>> potentially interesting links...
>> 
>> http://wiki.apache.org/solr/QueryParser (search the page for
>> synonum_edismax)
>> 
>> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ (blog
>> post about what became the synonym_edissmax Query Parser)
>> 
>> 
>> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>> 
>> This last was useful for lots of reasons and contains links to other
>> interesting, related web pages...
>> 
>> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes <jwar...@whitepages.com>
>> wrote:
>> 
>>> Oh, interesting. I've certainty encountered issues with multi-word
>>> synonyms, but I hadn't come across this. If you end up using it with a
>>> recent solr verison, I'd be glad to hear your experience.
>>> 
>>> I haven't used it, but I am aware of one other project in this vein that
>>> you might be interested in looking at:
>>> https://github.com/LucidWorks/auto-phrase-tokenfilter
>>> 
>>> 
>>> On 5/26/16, 9:29 AM, "John Bickerstaff" <j...@johnbickerstaff.com>
>> wrote:
>>> 
>>&g

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread John Bickerstaff
Thanks Chris --

The two projects I'm aware of are:

https://github.com/healthonnet/hon-lucene-synonyms

and the one referenced from the Lucidworks page here:
https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

... which is here : https://github.com/LucidWorks/auto-phrase-tokenfilter

Is there anything else out there that you would recommend I look at?

On Thu, May 26, 2016 at 12:01 PM, Chris Morley <ch...@depahelix.com> wrote:

> Chris Morley here, from Wayfair.  (Depahelix = my domain)
>
>  Suyash Sonawane and I have worked on multiple word synonyms at Wayfair.
> We worked mostly off of Ted Sullivan's work and also off of some
> suggestions from Koorosh Vakhshoori.  We have gotten to a point where we
> have a more sophisticated internal implementation, however, we've found
> that it is very difficult to make it do what you want it to do, and also be
> sufficiently performant.  Watch out for exceptional situations with mm
> (minimum should match).
>
>  Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have also
> done work in this area.
>
>  It should be very possible to get this kind of thing working on
> SolrCloud.  I haven't tried it yet but I think theoretically, it should
> just work.  The synonyms stuff is mostly about doing things at index time
> and query time.  The index time stuff should translate to SolrCloud
> directly, while the query time stuff might pose some issues, but probably
> not too bad, if there are any issues at all.
>
>  I've had decent luck porting our various plugins from 4.10.x to 5.5.0
> because a lot of stuff is just Java, and it still works within the Jetty
> context.
>
>  -Chris.
>
>
>
>
> 
>  From: "John Bickerstaff" <j...@johnbickerstaff.com>
> Sent: Thursday, May 26, 2016 1:51 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser
> Hey Jeff (or anyone interested in multi-word synonyms) here are some
> potentially interesting links...
>
> http://wiki.apache.org/solr/QueryParser (search the page for
> synonum_edismax)
>
> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ (blog
> post about what became the synonym_edissmax Query Parser)
>
>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>
> This last was useful for lots of reasons and contains links to other
> interesting, related web pages...
>
> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes <jwar...@whitepages.com>
> wrote:
>
> > Oh, interesting. I've certainty encountered issues with multi-word
> > synonyms, but I hadn't come across this. If you end up using it with a
> > recent solr verison, I'd be glad to hear your experience.
> >
> > I haven't used it, but I am aware of one other project in this vein that
> > you might be interested in looking at:
> > https://github.com/LucidWorks/auto-phrase-tokenfilter
> >
> >
> > On 5/26/16, 9:29 AM, "John Bickerstaff" <j...@johnbickerstaff.com>
> wrote:
> >
> > >Ahh - for question #3 I may have spoken too soon. This line from the
> > >github repository readme suggests a way.
> > >
> > >Update: We have tested to run with the jar in $SOLR_HOME/lib as well,
> and
> > >it works (Jetty).
> > >
> > >I'll try that and only respond back if that doesn't work.
> > >
> > >Questions 1 and 2 still stand of course... If anyone on the list has
> > >experience in this area...
> > >
> > >Thanks.
> > >
> > >On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff <
> > j...@johnbickerstaff.com
> > >> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I'm creating a Solr Cloud that will index and search medical text.
> > >> Multi-word synonyms are a pretty important factor.
> > >>
> > >> I find that there are some challenges around multi-word synonyms and I
> > >> also found on the wiki that there is a recommended 3rd-party parser
> > >> (synonym_edismax parser) created by Nolan Lawson and found here:
> > >> https://github.com/healthonnet/hon-lucene-synonyms
> > >>
> > >> Here's the thing - the instructions on the github site involve
> bringing
> > >> the jar file into the war file - which is not applicable any more...
> at
> > >> least I think it's not...
> > >>
> > >> I have three questions:
> > >>
> > >> 1. Is this still a good solution for multi-word synonyms (I.e. Solr
> > Cloud
> > >> doesn't break it in some way)
> > >> 2. Is there a tool or plug-in out there that the contributors would
> > >> recommend above this one?
> > >> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated
> procedure
> > >> for bringing it in to Solr Cloud (I'm running 5.4.x)
> > >>
> > >> Thanks
> > >>
> >
> >
>
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread Chris Morley
Chris Morley here, from Wayfair.  (Depahelix = my domain)

 Suyash Sonawane and I have worked on multiple word synonyms at Wayfair.  We 
worked mostly off of Ted Sullivan's work and also off of some suggestions from 
Koorosh Vakhshoori.  We have gotten to a point where we have a more 
sophisticated internal implementation, however, we've found that it is very 
difficult to make it do what you want it to do, and also be sufficiently 
performant.  Watch out for exceptional situations with mm (minimum should 
match).

 Trey Grainger (now at Lucidworks) and Simon Hughes of Dice.com have also done 
work in this area.

 It should be very possible to get this kind of thing working on SolrCloud.  I 
haven't tried it yet but I think theoretically, it should just work.  The 
synonyms stuff is mostly about doing things at index time and query time.  The 
index time stuff should translate to SolrCloud directly, while the query time 
stuff might pose some issues, but probably not too bad, if there are any issues 
at all.

 I've had decent luck porting our various plugins from 4.10.x to 5.5.0 because 
a lot of stuff is just Java, and it still works within the Jetty context.

 -Chris.





 From: "John Bickerstaff" <j...@johnbickerstaff.com>
Sent: Thursday, May 26, 2016 1:51 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser  
Hey Jeff (or anyone interested in multi-word synonyms) here are some
potentially interesting links...

http://wiki.apache.org/solr/QueryParser (search the page for
synonum_edismax)

https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ (blog
post about what became the synonym_edissmax Query Parser)

https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

This last was useful for lots of reasons and contains links to other
interesting, related web pages...

On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes <jwar...@whitepages.com>
wrote:

> Oh, interesting. I've certainty encountered issues with multi-word
> synonyms, but I hadn't come across this. If you end up using it with a
> recent solr verison, I'd be glad to hear your experience.
>
> I haven't used it, but I am aware of one other project in this vein that
> you might be interested in looking at:
> https://github.com/LucidWorks/auto-phrase-tokenfilter
>
>
> On 5/26/16, 9:29 AM, "John Bickerstaff" <j...@johnbickerstaff.com> wrote:
>
> >Ahh - for question #3 I may have spoken too soon. This line from the
> >github repository readme suggests a way.
> >
> >Update: We have tested to run with the jar in $SOLR_HOME/lib as well, and
> >it works (Jetty).
> >
> >I'll try that and only respond back if that doesn't work.
> >
> >Questions 1 and 2 still stand of course... If anyone on the list has
> >experience in this area...
> >
> >Thanks.
> >
> >On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff <
> j...@johnbickerstaff.com
> >> wrote:
> >
> >> Hi all,
> >>
> >> I'm creating a Solr Cloud that will index and search medical text.
> >> Multi-word synonyms are a pretty important factor.
> >>
> >> I find that there are some challenges around multi-word synonyms and I
> >> also found on the wiki that there is a recommended 3rd-party parser
> >> (synonym_edismax parser) created by Nolan Lawson and found here:
> >> https://github.com/healthonnet/hon-lucene-synonyms
> >>
> >> Here's the thing - the instructions on the github site involve bringing
> >> the jar file into the war file - which is not applicable any more... at
> >> least I think it's not...
> >>
> >> I have three questions:
> >>
> >> 1. Is this still a good solution for multi-word synonyms (I.e. Solr
> Cloud
> >> doesn't break it in some way)
> >> 2. Is there a tool or plug-in out there that the contributors would
> >> recommend above this one?
> >> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated procedure
> >> for bringing it in to Solr Cloud (I'm running 5.4.x)
> >>
> >> Thanks
> >>
>
>




Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread John Bickerstaff
fixing typo:

http://wiki.apache.org/solr/QueryParser  (search the page for
synonym_edismax)

On Thu, May 26, 2016 at 11:50 AM, John Bickerstaff  wrote:

> Hey Jeff (or anyone interested in multi-word synonyms) here are some
> potentially interesting links...
>
> http://wiki.apache.org/solr/QueryParser  (search the page for
> synonum_edismax)
>
> https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
>  (blog post about what became the synonym_edissmax Query Parser)
>
>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>
> This last was useful for lots of reasons and contains links to other
> interesting, related web pages...
>
> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes 
> wrote:
>
>> Oh, interesting. I’ve certainty encountered issues with multi-word
>> synonyms, but I hadn’t come across this. If you end up using it with a
>> recent solr verison, I’d be glad to hear your experience.
>>
>> I haven’t used it, but I am aware of one other project in this vein that
>> you might be interested in looking at:
>> https://github.com/LucidWorks/auto-phrase-tokenfilter
>>
>>
>> On 5/26/16, 9:29 AM, "John Bickerstaff"  wrote:
>>
>> >Ahh - for question #3 I may have spoken too soon.  This line from the
>> >github repository readme suggests a way.
>> >
>> >Update: We have tested to run with the jar in $SOLR_HOME/lib as well, and
>> >it works (Jetty).
>> >
>> >I'll try that and only respond back if that doesn't work.
>> >
>> >Questions 1 and 2 still stand of course...  If anyone on the list has
>> >experience in this area...
>> >
>> >Thanks.
>> >
>> >On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff <
>> j...@johnbickerstaff.com
>> >> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I'm creating a Solr Cloud that will index and search medical text.
>> >> Multi-word synonyms are a pretty important factor.
>> >>
>> >> I find that there are some challenges around multi-word synonyms and I
>> >> also found on the wiki that there is a recommended 3rd-party parser
>> >> (synonym_edismax parser) created by Nolan Lawson and found here:
>> >> https://github.com/healthonnet/hon-lucene-synonyms
>> >>
>> >> Here's the thing - the instructions on the github site involve bringing
>> >> the jar file into the war file - which is not applicable any more... at
>> >> least I think it's not...
>> >>
>> >> I have three questions:
>> >>
>> >> 1. Is this still a good solution for multi-word synonyms (I.e. Solr
>> Cloud
>> >> doesn't break it in some way)
>> >> 2. Is there a tool or plug-in out there that the contributors would
>> >> recommend above this one?
>> >> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated procedure
>> >> for bringing it in to Solr Cloud (I'm running 5.4.x)
>> >>
>> >> Thanks
>> >>
>>
>>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread John Bickerstaff
Hey Jeff (or anyone interested in multi-word synonyms) here are some
potentially interesting links...

http://wiki.apache.org/solr/QueryParser  (search the page for
synonum_edismax)

https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/  (blog
post about what became the synonym_edissmax Query Parser)

https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

This last was useful for lots of reasons and contains links to other
interesting, related web pages...

On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes 
wrote:

> Oh, interesting. I’ve certainty encountered issues with multi-word
> synonyms, but I hadn’t come across this. If you end up using it with a
> recent solr verison, I’d be glad to hear your experience.
>
> I haven’t used it, but I am aware of one other project in this vein that
> you might be interested in looking at:
> https://github.com/LucidWorks/auto-phrase-tokenfilter
>
>
> On 5/26/16, 9:29 AM, "John Bickerstaff"  wrote:
>
> >Ahh - for question #3 I may have spoken too soon.  This line from the
> >github repository readme suggests a way.
> >
> >Update: We have tested to run with the jar in $SOLR_HOME/lib as well, and
> >it works (Jetty).
> >
> >I'll try that and only respond back if that doesn't work.
> >
> >Questions 1 and 2 still stand of course...  If anyone on the list has
> >experience in this area...
> >
> >Thanks.
> >
> >On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff <
> j...@johnbickerstaff.com
> >> wrote:
> >
> >> Hi all,
> >>
> >> I'm creating a Solr Cloud that will index and search medical text.
> >> Multi-word synonyms are a pretty important factor.
> >>
> >> I find that there are some challenges around multi-word synonyms and I
> >> also found on the wiki that there is a recommended 3rd-party parser
> >> (synonym_edismax parser) created by Nolan Lawson and found here:
> >> https://github.com/healthonnet/hon-lucene-synonyms
> >>
> >> Here's the thing - the instructions on the github site involve bringing
> >> the jar file into the war file - which is not applicable any more... at
> >> least I think it's not...
> >>
> >> I have three questions:
> >>
> >> 1. Is this still a good solution for multi-word synonyms (I.e. Solr
> Cloud
> >> doesn't break it in some way)
> >> 2. Is there a tool or plug-in out there that the contributors would
> >> recommend above this one?
> >> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated procedure
> >> for bringing it in to Solr Cloud (I'm running 5.4.x)
> >>
> >> Thanks
> >>
>
>


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread Jeff Wartes
Oh, interesting. I’ve certainty encountered issues with multi-word synonyms, 
but I hadn’t come across this. If you end up using it with a recent solr 
verison, I’d be glad to hear your experience.

I haven’t used it, but I am aware of one other project in this vein that you 
might be interested in looking at: 
https://github.com/LucidWorks/auto-phrase-tokenfilter


On 5/26/16, 9:29 AM, "John Bickerstaff"  wrote:

>Ahh - for question #3 I may have spoken too soon.  This line from the
>github repository readme suggests a way.
>
>Update: We have tested to run with the jar in $SOLR_HOME/lib as well, and
>it works (Jetty).
>
>I'll try that and only respond back if that doesn't work.
>
>Questions 1 and 2 still stand of course...  If anyone on the list has
>experience in this area...
>
>Thanks.
>
>On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff > wrote:
>
>> Hi all,
>>
>> I'm creating a Solr Cloud that will index and search medical text.
>> Multi-word synonyms are a pretty important factor.
>>
>> I find that there are some challenges around multi-word synonyms and I
>> also found on the wiki that there is a recommended 3rd-party parser
>> (synonym_edismax parser) created by Nolan Lawson and found here:
>> https://github.com/healthonnet/hon-lucene-synonyms
>>
>> Here's the thing - the instructions on the github site involve bringing
>> the jar file into the war file - which is not applicable any more... at
>> least I think it's not...
>>
>> I have three questions:
>>
>> 1. Is this still a good solution for multi-word synonyms (I.e. Solr Cloud
>> doesn't break it in some way)
>> 2. Is there a tool or plug-in out there that the contributors would
>> recommend above this one?
>> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated procedure
>> for bringing it in to Solr Cloud (I'm running 5.4.x)
>>
>> Thanks
>>



Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread John Bickerstaff
Ahh - for question #3 I may have spoken too soon.  This line from the
github repository readme suggests a way.

Update: We have tested to run with the jar in $SOLR_HOME/lib as well, and
it works (Jetty).

I'll try that and only respond back if that doesn't work.

Questions 1 and 2 still stand of course...  If anyone on the list has
experience in this area...

Thanks.

On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff  wrote:

> Hi all,
>
> I'm creating a Solr Cloud that will index and search medical text.
> Multi-word synonyms are a pretty important factor.
>
> I find that there are some challenges around multi-word synonyms and I
> also found on the wiki that there is a recommended 3rd-party parser
> (synonym_edismax parser) created by Nolan Lawson and found here:
> https://github.com/healthonnet/hon-lucene-synonyms
>
> Here's the thing - the instructions on the github site involve bringing
> the jar file into the war file - which is not applicable any more... at
> least I think it's not...
>
> I have three questions:
>
> 1. Is this still a good solution for multi-word synonyms (I.e. Solr Cloud
> doesn't break it in some way)
> 2. Is there a tool or plug-in out there that the contributors would
> recommend above this one?
> 3. Assuming 1 = yes and 2 = no, can anyone tell me an updated procedure
> for bringing it in to Solr Cloud (I'm running 5.4.x)
>
> Thanks
>