Re: Wildcard in FL parameter not working with Solr 4.10.0

2014-09-10 Thread Mike Hugo
This may have been introduced by changes made to solve
https://issues.apache.org/jira/browse/SOLR-5968

I created https://issues.apache.org/jira/browse/SOLR-6501 to track the new
bug.

On Tue, Sep 9, 2014 at 4:53 PM, Mike Hugo m...@piragua.com wrote:

 Hello,

 With Solr 4.7 we had some queries that return dynamic fields by passing in
 a fl=*_exact parameter; this is not working for us after upgrading to Solr
 4.10.0.  This appears to only be a problem when requesting wildcarded
 fields via SolrJ

 With Solr 4.10.0 - I downloaded the binary and set up the example:

 cd example
 java -jar start.jar
 java -jar post.jar solr.xml monitor.xml

 In a browser, if I request

 http://localhost:8983/solr/collection1/select?q=*:*wt=jsonindent=true
 *fl=*d*

 All is well with the world:

 {responseHeader: {status: 0,QTime: 1,params: {fl: *d,indent
 : true,q: *:*,wt: json}},response: {numFound: 2,start: 0,
 docs: [{id: SOLR1000},{id: 3007WFP}]}}

 However if I do the same query with SolrJ (groovy script)


 @Grab(group = 'org.apache.solr', module = 'solr-solrj', version = '4.10.0')

 import org.apache.solr.client.solrj.SolrQuery
 import org.apache.solr.client.solrj.impl.HttpSolrServer

 HttpSolrServer solrServer = new HttpSolrServer(
 http://localhost:8983/solr/collection1;)
 SolrQuery q = new SolrQuery(*:*)
 *q.setFields(*d)*
 println solrServer.query(q)


 No fields are returned:


 {responseHeader={status=0,QTime=0,params={fl=*d,q=*:*,wt=javabin,version=2}},response={numFound=2,start=0,docs=[*SolrDocument{},
 SolrDocument{}*]}}



 Any ideas as to why when using SolrJ wildcarded fl fields are not returned?

 Thanks,

 Mike



Wildcard in FL parameter not working with Solr 4.10.0

2014-09-09 Thread Mike Hugo
Hello,

With Solr 4.7 we had some queries that return dynamic fields by passing in
a fl=*_exact parameter; this is not working for us after upgrading to Solr
4.10.0.  This appears to only be a problem when requesting wildcarded
fields via SolrJ

With Solr 4.10.0 - I downloaded the binary and set up the example:

cd example
java -jar start.jar
java -jar post.jar solr.xml monitor.xml

In a browser, if I request

http://localhost:8983/solr/collection1/select?q=*:*wt=jsonindent=true
*fl=*d*

All is well with the world:

{responseHeader: {status: 0,QTime: 1,params: {fl: *d,indent: 
true,q: *:*,wt: json}},response: {numFound: 2,start: 0,docs
: [{id: SOLR1000},{id: 3007WFP}]}}

However if I do the same query with SolrJ (groovy script)


@Grab(group = 'org.apache.solr', module = 'solr-solrj', version = '4.10.0')

import org.apache.solr.client.solrj.SolrQuery
import org.apache.solr.client.solrj.impl.HttpSolrServer

HttpSolrServer solrServer = new HttpSolrServer(
http://localhost:8983/solr/collection1;)
SolrQuery q = new SolrQuery(*:*)
*q.setFields(*d)*
println solrServer.query(q)


No fields are returned:

{responseHeader={status=0,QTime=0,params={fl=*d,q=*:*,wt=javabin,version=2}},response={numFound=2,start=0,docs=[*SolrDocument{},
SolrDocument{}*]}}



Any ideas as to why when using SolrJ wildcarded fl fields are not returned?

Thanks,

Mike


Deep paging in parallel with solr cloud - OutOfMemory

2014-03-17 Thread Mike Hugo
Hello,

We recently upgraded to Solr Cloud 4.7 (went from a single node Solr 4.0
instance to 3 node Solr 4.7 cluster).

Part of out application does an automated traversal of all documents that
match a specific query.  It does this by iterating through results by
setting the start and rows parameters, starting with start=0 and rows=1000,
then start=1000, rows=1000, start = 2000, rows=1000, etc etc.

We do this in parallel fashion with multiple workers on multiple nodes.
 It's easy to chunk up the work to be done by figuring out how many total
results there are and then creating 'chunks' (0-1000, 1000-2000, 2000-3000)
and sending each chunk to a worker in a pool of multi-threaded workers.

This worked well for us with a single server.  However upon upgrading to
solr cloud, we've found that this quickly (within the first 4 or 5
requests) causes an OutOfMemory error on the coordinating node that
receives the query.   I don't fully understand what's going on here, but it
looks like the coordinating node receives the query and sends it to the
shard requested.  For example, given:

shards=shard3sort=id+ascstart=4000q=*:*rows=1000

The coordinating node sends this query to shard3:

NOW=1395086719189shard.url=
http://shard3_url_goes_here:8080/solr/collection1/fl=idsort=id+ascstart=0q=*:*distrib=falsewt=javabinisShard=truefsv=trueversion=2rows=5000

Notice the rows parameter is 5000 (start + rows).  If the coordinator node
is able to process the result set (which works for the first few pages,
after that it will quickly run out of memory), it eventually issues this
request back to shard3:

NOW=1395086719189shard.url=
http://10.128.215.226:8080/extera-search/gemindex/start=4000ids=a..bunch...(1000)..of..doc..ids..go..hereq=*:*distrib=falsewt=javabinisShard=trueversion=2rows=1000

and then finally returns the response to the client.

One possible workaround:  We've found that if we issue non-distributed
requests to specific shards, that we get performance along the same lines
that we did before.  E.g. issue a query with shards=shard3distrib=false
directly to the url of the shard3 instance, rather than going through the
cloud solr server solrj API.

The other workaround is to adapt to use the new new cursorMark
functionality.  I've manually tried a few requests and it is pretty
efficient, and doesn't result in the OOM errors on the coordinating node.
 However, i've only done this in single threaded manner.  I'm wondering if
there would be a way to get cursor marks for an entire result set at a
given page interval, so that they could then be fed to the pool of parallel
workers to get the results in parallel rather than single threaded.  Is
there a way to do this so we could process the results in parallel?

Any other possible solutions?  Thanks in advance.

Mike


Re: Deep paging in parallel with solr cloud - OutOfMemory

2014-03-17 Thread Mike Hugo
I should add each node has 16G of ram, 8GB of which is allocated to the
JVM.  Each node has about 200k docs and happily uses only about 3 or 4gb of
ram during normal operation.  It's only during this deep pagination that we
have seen OOM errors.


On Mon, Mar 17, 2014 at 3:14 PM, Mike Hugo m...@piragua.com wrote:

 Hello,

 We recently upgraded to Solr Cloud 4.7 (went from a single node Solr 4.0
 instance to 3 node Solr 4.7 cluster).

 Part of out application does an automated traversal of all documents that
 match a specific query.  It does this by iterating through results by
 setting the start and rows parameters, starting with start=0 and rows=1000,
 then start=1000, rows=1000, start = 2000, rows=1000, etc etc.

 We do this in parallel fashion with multiple workers on multiple nodes.
  It's easy to chunk up the work to be done by figuring out how many total
 results there are and then creating 'chunks' (0-1000, 1000-2000, 2000-3000)
 and sending each chunk to a worker in a pool of multi-threaded workers.

 This worked well for us with a single server.  However upon upgrading to
 solr cloud, we've found that this quickly (within the first 4 or 5
 requests) causes an OutOfMemory error on the coordinating node that
 receives the query.   I don't fully understand what's going on here, but it
 looks like the coordinating node receives the query and sends it to the
 shard requested.  For example, given:

 shards=shard3sort=id+ascstart=4000q=*:*rows=1000

 The coordinating node sends this query to shard3:

 NOW=1395086719189shard.url=
 http://shard3_url_goes_here:8080/solr/collection1/fl=idsort=id+ascstart=0q=*:*distrib=falsewt=javabinisShard=truefsv=trueversion=2rows=5000

 Notice the rows parameter is 5000 (start + rows).  If the coordinator node
 is able to process the result set (which works for the first few pages,
 after that it will quickly run out of memory), it eventually issues this
 request back to shard3:

 NOW=1395086719189shard.url=
 http://10.128.215.226:8080/extera-search/gemindex/start=4000ids=a..bunch...(1000)..of..doc..ids..go..hereq=*:*distrib=falsewt=javabinisShard=trueversion=2rows=1000

 and then finally returns the response to the client.

 One possible workaround:  We've found that if we issue non-distributed
 requests to specific shards, that we get performance along the same lines
 that we did before.  E.g. issue a query with shards=shard3distrib=false
 directly to the url of the shard3 instance, rather than going through the
 cloud solr server solrj API.

 The other workaround is to adapt to use the new new cursorMark
 functionality.  I've manually tried a few requests and it is pretty
 efficient, and doesn't result in the OOM errors on the coordinating node.
  However, i've only done this in single threaded manner.  I'm wondering if
 there would be a way to get cursor marks for an entire result set at a
 given page interval, so that they could then be fed to the pool of parallel
 workers to get the results in parallel rather than single threaded.  Is
 there a way to do this so we could process the results in parallel?

 Any other possible solutions?  Thanks in advance.

 Mike





Re: Deep paging in parallel with solr cloud - OutOfMemory

2014-03-17 Thread Mike Hugo
Thanks Steve,

That certainly looks like it could be the culprit.  Any word on a release
date for 4.7.1?  Days?  Weeks?  Months?

Mike


On Mon, Mar 17, 2014 at 3:31 PM, Steve Rowe sar...@gmail.com wrote:

 Hi Mike,

 The OOM you're seeing is likely a result of the bug described in (and
 fixed by a commit under) SOLR-5875: 
 https://issues.apache.org/jira/browse/SOLR-5875.

 If you can build from source, it would be great if you could confirm the
 fix addresses the issue you're facing.

 This fix will be part of a to-be-released Solr 4.7.1.

 Steve

 On Mar 17, 2014, at 4:14 PM, Mike Hugo m...@piragua.com wrote:

  Hello,
 
  We recently upgraded to Solr Cloud 4.7 (went from a single node Solr 4.0
  instance to 3 node Solr 4.7 cluster).
 
  Part of out application does an automated traversal of all documents that
  match a specific query.  It does this by iterating through results by
  setting the start and rows parameters, starting with start=0 and
 rows=1000,
  then start=1000, rows=1000, start = 2000, rows=1000, etc etc.
 
  We do this in parallel fashion with multiple workers on multiple nodes.
  It's easy to chunk up the work to be done by figuring out how many total
  results there are and then creating 'chunks' (0-1000, 1000-2000,
 2000-3000)
  and sending each chunk to a worker in a pool of multi-threaded workers.
 
  This worked well for us with a single server.  However upon upgrading to
  solr cloud, we've found that this quickly (within the first 4 or 5
  requests) causes an OutOfMemory error on the coordinating node that
  receives the query.   I don't fully understand what's going on here, but
 it
  looks like the coordinating node receives the query and sends it to the
  shard requested.  For example, given:
 
  shards=shard3sort=id+ascstart=4000q=*:*rows=1000
 
  The coordinating node sends this query to shard3:
 
  NOW=1395086719189shard.url=
 
 http://shard3_url_goes_here:8080/solr/collection1/fl=idsort=id+ascstart=0q=*:*distrib=falsewt=javabinisShard=truefsv=trueversion=2rows=5000
 
  Notice the rows parameter is 5000 (start + rows).  If the coordinator
 node
  is able to process the result set (which works for the first few pages,
  after that it will quickly run out of memory), it eventually issues this
  request back to shard3:
 
  NOW=1395086719189shard.url=
 
 http://10.128.215.226:8080/extera-search/gemindex/start=4000ids=a..bunch...(1000)..of..doc..ids..go..hereq=*:*distrib=falsewt=javabinisShard=trueversion=2rows=1000
 
  and then finally returns the response to the client.
 
  One possible workaround:  We've found that if we issue non-distributed
  requests to specific shards, that we get performance along the same lines
  that we did before.  E.g. issue a query with shards=shard3distrib=false
  directly to the url of the shard3 instance, rather than going through the
  cloud solr server solrj API.
 
  The other workaround is to adapt to use the new new cursorMark
  functionality.  I've manually tried a few requests and it is pretty
  efficient, and doesn't result in the OOM errors on the coordinating node.
  However, i've only done this in single threaded manner.  I'm wondering if
  there would be a way to get cursor marks for an entire result set at a
  given page interval, so that they could then be fed to the pool of
 parallel
  workers to get the results in parallel rather than single threaded.  Is
  there a way to do this so we could process the results in parallel?
 
  Any other possible solutions?  Thanks in advance.
 
  Mike




Re: Deep paging in parallel with solr cloud - OutOfMemory

2014-03-17 Thread Mike Hugo
Thanks!


On Mon, Mar 17, 2014 at 3:47 PM, Steve Rowe sar...@gmail.com wrote:

 Mike,

 Days.  I plan on making a 4.7.1 release candidate a week from today, and
 assuming nobody finds any problems with the RC, it will be released roughly
 four days thereafter (three days for voting + one day for release
 propogation to the Apache mirrors): i.e., next Friday-ish.

 Steve

 On Mar 17, 2014, at 4:40 PM, Mike Hugo m...@piragua.com wrote:

  Thanks Steve,
 
  That certainly looks like it could be the culprit.  Any word on a release
  date for 4.7.1?  Days?  Weeks?  Months?
 
  Mike
 
 
  On Mon, Mar 17, 2014 at 3:31 PM, Steve Rowe sar...@gmail.com wrote:
 
  Hi Mike,
 
  The OOM you're seeing is likely a result of the bug described in (and
  fixed by a commit under) SOLR-5875: 
  https://issues.apache.org/jira/browse/SOLR-5875.
 
  If you can build from source, it would be great if you could confirm the
  fix addresses the issue you're facing.
 
  This fix will be part of a to-be-released Solr 4.7.1.
 
  Steve
 
  On Mar 17, 2014, at 4:14 PM, Mike Hugo m...@piragua.com wrote:
 
  Hello,
 
  We recently upgraded to Solr Cloud 4.7 (went from a single node Solr
 4.0
  instance to 3 node Solr 4.7 cluster).
 
  Part of out application does an automated traversal of all documents
 that
  match a specific query.  It does this by iterating through results by
  setting the start and rows parameters, starting with start=0 and
  rows=1000,
  then start=1000, rows=1000, start = 2000, rows=1000, etc etc.
 
  We do this in parallel fashion with multiple workers on multiple nodes.
  It's easy to chunk up the work to be done by figuring out how many
 total
  results there are and then creating 'chunks' (0-1000, 1000-2000,
  2000-3000)
  and sending each chunk to a worker in a pool of multi-threaded workers.
 
  This worked well for us with a single server.  However upon upgrading
 to
  solr cloud, we've found that this quickly (within the first 4 or 5
  requests) causes an OutOfMemory error on the coordinating node that
  receives the query.   I don't fully understand what's going on here,
 but
  it
  looks like the coordinating node receives the query and sends it to the
  shard requested.  For example, given:
 
  shards=shard3sort=id+ascstart=4000q=*:*rows=1000
 
  The coordinating node sends this query to shard3:
 
  NOW=1395086719189shard.url=
 
 
 http://shard3_url_goes_here:8080/solr/collection1/fl=idsort=id+ascstart=0q=*:*distrib=falsewt=javabinisShard=truefsv=trueversion=2rows=5000
 
  Notice the rows parameter is 5000 (start + rows).  If the coordinator
  node
  is able to process the result set (which works for the first few pages,
  after that it will quickly run out of memory), it eventually issues
 this
  request back to shard3:
 
  NOW=1395086719189shard.url=
 
 
 http://10.128.215.226:8080/extera-search/gemindex/start=4000ids=a..bunch...(1000)..of..doc..ids..go..hereq=*:*distrib=falsewt=javabinisShard=trueversion=2rows=1000
 
  and then finally returns the response to the client.
 
  One possible workaround:  We've found that if we issue non-distributed
  requests to specific shards, that we get performance along the same
 lines
  that we did before.  E.g. issue a query with
 shards=shard3distrib=false
  directly to the url of the shard3 instance, rather than going through
 the
  cloud solr server solrj API.
 
  The other workaround is to adapt to use the new new cursorMark
  functionality.  I've manually tried a few requests and it is pretty
  efficient, and doesn't result in the OOM errors on the coordinating
 node.
  However, i've only done this in single threaded manner.  I'm wondering
 if
  there would be a way to get cursor marks for an entire result set at a
  given page interval, so that they could then be fed to the pool of
  parallel
  workers to get the results in parallel rather than single threaded.  Is
  there a way to do this so we could process the results in parallel?
 
  Any other possible solutions?  Thanks in advance.
 
  Mike
 
 




Re: Deep paging in parallel with solr cloud - OutOfMemory

2014-03-17 Thread Mike Hugo
Cursor mark definitely seems like the way to go.  If I can get it to work
in parallel then that's additional bonus


On Mon, Mar 17, 2014 at 5:41 PM, Greg Pendlebury
greg.pendleb...@gmail.comwrote:

 Shouldn't all deep pagination against a cluster use the new cursor mark
 feature instead of 'start' and 'rows'?

 4 or 5 requests still seems a very low limit to be running into an OOM
 issues though, so perhaps it is both issues combined?

 Ta,
 Greg



 On 18 March 2014 07:49, Mike Hugo m...@piragua.com wrote:

  Thanks!
 
 
  On Mon, Mar 17, 2014 at 3:47 PM, Steve Rowe sar...@gmail.com wrote:
 
   Mike,
  
   Days.  I plan on making a 4.7.1 release candidate a week from today,
 and
   assuming nobody finds any problems with the RC, it will be released
  roughly
   four days thereafter (three days for voting + one day for release
   propogation to the Apache mirrors): i.e., next Friday-ish.
  
   Steve
  
   On Mar 17, 2014, at 4:40 PM, Mike Hugo m...@piragua.com wrote:
  
Thanks Steve,
   
That certainly looks like it could be the culprit.  Any word on a
  release
date for 4.7.1?  Days?  Weeks?  Months?
   
Mike
   
   
On Mon, Mar 17, 2014 at 3:31 PM, Steve Rowe sar...@gmail.com
 wrote:
   
Hi Mike,
   
The OOM you're seeing is likely a result of the bug described in
 (and
fixed by a commit under) SOLR-5875: 
https://issues.apache.org/jira/browse/SOLR-5875.
   
If you can build from source, it would be great if you could confirm
  the
fix addresses the issue you're facing.
   
This fix will be part of a to-be-released Solr 4.7.1.
   
Steve
   
On Mar 17, 2014, at 4:14 PM, Mike Hugo m...@piragua.com wrote:
   
Hello,
   
We recently upgraded to Solr Cloud 4.7 (went from a single node
 Solr
   4.0
instance to 3 node Solr 4.7 cluster).
   
Part of out application does an automated traversal of all
 documents
   that
match a specific query.  It does this by iterating through results
 by
setting the start and rows parameters, starting with start=0 and
rows=1000,
then start=1000, rows=1000, start = 2000, rows=1000, etc etc.
   
We do this in parallel fashion with multiple workers on multiple
  nodes.
It's easy to chunk up the work to be done by figuring out how many
   total
results there are and then creating 'chunks' (0-1000, 1000-2000,
2000-3000)
and sending each chunk to a worker in a pool of multi-threaded
  workers.
   
This worked well for us with a single server.  However upon
 upgrading
   to
solr cloud, we've found that this quickly (within the first 4 or 5
requests) causes an OutOfMemory error on the coordinating node that
receives the query.   I don't fully understand what's going on
 here,
   but
it
looks like the coordinating node receives the query and sends it to
  the
shard requested.  For example, given:
   
shards=shard3sort=id+ascstart=4000q=*:*rows=1000
   
The coordinating node sends this query to shard3:
   
NOW=1395086719189shard.url=
   
   
  
 
 http://shard3_url_goes_here:8080/solr/collection1/fl=idsort=id+ascstart=0q=*:*distrib=falsewt=javabinisShard=truefsv=trueversion=2rows=5000
   
Notice the rows parameter is 5000 (start + rows).  If the
 coordinator
node
is able to process the result set (which works for the first few
  pages,
after that it will quickly run out of memory), it eventually issues
   this
request back to shard3:
   
NOW=1395086719189shard.url=
   
   
  
 
 http://10.128.215.226:8080/extera-search/gemindex/start=4000ids=a..bunch...(1000)..of..doc..ids..go..hereq=*:*distrib=falsewt=javabinisShard=trueversion=2rows=1000
   
and then finally returns the response to the client.
   
One possible workaround:  We've found that if we issue
  non-distributed
requests to specific shards, that we get performance along the same
   lines
that we did before.  E.g. issue a query with
   shards=shard3distrib=false
directly to the url of the shard3 instance, rather than going
 through
   the
cloud solr server solrj API.
   
The other workaround is to adapt to use the new new cursorMark
functionality.  I've manually tried a few requests and it is pretty
efficient, and doesn't result in the OOM errors on the coordinating
   node.
However, i've only done this in single threaded manner.  I'm
  wondering
   if
there would be a way to get cursor marks for an entire result set
 at
  a
given page interval, so that they could then be fed to the pool of
parallel
workers to get the results in parallel rather than single threaded.
   Is
there a way to do this so we could process the results in parallel?
   
Any other possible solutions?  Thanks in advance.
   
Mike
   
   
  
  
 



Re: Deep paging in parallel with solr cloud - OutOfMemory

2014-03-17 Thread Mike Hugo
Greg and I are talking about the same type of parallel.

We do the same thing - if I know there are 10,000 results, we can chunk
that up across multiple worker threads up front without having to page
through the results.  We know there are 10 chunks of 1,000, so we can have
one thread process 0-1000 while another thread starts on 1000-2000 at the
same time.

The only idea I've had so far is that you could have a single thread up
front iterate through the entire result set, perhaps asking for 'null' from
the the fl param (to make the response more light weight) and record all
the next cursorMark tokens - then just fire those off to the workers as you
get them. depending on the amount of processing being done for each
response it might give you some optimizations from being
multi-threaded...or maybe the overhead of calculating the cursorMarks isn't
worth the effort.  Haven't tried either way yet.

Mike


On Mon, Mar 17, 2014 at 6:54 PM, Greg Pendlebury
greg.pendleb...@gmail.comwrote:

 Sorry, I meant one thread requesting records 1 - 1000, whilst the next
 thread requests 1001 - 2000 from the same ordered result set. We've
 observed several of our customers trying to harvest our data with
 multi-threaded scripts that work like this. I thought it would not work
 using cursor marks... but:

 A) I could be wrong, and
 B) I could be talking about parallel in a different way to Mike.

 Ta,
 Greg



 On 18 March 2014 10:24, Yonik Seeley yo...@heliosearch.com wrote:

  On Mon, Mar 17, 2014 at 7:14 PM, Greg Pendlebury
  greg.pendleb...@gmail.com wrote:
   My suspicion is that it won't work in parallel
 
  Deep paging with cursorMark does work with distributed search
  (assuming that's what you meant by parallel... querying sub-shards
  in parallel?).
 
  -Yonik
  http://heliosearch.org - solve Solr GC pauses with off-heap filters
  and fieldcache
 



Change replication factor

2014-03-12 Thread Mike Hugo
After a collection has been created in SolrCloud, is there a way to modify
the Replication Factor?

Say I start with a few nodes in the cluster, and have a replication factor
of 2.  Over time, the index grows and we add more nodes to the cluster, can
I increase the replication factor to 3?

Thanks!

Mike


Re: Change replication factor

2014-03-12 Thread Mike Hugo
Thanks Mark!

Mike


On Wed, Mar 12, 2014 at 12:43 PM, Mark Miller markrmil...@gmail.com wrote:

 You can simply create a new SolrCore with the same collection and shard id
 as the colleciton and shard you want to add a replica too.

 There is also an addReplica command comming to the collections API. Or
 perhaps it's in 4.7, I don't know, this JIRA issue is a little confusing as
 it's still open, though it looks like stuff has been committed:
 https://issues.apache.org/jira/browse/SOLR-5130
 --
 Mark Miller
 about.me/markrmiller

 On March 12, 2014 at 10:40:15 AM, Mike Hugo (m...@piragua.com) wrote:

 After a collection has been created in SolrCloud, is there a way to modify
 the Replication Factor?

 Say I start with a few nodes in the cluster, and have a replication factor
 of 2. Over time, the index grows and we add more nodes to the cluster, can
 I increase the replication factor to 3?

 Thanks!

 Mike



Re: Expanding sets of words

2013-05-21 Thread Mike Hugo
I'll buy that book :)

Does this work with mutli-word terms?

(common lisp or assembly language)
(programming or coding or development)


I tried:

{!surround}(common lisp OR assembly language) W (programming)

but that returns a parse error.

Putting quotes around the multi-word terms parses but returns 0 results

{!surround}(common lisp OR assembly language) W (programming)


On Tue, May 21, 2013 at 8:32 AM, Jack Krupansky j...@basetechnology.comwrote:

 I'll make sure to include that specific example in the new Solr book.


 -- Jack Krupansky

 -Original Message- From: Mike Hugo
 Sent: Tuesday, May 21, 2013 12:29 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Expanding sets of words


 Fantastic!  Thanks!


 On Mon, May 20, 2013 at 11:21 PM, Jack Krupansky j...@basetechnology.com
 **wrote:

  Yes, with the Solr surround query parser:

 q=(java OR groovy OR scala) W (programming OR coding OR development)

 BUT... there is the caveat that the surround query parser does no
 analysis. So, maybe you need Java OR java etc. Or, if you know that the
 index is lower case.

 Try this dataset:

 curl 
 http://localhost:8983/solr/collection1/update?commit=truehttp://localhost:8983/solr/**collection1/update?commit=true
 **http://localhost:8983/solr/**collection1/update?commit=truehttp://localhost:8983/solr/collection1/update?commit=true
 **-H 'Content-type:application/csv' -d '

 id,features
 doc-1,java coding
 doc-2,java programming
 doc-3,java development
 doc-4,groovy coding
 doc-5,groovy programming
 doc-6,groovy development
 doc-7,scala coding
 doc-8,scala programming
 doc-9,scala development
 doc-10,c coding
 doc-11,c programming
 doc-12,c development
 doc-13,java language
 doc-14,groovy language
 doc-15,scala language'

 And try these commands:

 curl 
 http://localhost:8983/solr/select/?q=(java+OR+scala)+W+**http://localhost:8983/solr/**select/?q=(java+OR+scala)+W+**
 programming\http://localhost:**8983/solr/select/?q=(java+OR+**
 scala)+W+programming%5Chttp://localhost:8983/solr/select/?q=(java+OR+scala)+W+programming%5C
 
 df=featuresdefType=surroundindent=true

 curl 
 http://localhost:8983/solr/select/http://localhost:8983/solr/**select/
 http://localhost:8983/**solr/select/http://localhost:8983/solr/select/
 
 ?\
 q=(java+OR+scala)+W+(programming+OR+coding)\
 df=featuresdefType=surroundindent=true

 curl 
 http://localhost:8983/solr/select/\http://localhost:8983/solr/**select/%5C
 http://localhost:**8983/solr/select/%5Chttp://localhost:8983/solr/select/%5C
 
 ?q=(java+OR+groovy+OR+scala)+W+(programming+OR+coding+OR+***
 *development)\
 df=featuresdefType=surroundindent=true


 The LucidWorks Search query parser also supports NEAR, BEFORE, and AFTER
 operators, in conjunction with OR and - to generate span queries:

 q=(java OR groovy OR scala) BEFORE:0 (programming OR coding OR
 development)

 -- Jack Krupansky

 -Original Message- From: Mike Hugo
 Sent: Monday, May 20, 2013 11:42 PM
 To: solr-user@lucene.apache.org
 Subject: Expanding sets of words


 Is there a way to query for combinations of two sets of words?  For
 example, if I had

 (java or groovy or scala)
 (programming or coding or development)

 Is there a query parser that, at query time, would expand that into
 combinations like

 java programming
 groovy programming
 scala programming
 java coding
 java development
 
 etc etc etc

 Thanks!

 Mike





Re: Expanding sets of words

2013-05-21 Thread Mike Hugo
Fantastic!  Thanks for following up - this is great.

Mike



On Tue, May 21, 2013 at 11:17 PM, Jack Krupansky j...@basetechnology.comwrote:

 Ah... and the answer is:

 curl http://localhost:8983/solr/**select/?q=(assembly+W+**
 language+OR+scala)+W+**programming\http://localhost:8983/solr/select/?q=(assembly+W+language+OR+scala)+W+programming%5C
 df=featuresdefType=surround**indent=true

 IOW, any quoted phrase like a b c d can be written in surround as a W b
 W c W d.

 Presto!

 I'll make sure that example is in the book as well.

 -- Jack Krupansky

 -Original Message- From: Jack Krupansky
 Sent: Tuesday, May 21, 2013 11:37 AM

 To: solr-user@lucene.apache.org
 Subject: Re: Expanding sets of words

 Hmmm... I did a quick test and quoted phrase wasn't working for me either.
 Oh well.

 But... it should work for the LucidWorks Search query parser!

 -- Jack Krupansky

 -Original Message- From: Mike Hugo
 Sent: Tuesday, May 21, 2013 11:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Expanding sets of words

 I'll buy that book :)

 Does this work with mutli-word terms?

 (common lisp or assembly language)
 (programming or coding or development)


 I tried:

 {!surround}(common lisp OR assembly language) W (programming)

 but that returns a parse error.

 Putting quotes around the multi-word terms parses but returns 0 results

 {!surround}(common lisp OR assembly language) W (programming)


 On Tue, May 21, 2013 at 8:32 AM, Jack Krupansky
 j...@basetechnology.com**wrote:

  I'll make sure to include that specific example in the new Solr book.


 -- Jack Krupansky

 -Original Message- From: Mike Hugo
 Sent: Tuesday, May 21, 2013 12:29 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Expanding sets of words


 Fantastic!  Thanks!


 On Mon, May 20, 2013 at 11:21 PM, Jack Krupansky j...@basetechnology.com
 
 **wrote:

  Yes, with the Solr surround query parser:


 q=(java OR groovy OR scala) W (programming OR coding OR development)

 BUT... there is the caveat that the surround query parser does no
 analysis. So, maybe you need Java OR java etc. Or, if you know that the
 index is lower case.

 Try this dataset:

 curl 
 http://localhost:8983/solr/**collection1/update?commit=**truehttp://localhost:8983/solr/collection1/update?commit=true
 http://localhost:8983/**solr/**collection1/update?**commit=truehttp://localhost:8983/solr/**collection1/update?commit=true
 
 **http://localhost:8983/solr/collection1/update?commit=**truehttp://localhost:8983/solr/**collection1/update?commit=true
 http://localhost:8983/**solr/collection1/update?**commit=truehttp://localhost:8983/solr/collection1/update?commit=true
 
 **-H 'Content-type:application/csv' -d '

 id,features
 doc-1,java coding
 doc-2,java programming
 doc-3,java development
 doc-4,groovy coding
 doc-5,groovy programming
 doc-6,groovy development
 doc-7,scala coding
 doc-8,scala programming
 doc-9,scala development
 doc-10,c coding
 doc-11,c programming
 doc-12,c development
 doc-13,java language
 doc-14,groovy language
 doc-15,scala language'

 And try these commands:

 curl 
 http://localhost:8983/solr/**select/?q=(java+OR+scala)+W+http://localhost:8983/solr/select/?q=(java+OR+scala)+W+**
 http://localhost:8983/solr/select/?q=(java+OR+scala)+W+http://localhost:8983/solr/**select/?q=(java+OR+scala)+W+**
 
 programming\http://localhost:8983/solr/select/?q=(java+**OR+**
 scala)+W+programming%5Chttp:/**/localhost:8983/solr/select/?**
 q=(java+OR+scala)+W+**programming%5Chttp://localhost:8983/solr/select/?q=(java+OR+scala)+W+programming%5C
 
 
 df=featuresdefType=surround**indent=true

 curl 
 http://localhost:8983/solr/**select/http://localhost:8983/solr/select/
 http://localhost:**8983/solr/**select/http://localhost:8983/solr/**select/
 
 http://localhost:8983/**solr/**select/http://localhost:8983/**solr/select/
 http://localhost:8983/**solr/select/http://localhost:8983/solr/select/
 
 
 ?\
 q=(java+OR+scala)+W+(**programming+OR+coding)\
 df=featuresdefType=surround**indent=true

 curl 
 http://localhost:8983/solr/**select/\http://localhost:8983/solr/select/%5C
 http://localhost:**8983/solr/**select/%5Chttp://localhost:8983/solr/**select/%5C
 
 http://localhost:**8983/solr/**select/%5Chttp://localhost:**
 8983/solr/select/%5C http://localhost:8983/solr/select/%5C
 
 ?q=(java+OR+groovy+OR+scala)+**W+(programming+OR+coding+**OR+***
 *development)\
 df=featuresdefType=surround**indent=true


 The LucidWorks Search query parser also supports NEAR, BEFORE, and AFTER
 operators, in conjunction with OR and - to generate span queries:

 q=(java OR groovy OR scala) BEFORE:0 (programming OR coding OR
 development)

 -- Jack Krupansky

 -Original Message- From: Mike Hugo
 Sent: Monday, May 20, 2013 11:42 PM
 To: solr-user@lucene.apache.org
 Subject: Expanding sets of words


 Is there a way to query for combinations of two sets of words?  For
 example, if I had

Expanding sets of words

2013-05-20 Thread Mike Hugo
Is there a way to query for combinations of two sets of words?  For
example, if I had

(java or groovy or scala)
(programming or coding or development)

Is there a query parser that, at query time, would expand that into
combinations like

java programming
groovy programming
scala programming
java coding
java development

etc etc etc

Thanks!

Mike


Re: Expanding sets of words

2013-05-20 Thread Mike Hugo
Fantastic!  Thanks!


On Mon, May 20, 2013 at 11:21 PM, Jack Krupansky j...@basetechnology.comwrote:

 Yes, with the Solr surround query parser:

 q=(java OR groovy OR scala) W (programming OR coding OR development)

 BUT... there is the caveat that the surround query parser does no
 analysis. So, maybe you need Java OR java etc. Or, if you know that the
 index is lower case.

 Try this dataset:

 curl 
 http://localhost:8983/solr/**collection1/update?commit=truehttp://localhost:8983/solr/collection1/update?commit=true-H
  'Content-type:application/csv' -d '
 id,features
 doc-1,java coding
 doc-2,java programming
 doc-3,java development
 doc-4,groovy coding
 doc-5,groovy programming
 doc-6,groovy development
 doc-7,scala coding
 doc-8,scala programming
 doc-9,scala development
 doc-10,c coding
 doc-11,c programming
 doc-12,c development
 doc-13,java language
 doc-14,groovy language
 doc-15,scala language'

 And try these commands:

 curl http://localhost:8983/solr/**select/?q=(java+OR+scala)+W+**
 programming\http://localhost:8983/solr/select/?q=(java+OR+scala)+W+programming%5C
 df=featuresdefType=surround**indent=true

 curl http://localhost:8983/solr/**select/http://localhost:8983/solr/select/
 ?\
 q=(java+OR+scala)+W+(**programming+OR+coding)\
 df=featuresdefType=surround**indent=true

 curl 
 http://localhost:8983/solr/**select/\http://localhost:8983/solr/select/%5C
 ?q=(java+OR+groovy+OR+scala)+**W+(programming+OR+coding+OR+**development)\
 df=featuresdefType=surround**indent=true

 The LucidWorks Search query parser also supports NEAR, BEFORE, and AFTER
 operators, in conjunction with OR and - to generate span queries:

 q=(java OR groovy OR scala) BEFORE:0 (programming OR coding OR development)

 -- Jack Krupansky

 -Original Message- From: Mike Hugo
 Sent: Monday, May 20, 2013 11:42 PM
 To: solr-user@lucene.apache.org
 Subject: Expanding sets of words


 Is there a way to query for combinations of two sets of words?  For
 example, if I had

 (java or groovy or scala)
 (programming or coding or development)

 Is there a query parser that, at query time, would expand that into
 combinations like

 java programming
 groovy programming
 scala programming
 java coding
 java development
 
 etc etc etc

 Thanks!

 Mike



ConcurrentUpdateSolrServer flush on size of documents rather than queue size

2013-03-01 Thread Mike Hugo
Does anyone know if a version of ConcurrentUpdateSolrServer exists that
would use the size in memory of the queue to decide when to send documents
to the solr server?

For example, if I set up a ConcurrentUpdateSolrServer with 4 threads and a
batch size of 200 that works if my documents are small.  But if I am
building up documents that have a lot of text, I have run into an
OutOfMemory exception in my process that builds the docs.  The document
sizes are variable.

What I'd like to be able to do is submit documents to the solr sever when
the size of the queue reaches (or is greater than) 200MB or something like
that, so rather than specifying the number of document to put in the queue,
I'd specify the size in MB to build up before submitting.

Does something like this exist already?

Thanks,

Mike


Re: always getting distinct count of -1 in luke response (solr4 snapshot)

2012-05-23 Thread Mike Hugo
Explicitly running an optimize on the index via the admin screens solved
this problem - the correct counts are now being returned.

On Tue, May 22, 2012 at 4:33 PM, Mike Hugo m...@piragua.com wrote:

 We're testing a snapshot of Solr4 and I'm looking at some of the responses
 from the Luke request handler.  Everything looks good so far, with the
 exception of the distinct attribute which (in Solr3) shows me the
 distinct number of terms for a given field.

 Given the request below, I'm consistently getting a response back with a
 value in the distinct field of -1.  Is there something different I need
 to do to get back the actual distinct count?

 Thanks!

 Mike

 http://localhost:8080/solr/core1/admin/luke?wt=jsonfl=labelnumTerms=1

 fields: {
 label: {
 type: text_general,
 schema: IT-M--,
 index: (unstored field),
 docs: 63887,
 *distinct: -1,*
 topTerms: [



always getting distinct count of -1 in luke response (solr4 snapshot)

2012-05-22 Thread Mike Hugo
We're testing a snapshot of Solr4 and I'm looking at some of the responses
from the Luke request handler.  Everything looks good so far, with the
exception of the distinct attribute which (in Solr3) shows me the
distinct number of terms for a given field.

Given the request below, I'm consistently getting a response back with a
value in the distinct field of -1.  Is there something different I need
to do to get back the actual distinct count?

Thanks!

Mike

http://localhost:8080/solr/core1/admin/luke?wt=jsonfl=labelnumTerms=1

fields: {
label: {
type: text_general,
schema: IT-M--,
index: (unstored field),
docs: 63887,
*distinct: -1,*
topTerms: [


Re: Size of suggest dictionary

2012-02-16 Thread Mike Hugo
Thanks Em!

What if we use a threshold value in the suggest configuration, like 

  float name=threshold0.005/float

I assume the dictionary size will then be smaller than the total number of 
distinct terms, is there anyway to determine what that size is?

Thanks,

Mike


On Wednesday, February 15, 2012 at 4:39 PM, Em wrote:

 Hello Mike,
 
 have a look at Solr's Schema Browser. Click on FIELDS, select label
 and have a look at the number of distinct (term-)values.
 
 Regards,
 Em
 
 
 Am 15.02.2012 23:07, schrieb Mike Hugo:
  Hello,
  
  We're building an auto suggest component based on the label field of
  documents. Is there a way to see how many terms are in the dictionary, or
  how much memory it's taking up? I looked on the statistics page but didn't
  find anything obvious.
  
  Thanks in advance,
  
  Mike
  
  ps- here's the config:
  
  searchComponent name=suggestlabel class=solr.SpellCheckComponent
  lst name=spellchecker
  str name=namesuggestlabel/str
  str
  name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
  name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
  str name=fieldlabel/str
  str name=buildOnOptimizetrue/str
  /lst
  /searchComponent
  
  requestHandler name=suggestlabel
  class=org.apache.solr.handler.component.SearchHandler
  lst name=defaults
  str name=spellchecktrue/str
  str name=spellcheck.dictionarysuggestlabel/str
  str name=spellcheck.count10/str
  /lst
  arr name=components
  strsuggestlabel/str
  /arr
  /requestHandler
  
 
 
 




Size of suggest dictionary

2012-02-15 Thread Mike Hugo
Hello,

We're building an auto suggest component based on the label field of
documents.  Is there a way to see how many terms are in the dictionary, or
how much memory it's taking up?  I looked on the statistics page but didn't
find anything obvious.

Thanks in advance,

Mike

ps- here's the config:

searchComponent name=suggestlabel class=solr.SpellCheckComponent
lst name=spellchecker
str name=namesuggestlabel/str
str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str
name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
str name=fieldlabel/str
str name=buildOnOptimizetrue/str
/lst
/searchComponent

requestHandler name=suggestlabel
class=org.apache.solr.handler.component.SearchHandler
lst name=defaults
str name=spellchecktrue/str
str name=spellcheck.dictionarysuggestlabel/str
str name=spellcheck.count10/str
/lst
arr name=components
strsuggestlabel/str
/arr
/requestHandler


Re: Solr Join query with fq not correctly filtering results?

2012-02-01 Thread Mike Hugo
Thanks Yonik!!

The join functionality is proving extremely useful for us in a specific use
case - we're really looking forward to join and other cool features coming
in Solr4!!

Mike

On Wed, Feb 1, 2012 at 3:30 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 Thanks for your persistence in tracking this down Mike!
 I'm going to start looking into this now...

 -Yonik
 lucidimagination.com



 On Thu, Jan 26, 2012 at 11:06 PM, Mike Hugo m...@piragua.com wrote:
  I created issue https://issues.apache.org/jira/browse/SOLR-3062 for this
  problem.  I was able to track it down to something in this commit -
  http://svn.apache.org/viewvc?view=revisionrevision=1188624(LUCENE-1536:
  Filters can now be applied down-low, if their DocIdSet implements a new
  bits() method, returning all documents in a random access way
  ) - before that commit the join / fq functionality works as expected /
  documented on the wiki page.  After that commit it's broken.
 
  Any assistance is greatly appreciated!
 
  Thanks,
 
  Mike
 
  On Thu, Jan 26, 2012 at 11:04 AM, Mike Hugo m...@piragua.com wrote:
 
  Hello,
 
  I'm trying out the Solr JOIN query functionality on trunk.  I have the
  latest checkout, revision #1236272 - I did the following steps to get
 the
  example up and running:
 
  cd solr
  ant example
  java -jar start.jar
  cd exampledocs
  java -jar post.jar *.xml
 
  Then I tried a few of the sample queries on the wiki page
  http://wiki.apache.org/solr/Join.  In particular, this is one that I'm
  interest in
 
  Find all manufacturer docs named belkin, then join them against
  (product) docs and filter that list to only products with a price less
 than
  12 dollars
 
 
 http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D
 
 http://localhost:8983/solr/select?q=%7B!join+from=id+to=manu_id_s%7DcompName_s:Belkinfq=price:%5B%2A+TO+12%5D
 
 
 
  However, when I run that query, I get two results, one with a price of
  19.95 and another with a price of 11.5  Because of the filter query, I'm
  only expecting to see one result - the one with a price of 11.99.
 
  I was also able to replicate this in a unit test added to
  org.apache.solr.TestJoin:
 
@Test
public void testJoin_withFilterQuery() throws Exception {
  assertU(add(doc(id, 1,name, john, title, Director,
  dept_s,Engineering)));
  assertU(add(doc(id, 2,name, mark, title, VP,
  dept_s,Marketing)));
  assertU(add(doc(id, 3,name, nancy, title, MTS,
  dept_s,Sales)));
  assertU(add(doc(id, 4,name, dave, title, MTS,
  dept_s,Support, dept_s,Engineering)));
  assertU(add(doc(id, 5,name, tina, title, VP,
  dept_s,Engineering)));
 
  assertU(add(doc(id,10, dept_id_s, Engineering, text,These
  guys develop stuff)));
  assertU(add(doc(id,11, dept_id_s, Marketing, text,These
  guys make you look good)));
  assertU(add(doc(id,12, dept_id_s, Sales, text,These guys
  sell stuff)));
  assertU(add(doc(id,13, dept_id_s, Support, text,These
 guys
  help customers)));
 
  assertU(commit());
 
  //***
  //This works as expected - the correct number of results are found
  //***
  // find people that develop stuff
  assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
  fl,id)
 
 
 ,/response=={'numFound':3,'start':0,'docs':[{'id':'1'},{'id':'4'},{'id':'5'}]}
  );
 
  *//
  *// this fails - the response returned finds all three people - it
  should only find John*
  *//expected
  =/response=={numFound:1,start:0,docs:[{id:1}]}*
  *//response = {*
  *//responseHeader:{*
  *//  status:0,*
  *//  QTime:4},*
  *//response:{numFound:3,start:0,docs:[*
  *//  {*
  *//id:1},*
  *//  {*
  *//id:4},*
  *//  {*
  *//id:5}]*
  *//}}*
  *//
  *// find people that develop stuff - but limit via filter query to a
  name of john*
  *assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
  fl,id, fq, name:john)*
  *,/response=={'numFound':1,'start':0,'docs':[{'id':'1'}]}*
  *);*
 
}
 
 
  Interestingly, I know this worked at some point.  I had a snapshot build
  in my ivy cache from 10/2/2011 and it was working with that
  build maven_artifacts/org/apache/solr/
  solr/4.0-SNAPSHOT/solr-4.0-20111002.161157-1.pom
 
 
  Mike
 



Re: Solr Join query with fq not correctly filtering results?

2012-01-31 Thread Mike Hugo
I've been looking into this a bit further and am trying to figure out why
the FQ isn't getting applied.

Can anyone point me to a good spot in the code to start looking at how FQ
parameters are applied to query results in Solr4?

Thanks,

Mike

On Thu, Jan 26, 2012 at 10:06 PM, Mike Hugo m...@piragua.com wrote:

 I created issue https://issues.apache.org/jira/browse/SOLR-3062 for this
 problem.  I was able to track it down to something in this commit -
 http://svn.apache.org/viewvc?view=revisionrevision=1188624 (LUCENE-1536:
 Filters can now be applied down-low, if their DocIdSet implements a new
 bits() method, returning all documents in a random access way
 ) - before that commit the join / fq functionality works as expected /
 documented on the wiki page.  After that commit it's broken.

 Any assistance is greatly appreciated!

 Thanks,

 Mike


 On Thu, Jan 26, 2012 at 11:04 AM, Mike Hugo m...@piragua.com wrote:

 Hello,

 I'm trying out the Solr JOIN query functionality on trunk.  I have the
 latest checkout, revision #1236272 - I did the following steps to get the
 example up and running:

 cd solr
 ant example
 java -jar start.jar
 cd exampledocs
 java -jar post.jar *.xml

 Then I tried a few of the sample queries on the wiki page
 http://wiki.apache.org/solr/Join.  In particular, this is one that I'm
 interest in

 Find all manufacturer docs named belkin, then join them against
 (product) docs and filter that list to only products with a price less than
 12 dollars

 http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5Dhttp://localhost:8983/solr/select?q=%7B!join+from=id+to=manu_id_s%7DcompName_s:Belkinfq=price:%5B%2A+TO+12%5D


 However, when I run that query, I get two results, one with a price of
 19.95 and another with a price of 11.5  Because of the filter query, I'm
 only expecting to see one result - the one with a price of 11.99.

 I was also able to replicate this in a unit test added to
 org.apache.solr.TestJoin:

   @Test
   public void testJoin_withFilterQuery() throws Exception {
 assertU(add(doc(id, 1,name, john, title, Director,
 dept_s,Engineering)));
 assertU(add(doc(id, 2,name, mark, title, VP,
 dept_s,Marketing)));
 assertU(add(doc(id, 3,name, nancy, title, MTS,
 dept_s,Sales)));
 assertU(add(doc(id, 4,name, dave, title, MTS,
 dept_s,Support, dept_s,Engineering)));
 assertU(add(doc(id, 5,name, tina, title, VP,
 dept_s,Engineering)));

 assertU(add(doc(id,10, dept_id_s, Engineering, text,These
 guys develop stuff)));
 assertU(add(doc(id,11, dept_id_s, Marketing, text,These
 guys make you look good)));
 assertU(add(doc(id,12, dept_id_s, Sales, text,These guys
 sell stuff)));
 assertU(add(doc(id,13, dept_id_s, Support, text,These guys
 help customers)));

 assertU(commit());

 //***
 //This works as expected - the correct number of results are found
 //***
 // find people that develop stuff
 assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
 fl,id)

 ,/response=={'numFound':3,'start':0,'docs':[{'id':'1'},{'id':'4'},{'id':'5'}]}
 );

 *//
 *// this fails - the response returned finds all three people - it
 should only find John*
 *//expected
 =/response=={numFound:1,start:0,docs:[{id:1}]}*
 *//response = {*
 *//responseHeader:{*
 *//  status:0,*
 *//  QTime:4},*
 *//response:{numFound:3,start:0,docs:[*
 *//  {*
 *//id:1},*
 *//  {*
 *//id:4},*
 *//  {*
 *//id:5}]*
 *//}}*
 *//
 *// find people that develop stuff - but limit via filter query to a
 name of john*
 *assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
 fl,id, fq, name:john)*
 *,/response=={'numFound':1,'start':0,'docs':[{'id':'1'}]}*
 *);*

   }


 Interestingly, I know this worked at some point.  I had a snapshot build
 in my ivy cache from 10/2/2011 and it was working with that
 build maven_artifacts/org/apache/solr/
 solr/4.0-SNAPSHOT/solr-4.0-20111002.161157-1.pom


 Mike





Solr Join query with fq not correctly filtering results?

2012-01-26 Thread Mike Hugo
Hello,

I'm trying out the Solr JOIN query functionality on trunk.  I have the
latest checkout, revision #1236272 - I did the following steps to get the
example up and running:

cd solr
ant example
java -jar start.jar
cd exampledocs
java -jar post.jar *.xml

Then I tried a few of the sample queries on the wiki page
http://wiki.apache.org/solr/Join.  In particular, this is one that I'm
interest in

Find all manufacturer docs named belkin, then join them against (product)
 docs and filter that list to only products with a price less than 12 dollars

 http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5D


However, when I run that query, I get two results, one with a price of
19.95 and another with a price of 11.5  Because of the filter query, I'm
only expecting to see one result - the one with a price of 11.99.

I was also able to replicate this in a unit test added to
org.apache.solr.TestJoin:

  @Test
  public void testJoin_withFilterQuery() throws Exception {
assertU(add(doc(id, 1,name, john, title, Director,
dept_s,Engineering)));
assertU(add(doc(id, 2,name, mark, title, VP,
dept_s,Marketing)));
assertU(add(doc(id, 3,name, nancy, title, MTS,
dept_s,Sales)));
assertU(add(doc(id, 4,name, dave, title, MTS,
dept_s,Support, dept_s,Engineering)));
assertU(add(doc(id, 5,name, tina, title, VP,
dept_s,Engineering)));

assertU(add(doc(id,10, dept_id_s, Engineering, text,These
guys develop stuff)));
assertU(add(doc(id,11, dept_id_s, Marketing, text,These guys
make you look good)));
assertU(add(doc(id,12, dept_id_s, Sales, text,These guys
sell stuff)));
assertU(add(doc(id,13, dept_id_s, Support, text,These guys
help customers)));

assertU(commit());

//***
//This works as expected - the correct number of results are found
//***
// find people that develop stuff
assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
fl,id)

,/response=={'numFound':3,'start':0,'docs':[{'id':'1'},{'id':'4'},{'id':'5'}]}
);

*//
*// this fails - the response returned finds all three people - it
should only find John*
*//expected =/response=={numFound:1,start:0,docs:[{id:1}]}
*
*//response = {*
*//responseHeader:{*
*//  status:0,*
*//  QTime:4},*
*//response:{numFound:3,start:0,docs:[*
*//  {*
*//id:1},*
*//  {*
*//id:4},*
*//  {*
*//id:5}]*
*//}}*
*//
*// find people that develop stuff - but limit via filter query to a
name of john*
*assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
fl,id, fq, name:john)*
*,/response=={'numFound':1,'start':0,'docs':[{'id':'1'}]}*
*);*

  }


Interestingly, I know this worked at some point.  I had a snapshot build in
my ivy cache from 10/2/2011 and it was working with that
build maven_artifacts/org/apache/solr/
solr/4.0-SNAPSHOT/solr-4.0-20111002.161157-1.pom


Mike


Re: Solr Join query with fq not correctly filtering results?

2012-01-26 Thread Mike Hugo
I created issue https://issues.apache.org/jira/browse/SOLR-3062 for this
problem.  I was able to track it down to something in this commit -
http://svn.apache.org/viewvc?view=revisionrevision=1188624 (LUCENE-1536:
Filters can now be applied down-low, if their DocIdSet implements a new
bits() method, returning all documents in a random access way
) - before that commit the join / fq functionality works as expected /
documented on the wiki page.  After that commit it's broken.

Any assistance is greatly appreciated!

Thanks,

Mike

On Thu, Jan 26, 2012 at 11:04 AM, Mike Hugo m...@piragua.com wrote:

 Hello,

 I'm trying out the Solr JOIN query functionality on trunk.  I have the
 latest checkout, revision #1236272 - I did the following steps to get the
 example up and running:

 cd solr
 ant example
 java -jar start.jar
 cd exampledocs
 java -jar post.jar *.xml

 Then I tried a few of the sample queries on the wiki page
 http://wiki.apache.org/solr/Join.  In particular, this is one that I'm
 interest in

 Find all manufacturer docs named belkin, then join them against
 (product) docs and filter that list to only products with a price less than
 12 dollars

 http://localhost:8983/solr/select?q={!join+from=id+to=manu_id_s}compName_s:Belkinfq=price:%5B%2A+TO+12%5Dhttp://localhost:8983/solr/select?q=%7B!join+from=id+to=manu_id_s%7DcompName_s:Belkinfq=price:%5B%2A+TO+12%5D


 However, when I run that query, I get two results, one with a price of
 19.95 and another with a price of 11.5  Because of the filter query, I'm
 only expecting to see one result - the one with a price of 11.99.

 I was also able to replicate this in a unit test added to
 org.apache.solr.TestJoin:

   @Test
   public void testJoin_withFilterQuery() throws Exception {
 assertU(add(doc(id, 1,name, john, title, Director,
 dept_s,Engineering)));
 assertU(add(doc(id, 2,name, mark, title, VP,
 dept_s,Marketing)));
 assertU(add(doc(id, 3,name, nancy, title, MTS,
 dept_s,Sales)));
 assertU(add(doc(id, 4,name, dave, title, MTS,
 dept_s,Support, dept_s,Engineering)));
 assertU(add(doc(id, 5,name, tina, title, VP,
 dept_s,Engineering)));

 assertU(add(doc(id,10, dept_id_s, Engineering, text,These
 guys develop stuff)));
 assertU(add(doc(id,11, dept_id_s, Marketing, text,These
 guys make you look good)));
 assertU(add(doc(id,12, dept_id_s, Sales, text,These guys
 sell stuff)));
 assertU(add(doc(id,13, dept_id_s, Support, text,These guys
 help customers)));

 assertU(commit());

 //***
 //This works as expected - the correct number of results are found
 //***
 // find people that develop stuff
 assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
 fl,id)

 ,/response=={'numFound':3,'start':0,'docs':[{'id':'1'},{'id':'4'},{'id':'5'}]}
 );

 *//
 *// this fails - the response returned finds all three people - it
 should only find John*
 *//expected
 =/response=={numFound:1,start:0,docs:[{id:1}]}*
 *//response = {*
 *//responseHeader:{*
 *//  status:0,*
 *//  QTime:4},*
 *//response:{numFound:3,start:0,docs:[*
 *//  {*
 *//id:1},*
 *//  {*
 *//id:4},*
 *//  {*
 *//id:5}]*
 *//}}*
 *//
 *// find people that develop stuff - but limit via filter query to a
 name of john*
 *assertJQ(req(q,{!join from=dept_id_s to=dept_s}text:develop,
 fl,id, fq, name:john)*
 *,/response=={'numFound':1,'start':0,'docs':[{'id':'1'}]}*
 *);*

   }


 Interestingly, I know this worked at some point.  I had a snapshot build
 in my ivy cache from 10/2/2011 and it was working with that
 build maven_artifacts/org/apache/solr/
 solr/4.0-SNAPSHOT/solr-4.0-20111002.161157-1.pom


 Mike



Re: HTMLStripCharFilterFactory not working in Solr4?

2012-01-25 Thread Mike Hugo
Thanks guys!  I'll grab the latest build from the solr4 jenkins server when
those commits get picked up and try it out.  Thanks for the quick
turnaround!

Mike

On Wed, Jan 25, 2012 at 11:01 AM, Steven A Rowe sar...@syr.edu wrote:

 Hi Mike,

 Yonik committed a fix to Solr trunk - your test on LUCENE-3721 succeeds
 for me now.  (On Solr trunk, *all* CharFilters have been non-functional
 since LUCENE-3396 was committed in r1175297 on 25 Sept 2011, until Yonik's
 fix today in r1235810; Solr 3.x was not affected - CharFilters have been
 working there all along.)

 Steve

  -Original Message-
  From: Mike Hugo [mailto:m...@piragua.com]
  Sent: Tuesday, January 24, 2012 3:56 PM
  To: solr-user@lucene.apache.org
  Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
 
  Thanks for the responses everyone.
 
  Steve, the test method you provided also works for me.  However, when I
  try
  a more end to end test with the HTMLStripCharFilterFactory configured for
  a
  field I am still having the same problem.  I attached a failing unit test
  and configuration to the following issue in JIRA:
 
  https://issues.apache.org/jira/browse/LUCENE-3721
 
  I appreciate all the prompt responses!  Looking forward to finding the
  root
  cause of this guy :)  If there's something I'm doing incorrectly in the
  configuration, please let me know!
 
  Mike
 
  On Tue, Jan 24, 2012 at 1:57 PM, Steven A Rowe sar...@syr.edu wrote:
 
   Hi Mike,
  
   When I add the following test to TestHTMLStripCharFilterFactory.java on
   Solr trunk, it passes:
  
   public void testNumericCharacterEntities() throws Exception {
final String text = Bose#174; #8482;;  // |Bose® ™|
HTMLStripCharFilterFactory htmlStripFactory = new
   HTMLStripCharFilterFactory();
htmlStripFactory.init(Collections.String,StringemptyMap());
CharStream charStream = htmlStripFactory.create(CharReader.get(new
   StringReader(text)));
StandardTokenizerFactory stdTokFactory = new
  StandardTokenizerFactory();
stdTokFactory.init(DEFAULT_VERSION_PARAM);
Tokenizer stream = stdTokFactory.create(charStream);
assertTokenStreamContents(stream, new String[] { Bose });
   }
  
   What's happening:
  
   First, htmlStripFactory converts #174; to ® and #8482; to ™.
Then stdTokFactory declines to tokenize ® and ™, because they are
   belong to the Unicode general category Symbol, Other, and so are not
   included in any of the output tokens.
  
   StandardTokenizer uses the Word Break rules find UAX#29 
   http://unicode.org/reports/tr29/ to find token boundaries, and then
   outputs only alphanumeric tokens.  See the JFlex grammar for details: 
  
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/analysis/common/src/
 
 java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=
  markup
   .
  
   The behavior you're seeing is not consistent with the above test.
  
   Steve
  
-Original Message-
From: Mike Hugo [mailto:m...@piragua.com]
Sent: Tuesday, January 24, 2012 1:34 PM
To: solr-user@lucene.apache.org
Subject: HTMLStripCharFilterFactory not working in Solr4?
   
We recently updated to the latest build of Solr4 and everything is
   working
really well so far!  There is one case that is not working the same
  way
   it
was in Solr 3.4 - we strip out certain HTML constructs (like
 trademark
   and
registered, for example) in a field as defined below - it was working
  in
Solr3.4 with the configuration shown here, but is not working the
 same
   way
in Solr4.
   
The label field is defined as type=text_general
field name=label type=text_general indexed=true stored=false
required=false multiValued=true/
   
Here's the type definition for text_general field:
fieldType name=text_general class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory
  ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory
  ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType
   
   
In Solr 3.4, that configuration was completely stripping html
  constructs
out of the indexed field which is exactly what we wanted.  If for
   example,
we then do a facet on the label field, like in the test below, we're
getting some terms

HTMLStripCharFilterFactory not working in Solr4?

2012-01-24 Thread Mike Hugo
We recently updated to the latest build of Solr4 and everything is working
really well so far!  There is one case that is not working the same way it
was in Solr 3.4 - we strip out certain HTML constructs (like trademark and
registered, for example) in a field as defined below - it was working in
Solr3.4 with the configuration shown here, but is not working the same way
in Solr4.

The label field is defined as type=text_general
field name=label type=text_general indexed=true stored=false
required=false multiValued=true/

Here's the type definition for text_general field:
fieldType name=text_general class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType


In Solr 3.4, that configuration was completely stripping html constructs
out of the indexed field which is exactly what we wanted.  If for example,
we then do a facet on the label field, like in the test below, we're
getting some terms in the response that we would not like to be there.


// test case (groovy)
void specialHtmlConstructsGetStripped() {
SolrInputDocument inputDocument = new SolrInputDocument()
inputDocument.addField('label', 'Bose#174; #8482;')

solrServer.add(inputDocument)
solrServer.commit()

QueryResponse response = solrServer.query(new SolrQuery('bose'))
assert 1 == response.results.numFound

SolrQuery facetQuery = new SolrQuery('bose')
facetQuery.facet = true
facetQuery.set(FacetParams.FACET_FIELD, 'label')
facetQuery.set(FacetParams.FACET_MINCOUNT, '1')

response = solrServer.query(facetQuery)
FacetField ff = response.facetFields.find {it.name == 'label'}

List suggestResponse = []

for (FacetField.Count facetField in ff?.values) {
suggestResponse  facetField.name
}

assert suggestResponse == ['bose']
}

With the upgrade to Solr4, the assertion fails, the suggested response
contains 174 and 8482 as terms.  Test output is:

Assertion failed:

assert suggestResponse == ['bose']
   |   |
   |   false
   [174, 8482, bose]


I just tried again using the latest build from today, namely:
https://builds.apache.org/job/Lucene-Solr-Maven-trunk/369/ and we're still
getting the failing assertion. Is there a different way to configure the
HTMLStripCharFilterFactory in Solr4?

Thanks in advance for any tips!

Mike


Re: HTMLStripCharFilterFactory not working in Solr4?

2012-01-24 Thread Mike Hugo
Thanks for the response Yonik,
Interestingly enough, changing to to the LegacyHTMLStripCharFilterFactory
does NOT solve the problem - in fact I get the same result

I can see that the LegacyHTMLStripCharFilterFactory is being applied at
startup:

Jan 24, 2012 1:25:29 PM org.apache.solr.util.plugin.AbstractPluginLoader
load
INFO: created : org.apache.solr.analysis.LegacyHTMLStripCharFilterFactory

however, I'm still getting the same assertion error.  Any thoughts?

Mike


On Tue, Jan 24, 2012 at 12:40 PM, Yonik Seeley
yo...@lucidimagination.comwrote:

 You can use LegacyHTMLStripCharFilterFactory to get the previous behavior.
 See https://issues.apache.org/jira/browse/LUCENE-3690 for more details.

 -Yonik
 http://www.lucidimagination.com



 On Tue, Jan 24, 2012 at 1:34 PM, Mike Hugo m...@piragua.com wrote:
  We recently updated to the latest build of Solr4 and everything is
 working
  really well so far!  There is one case that is not working the same way
 it
  was in Solr 3.4 - we strip out certain HTML constructs (like trademark
 and
  registered, for example) in a field as defined below - it was working in
  Solr3.4 with the configuration shown here, but is not working the same
 way
  in Solr4.
 
  The label field is defined as type=text_general
  field name=label type=text_general indexed=true stored=false
  required=false multiValued=true/
 
  Here's the type definition for text_general field:
  fieldType name=text_general class=solr.TextField
  positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 charFilter class=solr.HTMLStripCharFilterFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt
 enablePositionIncrements=true/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 charFilter class=solr.HTMLStripCharFilterFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt
 enablePositionIncrements=true/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 /fieldType
 
 
  In Solr 3.4, that configuration was completely stripping html constructs
  out of the indexed field which is exactly what we wanted.  If for
 example,
  we then do a facet on the label field, like in the test below, we're
  getting some terms in the response that we would not like to be there.
 
 
  // test case (groovy)
  void specialHtmlConstructsGetStripped() {
 SolrInputDocument inputDocument = new SolrInputDocument()
 inputDocument.addField('label', 'Bose#174; #8482;')
 
 solrServer.add(inputDocument)
 solrServer.commit()
 
 QueryResponse response = solrServer.query(new SolrQuery('bose'))
 assert 1 == response.results.numFound
 
 SolrQuery facetQuery = new SolrQuery('bose')
 facetQuery.facet = true
 facetQuery.set(FacetParams.FACET_FIELD, 'label')
 facetQuery.set(FacetParams.FACET_MINCOUNT, '1')
 
 response = solrServer.query(facetQuery)
 FacetField ff = response.facetFields.find {it.name == 'label'}
 
 List suggestResponse = []
 
 for (FacetField.Count facetField in ff?.values) {
 suggestResponse  facetField.name
 }
 
 assert suggestResponse == ['bose']
  }
 
  With the upgrade to Solr4, the assertion fails, the suggested response
  contains 174 and 8482 as terms.  Test output is:
 
  Assertion failed:
 
  assert suggestResponse == ['bose']
|   |
|   false
[174, 8482, bose]
 
 
  I just tried again using the latest build from today, namely:
  https://builds.apache.org/job/Lucene-Solr-Maven-trunk/369/ and we're
 still
  getting the failing assertion. Is there a different way to configure the
  HTMLStripCharFilterFactory in Solr4?
 
  Thanks in advance for any tips!
 
  Mike



Re: HTMLStripCharFilterFactory not working in Solr4?

2012-01-24 Thread Mike Hugo
Thanks for the responses everyone.

Steve, the test method you provided also works for me.  However, when I try
a more end to end test with the HTMLStripCharFilterFactory configured for a
field I am still having the same problem.  I attached a failing unit test
and configuration to the following issue in JIRA:

https://issues.apache.org/jira/browse/LUCENE-3721

I appreciate all the prompt responses!  Looking forward to finding the root
cause of this guy :)  If there's something I'm doing incorrectly in the
configuration, please let me know!

Mike

On Tue, Jan 24, 2012 at 1:57 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Mike,

 When I add the following test to TestHTMLStripCharFilterFactory.java on
 Solr trunk, it passes:

 public void testNumericCharacterEntities() throws Exception {
  final String text = Bose#174; #8482;;  // |Bose® ™|
  HTMLStripCharFilterFactory htmlStripFactory = new
 HTMLStripCharFilterFactory();
  htmlStripFactory.init(Collections.String,StringemptyMap());
  CharStream charStream = htmlStripFactory.create(CharReader.get(new
 StringReader(text)));
  StandardTokenizerFactory stdTokFactory = new StandardTokenizerFactory();
  stdTokFactory.init(DEFAULT_VERSION_PARAM);
  Tokenizer stream = stdTokFactory.create(charStream);
  assertTokenStreamContents(stream, new String[] { Bose });
 }

 What's happening:

 First, htmlStripFactory converts #174; to ® and #8482; to ™.
  Then stdTokFactory declines to tokenize ® and ™, because they are
 belong to the Unicode general category Symbol, Other, and so are not
 included in any of the output tokens.

 StandardTokenizer uses the Word Break rules find UAX#29 
 http://unicode.org/reports/tr29/ to find token boundaries, and then
 outputs only alphanumeric tokens.  See the JFlex grammar for details: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/modules/analysis/common/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=markup
 .

 The behavior you're seeing is not consistent with the above test.

 Steve

  -Original Message-
  From: Mike Hugo [mailto:m...@piragua.com]
  Sent: Tuesday, January 24, 2012 1:34 PM
  To: solr-user@lucene.apache.org
  Subject: HTMLStripCharFilterFactory not working in Solr4?
 
  We recently updated to the latest build of Solr4 and everything is
 working
  really well so far!  There is one case that is not working the same way
 it
  was in Solr 3.4 - we strip out certain HTML constructs (like trademark
 and
  registered, for example) in a field as defined below - it was working in
  Solr3.4 with the configuration shown here, but is not working the same
 way
  in Solr4.
 
  The label field is defined as type=text_general
  field name=label type=text_general indexed=true stored=false
  required=false multiValued=true/
 
  Here's the type definition for text_general field:
  fieldType name=text_general class=solr.TextField
  positionIncrementGap=100
  analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  charFilter class=solr.HTMLStripCharFilterFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt
  enablePositionIncrements=true/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  charFilter class=solr.HTMLStripCharFilterFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt
  enablePositionIncrements=true/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  /fieldType
 
 
  In Solr 3.4, that configuration was completely stripping html constructs
  out of the indexed field which is exactly what we wanted.  If for
 example,
  we then do a facet on the label field, like in the test below, we're
  getting some terms in the response that we would not like to be there.
 
 
  // test case (groovy)
  void specialHtmlConstructsGetStripped() {
  SolrInputDocument inputDocument = new SolrInputDocument()
  inputDocument.addField('label', 'Bose#174; #8482;')
 
  solrServer.add(inputDocument)
  solrServer.commit()
 
  QueryResponse response = solrServer.query(new SolrQuery('bose'))
  assert 1 == response.results.numFound
 
  SolrQuery facetQuery = new SolrQuery('bose')
  facetQuery.facet = true
  facetQuery.set(FacetParams.FACET_FIELD, 'label')
  facetQuery.set(FacetParams.FACET_MINCOUNT, '1')
 
  response = solrServer.query(facetQuery)
  FacetField ff = response.facetFields.find {it.name == 'label'}
 
  List suggestResponse = []
 
  for (FacetField.Count facetField in ff?.values) {
  suggestResponse  facetField.name
  }
 
  assert suggestResponse == ['bose']
  }
 
  With the upgrade to Solr4, the assertion fails, the suggested response
  contains