Re: Solr admin interface freezes on Chrome

2019-10-02 Thread Solr User
> Works fine on Firefox, and I
> haven't made any changes to our Solr instance (v8.1.1) in a while.

Had a co-worker with a similar issue. He had a pop-blocker enabled in
chrome that was preventing some resource call (or something similar). When
switching to Firefox everything worked without issue.  Any chance something
is showing in the developer tools console?


Solr standalone timeouts after upgrading to SOLR 7

2019-10-02 Thread Solr User
Hello all,

We recently moved to SOLR 7 from SOLR 6 about 2 weeks ago. Once each week
(including today) we experienced query timeout issues with corresponding GC
events. There was a spike in CPU up to 66% which is not something we
previously saw w/ Solr 6. From the SOLR logs it looks like something inside
the JVM has happend, SOLR is reporting closed connections from Jetty. Our
data size is relatively small but we do run 5 cores within the one Jetty
instance. There index sizes are anywhere between 200Mb to 2GB

Our memory consumption is relatively low:

"free":"296.1 MB",

  "total":"569.6 MB",

  "max":"9.6 GB",

  "used":"273.5 MB (%2.8)",



We had a spike in traffic about 5 minutes prior to some longer GC events
(similar situation last week).

Any help would be appreciated. Below is my current system info along with a
GC log snippet and the corresponding SOLR log error.

*System info:*
AMZ2 linux
8 core 32 GB Mem
*Java:* 1.8.0_222-ea 25.222-b03
*Solr: *solr-spec-version":"7.7.2"
*Start options: *
"-Xms512m",
"-Xmx10g",
"-XX:NewRatio=3",
"-XX:SurvivorRatio=4",
"-XX:TargetSurvivorRatio=90",
"-XX:MaxTenuringThreshold=8",
"-XX:+UseConcMarkSweepGC",
"-XX:ConcGCThreads=4",
"-XX:ParallelGCThreads=4",
"-XX:+CMSScavengeBeforeRemark",
"-XX:PretenureSizeThreshold=64m",
"-XX:+UseCMSInitiatingOccupancyOnly",
"-XX:CMSInitiatingOccupancyFraction=50",
"-XX:CMSMaxAbortablePrecleanTime=6000",
"-XX:+CMSParallelRemarkEnabled",
"-XX:+ParallelRefProcEnabled",
"-XX:-OmitStackTraceInFastThrow",
"-verbose:gc",
"-XX:+PrintHeapAtGC",
"-XX:+PrintGCDetails",
"-XX:+PrintGCDateStamps",
"-XX:+PrintGCTimeStamps",
"-XX:+PrintTenuringDistribution",
"-XX:+PrintGCApplicationStoppedTime",
"-XX:+UseGCLogFileRotation",
"-XX:NumberOfGCLogFiles=9",
"-XX:GCLogFileSize=20M",
"-Xss256k",
"-Dsolr.log.muteconsole"

Here is an example of from the GC log:

2019-10-02T16:03:15.888+: 265318.624: [Full GC (Allocation
Failure) 2019-10-02T16:03:15.888+: 265318.624:
[CMS2019-10-02T16:03:16.134+: 26
5318.870: [CMS-concurrent-mark: 1.773/1.783 secs] [Times: user=13.14
sys=0.00, real=1.78 secs]
 (concurrent mode failure): 7864319K->7864319K(7864320K), 9.5890129
secs] 10048895K->8863021K(10048896K), [Metaspace:
53159K->53159K(1097728K)], 9.5892061 secs] [Times: user=10.31
sys=0.00, real=9.59 secs]
Heap after GC invocations=296656 (full 546):
 par new generation   total 2184576K, used 998701K
[0x00054000, 0x0005e000, 0x0005e000)
  eden space 1747712K,  57% used [0x00054000,
0x00057cf4b4f0, 0x0005aaac)
  from space 436864K,   0% used [0x0005aaac,
0x0005aaac, 0x0005c556)
  to   space 436864K,   0% used [0x0005c556,
0x0005c556, 0x0005e000)
 concurrent mark-sweep generation total 7864320K, used 7864319K
[0x0005e000, 0x0007c000, 0x0007c000)
 Metaspace   used 53159K, capacity 54766K, committed 55148K,
reserved 1097728K
  class spaceused 5589K, capacity 5950K, committed 6000K, reserved 1048576K
}
2019-10-02T16:03:25.477+: 265328.214: Total time for which
application threads were stopped: 9.5906157 seconds, Stopping threads
took: 0.0001274 seconds
*With the following from the SOLR log: *

[   x:core] o.a.s.s.HttpSolrCall Unable to write response, client
closed connection or we are s

hutting down

org.eclipse.jetty.io.EofException: Closed

at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:665)
~[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]

at 
org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:126)
~[solr-core-7.7.2.jar:7.7.2 d4c30fc2856154f2c

1fefc589eb7cd070a415b94 - janhoy - 2019-05-28 23:37:48]

at 
org.apache.solr.response.QueryResponseWriterUtil$1.write(QueryResponseWriterUtil.java:54)
~[solr-core-7.7.2.jar:7.7.2 d4c30fc2856154f2c1fef

c589eb7cd070a415b94 - janhoy - 2019-05-28 23:37:48]

at java.io.OutputStream.write(OutputStream.java:116) ~[?:1.8.0_222-ea]

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
~[?:1.8.0_222-ea]

at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
~[?:1.8.0_222-ea]

at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
~[?:1.8.0_222-ea]

at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
~[?:1.8.0_222-ea]

at org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
~[solr-solrj-7.7.2.jar:7.7.2 d4c30fc2856154f2c1fefc589eb7cd070a415b94
- j

anhoy - 2019-05-28 23:37:52]

at 
org.apache.solr.common.util.FastWriter.flushBuffer(FastWriter.java:154)
~[solr-solrj-7.7.2.jar:7.7.2 d4c30fc2856154f2c1fefc589eb7cd070a415b

94 - janhoy - 2019-05-28 23:37:52]

at 

Fwd: Solr standalone timeouts after upgrading to SOLR 7

2019-10-02 Thread Solr User
Hello all,

We recently moved to SOLR 7 from SOLR 6 about 2 weeks ago. Once each week
(including today) we experienced query timeout issues with corresponding GC
events. There was a spike in CPU up to 66% which is not something we
previously saw w/ Solr 6. From the SOLR logs it looks like something inside
the JVM has happend, SOLR is reporting closed connections from Jetty. Our
data size is relatively small but we do run 5 cores within the one Jetty
instance. There index sizes are anywhere between 200Mb to 2GB

Our memory consumption is relatively low:

"free":"296.1 MB",

  "total":"569.6 MB",

  "max":"9.6 GB",

  "used":"273.5 MB (%2.8)",



We had a spike in traffic about 5 minutes prior to some longer GC events
(similar situation last week).

Any help would be appreciated. Below is my current system info along with a
GC log snippet and the corresponding SOLR log error.

*System info:*
AMZ2 linux
8 core 32 GB Mem
*Java:* 1.8.0_222-ea 25.222-b03
*Solr: *solr-spec-version":"7.7.2"
*Start options: *
"-Xms512m",
"-Xmx10g",
"-XX:NewRatio=3",
"-XX:SurvivorRatio=4",
"-XX:TargetSurvivorRatio=90",
"-XX:MaxTenuringThreshold=8",
"-XX:+UseConcMarkSweepGC",
"-XX:ConcGCThreads=4",
"-XX:ParallelGCThreads=4",
"-XX:+CMSScavengeBeforeRemark",
"-XX:PretenureSizeThreshold=64m",
"-XX:+UseCMSInitiatingOccupancyOnly",
"-XX:CMSInitiatingOccupancyFraction=50",
"-XX:CMSMaxAbortablePrecleanTime=6000",
"-XX:+CMSParallelRemarkEnabled",
"-XX:+ParallelRefProcEnabled",
"-XX:-OmitStackTraceInFastThrow",
"-verbose:gc",
"-XX:+PrintHeapAtGC",
"-XX:+PrintGCDetails",
"-XX:+PrintGCDateStamps",
"-XX:+PrintGCTimeStamps",
"-XX:+PrintTenuringDistribution",
"-XX:+PrintGCApplicationStoppedTime",
"-XX:+UseGCLogFileRotation",
"-XX:NumberOfGCLogFiles=9",
"-XX:GCLogFileSize=20M",
"-Xss256k",
"-Dsolr.log.muteconsole"

Here is an example of from the GC log:

2019-10-02T16:03:15.888+: 265318.624: [Full GC (Allocation
Failure) 2019-10-02T16:03:15.888+: 265318.624:
[CMS2019-10-02T16:03:16.134+: 26
5318.870: [CMS-concurrent-mark: 1.773/1.783 secs] [Times: user=13.14
sys=0.00, real=1.78 secs]
 (concurrent mode failure): 7864319K->7864319K(7864320K), 9.5890129
secs] 10048895K->8863021K(10048896K), [Metaspace:
53159K->53159K(1097728K)], 9.5892061 secs] [Times: user=10.31
sys=0.00, real=9.59 secs]
Heap after GC invocations=296656 (full 546):
 par new generation   total 2184576K, used 998701K
[0x00054000, 0x0005e000, 0x0005e000)
  eden space 1747712K,  57% used [0x00054000,
0x00057cf4b4f0, 0x0005aaac)
  from space 436864K,   0% used [0x0005aaac,
0x0005aaac, 0x0005c556)
  to   space 436864K,   0% used [0x0005c556,
0x0005c556, 0x0005e000)
 concurrent mark-sweep generation total 7864320K, used 7864319K
[0x0005e000, 0x0007c000, 0x0007c000)
 Metaspace   used 53159K, capacity 54766K, committed 55148K,
reserved 1097728K
  class spaceused 5589K, capacity 5950K, committed 6000K, reserved 1048576K
}
2019-10-02T16:03:25.477+: 265328.214: Total time for which
application threads were stopped: 9.5906157 seconds, Stopping threads
took: 0.0001274 seconds
*With the following from the SOLR log: *

[   x:core] o.a.s.s.HttpSolrCall Unable to write response, client
closed connection or we are s

hutting down

org.eclipse.jetty.io.EofException: Closed

at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:665)
~[jetty-server-9.4.14.v20181114.jar:9.4.14.v20181114]

at 
org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:126)
~[solr-core-7.7.2.jar:7.7.2 d4c30fc2856154f2c

1fefc589eb7cd070a415b94 - janhoy - 2019-05-28 23:37:48]

at 
org.apache.solr.response.QueryResponseWriterUtil$1.write(QueryResponseWriterUtil.java:54)
~[solr-core-7.7.2.jar:7.7.2 d4c30fc2856154f2c1fef

c589eb7cd070a415b94 - janhoy - 2019-05-28 23:37:48]

at java.io.OutputStream.write(OutputStream.java:116) ~[?:1.8.0_222-ea]

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
~[?:1.8.0_222-ea]

at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
~[?:1.8.0_222-ea]

at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
~[?:1.8.0_222-ea]

at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
~[?:1.8.0_222-ea]

at org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
~[solr-solrj-7.7.2.jar:7.7.2 d4c30fc2856154f2c1fefc589eb7cd070a415b94
- j

anhoy - 2019-05-28 23:37:52]

at 
org.apache.solr.common.util.FastWriter.flushBuffer(FastWriter.java:154)
~[solr-solrj-7.7.2.jar:7.7.2 d4c30fc2856154f2c1fefc589eb7cd070a415b

94 - janhoy - 2019-05-28 23:37:52]

at 

Re: Work-around for "indexed without position data"

2017-07-03 Thread Solr User
Not sure if it helps beyond the steps to reproduce that I supplied above,
but I also see that "Omit Term Frequencies & Positions" is still set on the
field according to the LukeRequestHandler:

ITS--OF--



On Mon, Jun 5, 2017 at 1:18 PM, Solr User <solr...@gmail.com> wrote:

> Sorry for the delay.  I was able to reproduce this easily with my setup,
> but reproducing this on a Solr example proved challenging.  Hopefully the
> work that I did to find the situation in which this is produced will help
> in resolving the problem.  The driving factor for this appears to be how
> updates are sent to Solr.  When sending batches of updates with commits,
> the problem is reproduced.  If the commit is held until after all updates
> are sent, then no problem is produced.  This leads me to believe that this
> issue has something to do with overlapping commits or index merges.  This
> was reproducible regardless of running classic or managed schema and
> regardless of running Solr core or SolrCloud.
>
> There are not many steps to reproduce this, but you will need a way to
> send these updates.  I have included inline create.sh and create.pl
> scripts to generate the data and send the updates.  You can index a
> lastModified field or something to convince yourself that everything has
> been re-indexed.  I left that out to keep the steps lean.  Also, this test
> is using commit statements from the client sending the updates for
> simplicity even though it is not a good practice.  My normal setup is using
> Solrj with commitWithin to allow Solr to manage when the commits take
> place, but the same error is produced either way.
>
>
> *STEPS TO REPRODUCE*
>
>1. Install Solr 5.5.3 and change to that working directory
>2. bin/solr -e techproducts
>3. bin/solr stop [Why these next 3 steps?  These are to start the
>index completely new without the 32 example documents as opposed to a
>delete query.  The documents are not posted after the core is detected the
>second time.]
>4. rm -rf ./example/techproducts/solr/techproducts/data/
>5. bin/solr -e techproducts
>6. ./create.sh
>7. curl -X POST -H 'Content-type:application/json' --data-binary '{
>"replace-field":{ "name":"cat", "type":"text_en_splitting", "indexed":true,
>"multiValued":true, "stored":true } }' http://localhost:8983/solr/
>techproducts/schema
>8. http://localhost:8983/solr/techproducts/select?q=cat:%
>22hard%20drive%22  [error]
>9. ./create.sh
>10. http://localhost:8983/solr/techproducts/select?q=cat:%
>22hard%20drive%22  [error even though all documents have been
>re-indexed]
>
> *create.sh*
> #!/bin/bash
> for i in {1..100}; do
> echo "$i"
> ./create.pl $i > ./create.xml$i
> curl http://localhost:8983/solr/techproducts/update?commit=true -H
> "Content-Type: text/xml" --data-binary @./create.xml$i
> done
>
> *create.pl <http://create.pl>*
> #!/usr/bin/perl
> my $S = $ARGV[0];
> my $I = 100;
> my $N = $S*$I + $I;
> my $i;
> print "\n";
> for($i=$S*$I; $i<$N; $i++) {
>print "SP${i}cat
> hard drive ${i}\n";
> }
> print "\n";
>
>
> On Fri, May 26, 2017 at 2:14 AM, Rick Leir <rl...@leirtech.com> wrote:
>
>> Can you reproduce this error? What are the steps you take to reproduce
>> it? ( simple is better).
>>
>> cheers -- Rick
>>
>>
>>
>> On 2017-05-25 05:46 PM, Solr User wrote:
>>
>>> This is in regards to changing a field type from string to
>>> text_en_splitting, re-indexing all documents, even optimizing to give the
>>> index a chance to merge segments and rewrite itself entirely, and then
>>> getting this error when running a phrase query:
>>> java.lang.IllegalStateException: field "blah" was indexed without
>>> position
>>> data; cannot run PhraseQuery
>>>
>>> I have encountered this issue before and have always done one of the
>>> following as a work-around:
>>> 1.  Instead of changing the field type on an existing field just create a
>>> new field and retire the old one.
>>> 2.  Delete the index directory and start from scratch.
>>>
>>> These work-arounds are not always ideal.  Does anyone know what is
>>> holding
>>> onto that old field type definition?  What thinks it is still a string?
>>> Every document has been re-indexed and I am sure of this because I have a
>>> time stamp indexed.  Is there any other way to get this to work?
>>>
>>> For what it is worth, I am running this in SolrCloud mode but I remember
>>> seeing this issue before SolrCloud was released as well.
>>>
>>>
>>
>


Re: Anonymous Read?

2017-06-06 Thread Solr User
Thanks!  The null role value did the trick.  I tried this with the
predefined permissions and it worked as well.  Thanks again!

On Tue, Jun 6, 2017 at 2:08 PM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> We usually end security.json with the permissions
>
>{
> "name":"open_select",
>  "path":"/select/*",
>  "role":null},
>  {
> "name":"all-admin",
> "collection":null,
> "path":"/*",
> "role":"allgen"},
>  {
> "name":"all-core-handlers",
> "path":"/*",
>  "role":"allgen"}]
>  } }
>
>
> ...and then assign the "allgen" role to all users
>
> This allows a select without a login & password, but requires a login &
> password for anything else (including the front page of the GUI)
>
> -Original Message-
> From: Solr User [mailto:solr...@gmail.com]
> Sent: Tuesday, June 06, 2017 2:27 PM
> To: solr-user@lucene.apache.org
> Subject: Anonymous Read?
>
> Is it possible to setup Solr security to allow anonymous query (/select
> etc.) but restricted access to other permissions as described in
> https://lucidworks.com/2015/08/17/securing-solr-basic-
> auth-permission-rules/
> ?
>


Anonymous Read?

2017-06-06 Thread Solr User
Is it possible to setup Solr security to allow anonymous query (/select
etc.) but restricted access to other permissions as described in
https://lucidworks.com/2015/08/17/securing-solr-basic-auth-permission-rules/
?


Re: Work-around for "indexed without position data"

2017-06-05 Thread Solr User
Sorry for the delay.  I was able to reproduce this easily with my setup,
but reproducing this on a Solr example proved challenging.  Hopefully the
work that I did to find the situation in which this is produced will help
in resolving the problem.  The driving factor for this appears to be how
updates are sent to Solr.  When sending batches of updates with commits,
the problem is reproduced.  If the commit is held until after all updates
are sent, then no problem is produced.  This leads me to believe that this
issue has something to do with overlapping commits or index merges.  This
was reproducible regardless of running classic or managed schema and
regardless of running Solr core or SolrCloud.

There are not many steps to reproduce this, but you will need a way to send
these updates.  I have included inline create.sh and create.pl scripts to
generate the data and send the updates.  You can index a lastModified field
or something to convince yourself that everything has been re-indexed.  I
left that out to keep the steps lean.  Also, this test is using commit
statements from the client sending the updates for simplicity even though
it is not a good practice.  My normal setup is using Solrj with
commitWithin to allow Solr to manage when the commits take place, but the
same error is produced either way.


*STEPS TO REPRODUCE*

   1. Install Solr 5.5.3 and change to that working directory
   2. bin/solr -e techproducts
   3. bin/solr stop [Why these next 3 steps?  These are to start the
   index completely new without the 32 example documents as opposed to a
   delete query.  The documents are not posted after the core is detected the
   second time.]
   4. rm -rf ./example/techproducts/solr/techproducts/data/
   5. bin/solr -e techproducts
   6. ./create.sh
   7. curl -X POST -H 'Content-type:application/json' --data-binary '{
   "replace-field":{ "name":"cat", "type":"text_en_splitting", "indexed":true,
   "multiValued":true, "stored":true } }'
   http://localhost:8983/solr/techproducts/schema
   8.
   http://localhost:8983/solr/techproducts/select?q=cat:%22hard%20drive%22
   [error]
   9. ./create.sh
   10.
   http://localhost:8983/solr/techproducts/select?q=cat:%22hard%20drive%22
   [error even though all documents have been re-indexed]

*create.sh*
#!/bin/bash
for i in {1..100}; do
echo "$i"
./create.pl $i > ./create.xml$i
curl http://localhost:8983/solr/techproducts/update?commit=true -H
"Content-Type: text/xml" --data-binary @./create.xml$i
done

*create.pl <http://create.pl>*
#!/usr/bin/perl
my $S = $ARGV[0];
my $I = 100;
my $N = $S*$I + $I;
my $i;
print "\n";
for($i=$S*$I; $i<$N; $i++) {
   print "SP${i}cat
hard drive ${i}\n";
}
print "\n";


On Fri, May 26, 2017 at 2:14 AM, Rick Leir <rl...@leirtech.com> wrote:

> Can you reproduce this error? What are the steps you take to reproduce it?
> ( simple is better).
>
> cheers -- Rick
>
>
>
> On 2017-05-25 05:46 PM, Solr User wrote:
>
>> This is in regards to changing a field type from string to
>> text_en_splitting, re-indexing all documents, even optimizing to give the
>> index a chance to merge segments and rewrite itself entirely, and then
>> getting this error when running a phrase query:
>> java.lang.IllegalStateException: field "blah" was indexed without
>> position
>> data; cannot run PhraseQuery
>>
>> I have encountered this issue before and have always done one of the
>> following as a work-around:
>> 1.  Instead of changing the field type on an existing field just create a
>> new field and retire the old one.
>> 2.  Delete the index directory and start from scratch.
>>
>> These work-arounds are not always ideal.  Does anyone know what is holding
>> onto that old field type definition?  What thinks it is still a string?
>> Every document has been re-indexed and I am sure of this because I have a
>> time stamp indexed.  Is there any other way to get this to work?
>>
>> For what it is worth, I am running this in SolrCloud mode but I remember
>> seeing this issue before SolrCloud was released as well.
>>
>>
>


Work-around for "indexed without position data"

2017-05-25 Thread Solr User
This is in regards to changing a field type from string to
text_en_splitting, re-indexing all documents, even optimizing to give the
index a chance to merge segments and rewrite itself entirely, and then
getting this error when running a phrase query:
java.lang.IllegalStateException: field "blah" was indexed without position
data; cannot run PhraseQuery

I have encountered this issue before and have always done one of the
following as a work-around:
1.  Instead of changing the field type on an existing field just create a
new field and retire the old one.
2.  Delete the index directory and start from scratch.

These work-arounds are not always ideal.  Does anyone know what is holding
onto that old field type definition?  What thinks it is still a string?
Every document has been re-indexed and I am sure of this because I have a
time stamp indexed.  Is there any other way to get this to work?

For what it is worth, I am running this in SolrCloud mode but I remember
seeing this issue before SolrCloud was released as well.


Re: Faceting and Grouping Performance Degradation in Solr 5

2017-02-06 Thread Solr User
I am pleased to report that we are in Production on Solr 5.5.3 with
comparable performance to Solr 4.8.1 through leveraging facet.method=uif as
well as https://issues.apache.org/jira/browse/SOLR-9176.  Thanks to
everyone who worked on these!

On Mon, Oct 3, 2016 at 3:55 PM, Solr User <solr...@gmail.com> wrote:

> Below is some further testing.  This was done in an environment that had
> no other queries or updates during testing.  We ran through several
> scenarios so I pasted this with HTML formatting below so you may view this
> as a table.  Sorry if you have to pull this out into a different file for
> viewing, but I did not want the formatting to be messed up.  The times are
> average times in milliseconds.  Same test methodology as above except there
> was a 5 minute warmup and a 15 minute test.
>
> Note that both the segment and deletions were recorded from only 1 out of
> 2 of the shards so we cannot try to extrapolate a function between them and
> the outcome.  In other words, just view them as "non-optimized" versus
> "optimized" and "has deletions" versus "no deletions".  The only exceptions
> are the 0 deletes were true for both shards and the 1 segment and 8 segment
> cases were true for both shards.  A few of the tests were repeated as well.
>
> The only conclusion that I could draw is that the number of segments and
> the number of deletes appear to greatly influence the response times, at
> least more than any difference in Solr version.  There also appears to be
> some external contributor to variancemaybe network, etc.
>
> Thoughts?
>
>
> Date9/29/20169/29/
> 20169/29/20169/30/20169/30/
> 20169/30/20169/30/20169/30/
> 20169/30/20169/30/20169/30/
> 20169/30/20169/30/201610/3/
> 201610/3/201610/3/201610/3/2016Solr
> Version5.5.25.5.24.8.14.
> 8.14.8.15.5.25.5.25.5.2<
> /td>5.5.25.5.25.5.25.5.2 td>5.5.24.8.14.8.14.8.1 td>4.8.1Deleted Docs57873
> 57873176958593694593694
> 578735787357873578730<
> /td>00<
> /td>0Segment Count3434 td>1827273434<
> td>34348811 td>8811
> facet.method=uifYESYESN/A<
> td>N/AN/AYESYESNO td>NONOYESYESNO td>N/AN/AN/AN/AScenario
> #1198210145186<
> td>190208209210206 td>1091427370160 td>1098385Scenario
> #29288596258 td>7270777468<
> td>7363616654
> 5251
>
>
>
>
> On Wed, Sep 28, 2016 at 4:44 PM, Solr User <solr...@gmail.com> wrote:
>
>> I plan to re-test this in a separate environment that I have more control
>> over and will share the results when I can.
>>
>> On Wed, Sep 28, 2016 at 3:37 PM, Solr User <solr...@gmail.com> wrote:
>>
>>> Certainly.  And I would of course welcome anyone else to test this for
>>> themselves especially with facet.method=uif to see if that has indeed
>>> bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
>>> testing is invalid due to variance, problem in process, etc.  One thing I
>>> was pondering is if I should force merge the index to a certain amount of
>>> segments because indexing yields a random number of segments and
>>> deletions.  The only thing stopping me short of doing that were
>>> observations of longer Solr 4 times even with more deletions and similar
>>> number of segments.
>>>
>>> We use Soasta as our testing tool.  Before testing, load is sent for
>>> 10-15 minutes to make sure any Solr caches have stabilized.  Then the test
>>> is run for 30 minutes of steady volume with Scenario #1 tested at 15
>>> req/sec and Scenario #2 tested at 100 req/sec.  Each request is different
>>> with input being pulled from data files.  The requests are repeatable test
>>> to test.
>>>
>>> The numbers posted above are average response times as reported by
>>> Soasta.  However, respective time differences are supported by Splunk which
>>> indexes the Solr logs and Dynatrace which is instrumented on one of the
>>> JVM's.
>>>
>>> The versions are deployed to the same machines thereby overlaying the
>>> previous installation.  Going Solr 4 to Solr 5, full indexing is run with
>>> the same input data.  Being in SolrCloud mode, the full indexing comprises
>>> of indexing all documents and then deleting any that were not touched.
>>> Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
>>> load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
>>> results as the previous Solr 4 test.
>>>
>>>
>>> On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
>>>

Re: ClassNotFoundException with Custom ZkACLProvider

2016-11-15 Thread Solr User
For those interested, I ended up bundling the customized ACL provider with
the solr.war.  I could not stomach looking at the stack trace in the logs.

On Mon, Nov 7, 2016 at 4:47 PM, Solr User <solr...@gmail.com> wrote:

> This is mostly just an FYI regarding future work on issues like SOLR-8792.
>
> I wanted admin update but world read on ZK since I do not have anything
> sensitive from a read perspective in the Solr data and did not want to
> force all SolrCloud clients to implement authentication just for read.  So,
> I extended DefaultZkACLProvider and implemented a replacement for
> VMParamsAllAndReadonlyDigestZkACLProvider.
>
> My custom code is loaded from the sharedLib in solr.xml.  However, there
> is a temporary ZK lookup to read solr.xml (and chroot) which is obviously
> done before loading sharedLib.  Therefore, I am faced with a
> ClassNotFoundException.  This has no negative effect on the ACL
> functionalityjust the annoying stack trace in the logs.  I do not want
> to package this custom code with the Solr code and do not want to package
> this along with Solr dependencies in the Jetty lib/ext.
>
> So, I am planning to live with the stack trace and just wanted to share
> this for any future work on the dynamic solr.xml and chroot lookups or in
> case I am missing some work-around.
>
> Thanks!
>
>


ClassNotFoundException with Custom ZkACLProvider

2016-11-07 Thread Solr User
This is mostly just an FYI regarding future work on issues like SOLR-8792.

I wanted admin update but world read on ZK since I do not have anything
sensitive from a read perspective in the Solr data and did not want to
force all SolrCloud clients to implement authentication just for read.  So,
I extended DefaultZkACLProvider and implemented a replacement for
VMParamsAllAndReadonlyDigestZkACLProvider.

My custom code is loaded from the sharedLib in solr.xml.  However, there is
a temporary ZK lookup to read solr.xml (and chroot) which is obviously done
before loading sharedLib.  Therefore, I am faced with a
ClassNotFoundException.  This has no negative effect on the ACL
functionalityjust the annoying stack trace in the logs.  I do not want
to package this custom code with the Solr code and do not want to package
this along with Solr dependencies in the Jetty lib/ext.

So, I am planning to live with the stack trace and just wanted to share
this for any future work on the dynamic solr.xml and chroot lookups or in
case I am missing some work-around.

Thanks!


Re: Faceting and Grouping Performance Degradation in Solr 5

2016-10-03 Thread Solr User
Below is some further testing.  This was done in an environment that had no
other queries or updates during testing.  We ran through several scenarios
so I pasted this with HTML formatting below so you may view this as a
table.  Sorry if you have to pull this out into a different file for
viewing, but I did not want the formatting to be messed up.  The times are
average times in milliseconds.  Same test methodology as above except there
was a 5 minute warmup and a 15 minute test.

Note that both the segment and deletions were recorded from only 1 out of 2
of the shards so we cannot try to extrapolate a function between them and
the outcome.  In other words, just view them as "non-optimized" versus
"optimized" and "has deletions" versus "no deletions".  The only exceptions
are the 0 deletes were true for both shards and the 1 segment and 8 segment
cases were true for both shards.  A few of the tests were repeated as well.

The only conclusion that I could draw is that the number of segments and
the number of deletes appear to greatly influence the response times, at
least more than any difference in Solr version.  There also appears to be
some external contributor to variancemaybe network, etc.

Thoughts?


Date9/29/20169/29/20169/29/20169/30/20169/30/20169/30/20169/30/20169/30/20169/30/20169/30/20169/30/20169/30/20169/30/201610/3/201610/3/201610/3/201610/3/2016Solr
Version5.5.25.5.24.8.14.8.14.8.15.5.25.5.25.5.25.5.25.5.25.5.25.5.25.5.24.8.14.8.14.8.14.8.1Deleted
Docs578735787317695859369459369457873578735787357873Segment
Count34341827273434343488118811facet.method=uifYESYESN/AN/AN/AYESYESNONONOYESYESNON/AN/AN/AN/AScenario
#119821014518619020820921020610914273701601098385Scenario
#29288596258727077746873636166545251




On Wed, Sep 28, 2016 at 4:44 PM, Solr User <solr...@gmail.com> wrote:

> I plan to re-test this in a separate environment that I have more control
> over and will share the results when I can.
>
> On Wed, Sep 28, 2016 at 3:37 PM, Solr User <solr...@gmail.com> wrote:
>
>> Certainly.  And I would of course welcome anyone else to test this for
>> themselves especially with facet.method=uif to see if that has indeed
>> bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
>> testing is invalid due to variance, problem in process, etc.  One thing I
>> was pondering is if I should force merge the index to a certain amount of
>> segments because indexing yields a random number of segments and
>> deletions.  The only thing stopping me short of doing that were
>> observations of longer Solr 4 times even with more deletions and similar
>> number of segments.
>>
>> We use Soasta as our testing tool.  Before testing, load is sent for
>> 10-15 minutes to make sure any Solr caches have stabilized.  Then the test
>> is run for 30 minutes of steady volume with Scenario #1 tested at 15
>> req/sec and Scenario #2 tested at 100 req/sec.  Each request is different
>> with input being pulled from data files.  The requests are repeatable test
>> to test.
>>
>> The numbers posted above are average response times as reported by
>> Soasta.  However, respective time differences are supported by Splunk which
>> indexes the Solr logs and Dynatrace which is instrumented on one of the
>> JVM's.
>>
>> The versions are deployed to the same machines thereby overlaying the
>> previous installation.  Going Solr 4 to Solr 5, full indexing is run with
>> the same input data.  Being in SolrCloud mode, the full indexing comprises
>> of indexing all documents and then deleting any that were not touched.
>> Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
>> load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
>> results as the previous Solr 4 test.
>>
>>
>> On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
>> wrote:
>>
>>> On Tue, 2016-09-27 at 15:08 -0500, Solr User wrote:
>>> > Further testing indicates that any performance difference is not due
>>> > to deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing
>>> > deletes.
>>>
>>> Sanity check: Could you describe how you test?
>>>
>>> * How many queries do you issue for each test?
>>> * Are each query a new one or do you re-use the same query?
>>> * Do you discard the first X calls?
>>> * Are the numbers averages, medians or something third?
>>> * What do you do about disk cache?
>>> * Are both Solr's on the same machine?
>>> * Do they use the same index?
>>> * Do you alternate between testing 4.8.1 and 5.5.2 first?
>>>
>>> - Toke Eskildsen, State and University Library, Denmark
>>>
>>
>>
>


Re: Faceting and Grouping Performance Degradation in Solr 5

2016-09-28 Thread Solr User
I plan to re-test this in a separate environment that I have more control
over and will share the results when I can.

On Wed, Sep 28, 2016 at 3:37 PM, Solr User <solr...@gmail.com> wrote:

> Certainly.  And I would of course welcome anyone else to test this for
> themselves especially with facet.method=uif to see if that has indeed
> bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
> testing is invalid due to variance, problem in process, etc.  One thing I
> was pondering is if I should force merge the index to a certain amount of
> segments because indexing yields a random number of segments and
> deletions.  The only thing stopping me short of doing that were
> observations of longer Solr 4 times even with more deletions and similar
> number of segments.
>
> We use Soasta as our testing tool.  Before testing, load is sent for 10-15
> minutes to make sure any Solr caches have stabilized.  Then the test is run
> for 30 minutes of steady volume with Scenario #1 tested at 15 req/sec and
> Scenario #2 tested at 100 req/sec.  Each request is different with input
> being pulled from data files.  The requests are repeatable test to test.
>
> The numbers posted above are average response times as reported by
> Soasta.  However, respective time differences are supported by Splunk which
> indexes the Solr logs and Dynatrace which is instrumented on one of the
> JVM's.
>
> The versions are deployed to the same machines thereby overlaying the
> previous installation.  Going Solr 4 to Solr 5, full indexing is run with
> the same input data.  Being in SolrCloud mode, the full indexing comprises
> of indexing all documents and then deleting any that were not touched.
> Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
> load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
> results as the previous Solr 4 test.
>
>
> On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
> wrote:
>
>> On Tue, 2016-09-27 at 15:08 -0500, Solr User wrote:
>> > Further testing indicates that any performance difference is not due
>> > to deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing
>> > deletes.
>>
>> Sanity check: Could you describe how you test?
>>
>> * How many queries do you issue for each test?
>> * Are each query a new one or do you re-use the same query?
>> * Do you discard the first X calls?
>> * Are the numbers averages, medians or something third?
>> * What do you do about disk cache?
>> * Are both Solr's on the same machine?
>> * Do they use the same index?
>> * Do you alternate between testing 4.8.1 and 5.5.2 first?
>>
>> - Toke Eskildsen, State and University Library, Denmark
>>
>
>


Re: Faceting and Grouping Performance Degradation in Solr 5

2016-09-28 Thread Solr User
Certainly.  And I would of course welcome anyone else to test this for
themselves especially with facet.method=uif to see if that has indeed
bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
testing is invalid due to variance, problem in process, etc.  One thing I
was pondering is if I should force merge the index to a certain amount of
segments because indexing yields a random number of segments and
deletions.  The only thing stopping me short of doing that were
observations of longer Solr 4 times even with more deletions and similar
number of segments.

We use Soasta as our testing tool.  Before testing, load is sent for 10-15
minutes to make sure any Solr caches have stabilized.  Then the test is run
for 30 minutes of steady volume with Scenario #1 tested at 15 req/sec and
Scenario #2 tested at 100 req/sec.  Each request is different with input
being pulled from data files.  The requests are repeatable test to test.

The numbers posted above are average response times as reported by Soasta.
However, respective time differences are supported by Splunk which indexes
the Solr logs and Dynatrace which is instrumented on one of the JVM's.

The versions are deployed to the same machines thereby overlaying the
previous installation.  Going Solr 4 to Solr 5, full indexing is run with
the same input data.  Being in SolrCloud mode, the full indexing comprises
of indexing all documents and then deleting any that were not touched.
Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
results as the previous Solr 4 test.


On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
wrote:

> On Tue, 2016-09-27 at 15:08 -0500, Solr User wrote:
> > Further testing indicates that any performance difference is not due
> > to deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing
> > deletes.
>
> Sanity check: Could you describe how you test?
>
> * How many queries do you issue for each test?
> * Are each query a new one or do you re-use the same query?
> * Do you discard the first X calls?
> * Are the numbers averages, medians or something third?
> * What do you do about disk cache?
> * Are both Solr's on the same machine?
> * Do they use the same index?
> * Do you alternate between testing 4.8.1 and 5.5.2 first?
>
> - Toke Eskildsen, State and University Library, Denmark
>


Re: Faceting and Grouping Performance Degradation in Solr 5

2016-09-27 Thread Solr User
Further testing indicates that any performance difference is not due to
deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing deletes.
The times appear to converge on an optimized index.  Below are the
details.  Not sure what else to make of this at this point other than
moving forward with an upgrade with an optimized index wherever possible.

Scenario #1:  Using facet.method=uif with faceting on several multi-valued
fields.
4.8.1 (with deletes): 115 ms
5.5.2 (with deletes): 155 ms
4.8.1 (without deletes): 104 ms
5.5.2 (without deletes): 125 ms
4.8.1 (1 segment without deletes): 55 ms
5.5.2 (1 segment without deletes): 44 ms

Scenario #2:  Using facet.method=enum with faceting on several multi-valued
fields.  These fields are different than Scenario #1 and perform much
better with enum hence that method is used instead.
4.8.1 (with deletes): 38 ms
5.5.2 (with deletes): 49 ms
4.8.1 (without deletes): 35 ms
5.5.2 (without deletes): 42 ms
4.8.1 (1 segment without deletes): 28 ms
5.5.2 (1 segment without deletes): 34 ms

On Tue, Sep 27, 2016 at 3:45 AM, Alessandro Benedetti <abenede...@apache.org
> wrote:

> Hi !
> At the time we didn't investigate the deletion implication at all.
> This can be interesting.
> if you proceed with your investigations and discover what changed in the
> deletion approach, I would be more than happy to help!
>
> Cheers
>
> On Mon, Sep 26, 2016 at 10:59 PM, Solr User <solr...@gmail.com> wrote:
>
> > Thanks again for your work on honoring the facet.method.  I have an
> > observation that I would like to share and get your feedback on if
> > possible.
> >
> > I performance tested Solr 5.5.2 with various facet queries and the only
> way
> > I get comparable results to Solr 4.8.1 is when I expungeDeletes.  Is it
> > possible that Solr 5 is not as efficiently ignoring deletes as Solr 4?
> > Here are the details.
> >
> > Scenario #1:  Using facet.method=uif with faceting on several
> multi-valued
> > fields.
> > 4.8.1 (with deletes): 115 ms
> > 5.5.2 (with deletes): 155 ms
> > 5.5.2 (without deletes): 125 ms
> > 5.5.2 (1 segment without deletes): 44 ms
> >
> > Scenario #2:  Using facet.method=enum with faceting on several
> multi-valued
> > fields.  These fields are different than Scenario #1 and perform much
> > better with enum hence that method is used instead.
> > 4.8.1 (with deletes): 38 ms
> > 5.5.2 (with deletes): 49 ms
> > 5.5.2 (without deletes): 42 ms
> > 5.5.2 (1 segment without deletes): 34 ms
> >
> >
> >
> > On Tue, May 31, 2016 at 11:57 AM, Alessandro Benedetti <
> > abenede...@apache.org> wrote:
> >
> > > Interesting developments :
> > >
> > > https://issues.apache.org/jira/browse/SOLR-9176
> > >
> > > I think we found why term Enum seems slower in recent Solr !
> > > In our case it is likely to be related to the commit I mention in the
> > Jira.
> > > Have a check Joel !
> > >
> > > On Wed, May 25, 2016 at 12:30 PM, Alessandro Benedetti <
> > > abenede...@apache.org> wrote:
> > >
> > > > I am investigating this scenario right now.
> > > > I can confirm that the enum slowness is in Solr 6.0 as well.
> > > > And I agree with Joel, it seems to be un-related with the famous
> > faceting
> > > > regression :(
> > > >
> > > > Furthermore with the legacy facet approach, if you set docValues for
> > the
> > > > field you are not going to be able to try the enum approach anymore.
> > > >
> > > > org/apache/solr/request/SimpleFacets.java:448
> > > >
> > > > if (method == FacetMethod.ENUM && sf.hasDocValues()) {
> > > >   // only fc can handle docvalues types
> > > >   method = FacetMethod.FC;
> > > > }
> > > >
> > > >
> > > > I got really horrible regressions simply using term enum in both
> Solr 4
> > > > and Solr 6.
> > > >
> > > > And even the most optimized fcs approach with docValues and
> > > > facet.threads=nCore does not perform as the simple enum in Solr 4 .
> > > >
> > > > i.e.
> > > >
> > > > For some sample queries I have 40 ms vs 160 ms and similar...
> > > > I think we should open an issue if we can confirm it is not related
> > with
> > > > the other.
> > > > A lot of people will continue using the legacy approach for a
> while...
> > > >
> > > > On Wed, May 18, 2016 at 10:42 PM, Joel Bernstein <joels...@gmail.com
> >
&g

Re: Faceting and Grouping Performance Degradation in Solr 5

2016-09-26 Thread Solr User
Thanks again for your work on honoring the facet.method.  I have an
observation that I would like to share and get your feedback on if possible.

I performance tested Solr 5.5.2 with various facet queries and the only way
I get comparable results to Solr 4.8.1 is when I expungeDeletes.  Is it
possible that Solr 5 is not as efficiently ignoring deletes as Solr 4?
Here are the details.

Scenario #1:  Using facet.method=uif with faceting on several multi-valued
fields.
4.8.1 (with deletes): 115 ms
5.5.2 (with deletes): 155 ms
5.5.2 (without deletes): 125 ms
5.5.2 (1 segment without deletes): 44 ms

Scenario #2:  Using facet.method=enum with faceting on several multi-valued
fields.  These fields are different than Scenario #1 and perform much
better with enum hence that method is used instead.
4.8.1 (with deletes): 38 ms
5.5.2 (with deletes): 49 ms
5.5.2 (without deletes): 42 ms
5.5.2 (1 segment without deletes): 34 ms



On Tue, May 31, 2016 at 11:57 AM, Alessandro Benedetti <
abenede...@apache.org> wrote:

> Interesting developments :
>
> https://issues.apache.org/jira/browse/SOLR-9176
>
> I think we found why term Enum seems slower in recent Solr !
> In our case it is likely to be related to the commit I mention in the Jira.
> Have a check Joel !
>
> On Wed, May 25, 2016 at 12:30 PM, Alessandro Benedetti <
> abenede...@apache.org> wrote:
>
> > I am investigating this scenario right now.
> > I can confirm that the enum slowness is in Solr 6.0 as well.
> > And I agree with Joel, it seems to be un-related with the famous faceting
> > regression :(
> >
> > Furthermore with the legacy facet approach, if you set docValues for the
> > field you are not going to be able to try the enum approach anymore.
> >
> > org/apache/solr/request/SimpleFacets.java:448
> >
> > if (method == FacetMethod.ENUM && sf.hasDocValues()) {
> >   // only fc can handle docvalues types
> >   method = FacetMethod.FC;
> > }
> >
> >
> > I got really horrible regressions simply using term enum in both Solr 4
> > and Solr 6.
> >
> > And even the most optimized fcs approach with docValues and
> > facet.threads=nCore does not perform as the simple enum in Solr 4 .
> >
> > i.e.
> >
> > For some sample queries I have 40 ms vs 160 ms and similar...
> > I think we should open an issue if we can confirm it is not related with
> > the other.
> > A lot of people will continue using the legacy approach for a while...
> >
> > On Wed, May 18, 2016 at 10:42 PM, Joel Bernstein <joels...@gmail.com>
> > wrote:
> >
> >> The enum slowness is interesting. It would appear on the surface to not
> be
> >> related to the FieldCache issue. I don't think the main emphasis of the
> >> JSON facet API has been the enum approach. You may find using the JSON
> >> facet API and eliminating the use of enum meets your performance needs.
> >>
> >> With the CollapsingQParserPlugin top_fc is definitely faster during
> >> queries. The tradeoff is slower warming times and increased memory usage
> >> if
> >> the collapse fields are used in faceting, as faceting will load the
> field
> >> into a different cache.
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> On Wed, May 18, 2016 at 5:28 PM, Solr User <solr...@gmail.com> wrote:
> >>
> >> > Joel,
> >> >
> >> > Thank you for taking the time to respond to my question.  I tried the
> >> JSON
> >> > Facet API for one query that uses facet.method=enum (since this one
> has
> >> a
> >> > ton of unique values and performed better with enum) but this was way
> >> > slower than even the slower Solr 5 times.  I did not try the new API
> >> with
> >> > the non-enum queries though so I will give that a go.  It looks like
> >> Solr
> >> > 5.5.1 also has a facet.method=uif which will be interesting to try.
> >> >
> >> > If these do not prove helpful, it looks like I will need to wait for
> >> > SOLR-8096 to be resolved before upgrading.
> >> >
> >> > Thanks also for your comment on top_fc for the CollapsingQParser.  I
> use
> >> > collapse/expand for some queries but traditional grouping for others
> >> due to
> >> > performance.  It will be interesting to see if those grouping queries
> >> > perform better now using CollapsingQParser with top_fc.
> >> >
> >> > On Wed, May 18, 2016 at 11:39 AM, Joel Bernstein <joels...@gmail.com>
> >> > wrot

Re: Indexing a (File attached to a document)

2016-05-24 Thread Solr User
Hi 
I am using MapReduceIndexer Tool to index data from hdfs , using morphlines
as ETL tool.

Specifying data path as xpath's in morphline file.

sorry for delay



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-a-File-attached-to-a-document-tp4276334p4278730.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Faceting and Grouping Performance Degradation in Solr 5

2016-05-18 Thread Solr User
Joel,

Thank you for taking the time to respond to my question.  I tried the JSON
Facet API for one query that uses facet.method=enum (since this one has a
ton of unique values and performed better with enum) but this was way
slower than even the slower Solr 5 times.  I did not try the new API with
the non-enum queries though so I will give that a go.  It looks like Solr
5.5.1 also has a facet.method=uif which will be interesting to try.

If these do not prove helpful, it looks like I will need to wait for
SOLR-8096 to be resolved before upgrading.

Thanks also for your comment on top_fc for the CollapsingQParser.  I use
collapse/expand for some queries but traditional grouping for others due to
performance.  It will be interesting to see if those grouping queries
perform better now using CollapsingQParser with top_fc.

On Wed, May 18, 2016 at 11:39 AM, Joel Bernstein <joels...@gmail.com> wrote:

> Yes, SOLR-8096 is the issue here.
>
> I don't believe indexing with docValues is going to help too much with
> this. The enum slowness may not be related, but I'm not positive about
> that.
>
> The major slowdowns are likely due to the removal of the top level
> FieldCache from general use and the removal of the FieldValuesCache which
> was used for multi-value field faceting.
>
> The JSON facet API covers all the functionality in the traditional
> faceting, and it has been developed to be very performant.
>
> You may also want to see if Collapse/Expand can meet your applications
> needs rather Grouping. It allows you to specify using a top level
> FieldCache if performance is a blocker without it.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, May 18, 2016 at 10:42 AM, Solr User <solr...@gmail.com> wrote:
>
> > Does anyone know the answer to this?
> >
> > On Wed, May 4, 2016 at 2:19 PM, Solr User <solr...@gmail.com> wrote:
> >
> > > I recently was attempting to upgrade from Solr 4.8.1 to Solr 5.4.1 but
> > had
> > > to abort due to average response times degraded from a baseline volume
> > > performance test.  The affected queries involved faceting (both enum
> > method
> > > and default) and grouping.  There is a critical bug
> > > https://issues.apache.org/jira/browse/SOLR-8096 currently open which I
> > > gather is the cause of the slower response times.  One concern I have
> is
> > > that discussions around the issue offer the suggestion of indexing with
> > > docValues which alleviated the problem in at least that one reported
> > case.
> > > However, indexing with docValues did not improve the performance in my
> > case.
> > >
> > > Can someone please confirm or correct my understanding that this issue
> > has
> > > no path forward at this time and specifically that it is already known
> > that
> > > docValues does not necessarily solve this?
> > >
> > > Thanks in advance!
> > >
> > >
> > >
> >
>


Re: Faceting and Grouping Performance Degradation in Solr 5

2016-05-18 Thread Solr User
Does anyone know the answer to this?

On Wed, May 4, 2016 at 2:19 PM, Solr User <solr...@gmail.com> wrote:

> I recently was attempting to upgrade from Solr 4.8.1 to Solr 5.4.1 but had
> to abort due to average response times degraded from a baseline volume
> performance test.  The affected queries involved faceting (both enum method
> and default) and grouping.  There is a critical bug
> https://issues.apache.org/jira/browse/SOLR-8096 currently open which I
> gather is the cause of the slower response times.  One concern I have is
> that discussions around the issue offer the suggestion of indexing with
> docValues which alleviated the problem in at least that one reported case.
> However, indexing with docValues did not improve the performance in my case.
>
> Can someone please confirm or correct my understanding that this issue has
> no path forward at this time and specifically that it is already known that
> docValues does not necessarily solve this?
>
> Thanks in advance!
>
>
>


Indexing a (File attached to a document)

2016-05-12 Thread Solr User
Hi

If I index a document with a file attachment attached to it in solr, can I
visualise data of that attached file attachment also while querying that
particular document? Please help me on this


Thanks & Regards
Vidya Nadella



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-a-File-attached-to-a-document-tp4276334.html
Sent from the Solr - User mailing list archive at Nabble.com.


Faceting and Grouping Performance Degradation in Solr 5

2016-05-04 Thread Solr User
I recently was attempting to upgrade from Solr 4.8.1 to Solr 5.4.1 but had
to abort due to average response times degraded from a baseline volume
performance test.  The affected queries involved faceting (both enum method
and default) and grouping.  There is a critical bug
https://issues.apache.org/jira/browse/SOLR-8096 currently open which I
gather is the cause of the slower response times.  One concern I have is
that discussions around the issue offer the suggestion of indexing with
docValues which alleviated the problem in at least that one reported case.
However, indexing with docValues did not improve the performance in my case.

Can someone please confirm or correct my understanding that this issue has
no path forward at this time and specifically that it is already known that
docValues does not necessarily solve this?

Thanks in advance!


Re: Solr suggester throws error on core reload.

2015-08-14 Thread Nutch Solr User
 I want to use AnalyzingInfixLookupFactory for my autosuggestions.

Any idea when this issue will get fixed? Do we have any workaround for this
issue.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-suggester-throws-error-on-core-reload-tp4220725p4222902.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr suggester throws error on core reload.

2015-08-14 Thread Nutch Solr User
Hi Erick,

Sorry for the confusion caused, Next time will be more careful while posting
questions in forum.

Actually we are using AnalyzingInfixLookupFactory for auto-suggestions. And
currently is has open issue with core reload
(https://issues.apache.org/jira/browse/SOLR-6246). So my question was
related to resolution of this issue.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-suggester-throws-error-on-core-reload-tp4220725p4223098.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multiple but identical suggestions in autocomplete

2015-08-05 Thread Nutch Solr User
You will need to call this service from UI as you are calling suggester
component currently. (may be on every key-press event in text box). You will
pass required parameters too. 

Service will internally form a solr suggester query and query Solr. From the
returned response it will keep only unique suggestions from top N
suggestions and return suggestions to UI.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/multiple-but-identical-suggestions-in-autocomplete-tp4220055p4220953.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr suggester throws error on core reload.

2015-08-04 Thread Nutch Solr User
I am using AnalyzingInfixSuggester for auto-suggest  feature. but whenever I
try to reload solr core following error is thrown ,

org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
NativeFSLock@E:\SSearch\SolrServer\solr-5.2.1\server\solr\ssearch\data\main-suggest\write.lock

After restart everything works fine. What could be the reason for this? 







-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-suggester-throws-error-on-core-reload-tp4220725.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multiple but identical suggestions in autocomplete

2015-08-04 Thread Nutch Solr User
May be you are using DocumentDictionaryFactory because
HighFrequencyDictionaryFactory will never return duplicate duplicate terms.

We also had same problem with *DocumentDictionaryFactory +
AnalyzingInfixSuggester* We have created one service between UI and Solr
which groups duplicate suggestions. and returns unique list to UI with only
contains unique suggestions.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/multiple-but-identical-suggestions-in-autocomplete-tp4220055p4220727.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr suggester throws error on core reload.

2015-08-04 Thread Nutch Solr User
I found existing issue  here https://issues.apache.org/jira/browse/SOLR-6246
. It says fix version 5.2 but Resolution is unresolved. 



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-suggester-throws-error-on-core-reload-tp4220725p4220730.html
Sent from the Solr - User mailing list archive at Nabble.com.


Suggester always highlights suggestions even if we pass highlight=false

2015-07-30 Thread Nutch Solr User
I am still experiencing https://issues.apache.org/jira/browse/SOLR-6648 issue
with solr 5.2.1. 
even if i send highlight=false solr returns me highlighted suggestions. Any
idea why this is happening?

My configurations : 

*URL :
*http://solrhost:solrpost/mycorename/suggest?suggest.dictionary=altSuggestersuggest.dictionary=mainSuggesterwt=jsonsuggest.q=treatmsuggest.count=20highlight=false

*reponse : *

{
  responseHeader: {
status: 0,
QTime: 6
  },
  suggest: {
mainSuggester: {
  treatm: {
numFound: 20,
suggestions: [
  {
term: *Treatm*ent Refusal,
weight: 0,
payload: 
  },
  {
term: Withholding *Treatm*ent,
weight: 0,
payload: 
  },
  {
term: *Treatm*ent Refusal,
weight: 0,
payload: 
  },
  {
term: Withholding *Treatm*ent,
weight: 0,
payload: 
  }
]
  }
},
altSuggester: {
  treatm: {
numFound: 2,
suggestions: [
  {
term: *treatm*ent,
weight: 197,
payload: 
  },
  {
term: *treatm*ents,
weight: 5,
payload: 
  }
]
  }
}
  }
}


*My Configurations : *

searchComponent name=suggest class=solr.SuggestComponent
   lst name=suggester
str name=namemainSuggester/str
str name=lookupImplAnalyzingInfixLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldkeyphrases/str
str name=suggestAnalyzerFieldTypetext_general/str
str name=indexPathmain-suggest/str
str name=buildOnStartuptrue/str
  /lst
  lst name=suggester
str name=namealtSuggester/str
str name=lookupImplAnalyzingInfixLookupFactory/str
str name=dictionaryImplHighFrequencyDictionaryFactory/str
str name=fieldtext/str
str name=suggestAnalyzerFieldTypetext_general/str
 str name=indexPathalt-suggest/str
 str name=allTermsRequiredfalse/str
str name=buildOnStartuptrue/str
  /lst
/searchComponent


 requestHandler name=/suggest class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=suggesttrue/str
str name=suggest.count10/str
str name=suggest.dictionarymainSuggester/str
  /lst
  arr name=components
strsuggest/str
  /arr
/requestHandler



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggester-always-highlights-suggestions-even-if-we-pass-highlight-false-tp4219846.html
Sent from the Solr - User mailing list archive at Nabble.com.


Query token access in solr function queries

2015-07-29 Thread Nutch Solr User
How can i access each query token seperately in function query . I want to
pass each token to ttf function to get total term frequency for that token.
Currently I have access to main query using $q parameter. 

Do I have to write some code to tokenize original query and add tokens as
additional parameters to main query say t1,t2,t3 like this before
sending query to Solr. 

Is there any other way to do this using existing solr functions ?

one more questions is If I have to write my own function for this how should
I return these tokens?



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-token-access-in-solr-function-queries-tp4219695.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tips for faster indexing

2015-07-21 Thread solr . user . 1507
I can confirm this behavior, seen when sending json docs in batch, never 
happens when sending one by one, but sporadic when sending batches.

Like if sole/jetty drops couple of documents out of the batch.

Regards

 On 21 Jul 2015, at 21:38, Vineeth Dasaraju vineeth.ii...@gmail.com wrote:
 
 Hi,
 
 Thank You Erick for your inputs. I tried creating batches of 1000 objects
 and indexing it to solr. The performance is way better than before but I
 find that number of indexed documents that is shown in the dashboard is
 lesser than the number of documents that I had actually indexed through
 solrj. My code is as follows:
 
 private static String SOLR_SERVER_URL = http://localhost:8983/solr/newcore
 ;
 private static String JSON_FILE_PATH = /home/vineeth/week1_fixed.json;
 private static JSONParser parser = new JSONParser();
 private static SolrClient solr = new HttpSolrClient(SOLR_SERVER_URL);
 
 public static void main(String[] args) throws IOException,
 SolrServerException, ParseException {
File file = new File(JSON_FILE_PATH);
Scanner scn=new Scanner(file,UTF-8);
JSONObject object;
int i = 0;
CollectionSolrInputDocument batch = new
 ArrayListSolrInputDocument();
while(scn.hasNext()){
object= (JSONObject) parser.parse(scn.nextLine());
SolrInputDocument doc = indexJSON(object);
batch.add(doc);
if(i%1000==0){
System.out.println(Indexed  + (i+1) +  objects. );
solr.add(batch);
batch = new ArrayListSolrInputDocument();
}
i++;
}
solr.add(batch);
solr.commit();
System.out.println(Indexed  + (i+1) +  objects. );
 }
 
 public static SolrInputDocument indexJSON(JSONObject jsonOBJ) throws
 ParseException, IOException, SolrServerException {
CollectionSolrInputDocument batch = new
 ArrayListSolrInputDocument();
 
SolrInputDocument mainEvent = new SolrInputDocument();
mainEvent.addField(id, generateID());
mainEvent.addField(RawEventMessage, jsonOBJ.get(RawEventMessage));
mainEvent.addField(EventUid, jsonOBJ.get(EventUid));
mainEvent.addField(EventCollector, jsonOBJ.get(EventCollector));
mainEvent.addField(EventMessageType, jsonOBJ.get(EventMessageType));
mainEvent.addField(TimeOfEvent, jsonOBJ.get(TimeOfEvent));
mainEvent.addField(TimeOfEventUTC, jsonOBJ.get(TimeOfEventUTC));
 
Object obj = parser.parse(jsonOBJ.get(User).toString());
JSONObject userObj = (JSONObject) obj;
 
SolrInputDocument childUserEvent = new SolrInputDocument();
childUserEvent.addField(id, generateID());
childUserEvent.addField(User, userObj.get(User));
 
obj = parser.parse(jsonOBJ.get(EventDescription).toString());
JSONObject eventdescriptionObj = (JSONObject) obj;
 
SolrInputDocument childEventDescEvent = new SolrInputDocument();
childEventDescEvent.addField(id, generateID());
childEventDescEvent.addField(EventApplicationName,
 eventdescriptionObj.get(EventApplicationName));
childEventDescEvent.addField(Query, eventdescriptionObj.get(Query));
 
obj= JSONValue.parse(eventdescriptionObj.get(Information).toString());
JSONArray informationArray = (JSONArray) obj;
 
for(int i = 0; iinformationArray.size(); i++){
JSONObject domain = (JSONObject) informationArray.get(i);
 
SolrInputDocument domainDoc = new SolrInputDocument();
domainDoc.addField(id, generateID());
domainDoc.addField(domainName, domain.get(domainName));
 
String s = domain.get(columns).toString();
obj= JSONValue.parse(s);
JSONArray ColumnsArray = (JSONArray) obj;
 
SolrInputDocument columnsDoc = new SolrInputDocument();
columnsDoc.addField(id, generateID());
 
for(int j = 0; jColumnsArray.size(); j++){
JSONObject ColumnsObj = (JSONObject) ColumnsArray.get(j);
SolrInputDocument columnDoc = new SolrInputDocument();
columnDoc.addField(id, generateID());
columnDoc.addField(movieName, ColumnsObj.get(movieName));
columnsDoc.addChildDocument(columnDoc);
}
domainDoc.addChildDocument(columnsDoc);
childEventDescEvent.addChildDocument(domainDoc);
}
 
mainEvent.addChildDocument(childEventDescEvent);
mainEvent.addChildDocument(childUserEvent);
return mainEvent;
 }
 
 I would be grateful if you could let me know what I am missing.
 
 On Sun, Jul 19, 2015 at 2:16 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
 First thing is it looks like you're only sending one document at a
 time, perhaps with child objects. This is not optimal at all. I
 usually batch my docs up in groups of 1,000, and there is anecdotal
 evidence that there may (depending on the docs) be some gains above
 that number. Gotta balance the batch size off against how bug the docs
 are of course.
 
 Assuming that you really are calling this method for one doc (and
 

Re: Basic auth

2015-07-19 Thread solr . user . 1507
I followed this guide:
http://learnsubjects.drupalgardens.com/content/how-place-http-authentication-solr

But there is some something wrong, can anyone help or refer to a guide on how 
to setup http basic auth?

Regards

 On 19 Jul 2015, at 01:10, solr.user.1...@gmail.com wrote:
 
 SOLR-4470 is about:
 Support for basic auth in internal Solr  requests.
 
 What is wrong with the internal requests?
 Can someone help simplify, would it ever be possible to run with basic auth? 
 What work arounds?
 
 Regards


Basic auth

2015-07-18 Thread solr . user . 1507
SOLR-4470 is about:
Support for basic auth in internal Solr  requests.

What is wrong with the internal requests?
Can someone help simplify, would it ever be possible to run with basic auth? 
What work arounds?

Regards

Re: Programmatically find out if node is overseer

2015-07-17 Thread solr . user . 1507
Hi Anshum what do you mean by:
ideally, there shouldn't be a point where you have multiple active
Overseers in a single cluster

How can multiple Overseers happen? And what are the consequences?

Regards

 On 17 Jul 2015, at 19:37, Anshum Gupta ans...@anshumgupta.net wrote:
 
 ideally, there shouldn't be a point where you have multiple active
 Overseers in a single cluster


Re: Setup cloud collection

2015-07-16 Thread solr . user . 1507
Thanks Shawn, but don't want to build something in front of Solr cloud to help 
Solr assign leader role to distribute load of indexing.

Instead of doing this manual step (rebalance leaders) maybe one host should not 
take the leader role of multiple shards for same collection if the number of 
live nodes are equal to number of shards.

But assuming that when you say it will happen over time, Maybe I'll continue 
indexing and see that leaders will be rebalanced soon.

Regards

 On 16 Jul 2015, at 14:57, Shawn Heisey apa...@elyograg.org wrote:
 
 On 7/16/2015 5:51 AM, SolrUser2015 wrote:
 Hi, I'm new to solr!
 
 So downloaded version 5.2 and modified the solr file so it allows me to 
 create a 5 node cluster:
 
 5 shards and replication factor 3 
 
 Now I see that one node is marked as leader for 3 shards.
 
 So my question is, how can 1 node serve requests for 3 shards, wouldn't that 
 be uneven distribution of load?  
 
 SolrCloud will distribute individual queries to different replicas, so
 over time the entire cloud will be used.  The leader role shouldn't
 affect queries, that role is mostly there for indexing and fault handling.
 
 If you are really concerned about this, you can assign preferred leaders
 and then ask Solr to reshuffle them.  I have never used this
 functionality.  Here's the documentation on it:
 
 https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-RebalanceLeaders
 
 Thanks,
 Shawn
 


Re: Setup cloud collection

2015-07-16 Thread solr . user . 1507
Thank you, very good explanation.

Regards

 On 16 Jul 2015, at 17:12, Shawn Heisey apa...@elyograg.org wrote:
 
 On 7/16/2015 7:47 AM, solr.user.1...@gmail.com wrote:
 Thanks Shawn, but don't want to build something in front of Solr cloud to 
 help Solr assign leader role to distribute load of indexing.
 
 Instead of doing this manual step (rebalance leaders) maybe one host should 
 not take the leader role of multiple shards for same collection if the 
 number of live nodes are equal to number of shards.
 
 But assuming that when you say it will happen over time, Maybe I'll 
 continue indexing and see that leaders will be rebalanced soon.
 
 Unless you have a fairly major event (like Solr restarting or an
 operation taking longer than zkClientTimeout) your leaders will never
 change.  It's a semi-permanent role.  When a qualifying event happens,
 SolrCloud does an election process to determine the leader, but
 elections do not happen unless you force them with a REBALANCELEADERS
 action or one of several errors occurs.
 
 You don't have to build anything in front of Solr.  You simply have to
 assign a preferred leader for each shard, an action that can be done
 with an HTTP call in a browser.  I don't think we have anything in the
 admin UI to assign preferred leaders ... I will look into it and open an
 issue if necessary.
 
 The thing that I'm saying will happen over time is that all replicas
 will be used for queries.  If you send a thousand queries, you'll find
 that they will be divided fairly evenly among all replicas.  The fact
 that you have one node as leader for three of your shards is not very
 much of a big deal, but if you really want to change it, you can do so
 with the preferred leader feature.
 
 Thanks,
 Shawn
 


Per field mm parameter

2015-05-28 Thread Nutch Solr User
How to specify per field mm parameter in edismax query.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Per-field-mm-parameter-tp4208325.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Sorting on multivalues field in Solr

2015-05-12 Thread Nutch Solr User
Thanks Alex that was really useful.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-on-multivalued-field-in-Solr-tp4204996p4205017.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-20 Thread solr-user
interesting.  unfortunately, time to take a break and so will have to deal
with this in the new year tho.

Merry Christmas and thanks for all the time and effort you guys put in
answering all of our questions.  It is much appreciated.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4175423.html
Sent from the Solr - User mailing list archive at Nabble.com.


what does this write.lock does not exist mean??

2014-12-19 Thread solr-user
I looked for messages on the following error but dont see anything in nabble. 
Does anyone know what this error means and how to correct it??

SEVERE: java.lang.IllegalArgumentException:
/var/apache/my-solr-slave/solr/coreA/data/index/write.lock does not exist

I also occasionally see error messages about specific index files such as
this:

SEVERE: null:java.lang.IllegalArgumentException:
/var/apache/my_solr-slave/solr/coreA/data/index/_md39_1.del does not exist

I am using Solr 4.0.0, with Java 1.7.0_11-b21 and tomcat 7.0.34, running on
a 12GB centos box; we have master/slave setup with multiple slave searchers
per indexer.

any thoughts on this would be appreciated



--
View this message in context: 
http://lucene.472066.n3.nabble.com/what-does-this-write-lock-does-not-exist-mean-tp4175291.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-12 Thread solr-user
I did find out the cause of my problems.  Turns out the problem wasn't due to
the solrconfig.xml file; it was in the schema.xml file

I spent a fair bit of time making my solrconfig closer to the default
solrconfig.xml in the solr download; when that didnt get rid of the error I
went back to the only other file we had that was different

Turns out the line that was causing the problem was the middle line in this
location_rpt fieldtype definition:

fieldType name=location_rpt
class=solr.SpatialRecursivePrefixTreeFieldType
 
spatialContextFactory=com.spatial4j.core.context.jts.JtsSpatialContextFactory
  geo=true distErrPct=0.025 maxDistErr=0.09 units=degrees /

The spatialContextFactory line caused the core to not load even tho no
error/warning messages were shown.

I missed that extra line somehow; mea culpa.

Anyhow, I really appreciate the responses/help I got on this issue.  many
thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4174118.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-11 Thread solr-user
my apologies for the lack of clarity

our internal name for the project to upgrade solr from 4.0 to 4.10.2 is
helios and so we named our test folder heliosearch.  I was not even
aware of the github project Heliosearch, and nothing we are doing is related
to it.

to simplify things for this post, we simplified things so that we have one
solr instance but two cores;  coreX contains the collection1 files/folders
as per the downloaded solr 4.10.2 package, while coreA uses the same
collection1 files/folders but with schema.xml and solrconfig.xml changes to
meet our needs

so file and foldername-wise, here is what we did:

1. C:\SOLR\solr-4.10.2.zip\solr-4.10.2\example renamed to
C:\SOLR\helios-4.10.2\Master
2. renamed example\solr\collection1 to example\solr\coreX; no files modified
here
3. copied example\solr\coreX to example\solr\coreA
4. modified the coreA schema to match our current production schema; ie our
field names, etc
5. modified the coreA solrconfig.xml to meet our needs (see below)

here are the solrconfig.xml changes we made to coreA

1. directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.StandardDirectoryFactory}
2. mergeFactor4/mergeFactor
3. reopenReadersfalse/reopenReaders
4. infoStreamfalse/infoStream
5. commented out autoCommit section
6. commented out autoSoftCommit section
7. commented out the cache name=perSegFilter... section
8. maxWarmingSearchers4/maxWarmingSearchers
9. requestParsers enableRemoteStreaming=true
multipartUploadLimitInKB=2048000 /
10. requestHandler name=/select class=solr.SearchHandler contains arr
name=last-componentsstrgeocluster/str/arr
11. commented out these sections:
  requestHandler name=/browse class=solr.SearchHandler
 requestHandler name=/spell class=solr.SearchHandler startup=lazy
  requestHandler name=/suggest class=solr.SearchHandler startup=lazy
 searchComponent name=suggest class=solr.SuggestComponent
  searchComponent name=tvComponent class=solr.TermVectorComponent/
  requestHandler name=/tvrh class=solr.SearchHandler startup=lazy
  searchComponent name=quot;clusteringquot; ...
  lt;requestHandler name=quot;/clusteringquot;...
  lt;searchComponent name=quot;elevatorquot;
class=quot;solr.QueryElevationComponentquot; 
  requestHandler name=/elevate class=solr.SearchHandler startup=lazy
  queryResponseWriter name=xslt class=solr.XSLTResponseWriter

here are the schema.xml changes we made to our copy of the downloaded solr
4.10.2 package (aside from replacing the example fields provided in the
downloaded solr 4.10.2):

1. schema name=Helios version=1.5
2. removed the example fields provided in the downloaded solr 4.10.2
3. delete various types we dont use in our current schemas
4. added fieldtypes that are in our current solr 4.0 instances
5. added various fieldtypes that are in our current solr 4.0 instances
6. readded the text field as apparently required:field name=text
type=text_general indexed=true stored=false multiValued=true/

also note that we are using java 1.7.0_67 and jetty-8.1.10.v20130312

all in all, I dont see anything that we have done that would keep the cores
from being discovered.

hope that helps.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4173831.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-11 Thread solr-user
small correction;  coreX (the one with the unmodified schema.xml and
solrconfig.xml) IS seen by solr and appears on the solr admin page, but
coreA (which has our modified schema and solrconfig) is found by solr but is
not shown in the solr admin page:

1494 [main] INFO  org.apache.solr.core.CoresLocator  û Looking for core
definitions underneath C:\SOLR\helios-4.10.2\Master\solr
1502 [main] INFO  org.apache.solr.core.CoresLocator  û Found core coreA in
C:\SOLR\helios-4.10.2\Master\solr\coreA\
1502 [main] INFO  org.apache.solr.core.CoresLocator  û Found core coreX in
C:\SOLR\helios-4.10.2\Master\solr\coreX\
1503 [main] INFO  org.apache.solr.core.CoresLocator  û Found 2 core
definitions





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4173832.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-11 Thread solr-user
yes, have triple checked the schema and solrconfig XML; various tools have
indicated the XML is valid

no missing types or dupes, and have not disabled the admin handler

as mentioned in my most recent response, I can see the coreX core (the
renamed and unmodified collection1 core from the downloaded package) and
query it with no issues, but coreA (whch has our specific schema and
solrconfig changes) is not showing in the admin interface and cannot be
queried (I get a 404)

both cores are located in the same solr folder.

appreciate the suggestions; looks like I will need to gradually move my
schema and core changes towards the collection1 content and see where things
start working; will take a while...sigh

will let you know what I find out.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4173839.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-11 Thread solr-user
Chris, will get the schema and solrconfig ready for uploading.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4173840.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-10 Thread solr-user
  org.apache.solr.servlet.SolrDispatchFilter  û
user.dir=C:\SOLR\helios-4.10.2\Instance\Master
1864 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  û
SolrDispatchFilter.init() done
1885 [main] INFO  org.eclipse.jetty.server.AbstractConnector  û Started
SocketConnector@0.0.0.0:8086
9895 [qtp618640318-19] INFO  org.apache.solr.servlet.SolrDispatchFilter  û
[admin] webapp=null path=/admin/cores
params={indexInfo=false_=1418236560709wt=json} status=0 QTime=17

9931 [qtp618640318-19] INFO  org.apache.solr.servlet.SolrDispatchFilter  û
[admin] webapp=null path=/admin/info/system params={_=1418236560885wt=json}
status=0 QTime=2




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-10 Thread solr-user
definitely puzzling.

am running this on my local box (ie using http://localhost:8086/solr) and it
is the only running instance of any solr.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4173618.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.10.2 Found core but I get No cores available in dashboard page

2014-12-10 Thread solr-user
log tab shows No Events available
no errors at all in the CMD console

my test version hasnt got any logging changes that are already in the
default solr 4.10.2 package

some kind of warning or error message would have been helpful...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-10-2-Found-core-but-I-get-No-cores-available-in-dashboard-page-tp4173602p4173627.html
Sent from the Solr - User mailing list archive at Nabble.com.


confused about how to set a solr query timeout when using tomcat

2014-11-27 Thread solr-user
I inherited a set of some old 1.4x Solrs running under tomcat6/java6

while I will eventually upgrade them to a more recent solr/tomcat/java, I am
unable to do in near term

one of my priority fixes tho is to implement some sort of timeout for solr
queries that exceed 1000ms (or so); ie if the query takes longer than that,
I want to abort that query (returning nothing or an error or whatever) so
that solr can process other queries.  while we have optimized our queries
for an average 50ms response time, we do occasionally see some that can run
between 10 and 100 seconds.

I know that this version of Solr itself doesn't have a built in timeout
mechanism, which leaves me with figuring out what to do (it seems to me that
I have to figure out how to get Tomcat to timeout the queries somehow)

note that I DID google until my fingers hurt and have not been able to find
clear (at least not clear to me) instructions on how do to so 

Details:

1. the setup uses the DataImportHandler to updates Solr, and updates occur
often and can be quite large; we use batchSize=1 and autoCommit=true
with doc size being around 1400 to 1600 bytes.  I dont want the timeout to
kill the imports of course

2. I tried adding a timeout param to the tomcat configuration but it doesnt
work:  Connector port=quot;8086quot; protocol=quot;HTTP/1.1quot;
connectionTimeout=quot;2quot; protocol=quot;HTTP/1.1quot;
timeout=quot;1quot; /

any thoughts??   can anyone point me in the right direction on how to
implement this?

any help appreciated.  thx in advance



--
View this message in context: 
http://lucene.472066.n3.nabble.com/confused-about-how-to-set-a-solr-query-timeout-when-using-tomcat-tp4171363.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: confused about how to set a solr query timeout when using tomcat

2014-11-27 Thread solr-user
millions of documents per shard, with a number of shards
~40gb index folder size
12gb of heap on a 16gb machine (this old Solr doesnt use O/S mem space like
4.x does)
servers are hosted internally, and are powerful

understood.  as mentioned, we tuned the bulk of our queries to run very
quickly (50ms or less), but we do occasionally see queries (ie internal ones
for statistics/tests) that can be excessively long running

Basically, we want to be able to enforce how long those long running queries
are allowed to run



--
View this message in context: 
http://lucene.472066.n3.nabble.com/confused-about-how-to-set-a-solr-query-timeout-when-using-tomcat-tp4171363p4171368.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: confused about how to set a solr query timeout when using tomcat

2014-11-27 Thread solr-user
yes, that solr queries continue to run the query on the solr server even
after a connection is broken was my understanding and concern as well

I was hoping I had overlooked or missed something in Solr or Tomcat
documentation that might do the job

it is unfortunate

if anyone else can think of something, let me know




--
View this message in context: 
http://lucene.472066.n3.nabble.com/confused-about-how-to-set-a-solr-query-timeout-when-using-tomcat-tp4171363p4171379.html
Sent from the Solr - User mailing list archive at Nabble.com.


how do I stop queries from being logged in two different log files in Tomcat

2014-11-10 Thread solr-user
hi all.  

We have a number of solr 1.4x and solr 4.x installations running on tomcat

We are trying to standardize the content of our log files so that we can
automate log analysis; we dont want to use log4j at this time.

In our solr 1.4x installations, the following conf\logging.properties file
is correctly logging queries only to our localhost_access_log.xxx.txt files,
and tomcat type messages to our catalina.xxx.log files

However

in our solr 4.x installations, we are seeing solr queries being logged in
both our localhost_access_log.xxx.txt files and our catalina.xxx.log files. 
We dont want the solr queries logged in catalina.xxx.log files since it more
than doubles the amount of logging being done and doubles the disk space
requirement (which can be huge).

Is there a way to configure logging, without using log4j (for now), to only
log solr queries to the localhost_access_log.xxx.txt files??

I have looked at various tomcat logging info and dont see how to do it.

Any help appreciated.



# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the License); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an AS IS BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

handlers = 1catalina.org.apache.juli.FileHandler,
2localhost.org.apache.juli.FileHandler,
3manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler

.handlers = 1catalina.org.apache.juli.FileHandler,
java.util.logging.ConsoleHandler


# Handler specific properties.
# Describes specific configuration info for Handlers.


1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
1catalina.org.apache.juli.FileHandler.prefix = catalina.

2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
2localhost.org.apache.juli.FileHandler.prefix = localhost.

3manager.org.apache.juli.FileHandler.level = FINE
3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
3manager.org.apache.juli.FileHandler.prefix = manager.

java.util.logging.ConsoleHandler.level = WARNING
java.util.logging.ConsoleHandler.formatter =
java.util.logging.SimpleFormatter



# Facility specific properties.
# Provides extra control for each logger.


org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers =
2localhost.org.apache.juli.FileHandler

org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level
= INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers
= 3manager.org.apache.juli.FileHandler

# For example, set the org.apache.catalina.util.LifecycleBase logger to log
# each component that extends LifecycleBase changing state:
#org.apache.catalina.util.LifecycleBase.level = FINE




--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-stop-queries-from-being-logged-in-two-different-log-files-in-Tomcat-tp4168587.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how do I stop queries from being logged in two different log files in Tomcat

2014-11-10 Thread solr-user
awesome Mike.  that does exactly what I want.

many thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-stop-queries-from-being-logged-in-two-different-log-files-in-Tomcat-tp4168587p4168597.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how do I get search for fort st john to match ft saint john

2014-04-03 Thread solr-user
thanks guys.

unfortunately the solr that contains this schema/data is in a legacy system
that requires the fields to not be changed.

we will, hopefully in the near future, be able to look at redesigning the
schema.

alternatively, I could look at boning up on Java (which I havent used in a
long time) and see if I can write a subword synonym plugin of some sort to
perform this type of synonyming

thanks anyhow.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-get-search-for-fort-st-john-to-match-ft-saint-john-tp4127231p4128914.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how do I get search for fort st john to match ft saint john

2014-04-02 Thread solr-user
Hi Eric.

No, that doesnt fix the problem either (I have tested this previously and
did so again just now)

Since the PatternTokenizerFactory is not tokenizing on whitespace(by design
since I want the user to search by phrase), the phrase marina former fort
ord (for example) does not get turned into four tokens (marina, former,
fort and ord), and so the SynonymFilterFactory does not create synonyms
for them (by design)

the original question remains: is there a tokenizer/plugin that will allow
me to synonym words in a unbroken phrase?

note: the reason I dont want to tokenize the data by whitespace is that it
would cause way to many results to get returned if I, for example, search on
new or st ...  However, I still want to be able to include fort saint
john in the results if the user searches for ft st john or fort st john
or ...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-get-search-for-fort-st-john-to-match-ft-saint-john-tp4127231p4128640.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how do I get search for fort st john to match ft saint john

2014-04-01 Thread solr-user
Hi Eric.

Sorry, been away.  

The city_index_synonyms.txt file is pretty small as it contains just these
two lines:

saint,st,ste
fort,ft

There is nothing at all in the city_query_synonyms.txt file, and it isn't
used either.

My understanding is that solr would create the appropriate synonym entries
in the index and so treat fort and ft as equal

if you have a simple one line schema (that uses the type definition from my
original email) and index fort saint john, does it work for you?  i.e.
does it return results if you search for ft st john and ft saint john
and fort st john?  

My Solr 4.6.1 instance doesn't.  I am wondering if synonyms just don't work
for all/some words in a phrase



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-get-search-for-fort-st-john-to-match-ft-saint-john-tp4127231p4128500.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how do I get search for fort st john to match ft saint john

2014-03-28 Thread solr-user
yes, and I can see that (as expected) per the field type:

1. the indexed value is lowercased
2. stripped of non-alpha characters
3. multiple consecutive whitespace is removed
4. trimmed
5. goes thru the SynonymFilterFactory where:

a. the indexed value of Marina/Former Fort Ord is marina former fort ord
b. the search value of Marina/Former Ft Ord is marina former ft ord

This I already knew.  My question wasn't why they dont match, it is: how
do I get search for fort st john to match ft saint john.  ie is there a
way to index/search that would allow the search to match.

the SynonymFilterFactory during indexing does not create a matching term for
marina former ft ord, which I think it would do if the indexed value was a
word instead of a phrase (ie fort vs Marina/Former Fort Ord)

(note that my terms/understanding of how this works may be incorrect, hence
my request for assistance/understanding)





--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-get-search-for-fort-st-john-to-match-ft-saint-john-tp4127231p4127764.html
Sent from the Solr - User mailing list archive at Nabble.com.


how do I get search for fort st john to match ft saint john

2014-03-26 Thread solr-user
I have been using solr for a while but started running across situations
where synonyms are required.

the example I have is group of city names that look like Fort Saint John
(a city), in a text field.  Users may want to search for Ft St John or
Fort St John or Ft Saint John however

My attempted solution was to create a type that uses SynonymFilterFactory
and a text file of city based synonyms like this:

   saint,st,ste
   fort,ft

this doesnt work however and I am not sure I understand why.

any help appreciated.  thx

p.s. I am using Solr 4.6.1 and here is the field type definition from the
solrconfig.xml:

fieldtype name=geo_search_area_text class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.PatternTokenizerFactory pattern=[\^\-,|]
group=-1 /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.PatternReplaceFilterFactory pattern=[^\w\s]
replacement=  replace=all /
filter class=solr.PatternReplaceFilterFactory pattern=[\s]{2,}
replacement=  replace=all /
filter class=solr.TrimFilterFactory/
filter class=solr.SynonymFilterFactory
synonyms=city_index_synonyms.txt ignoreCase=true expand=true /
  /analyzer
  analyzer type=query
tokenizer class=solr.PatternTokenizerFactory pattern=[\^\-,|]
group=-1 /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.PatternReplaceFilterFactory pattern=[^\w\s]
replacement=  replace=all /
filter class=solr.PatternReplaceFilterFactory pattern=[\s]{2,}
replacement=  replace=all /
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldtype



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-do-I-get-search-for-fort-st-john-to-match-ft-saint-john-tp4127231.html
Sent from the Solr - User mailing list archive at Nabble.com.


does shards.tolerant deal with this scenario?

2014-03-18 Thread solr-user
hi all

I have some questions re shards.tolerant=true and timeAllowed=xxx

I have seen situations where shards.tolerant=true works; if one of the
shards specified in a query is dead, shards.tolerant seems to work and I get
results from the non-dead shards

However, if one of the shards goes down during the execution of a query, I
have to wait for the primary searcher (the solr sending the request to the
shards) to timeout, which can last minutes.  ie shards.tolerant doesn't seem
to work

question 1: is timeAllowed shard-aware?  ie in a sharded query, does this
param get used by all the shards specified or does it only get used by the
primary searcher?

question 2: Since shards.tolerant=true is not helping when a shard goes down
during query execution, is there any other way to deal with this?  If
timeAllowed is shard-aware, I would think that I could use timeAware and the
primary searcher would then wait xxx milliseconds and return with whatever
the other shards had sent back.  Is that correct?

thanks in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/does-shards-tolerant-deal-with-this-scenario-tp4125300.html
Sent from the Solr - User mailing list archive at Nabble.com.


Are there any Java versions we should avoid with Solr

2014-03-04 Thread solr-user
we are currently using Oracle Java 1.7.0_11 23.6-b04 JDK with our Solr 4.6.1
setup

I was looking at upgrading to a more recent version but am wondering, are
there any versions to avoid?

reason I ask is that I see some versions that have GC issues but am not sure
how/if Solr is affected by them.

7u40 has bug with New minimum young generation size is not properly checked
by the JVM, and with Irregular crash or corrupt term vectors in the Lucene
libraries

7u51 has bug with Memory leak when GCNotifier uses
create_from_platform_dependent_str()




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Are-there-any-Java-versions-we-should-avoid-with-Solr-tp4121164.html
Sent from the Solr - User mailing list archive at Nabble.com.


is it possible to consolidate filterquery cache strings

2014-03-03 Thread solr-user
lets say I have a largish set of data (120M docs) and that I am partitioning
my data by groups of states (using the state codes)

Someone suggested that I could use the following format in my solrconfig.xml
when defining the filterqueries work:

listener event=newSearcher class=solr.QuerySenderListener
  arr name=queries
lst
  str name=q*:*/str
  str name=fqState:AL/str
  str name=fqState:AK/str
...
  str name=fqState:WY/str
  /arr
/listener

Would that work, and if so how would I know that the cache is being hit?

Or do I need to use the following traditional syntax instead:

listener event=newSearcher class=solr.QuerySenderListener
  arr name=queries
lst
  str name=q*:*/str
  str name=fqState:AL/str
/str
lst
  str name=q*:*/str
  str name=fqState:AK/str
/str
...
lst
  str name=q*:*/str
  str name=fqState:WY/str
/str
  /arr
/listener

any help appreciated



--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is it possible to consolidate filterquery cache strings

2014-03-03 Thread solr-user
note: by partitioning I mean that I have sharded the 120M docs into 9 Solr
partitions (each on a separate server)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005p4121012.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is it possible to consolidate filterquery cache strings

2014-03-03 Thread solr-user
would not breaking the FQs out by state be faster for warming up the fq
caches?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-consolidate-filterquery-cache-strings-tp4121005p4121030.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr3.4 on tomcat 7.0.23 - hung with error threw exception java.lang.IllegalStateException: Cannot call sendError() after the response has been committed

2013-12-18 Thread solr-user
were you able to resolve this issue, and if so how??

I am encountering the same issue in a couple of solr versions (including 4.0
and 4.5)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr3-4-on-tomcat-7-0-23-hung-with-error-threw-exception-java-lang-IllegalStateException-Cannot-call-tp4087342p4107286.html
Sent from the Solr - User mailing list archive at Nabble.com.


what is difference between 4.1 and 5.x

2013-01-09 Thread solr-user
just curious as to what the difference is between 4.1 and 5.0

i.e. is 4.1 a maintenance branch for what is currently 4.0 or are they very
different designs/architectures



--
View this message in context: 
http://lucene.472066.n3.nabble.com/what-is-difference-between-4-1-and-5-x-tp4032064.html
Sent from the Solr - User mailing list archive at Nabble.com.


spatial searches and geo-json data

2012-12-11 Thread solr-user
hi all.  I have a large amount of spatial data in geo-json format that I get
from mssql server.

I want to be able to index that data and am trying to figure out how to
convert the data into WKT format since solr only accepts WKT.

is anyone away of any solr module or tsql code or c# code that would help me
with the conversion?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/spatial-searches-and-geo-json-data-tp4026140.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is there a way to prevent abusing rows parameter

2012-11-22 Thread solr-user
Thanks guys.  This is a problem with the front end not validating requests. 
I was hoping there might be a simple config value I could enter/change,
rather than going the long process of migrating a proper fix all the way up
to our production servers.  Looks like not, but thx.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-there-a-way-to-prevent-abusing-rows-parameter-tp4021467p4021892.html
Sent from the Solr - User mailing list archive at Nabble.com.


upgrading from 4.0 to 4.1 causes CorruptIndexException: checksum mismatch in segments file

2012-11-22 Thread solr-user
hi all

I have been working on moving us from 4.0 to a newer build of 4.1

I am seeing a CorruptIndexException: checksum mismatch in segments file
error when I try to use the existing index files.

I did see something in the build log for #119 re LUCENE-4446 that mentions
flip file formats to point to 4.1 format

Do I just need to reindex or is this some other issue (ie do I need to
configure something differently)?

or should I move back a few builds?

note, we are currently using:

solr-spec 4.0.0.2012.04.05.15.05.52
solr-impl 4.0-SNAPSHOT 1310094M - - 2012-04-05 15:05:52
lucene-spec 4.0-SNAPSHOT
lucene-impl 4.0-SNAPSHOT 1309921 - - 2012-04-05 10:25:27

and are considering moving to:

solr-spec 4.1.0.2012.11.03.18.08.42
solr-impl 4.1-2012-11-03_18-05-49 1405392 - hudson - 2012-11-03 18:08:42
lucene-spec 4.1-2012-11-03_18-05-49
lucene-impl 4.1-2012-11-03_18-05-49 1405392 - hudson - 2012-11-03 18:06:50
(aka apache-solr-4.1-2012-11-03_18-05-49)





--
View this message in context: 
http://lucene.472066.n3.nabble.com/upgrading-from-4-0-to-4-1-causes-CorruptIndexException-checksum-mismatch-in-segments-file-tp4021913.html
Sent from the Solr - User mailing list archive at Nabble.com.


is there a way to prevent abusing rows parameter

2012-11-20 Thread solr-user
silly question

is there any configuration value I can set to prevent someone from entering
a bad value for the rows parameter?

ie to prevent something like rows=1  from crashing my servers?

the server I am looking at is a solr v3.6



--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-there-a-way-to-prevent-abusing-rows-parameter-tp4021467.html
Sent from the Solr - User mailing list archive at Nabble.com.


Intersects spatial query returns polygons it shouldn't

2012-09-18 Thread solr-user
,
-93.22617724980643 45.29791971794424, -93.23408017640227 45.298023690859175,
-93.2343080073169 45.288444186545625, -93.23432525195352 45.287995322205425,
-93.23469515647318 45.269279712377234, -93.23475627635968
45.266203358381446, -93.23560542207227 45.26619551047824, -93.23899176558338
45.26613779367068, -93.24250527367546 45.26608234822973, -93.243445378056
45.26606503829342, -93.24512861083372 45.2660344570852, -93.24588057830995
45.26602026067889, -93.24713274287363 45.26599455787498, -93.25036838013868
45.26592734514467, -93.25172461510564 45.265900698298395, -93.25236738024864
45.265888260809106, -93.25481754173921 45.26583307838667, -93.25571357952906
45.265819559899164, -93.2594981489083 45.26575415212897, -93.26098138766197
45.265754375486374, -93.26155216698102 45.26565612540643, -93.26170097145753
45.26562288963898, -93.26208574477789 45.26553876835043, -93.26245875524685
45.265434673708015, -93.26277275191426 45.265316250819595,
-93.26311663127117 45.26517251314189, -93.26346212923646 45.26500240317637,
-93.26393572774133 45.26477558787491, -93.2651820516718 45.26406759657772,
-93.26518110226205 45.26337226279194, -93.26515218908767 45.26311636791454,
-93.26518703008779 45.262871689663605, -93.2652064900752 45.26265582104258,
-93.2652110298225 45.26215614194132, -93.26522443086994 45.26112430402238,
-93.26522989950563 45.260703199933474, -93.26524872191168 45.25930812973533,
-93.26525187087448 45.258897852775995, -93.26525857049303
45.258025812056765, -93.26527734826267 45.256675072153314,
-93.26528081766433 45.25612813038996, -93.265287399575 45.25512698071874,
-93.26530031054412 45.253711671615115, -93.26531490547187 45.25273002640574,
-93.26532214123614 45.252243491267, -93.26533817105908 45.25062180123498,
-93.26535413994274 45.24906421173263, -93.26536141910549 45.24841165046578,
-93.26536638602661 45.24796649509243, -93.26537318826473 45.24735637067748,
-93.26539798003012 45.24589779189643, -93.265404909549 45.24454674190931,
-93.2654060939449 45.24296904311022, -93.26540624905046 45.24276127146885,
-93.26540843815205 45.2420263885843, -93.26541275006169 45.240577352345994,
-93.2654375717671 45.238843301612725, -93.26544518264211 45.237906888690105,
-93.26544940933664 45.23738688110566, -93.26546966016808 45.236093591927926,
-93.2654781584622 45.235359229961944, -93.26548338867605 45.23490715107922,
-93.26553582901259 45.23354268990693, -93.26554071996831 45.23330119833777,
-93.26555987026248 45.2323552839169, -93.26557251955711 45.23173040973764,
-93.26556626032777 45.22975235185782, -93.26556606661761 45.229367333607186,
-93.26556579189545 45.228823722705066, -93.26562882232702
45.226872206176665, -93.26571073971922 45.224335971082276,
-93.26574560622672 45.2219321787, -93.26574836877063 45.22173093256304,
-93.26577033227747 45.22021043432355, -93.26578588443306 45.21913391123174,
-93.26580662128347 45.21769799745153, -93.26580983179628 45.217475736026664,
-93.26581322607608 45.217240685631346, -93.26590715360736
45.210737684073244, -93.26591966090616 45.209871711997586, -93.2659016992406
45.20722015227932, -93.26587484243684 45.203254836571126, -93.26585637174348
45.20052765082941, -93.26585684827346 45.19841676076085, -93.26587786763154
45.19732741144391, -93.2658624676632 45.1970879109074, -93.2659274100303
45.194004979577755, -93.26595017983325 45.191531890895845,
-93.26595423366354 45.19092534610275, -93.26593099287571 45.190637988686554,
-93.2659274057232 45.18986823069059, -93.26592485308495 45.18931973506328))'




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Intersects-spatial-query-returns-polygons-it-shouldn-t-tp4008646.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: question(s) re lucene spatial toolkit aka LSP aka spatial4j

2012-08-09 Thread solr-user
Thanks David.  No worries about the delay; am always happy and appreciative
when someone responds.

I don't understand what you mean by All center points get cached into
memory upon first use in a score in question 2 about the Java OOM errors I
am seeing.

The Solr instance I have setup for testing has around 200k docs, with one
WKT field per doc (indexed and stored and set to multivalue).

I did a count of the number of points that get indexed in Solr (computed in
MS SQL by counting the number of points (using STNumPoints) for each
geometry (using STNumGeometries) in the WKT data I am indexing), and I have
around 35M points total.

If only the center points for 190K docs get cached, wouldn't that easily fit
in 7GB of heap? 

Even if Solr was caching 35M points, that still doesn't sound like 7GB worth
of data.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-s-re-lucene-spatial-toolkit-aka-LSP-aka-spatial4j-tp3997757p4000268.html
Sent from the Solr - User mailing list archive at Nabble.com.


question(s) re lucene spatial toolkit aka LSP aka spatial4j

2012-07-27 Thread solr-user
hopefully someone is using the lucene spatial toolkit aka LSP aka spatial4j,
and can answer this question

we are using this spatial tool for doing searches.  overall, it seems to
work very well.  however, finding documentation is difficult.

I have a couple of questions:

1. I have a geohash field in my solr schema that contains indexed geographic
polygon data.  I want to find all docs where that polygon intersects a given
lat/long.  I was experimenting with returning distance in the resultset and
with sorting by distance and found that the following query works.  However,
I dont know what distance means in the query.  i.e. is it distance from
point to the polygon centroid, to the closest outer edge of the polygon, its
a useless random value, etc. Does anyone know??

http://solrserver:solrport/solr/core0/select?q=*:*fq={!v=$geoq%20cache=false}geoq=wkt_search:%22Intersects(Circle(-97.057%2047.924%20d=0.01))%22sort=query($geoq)+ascfl=catchment_wkt1_trimmed,school_name,latitude,longitude,dist:query($geoq,-1),loc_city,loc_state

2. some of the polygons, being geographic representations, are very big (ie
state/province polygons).  when solr starts processing a spatial query (like
the one above), I can see (INFO: Building Cache [xx]) it fills in some
sort of memory cache
(org.apache.lucene.spatial.strategy.util.ShapeFieldCache) of the indexed
polygon data.  We are encountering Java OOM issues when this occurs (even
when we booested the mem to 7GB). I know that some of the polygons can have
more than 2300 points, but heavy trimming isn't really an option due to
level of detail issues. Can we control this caching, or the indexing of the
polygons, in any way to reduce the memory requirements??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-s-re-lucene-spatial-toolkit-aka-LSP-aka-spatial4j-tp3997757.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Customized sorting in Solr

2012-04-30 Thread solr user
Hi,

Any suggestions,

Am I trying to do too much with solr? Is there any other search engine,
which should be used here?

I am looking into solr codebase and planning to modify QueryComponent. Will
this be the right approach?

Regards,

Shivam

On Fri, Apr 27, 2012 at 10:48 AM, solr user solr.user...@gmail.com wrote:

 Jan,

 Thanks for the response,

 I though of using it, but it will be suboptimal to do this in the scenario
 I have. I guess I have to explain the scenario better, let me try it again:-

 1. I have importance based buckets in the system, this is implemented
 using a variable named bucket_count having integer values 0,1,2,3, and I
 have to show results in order of bucket_count i.e. results from 0th bucket
 at top, then results from 1st bucket and so on. That is done by doing a asc
 sort on this variable.
 2. Now *within these buckets* I need to ensure that 1st listing of every
 advertiser comes at top, then 2nd listing from every advertiser and so on.

 Now if I go with the grouping on advertiserId and and use the
 group.offset, then probably I also need to do additive filtering on
 bucket_count. To explain it better pseudo algorithm will be like

 1. query solr with group.offset 0 and bucket count 0
 2. if results more than zero in step1 then increase group offset and
 follow step 1 again
 3. else increase bucket count with group offset zero and start from step 1.

 With this logic in the worst case I need to query solr (number of
 importance buckets)*(max number of listings by an advertiser). Which could
 be very high number of solr queries for a single user query. Please suggest
 if I can do this with more optimal way. I am also open to do modifications
 in solr/lucene code if needed.

 Regards,
 BC Rathore



 On Fri, Apr 27, 2012 at 4:09 AM, Jan Høydahl jan@cominvent.comwrote:

 Hi,

 How about trying grouping with paging?
 First you do
 group=truegroup.field=advertiserIdgroup.limit=1group.offset=0group.main=truesort=somethinggroup.sort=how-much-paid
 desc

 That gives you one listing per advertiser, sorted the way you like.
 Then to grab the next batch of ads, you go group.offset=1 etc etc.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com

 On 26. apr. 2012, at 08:10, solr user wrote:

  Hi,
 
  We are planning to move the search of one of our listing based portal to
  solr/lucene search server from sphinx search server. But we are facing a
  challenge is porting customized sorting being used in our portal. We
 only
  have last 60 days of data live.The algorithm is as follows:-
 
1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
buckets of 7day, 1 day, 1 day……
2.  For each date bucket we make 2 buckets –(Paid / free bucket)
3.  For each paid / free bucket cycle the advertisers on uniqueness
 basis
 
   i.e. inside a bucket the ordering should be 1st listing
  of each advertiser, 2nd listing of each advertiser and so on
   in other words within a *sub-bucket* second listing of
 an
  advertiser will be displayed only after first listing of all advertiser
 has
  been displayed.
 
  For taking care of point 1 and 2 we have created a field named
 bucket_index
  at the time of indexing the data and get the results sorted by this
 index,
  but we are not able to find a way to create a sort field at index time
 or
  think of a sort function for the point no 3.  Please suggest if there
 is a
  way to do so in solr.
 
  Tia,
 
  BC Rathore





Using Customized sorting in Solr

2012-04-26 Thread solr user
Hi,

We are planning to move the search of one of our listing based portal to
solr/lucene search server from sphinx search server. But we are facing a
challenge is porting customized sorting being used in our portal. We only
have last 60 days of data live.The algorithm is as follows:-

   1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
   buckets of 7day, 1 day, 1 day……
   2.  For each date bucket we make 2 buckets –(Paid / free bucket)
   3.  For each paid / free bucket cycle the advertisers on uniqueness basis

  i.e. inside a bucket the ordering should be 1st listing
of each advertiser, 2nd listing of each advertiser and so on
  in other words within a *sub-bucket* second listing of an
advertiser will be displayed only after first listing of all advertiser has
been displayed.

For taking care of point 1 and 2 we have created a field named bucket_index
at the time of indexing the data and get the results sorted by this index,
but we are not able to find a way to create a sort field at index time or
think of a sort function for the point no 3.  Please suggest if there is a
way to do so in solr.

Tia,

BC Rathore


Re: Limiting term frequency in a document to a specific term

2012-01-24 Thread solr user
With the Solr search relevancy functions, a ParseException, unknown
function ttf in FunctionQuery.

http://localhost:8983/solr/select/?fl=score,documentPageIddefType=funcq=ttf(contents,amplifiers)

where contents is a field name, and amplifiers is text in the field name.

Just curious why I get a parse exception for the above syntax.




On Monday, January 23, 2012, Ahmet Arslan iori...@yahoo.com wrote:
 Below is an example query to search for the term frequency
 in a document,
 but it is returning the frequency for all the terms.

 [

http://localhost:8983/solr/select/?fl=documentPageIdq=documentPageId:49667.3qt=tvrhtv.tf=truetv.fl=contents][1
 ]

 I would like to be able to limit the query to just one term
 that I know
 occurs in the document.

 I don't fully follow but http://wiki.apache.org/solr/FunctionQuery#tf may
be what you want?



Re: Getting a word count frequency out of a page field

2012-01-23 Thread solr user
Thanks for the article.

I am indexing each page of a document as if it were a document.

I think the answer is to configure SOLR for use of the TermVector Component:
 http://wiki.apache.org/solr/TermVectorComponent

I have not tried it yet, but someone told me on StackExchange forum to try
this one.

-Melanie

On Sun, Jan 22, 2012 at 8:56 PM, Erick Erickson erickerick...@gmail.comwrote:

 Here's Hoss' XY problem writeup:
 http://people.apache.org/~hossman/#xyproblem
 but this doesn't appear to be that.

 There's no way out of the box that I know of to do what you want. It starts
 with the fact that Solr has no clue what a page is in the first place. Or
 a paragraph. Or a sentence. So you're really on your own here
 Solr only knows about *documents*. If each document is a page,
 you can do some stuff with term frequencies etc. But for a larger
 document you'll be getting into some pretty low-level analysis
 of the data to accomplish this.

 Sorry I can't be more help.
 Erick

 On Sun, Jan 22, 2012 at 5:35 PM, solr user mvidaat...@gmail.com wrote:
  See comments inline below.
 
  On Sun, Jan 22, 2012 at 8:27 PM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  Faceting won't work at all. Its function is to return the count
  of the *documents* that a value occurs in, so that's no good
  for your use case.
 
  I don't know how to issue a proper SOLR query that returns a word count
  for
  a paragraph of text such as the term amplifier for a field. For some
  reason it only returns.
 
  This is really unclear. Are you asking for the word counts of a
 paragraph
  that contains amplifier? The number of times amplifier appears in
  a paragraph? In a document?
 
 
  I'm looking for the number of times the word or term appears in a
 paragraph
  that I'm indexing as the field name contents. I'm storing and indexing
 the
  field name contents that contains multiple occurrences of the
 term/word.
  However, when I query for that term it only reports that the word/term
  appeared only once in the field name contents.
 
 
 
  And why do you want this information anyway? It might be an XY problem.
 
 
  I want to be able to search for word frequency for a page in a document
 that
  has many pages. So I can report to the user that the term/word occurred
 on
  page 1 10 times. The user can click on the result and go right the the
  page where the word/term appeared most frequently.
 
  What do you mean an XY problem?
 
 
 
 
  Best
  Erick
 
  On Fri, Jan 20, 2012 at 1:06 PM, solr user mvidaat...@gmail.com
 wrote:
   SOLR reports the term occurrence for terms over all the documents. I
 am
   having trouble making a query that returns the term occurrence in a
   specific page field called, documentPageId.
  
   I don't know how to issue a proper SOLR query that returns a word
 count
   for
   a paragraph of text such as the term amplifier for a field. For some
   reason it only returns.
  
   The things I've tried only return a count for 1 occurrence of the term
   even
   though I see the term in the paragraph more than just once.
  
   I've tried faceting on the field, contents
  
  
  
 http://localhost:8983/solr/select?indent=onq=*:*wt=standardfacet=onfacet.field=documentPageIdfacet.query=amplifierfacet.sort=lexfacet.missing=onfacet.method=count
  
   lst name=facet_counts
   lst name=facet_queries
   int name=amplifier21/int
   /lst
   lst name=facet_fields
   lst name=documentPageId
   int name=49667.11/int
   int name=49667.101/int
   int name=49667.111/int
   int name=49667.121/int
   int name=49667.131/int
   int name=49667.141/int
   int name=49667.151/int
   int name=49667.161/int
   int name=49667.171/int
   int name=49667.181/int
   int name=49667.191/int
   int name=49667.21/int
   int name=49667.201/int
   int name=49667.211/int
   int name=49667.31/int
   int name=49667.41/int
   int name=49667.51/int
   int name=49667.61/int
   int name=49667.71/int
   int name=49667.81/int
   int name=49667.91/int
   int name=49670.11/int
   int name=49670.21/int
   int name=49670.31/int
   int name=49670.41/int
   int name=49677.11/int
   int name=49677.21/int
   int name=49677.31/int
   int0/int
   /lst
   /lst
   lst name=facet_dates/
   lst name=facet_ranges/
   /lst
   /response
  
  
   In schema.xml:
field name=contents type=bucketFirstLetter stored=true
   indexed=true /
field name=documentPageId type=string indexed=true
 stored=true
   multiValued=false/
  
   In solrconfig.xml:
  
 str name=facet.fieldfilewrapper/str
 str name=facet.fieldcaseNumber/str
 str name=facet.fieldpageNumber/str
 str name=facet.fielddocumentId/str
 str name=facet.fieldcontents/str
 str name=facet.querydocumentId/str
 str name=facet.querycaseNumber/str
 str name=facet.querypageNumber/str
str name=facet.fielddocumentPageId/str
 str name=facet.querycontents/str
  
   Thanks in advance,
 
 



Limiting term frequency in a document to a specific term

2012-01-23 Thread solr user
0 down vote favorite
share [fb] share [tw]


What is the proper query URL to limit the term frequency to just one term
in a document?

Below is an example query to search for the term frequency in a document,
but it is returning the frequency for all the terms.

[
http://localhost:8983/solr/select/?fl=documentPageIdq=documentPageId:49667.3qt=tvrhtv.tf=truetv.fl=contents][1
]

I would like to be able to limit the query to just one term that I know
occurs in the document. The documentation for Term Frequency said to
specify the following:

   f.fieldName.tv.tf - Turns on Term Frequency for the fieldName specified.

This is in the wiki documentation:
http://wiki.apache.org/solr/TermVectorComponent

I tried various combinations of the above for the term amplifier in the URL
but I could not get it to work. I would appreciate the appropriate syntax
for a specific term amplifier.


Re: Getting a word count frequency out of a page field

2012-01-22 Thread solr user
See comments inline below.

On Sun, Jan 22, 2012 at 8:27 PM, Erick Erickson erickerick...@gmail.comwrote:

 Faceting won't work at all. Its function is to return the count
 of the *documents* that a value occurs in, so that's no good
 for your use case.

 I don't know how to issue a proper SOLR query that returns a word count
 for
 a paragraph of text such as the term amplifier for a field. For some
 reason it only returns.

 This is really unclear. Are you asking for the word counts of a paragraph
 that contains amplifier? The number of times amplifier appears in
 a paragraph? In a document?


I'm looking for the number of times the word or term appears in a paragraph
that I'm indexing as the field name contents. I'm storing and indexing
the field name contents that contains multiple occurrences of the
term/word. However, when I query for that term it only reports that the
word/term appeared only once in the field name contents.



 And why do you want this information anyway? It might be an XY problem.


I want to be able to search for word frequency for a page in a document
that has many pages. So I can report to the user that the term/word
occurred on page 1 10 times. The user can click on the result and go
right the the page where the word/term appeared most frequently.

What do you mean an XY problem?




 Best
 Erick

 On Fri, Jan 20, 2012 at 1:06 PM, solr user mvidaat...@gmail.com wrote:
  SOLR reports the term occurrence for terms over all the documents. I am
  having trouble making a query that returns the term occurrence in a
  specific page field called, documentPageId.
 
  I don't know how to issue a proper SOLR query that returns a word count
 for
  a paragraph of text such as the term amplifier for a field. For some
  reason it only returns.
 
  The things I've tried only return a count for 1 occurrence of the term
 even
  though I see the term in the paragraph more than just once.
 
  I've tried faceting on the field, contents
 
 
 http://localhost:8983/solr/select?indent=onq=*:*wt=standardfacet=onfacet.field=documentPageIdfacet.query=amplifierfacet.sort=lexfacet.missing=onfacet.method=count
 
  lst name=facet_counts
  lst name=facet_queries
  int name=amplifier21/int
  /lst
  lst name=facet_fields
  lst name=documentPageId
  int name=49667.11/int
  int name=49667.101/int
  int name=49667.111/int
  int name=49667.121/int
  int name=49667.131/int
  int name=49667.141/int
  int name=49667.151/int
  int name=49667.161/int
  int name=49667.171/int
  int name=49667.181/int
  int name=49667.191/int
  int name=49667.21/int
  int name=49667.201/int
  int name=49667.211/int
  int name=49667.31/int
  int name=49667.41/int
  int name=49667.51/int
  int name=49667.61/int
  int name=49667.71/int
  int name=49667.81/int
  int name=49667.91/int
  int name=49670.11/int
  int name=49670.21/int
  int name=49670.31/int
  int name=49670.41/int
  int name=49677.11/int
  int name=49677.21/int
  int name=49677.31/int
  int0/int
  /lst
  /lst
  lst name=facet_dates/
  lst name=facet_ranges/
  /lst
  /response
 
 
  In schema.xml:
   field name=contents type=bucketFirstLetter stored=true
  indexed=true /
   field name=documentPageId type=string indexed=true stored=true
  multiValued=false/
 
  In solrconfig.xml:
 
str name=facet.fieldfilewrapper/str
str name=facet.fieldcaseNumber/str
str name=facet.fieldpageNumber/str
str name=facet.fielddocumentId/str
str name=facet.fieldcontents/str
str name=facet.querydocumentId/str
str name=facet.querycaseNumber/str
str name=facet.querypageNumber/str
   str name=facet.fielddocumentPageId/str
str name=facet.querycontents/str
 
  Thanks in advance,



Getting a word count frequency out of a page field

2012-01-20 Thread solr user
SOLR reports the term occurrence for terms over all the documents. I am
having trouble making a query that returns the term occurrence in a
specific page field called, documentPageId.

I don't know how to issue a proper SOLR query that returns a word count for
a paragraph of text such as the term amplifier for a field. For some
reason it only returns.

The things I've tried only return a count for 1 occurrence of the term even
though I see the term in the paragraph more than just once.

I've tried faceting on the field, contents

http://localhost:8983/solr/select?indent=onq=*:*wt=standardfacet=onfacet.field=documentPageIdfacet.query=amplifierfacet.sort=lexfacet.missing=onfacet.method=count

lst name=facet_counts
lst name=facet_queries
int name=amplifier21/int
/lst
lst name=facet_fields
lst name=documentPageId
int name=49667.11/int
int name=49667.101/int
int name=49667.111/int
int name=49667.121/int
int name=49667.131/int
int name=49667.141/int
int name=49667.151/int
int name=49667.161/int
int name=49667.171/int
int name=49667.181/int
int name=49667.191/int
int name=49667.21/int
int name=49667.201/int
int name=49667.211/int
int name=49667.31/int
int name=49667.41/int
int name=49667.51/int
int name=49667.61/int
int name=49667.71/int
int name=49667.81/int
int name=49667.91/int
int name=49670.11/int
int name=49670.21/int
int name=49670.31/int
int name=49670.41/int
int name=49677.11/int
int name=49677.21/int
int name=49677.31/int
int0/int
/lst
/lst
lst name=facet_dates/
lst name=facet_ranges/
/lst
/response


In schema.xml:
 field name=contents type=bucketFirstLetter stored=true
indexed=true /
 field name=documentPageId type=string indexed=true stored=true
multiValued=false/

In solrconfig.xml:

   str name=facet.fieldfilewrapper/str
   str name=facet.fieldcaseNumber/str
   str name=facet.fieldpageNumber/str
   str name=facet.fielddocumentId/str
   str name=facet.fieldcontents/str
   str name=facet.querydocumentId/str
   str name=facet.querycaseNumber/str
   str name=facet.querypageNumber/str
  str name=facet.fielddocumentPageId/str
   str name=facet.querycontents/str

Thanks in advance,


Re: Terms Component - solr-1.4.0

2011-05-26 Thread Solr User
Hi All,

Please help me in implementing TermsComponent in my current Solr solution.

Regards,
Solr User

On Tue, May 17, 2011 at 4:12 PM, Solr User solr...@gmail.com wrote:

 Hi All,

 I am using Solr 1.4.0 and dismax as request handler.I have the following in
 my solrconfig.xml in the dismax request handler tag

 arr name=last-components
 strspellcheck/str
 /arr

 The above tags helps to find terms if there are spelling issues. I tried
 configuring terms component and no luck.

 May I know how to configure terms component with dismax? or Do I need to
 call terms component directly to get auto suggestions?

 Thank you so much in advance.

 Regards,
 Solr User



Terms Component - solr-1.4.0

2011-05-17 Thread Solr User
Hi All,

I am using Solr 1.4.0 and dismax as request handler.I have the following in
my solrconfig.xml in the dismax request handler tag

arr name=last-components
strspellcheck/str
/arr

The above tags helps to find terms if there are spelling issues. I tried
configuring terms component and no luck.

May I know how to configure terms component with dismax? or Do I need to
call terms component directly to get auto suggestions?

Thank you so much in advance.

Regards,
Solr User


Out of memory while creating indexes

2011-03-03 Thread Solr User
Hi All,

I am trying to create indexes out of a 400MB XML file using the following
command and I am running into out of memory exception.

$JAVA_HOME/bin/java -Xms768m -Xmx1024m -*Durl*=http://$SOLR_HOST
SOLR_PORT/solr/customercarecore/update -jar
$SOLRBASEDIR/*dataconvertor*/common/lib/post.jar
$SOLRBASEDIR/dataconvertor/customercare/xml/CustomerData.xml

I am planning to bump up the memory and try again.

Did any one ran into similar issue? Any inputs would be very helpful to
resolve the out of memory exception.

I was able to create indexes with small file but not with large file. I am
not using Solr J.

Thanks,
Solr User


Re: what would cause large numbers of executeWithRetry INFO messages?

2011-01-18 Thread solr-user

sorry, never did find a solution to that.

if you do happen to figure it out, pls post a reply to this thread.  thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/what-would-cause-large-numbers-of-executeWithRetry-INFO-messages-tp1453417p2281087.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to get all the search results?

2010-12-13 Thread Solr User
Hi,

I tried *:* using dismax and I get no results.

Is there a way that I can get all the search results using dismax?

Thanks,
Murali

On Mon, Dec 6, 2010 at 11:17 AM, Savvas-Andreas Moysidis 
savvas.andreas.moysi...@googlemail.com wrote:

 Hello,

 shouldn't that query syntax be *:* ?

 Regards,
 -- Savvas.

 On 6 December 2010 16:10, Solr User solr...@gmail.com wrote:

  Hi,
 
  First off thanks to the group for guiding me to move from default search
  handler to dismax.
 
  I have a question related to getting all the search results. In the past
  with the default search handler I was getting all the search results
 (8000)
  if I pass q=* as search string but with dismax I was getting only 16
  results
  instead of 8000 results.
 
  How to get all the search results using dismax? Do I need to configure
  anything to make * (asterisk) work?
 
  Thanks,
  Solr User
 



Re: How to get all the search results?

2010-12-13 Thread Solr User
Hi Shawn,

Yes you did.

I tried and did not work so I asked the same question again.

Now I understood and tried directly on the Solr admin and I got all the
search results. I will implement the same on the website.

Thank you so much Shawn.


On Mon, Dec 13, 2010 at 5:16 PM, Shawn Heisey s...@elyograg.org wrote:

 On 12/13/2010 9:59 AM, Solr User wrote:

 Hi,

 I tried *:* using dismax and I get no results.

 Is there a way that I can get all the search results using dismax?


 For dismax, use q= or simply leave the q parameter off the URL entirely.
  It appears that you need to have q.alt set to *:* for this to work.  It
 would be a good idea to include this in your handler definition:

 str name=q.alt*:*/str

 Two people (myself and Peter Karich) gave this answer on this thread last
 week, within 15 minutes of the time your original question was posted.
  Here's the entire thread on nabble:


 http://lucene.472066.n3.nabble.com/How-to-get-all-the-search-results-td2028233.html

 Shawn




How to get all the search results?

2010-12-06 Thread Solr User
Hi,

First off thanks to the group for guiding me to move from default search
handler to dismax.

I have a question related to getting all the search results. In the past
with the default search handler I was getting all the search results (8000)
if I pass q=* as search string but with dismax I was getting only 16 results
instead of 8000 results.

How to get all the search results using dismax? Do I need to configure
anything to make * (asterisk) work?

Thanks,
Solr User


Re: Dismax - Boosting

2010-11-22 Thread Solr User
Hi Ahmet,

In the past we used /spell and if there is not match then we use to get a
list of suggestions and then we use to make another call with the first
suggestion to get search results. After that we show user both suggestions
for the spelling mistake and results of the first suggestion.

I think the URL that you provided which has plug in will do help doing that.

Is there a way from Solr to directly get the spelling suggestions as well as
first suggestion data at the same time?

For example:

if seach keywork is mooon (typed by mistake instead of moon)

the we need all suggestions like:

Did you mean:  moon, mo, mooing, moonen, soon, mood, moose, moore,
spoon, moons?

and also the search results for the first suggestion moon.

Thanks,
Solr User

On Fri, Nov 19, 2010 at 6:41 PM, Ahmet Arslan iori...@yahoo.com wrote:

  The below is my previous configuration which use to work
  correctly.
 
  searchComponent name=spellcheck
  class=solr.SpellCheckComponent
   str
  name=queryAnalyzerFieldTypetextSpell/str
   lst name=spellchecker
str name=namedefault/str
str name=fieldsearchFields/str
str
  name=spellcheckIndexDir/solr/qa/tradedata/spellchecker/str
str name=buildOnCommittrue/str
   /lst
  /searchComponent
 
  We use to search only in one field which is searchFields
  but with
  implementing dismax we are searching in different fields
  like
 
  title^9.0 subtitle^3.0 author^2.0 desc shortdesc imprint
  category isbn13
  isbn10 format series season bisacsub award.
 
  Do we need to modify the above configuration to include all
  the above
  fields:??? Please give me an example.

 Searching and spell checking are independent. For example you can search on
 10 fields, and create suggestions from 2 fields. Spell checker accepts one
 field in its configuration. So you need to populate this field with
 copyField. Using the fields that you want to use spell checking. And type of
 this field should be textSpell in your case. You can use above config.

 
  In the past we use to query twice to get first the
  suggestions and then we
  use to query using the first suggestion to show the data.
 
  Is there a way that we can do it in one step?

 Are you talking about queries that return 0 numFound? Re-executing the
 search like, described here
 http://sematext.com/products/dym-researcher/index.html

 Not out-of-the-box.






Special Characters

2010-11-22 Thread Solr User
Hi,

I am searching for j.r.r. tolkien and getting results back but if I search
for jrr I am not getting any results. Also not getting any results if I am
searching for jrr tolkien. I am using AND as the default operator.

The search results should work for both j.r.r. tolkien and jrr tolkien.

What configuration changes I need to make so that special characters like
hypen (-), period (.) are ignored while indexing? or any other suggestions?

Thanks,
Solr User


Re: Special Characters

2010-11-22 Thread Solr User
Hi Eric,

I use solr version 1.4.0 and below is my schema.xml

fieldType name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt
ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt/
/analyzer
/fieldType

It creates 3 tokens j r r tolkien works fine but not jrr tolkien.

I will read about PatternReplaceCharFilterFactory and try it. Please let me
know if I need to do anything differently.

Thanks,
Solr User



On Mon, Nov 22, 2010 at 8:19 AM, Erick Erickson erickerick...@gmail.comwrote:

 What version of Solr are you using? You can think about
 PatternReplaceCharFilterFactory if you're using the right
 version of Solr.

 But you have other problems than that. Let's claim you
 get the periods removed. Do you tokenize three tokens or
 one? I.e. jrr or j r r? In the latter case your search still won't
 match.

 Best
 Erick

 On Mon, Nov 22, 2010 at 7:45 AM, Solr User solr...@gmail.com wrote:

  Hi,
 
  I am searching for j.r.r. tolkien and getting results back but if I
 search
  for jrr I am not getting any results. Also not getting any results if I
 am
  searching for jrr tolkien. I am using AND as the default operator.
 
  The search results should work for both j.r.r. tolkien and jrr tolkien.
 
  What configuration changes I need to make so that special characters like
  hypen (-), period (.) are ignored while indexing? or any other
 suggestions?
 
  Thanks,
  Solr User
 



Facet - Range Query issue

2010-11-22 Thread Solr User
Hi,

I am having issue with querying and using facet.

This was working fine earlier:

/spell/?q=(sun) AND (pubyear:[1991 TO
2011])rows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyearfacet.field=formatfacet.field=seriesfacet.field=seasonfacet.field=imprintfacet.field=categoryfacet.field=awardfacet.field=agefacet.field=readingfacet.field=gradefacet.field=pricespellcheck=truedebugQuery=on

After modifying to use dismax handler with new schema the below query does
not work:

/select/?q=(sun) AND (pubyear:[1991 TO
2011])rows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyear_facetfacet.field=format_facetfacet.field=series_facetfacet.field=season_facetfacet.field=imprint_facetfacet.field=category_facetfacet.field=award_facetfacet.field=age_facetfacet.field=reading_facetfacet.field=grade_facetfacet.field=price_facetspellcheck=truedebugQuery=on

lst name=debug
  str name=rawquerystring(sun) AND (pubyear:[1991 TO 2011])/str
  str name=querystring(sun) AND (pubyear:[1991 TO 2011])/str
  str name=parsedquery+((+DisjunctionMaxQuery((series:sun | desc:sun |
bisacsub:sun | award:sun | format:sun | shortdesc:sun | pubyear:sun |
author:sun^2.0 | category:sun | title:sun^9.0 | isbn10:sun | season:sun |
imprint:sun | subtitle:sun^3.0 | isbn13:sun))
+DisjunctionMaxQuery((series:pubyear 1991 | desc:pubyear 1991 |
bisacsub:pubyear 1991 | award:pubyear 1991 | format:pubyear 1991 |
shortdesc:pubyear 1991 | pubyear:pubyear 1991 | author:pubyear
1991^2.0 | category:pubyear 1991 | title:pubyear 1991^9.0 |
isbn10:pubyear 1991 | season:pubyear 1991 | imprint:pubyear 1991 |
subtitle:pubyear 1991^3.0 | isbn13:pubyear 1991))
DisjunctionMaxQuery((series:2011 | desc:2011 | bisacsub:2011 | award:2011 |
format:2011 | shortdesc:2011 | pubyear:2011 | author:2011^2.0 |
category:2011 | title:2011^9.0 | isbn10:2011 | season:2011 | imprint:2011 |
subtitle:2011^3.0 | isbn13:2011)))~1) ()/str
  str name=parsedquery_toString+((+(series:sun | desc:sun | bisacsub:sun
| award:sun | format:sun | shortdesc:sun | pubyear:sun | author:sun^2.0 |
category:sun | title:sun^9.0 | isbn10:sun | season:sun | imprint:sun |
subtitle:sun^3.0 | isbn13:sun) +(series:pubyear 1991 | desc:pubyear 1991
| bisacsub:pubyear 1991 | award:pubyear 1991 | format:pubyear 1991 |
shortdesc:pubyear 1991 | pubyear:pubyear 1991 | author:pubyear
1991^2.0 | category:pubyear 1991 | title:pubyear 1991^9.0 |
isbn10:pubyear 1991 | season:pubyear 1991 | imprint:pubyear 1991 |
subtitle:pubyear 1991^3.0 | isbn13:pubyear 1991) (series:2011 |
desc:2011 | bisacsub:2011 | award:2011 | format:2011 | shortdesc:2011 |
pubyear:2011 | author:2011^2.0 | category:2011 | title:2011^9.0 |
isbn10:2011 | season:2011 | imprint:2011 | subtitle:2011^3.0 |
isbn13:2011))~1) ()/str
  lst name=explain /
  str name=QParserDisMaxQParser/str

Basically we are trying to pass the query string along with a facet field
and the range. Is there any syntax issue? Please help this is urgent as I
got stuck.

Thanks,
Solr user


Re: Facet - Range Query issue

2010-11-22 Thread Solr User
Eric,

I solved the issue by adding fq parameter in the query. Thank you so much
for your reply.

Thanks,
Murali

On Mon, Nov 22, 2010 at 1:51 PM, Erick Erickson erickerick...@gmail.comwrote:

 Well, without seeing the changes you made to the schema, it's hard to tell
 much.
 Also, could you define not work? What, exactly, fails to do what you
 expect?

 But the first question I have is did you reindex after changing your
 schema?.

 And have you checked your index to verify that there values in the fields
 you
 changed?

 Best
 Erick

 On Mon, Nov 22, 2010 at 1:42 PM, Solr User solr...@gmail.com wrote:

  Hi,
 
  I am having issue with querying and using facet.
 
  This was working fine earlier:
 
  /spell/?q=(sun) AND (pubyear:[1991 TO
 
 
 2011])rows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyearfacet.field=formatfacet.field=seriesfacet.field=seasonfacet.field=imprintfacet.field=categoryfacet.field=awardfacet.field=agefacet.field=readingfacet.field=gradefacet.field=pricespellcheck=truedebugQuery=on
 
  After modifying to use dismax handler with new schema the below query
 does
  not work:
 
  /select/?q=(sun) AND (pubyear:[1991 TO
 
 
 2011])rows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyear_facetfacet.field=format_facetfacet.field=series_facetfacet.field=season_facetfacet.field=imprint_facetfacet.field=category_facetfacet.field=award_facetfacet.field=age_facetfacet.field=reading_facetfacet.field=grade_facetfacet.field=price_facetspellcheck=truedebugQuery=on
 
  lst name=debug
   str name=rawquerystring(sun) AND (pubyear:[1991 TO 2011])/str
   str name=querystring(sun) AND (pubyear:[1991 TO 2011])/str
   str name=parsedquery+((+DisjunctionMaxQuery((series:sun | desc:sun |
  bisacsub:sun | award:sun | format:sun | shortdesc:sun | pubyear:sun |
  author:sun^2.0 | category:sun | title:sun^9.0 | isbn10:sun | season:sun |
  imprint:sun | subtitle:sun^3.0 | isbn13:sun))
  +DisjunctionMaxQuery((series:pubyear 1991 | desc:pubyear 1991 |
  bisacsub:pubyear 1991 | award:pubyear 1991 | format:pubyear 1991 |
  shortdesc:pubyear 1991 | pubyear:pubyear 1991 | author:pubyear
  1991^2.0 | category:pubyear 1991 | title:pubyear 1991^9.0 |
  isbn10:pubyear 1991 | season:pubyear 1991 | imprint:pubyear 1991 |
  subtitle:pubyear 1991^3.0 | isbn13:pubyear 1991))
  DisjunctionMaxQuery((series:2011 | desc:2011 | bisacsub:2011 | award:2011
 |
  format:2011 | shortdesc:2011 | pubyear:2011 | author:2011^2.0 |
  category:2011 | title:2011^9.0 | isbn10:2011 | season:2011 | imprint:2011
 |
  subtitle:2011^3.0 | isbn13:2011)))~1) ()/str
   str name=parsedquery_toString+((+(series:sun | desc:sun |
 bisacsub:sun
  | award:sun | format:sun | shortdesc:sun | pubyear:sun | author:sun^2.0 |
  category:sun | title:sun^9.0 | isbn10:sun | season:sun | imprint:sun |
  subtitle:sun^3.0 | isbn13:sun) +(series:pubyear 1991 | desc:pubyear
  1991
  | bisacsub:pubyear 1991 | award:pubyear 1991 | format:pubyear 1991
 |
  shortdesc:pubyear 1991 | pubyear:pubyear 1991 | author:pubyear
  1991^2.0 | category:pubyear 1991 | title:pubyear 1991^9.0 |
  isbn10:pubyear 1991 | season:pubyear 1991 | imprint:pubyear 1991 |
  subtitle:pubyear 1991^3.0 | isbn13:pubyear 1991) (series:2011 |
  desc:2011 | bisacsub:2011 | award:2011 | format:2011 | shortdesc:2011 |
  pubyear:2011 | author:2011^2.0 | category:2011 | title:2011^9.0 |
  isbn10:2011 | season:2011 | imprint:2011 | subtitle:2011^3.0 |
  isbn13:2011))~1) ()/str
   lst name=explain /
   str name=QParserDisMaxQParser/str
 
  Basically we are trying to pass the query string along with a facet field
  and the range. Is there any syntax issue? Please help this is urgent as I
  got stuck.
 
  Thanks,
  Solr user
 



Re: Dismax - Boosting

2010-11-19 Thread Solr User
Hi Ahmet,

The below is my previous configuration which use to work correctly.

searchComponent name=spellcheck class=solr.SpellCheckComponent
 str name=queryAnalyzerFieldTypetextSpell/str
 lst name=spellchecker
  str name=namedefault/str
  str name=fieldsearchFields/str
  str name=spellcheckIndexDir/solr/qa/tradedata/spellchecker/str
  str name=buildOnCommittrue/str
 /lst
/searchComponent

We use to search only in one field which is searchFields but with
implementing dismax we are searching in different fields like

title^9.0 subtitle^3.0 author^2.0 desc shortdesc imprint category isbn13
isbn10 format series season bisacsub award.

Do we need to modify the above configuration to include all the above
fields:??? Please give me an example.

In the past we use to query twice to get first the suggestions and then we
use to query using the first suggestion to show the data.

Is there a way that we can do it in one step?

Thanks,

Murali




On Wed, Nov 17, 2010 at 7:00 PM, Ahmet Arslan iori...@yahoo.com wrote:


  2. How to use spell checker request handler along with
  dismax?

 Just append this at the end of dismax request handler definition:

 arr name=last-components
   strspellcheck/str
 /arr

 /requestHandler






Re: Dismax - Boosting

2010-11-18 Thread Solr User
Ahmet,

I modified the schema as follows: (Added more fields for faceting)


field name=title type=text indexed=true stored=true
omitNorms=true /

field name=author type=text indexed=true stored=true
multiValued=true omitNorms=true /

field name=authortype type=text indexed=true stored=true
multiValued=true omitNorms=true /

field name=isbn13 type=text indexed=true stored=true /

field name=isbn10 type=text indexed=true stored=true /

field name=material type=text indexed=true stored=true /

field name=pubdate type=text indexed=true stored=true /

field name=pubyear type=text indexed=true stored=true /

field name=reldate type=text indexed=false stored=true /

field name=format type=text indexed=true stored=true /

field name=pages type=text indexed=false stored=true /

field name=desc type=text indexed=true stored=true /

field name=series type=text indexed=true stored=true /

field name=season type=text indexed=true stored=true /

field name=imprint type=text indexed=true stored=true /

field name=bisacsub type=text indexed=true stored=true
multiValued=true omitNorms=true /

field name=bisacstatus type=text indexed=false stored=true /

field name=category type=text indexed=true stored=true
multiValued=true omitNorms=true /

field name=award type=text indexed=true stored=true
multiValued=true omitNorms=true /

field name=age type=text indexed=true stored=true /

field name=reading type=text indexed=true stored=true /

field name=grade type=text indexed=true stored=true /

field name=path type=text indexed=false stored=true /

field name=shortdesc type=text indexed=true stored=true /

field name=subtitle type=text indexed=true stored=true
omitNorms=true/

field name=price type=float indexed=true stored=true/

field name=author_facet type=string indexed=true stored=true
omitNorms=true/

field name=pubyear_facet type=string indexed=true stored=true
multiValued=true omitNorms=true/

field name=format_facet type=string indexed=true stored=true
omitNorms=true/

field name=series_facet type=string indexed=true stored=true
omitNorms=true/

field name=season_facet type=string indexed=true stored=true
omitNorms=true/

field name=imprint_facet type=string indexed=true stored=true
omitNorms=true/

field name=category_facet type=string indexed=true stored=true
multiValued=true omitNorms=true/

field name=award_facet type=string indexed=true stored=true
multiValued=true omitNorms=true/

field name=age_facet type=string indexed=true stored=true
omitNorms=true/

field name=reading_facet type=string indexed=true stored=true
omitNorms=true/

field name=grade_facet type=string indexed=true stored=true
omitNorms=true/

field name=price_facet type=string indexed=true stored=true
omitNorms=true/

Also added Copy Fields as below:


copyField source=author dest=author_facet/

copyField source=pubyear dest=pubyear_facet/

copyField source=format dest=format_facet/

copyField source=series dest=series_facet/

copyField source=season dest=season_facet/

copyField source=imprint dest=imprint_facet/

copyField source=category dest=category_facet/

copyField source=award dest=award_facet/

copyField source=age dest=age_facet/

copyField source=reading dest=reading_facet/

copyField source=grade dest=grade_facet/

copyField source=price dest=price_facet/
With the above changes I am not getting any facet data as a result.

Why is that the facet data not returning and what mistake I did with the
schema?

Thanks,
Solr User

On Wed, Nov 17, 2010 at 6:42 PM, Ahmet Arslan iori...@yahoo.com wrote:



 Wow you facet on many fields :

 author,pubyear,format,series,season,imprint,category,award,age,reading,grade,price

 The fields you facet on should be untokenized type: string, int, tint date
 etc.

 The fields you want full text search, e.g. the ones you specify in qf, pf
 parameter should be text type.
 (title subtitle authordesc shortdesc imprint category isbn13 isbn10 format
 series season bisacsub award)

 If you have common fields, for example category, you need two copy of that.
 one string one text. So that you can both full-text search and facet on.
 Use copy field for this.

 copyField source=category dest=category_string/

 Example document:
 category: electronic devices


 query electronic will return it, and facets on category_string will be
 displayed as :

 electronic devices (1)

 not :

 electronic (1)
 devices (1)



 --- On Wed, 11/17/10, Solr User solr...@gmail.com wrote:

  From: Solr User solr...@gmail.com
  Subject: Re: Dismax - Boosting
  To: solr-user@lucene.apache.org
  Date: Wednesday, November 17, 2010, 11:31 PM
   Ahmet,
 
  Thanks for the reply and it was very helpful.
 
  The query that I used before changing to dismax was:
 
 
 /solr/tradecore/spell/?q=curiouswt=jsonrows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyearfacet.field=formatfacet.field=seriesfacet.field=seasonfacet.field=imprintfacet.field=categoryfacet.field=awardfacet.field=agefacet.field=readingfacet.field

Re: Dismax - Boosting

2010-11-17 Thread Solr User
Ahmet,

Thanks for the reply and it was very helpful.

The query that I used before changing to dismax was:

/solr/tradecore/spell/?q=curiouswt=jsonrows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyearfacet.field=formatfacet.field=seriesfacet.field=seasonfacet.field=imprintfacet.field=categoryfacet.field=awardfacet.field=agefacet.field=readingfacet.field=gradefacet.field=pricespellcheck=true

The above query use to return all the data related to facets, data and also
any suggestions related to spelling mistakes properly.

The configuration after modifying using dismax is as below:

Schema.xml:

   field name=title type=text indexed=true stored=true
omitNorms=true /
   field name=author type=text indexed=true stored=true
multiValued=true omitNorms=true /
   field name=authortype type=text indexed=true stored=true
multiValued=true omitNorms=true /
   field name=isbn13 type=text indexed=true stored=true /
   field name=isbn10 type=text indexed=true stored=true /
   field name=material type=text indexed=true stored=true /
   field name=pubdate type=text indexed=true stored=true /
   field name=pubyear type=text indexed=true stored=true /
   field name=reldate type=text indexed=false stored=true /
   field name=format type=text indexed=true stored=true /
   field name=pages type=text indexed=false stored=true /
   field name=desc type=text indexed=true stored=true /
   field name=series type=text indexed=true stored=true /
   field name=season type=text indexed=true stored=true /
   field name=imprint type=text indexed=true stored=true /
   field name=bisacsub type=text indexed=true stored=true
multiValued=true omitNorms=true /
   field name=bisacstatus type=text indexed=false stored=true /
   field name=category type=text indexed=true stored=true
multiValued=true omitNorms=true /
   field name=award type=text indexed=true stored=true
multiValued=true omitNorms=true /
   field name=age type=text indexed=true stored=true /
   field name=reading type=text indexed=true stored=true /
   field name=grade type=text indexed=true stored=true /
   field name=path type=text indexed=false stored=true /
   field name=shortdesc type=text indexed=true stored=true /
   field name=subtitle type=text indexed=true stored=true
omitNorms=true/
   field name=price  type=float indexed=true stored=true/

SolrConfig.xml:

  requestHandler name=dismax class=solr.SearchHandler default=true
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsexplicit/str
 !-- float name=tie0.01/float --
 str name=qf
title^9.0 subtitle^3.0 author^1.0 desc shortdesc imprint category
isbn13 isbn10 format series season bisacsub award
 /str
 !--
str name=pf
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
 /str
 str name=bf
popularity^0.5 recip(price,1,1000,1000)^0.3
 /str
--
 str name=fl
*
 /str
!--
 str name=mm
2lt;-1 5lt;-2 6lt;90%
 /str
 int name=ps100/int
 str name=q.alt*:*/str
--
 !-- example highlighter config, enable per-query with hl=true --
!--
 str name=hl.fltext features name/str
--
 !-- for this field, we want no fragmenting, just highlighting --
!--
 str name=f.name.hl.fragsize0/str
--
 !-- instructs Solr to return the field itself if no query terms are
  found --
!--
 str name=f.name.hl.alternateFieldname/str
 str name=f.text.hl.fragmenterregex/str
--
 !-- defined below --
/lst
  /requestHandler

The query that I used after changing to dismax is:

solr/tradecore/select/?q=curiouswt=jsonrows=9facet=truefacet.limit=-1facet.mincount=1facet.field=authorfacet.field=pubyearfacet.field=formatfacet.field=seriesfacet.field=seasonfacet.field=imprintfacet.field=categoryfacet.field=awardfacet.field=agefacet.field=readingfacet.field=gradefacet.field=pricespellcheck=true


The following are the issues that I am having after modifying to dismax:

1. Facets data is not coming correctly. Lot of extra data is coming. Why and
how to fix it?
2. How to use spell checker request handler along with dismax?

Thanks,
Murali

On Mon, Nov 15, 2010 at 5:38 PM, Ahmet Arslan iori...@yahoo.com wrote:

  1. Do we need to change the above DisMax handler
  configuration as per our
  requirements? Or Leave it as it is? What changes?

 Yes, you need to edit it. At least field names. Does your schema has a
 field named sku?

  2. Do we need make DisMax as a default request
  handler?  Do I need to add
  attribute default=true to the tag?

 If you are going to always use it, why not, change it by adding
 default=true. By doing so you need to add qt parameter in every request.
 But don't forget to delete other default=true. There can be only one
 default=true :)

  3. I read in the documentation that Default Search Handler
  and DisMax are the same except that to use DisMaxQueryParser add
  defType=dismax in the query string. Is there anything else do we need to
  do?

 Above dismax 

Dismax - Boosting

2010-11-15 Thread Solr User
Hi,

Currently we are using StandardRequestHandler and the configuration in
SolrConfig.xml is as below:

  requestHandler name=standard class=solr.SearchHandler default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
   !--
   int name=rows10/int
   str name=fl*/str
   str name=version2.1/str
--
 /lst
  /requestHandler


We would like to switch to DisMax request handler and the configuration in
SolrConfig.xml is:

  requestHandler name=dismax class=solr.SearchHandler 
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 str name=qf
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 /str
 str name=pf
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
 /str
 str name=bf
popularity^0.5 recip(price,1,1000,1000)^0.3
 /str
 str name=fl
id,name,price,score
 /str
 str name=mm
2lt;-1 5lt;-2 6lt;90%
 /str
 int name=ps100/int
 str name=q.alt*:*/str
 !-- example highlighter config, enable per-query with hl=true --
 str name=hl.fltext features name/str
 !-- for this field, we want no fragmenting, just highlighting --
 str name=f.name.hl.fragsize0/str
 !-- instructs Solr to return the field itself if no query terms are
  found --
 str name=f.name.hl.alternateFieldname/str
 str name=f.text.hl.fragmenterregex/str !-- defined below --
/lst
  /requestHandler

Questions:

1. Do we need to change the above DisMax handler configuration as per our
requirements? Or Leave it as it is? What changes?
2. Do we need make DisMax as a default request handler?  Do I need to add
attribute default=true to the tag?
3. I read in the documentation that Default Search Handler and DisMax are
the same except that to use DisMaxQueryParser add defType=dismax in the
query string. Is there anything else do we need to do?

We are basically moving on to dismax handler and trying to understand what
changes we need to make to SolrConfig.xml. I understood what changes need to
be made to schema.xml in a different thread on this forum.

Thanks,
Solr User


Re: WELCOME to solr-user@lucene.apache.org

2010-11-12 Thread Solr User
Ahmet,

Thanks for the reply.

select/?q=built+to+lastdefType=dismaxqf=searchFields^0.2+title^20debugQuery=on

For some reason if I use title field in my query I don't get any results.

I am copying all searchable fields into searchFields field. So I am able to
search only in the searchFields field not in any other fields.

I request you all to clarify if anything wrong with my schema.xml. The
schema.xml is at the bottom of this email.

I am not able to get the boosting working on the title field. Please help me
here too.

Thanks,
Solr User

On Thu, Nov 11, 2010 at 5:11 PM, Ahmet Arslan iori...@yahoo.com wrote:

 There are several mistakes in your approach:

 copyField just copies data. Index time boost is not copied.

 There is no such boosting syntax. /select?q=Eachtitle^9fl=score

 You are searching on your default field.

 This is not your cause of your problem but omitNorms=true disables index
 time boosts.

 http://wiki.apache.org/solr/DisMaxQParserPlugin can satisfy your need.


 --- On Thu, 11/11/10, Solr User solr...@gmail.com wrote:

  From: Solr User solr...@gmail.com
  Subject: Re: WELCOME to solr-user@lucene.apache.org
  To: solr-user@lucene.apache.org
  Date: Thursday, November 11, 2010, 11:54 PM
  Eric,
 
  Thank you so much for the reply and apologize for not
  providing all the
  details.
 
  The following are the field definitons in my schema.xml:
 
  field name=title type=string indexed=true
  stored=true
  omitNorms=false /
 
  field name=author type=string indexed=true
  stored=true
  multiValued=true omitNorms=true /
 
  field name=authortype type=string indexed=true
  stored=true
  multiValued=true omitNorms=true /
 
  field name=isbn13 type=string indexed=true
  stored=true /
 
  field name=isbn10 type=string indexed=true
  stored=true /
 
  field name=material type=string indexed=true
  stored=true /
 
  field name=pubdate type=string indexed=true
  stored=true /
 
  field name=pubyear type=string indexed=true
  stored=true /
 
  field name=reldate type=string indexed=false
  stored=true /
 
  field name=format type=string indexed=true
  stored=true /
 
  field name=pages type=string indexed=false
  stored=true /
 
  field name=desc type=string indexed=true
  stored=true /
 
  field name=series type=string indexed=true
  stored=true /
 
  field name=season type=string indexed=true
  stored=true /
 
  field name=imprint type=string indexed=true
  stored=true /
 
  field name=bisacsub type=string indexed=true
  stored=true
  multiValued=true omitNorms=true /
 
  field name=bisacstatus type=string indexed=false
  stored=true /
 
  field name=category type=string indexed=true
  stored=true
  multiValued=true omitNorms=true /
 
  field name=award type=string indexed=true
  stored=true
  multiValued=true omitNorms=true /
 
  field name=age type=string indexed=true
  stored=true /
 
  field name=reading type=string indexed=true
  stored=true /
 
  field name=grade type=string indexed=true
  stored=true /
 
  field name=path type=string indexed=false
  stored=true /
 
  field name=shortdesc type=string indexed=true
  stored=true /
 
  field name=subtitle type=string indexed=true
  stored=true
  omitNorms=true/
 
  field name=price type=float indexed=true
  stored=true/
 
  field name=searchFields type=textSpell
  indexed=true stored=true
  multiValued=true omitNorms=true/
 
  Copy Fields:
 
  copyField source=title dest=searchFields/
 
  copyField source=author dest=searchFields/
 
  copyField source=isbn13 dest=searchFields/
 
  copyField source=isbn10 dest=searchFields/
 
  copyField source=format dest=searchFields/
 
  copyField source=series dest=searchFields/
 
  copyField source=season dest=searchFields/
 
  copyField source=imprint dest=searchFields/
 
  copyField source=bisacsub dest=searchFields/
 
  copyField source=category dest=searchFields/
 
  copyField source=award dest=searchFields/
 
  copyField source=shortdesc dest=searchFields/
 
  copyField source=desc dest=searchFields/
 
  copyField source=subtitle dest=searchFields/
 
 
 
  defaultSearchFieldsearchFields/defaultSearchField
 
 
 
  Before creating the indexes I feed XML file to the Solr job
  to create index
  files. I added Boost attribute to the title field before
  creating indexes
  and an example is below:
 
  ?xml version=1.0 encoding=UTF-8
  standalone=no?adddocfield
  name=material1785440/fieldfield
  boost=10.0 name=titleEach Little
  Bird That Sings/fieldfield
  name=price16.0/fieldfield
  name=isbn100152051139/fieldfield
  name=isbn139780152051136/fieldfield
  name=formatHardcover/fieldfield
  name=pubdate2005-03-01/fieldfield
  name=pubyear2005/fieldfield
  name=reldate2005-02-22/fieldfield
  name=pages272/fieldfield
  name=bisacstatusActive/fieldfield
  name=seasonSpring
  2005/fieldfield
  name=imprintChildren's/fieldfield
  name=age8.0-12.0/fieldfield
  name=grade3-6/fieldfield
  name=authorMarla Frazee/fieldfield
  name=authortypeJacket
  Illustrator/fieldfield name=authorDeborah
  Wiles/fieldfield

Re: WELCOME to solr-user@lucene.apache.org

2010-11-12 Thread Solr User
Ahmet,

In production system we are using

/spell/?q=built+to+last

so that we can check the spelling. We are not using /select?q=built+to+last

Can I use dismax with /spell?

I understood from your reply that I need to change my schema.xml and modify
the field types.

Do I need to still use the searchFields field and what do I need to specify
in the defaultSearchField tag?

searchFields is one of the field names that we provided.

Thanks,
Solr User


On Fri, Nov 12, 2010 at 10:26 AM, Ahmet Arslan iori...@yahoo.com wrote:

 
 select/?q=built+to+lastdefType=dismaxqf=searchFields^0.2+title^20debugQuery=on
 
  For some reason if I use title field in my query I don't
  get any results.
 
  I am copying all searchable fields into searchFields field.
  So I am able to
  search only in the searchFields field not in any other
  fields.
 
  I request you all to clarify if anything wrong with my
  schema.xml. The
  schema.xml is at the bottom of this email.
 
  I am not able to get the boosting working on the title
  field. Please help me
  here too.

 Change type of your title field. It is string now. Make it solr.TextField.
 Actually you dont need cath-all copy field with dismax.
 Just change their types string to text and append them qf= parameter.






Re: WELCOME to solr-user@lucene.apache.org

2010-11-11 Thread Solr User
Hi,

I have a question about boosting.

I have the following fields in my schema.xml:

1. title
2. description
3. ISBN

etc

I want to boost the field title. I tried index time boosting but it did not
work. I also tried Query time boosting but with no luck.

Can someone help me on how to implement boosting on a specific field like
title?

Thanks,
Solr User

On Thu, Nov 11, 2010 at 10:26 AM, solr-user-h...@lucene.apache.org wrote:

 Hi! This is the ezmlm program. I'm managing the
 solr-user@lucene.apache.org mailing list.

 I'm working for my owner, who can be reached
 at solr-user-ow...@lucene.apache.org.

 Acknowledgment: I have added the address

   solr...@gmail.com

 to the solr-user mailing list.

 Welcome to solr-u...@lucene.apache.org!

 Please save this message so that you know the address you are
 subscribed under, in case you later want to unsubscribe or change your
 subscription address.


 --- Administrative commands for the solr-user list ---

 I can handle administrative requests automatically. Please
 do not send them to the list address! Instead, send
 your message to the correct command address:

 To subscribe to the list, send a message to:
   solr-user-subscr...@lucene.apache.org

 To remove your address from the list, send a message to:
   solr-user-unsubscr...@lucene.apache.org

 Send mail to the following for info and FAQ for this list:
   solr-user-i...@lucene.apache.org
   solr-user-...@lucene.apache.org

 Similar addresses exist for the digest list:
   solr-user-digest-subscr...@lucene.apache.org
   solr-user-digest-unsubscr...@lucene.apache.org

 To get messages 123 through 145 (a maximum of 100 per request), mail:
   solr-user-get.123_...@lucene.apache.org

 To get an index with subject and author for messages 123-456 , mail:
   solr-user-index.123_...@lucene.apache.org

 They are always returned as sets of 100, max 2000 per request,
 so you'll actually get 100-499.

 To receive all messages with the same subject as message 12345,
 send a short message to:
   solr-user-thread.12...@lucene.apache.org

 The messages should contain one line or word of text to avoid being
 treated as s...@m, but I will ignore their content.
 Only the ADDRESS you send to is important.

 You can start a subscription for an alternate address,
 for example j...@host.domain, just add a hyphen and your
 address (with '=' instead of '@') after the command word:
 solr-user-subscribe-john=host.dom...@lucene.apache.org

 To stop subscription for this address, mail:
 solr-user-unsubscribe-john=host.dom...@lucene.apache.org

 In both cases, I'll send a confirmation message to that address. When
 you receive it, simply reply to it to complete your subscription.

 If despite following these instructions, you do not get the
 desired results, please contact my owner at
 solr-user-ow...@lucene.apache.org. Please be patient, my owner is a
 lot slower than I am ;-)

 --- Enclosed is a copy of the request I received.

 Return-Path: solr...@gmail.com
 Received: (qmail 48883 invoked by uid 99); 11 Nov 2010 15:26:44 -
 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Nov 2010 15:26:44
 +
 X-ASF-Spam-Status: No, hits=2.2 required=10.0

  
 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL
 X-Spam-Check-By: apache.org
 Received-SPF: pass (nike.apache.org: domain of solr...@gmail.comdesignates 
 209.85.213.48 as permitted sender)
 Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com)
 (209.85.213.48)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Nov 2010 15:26:35
 +
 Received: by ywp4 with SMTP id 4so1394872ywp.35
for solr-user-sc.1289489103.apfngfdapdhadiahjfln-solrnew=gmail.com
 @lucene.apache.org; Thu, 11 Nov 2010 07:26:14 -0800 (PST)
 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=gamma;
h=domainkey-signature:mime-version:received:received:in-reply-to
 :references:date:message-id:subject:from:to:content-type;
bh=4KuKRrRVLjzTO4oB9/DNxMdQPfNQH2GnYznzPE6YqOo=;
b=l5lBfUYcyvipJn9SE+5j+t1XUmBjTtbyPYlRVj7jDb6G+W3NzQ21EHOowiD9rNH2L9

 gc2+6mGEZmRJOZQwpKD7SUQ2bXL9fVm7mVfS21TMAgC+ZsWQ3vvFOHXalWZa8dbtcOY7
 C23KauLY7YH1UfducfXL77J7u0/snEZl5jQ7A=
 DomainKey-Signature: a=rsa-sha1; c=nofws;
d=gmail.com; s=gamma;

  h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :content-type;
b=nb9+3a9bOHnjGO5T5BhMlW15adcafr+MPzvpgc5X5NXEUGCI05ViLho0SSoQP2Wp2i

 xp1Mfjrjw05umeKmHX23oeD5Idc2G6xgz8I3ZcJ1bUM+cD7c52cMKG2suE2VvhUHlfah
 z52rEtlqd0Q9fk/ZDWwR2DS7GoiVMRmgaWgD0=
 MIME-Version: 1.0
 Received: by 10.229.216.201 with SMTP id hj9mr877669qcb.58.1289489174123;
 Thu,
  11 Nov 2010 07:26:14 -0800 (PST)
 Received: by 10.229.66.165 with HTTP; Thu, 11 Nov 2010 07:26:14 -0800 (PST)
 In-Reply-To: 1289489103.46214.ez...@lucene.apache.org
 References: 1289489103.46214.ez...@lucene.apache.org

Boosting

2010-11-11 Thread Solr User
Hi,

I have a question about boosting.

I have the following fields in my schema.xml:

1. title
2. description
3. ISBN

etc

I want to boost the field title. I tried index time boosting but it did not
work. I also tried Query time boosting but with no luck.

Can someone help me on how to implement boosting on a specific field like
title?

Thanks,
Solr User


Re: WELCOME to solr-user@lucene.apache.org

2010-11-11 Thread Solr User
 learns about life's
surprises in this funny, poignant, and very Southern coming-of-age
story./field/docdocfield name=material1195443/fieldfield
boost=10.0 name=titleBaby Bear's Chairs/fieldfield
name=price16.0/fieldfield name=isbn100152051147/fieldfield
name=isbn139780152051143/fieldfield
name=formatHardcover/fieldfield
name=pubdate2005-09-01/fieldfield name=pubyear2005/fieldfield
name=reldate2005-08-01/fieldfield name=pages40/fieldfield
name=bisacstatusActive/fieldfield name=seasonFall
2005/fieldfield name=imprintChildren's/fieldfield
name=age2.0-5.0/fieldfield name=gradeP-K/fieldfield
name=authorJane Yolen/fieldfield
name=authortypeAuthor/fieldfield name=authorMelissa
Sweet/fieldfield name=authortypeIllustrator/fieldfield
name=bisacsubBedtime amp; Dreams/fieldfield
name=bisacsubAnimals/Bears/fieldfield name=bisacsubFamily/General
(see also headings under Social Issues)/fieldfield name=bisacsubSocial
Issues/Emotions amp; Feelings/fieldfield
name=bisacsubFamily/Parents/fieldfield
name=categoryAnimals/Bears/fieldfield name=categoryBedtime
Books/fieldfield name=categoryFamily
Relationships/Parent-Child/fieldfield
name=path/assets/product/0152051147.gif/fieldfield
name=desclt;divgt;Baby Bear is the littlest bear in his family, and
sometimes that's not so easy. Mama and Papa Bear get to stay up late in
their great big chairs. Big brother gets to play fun games in his
middle-sized chair. And Baby Bear only seems to cause trouble in his own
tiny chair. But at the end of the day, he finds the onelt;igt;
lt;/igt;perfect chair that's comfier and cozier than all the
rest.lt;brgt; lt;brgt;Bestselling author Jane Yolen and popular
illustrator Melissa Sweet have come together to create a lyrical bedtime
tale about a baby bear trying to find his place in a family. With a playful
rhyming text and adorable, fun illustrations, here is a book for parents and
their own baby bears to treasure.lt;brgt;lt;/divgt;/fieldfield
name=shortdescIn this sweet, bedtime story, Baby Bear discovers that
Papa's lap is the best chair of all!/field/doc/add

I am trying to boost the title field so that the search results brings the
actual match with title as the first item in the results.

Adding boost attribute to the title field and Index time boosting did not
change the search results. I tried Query time boosting also as mentioned
below but no luck

/select?q=Each+Little+Bird+That+Singstitle^9fl=score

Any help to fix this issue would be really helpful.

Thanks,

Solr User
On Thu, Nov 11, 2010 at 10:32 AM, Solr User solr...@gmail.com wrote:

 Hi,

 I have a question about boosting.

 I have the following fields in my schema.xml:

 1. title
 2. description
 3. ISBN

 etc

 I want to boost the field title. I tried index time boosting but it did not
 work. I also tried Query time boosting but with no luck.

 Can someone help me on how to implement boosting on a specific field like
 title?

 Thanks,
 Solr User





what would cause large numbers of executeWithRetry INFO messages?

2010-09-10 Thread solr-user

I see a large number (~1000) of the following executeWithRetry messages in my
apache catalina log files every day (see bolded snippet below).  They seem
to appear at random intervals.

Since they are not flagged as errors or warnings, I have been ignoring them
for now.  However, I started wondering if INFO message is a red-herring
and thinking there might be an actual problem somewhere.

Does anyone know what would cause this type of message?  Are they normal?  I
have not seen anything in my google searches for solr that contain this
message

Details:

1. My CPU usage seems fine as does my heap; we have lots of cpu capacity and
heap space
2. The log is from a searcher but I know that the intervals do not
correspond to replication (every 15 min on the hour)
3. the INFO lines appear in all searcher logs (we have a number of
searchers)
4. the data is around 10m records per searcher and occupies around 14gb
5. I am not noticing any problems performing queries on the solr (so no
trace info to give you); performance and queries seem fine

Log snippet:
Sep 10, 2010 2:17:59 AM org.apache.solr.handler.SnapPuller fetchLatestIndex
INFO: Slave in sync with master.
Sep 10, 2010 2:18:20 AM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (org.apache.commons.httpclient.NoHttpResponseException)
caught when processing request: The server xxx.admin.inf failed to respond
Sep 10, 2010 2:18:20 AM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Sep 10, 2010 2:18:20 AM org.apache.solr.handler.SnapPuller fetchLatestIndex
INFO: Slave in sync with master.

any info appreciated.  thx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/what-would-cause-large-numbers-of-executeWithRetry-INFO-messages-tp1453417p1453417.html
Sent from the Solr - User mailing list archive at Nabble.com.


  1   2   >