Having an issue with pivot faceting

2014-10-07 Thread cwhi
I'm having an issue getting pivot faceting working as expected.  I'm trying
to filter by a specific criteria, and then first facet by one of my document
attributes called item_generator, then facet those results into 2 sets each:
the first set is the count of documents satisfying that facet with
number_of_items_generated set to 0, the other set counting the documents
satisfying that facet with number_of_items_generated greater than 0.  Seems
simple enough, and it seems like pivot faceting with facet.interval.set
would be the solution.  However, I'm not getting the expected results.  Here
are my request and response:

Request:

http://localhost/solr/select/?facet=true&facet.sort=true&q=item_type:Food&facet.field=item_generator&f.number_of_items_generated.facet.interval.set=[0,0]&f.number_of_items_generated.facet.interval.set=[1,100]&rows=0&version=2.2&wt=json

Response:

{"responseHeader":{"status":0,"QTime":0,"params":{"f.number_of_items_generated.facet.interval.set":["[0,0]","[1,100]"],"facet":"true","facet.sort":"true","q”:"item_type:Food","facet.field”:"item_generator","wt":"json","version":"2.2","rows":"0"}},"response":{"numFound":743,"start":0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{“item_generator":[“food-creator",387,”toy-creator",356]},"facet_dates":{},"facet_ranges":{}}}

I've set docValues="true" for number_of_items_generated in my schema.xml.
What am I doing wrong in my query?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Having-an-issue-with-pivot-faceting-tp4163158.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-16 Thread cwhi
Hi Shawn,

Thanks for the helpful and thorough response.  While I understand all of the
factors that you've outlined for memory requirements (in fact, I'd
previously read your page on Solr performance problems), it is baffling to
me why two identical SolrCloud instances, each sharded across 3 machines
with identical hardware, would run into these memory issues at such
differently memory limits (one SolrCloud instance started seeing OOM issues
at 2 million indexed documents, the other started seeing OOM issues between
20 and 30 million indexed documents). 

When I stated that approximately 1.5GB, I mean that this is how much heap
space I allocated when launching java with -Xmx, and I can see the java
process using that full amount of RAM.  

>From a usage perspective, the usage doesn't seem all that heavy.  I'm
indexing about 600k documents an hour (each of which have ~20 short numeric
or string fields).  I have the autoSoftCommit parameter set for once a
second, and the autoCommit time set for every 5 minutes, with openSearcher
set to false.  Finally, I have maxWarmingSearchers at 2.  Besides indexing
those documents, I've been doing a few small queries just to check how many
documents have been indexed, and a few other small queries, sorting by a
single attribute.  These searches are very infrequent though, maybe 5 or 6
an hour.

Seems like a strange issue indeed.  My expectation is that Solr would hit a
point where it becomes horribly slow after some threshold where things don't
fit in the cache, but I'd never expect it to simply crash like it's doing.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111680.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-15 Thread cwhi
Hi Shawn,

Thanks for the quick reply.  I did notice the exception you pointed out and
had some thoughts about it maybe being the client library I'm using to
connect to Solr (C# SolrNet) disconnecting too early, but that doesn't
explain it eventually running out of memory altogether.  A large index
shouldn't cause Solr to run out of memory, since it would just go to disk on
queries to process requests instead of holding the entire index in memory.  

I'm also not sure that the index size is the case, because I have another
SolrCloud instance running where I saw this behaviour at ~20 million, rather
than 2 million documents (same type of documents, so much larger on disk). 
The machines these are running on are identical Amazon EC2 instances as
well, so that rules out the larger index succeeding for longer due to better
hardware.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111561.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-15 Thread cwhi
I have a SolrCloud installation with about 2 million documents indexed in it. 
It's been buzzing along without issue for the past 8 days, but today started
throwing errors on document adds that eventually resulted in out of memory
exceptions.  There is nothing funny going on.  There are a few infrequent
searches on the index every few minutes, and documents are being added in
batch (batches of 1000-5000) every few minutes as well.

The exceptions I'm receiving don't seem very informative.  The first
exception looks like this:

org.apache.solr.common.SolrException
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
-- snip --

I've now experienced this with two SolrCloud instances in a row.  The
SolrCloud instance has 3 shards, each on a separate machine (each machine is
also running Zookeeper).  Each of the machines have 4 GB of RAM, with ~1.5
GB allocated to Solr.  Solr seems to be maxing out the CPU on index, so I
don't know if that's related.

If anybody could help me in sorting out these issues, it would be greatly
appreciated.  I pulled the Solr log file and have uploaded it at
https://www.dropbox.com/s/co3r4esjnsas0tl/solr.log

Also, a short snippet of the first exception is available on pastebin at
http://pastebin.com/pWZrkGEr

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr-user@lucene.apache.org

2014-01-07 Thread cwhi
I agree, but that's the only information that was returned from the
interface.  I had a look through the logs and there isn't a deeper stack
trace, but I noticed these errors directly before that exception that might
be helpful:

ERROR - 2014-01-06 16:23:12.426; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 1
ERROR - 2014-01-06 16:23:12.926; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 2
ERROR - 2014-01-06 16:23:16.765; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 3
ERROR - 2014-01-06 16:23:17.385; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 4
ERROR - 2014-01-06 16:23:18.331; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 5
ERROR - 2014-01-06 16:23:18.945; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 6
ERROR - 2014-01-06 16:23:19.561; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 7
ERROR - 2014-01-06 16:23:20.173; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 8
ERROR - 2014-01-06 16:23:20.805; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 9
ERROR - 2014-01-06 16:23:21.631; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 10
ERROR - 2014-01-06 16:23:22.202; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 11
ERROR - 2014-01-06 16:23:22.811; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 12
ERROR - 2014-01-06 16:23:23.588; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 13
ERROR - 2014-01-06 16:23:24.197; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 14
ERROR - 2014-01-06 16:23:24.779; org.apache.solr.common.SolrException;
forwarding update to http://10.0.0.5:8443/solr/collection1/ failed -
retrying ... retries: 15
ERROR - 2014-01-06 16:23:25.990;
org.apache.solr.update.StreamingSolrServers$1; error
org.apache.solr.common.SolrException: Server Error

request:
http://10.0.0.5:8443/solr/collection1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F10.0.0.229%3A8443%2Fsolr%2Fcollection1%2F&wt=javabin&version=2
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Seemingly-arbitrary-error-on-document-adds-to-SolrCloud-Server-Error-request-http-10-0-0-5-8443-solr-tp4109864p4109994.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr-user@lucene.apache.org

2014-01-06 Thread cwhi
I'm adding dozens of documents every few minutes to a SolrCloud instance with
3 machines and ~ 25 million documents.  I'm starting to see issues where
adds are throwing these ugly errors that seem to indicate there might be
some issues with the nodes communicating to one another.  My posts are of
the following form, but with about 30 fields rather than just 1: 


112370241



And here is the error that Solr is throwing:

null:org.apache.solr.common.SolrException: Server Error

request:
http://10.0.0.5:8443/solr/collection1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F10.0.0.229%3A8443%2Fsolr%2Fcollection1%2F&wt=javabin&version=2
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)


What is the source of these errors, and how can I resolve them?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Seemingly-arbitrary-error-on-document-adds-to-SolrCloud-Server-Error-request-http-10-0-0-5-8443-solr-tp4109864.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Shards stuck in "down" state after splitting shard - How can we recover from a failed SPLITSHARD?

2013-12-20 Thread cwhi
Thanks again for your replies.  I'm using Solr 4.6.  I just tried splitting
another shard so I could grab the exceptions from the logs, and  here is the
log output. 

I  noticed a few obvious exceptions that might have caused this to fail,
such as this:

ERROR - 2013-12-20 20:18:24.231; org.apache.solr.core.CoreContainer; Unable
to create core: collection1_shard3_1_replica1
java.lang.RuntimeException: java.io.IOException: Error opening
/configs/config1/stopwords.txt
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:169)
at
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:254)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:590)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:498)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:152)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:662)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Error opening /configs/config1/stopwords.txt
at
org.apache.solr.cloud.ZkSolrResourceLoader.openResource(ZkSolrResourceLoader.java:83)
at
org.apache.lucene.analysis.util.AbstractAnalysisFactory.getLines(AbstractAnalysisFactory.java:255)
at
org.apache.lucene.analysis.util.AbstractAnalysisFactory.getWordSet(AbstractAnalysisFactory.java:243)
at
org.apache.lucene.analysis.core.StopFilterFactory.inform(StopFilterFactory.java:99)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:655)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:167)
... 35 more


That exception claims that it can't read stopwords.txt, but the file is
definitely present locally at solr/conf/stopwords.txt, and it's present in
zookeeper at /configs/config1/stopwords.txt (I just checked with zkCli.cmd).




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shards-stuck-in-down-state-after-splitting-shard-How-can-we-recover-from-a-failed-SPLITSHARD-tp4107297p4107668.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Shards stuck in "down" state after splitting shard - How can we recover from a failed SPLITSHARD?

2013-12-20 Thread cwhi
My apologies, I forgot to paste the output of clusterstate.json to my last
post.  Here it is:

[zk: localhost:2181(CONNECTED) 1] get /clusterstate.json
{"collection1":{
"shards":{
  "shard1":{
"range":"8000-d554",
"state":"active",
"replicas":{"10.0.0.229:8443_solr_collection1":{
"state":"active",
"base_url":"http://10.0.0.229:8443/solr";,
"core":"collection1",
"node_name":"10.0.0.229:8443_solr",
"leader":"true"}}},
  "shard2":{
"range":"d555-2aa9",
"state":"active",
"replicas":{"10.0.0.5:8443_solr_collection1":{
"state":"active",
"base_url":"http://10.0.0.5:8443/solr";,
"core":"collection1",
"node_name":"10.0.0.5:8443_solr",
"leader":"true"}}},
  "shard3":{
"range":"2aaa-7fff",
"state":"active",
"replicas":{"10.0.0.246:8443_solr_collection1":{
"state":"active",
"base_url":"http://10.0.0.246:8443/solr";,
"core":"collection1",
"node_name":"10.0.0.246:8443_solr",
"leader":"true"}}},
  "shard1_0":{
"range":"8000-aaa9",
"state":"construction",
"parent":"shard1",
"replicas":{"10.0.0.229:8443_solr_collection1_shard1_0_replica1":{
"state":"down",
"base_url":"http://10.0.0.229:8443/solr";,
"core":"collection1_shard1_0_replica1",
"node_name":"10.0.0.229:8443_solr"}}},
  "shard1_1":{
"range":"-d554",
"state":"construction",
"parent":"shard1",
"replicas":{"10.0.0.229:8443_solr_collection1_shard1_1_replica1":{
"state":"down",
"base_url":"http://10.0.0.229:8443/solr";,
"core":"collection1_shard1_1_replica1",
"node_name":"10.0.0.229:8443_solr"}}},
  "shard2_0":{
"range":"d555-fffe",
"state":"construction",
"parent":"shard2",
"replicas":{"10.0.0.5:8443_solr_collection1_shard2_0_replica1":{
"state":"down",
"base_url":"http://10.0.0.5:8443/solr";,
"core":"collection1_shard2_0_replica1",
"node_name":"10.0.0.5:8443_solr",
"leader":"true"}}},
  "shard2_1":{
"range":"-2aa9",
"state":"construction",
"parent":"shard2",
"replicas":{"10.0.0.5:8443_solr_collection1_shard2_1_replica1":{
"state":"down",
"base_url":"http://10.0.0.5:8443/solr";,
"core":"collection1_shard2_1_replica1",
"node_name":"10.0.0.5:8443_solr",
"leader":"true",
"maxShardsPerNode":"1",
"router":{"name":"compositeId"},
"replicationFactor":"1",
"autoCreated":"true"}}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shards-stuck-in-down-state-after-splitting-shard-How-can-we-recover-from-a-failed-SPLITSHARD-tp4107297p4107622.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Shards stuck in "down" state after splitting shard - How can we recover from a failed SPLITSHARD?

2013-12-20 Thread cwhi
Thanks for your reply Anshum.  I took a look at clusterstate.json, and it
seems they are stuck in "construction" while the others are still active. 
I'm able to query my index again (that seems to have been an unrelated
issue), but I'd still like to remove these stuck shards and recreate them
(or fix the existing ones).



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shards-stuck-in-down-state-after-splitting-shard-How-can-we-recover-from-a-failed-SPLITSHARD-tp4107297p4107620.html
Sent from the Solr - User mailing list archive at Nabble.com.


Shards stuck in "down" state after splitting shard - How can we recover from a failed SPLITSHARD?

2013-12-18 Thread cwhi
I called SPLITSHARD on a shard in an existing SolrCloud instance, where the
shard had ~1 million documents in it.  It's been about 3 hours since that
splitting has completed, and the subshards are still stuck in a "Down"
state.  They are reported as down in localhost/solr/#/~cloud, and I'm unable
to query my index.

How can we recover from a failed SPLITSHARD operation?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shards-stuck-in-down-state-after-splitting-shard-How-can-we-recover-from-a-failed-SPLITSHARD-tp4107297.html
Sent from the Solr - User mailing list archive at Nabble.com.


How can you move a shard from one SolrCloud node to another?

2013-12-15 Thread cwhi
Let's say I want to rebalance a SolrCloud collection.  I call SPLITSHARD to
split an existing shard, and then I'd like to move one of the subshards to a
new machine so the index is more balanced.  Can this be done?  If not, how
do you rebalance an existing SolrCloud collection?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-you-move-a-shard-from-one-SolrCloud-node-to-another-tp4106815.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Rebalancing a SolrCloud index after adding new nodes

2013-12-10 Thread cwhi
Interesting, thanks for the response.  Let's saying I wanted to use
SPLITSHARD to balance my index.  Once I call it, the original shard has been
split into 2 subshards that are both then on the same machine.  Can I then
somehow move one of the shards to my new server?  This would allow me to
balance manually, at least.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Rebalancing-a-SolrCloud-index-after-adding-new-nodes-tp4105956p4106000.html
Sent from the Solr - User mailing list archive at Nabble.com.


Rebalancing a SolrCloud index after adding new nodes

2013-12-10 Thread cwhi
I'm wondering if there is an automated way to rebalance a solr collection
once a new node is added, so the maximum size of a single shard is
decreased.  I was pointed to  this document
   which seems
to imply such functionality exists, but I can't find any concrete examples.

How do we rebalance a collection when a new node becomes available for
SolrCloud?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Rebalancing-a-SolrCloud-index-after-adding-new-nodes-tp4105956.html
Sent from the Solr - User mailing list archive at Nabble.com.


Replicating from the correct collections in SolrCloud on solr start

2013-12-09 Thread cwhi
I have a Solr configuration that I am trying to replicate on several machines
as part of a package installation.  I have a cluster of machines that will
run the SolrCloud, with 3 machines in the cluster running a zookeeper
ensemble.  As part of the installation of each machine, Solr is started with
the desired configuration uploaded (java
-Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf
-DzkHost=ipaddress1:2181,ipaddress2:2181,ipaddress3:2181 -jar start.jar).

My problem is that when I add a new machine to my SolrCloud cluster, I
expect it to replicate data from the collections I have in SolrCloud.  This
doesn't appear to be happening.  Instead, each new machine just replicates
the default collection1 collection.  I'd added the collection in question
with this command:

http://localhost:8983/solr/admin/collections?action=CREATE&name=SolrCloudTest&numShards=1&replicationFactor=2&collection.configName=myconf

So my question is simple: Why is it that when I start a new Solr instance on
the same zookeeper ensemble, it does not replicate the data from the
SolrCloudTest collection, and instead only replicates collection1?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replicating-from-the-correct-collections-in-SolrCloud-on-solr-start-tp4105754.html
Sent from the Solr - User mailing list archive at Nabble.com.