from:"jimtronic"

Re: Solr Cloud A/B Deployment Issue

2016-10-27 Thread jimtronic

Great. Thanks for the work on this patch!

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-A-B-Deployment-Issue-tp4302810p4303357.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Cloud A/B Deployment Issue

2016-10-26 Thread jimtronic

It appears this has all been resolved by the following ticket:

https://issues.apache.org/jira/browse/SOLR-9446

My scenario fails in 6.2.1, but works in 6.3 and Master where this bug has
been fixed.

In the meantime, we can use our workaround to issue a simple delete command
that deletes a non-existent document.

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-A-B-Deployment-Issue-tp4302810p4303210.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Cloud A/B Deployment Issue

2016-10-25 Thread jimtronic

Also, if we issue a delete by query where the query is "_version_:0", it also
creates a transaction log and then has no trouble transferring leadership
between old and new nodes.

Still, it seems like when we ADDREPLICA, some sort of transaction log should
be started. 

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-A-B-Deployment-Issue-tp4302810p4302959.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Cloud A/B Deployment Issue

2016-10-25 Thread jimtronic

Interestingly, If I simply add one document to the full cluster after all 6
nodes are active, this entire problem goes away. This appears to be because
a transaction log entry is created which in turn prevents the new nodes from
going into full replication recovery upon leader change.

Adding a document is a hacky solution, however. It seems like new nodes that
were added via ADDREPLICA should know more about versions than they
currently do.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-A-B-Deployment-Issue-tp4302810p4302949.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 6.0 Highlighting Not Working

2016-10-24 Thread jimtronic

Perhaps you need to wrap your inner "" and "" tags in the CDATA
structure?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-0-Highlighting-Not-Working-tp4302787p4302835.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr Cloud A/B Deployment Issue

2016-10-24 Thread jimtronic

We are running into a timing issue when trying to do a scripted deployment of
our Solr Cloud cluster.

Scenario to reproduce (sometimes):

1. launch 3 clean solr nodes connected to zookeeper.
2. create a 1 shard collection with replicas on each node.
3. load data (more will make the problem worse)
4. launch 3 more nodes
5. add replicas to each new node
6. once entire cluster is healthy, start killing first three nodes.

Depending on the timing, the second three nodes end up all in RECOVERING
state without a leader.  

This appears to be happening because when the first leader dies, all the new
nodes go into full replication recovery and if all the old boxes happen to
die during that state, the boxes are stuck. The boxes cannot serve requests
and they eventually (1-8 hours) go into RECOVERY_FAILED state. 

This state is easy to fix with a FORCELEADER call to the collections API,
but that's only remediation, not prevention.

My question is this: Why do the new nodes have to go into full replication
recovery when they are already up to date? I just added the replica, so it
shouldn't have to a new full replication again.

Jim




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-A-B-Deployment-Issue-tp4302810.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Cloud prevent Ping Request From Forwarding Request

2016-09-22 Thread jimtronic

It seems like all the parameters in the PingHandler get processed by the
remote server. So, things like shards=localhost or distrib=false take effect
too late.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-prevent-Ping-Request-From-Forwarding-Request-tp4297521p4297565.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr Cloud prevent Ping Request From Forwarding Request

2016-09-22 Thread jimtronic

Here's the scenario:

Boxes 1,2, and 3 have replicas of collections dogs and cats. Box 4 has only
a replica of dogs.

All of these boxes have a healthcheck file on them that works with the
PingRequestHandler to say whether the box is up or not.

If I hit Box4/cats/admin/ping, Solr forwards the ping request to another box
which returns with status OK.

Is there anyway to stop a box from forwarding a request to another node?

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cloud-prevent-Ping-Request-From-Forwarding-Request-tp4297521.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can't create collection without plugin, can't load plugin without collection

2016-09-07 Thread jimtronic

Sadly, that didn't work. 

Without a core to hit, the /[COLLECTION]/config returns a 404 error.

The best bet at this point may be for me may be one of the following:

1. Programmatically modify configoverlay.json file to add the runtime libs
when I upload the config.

or 

2. Patch solr so that schema.xml loads custom classes directly from the
BlobStore like solrconfig.xml does.

or 

3. Patch solr so that you can specify configSets instead of a collection
when associating a runtimeLib.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-t-create-collection-without-plugin-can-t-load-plugin-without-collection-tp4294865p4295028.html
Sent from the Solr - User mailing list archive at Nabble.com.

Can't create collection without plugin, can't load plugin without collection

2016-09-06 Thread jimtronic

I've run into an orchestration problem while creating collections and loading
plugins via the ConfigAPI in Solr Cloud.

Here's the scenario:

1. I create a configSet that references a custom class in schema.xml.
2. I upload the jar to the BlobStore and issue add-runtimelib using the
Config API. This fails because the collection doesn't exist yet.
3. I try to create the collection with the configSet but it fails because
the custom plugin is not available yet.

I can force this to work by removing the custom reference, create the
collection, load the jar and then add the custom reference back in place.
This is fine as a manual one-time setup, but not feasible in a scripted
production deployment. 

I wish I could create a collection without actually needing to create any
cores. Then, I could get all of the configurations for a collection setup
before creating the cores.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-t-create-collection-without-plugin-can-t-load-plugin-without-collection-tp4294865.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Why Doesn't Solr Really Quit on Zookeeper Exceptions?

2016-05-31 Thread jimtronic

Thanks Shawn. I'm leaning towards a retry as well.

So, there's no mechanism that currently exists within Solr that would allow
me to automatically retry the zookeeper connection on launch?

My options then would be:

1. Externally monitor the status of Solr (eg
/solr/admin/collections?action=CLUSTERSTATUS or bin/solr status) and force a
restart. 

2. Write a patch to retry Zookeeper connections based on some configuration
values that specify attempts and wait times.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-Doesn-t-Solr-Really-Quit-on-Zookeeper-Exceptions-tp4279971p4279987.html
Sent from the Solr - User mailing list archive at Nabble.com.

Why Doesn't Solr Really Quit on Zookeeper Exceptions?

2016-05-31 Thread jimtronic

When I try to launch Solr 6.0 in cloud mode and connect it to a specific
chroot in zookeeper that doesn't exist, I get an error in my solr.log.
That's expected, but the solr process continues to launch and succeeds.

Why wouldn't we want the start process simply to fail and exit?

There's no mechanism to trigger a retry, so Solr just sits there like a
zombie.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-Doesn-t-Solr-Really-Quit-on-Zookeeper-Exceptions-tp4279971.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Using a RequestHandler to expand query parameter

2014-09-09 Thread jimtronic

Never got a response on this ... Just looking for the best way to handle it?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596p4157613.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Using a RequestHandler to expand query parameter

2014-09-09 Thread jimtronic

So, the problem I found that's driving this is that I have several phrase
synonyms set up. For example, ipod mini into ipad mini. This synonym is
only applied if you submit it as a phrase in quotes. 

So, the pf param doesn't help because it's not the right phrase in the first
place.

I can fix this by sending in the query as (ipod mini ipod mini).





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596p4157637.html
Sent from the Solr - User mailing list archive at Nabble.com.

Using a RequestHandler to expand query parameter

2014-08-28 Thread jimtronic

I would like to send only one query to my custom request handler and have the
request handler expand that query into a more complicated query.

Example:

*/myHandler?q=kids+books*

... would turn into a more complicated EDismax query of:

*kids books kids books*

Is this achievable via a Request Handler definition in solrconfig.xml?

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596.html
Sent from the Solr - User mailing list archive at Nabble.com.

CloudSolrServer vs Software/Hardware Load Balancer

2014-07-11 Thread jimtronic

Hi there,

We're trying to evaluate whether to use the CloudSolrServer in SolrJ or to
use the HttpSolrServer that is pointed at a software or hardware load
balancer such as haproxy or f5. This would be in production.

Can anyone provide any experiential pros or cons on these? In addition to
performance, i'm interested in management, scalability, and stability.

Technically at this point we can already support both, so I'm really looking
for best practices.

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CloudSolrServer-vs-Software-Hardware-Load-Balancer-tp4146761.html
Sent from the Solr - User mailing list archive at Nabble.com.

Help importing xml file as raw xml

2013-08-06 Thread jimtronic

Hi,

I found a few threads out there dealing with this problem, but there didn't
really seem to be much detail to the solution.

I have large xml files (500M to 2+ G) with a complex nested structure. It's
impossible for me to import the exact structure into a solr representation,
and, honestly, I don't need to. But, I do need to store the raw xml for each
main item in a solr field for use by other clients. 

I tried using the xsl option for the XPathEntityProcessor, and it works
perfectly for small files. However, it cannot handle the big file -- or at
least the machine I have doesn't have enough memory to handle this task.

Normal import with the XPEProcessor takes just a few minutes. I do this job
a couple times a day and I don't want it to eat up all the memory on one of
my nodes.

I tried using xsltproc to pretransform the file, but it also took a long
time and eventually failed due to memory.

My best option now would seem to be using awk or sed to transform the file
prior to solr import. Perhaps by removing line breaks and using the
LineEntityProcessor and some scripts.

My other thought is that since the XPEProcessor knows the structure, there
must be some way for it to be extended so that it outputs the raw input if
requested.

Anyone have any other thoughts?

Thanks!
Jim 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-importing-xml-file-as-raw-xml-tp4082824.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to uncache a query to debug?

2013-08-01 Thread jimtronic

I have a query that runs slow occasionally. I'm having trouble debugging it
because once it's cached, it runs fast -- under 10 ms. But throughout the
day it occasionally takes up to 3 secs. It seems like it could be one of the
following:

1. My autoCommit (30 and openSearcher=false) and softAutoCommit (1)
settings
2. Something to do with distributed search -- There are three nodes, but
only 1 shard each.
3. Just a slow query that is getting blown out of cache periodically

This is in Solr 4.2.

I like that it runs fast when cached, but if it's going to be blown out
quickly, then I'd really like to just optimize the query to run fast
uncached.

*Is there any way to run a query using no caching whatsoever?* 

The query changes, but has *:* for the q param and 4 fq parameters. It's
also trying to do field collapsing.

Jim




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to uncache a query to debug?

2013-08-01 Thread jimtronic

Thanks, but that doesn't seem to do much. I've added it to all four of the
fq params and the q param, but it only makes it marginally slower -- like
50 ms instead of 2 ms. There appears to be a deeper or more widely
encompassing cache at work here.

Jim

On Thu, Aug 1, 2013 at 2:49 PM, Mikhail Khludnev [via Lucene]
ml-node+s472066n4082044...@n3.nabble.com wrote:

Hello Jim,

Does q={!cache=false}lorem ipsum works for you?

On Thu, Aug 1, 2013 at 9:12 PM, jimtronic [hidden
email]http://user/SendEmail.jtp?type=nodenode=4082044i=0
wrote:

I have a query that runs slow occasionally. I'm having trouble debugging
it
because once it's cached, it runs fast -- under 10 ms. But throughout
the
day it occasionally takes up to 3 secs. It seems like it could be one of
the
following:

1. My autoCommit (30 and openSearcher=false) and softAutoCommit
(1)
settings
2. Something to do with distributed search -- There are three nodes, but
only 1 shard each.
3. Just a slow query that is getting blown out of cache periodically

This is in Solr 4.2.

I like that it runs fast when cached, but if it's going to be blown out
quickly, then I'd really like to just optimize the query to run fast
uncached.

*Is there any way to run a query using no caching whatsoever?*

The query changes, but has *:* for the q param and 4 fq parameters. It's
also trying to do field collapsing.

Jim

--
View this message in context:

http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
[hidden email] http://user/SendEmail.jtp?type=nodenode=4082044i=1

--
If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010p4082044.html
To unsubscribe from How to uncache a query to debug?, click
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4082010code=amltdHJvbmljQGdtYWlsLmNvbXw0MDgyMDEwfDEzMjQ4NDk0MTQ=
.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010p4082046.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to uncache a query to debug?

2013-08-01 Thread jimtronic

Thanks. I'd rather not turn off caching completely because it only seems to
show up in production and I don't want to turn reboot all the solr
processes on each node.

Jim

On Thu, Aug 1, 2013 at 12:30 PM, Roman Chyla [via Lucene]
ml-node+s472066n4082014...@n3.nabble.com wrote:

When you set your cache (solrconfig.xml) to size=0, you are not using a
cache. so you can debug more easily

roman

On Thu, Aug 1, 2013 at 1:12 PM, jimtronic [hidden
email]http://user/SendEmail.jtp?type=nodenode=4082014i=0
wrote:

This is in Solr 4.2.

I like that it runs fast when cached, but if it's going to be blown out
quickly, then I'd really like to just optimize the query to run fast
uncached.

*Is there any way to run a query using no caching whatsoever?*

The query changes, but has *:* for the q param and 4 fq parameters. It's
also trying to do field collapsing.

Jim

--
View this message in context:

http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010p4082014.html
To unsubscribe from How to uncache a query to debug?, click
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4082010code=amltdHJvbmljQGdtYWlsLmNvbXw0MDgyMDEwfDEzMjQ4NDk0MTQ=
.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-uncache-a-query-to-debug-tp4082010p4082047.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to debug an OutOfMemoryError?

2013-07-24 Thread jimtronic

I've encountered an OOM that seems to come after the server has been up for a
few weeks. 

While I would love for someone to just tell me you did X wrong, I'm more
interested in trying to debug this. So, given the error below, where would I
look next? The only odd thing that sticks out to me is that my log file had
grown to about 70G. Would that cause an error like this? This is Solr 4.2.

Jul 24, 2013 3:08:09 PM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java
heap space
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:642)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.OpenBitSet.init(OpenBitSet.java:88)
at
org.apache.solr.search.DocSetCollector.collect(DocSetCollector.java:65)
at org.apache.lucene.search.Scorer.score(Scorer.java:64)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:605)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
at
org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1060)
at
org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:763)
at
org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:880)
at org.apache.solr.search.Grouping.execute(Grouping.java:284)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:384)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at

Re: Node down, but not out

2013-07-24 Thread jimtronic

Wow! Awesome. Give me a bit to try to plug this into my environment.

The other way I was going to attempt this was to use the health check file
option for the ping request handler. I would have to write a separate
process in python or something that would ping zookeeper for active nodes
and if the current box's ip is there, I would create the health check file
which would make the ping work.

I'd prefer not to introduce yet another process that I need to keep
running, so this looks promising.

Jim

On Wed, Jul 24, 2013 at 11:49 AM, Timothy Potter [via Lucene] 
ml-node+s472066n4080116...@n3.nabble.com wrote:

 Hi Jim,

 Based on our discussion, I cooked up this solution for my book Solr in
 Action and would appreciate you looking it over to see if it meets
 your needs. The basic idea is to extend Solr's built-in
 PingRequestHandler to verify a replica is connected to Zookeeper and
 is in the active state. To enable this, install the custom JAR and
 then update your solrconfig.xml to use this class instead of the
 built-in one for the /admin/ping request handler:

 requestHandler name=/admin/ping
 class=sia.ch13.ClusterStateAwarePingRequestHandler



  Code 

 package sia.ch13;

 import org.apache.solr.cloud.CloudDescriptor;
 import org.apache.solr.cloud.ZkController;
 import org.apache.solr.common.SolrException;
 import org.apache.solr.common.cloud.ClusterState;
 import org.apache.solr.common.cloud.Slice;
 import org.apache.solr.core.CoreContainer;
 import org.apache.solr.core.CoreDescriptor;
 import org.apache.solr.core.SolrCore;
 import org.apache.solr.handler.PingRequestHandler;
 import org.apache.solr.request.SolrQueryRequest;
 import org.apache.solr.response.SolrQueryResponse;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;

 /**
  * Extends Solr's PingRequestHandler to check a replica's cluster
 status as part of the health check.
  */
 public class ClusterStateAwarePingRequestHandler extends
 PingRequestHandler {

 public static Logger log =
 LoggerFactory.getLogger(ClusterStateAwarePingRequestHandler.class);

 @Override
 public void handleRequestBody(SolrQueryRequest solrQueryRequest,
 SolrQueryResponse solrQueryResponse) throws Exception {
 // delegate to the base class to check the status of this local
 index
 super.handleRequestBody(solrQueryRequest, solrQueryResponse);

 // if ping status is OK, then check cluster state of this core
 if (OK.equals(solrQueryResponse.getValues().get(status))) {
 verifyThisReplicaIsActive(solrQueryRequest.getCore());
 }
 }

 /**
  * Verifies this replica is active.
  */
 protected void verifyThisReplicaIsActive(SolrCore solrCore) throws
 SolrException {
 String replicaState = unknown;
 String nodeName = ?;
 String shardName = ?;
 String collectionName = ?;
 String role = ?;
 Exception exc = null;
 try {
 CoreDescriptor coreDescriptor = solrCore.getCoreDescriptor();
 CoreContainer coreContainer =
 coreDescriptor.getCoreContainer();
 CloudDescriptor cloud = coreDescriptor.getCloudDescriptor();

 shardName = cloud.getShardId();
 collectionName = cloud.getCollectionName();
 role = (cloud.isLeader() ? Leader : Replica);

 ZkController zkController = coreContainer.getZkController();
 if (zkController != null) {
 nodeName = zkController.getNodeName();
 if (zkController.isConnected()) {
 ClusterState clusterState =
 zkController.getClusterState();
 Slice slice =
 clusterState.getSlice(collectionName, shardName);
 replicaState = (slice != null) ? slice.getState() :
 gone;
 } else {
 replicaState = not connected to Zookeeper;
 }
 } else {
 replicaState = Zookeeper not enabled/configured;
 }
 } catch (Exception e) {
 replicaState = error determining cluster state;
 exc = e;
 }

 if (active.equals(replicaState)) {
 log.info(String.format(%s at %s for %s in the %s
 collection is active.,
 role, nodeName, shardName, collectionName));
 } else {
 // fail the ping by raising an exception
 String errMsg = String.format(%s at %s for %s in the %s
 collection is not active! State is: %s,
 role, nodeName, shardName, collectionName,
 replicaState);
 if (exc != null) {
 throw new
 SolrException(SolrException.ErrorCode.SERVER_ERROR, errMsg, exc);
 } else {
 throw new
 SolrException(SolrException.ErrorCode.SERVER_ERROR, errMsg);
 }
 }
 }
 }

 On Tue, Jul 23, 2013 at 1:46 PM, jimtronic [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4080116i=0
 wrote

Re: Node down, but not out

2013-07-24 Thread jimtronic

 {
 throw new
 SolrException(SolrException.ErrorCode.SERVER_ERROR, errMsg);
 }
 }
 }
 }

 On Tue, Jul 23, 2013 at 1:46 PM, jimtronic [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4080116i=0
 wrote:

  I think the best bet here would be a ping like handler that would
 simply
  return the state of only this box in the cluster:
 
  Something like /admin/state which would return
  down,active,leader,recovering
 
  I'm not really sure where to begin however. Any ideas?
 
  jim
 
  On Mon, Jul 22, 2013 at 12:52 PM, Timothy Potter [via Lucene] 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=4080116i=1
 wrote:
 
  There is but I couldn't get it to work in my environment on Jetty,
 see:
 
 
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201306.mbox/%3CCAJt9Wnib+p_woYODtrSPhF==v8Vx==mDBd_qH=x_knbw-BnPXQ@...%3E
 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201306.mbox/%3CCAJt9Wnib+p_woYODtrSPhF==v8Vx==mDBd_qH=x_knbw-BnPXQ@...%3Ehttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/201306.mbox/%3CCAJt9Wnib+p_woYODtrSPhF==v8Vx==mDBd_qH=x_knbw-bn...@mail.gmail.com%3E

 
  Let me know if you have any better luck. I had to resort to something
  hacky but was out of time I could devote to such unproductive
  endeavors ;-)
 
  On Mon, Jul 22, 2013 at 10:49 AM, jimtronic [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4079518i=0
  wrote:
 
   I'm not sure why it went down exactly -- I restarted the process and
  lost the
   logs. (d'oh!)
  
   An OOM seems likely, however. Is there a setting for killing the
  processes
   when solr encounters an OOM?
  
   Thanks!
  
   Jim
  
  
  
   --
   View this message in context:
 
 http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4079507.html
 
   Sent from the Solr - User mailing list archive at Nabble.com.
 
 
  --
   If you reply to this email, your message will be added to the
 discussion
  below:
 
 

  .
  NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

 
 
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4079856.html

  Sent from the Solr - User mailing list archive at Nabble.com.


 --
  If you reply to this email, your message will be added to the
 discussion below:

 http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4080116.html
  To unsubscribe from Node down, but not out, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4079495code=amltdHJvbmljQGdtYWlsLmNvbXw0MDc5NDk1fDEzMjQ4NDk0MTQ=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4080169.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Node down, but not out

2013-07-23 Thread jimtronic

I think the best bet here would be a ping like handler that would simply
return the state of only this box in the cluster:

Something like /admin/state which would return
down,active,leader,recovering

I'm not really sure where to begin however. Any ideas?

jim

On Mon, Jul 22, 2013 at 12:52 PM, Timothy Potter [via Lucene] 
ml-node+s472066n4079518...@n3.nabble.com wrote:

 There is but I couldn't get it to work in my environment on Jetty, see:


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201306.mbox/%3CCAJt9Wnib+p_woYODtrSPhF==v8Vx==mDBd_qH=x_knbw-BnPXQ@...%3Ehttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/201306.mbox/%3CCAJt9Wnib+p_woYODtrSPhF==v8Vx==mDBd_qH=x_knbw-bn...@mail.gmail.com%3E

 Let me know if you have any better luck. I had to resort to something
 hacky but was out of time I could devote to such unproductive
 endeavors ;-)

 On Mon, Jul 22, 2013 at 10:49 AM, jimtronic [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4079518i=0
 wrote:

  I'm not sure why it went down exactly -- I restarted the process and
 lost the
  logs. (d'oh!)
 
  An OOM seems likely, however. Is there a setting for killing the
 processes
  when solr encounters an OOM?
 
  Thanks!
 
  Jim
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4079507.html

  Sent from the Solr - User mailing list archive at Nabble.com.


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4079518.html
  To unsubscribe from Node down, but not out, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4079495code=amltdHJvbmljQGdtYWlsLmNvbXw0MDc5NDk1fDEzMjQ4NDk0MTQ=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4079856.html
Sent from the Solr - User mailing list archive at Nabble.com.

Node down, but not out

2013-07-22 Thread jimtronic

I've run into a problem recently that's difficult to debug and search for:

I have three nodes in a cluster and this weekend one of the nodes went
partially down. It no longer responds to distributed updates and it is
marked as GONE in the Cloud view of the admin screen. That's not ideal, but
there's still two boxes up so not the end of the world.

The problem is that it is still responding to ping requests and returning
queries successfully. In my setup, I have the three servers on an haproxy
load balancer so that I can distribute requests and have clients stick to a
specific solr box. Because the bad node is still returning OK to the ping
requests and still returns results for simple queries, the load balancer
does not remove it from the group.

Is there a ping like request handler that would tell me whether the given
box I'm hitting is still in the cloud?

Thanks!
Jim Musil



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Node down, but not out

2013-07-22 Thread jimtronic

I'm not sure why it went down exactly -- I restarted the process and lost the
logs. (d'oh!) 

An OOM seems likely, however. Is there a setting for killing the processes
when solr encounters an OOM?

Thanks!

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Node-down-but-not-out-tp4079495p4079507.html
Sent from the Solr - User mailing list archive at Nabble.com.

Best way to match umlauts

2013-06-13 Thread jimtronic

I'm trying to make Brüno come up in my results when the user types in
Bruno. 

What's the best way to accomplish this?

Using Solr 4.2



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Best way to match umlauts

2013-06-13 Thread jimtronic

Thanks! Sorry for the basic question, but I was having trouble finding the
results through google.

On Thu, Jun 13, 2013 at 10:39 AM, Jack Krupansky-2 [via Lucene] 
ml-node+s472066n4070262...@n3.nabble.com wrote:

 charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/

 -- Jack Krupansky

 -Original Message-
 From: jimtronic
 Sent: Thursday, June 13, 2013 11:31 AM
 To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4070262i=0
 Subject: Best way to match umlauts

 I'm trying to make Brüno come up in my results when the user types in
 Bruno.

 What's the best way to accomplish this?

 Using Solr 4.2



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256.html
 Sent from the Solr - User mailing list archive at Nabble.com.



 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070262.html
  To unsubscribe from Best way to match umlauts, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4070256code=amltdHJvbmljQGdtYWlsLmNvbXw0MDcwMjU2fDEzMjQ4NDk0MTQ=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070273.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: dataimporter.last_index_time SolrCloud

2013-04-17 Thread jimtronic

Is this a bug? I can create the ticket in Jira if it is, but it's not clear
to me what should be happening.

I noticed that if it is using the value set in the home directory, but that
value does not get updated, so my imports get slower and slower. 

I guess I could create a cron job to update that time, but this seems kind
of wonky.

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/dataimporter-last-index-time-SolrCloud-tp4055679p4056718.html
Sent from the Solr - User mailing list archive at Nabble.com.

dataimporter.last_index_time SolrCloud

2013-04-12 Thread jimtronic

My data-config files use the dataimporter.last_index_time variable, but it
seems to have stopped working when I upgraded to 4.2.

In previous 4.x versions, I saw that it was being written to zookeeper, but
now there's nothing there.

Did anything change? Or should I be doing something differently?

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/dataimporter-last-index-time-SolrCloud-tp4055679.html
Sent from the Solr - User mailing list archive at Nabble.com.

bootstrap_conf without restarting

2013-03-28 Thread jimtronic

I'm doing fairly frequent changes to my data-config.xml files on some of my
cores in a solr cloud setup. Is there anyway to to get these files active
and up to Zookeeper without restarting the instance?

I've noticed that if I just launch another instance of solr with the
bootstrap_conf flag set to true, it uploads the new settings, but it dies
because there's already a solr instance running on that port. It also seems
to make the original one unresponsive or at least down in zookeeper's
eyes. I then just restart that instance and everything is back up. It'd be
nice if I could bootstrap without actually starting solr.

What's the best practice for deploying changes to data-config.xml?

Thanks, Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/bootstrap-conf-without-restarting-tp4052092.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Did something change with Payloads?

2013-03-23 Thread jimtronic

Created:

https://issues.apache.org/jira/browse/SOLR-4639

Thanks!

On Fri, Mar 22, 2013 at 5:01 PM, Mark Miller-3 [via Lucene] 
ml-node+s472066n405060...@n3.nabble.com wrote:


 On Mar 22, 2013, at 5:54 PM, jimtronic [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4050603i=0
 wrote:

  Ok, this is very bizzare.
 
  If I insert more than one document at a time using the update handler
 like
  so:
 
  [{id:1,foo_ap:bar|50}},{id:2,foo_ap:bar|75}]
 
  It actually stores the same payload value 50 for both docs.
 
  That seems like a bug, no?
 
  There was a core change in 4.1 to how payloads were stored. I'm
 wondering if
  solr is not handling them properly?

 This could be - if you have compiled a lot of evidence (sorry i have not
 had time to follow up on this myself), please create a jira issue for more
 prominence.

 - Mark

 
  Jim
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4050599.html

  Sent from the Solr - User mailing list archive at Nabble.com.



 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4050603.html
  To unsubscribe from Did something change with Payloads?, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4049561code=amltdHJvbmljQGdtYWlsLmNvbXw0MDQ5NTYxfDEzMjQ4NDk0MTQ=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4050748.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Did something change with Payloads?

2013-03-22 Thread jimtronic

Ok, this is very bizzare.

If I insert more than one document at a time using the update handler like
so:

[{id:1,foo_ap:bar|50}},{id:2,foo_ap:bar|75}]

It actually stores the same payload value 50 for both docs.

That seems like a bug, no?

There was a core change in 4.1 to how payloads were stored. I'm wondering if
solr is not handling them properly?

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4050599.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Did something change with Payloads?

2013-03-21 Thread jimtronic

Ok, Yes, I have now recompiled against the 4.2.0 libraries. I needed to
change a few things, but the problem still exists using the new libraries.

I think the problem may actually be on the indexing side of things. Here's
why:

1. I had an old index created under 4.0, running 4.0. Works as expected.
2. I used the same index, but running under 4.2. Works as expected.
3. I started fresh with 4.2 and did a fresh import of the data. Does not
work.

By Does not work I mean this. The payload values that I enter are not the
payload values I get back using my custom query plugin.

These are stored fields, so I can see clearly what the payload value should
be.

Is there any way to see what the payload value is at a very low level?

Thanks! Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4049813.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Did something change with Payloads?

2013-03-21 Thread jimtronic

Something has definitely changed at 4.1. I've installed 4.0, 4.1, and 4.2
side by side and conducted the same tests on each one. Only 4.0 is returning
the expected results.

Apologies for cross posting this here and in the Lucene forum, but I really
can't tell if this is a Solr or a Lucene issue.

In my tests, I have the following two documents and a custom query plugin
that should average the payload of the term bing and use that as the score: 

In 4.1 and 4.2, I get: 

docs:[ 
  { 
id:3, 
foo_ap:[bing|9,bing|7], 
score:9.0}, 
  { 
id:1, 
foo_ap:[bing|9 bing|7,badda|9 bing|7], 
score:9.0}, 
] 

Using 4.0, I get these results: 

docs:[ 
  { 
id:1, 
foo_ap:[bing|9 bing|7,badda|9 bing|7], 
score:7.665}, 
  { 
id:3, 
foo_ap:[bing|9,bing|7], 
score:8.0} 
] 

Thanks for any input.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4049957.html
Sent from the Solr - User mailing list archive at Nabble.com.

Did something change with Payloads?

2013-03-20 Thread jimtronic

I've been using Payloads through several versions of Solr including 4.0, but
now they are no longer working correctly on 4.2

I had originally followed Grant's article here:
http://searchhub.org/2009/08/05/getting-started-with-payloads/

I have a custom query plugin {!payload} that will return the payload value
for a given term, but now it's returning erratic results. No errors, but
just the wrong values.

Thanks for any help!
Jim





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Did something change with Payloads?

2013-03-20 Thread jimtronic

Actually, this is more like the code I've got in place:

http://sujitpal.blogspot.com/2011/01/payloads-with-solr.html

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4049566.html
Sent from the Solr - User mailing list archive at Nabble.com.

Zookeeper specs

2013-03-19 Thread jimtronic

I understand this may be a better question for the zookeeper list, but I'm
asking here because I'm not completely clear how much load zookeeper takes
on in a solr cloud setup.

I'm trying to determine what specs my zookeeper boxes should be. I'm on EC2,
so what I'm curious about is whether zookeeper should have high I/O, high
memory, or high CPU.

I've been running my zookeeper on micro instances with no problem, but want
to understand what the potential bottlenecks might be.

Thanks for any input!

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Zookeeper-specs-tp4049058.html
Sent from the Solr - User mailing list archive at Nabble.com.

Practicality of enormous fields

2013-03-19 Thread jimtronic

What are the likely ramifications of having a stored field with millions of
words?

For example, If I had an article and wanted to store the user id of every
user who has read it and stuck it into a simple white space delimited field.
What would go wrong and when?

My tests lead me to believe this is not a problem, but it feels weird.

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Practicality-of-enormous-fields-tp4049131.html
Sent from the Solr - User mailing list archive at Nabble.com.

Scaling SolrCloud and DIH

2013-03-13 Thread jimtronic

I'm curious how people are using DIH with SolrCloud.

I have cron jobs set up to trigger the dataimports which come from both xml
files and a sql database. Some are frequent small delta imports while others
are larger daily xml imports.

Here's what I've tried:

1. Set up a micro box that sends the dataimport requests to a load balancer
using cron. This didn't work because frequent requests would get spread
around and at one point all my nodes were doing the dataimport requests at
the same time.

2. Designate one box as the indexer and call dataimport via localhost. The
problem here is that I now have a single point of failure for indexing -- I
always have to have that box running. I love that SolrCloud is distributed
so I can have 3 boxes in my cluster and I don't care which one goes down.

I don't really know what the solution is, but I guess it would be nice if
the dataimport was cloud aware. Meaning that the cluster knows an update is
happening on one of the boxes and won't let another one start. That way I
could just send the dataimport request up through the load balancer and
forget about it.

Anyway, I thought I would see how others are handling this issue.

Cheers, Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Scaling-SolrCloud-and-DIH-tp4047049.html
Sent from the Solr - User mailing list archive at Nabble.com.

Some nodes have all the load

2013-03-11 Thread jimtronic

I was doing some rolling updates of my cluster ( 12 cores, 4 servers ) and I
ended up in a situation where one node was elected leader by all the cores.
This seemed very taxing to that one node. It was also still trying to serve
query requests so it slowed everything down. I'm trying to do a lot of
frequent atomic updates along with some periodic DIH syncs.

My solution to this situation was to try to take the supreme leader out of
the cluster and let the leader election start. This was not easy as there
was so much load on it, I couldn't take it out gracefully. Some of my cores
became unreachable for a while.

This was all under fictitious load, but it made me nervous about high load
production situation.

I'm sure there's several things I'm doing wrong in all this, so I thought
I'd see what you guys think.

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Some-nodes-have-all-the-load-tp4046349.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Some nodes have all the load

2013-03-11 Thread jimtronic

The load test was fairly heavy (ie lots of users) and designed to mimic a
fully operational system with lots of users doing normal things.

There were two things I gleaned from the logs:

PERFORMANCE WARNING: Overlapping onDeckSearchers=2 appeared for several of
my more active cores

and

The non-leaders were throwing errors saying that the leader as not
responding while trying to forward updates. (sorry can't find that specific
error now)

My best guess is that it has something to do with the commits.

 a. frequent user generated writes using
/update?commitWithin=500waitFlush=falsewaitSearcher=false
 b. softCommit set to 3000
 c. autoCommit set to 300,000 and openSearcher false
 d. I'm also doing frequent periodic DIH updates. I guess this is
commit=true by default.

Should I omit commitWithin and set DIH to commit=false and just let soft and
autocommit do their jobs?

Cheers,
Jim





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Some-nodes-have-all-the-load-tp4046349p4046476.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Feeding Custom QueryParser with Nested Query

2013-03-10 Thread jimtronic

It seems like I could could accomplish this by following the
JoinQParserPlugin logic. I can actually get pretty close using the join
query, but I need to do some extra math in the middle.

The difference in my case is that I need to access the id and the score. I
*think* the logic would go something like this:

1. do sub query to get doc ids and score
2. use the resulting doc ids to feed into another query.
3. write a custom scorer that uses the score from the subquery to determine
the scores of the final results.

Thanks for any suggestions...

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Feeding-Custom-QueryParser-with-Nested-Query-tp4046007p4046162.html
Sent from the Solr - User mailing list archive at Nabble.com.

optimal maxWarmingSearchers in solr cloud

2013-03-10 Thread jimtronic

The notes for maxWarmingSearchers in solrconfig.xml state:

Recommend values of 1-2 for read-only slaves, higher for masters w/o cache
warming.

Since solr cloud nodes could be both a leader and non-leader depending on
the current state of the cloud, what would be the optimal setting here?

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/optimal-maxWarmingSearchers-in-solr-cloud-tp4046164.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multiple Collections in one Zookeeper

2013-03-09 Thread jimtronic

Ok, I'm a little confused.

I had originally bootstrapped zookeeper using a solr.xml file which
specified the following cores:

cats
dogs
birds

In my /solr/#/cloud?view=tree view I see that I have

/collections
 /cats
 /dogs
 /birds
/configs
 /cats
 /dogs
 /birds

When I launch a new server and connect it to zookeeper, it creates all three
collections. What I'd like to do is move cats to it's own set of boxes. 

When I run:

java -DzkHost=zookeeper:9893/cats -jar start.jar

or

java -DzkHost=zookeeper:9893,zookeeper:9893/cats -jar start.jar


I get this error:

SEVERE: Could not create Overseer node

For simplicity, I'd like to only have zookeeper ensemble.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-Collections-in-one-Zookeeper-tp4045936p4045981.html
Sent from the Solr - User mailing list archive at Nabble.com.

Feeding Custom QueryParser with Nested Query

2013-03-09 Thread jimtronic

I've written a custom query parser that we'll call {!doFoo } which takes two
parameters: a field name and a space delimited list of values. The parser
does some calculations between the list of values and the field in question.

In some cases, the list is quite long and as it turns out, the core already
has the information. I think most of my latency in this operation is just
passing big lists around. 

Ideally, I'd like to accomplish something like this:

{!doFoo f=my_field v='query(...)'}

Or, even better, if I could just pass a parameter in and get the results.

{!doFoo with='bar')

Thanks for any advice!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Feeding-Custom-QueryParser-with-Nested-Query-tp4046007.html
Sent from the Solr - User mailing list archive at Nabble.com.

Multiple Collections in one Zookeeper

2013-03-08 Thread jimtronic

Hi, 

I have a solrcloud cluster running several cores and pointing at one
zookeeper.

For performance reasons, I'd like to move one of the cores on to it's own
dedicated cluster of servers. Can I use the same zookeeper to keep track of
both clusters.

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-Collections-in-one-Zookeeper-tp4045936.html
Sent from the Solr - User mailing list archive at Nabble.com.

Nodes out of sync, deletes fail

2013-02-27 Thread jimtronic

I'm not sure how it happened, but one of my nodes has different data than the
others. 

When I try to delete the offending document by posting json to the /update
url, it hangs and after a minute it just fails with no reply.

I disconnected the offending node from the cloud and was able to delete the
problem docs without issue.

It seems as though there's a real problem here though if a delete tries to
propagate to other nodes that don't have that document.  

I tried deleting by id and by query.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Nodes out of sync, deletes fail

2013-02-27 Thread jimtronic

solrspec: 5.0.0.2012.12.03.13.10.02



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043437.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Nodes out of sync, deletes fail

2013-02-27 Thread jimtronic

Oddly, not much info there. Here's what I do know.

- I had a three node cluster running.
- adding documents was also failing in the same exact way.
- updates/deletes would make it to the elected leader, but then never show
up on the other nodes.
- eventually, after 30 seconds or so, the write to the leader would
succeed, but it never showed up on any other node. This caused my nodes to
be out of sync.
- once i restarted solr on the other nodes, everything worked great.
updates/deletes worked immediately.

It seems odd that the write should succeed on the leader even though it
didn't work on the other nodes.

Jim

On Wed, Feb 27, 2013 at 1:06 PM, Mark Miller-3 [via Lucene]
ml-node+s472066n4043462...@n3.nabble.com wrote:

You are working off trunk?

Do you have any interesting info in the logs?

- Mark

On Feb 27, 2013, at 12:55 PM, jimtronic [hidden
email]http://user/SendEmail.jtp?type=nodenode=4043462i=0
wrote:

solrspec: 5.0.0.2012.12.03.13.10.02

--
View this message in context:
http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043437.html

Sent from the Solr - User mailing list archive at Nabble.com.

--
If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043462.html
To unsubscribe from Nodes out of sync, deletes fail, click
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4043433code=amltdHJvbmljQGdtYWlsLmNvbXw0MDQzNDMzfDEzMjQ4NDk0MTQ=
.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context:
http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043465.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Nodes out of sync, deletes fail

2013-02-27 Thread jimtronic

Currently, a leader does an update locally before sending in parallel to
all replicas. If we can't send an update to a replica, because it crashed,
or because of some other reason, we ask that replica to recover if we can.
In that case, it's either gone and will come back and recover, or oddly,
the request failed and it's still in normal operations, in which case we
ask it to recover because something must be wrong.

So if a leader can't send to any replicas, he's going to assume they are
all screwed (they are if he can't send to them) and think he is the only
part of the cluster.

It might be nice if we had a param for you to say, consider this a fail
unless it hits this many replicas - but still the leader is going to have
carried out the request.

This seems to violate the strong consistency model doesn't it? If a write
doesn't succeed at a replica, it shouldn't succeed anywhere.

Cassandra seems to have this same problem --
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure--
except that it returns a timeout error and saves the hint for later.

I was assuming that solr was acting like CONSISTENCY ALL for writes and
CONSISTENCY ANY for reads. If that were the case, I'd like to ensure that
my nodes don't get out of sync if an otherwise healthy node can't perform
the update and that the original write would be rolled back.

What you need to figure out is why the leader could not talk to the
replicas - very weird to not see log errors about that!

Were the replicas responding to requests?

OOM's are bad for SolrCloud by the way - a JVM that has OOM is outta
control - you really want to use the option that kills the jvm on OOMs.

This does seem to be the biggest problem. The replica was responding
normally. I'll try upping the memory and getting the latest version.

- Mark

--
If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043467.html
To unsubscribe from Nodes out of sync, deletes fail, click
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4043433code=amltdHJvbmljQGdtYWlsLmNvbXw0MDQzNDMzfDEzMjQ4NDk0MTQ=
.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context:
http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043478.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud as my primary data store

2013-02-22 Thread jimtronic

Yes, these are good points. I'm using solr to leverage user preference data
and I need that data available real time. SQL just can't do the kind of
things I'm able to do in solr, so I have to wait until the write (a user
action, a user preference, etc) gets to solr from the db anyway. 

I'm kind of curious about how many single documents i can send through via
the json update in a day. Millions would be nice, but I wonder what the
upper limit would be.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-as-my-primary-data-store-tp4041774p4042251.html
Sent from the Solr - User mailing list archive at Nabble.com.

SolrCloud as my primary data store

2013-02-20 Thread jimtronic

Now that I've been running Solr Cloud for a couple months and gotten
comfortable with it, I think it's time to revisit this subject. 

When I search for the topic of using Solr as a primary db online, I get lots
of discussions from 2-3 years ago and usually they point out a lot of
hurdles that have now largely been eliminated with the release of Solr
Cloud.

I've stopped using the standard method of writing to my db and pushing out
periodically to solr. Instead, I'm writing simultaneously to solr and the db
with less frequent syncs from the database just to be safe. I find this to
be much faster and easier than doing delta imports via the DIH handler. In
fact, it's gone so smoothly, I'm really wondering why I need to keep writing
it to the db at all. 

I've always got several nodes running and launching new ones takes only
minutes to be fully operational. I'm taking frequent snapshots and my test
restores have been painless and quick.

So, if I'm looking at other NoSQL solutions like MongoDB or Cassandra, why
wouldn't I just use Solr? It's distributed, fast, and stable. It has a great
http api and it's nearly schema-less using dynamic fields. And, most
importantly, it offers the most powerful query language available. 

I'd really like to hear from someone who has made the leap.

Cheers, Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-as-my-primary-data-store-tp4041774.html
Sent from the Solr - User mailing list archive at Nabble.com.

DIH clean=true behavior in SolrCloud

2013-01-08 Thread jimtronic

I'm confused about the behavior of clean=true using the DataImportHandler.

When I use clean=true on just one instance, it doesn't blow all the data out
until the import succeeds. In a cluster, however, it appears to blow all the
data out of the other nodes first, then starts adding new docs.

Am I wrong about this?

Jim




--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-clean-true-behavior-in-SolrCloud-tp4031680.html
Sent from the Solr - User mailing list archive at Nabble.com.

If bootstrap a new solrconfig file to zookeeper, do I need to restart all nodes?

2012-10-18 Thread jimtronic

I have a simple cluster of three servers and a dedicated zookeeper server
running separately. If I make a change to my solrconfig.xml file on one of
the servers and restart the server with the bootstrap_conf=true option, will
that change be sent to the other nodes? 

Or, will I have to log into each node and restart the server?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/If-bootstrap-a-new-solrconfig-file-to-zookeeper-do-I-need-to-restart-all-nodes-tp4014520.html
Sent from the Solr - User mailing list archive at Nabble.com.

Filter results based on custom scoring and _val_

2012-10-10 Thread jimtronic

I'm using solr function queries to generate my own custom score. I achieve
this using something along these lines:

q=_val_:my_custom_function()
This populates the score field as expected, but it also includes documents
that score 0. I need a way to filter the results so that scores below zero
are not included.

I realize that I'm using score in a non-standard way and that normally the
score that lucene/solr produce is not absolute. However, producing my own
score works really well for my needs.

I've tried using {!frange l=0} but this causes the score for all documents
to be 1.0.

I've found that I can do the following:

q=*:*fl=foo:my_custom_function()fq={!frange l=1}my_custom_function() 

This puts my custom score into foo, but it requires me to list all the logic
twice. Sometimes my logic is very long.









--
View this message in context: 
http://lucene.472066.n3.nabble.com/Filter-results-based-on-custom-scoring-and-val-tp4012968.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to post atomic updates using xml

2012-09-26 Thread jimtronic

For multi-valued fields, you can use add to add a value to the list. If the
value already exists, it will be there twice.

set will replace the entire list with the one value that you specify.

There's currently no method to remove a value, although the issue has been
logged: https://issues.apache.org/jira/browse/SOLR-3862

You can always edit the list by pulling down all the values and uploading
the new set.

Jim





--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323p4010547.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: need best solution for indexing and searching multiple, related database tables

2012-09-24 Thread jimtronic

I'm not sure if this will be relevant for you, but this is roughly what I do.
Apologies if it's too basic. 

I have a complex view that normalizes all the data that I need to be
together -- from over a dozen different tables. For one to many and many to
many relationships, I have sql turn the data into a comma delimited string
which the data import handler and the RegexTransformer will split into a
multi-valued field.

So, you might have a schema like this:

id123/id
name_sJohn Smith/name_s
attr_products
  strpython/str
  strjava/str
  strjavascript/str
/attr_products

Often I've found that I don't really need to the data together into one solr
core and it works better to just create a separate core just for that
schema. 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-best-solution-for-indexing-and-searching-multiple-related-database-tables-tp4009857p4009879.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: some general solr 4.0 questions

2012-09-20 Thread jimtronic

I've got a setup like yours -- lots of cores and replicas, but no need for
shards -- and here's what I've found so far:

1. Zookeeper is tiny. I would think network I/O is going to be the biggest
concern.

2. I think this is more about high availability than performance. I've been
expirementing with taking down parts of my setup to see what happens. When
zookeeper goes down, the solr instances still serve requests. It appears,
however, that updating and replication stop. I want to make frequent updates
so this is a big concern for me.

3. On ec2, I launch a server which is configured to register itself with my
zookeeper box upon launch. When they are ready I add them to my load
balancer. Theoretically, zookeeper would help further balance them, but
right now I find those queries to be too slow. Since the load balancer is
already distributing the load, I'm adding the parameter distrib=false to
my queries. This forces the request to stay on the box the load balancer
chose.

4. This is interesting. I started down this path of wanting to maintain a
master, but I've moved towards a system where all of my update requests go
through my load balancer. Since zookeeper dynamically elects a leader, no
matter which box gets the update the leader gets it anyway. This is very
nice for me because I want all my solr instances to be identical.

Since there's not a lot of documentation on this yet, I hope other people
share their findings, too.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/some-general-solr-4-0-questions-tp4009267p4009286.html
Sent from the Solr - User mailing list archive at Nabble.com.

Backup strategy for SolrCloud

2012-09-20 Thread jimtronic

I'm trying to determine my options for backing up data from a SolrCloud
cluster.

For me, bringing up my cluster from scratch can take several hours. It's way
faster to take snapshots of the index periodically and then use one of these
when booting a new instance. Since I use static xml files and delta-imports,
everything catches up on quickly.

Sorry if this is a dumb question, but where do I pull the snapshots from?
Zookeeper? Any box in the cluster? The leader?

Thanks!
Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Backup-strategy-for-SolrCloud-tp4009291.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: deleting a single value from multivalued field

2012-09-20 Thread jimtronic

Just added this today. 

https://issues.apache.org/jira/browse/SOLR-3862



--
View this message in context: 
http://lucene.472066.n3.nabble.com/deleting-a-single-value-from-multivalued-field-tp4009092p4009292.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Backup strategy for SolrCloud

2012-09-20 Thread jimtronic

I'm thinking about catastrophic failure and recovery. If, for some reason,
the cluster should go down or become unusable and I simply want to bring it
back up as quickly as possible, what's the best way to accomplish that? 

Maybe I'm thinking about this incorrectly? Is this not a concern?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Backup-strategy-for-SolrCloud-tp4009291p4009297.html
Sent from the Solr - User mailing list archive at Nabble.com.

Help with slow Solr Cloud query

2012-09-17 Thread jimtronic

Hi,

I've got a set up as follows:

- 13 cores
- 2 servers 
- running Solr 4.0 Beta with numShards=1 and an embedded zookeeper.

I'm trying to figure out why some complex queries are running so slowly in
this setup versus quickly in a standalone mode.

Given a query like: /select?q=(some complex query)

It runs fast and gets faster (caches) when only running one server:

1. ?fl=*q=(complex query)wt=jsonrows=24 (QTime 3)

When, I issue the same query to the cluster and watch the logs, it looks
like it's actually performing the query 3 times like so:

1. ?q=(complex
query)distrib=falsewt=javabinrows=24version=2NOW=1347911018556shard.url=(server1)|(server2)fl=id,scoredf=textstart=0isShard=truefsv=true
(QTime 2)

2. ?ids=(ids from query
1)distrib=falsewt=javabinrows=24version=2df=textfl=*shard.url=(server1)|(server2)NOW=1347911018556start=0q=(complex
query)isShard=true (QTime 4)

3.  ?fl=*q=(complex query)wt=jsonrows=24 (QTime 459)

Why is it performing #3? It already has everything it needs in #2 and #3
seems to be really slow even when warmed and cached.

As stated above, this query is fast when running on a single server that is
warmed and cached.

Since my query is complex, I could understand some slowness if I was
attempting this across multiple shards, but since there's only one shard,
shouldn't it just pick one server and query it?

Thanks!
Jim





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-slow-Solr-Cloud-query-tp4008448.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to post atomic updates using xml

2012-09-13 Thread jimtronic

Actually, the correct method appears to be this:

an atomic update in JSON:
 {
  id   : book1, 
  author   : {set:Neal Stephenson} 
 } 

the same in XML: 

add
doc
  field name=idbook1/field
  field name=author update=setNeal Stephenson/field
/doc
/add

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323p4007517.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to post atomic updates using xml

2012-09-12 Thread jimtronic

There's a good intro to atomic updates here:
http://yonik.com/solr/atomic-updates/ but it does not describe how to
structure the updates using xml.

Anyone have any idea on how these would look?

Thanks! Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to post atomic updates using xml

2012-09-12 Thread jimtronic

Figured it out.

in JSON: 

 {id   : book1,
  author   : {set:Neal Stephenson}
 }

in XML:

adddocfield name=idbook1/fieldfield name=author set=Neal
Stephenson/field

This seems to work.

Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323p4007325.html
Sent from the Solr - User mailing list archive at Nabble.com.

Atomic Updates, Payloads, Non-stored data

2012-09-10 Thread jimtronic

Hi,

I'm using payloads to tie a value to an attribute for a document -- eg a
user's rating for a document. I do not store this data, but I index it and
access the value through function queries.

I was really excited about atomic updates, but they do not work for me
because they are blowing out all of my non-stored payload data. 

I can make the fields stored, but that is not desirable as in some cases
there's a lot of data.

I was wondering how feasible it would be for me to modify the
DistributedUpdateProcessor so that it preserves my non-stored payloads while
performing the atomic updates.

Thanks! Jim




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Atomic-Updates-Payloads-Non-stored-data-tp4006678.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How can I use a function or fieldvalue as the default for query(subquery, default)?

2012-04-20 Thread jimtronic

I was able to use solr 3.1 functions to accomplish this logic:

/solr/select?q=_val_:sum(query({!dismax qf=text v='solr
rocks'}),product(map(query({!dismax qf=text v='solr
rocks'},-1),0,100,0,1), product(this_field,that_field)))





--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-I-use-a-function-or-fieldvalue-as-the-default-for-query-subquery-default-tp3924172p3926183.html
Sent from the Solr - User mailing list archive at Nabble.com.

How can I use a function or fieldvalue as the default for query(subquery, default)?

2012-04-19 Thread jimtronic

Hi,

For the solr function query(subquery, default) I'd like to be able to
specify the value of another field or even a function as the default.

For example, I might have:

/solr/select?q=_val_:query({!dismax qf=text v='solr rocks'},
product(this_field, that_field))

Is this possible?

I see that Boolean functions are coming in Solr 4, but it is unclear whether
these would accept functions as defaults.

Thanks,
Jim

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-I-use-a-function-or-fieldvalue-as-the-default-for-query-subquery-default-tp3924172p3924172.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Concatenate multivalued DIH fields

2011-04-28 Thread jimtronic

I solved this problem using the flatten=true attribute.

Given this schema
people
 person
  names
   name
firstNameJoe/firstName
lastNameSmith/firstName
   /name
  /names
 /person
/people

field column=attr_names xpath=/people/person/names/name flatten=true
/

attr_names is a multiValued field in my schema.xml. The flatten attribute
tells solr to take all the text from the specified node and below.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Concatenate-multivalued-DIH-fields-tp2749988p2875435.html
Sent from the Solr - User mailing list archive at Nabble.com.

70 matches

Mail list logo