date:20121209

Solr autocomplete keyword and geolocation based

2012-12-09 Thread reeuv

I am looking for getting auto complete suggestions using Solr based on
keyword as well as geolocation. Is there a way the 'Suggester' component or
any other way, Solr can take in multiple fields for auto completion?

For e.g. if I have a restaurants database and I want to get suggestions
using keyword e.g. 'Piz', the results should be based both on the keyword
'Piz' and also the locations that are close to certain latitude, longitude.

Is there a way to do it in Solr ?

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-autocomplete-keyword-and-geolocation-based-tp4025466.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR4 (sharded) and join query

2012-12-09 Thread Yonik Seeley

On Thu, Dec 6, 2012 at 6:47 PM, Erick Erickson erickerick...@gmail.com wrote:
 see: http://wiki.apache.org/solr/DistributedSearch

 joins aren't supported in distributed search. Any time you have more than
 one shard in SolrCloud, you are, by definition, doing distributed search.

It is supported, but there is a limitation:  You can use joins, but
those joins will each be calculated locally per-shard based on common
terms within the same shard.

And you can ensure that certain sets of related documents appear on
the same shard with the new document routing feature:
https://issues.apache.org/jira/browse/SOLR-2592

-Yonik
http://lucidworks.com

Re: Modeling openinghours using multipoints

2012-12-09 Thread Erick Erickson

Thanks for the discussion, I've added this to my bag of tricks, way cool!

Erick


On Sat, Dec 8, 2012 at 10:52 PM, britske gbr...@gmail.com wrote:

 Brilliant! Got some great ideas for this. Indeed all sorts of usecases
 which use multiple temporal ranges could benefit..

 Eg: Another Guy on stackoverflow asked me about this some days ago.. He
 wants to model multiple temporary offers per product (free shopping for
 christmas, 20% discount for Black friday , etc) .. All possible with this
 out of the box. Factor in 'offer category' in  x and y as well for some
 extra powerfull querying.

 Yup im enthousiastic about it , which im sure you can tell :)

 Thanks a lot David,

 Cheers,
 Geert-Jan



 Sent from my iPhone

 On 9 dec. 2012, at 05:35, David Smiley (@MITRE.org) [via Lucene] 
 ml-node+s472066n4025434...@n3.nabble.com wrote:

  britske wrote
  That's seriously awesome!
 
  Some change in the query though:
  You described: To query for a business that is open during at least some
  part of a given time duration
  I want To query for a business that is open during at least the entire
  given time duration.
 
  Feels like a small difference but probably isn't (I'm still wrapping my
  head on the intersect query I must admit)
  So this would be a slightly different rectangle query.  Interestingly,
 you simply swap the location in the rectangle where you put the start and
 end time.  In summary:
 
  Indexed span CONTAINS query span:
  minX minY maxX maxY - 0 end start *
 
  Indexed span INTERSECTS (i.e. OVERLAPS) query span:
  minX minY maxX maxY - 0 start end *
 
  Indexed span WITHIN query span:
  minX minY maxX maxY - start 0 * end
 
  I'm using '*' here to denote the max possible value.  At some point I
 may add that as a feature.
 
  That was a fun exercise!  I give you credit in prodding me in this
 direction as I'm not sure if this use of spatial would have occurred to me
 otherwise.
 
  britske wrote
  Moreover, any indication on performance? Should, say, 50.000 docs with
  about 100-200 points each (1 a 2 open-close spans per day) be ok? ( I
 know
  'your mileage may very' etc. but just a guestimate :)
  You should have absolutely no problem.  The real clincher in your favor
 is the fact that you only need 9600 discrete time values (so you said), not
 Long.MAX_VALUE.  Using Long.MAX_VALUE would simply not be possible with the
 current implementation because it's using Doubles which has 52 bits of
 precision not the 64 that would be required to be a complete substitute for
 any time/date.  Even given the 52 bits, a quad SpatialPrefixTree with
 maxLevels=52 would probably not perform well or might fail; not sure.
  Eventually when I have time to work on an implementation that can be based
 on a configurable number of grid cells (not unlike how you can configure
 precisionStep on the Trie numeric fields), 52 should be no problem.
 
  I'll have to remember to refer back to this email on the approach if I
 create a field type that wraps this functionality.
 
  ~ David
 
  britske wrote
  Again, this looks good!
  Geert-Jan
 
  2012/12/8 David Smiley (@MITRE.org) [via Lucene] 
  [hidden email]
 
   Hello again Geert-Jan!
  
   What you're trying to do is indeed possible with Solr 4 out of the box.
Other terminology people use for this is multi-value time duration.
  This
   creative solution is a pure application of spatial without the
 geospatial
   notion -- we're not using an earth or other sphere model -- it's a flat
   plane.  So no need to make reference to longitude  latitude, it's x 
 y.
  
   I would put opening time into x, and closing time into y.  To express a
   point, use x y (x space y), and supply this as a string to your
   SpatialRecursivePrefixTreeFieldType based field for indexing.  You can
 give
   it multiple values and it will work correctly; this is one of RPT's
 main
   features that set it apart from Solr 3 spatial.  To query for a
 business
   that is open during at least some part of a given time duration, say
 6-8
   o'clock, the query would look like openDuration:Intersects(minX minY
 maxX
   maxY)  and put 0 or minX (always), 6 for minY (start time), 8 for maxX
   (end time), and the largest possible value for maxY.  You wouldn't
 actually
   use 6  8, you'd use the number of 15 minute intervals since your
 epoch for
   this equivalent time span.
  
   You'll need to configure the field correctly: geo=false
 worldBounds=0 0
   maxTime maxTime substituting an appropriate value for maxTime based on
   your unit of time (number of 15 minute intervals you need) and
   distErrPct=0 (full precision).
  
   Let me know how this works for you.
  
   ~ David
Author:
   http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
   Author:
 http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
 
 
  If you reply to this email, your message will be added to the discussion
 below:

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-09 Thread sausarkar

Thank you very much will wait for the results from your tests.

From: Mark Miller-3 [via Lucene] 
ml-node+s472066n4025457...@n3.nabble.commailto:ml-node+s472066n4025457...@n3.nabble.com
Date: Saturday, December 8, 2012 11:08 PM
To: Sarkar, Sauvik sausar...@ebay.commailto:sausar...@ebay.com
Subject: Re: SolrCloud - Query performance degrades with multiple servers

If that's true, we will fix it for 4.1. I can look closer tomorrow.

Mark

Sent from my iPhone

On Dec 9, 2012, at 2:04 AM, sausarkar [hidden 
email]/user/SendEmail.jtp?type=nodenode=4025457i=0 wrote:

 Spoke too early it seems that SolrCloud is still distributing queries to all
 the servers even if numShards=1 We are seeing POST request to all servers in
 the cluster, please let me know what is the solution. Here is an example:
 (the variable isShard should be false in our case as single shard, please
 help)

 POST /solr/core0/select HTTP/1.1
 Content-Charset: UTF-8
 Content-Type: application/x-www-form-urlencoded; charset=UTF-8
 User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
 Content-Length: 991
 Host: server1
 Connection: Keep-Alive

 lowercaseOperators=truemm=70%fl=EntityIddf=EntityIdq.op=ANDq.alt=*:*qs=10stopwords=truedefType=edismaxrows=3000q=*:*start=0fsv=truedistrib=false*isShard=true*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/NOW=1354918880447wt=javabinversion=2

 Re: SolrCloud - Query performance degrades with multiple servers
 Dec 06, 2012; 6:29pm — by   Mark Miller-3

 On Dec 6, 2012, at 5:08 PM, sausarkar [hidden email] wrote:

 We solved the issue by explicitly adding numShards=1 argument to the solr
 start up script. Is this a bug?

 Sounds like it…perhaps related to SOLR-3971…not sure though.

 - Mark

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
 Sent from the Solr - User mailing list archive at Nabble.com.

If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025457.html
To unsubscribe from SolrCloud - Query performance degrades with multiple 
servers, click 
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4024660code=c2F1c2Fya2FyQGViYXkuY29tfDQwMjQ2NjB8LTE0MTU2ODg5MDk=.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025573.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-12-09 Thread Dmitry Kan

Hi,

Thanks for the package, it is useful. I decided to adapt it to Lucene trunk
(ver. 5.0-SNAPSHOT). The package with the source code and a binary (dir:
target) can be found along the same link. It worked fine against trunk SOLR
 / Lucene index. There could be bugs though, please drop a line if you test
and find any.

Regards,

Dmitry Kan

On Fri, Dec 7, 2012 at 5:50 PM, Neil Ireson n.ire...@dcs.shef.ac.uk wrote:

 In case it is of use, I have just uploaded an updated and mavenised
 version of the Luke code to the Luke discussion list, see
 https://groups.google.com/d/**topic/luke-discuss/MNT_**teDxVno/discussionhttps://groups.google.com/d/topic/luke-discuss/MNT_teDxVno/discussion
 .

 It seems to work with the latest (4.0.0  4.1-SNAPSHOT) versions of Lucene.

 N

Re: Modeling openinghours using multipoints

2012-12-09 Thread Lance Norskog

If these are not raw times, but quantized on-the-hour, would it be
faster to create a bit map of hours and then query across the bit
maps?

On Sun, Dec 9, 2012 at 8:06 AM, Erick Erickson erickerick...@gmail.com wrote:
 Thanks for the discussion, I've added this to my bag of tricks, way cool!

 Erick


 On Sat, Dec 8, 2012 at 10:52 PM, britske gbr...@gmail.com wrote:

 Brilliant! Got some great ideas for this. Indeed all sorts of usecases
 which use multiple temporal ranges could benefit..

 Eg: Another Guy on stackoverflow asked me about this some days ago.. He
 wants to model multiple temporary offers per product (free shopping for
 christmas, 20% discount for Black friday , etc) .. All possible with this
 out of the box. Factor in 'offer category' in  x and y as well for some
 extra powerfull querying.

 Yup im enthousiastic about it , which im sure you can tell :)

 Thanks a lot David,

 Cheers,
 Geert-Jan



 Sent from my iPhone

 On 9 dec. 2012, at 05:35, David Smiley (@MITRE.org) [via Lucene] 
 ml-node+s472066n4025434...@n3.nabble.com wrote:

  britske wrote
  That's seriously awesome!
 
  Some change in the query though:
  You described: To query for a business that is open during at least some
  part of a given time duration
  I want To query for a business that is open during at least the entire
  given time duration.
 
  Feels like a small difference but probably isn't (I'm still wrapping my
  head on the intersect query I must admit)
  So this would be a slightly different rectangle query.  Interestingly,
 you simply swap the location in the rectangle where you put the start and
 end time.  In summary:
 
  Indexed span CONTAINS query span:
  minX minY maxX maxY - 0 end start *
 
  Indexed span INTERSECTS (i.e. OVERLAPS) query span:
  minX minY maxX maxY - 0 start end *
 
  Indexed span WITHIN query span:
  minX minY maxX maxY - start 0 * end
 
  I'm using '*' here to denote the max possible value.  At some point I
 may add that as a feature.
 
  That was a fun exercise!  I give you credit in prodding me in this
 direction as I'm not sure if this use of spatial would have occurred to me
 otherwise.
 
  britske wrote
  Moreover, any indication on performance? Should, say, 50.000 docs with
  about 100-200 points each (1 a 2 open-close spans per day) be ok? ( I
 know
  'your mileage may very' etc. but just a guestimate :)
  You should have absolutely no problem.  The real clincher in your favor
 is the fact that you only need 9600 discrete time values (so you said), not
 Long.MAX_VALUE.  Using Long.MAX_VALUE would simply not be possible with the
 current implementation because it's using Doubles which has 52 bits of
 precision not the 64 that would be required to be a complete substitute for
 any time/date.  Even given the 52 bits, a quad SpatialPrefixTree with
 maxLevels=52 would probably not perform well or might fail; not sure.
  Eventually when I have time to work on an implementation that can be based
 on a configurable number of grid cells (not unlike how you can configure
 precisionStep on the Trie numeric fields), 52 should be no problem.
 
  I'll have to remember to refer back to this email on the approach if I
 create a field type that wraps this functionality.
 
  ~ David
 
  britske wrote
  Again, this looks good!
  Geert-Jan
 
  2012/12/8 David Smiley (@MITRE.org) [via Lucene] 
  [hidden email]
 
   Hello again Geert-Jan!
  
   What you're trying to do is indeed possible with Solr 4 out of the box.
Other terminology people use for this is multi-value time duration.
  This
   creative solution is a pure application of spatial without the
 geospatial
   notion -- we're not using an earth or other sphere model -- it's a flat
   plane.  So no need to make reference to longitude  latitude, it's x 
 y.
  
   I would put opening time into x, and closing time into y.  To express a
   point, use x y (x space y), and supply this as a string to your
   SpatialRecursivePrefixTreeFieldType based field for indexing.  You can
 give
   it multiple values and it will work correctly; this is one of RPT's
 main
   features that set it apart from Solr 3 spatial.  To query for a
 business
   that is open during at least some part of a given time duration, say
 6-8
   o'clock, the query would look like openDuration:Intersects(minX minY
 maxX
   maxY)  and put 0 or minX (always), 6 for minY (start time), 8 for maxX
   (end time), and the largest possible value for maxY.  You wouldn't
 actually
   use 6  8, you'd use the number of 15 minute intervals since your
 epoch for
   this equivalent time span.
  
   You'll need to configure the field correctly: geo=false
 worldBounds=0 0
   maxTime maxTime substituting an appropriate value for maxTime based on
   your unit of time (number of 15 minute intervals you need) and
   distErrPct=0 (full precision).
  
   Let me know how this works for you.
  
   ~ David
Author:
   http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
   Author:

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

2012-12-09 Thread Brett Hoerner

Thanks,

It looks like my cluster is in a wedged state after I tried to delete a
collection that didn't exist. There are about 80 items in the queue after
the delete op (that it can't get by). Is that a known bug?

I guess for now I'll just check that a collection exists before sending any
deletes. :)

Brett


On Fri, Dec 7, 2012 at 10:50 AM, Mark Miller markrmil...@gmail.com wrote:

 Anything in any of the other logs (the other nodes)? The key is getting
 the logs from the node designated as the overseer - it should hopefully
 have the error.

 Right now because you pass this stuff off to the overseer, you will always
 get back a 200 - there is a JIRA issue that addresses this though
 (collection API responses) and I hope to get it committed soon.

 - Mark

 On Dec 7, 2012, at 7:26 AM, Brett Hoerner br...@bretthoerner.com wrote:

  For what it's worth this is the log output with DEBUG on,
 
  Dec 07, 2012 2:00:48 PM org.apache.solr.handler.admin.CollectionsHandler
  handleCreateAction
  INFO: Creating Collection : action=CREATEname=foonumShards=4
  Dec 07, 2012 2:01:03 PM org.apache.solr.core.SolrCore execute
  INFO: [15671] webapp=/solr path=/admin/system params={wt=json} status=0
  QTime=5
  Dec 07, 2012 2:01:15 PM org.apache.solr.handler.admin.CollectionsHandler
  handleDeleteAction
  INFO: Deleting Collection : action=DELETEname=default
  Dec 07, 2012 2:01:20 PM org.apache.solr.core.SolrCore execute
 
  Neither the CREATE or DELETE actually did anything, though. (Again, HTTP
  200 OK)
 
  Still stuck here, any ideas?
 
  Brett
 
 
  On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner br...@bretthoerner.com
 wrote:
 
  Hi,
 
  I have a Cloud setup of 4 machines. I bootstrapped them with 1
 collection,
  which I called default and haven't used since. I'm using an external
 ZK
  ensemble that was completely empty before I started this cloud.
 
  Once I had all 4 nodes in the cloud I used the collection API to create
  the real collections I wanted. I also tested that deleting works.
 
  For example,
 
  # this worked
  curl 
 
 http://localhost:8984/solr/admin/collections?action=CREATEname=15678numShards=4
  
 
  # this worked
  curl 
  http://localhost:8984/solr/admin/collections?action=DELETEname=15678;
 
  Next, I started my indexer service which happily sent many, many updates
  to the cloud. Queries against the collections also work just fine.
 
  Finally, a few hours later, I tried doing a create and a delete. Both
  operations did nothing, although Solr replied with a 200 OK.
 
  $ curl -i 
 
 http://localhost:8984/solr/admin/collections?action=CREATEname=15679numShards=4
  
  HTTP/1.1 200 OK
  Content-Type: application/xml; charset=UTF-8
  Transfer-Encoding: chunked
 
  ?xml version=1.0 encoding=UTF-8?
  response
  lst name=responseHeaderint name=status0/intint
  name=QTime3/int/lst
 
  There is nothing in the stdout/stderr logs, nor the Java logs (I have it
  set to WARN).
 
  I have tried bouncing the nodes and it doesn't change anything.
 
  Any ideas? How can I further debug this or what else can I provide?

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

2012-12-09 Thread Mark Miller

Yeah it is - this was fixed a while ago on 4x and will be in 4.1.

An exception would kill the collection manager wait loop.

- Mark

On Sun, Dec 9, 2012 at 9:21 PM, Brett Hoerner br...@bretthoerner.com wrote:
 Thanks,

 It looks like my cluster is in a wedged state after I tried to delete a
 collection that didn't exist. There are about 80 items in the queue after
 the delete op (that it can't get by). Is that a known bug?

 I guess for now I'll just check that a collection exists before sending any
 deletes. :)

 Brett


 On Fri, Dec 7, 2012 at 10:50 AM, Mark Miller markrmil...@gmail.com wrote:

 Anything in any of the other logs (the other nodes)? The key is getting
 the logs from the node designated as the overseer - it should hopefully
 have the error.

 Right now because you pass this stuff off to the overseer, you will always
 get back a 200 - there is a JIRA issue that addresses this though
 (collection API responses) and I hope to get it committed soon.

 - Mark

 On Dec 7, 2012, at 7:26 AM, Brett Hoerner br...@bretthoerner.com wrote:

  For what it's worth this is the log output with DEBUG on,
 
  Dec 07, 2012 2:00:48 PM org.apache.solr.handler.admin.CollectionsHandler
  handleCreateAction
  INFO: Creating Collection : action=CREATEname=foonumShards=4
  Dec 07, 2012 2:01:03 PM org.apache.solr.core.SolrCore execute
  INFO: [15671] webapp=/solr path=/admin/system params={wt=json} status=0
  QTime=5
  Dec 07, 2012 2:01:15 PM org.apache.solr.handler.admin.CollectionsHandler
  handleDeleteAction
  INFO: Deleting Collection : action=DELETEname=default
  Dec 07, 2012 2:01:20 PM org.apache.solr.core.SolrCore execute
 
  Neither the CREATE or DELETE actually did anything, though. (Again, HTTP
  200 OK)
 
  Still stuck here, any ideas?
 
  Brett
 
 
  On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner br...@bretthoerner.com
 wrote:
 
  Hi,
 
  I have a Cloud setup of 4 machines. I bootstrapped them with 1
 collection,
  which I called default and haven't used since. I'm using an external
 ZK
  ensemble that was completely empty before I started this cloud.
 
  Once I had all 4 nodes in the cloud I used the collection API to create
  the real collections I wanted. I also tested that deleting works.
 
  For example,
 
  # this worked
  curl 
 
 http://localhost:8984/solr/admin/collections?action=CREATEname=15678numShards=4
  
 
  # this worked
  curl 
  http://localhost:8984/solr/admin/collections?action=DELETEname=15678;
 
  Next, I started my indexer service which happily sent many, many updates
  to the cloud. Queries against the collections also work just fine.
 
  Finally, a few hours later, I tried doing a create and a delete. Both
  operations did nothing, although Solr replied with a 200 OK.
 
  $ curl -i 
 
 http://localhost:8984/solr/admin/collections?action=CREATEname=15679numShards=4
  
  HTTP/1.1 200 OK
  Content-Type: application/xml; charset=UTF-8
  Transfer-Encoding: chunked
 
  ?xml version=1.0 encoding=UTF-8?
  response
  lst name=responseHeaderint name=status0/intint
  name=QTime3/int/lst
 
  There is nothing in the stdout/stderr logs, nor the Java logs (I have it
  set to WARN).
 
  I have tried bouncing the nodes and it doesn't change anything.
 
  Any ideas? How can I further debug this or what else can I provide?
 





-- 
- Mark

Re: stress testing Solr 4.x

2012-12-09 Thread Mark Miller

Hmmm...EOF on the segments file is odd...

How were you killing the nodes? Just stopping them or kill -9 or what?

- Mark

On Sun, Dec 9, 2012 at 1:37 PM, Alain Rogister alain.rogis...@gmail.com wrote:
 Hi,

 I have re-ran my tests today after I updated Solr 4.1 to apply the patch.

 First, the good news : it works i.e. if I stop all three Solr servers and
 then restart one, it will try to find the other two for a while (about 3
 minutes I think) then give up, become the leader and start processing
 requests.

 Now, the not-so-good : I encountered several exceptions that seem to
 indicate 2 other issues. Here are the relevant bits.

 1) The ZK session expiry problem : not sure what caused it but I did a few
 Solr or ZK node restarts while the system was under load.

 SEVERE: There was a problem finding the leader in
 zk:org.apache.solr.common.SolrException: Could not get leader props
 at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:732)
 at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:696)
 at
 org.apache.solr.cloud.ZkController.waitForLeaderToSeeDownState(ZkController.java:1095)
 at
 org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:265)
 at org.apache.solr.cloud.ZkController.access$100(ZkController.java:84)
 at org.apache.solr.cloud.ZkController$1.command(ZkController.java:184)
 at
 org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:116)
 at
 org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
 at
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:90)
 at
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
 KeeperErrorCode = Session expired for /collections/adressage/leaders/shard1
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
 at
 org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:244)
 at
 org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:241)
 at
 org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:63)
 at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:241)
 at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:710)
 ... 10 more
 SEVERE: :org.apache.zookeeper.KeeperException$SessionExpiredException:
 KeeperErrorCode = Session expired for /overseer/queue/qn-
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
 at
 org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:210)
 at
 org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:207)
 at
 org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:63)
 at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:207)
 at org.apache.solr.cloud.DistributedQueue.offer(DistributedQueue.java:229)
 at org.apache.solr.cloud.ZkController.publish(ZkController.java:824)
 at org.apache.solr.cloud.ZkController.publish(ZkController.java:797)
 at
 org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:258)
 at org.apache.solr.cloud.ZkController.access$100(ZkController.java:84)
 at org.apache.solr.cloud.ZkController$1.command(ZkController.java:184)
 at
 org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:116)
 at
 org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
 at
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:90)
 at
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)

 2) Data corruption of 1 core on 2 out of 3 Solr servers. This core failed
 to start due to the exceptions below and both servers went into a seemingly
 endless loop of exponential retries. The fix was to stop both faulty
 servers, remove the data directory of this core and restart : replication
 then took place correctly. As above, not sure what exactly caused this to
 happen; no updates were taking place, only searches.

 On server 1 :

 INFO: Closing
 directory:/Users/arogister/Dev/apache-solr-4.1-branch/solr/forem/solr/formabanque/data/index.20121209152525785
 Dec 09, 2012 3:25:25 PM org.apache.solr.common.SolrException log
 SEVERE: SnapPull failed :org.apache.solr.common.SolrException: Index fetch
 failed :
 at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:400)
 at

Re: Modeling openinghours using multipoints

2012-12-09 Thread Mikhail Khludnev

Colleagues,
What are benefits of this approach at contrast to block join?

Thanks
10.12.2012 3:35 пользователь Lance Norskog goks...@gmail.com написал:

 If these are not raw times, but quantized on-the-hour, would it be
 faster to create a bit map of hours and then query across the bit
 maps?

 On Sun, Dec 9, 2012 at 8:06 AM, Erick Erickson erickerick...@gmail.com
 wrote:
  Thanks for the discussion, I've added this to my bag of tricks, way cool!
 
  Erick
 
 
  On Sat, Dec 8, 2012 at 10:52 PM, britske gbr...@gmail.com wrote:
 
  Brilliant! Got some great ideas for this. Indeed all sorts of usecases
  which use multiple temporal ranges could benefit..
 
  Eg: Another Guy on stackoverflow asked me about this some days ago.. He
  wants to model multiple temporary offers per product (free shopping for
  christmas, 20% discount for Black friday , etc) .. All possible with
 this
  out of the box. Factor in 'offer category' in  x and y as well for some
  extra powerfull querying.
 
  Yup im enthousiastic about it , which im sure you can tell :)
 
  Thanks a lot David,
 
  Cheers,
  Geert-Jan
 
 
 
  Sent from my iPhone
 
  On 9 dec. 2012, at 05:35, David Smiley (@MITRE.org) [via Lucene] 
  ml-node+s472066n4025434...@n3.nabble.com wrote:
 
   britske wrote
   That's seriously awesome!
  
   Some change in the query though:
   You described: To query for a business that is open during at least
 some
   part of a given time duration
   I want To query for a business that is open during at least the
 entire
   given time duration.
  
   Feels like a small difference but probably isn't (I'm still wrapping
 my
   head on the intersect query I must admit)
   So this would be a slightly different rectangle query.  Interestingly,
  you simply swap the location in the rectangle where you put the start
 and
  end time.  In summary:
  
   Indexed span CONTAINS query span:
   minX minY maxX maxY - 0 end start *
  
   Indexed span INTERSECTS (i.e. OVERLAPS) query span:
   minX minY maxX maxY - 0 start end *
  
   Indexed span WITHIN query span:
   minX minY maxX maxY - start 0 * end
  
   I'm using '*' here to denote the max possible value.  At some point I
  may add that as a feature.
  
   That was a fun exercise!  I give you credit in prodding me in this
  direction as I'm not sure if this use of spatial would have occurred to
 me
  otherwise.
  
   britske wrote
   Moreover, any indication on performance? Should, say, 50.000 docs with
   about 100-200 points each (1 a 2 open-close spans per day) be ok? ( I
  know
   'your mileage may very' etc. but just a guestimate :)
   You should have absolutely no problem.  The real clincher in your
 favor
  is the fact that you only need 9600 discrete time values (so you said),
 not
  Long.MAX_VALUE.  Using Long.MAX_VALUE would simply not be possible with
 the
  current implementation because it's using Doubles which has 52 bits of
  precision not the 64 that would be required to be a complete substitute
 for
  any time/date.  Even given the 52 bits, a quad SpatialPrefixTree with
  maxLevels=52 would probably not perform well or might fail; not sure.
   Eventually when I have time to work on an implementation that can be
 based
  on a configurable number of grid cells (not unlike how you can configure
  precisionStep on the Trie numeric fields), 52 should be no problem.
  
   I'll have to remember to refer back to this email on the approach if I
  create a field type that wraps this functionality.
  
   ~ David
  
   britske wrote
   Again, this looks good!
   Geert-Jan
  
   2012/12/8 David Smiley (@MITRE.org) [via Lucene] 
   [hidden email]
  
Hello again Geert-Jan!
   
What you're trying to do is indeed possible with Solr 4 out of the
 box.
 Other terminology people use for this is multi-value time duration.
   This
creative solution is a pure application of spatial without the
  geospatial
notion -- we're not using an earth or other sphere model -- it's a
 flat
plane.  So no need to make reference to longitude  latitude, it's
 x 
  y.
   
I would put opening time into x, and closing time into y.  To
 express a
point, use x y (x space y), and supply this as a string to your
SpatialRecursivePrefixTreeFieldType based field for indexing.  You
 can
  give
it multiple values and it will work correctly; this is one of RPT's
  main
features that set it apart from Solr 3 spatial.  To query for a
  business
that is open during at least some part of a given time duration, say
  6-8
o'clock, the query would look like openDuration:Intersects(minX
 minY
  maxX
maxY)  and put 0 or minX (always), 6 for minY (start time), 8 for
 maxX
(end time), and the largest possible value for maxY.  You wouldn't
  actually
use 6  8, you'd use the number of 15 minute intervals since your
  epoch for
this equivalent time span.
   
You'll need to configure the field correctly: geo=false
  worldBounds=0 0
maxTime maxTime substituting an

How does fq skip score value?

2012-12-09 Thread deniz

Hello,

 I would like to know fq parameters doesnt deal with scoring so on,,, I have
been digging the code, to see where it separates and executes fq parameters
but couldnt find yet...

anyone knows how does fq work to skip score information?



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-does-fq-skip-score-value-tp4025608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How does fq skip score value?

2012-12-09 Thread Mikhail Khludnev

Sure.

Here the fq's docsets are intersected
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L864
and here
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1471that
docset is passed to lucene search.



On Mon, Dec 10, 2012 at 10:06 AM, deniz denizdurmu...@gmail.com wrote:

 Hello,

  I would like to know fq parameters doesnt deal with scoring so on,,, I
 have
 been digging the code, to see where it separates and executes fq parameters
 but couldnt find yet...

 anyone knows how does fq work to skip score information?



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-does-fq-skip-score-value-tp4025608.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com

Solr autocomplete keyword and geolocation based

Re: SOLR4 (sharded) and join query

Re: Modeling openinghours using multipoints

Re: SolrCloud - Query performance degrades with multiple servers

Re: error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

Re: Modeling openinghours using multipoints

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Re: stress testing Solr 4.x

Re: Modeling openinghours using multipoints

How does fq skip score value?

Re: How does fq skip score value?

12 matches

Site Navigation

Mail list logo

Footer information