date:20110719

Re: How could I monitor solr cache

2011-07-19 Thread Ahmet Arslan


 I am wondering how could I get solr cache running status. I
 know there is a
 JMX containing those information.
 
 Just want to know what tool or method do you make use of to
 monitor cache,
 in order to enhance performance or detect issue.

You might find this interesting :

http://sematext.com/spm/solr-performance-monitoring/index.html
http://sematext.com/spm/index.html

solr slave's performance issue after replicate the optimized index

2011-07-19 Thread 虞冰

Hi all,

I have a performance issue~

I do a optimize on solr master every night.
but about a month ago, every time after the slaves get the new
optimized index, system cpu usage will raise from 0.3 - 0.5% to 7 -
10% (daily average), and servers's load average also become 2 times
more than normal.the load average remain high even I restart the
tomcat .

after many day's testing, I find that 4 ways to bring the slaves back
to normal load average.

1. reboot linux server
2. shutdown tomcat, manually rm the index data and do repilcate again
3. shutdown tomcat, copy indexdata as indexdata2, rm indexdata, mv
indexdata2 to indexdata, start tomcat
4. shutdown tomcat, use C to alloc 20G memory and free it, start server.

I can only guess it has some relationship with the memory or  the system cache.

Is this a solr bug or lucence bug or just system issue?


My System:

CentOS 5.6 X64   Tomcat 7.0 JRocket 6
Intel E5620 *2  24GB DDR3
Solr 3.1
Index size 7G (after optimize)  / 8G (before opitimize)


Many thanks~

Re: ' invisible ' words

2011-07-19 Thread deniz

Hi Erick,

thank you for the advice... I will be doing as you advised and update you
here...

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/invisible-words-tp3158060p3181647.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps

2011-07-19 Thread Andy

Nagendra,

In another email you mentioned there's a problem where if an existing document 
is updated both the old and new version will show up in search results.

Has that been solved in Solr-RA 3.3?

--- On Mon, 7/18/11, Nagendra Nagarajayya nnagaraja...@transaxtions.com wrote:

 From: Nagendra Nagarajayya nnagaraja...@transaxtions.com
 Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high 
 performance 1 tps
 To: solr-user@lucene.apache.org
 Date: Monday, July 18, 2011, 10:43 AM
 Hi!
 
 I would like to announce the availability of Solr 3.3 with
 RankingAlgorithm and Near Real Time (NRT) search capability
 now. The NRT performance is very high, 10,000 documents/sec
 with the MBArtists 390k index. The NRT functionality allows
 you to add documents without the IndexSearchers being closed
 or caches being cleared. A commit is also not needed with
 the document update. Searches can run concurrently with
 document updates. No changes are needed except for enabling
 the NRT through solrconfig.xml.
 
 RankingAlgorithm query performance is now 3x times faster
 than before and is exposed as the Lucene API. This release
 also adds supports for the last document with a unique id to
 be searchable and visible in search results in case of
 multiple updates of the document.
 
 I have a wiki page that describes NRT performance in detail
 and can be accessed from here:
 
 http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
 
 You can download Solr 3.3 with RankingAlgorithm (NRT
 version) from here:
 
 http://solr-ra.tgels.org
 
 I would like to invite you to give this version a try as
 the performance is very high.
 
 Regards,
 
 - Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.org

Solr 3.3: SEVERE: java.io.IOException: seek past EOF

2011-07-19 Thread mdz-munich

Hi Developers and Users,

a serious Problem occurred: 

19.07.2011 10:50:32 org.apache.solr.common.SolrException log
SEVERE: java.io.IOException: seek past EOF
at
org.apache.lucene.store.MMapDirectory$MMapIndexInput.seek(MMapDirectory.java:343)
at org.apache.lucene.index.FieldsReader.seekIndex(FieldsReader.java:226)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:242)
at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:471)
at
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:564)
at
org.apache.solr.search.SolrIndexReader.document(SolrIndexReader.java:260)
at 
org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:440)
at
org.apache.solr.util.SolrPluginUtils.optimizePreFetchDocs(SolrPluginUtils.java:270)
at
org.apache.solr.handler.component.QueryComponent.doPrefetch(QueryComponent.java:358)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:265)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:202)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164)
at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:462)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:562)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.java:210)
at
org.apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:85)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:395)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:250)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
at java.lang.Thread.run(Thread.java:736)

Fresh index with Solr 3.3. It only occurs with some Words (in this case it
was Graf, no idea). Query-Type (dismax, standard, edismax), Highlighting
and Faceting have no affect, only the term to search. And it seems to affect
only OCR-fields, which are usually larger than fields for meta-data.

Any ideas? 

Grettings  best regards,

Sebastian 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-3-SEVERE-java-io-IOException-seek-past-EOF-tp3181869p3181869.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to find whether solr server is running or not

2011-07-19 Thread Romi

I am running an application that get search results from solr server. But
when server is not running i get no response from the server. Is there any
way i can found that my server is not running so that i can give proper
error message regarding it


-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181870.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to find whether solr server is running or not

2011-07-19 Thread Mohammad Shariq

Check for HTTP response code, if its other than 200 means services are not
OK.

On 19 July 2011 14:39, Romi romijain3...@gmail.com wrote:

 I am running an application that get search results from solr server. But
 when server is not running i get no response from the server. Is there any
 way i can found that my server is not running so that i can give proper
 error message regarding it


 -
 Thanks  Regards
 Romi
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181870.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Thanks and Regards
Mohammad Shariq

Re: How to find whether solr server is running or not

2011-07-19 Thread Romi

i am getting json response from solr as

*$.getJSON(http://192.168.1.9:8983/solr/db/select/?qt=dismaxwt=jsonstart=0rows=100q=eleganthl=truehl.fl=texthl.usePhraseHighlighter=truesort=
scorejson.wrf=?, function(result){*

how can i check whether i get response or not




-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181942.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 3.3: SEVERE: java.io.IOException: seek past EOF

2011-07-19 Thread mdz-munich

Ups, false alarm.

CustomSimilarity, combined with a very small set of documents caused the
problem.

Greetings,

Sebastian 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-3-SEVERE-java-io-IOException-seek-past-EOF-tp3181869p3181943.html
Sent from the Solr - User mailing list archive at Nabble.com.

Spatial Search with distance as a parameter

2011-07-19 Thread Michael Lorz

Hi all,

I have the following problem: The documents in the index of my solr instance 
correspond to persons. Each document (=person) has lat/lon coordinates and 
additionally a travel radius. The coordinates correspond to the office of the 
person, the travel radius indicates a distance which the person is willing to 
travel.

I would like to search for all persons which are willing to travel to a 
particular place (also given as lat/lon coordinates). In other words I have to 
do a query with the geofilt filter. The problem here: the distance parameter d 
cannot be defined in advance, but should correspond to the travel radius (which 
may be different for each person).

Any ideas how this problem could be solved?

Thanks in advance
Michi

Re: How to find whether solr server is running or not

2011-07-19 Thread Péter Király

You can use ping:
http://host:port/solr/admin/ping

The response is something like this:
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint
name=QTime5/intlst name=paramsstr
name=echoParamsall/strstr name=rows10/strstr
name=echoParamsall/strstr name=qsolrpingquery/strstr
name=qtsearch/str/lst/lststr name=statusOK/str
/response

or with JSON response:
http://host:port/solr/admin/ping?wt=json
{responseHeader:{status:0,QTime:2,params:{echoParams:all,rows:10,echoParams:all,q:solrpingquery,qt:search,wt:json}},status:OK}

Hope this helps.
Péter

-- 
eXtensible Catalog
http://drupal.org/project/xc

2011/7/19 Romi romijain3...@gmail.com:
 i am getting json response from solr as

 *$.getJSON(http://192.168.1.9:8983/solr/db/select/?qt=dismaxwt=jsonstart=0rows=100q=eleganthl=truehl.fl=texthl.usePhraseHighlighter=truesort=
 scorejson.wrf=?, function(result){*

 how can i check whether i get response or not




 -
 Thanks  Regards
 Romi
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181942.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to find whether solr server is running or not

2011-07-19 Thread Romi

But the problem is when solr server is not runing 
*http://host:port/solr/admin/ping*

will not give me any json response
then how will i get the status :(

when i run this url browser gives me following error
*Unable to connect
Firefox can't establish a connection to the server at 192.168.1.9:8983.*

-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182202.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: - character in search query

2011-07-19 Thread roySolr

Anybody?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/character-in-search-query-tp3168604p3182228.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: any detailed tutorials on plugin development?

2011-07-19 Thread Dmitry Kan

which plugin specifically are you going to implement?

On Mon, Jul 18, 2011 at 2:24 AM, deniz denizdurmu...@gmail.com wrote:

 anyone knows any tutorials on implementing tutorials? there is one page on
 wiki but i dont think we can it tutorial... i am looking for something with
 some example code...



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/any-detailed-tutorials-on-plugin-development-tp3177821p3177821.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,

Dmitry Kan

Re: How to find whether solr server is running or not

2011-07-19 Thread Erick Erickson

Try doing this from a program rather than the browser. If Solr isn't
running, you have to infer that fact from the lack of a response.

Best
Erick

On Tue, Jul 19, 2011 at 7:42 AM, Romi romijain3...@gmail.com wrote:
 But the problem is when solr server is not runing
 *http://host:port/solr/admin/ping*

 will not give me any json response
 then how will i get the status :(

 when i run this url browser gives me following error
 *Unable to connect
 Firefox can't establish a connection to the server at 192.168.1.9:8983.*

 -
 Thanks  Regards
 Romi
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182202.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to find whether solr server is running or not

2011-07-19 Thread François Schiettecatte

I think anything but a 200 OK mean it is dead like the proverbial parrot :)

François

On Jul 19, 2011, at 7:42 AM, Romi wrote:

 But the problem is when solr server is not runing 
 *http://host:port/solr/admin/ping*
 
 will not give me any json response
 then how will i get the status :(
 
 when i run this url browser gives me following error
 *Unable to connect
 Firefox can't establish a connection to the server at 192.168.1.9:8983.*
 
 -
 Thanks  Regards
 Romi
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182202.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: - character in search query

2011-07-19 Thread Erick Erickson

Let's see the complete fieldType definition. Have you looked at
your index with, say, Luke and seen what's actually in your
index? And do you re-index after each schema change?

What does your admin/analysis page look like? Have you considered
PatternReplaceCharFilterFactory rather than the tokenizer?

Best
Erick

On Tue, Jul 19, 2011 at 7:48 AM, roySolr royrutten1...@gmail.com wrote:
 Anybody?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/character-in-search-query-tp3168604p3182228.html
 Sent from the Solr - User mailing list archive at Nabble.com.

RE: Need Suggestion

2011-07-19 Thread Jagdish Vasani

Look  this link : http://wiki.apache.org/solr/DistributedSearch

This will help you when you have large index. 

-Original Message-
From: Rohit Gupta [mailto:ro...@in-rev.com] 
Sent: Friday, July 15, 2011 11:37 PM
To: solr-user@lucene.apache.org
Subject: Re: Need Suggestion

I am using -Xms2g and -Xmx6g 

What would be the ideal JVM size?

Regards,
Rohit

From: Mohammad Shariq shariqn...@gmail.com
To: solr-user@lucene.apache.org
Sent: Fri, 15 July, 2011 7:27:38 PM
Subject: Re: Need Suggestion

below are  certain things to do for search latency.
1) Do bulk insert.
2) Commit after every ~5000 docs.
3) Do optimization once in a day.
4) in search query  use fq parameter.

What is the size of JVM you are using ???

On 15 July 2011 17:44, Rohit ro...@in-rev.com wrote:

 I am facing some performance issues on my Solr Installation (3core server).
 I am indexing live twitter data based on certain keywords, as you can
 imagine, the rate at which documents are received is very high and so the
 updates to the core is very high and regular. Given below are the document
 size on my three core.

 Twitter  - 26874747

 Core2-  3027800

 Core3-  6074253

 My Server configuration has 8GB RAM, but now we are experiencing server
 performance drop. What can be done to improve this?  Also, I have a few
 questions.

 1.  Does the number of commit takes high memory? Will reducing the
 number of commits per hour help?
 2.  Most of my queries are field or date faceting based? how to improve
 those?

 Regards,

 Rohit

 Regards,

 Rohit

 Mobile: +91-9901768202

 About Me:  http://about.me/rohitg http://about.me/rohitg

-- 
Thanks and Regards
Mohammad Shariq

Re: defType argument weirdness

2011-07-19 Thread Erik Hatcher


On Jul 18, 2011, at 19:15 , Naomi Dushay wrote:

 I found a weird behavior with the Solr  defType argument, perhaps with 
 respect to default queries?
 
 q={!defType=dismax}*:* hits

this is the confusing one.  defType is a Solr request parameter, but not 
something that works as a local (inside the {!} brackets) parameter.  
Confusing, indeed.  But just not how local params/defType works at the moment.  
So, with defType being ignored in those curly brackets, you're getting the 
default lucene query parser.  Check it out with debugQuery=true and see how 
queries parse.

Erik

Re: XInclude Multiple Elements

2011-07-19 Thread Stephen Duncan Jr

On Mon, Jul 18, 2011 at 8:06 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 Can you post the details of your JVM / ServletContainer and the full stack
 trace of the exception?  My understanding is that fragment identifiers are
 a mandatory part of the xinclude/xpointer specs.

 It would also be good to know if you tried the explicit xpointer
 attribute approach on the xinclude syntax also mentioned in that thread...

 I think it owuld be something like...

 xi:include href=solrconfigIncludes.xml 
 xpointer=xpointer(//requestHandler) /


 In general, Solr really isn't doing anything special with XInclude ...
 it's all just delegated to the XML Libraries.  You might want to start by
 ignoring solr, and reading up on XInclude/XPointer tutorials in general,
 and experimenting with command line xml tools to figure out the syntax you
 need to get the final xml structures you want -- then aply that
 knowledge to the solr config files.


 -Hoss

This is running on java 1.6.0_26, and jetty 7.4.4.v20110707.

The stack trace in the case of the use of the fragment is:

2011-07-13 18:52:42,953 [main] ERROR org.apache.solr.core.Config -
Exception during parsing file:
solrconfig.xml:org.xml.sax.SAXParseException: Fragment identifiers
must not be used. The 'href' attribute value
'../../conf/solrconfigIncludes.xml#xpointer(root/node())' is not
permitted.
at 
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.reportError(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.reportFatalError(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.handleIncludeElement(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.emptyElement(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown 
Source)
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown 
Source)
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
Source)
at org.apache.solr.core.Config.init(Config.java:159)
at org.apache.solr.core.SolrConfig.init(SolrConfig.java:131)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:435)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58)
at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:742)
at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:245)
at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1208)
at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:586)
at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:449)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:89)
at org.eclipse.jetty.server.Server.doStart(Server.java:258)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58)
at com.issinc.cidne.solr.App.main(App.java:41)


I did attempt the xpointer=xpointer(//requestHandler) syntax, and
received this error: 2011-07-13 18:49:06,640 [main] WARN
org.apache.solr.core.Config - XML parse warning in
solrres:/solrconfig.xml, line 3, column 133:

Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps

2011-07-19 Thread Nagendra Nagarajayya

Yes, this problem has been solved though not completely, there is still 
a refresh problem.  To eliminate duplicate documents with a unique id 
during update, you need to set 
maxBufferedDeleteTerms1/maxBufferedDeleteTerms. This makes the most 
recent updated document to become searchable as well as removing the 
older documents. There is a catch though, if some of the fields  in a 
document are different and this is updated , older content might show up 
as part of the results even though the query matches the most recent 
document content ie. if the most recent doc has afield set to 
docafieldabc/afield/doc and this is updated, and the old docs 
were docafieldxyz/afield, at query time, q=afield:abc matches, but 
the results show may show docafieldxyz/afield. I am still 
researching this.


You can get more information about the performance and known issues here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/19/2011 1:21 AM, Andy wrote:

Nagendra,

In another email you mentioned there's a problem where if an existing document 
is updated both the old and new version will show up in search results.

Has that been solved in Solr-RA 3.3?

--- On Mon, 7/18/11, Nagendra Nagarajayyannagaraja...@transaxtions.com  wrote:


From: Nagendra Nagarajayyannagaraja...@transaxtions.com
Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high 
performance 1 tps
To: solr-user@lucene.apache.org
Date: Monday, July 18, 2011, 10:43 AM
Hi!

I would like to announce the availability of Solr 3.3 with
RankingAlgorithm and Near Real Time (NRT) search capability
now. The NRT performance is very high, 10,000 documents/sec
with the MBArtists 390k index. The NRT functionality allows
you to add documents without the IndexSearchers being closed
or caches being cleared. A commit is also not needed with
the document update. Searches can run concurrently with
document updates. No changes are needed except for enabling
the NRT through solrconfig.xml.

RankingAlgorithm query performance is now 3x times faster
than before and is exposed as the Lucene API. This release
also adds supports for the last document with a unique id to
be searchable and visible in search results in case of
multiple updates of the document.

I have a wiki page that describes NRT performance in detail
and can be accessed from here:

http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x

You can download Solr 3.3 with RankingAlgorithm (NRT
version) from here:

http://solr-ra.tgels.org

I would like to invite you to give this version a try as
the performance is very high.

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

Re: Solr User Interface

2011-07-19 Thread Yusniel Hidalgo Delgado


Hi,

You can to send wt param to Solr such as follow:

wt=json or wt=phps

In the first case, Solr result are retorned in JSON format response and 
the second case, are returned in PHP serialized format.


Regards.

El 19/07/11 15:46, serenity keningston escribió:

Hi,

I installed Solr 3.2 and able to search results successfully from the
crawled data, however , I would like to develop UI for the http or json
response.  Can anyone guide me with the tutorial or sample ?
I referred few thing like Ajax Solr but am not sure how to do the things.


Serenity

Solr UI

2011-07-19 Thread serenity

Hi,

I installed Solr 3.2 and able to search results successfully from the
crawled data, however , I would like to develop UI for the http or json
response.  Can anyone guide me with the tutorial or sample ?
I referred few thing like Ajax Solr but am not sure how to do the things.


Serenity 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-UI-tp3182594p3182594.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Spatial Search with distance as a parameter

2011-07-19 Thread Smiley, David W.

Hi Michael,

It appears that you want to index circles (aka point-radius) and you want to do
a query that is a point that matches documents where this point is within an
indexed circle. I'm working with a couple Lucene/Solr committers on a
geospatial module that can do this today against Solr 4 (trunk), but it's rough
on the edges and needs testing. In the mean time, I suggest you look at this
post from Spaceman Steve in which he indexed lat-lon boxes and he has a query
box. With a bit of creativity, you may be able to adapt it to your needs.

http://lucene.472066.n3.nabble.com/intersecting-map-extent-with-solr-spatial-documents-tc3104098.html

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/

On Jul 19, 2011, at 6:14 AM, Michael Lorz wrote:

Hi all,

I have the following problem: The documents in the index of my solr instance
correspond to persons. Each document (=person) has lat/lon coordinates and
additionally a travel radius. The coordinates correspond to the office of the
person, the travel radius indicates a distance which the person is willing to
travel.

I would like to search for all persons which are willing to travel to a
particular place (also given as lat/lon coordinates). In other words I have
to
do a query with the geofilt filter. The problem here: the distance parameter
d
cannot be defined in advance, but should correspond to the travel radius
(which
may be different for each person).

Any ideas how this problem could be solved?

Thanks in advance
Michi

Re: Solr UI

2011-07-19 Thread Erik Hatcher

There's several starting points for Solr UI out there, but really the best 
choice is whatever fits your environment and the skills/resources you have 
handy.  Here's a few off the top of my head -

  * Blacklight - it's a Ruby on Rails full-featured search UI powered by Solr.  
It can be customized fairly easily to work with any arbitrary Solr schema, but 
by default it is kinda library-specific in it's out of the box experience.  It 
powers UVa, Stanford, and other libraries and sites out there in production now 
- http://projectblacklight.org/

  * Flare - it's the first prototype to Blacklight, and fairly dusty and 
prototypical, but I still think a good example of how lean a search UI can be 
that has a number of fancy features - http://wiki.apache.org/solr/Flare/HowTo

  * Solritas/VelocityResponseWriter - this is built right into Solr and allows 
easily templating of Solr responses.  It's the /browse interface out of the 
box.  While probably not how someone would deploy a production search UI, it 
can make proof of concepts and getting up and running quite quick and easy - 
http://wiki.apache.org/solr/VelocityResponseWriter

And there's a new little tinkering I've started a while back that might be good 
food for thought for the same sorts of ideas as the above but in a slightly 
different direction - https://github.com/lucidimagination/Prism

Erik



On Jul 19, 2011, at 10:00 , serenity wrote:

 Hi,
 
 I installed Solr 3.2 and able to search results successfully from the
 crawled data, however , I would like to develop UI for the http or json
 response.  Can anyone guide me with the tutorial or sample ?
 I referred few thing like Ajax Solr but am not sure how to do the things.
 
 
 Serenity 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-UI-tp3182594p3182594.html
 Sent from the Solr - User mailing list archive at Nabble.com.

query time boosting in solr

2011-07-19 Thread Sowmya V.B.

Hi

Is query time boosting possible in Solr?

Here is what I want to do: I want to boost the ranking of certain documents,
which have their relevant field values, in a particular range (selected by
user at query time)...

when I do something like:
http://localhost:8085/solr/select?indent=onversion=2.2q=scientific+temperfq=field1:[10%20TO%2030]start=0rows=10
-I guess, it is just a filter over the normal results and not exactly a
query.

I tried giving this:
http://localhost:8085/solr/select?indent=onversion=2.2q=scientific+temper+field1:[10%20TO%2030]start=0rows=10
-This still worked and gave me different results. But, I did not quite
understand what this second query meant. Does it mean: Rank those documents
with field1 value in 10-30 better than those without ?

S
-- 
Sowmya V.B.

Losing optimism is blasphemy!
http://vbsowmya.wordpress.com

Error 400 in Solr 1.4

2011-07-19 Thread Yusniel Hidalgo Delgado


Hi,

I have a problem when I try to send the qt param to Solr 1.4 with dismax 
value. I get the following error from Solr response:


HTTP ERROR: 400

undefined field price

RequestURI=/solr/select

Any idea?

Regards.

Re: Error 400 in Solr 1.4

2011-07-19 Thread Erik Hatcher

Just a hunch, ;), but I'm guessing you don't have a price field defined.  qt is 
for selecting a request handler you have defined in your solrconfig.xml - you 
need to customize the parameters to your schema.

Erik

On Jul 19, 2011, at 04:32 , Yusniel Hidalgo Delgado wrote:

 Hi,
 
 I have a problem when I try to send the qt param to Solr 1.4 with dismax 
 value. I get the following error from Solr response:
 
 HTTP ERROR: 400
 
 undefined field price
 
 RequestURI=/solr/select
 
 Any idea?
 
 Regards.

Edismax and leading wildcards

2011-07-19 Thread Jamie Johnson

My schema.xml currently has a content field and a content_rev field
which is the field run through the reversed wild card filter, my
question is does Edismax support using this field?  Reading through
this jira(https://issues.apache.org/jira/browse/SOLR-1321) it seems to
indicate that SolrQueryParser was updated to support using this field
but it feels like there should be something that I'd need to configure
to let Solr know that if doing wildcard queries on field content
content_rev should be used if it has a lower cost.  Is that not the
case?

Re: Error 400 in Solr 1.4

2011-07-19 Thread Yusniel Hidalgo Delgado

Thanks Erik for your quick reply. You are right, In my solrconfig.xml 
file, I did have a wrong configuration option. Thanks again.


El 19/07/11 16:37, Erik Hatcher escribió:

Just a hunch, ;), but I'm guessing you don't have a price field defined.  qt is 
for selecting a request handler you have defined in your solrconfig.xml - you 
need to customize the parameters to your schema.

Erik

On Jul 19, 2011, at 04:32 , Yusniel Hidalgo Delgado wrote:


Hi,

I have a problem when I try to send the qt param to Solr 1.4 with dismax value. 
I get the following error from Solr response:

HTTP ERROR: 400

undefined field price

RequestURI=/solr/select

Any idea?

Regards.

Re: How to find whether solr server is running or not

2011-07-19 Thread Chanty

i am new in HADOOP TESTING , any body tell me about the hadoop  testing ,
what  should be sufficient  for a tester  to  test  hadoop   based projects
. please help  me

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3182201.html
Sent from the Solr - User mailing list archive at Nabble.com.

delta import exception

2011-07-19 Thread Elaine Li

Hi,

I am trying to trace the exception I get from the deletedPkQuery I am
running.
When I kick off the delta-import, the statusMessage has the following
message after 2 hours, but no single document was modified or deleted.
str name=Total Rows Fetched2813450/str
and then it bailed out when i submitted another heavy query on the mysql
prompt.

Does it mean that the importer was still trying to identify documents to
update/delete?
It might be because the deletedPkQuery takes too long to return the
documents?
Where is the source code for getNext()?
I could not find under apache-solr-1.4.0/src/java/org/apache/solr/handler

Caused by: java.io.EOFException: Can not read response from server. Expected
to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2455)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2906)
... 22 more
Jul 19, 2011 11:35:00 AM
org.apache.solr.handler.dataimport.EntityProcessorBase getNext
SEVERE: getNext() failed for query ' ***removed***'

org.apache.solr.handler.dataimport.DataImportHandlerException:
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet
successfully received from the server was0 milliseconds ago.The last packet
sent successfully to the server was 9042 milliseconds ago, which  is longer
than the server configured value of 'wait_timeout'. You should consider
either expiring and/or testing connection validity before use in your
application, increasing the server configured values for client timeouts, or
using the Connector/J connection property 'autoReconnect=true' to avoid this
problem.
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:339)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$700(JdbcDataSource.java:228)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.hasNext(JdbcDataSource.java:262)
at
org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:78)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextDeletedRowKey(SqlEntityProcessor.java:93)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextDeletedRowKey(EntityProcessorWrapper.java:258)
at
org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:636)
at
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:258)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:172)
at
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:352)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)

Elaine

RE: Searching for strings

2011-07-19 Thread Chip Calhoun

Thanks Rob.  It turns out this was a false alarm; I was misinterpreting a 
different problem with my crawl.

-Original Message-
From: Rob Casson [mailto:rob.cas...@gmail.com] 
Sent: Monday, July 18, 2011 5:58 PM
To: solr-user@lucene.apache.org
Subject: Re: Searching for strings

chip,

gonna need more information about your particular analysis chain, content, and 
example searches to give a better answer, but phrase queries (using quotes) are 
supported in both the standard and dismax query parsers

that being said, lots of things may not match a person's idea of an exact 
string...stopwords, synonyms, slop, etc.

cheers,
rob

On Mon, Jul 18, 2011 at 5:25 PM, Chip Calhoun ccalh...@aip.org wrote:
 Is there a way to search for a specific string using Solr, either by putting 
 it in quotes or by some other means?  I haven't been able to do this, but I 
 may be missing something.

 Thanks,
 Chip

Re: Data Import from a Queue

2011-07-19 Thread Brandon Fish

Let me provide some more details to the question:

I was unable to find any example implementations where individual documents
(single document per message) are read from a message queue (like ActiveMQ
or RabbitMQ) and then added to Solr via SolrJ, a HTTP POST or another
method. Does anyone know of any available examples for this type of import?

If no examples exist, what would be a recommended commit strategy for
performance? My best guess for this would be to have a queue per core and
commit once the queue is empty.

Thanks.

On Mon, Jul 18, 2011 at 6:52 PM, Erick Erickson erickerick...@gmail.comwrote:

 This is a really cryptic problem statement.

 you might want to review:

 http://wiki.apache.org/solr/UsingMailingLists

 Best
 Erick

 On Fri, Jul 15, 2011 at 1:52 PM, Brandon Fish brandon.j.f...@gmail.com
 wrote:
  Does anyone know of any existing examples of importing data from a queue
  into Solr?
 
  Thank you.

Re: defType argument weirdness

2011-07-19 Thread Naomi Dushay

qf_dismax and pf_dismax   are irrelevant -- I shouldn't have included  
that info.  They are passed in the url and they work;   they do not  
affect this problem.


Your reminder of debugQuery  was a good one - I use that a lot but  
forgot in this case.


Regardless, I thought that defType=dismaxq=*:*   is supposed to  
be equivalent to  q={!defType=dismax}*:*  and also equivalent to q={! 
dismax}*:*



defType=dismaxq=*:*   DOESN'T WORK
str name=rawquerystring*:*/str
str name=querystring*:*/str
str name=parsedquery+() ()/str
str name=parsedquery_toString+() ()/str

leaving out the explicit query
defType=dismax WORKS
null name=rawquerystring/
null name=querystring/
str name=parsedquery+MatchAllDocsQuery(*:*)/str
str name=parsedquery_toString+*:*/str


q={!dismax}*:*   DOESN'T WORK
str name=rawquerystring*:*/str
str name=querystring*:*/str
str name=parsedquery+() ()/str
str name=parsedquery_toString+() ()/str

leaving out the explicit query:
q={!dismax}WORKS
str name=rawquerystring{!dismax}/str
str name=querystring{!dismax}/str
str name=parsedquery+MatchAllDocsQuery(*:*)/str
str name=parsedquery_toString+*:*/str


q={!defType=dismax}*:*WORKS
str name=rawquerystring{!defType=dismax}*:*/str
str name=querystring{!defType=dismax}*:*/str
str name=parsedqueryMatchAllDocsQuery(*:*)/str
str name=parsedquery_toString*:*/str

leaving out the explicit query:
q={!defType=dismax}DOESN'T WORK
org.apache.lucene.queryParser.ParseException: Cannot parse '':  
Encountered EOF at line 1, column 0.


On Jul 18, 2011, at 5:44 PM, Erick Erickson wrote:


What are qf_dismax and pf_dismax? They are meaningless to
Solr. Try adding debugQuery=on to your URL and you'll
see the parsed query, which helps a lot here

If you change these to the proper dismax values (qf and pf)
you'll get beter results. As it is, I think you'll see output like:

str name=parsedquery+() ()/str

showing that your query isn't actually going against
any fields

Best
Erick

On Mon, Jul 18, 2011 at 7:15 PM, Naomi Dushay ndus...@stanford.edu  
wrote:
I found a weird behavior with the Solr  defType argument, perhaps  
with

respect to default queries?

 defType=dismaxq=*:*  no hits

 q={!defType=dismax}*:* hits

 defType=dismax hits


Here is the request handler, which I explicitly indicate:

requestHandler name=search class=solr.SearchHandler  
default=true

   lst name=defaults
   str name=defTypelucene/str

   !-- lucene params --
   str name=dfhas_model_s/str
   str name=q.opAND/str

   !-- dismax params --
   str name=mm 2-1 5-2 690% /str
   str name=q.alt*:*/str
   str name=qf_dismaxid^0.8 id_t^0.8 title_t^0.3  
mods_t^0.2

text/str
   str name=pf_dismaxid^0.9  id_t^0.9 title_t^0.5  
mods_t^0.2

text/str
   int name=ps100/int
   float name=tie0.01/float
/requestHandler


Solr Specification Version: 1.4.0
Solr Implementation Version: 1.4.0 833479 - grantingersoll -  
2009-11-06

12:33:40
Lucene Specification Version: 2.9.1
Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25

- Naomi

Question on the appropriate software

2011-07-19 Thread Matthew Twomey


Greetings,

I'm interesting in having a server based personal document library with 
a few specific features and I'm trying to determine what the most 
appropriate tools are to build it.


I have the following content which I wish to include in the archive:

1. A smallish collection of technical books in PDF format (around 100)
2. Many years of several different magazine subscriptions in PDF format 
(probably another 100 - 200 PDFs)
3. Several years of personal documents which were scanned in and 
converted to searchable PDF format (300 - 500 documents)

4. I also have local mirrors of several HTML based reference sites

I'd like to have the ability to index all of this content and search it 
from a web form (so that I and a few other can reach it from multiple 
locations). Here are two examples of the functionality I'm looking for:


Scenario 1. What was that software that has all the nutritional data 
and hooks up to some USDA database? I know I read about it in one of my 
Linux Journals last year.


Now I'd like to be able to pull up the webform and search for nutrition 
USDA. I'd like to restrict the search to the Linux Journal magazine 
PDFs (or refine the results). I'd like results to contain context 
snippets with each search result. Finally most importantly, I'd like 
multiple results per PDF (or all occurrences). The last one is important 
so that I can actually quickly find the right issue (in case there is 
some advertisement in every issue for the last year that contains those 
terms). When I click on the desired result, the PDF is downloaded by my 
browser.


Scenario 2. How much have I been paying for property taxes for the last 
five years again? (the bills are all scanned in)


In this case I'd like to search for my property identification number 
(which is on the bills) and the results should show all the documents 
that have it, with context. Clicking on results downloads the documents. 
I assume this example is simple to achieve if example 1 can be done.


So in general, my question is - can this be done in a fairly straight 
forward manner with Solr? Is there a more appropriate tool to be using 
(e.g. Nutch?). Also, I have looked high and low for a free, already 
baked solution which can do scenario 1 but haven't been able to find 
something - so if someone knows of such a thing, please let me know.


Thanks!

-Matt

use case: structured DB records with a bunch of related files

2011-07-19 Thread Travis Low

Greetings.  I have a bunch of highly structured DB records, and I'm pretty
clear on how to index those.  However, each of those records may have any
number of related documents (Word, Excel, PDF, PPT, etc.).  All of this
information will change over time.

Can someone point me to a use case or some good reading to get me started on
configuring Solr to index the DB records and files in such a way as to
relate the two types of information?  By relate, I mean that if there's a
hit in a related file, then I need to show the user a link to the DB record
as well as a link to the file.

Thanks in advance.

cheers,

Travis

-- 

**

*Travis Low, Director of Development*


** t...@4centurion.com* *

*Centurion Research Solutions, LLC*

*14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

*703-956-6276 *•* 703-378-4474 (fax)*

*http://www.centurionresearch.com* http://www.centurionresearch.com

**The information contained in this email message is confidential and
protected from disclosure.  If you are not the intended recipient, any use
or dissemination of this communication, including attachments, is strictly
prohibited.  If you received this email message in error, please delete it
and immediately notify the sender.

This email message and any attachments have been scanned and are believed to
be free of malicious software and defects that might affect any computer
system in which they are received and opened. No responsibility is accepted
by Centurion Research Solutions, LLC for any loss or damage arising from the
content of this email.

RE: Analysis page output vs. actually getting search matches, a discrepency?

2011-07-19 Thread Robert Petersen

Thanks Eric,

Unfortunately I'm stemming the same on both sides, similar to the SOLR example 
settings for the text type field.  Default search field is moreWords, as I want 
yes.

Since I don't have this problem for any other mfg names at all in our index of 
almost 10 mm product docs, and this shows that is should match in my best 
estimation.

Note:  LucidKStemFilterFactory does not take 'Sterling' down to 'Sterl' in 
indexing nor searching, it stays as 'Sterling'.

I have given up on this.  I've decided it is just an unexplainable anomaly, and 
have solved it by inserting a LucidKStemFilterFactory and just modifying that 
word to it's searchable form before hitting the WhitespaceTokenizerFactory, 
which is kind of hackish but solves my problem at least.  This seller only has 
a couple hundred cheap products on our site, so I have bigger fish to fry at 
this point.  I've wasted too much time trying to chase this down.

Cheers all
Robi

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, July 18, 2011 5:33 PM
To: solr-user@lucene.apache.org
Subject: Re: Analysis page output vs. actually getting search matches, a 
discrepency?

Hmmm, is there any chance that you're stemming one place and
not the other?
And I infer from your output that your default search field is
moreWords, is that true and expected?

You might use luke or the TermsComponent to see what's actually in
the index, I'm going to guess that you'll find sterl but not sterling as
an indexed term and your problem is stemming, but that's
a shot in the dark.

Best
Erick

On Mon, Jul 18, 2011 at 5:37 PM, Robert Petersen rober...@buy.com wrote:
 OK I did what Hoss said, it only confirms I don't get a match when I
 should and that the query parser is doing the expected.  Here are the
 details for one test sku.

 My analysis page output is shown in my email starting this thread and
 here is my query debug output.  This absolutely should match but
 doesn't.  Both the indexing side and the query side are splitting on
 case changes.  This actually isn't a problem for any of our other
 content, for instance there is no issue searching for 'VideoSecu'.
 Their products come up fine in our searches regardless of casing in the
 query.  Only SterlingTek's products seem to be causing us issues.

 Indexed content has camel case, stored in the text field 'moreWords':
 SterlingTek's NB-2LH 2 Pack Batteries + Charger Combo for Canon DC301
 Search term not matching with camel case: SterlingTek's
 Search term matching if no case changes: Sterlingtek's

 Indexing:
 filter class=solr.WordDelimiterFilterFactory
        generateWordParts=1
        generateNumberParts=1
        catenateWords=1
        catenateNumbers=1
        catenateAll=0
        splitOnCaseChange=1
        preserveOriginal=0
 /
 Searching:
 filter class=solr.WordDelimiterFilterFactory
         generateWordParts=1
         generateNumberParts=1
         catenateWords=0
         catenateNumbers=0
         catenateAll=0
         splitOnCaseChange=1
         preserveOriginal=0
 /

 Thanks

 http://ssdevrh01.buy.com:8983/solr/1/select?indent=onversion=2.2q=
 SterlingTek%27sfq=start=0rows=1fl=*%2Cscoreqt=standardwt=standard
 debugQuery=onexplainOther=sku%3A216473417hl=onhl.fl=echoHandler=true
 adf

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime4/int
 str
 name=handlerorg.apache.solr.handler.component.SearchHandler/str
 lst name=params
  str name=explainOthersku:216473417/str
  str name=indenton/str
  str name=echoHandlertrue/str
  str name=hl.fl/
  str name=wtstandard/str
  str name=hlon/str
  str name=rows1/str
  str name=version2.2/str
  str name=fl*,score/str
  str name=debugQueryon/str
  str name=start0/str
  str name=qSterlingTek's/str
  str name=qtstandard/str
  str name=fq/
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/
 lst name=highlighting/
 lst name=debug
 str name=rawquerystringSterlingTek's/str
 str name=querystringSterlingTek's/str
 str name=parsedqueryPhraseQuery(moreWords:sterling tek)/str
 str name=parsedquery_toStringmoreWords:sterling tek/str
 lst name=explain/
 str name=otherQuerysku:216473417/str
 lst name=explainOther
 str name=216473417
 0.0 = fieldWeight(moreWords:sterling tek in 76351), product of:
  0.0 = tf(phraseFreq=0.0)
  19.502613 = idf(moreWords: sterling=1 tek=72)
  0.15625 = fieldNorm(field=moreWords, doc=76351)

 /str
 /lst
 str name=QParserLuceneQParser
 /str
 arr name=filter_queries
 str/
 /arr



 -Original Message-
 From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
 Sent: Friday, July 15, 2011 4:36 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Analysis page output vs. actually getting search matches, a
 discrepency?


 : Subject: Analysis page output vs. actually getting search matches,
 :     a discrepency?

 99% of the time when people ask questions like this, it's because of
 confusion about how/when QueryParsing comes into play (as opposed to

Re: DIH full-import - when is commit() actally triggered?

2011-07-19 Thread Frank Wesemann


Ahmet Arslan schrieb:

I am running a full import with a quite plain data-config
(a root entity with three sub entities ) from a jdbc
datasource.
This import is expected to add approximately 10 mio
documents
What I now see from my logfiles is, that a newSearcher
event is fired about every five seconds.



This is triggered by autoCommit in every 300,000 milli seconds.
You need to remove maxTime30/maxTime to disable this mechanism.


  

Thanks Ahmet,
indeed I had to remove the maxDocs Entry. So now a commit happens only 
every five minutes.


--
mit freundlichem Gruß,

Frank Wesemann
Fotofinder GmbH USt-IdNr. DE812854514
Software EntwicklungWeb: http://www.fotofinder.com/
Potsdamer Str. 96   Tel: +49 30 25 79 28 90
10785 BerlinFax: +49 30 25 79 28 999

Sitz: Berlin
Amtsgericht Berlin Charlottenburg (HRB 73099)
Geschäftsführer: Ali Paczensky

Geospatial queries in Solr

2011-07-19 Thread Jamie Johnson

I have looked at the code being shared on the
lucene-spatial-playground and was wondering if anyone could provide
some details as to its state.  Specifically I'm looking to add
geospatial support to my application based on a user provided polygon,
is this currently possible using this extension?

Re: Using FieldCache in SolrIndexSearcher - crazy idea?

2011-07-19 Thread Chris Hostetter


:  Quite probably ... you typically can't assume that a FieldCache can be
:  constructed for *any* field, but it should be a safe assumption for the
:  uniqueKey field, so for that initial request of the mutiphase distributed
:  search it's quite possible it would speed things up.
: 
: Ah, thanks Hoss - I had meant to respond to the original email, but
: then I lost track of it.
: 
: Via pseudo-fields, we actually already have the ability to retrieve
: values via FieldCache.
: fl=id:{!func}id

isn't that kind of orthoginal to the question though? ... a user can use 
the new psuedo-field functionality to request values from the FieldCache 
instead of stored fields, but specificly in the case of distributed 
search, when the first request is only asking for the uniqueKey values and 
scores, shouldn't that use the FieldCache to get those values?  (w/o the 
user needing to jumpt thorugh hoops in how the request is made/configured)


-Hoss

Re: Using FieldCache in SolrIndexSearcher - crazy idea?

2011-07-19 Thread Yonik Seeley

On Tue, Jul 19, 2011 at 3:20 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 :  Quite probably ... you typically can't assume that a FieldCache can be
 :  constructed for *any* field, but it should be a safe assumption for the
 :  uniqueKey field, so for that initial request of the mutiphase distributed
 :  search it's quite possible it would speed things up.
 :
 : Ah, thanks Hoss - I had meant to respond to the original email, but
 : then I lost track of it.
 :
 : Via pseudo-fields, we actually already have the ability to retrieve
 : values via FieldCache.
 : fl=id:{!func}id

 isn't that kind of orthoginal to the question though? ... a user can use
 the new psuedo-field functionality to request values from the FieldCache
 instead of stored fields, but specificly in the case of distributed
 search, when the first request is only asking for the uniqueKey values and
 scores, shouldn't that use the FieldCache to get those values?  (w/o the
 user needing to jumpt thorugh hoops in how the request is made/configured)

Well, I was pointing out that distributed search could be easily
modified to use the field-cache
by changing id to id:{!func}id

But I'm not sure we should do that by default - the memory of a full
fieldCache entry is non-trivial for some people.
Using a CSF id field would be better I think (the type were it doesn't
populate a fieldcache entry).

-Yonik
http://www.lucidimagination.com

Re: Geospatial queries in Solr

2011-07-19 Thread Smiley, David W.

Hi Jamie.
I work on LSP; it can index polygons and query for them. Although the 
capability is there, we have more testing  benchmarking to do, and then we 
need to put together a tutorial to explain how to use it at the Solr layer.  I 
recently cleaned up the READMEs a bit.  Try downloading the trunk codebase, and 
follow the README.  It points to another README which shows off a demo webapp.  
At the conclusion of this, you'll need to examine the tests and webapp a bit to 
figure out how to apply it in your app.  We don't yet have a tutorial as the 
framework has been in flux  although it has stabilized a good deal.

Oh... by the way, this works off of Lucene/Solr trunk.  Within the past week 
there was a major change to trunk and LSP won't compile until we make updates.  
Either Ryan McKinley or I will get to that by the end of the week.  So unless 
you have access to 2-week old maven artifacts of Lucene/Solr, you're stuck 
right now.

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/

On Jul 19, 2011, at 3:03 PM, Jamie Johnson wrote:

 I have looked at the code being shared on the
 lucene-spatial-playground and was wondering if anyone could provide
 some details as to its state.  Specifically I'm looking to add
 geospatial support to my application based on a user provided polygon,
 is this currently possible using this extension?

RE: Analysis page output vs. actually getting search matches, a discrepency?

2011-07-19 Thread Robert Petersen

Um sorry for any confusion.  I meant to say I solved my issue by inserting a 
charFilter before the WhitespaceTokenizerFactory to convert my problem word to 
a searchable form.  I had a cut n paste malfunction below.  Thanks guys.

-Original Message-
From: Robert Petersen [mailto:rober...@buy.com] 
Sent: Tuesday, July 19, 2011 11:06 AM
To: solr-user@lucene.apache.org
Subject: RE: Analysis page output vs. actually getting search matches, a 
discrepency?

Thanks Eric,

Unfortunately I'm stemming the same on both sides, similar to the SOLR example 
settings for the text type field.  Default search field is moreWords, as I want 
yes.

Since I don't have this problem for any other mfg names at all in our index of 
almost 10 mm product docs, and this shows that is should match in my best 
estimation.

Note:  LucidKStemFilterFactory does not take 'Sterling' down to 'Sterl' in 
indexing nor searching, it stays as 'Sterling'.

I have given up on this.  I've decided it is just an unexplainable anomaly, and 
have solved it by inserting a LucidKStemFilterFactory and just modifying that 
word to it's searchable form before hitting the WhitespaceTokenizerFactory, 
which is kind of hackish but solves my problem at least.  This seller only has 
a couple hundred cheap products on our site, so I have bigger fish to fry at 
this point.  I've wasted too much time trying to chase this down.

Cheers all
Robi

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, July 18, 2011 5:33 PM
To: solr-user@lucene.apache.org
Subject: Re: Analysis page output vs. actually getting search matches, a 
discrepency?

Hmmm, is there any chance that you're stemming one place and
not the other?
And I infer from your output that your default search field is
moreWords, is that true and expected?

You might use luke or the TermsComponent to see what's actually in
the index, I'm going to guess that you'll find sterl but not sterling as
an indexed term and your problem is stemming, but that's
a shot in the dark.

Best
Erick

On Mon, Jul 18, 2011 at 5:37 PM, Robert Petersen rober...@buy.com wrote:
 OK I did what Hoss said, it only confirms I don't get a match when I
 should and that the query parser is doing the expected.  Here are the
 details for one test sku.

 My analysis page output is shown in my email starting this thread and
 here is my query debug output.  This absolutely should match but
 doesn't.  Both the indexing side and the query side are splitting on
 case changes.  This actually isn't a problem for any of our other
 content, for instance there is no issue searching for 'VideoSecu'.
 Their products come up fine in our searches regardless of casing in the
 query.  Only SterlingTek's products seem to be causing us issues.

 Indexed content has camel case, stored in the text field 'moreWords':
 SterlingTek's NB-2LH 2 Pack Batteries + Charger Combo for Canon DC301
 Search term not matching with camel case: SterlingTek's
 Search term matching if no case changes: Sterlingtek's

 Indexing:
 filter class=solr.WordDelimiterFilterFactory
        generateWordParts=1
        generateNumberParts=1
        catenateWords=1
        catenateNumbers=1
        catenateAll=0
        splitOnCaseChange=1
        preserveOriginal=0
 /
 Searching:
 filter class=solr.WordDelimiterFilterFactory
         generateWordParts=1
         generateNumberParts=1
         catenateWords=0
         catenateNumbers=0
         catenateAll=0
         splitOnCaseChange=1
         preserveOriginal=0
 /

 Thanks

 http://ssdevrh01.buy.com:8983/solr/1/select?indent=onversion=2.2q=
 SterlingTek%27sfq=start=0rows=1fl=*%2Cscoreqt=standardwt=standard
 debugQuery=onexplainOther=sku%3A216473417hl=onhl.fl=echoHandler=true
 adf

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime4/int
 str
 name=handlerorg.apache.solr.handler.component.SearchHandler/str
 lst name=params
  str name=explainOthersku:216473417/str
  str name=indenton/str
  str name=echoHandlertrue/str
  str name=hl.fl/
  str name=wtstandard/str
  str name=hlon/str
  str name=rows1/str
  str name=version2.2/str
  str name=fl*,score/str
  str name=debugQueryon/str
  str name=start0/str
  str name=qSterlingTek's/str
  str name=qtstandard/str
  str name=fq/
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/
 lst name=highlighting/
 lst name=debug
 str name=rawquerystringSterlingTek's/str
 str name=querystringSterlingTek's/str
 str name=parsedqueryPhraseQuery(moreWords:sterling tek)/str
 str name=parsedquery_toStringmoreWords:sterling tek/str
 lst name=explain/
 str name=otherQuerysku:216473417/str
 lst name=explainOther
 str name=216473417
 0.0 = fieldWeight(moreWords:sterling tek in 76351), product of:
  0.0 = tf(phraseFreq=0.0)
  19.502613 = idf(moreWords: sterling=1 tek=72)
  0.15625 = fieldNorm(field=moreWords, doc=76351)

 /str
 /lst
 str name=QParserLuceneQParser
 /str
 arr name=filter_queries
 str/
 /arr

 -Original

Re: Solr Request Logging

2011-07-19 Thread Chris Hostetter


: I am using the trunk version of solr and I am getting a ton more logging 
: information than I really care to see and definitely more than 1.4, but 
: I cant really see a way to change it.

http://wiki.apache.org/solr/SolrLogging


-Hoss

Using functions in fq

2011-07-19 Thread solr nps

My documents have two prices retail_price and current_price. I want to
get products which have a sale of x%, the x is dynamic and can be specified
by the user. I was trying to achieve this by using fq.

If I want all sony tv's that are at least 20% off, I want to write something
like

q=sony tvfq=current_price:[0 TO product(retail_price,0.80)]

this does not work as the function is not expected in fq.

how else can I achieve this?

Thanks

Re: Using functions in fq

2011-07-19 Thread Yonik Seeley

On Tue, Jul 19, 2011 at 6:49 PM, solr nps solr...@gmail.com wrote:
 My documents have two prices retail_price and current_price. I want to
 get products which have a sale of x%, the x is dynamic and can be specified
 by the user. I was trying to achieve this by using fq.

 If I want all sony tv's that are at least 20% off, I want to write something
 like

 q=sony tvfq=current_price:[0 TO product(retail_price,0.80)]

 this does not work as the function is not expected in fq.

 how else can I achieve this?

The frange query parser may do what you want.
http://www.lucidimagination.com/blog/2009/07/06/ranges-over-functions-in-solr-14/

fq={!frange l=0 u=0.8}div(current_price, retail_price)

-Yonik
http://www.lucidimagination.com

Re: omitNorms and omitTermFreqAndPosition

2011-07-19 Thread Chris Hostetter


As a general rule, if you are looking at the score explanations from 
debugQuery, and you don't understand why you get the scores thta you do, 
then you should actaully send the score explanations along with your email 
when you ask why it doesn't match what you expect.

In the absense of any other information to go on, i'm going to guess that 
the reason for the differnet scores is that category may be a multiValued 
field, and some docs are matching multiple clauses of your query -- so the 
coord factor of hte boolean query comes into play (rewarding docs for 
matching multiple clauses) ... but as i said, i can't be certain because 
you didn't actually tell us what hte score explanation said.

assuming i'm right, and assuming you want all the docs to score the 
same, or for the score to be driven by some other factor besides the 
relevancy of the query you are sending then anothe general rule comes into 
play: if you don't care about hte score of a query, then that query 
probably makes more sense as a filter

fq=category(X OR Y OR Z)q=...whatever, maybe *:*...

: i have a problem with omitTermFreqAndPosition and omitNorms.
: In my schema i have some fields with these property set True.
: for example the field category
: 
: then i make a query like:
: select?q=category:(x OR y or Z)
: 
: it returns all docs that have as category x or y or z.
: 
: i make a debugQuery=on to see the score and i see every docs have different
: score.
: why? the tf is calculated and, also normalization. why? they should be have
: the same score..
: cause it's not a full-text search but i search only docs that are inside a
: group. stop
:  Thank you very much


-Hoss

Re: Using functions in fq

2011-07-19 Thread solr nps

I read about frange but didn't think about using it like you mentioned :)
Thank you.


On Tue, Jul 19, 2011 at 4:12 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Tue, Jul 19, 2011 at 6:49 PM, solr nps solr...@gmail.com wrote:
  My documents have two prices retail_price and current_price. I want
 to
  get products which have a sale of x%, the x is dynamic and can be
 specified
  by the user. I was trying to achieve this by using fq.
 
  If I want all sony tv's that are at least 20% off, I want to write
 something
  like
 
  q=sony tvfq=current_price:[0 TO product(retail_price,0.80)]
 
  this does not work as the function is not expected in fq.
 
  how else can I achieve this?

 The frange query parser may do what you want.

 http://www.lucidimagination.com/blog/2009/07/06/ranges-over-functions-in-solr-14/

 fq={!frange l=0 u=0.8}div(current_price, retail_price)

 -Yonik
 http://www.lucidimagination.com

solr chewing up system swap

2011-07-19 Thread hinder90

I have arrived a site where solr is being run under jetty. It is ubuntu 10.04
i386 hosted on AWS (xen). Our combined solr index size is a mere 21 MB. What
I am seeing that solr is steadily consuming about 150 MB of swap per week
and won't relinquish it until sunspot is restarted.

Oddly, Jetty doesn't seem to have any memory parameters to speak up supplied
to it, which may very well be the problem in that no garbage collection is
taking place, but I wanted to see if anyone else who uses solr/jetty has
encountered this and if they added some memory parameters to the jetty's
java args.

Thanks,
Matthew

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-chewing-up-system-swap-tp3184083p3184083.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: any detailed tutorials on plugin development?

2011-07-19 Thread deniz

gosh sorry for my typo in msg first... i just realized it now... well
anyway...

i would like to find a detailed tutorial about how to implement an analyzer
or a request handler plugin... but all i have got is nothing from the
documentation of solr wiki...

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/any-detailed-tutorials-on-plugin-development-tp3177821p3184160.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: defType argument weirdness

2011-07-19 Thread Yonik Seeley

On Tue, Jul 19, 2011 at 1:25 PM, Naomi Dushay ndus...@stanford.edu wrote:
 Regardless, I thought that     defType=dismaxq=*:*   is supposed to be
 equivalent to  q={!defType=dismax}*:*  and also equivalent to q={!dismax}*:*

Not quite - there is a very subtle distinction.

{!dismax}  is short for {!type=dismax}, the type of the actual query,
and this may not be overridden.

The defType local param is only the default type for sub-queries (as
opposed to the current query).
It's useful in conjunction with the query  or nested query qparser:
http://lucene.apache.org/solr/api/org/apache/solr/search/NestedQParserPlugin.html

-Yonik
http://www.lucidimagination.com

Re: How could I monitor solr cache

2011-07-19 Thread kun xiong

I working on dev performance turning.

I am looking for a method that could record cache status into log files.


On Tue, Jul 19, 2011 at 2:24 PM, Ahmet Arslan iori...@yahoo.com wrote:


  I am wondering how could I get solr cache running status. I
  know there is a
  JMX containing those information.
 
  Just want to know what tool or method do you make use of to
  monitor cache,
  in order to enhance performance or detect issue.

 You might find this interesting :

 http://sematext.com/spm/solr-performance-monitoring/index.html
 http://sematext.com/spm/index.html

Re: I found a sorting bug in solr/lucene

2011-07-19 Thread Jason Toy

According to that bug list, there are other characters that break the
sorting function.  Is there a list of safe characters I can use as a
delimiter?

On Mon, Jul 18, 2011 at 1:31 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : When I try to sort by a column with a colon in it like
 : scores:rails_f,  solr has cutoff the column name from the colon
 : forward so scores:rails_f becomes scores

 Yes, this bug was recently reported against the 3.x line, but no fix has
 yet been identified...

 https://issues.apache.org/jira/browse/SOLR-2606

 : Can anyone else confirm this is a bug. Is this in lucene or solr? I
 believe
 : the issue resides in solr.

 it's specific to the param parsing, likely due to the addition of
 supporting functions in the sort param.


 -Hoss




-- 
- sent from my mobile
6176064373

embeded solrj doesn't refresh index

2011-07-19 Thread Jianbin Dai

Hi,

 

I am using embedded solrj. After I add new doc to the index, I can see the
changes through solr web, but not from embedded solrj. But after I restart
the embedded solrj, I do see the changes. It works as if there was a cache.
Anyone knows the problem? Thanks.

 

Jianbin

how to get solr core information using solrj

2011-07-19 Thread Jiang mingyuan

hi all,

Our solr server contains two cores:core0,core1,and they both works well.

Now I'am trying to find a way to get information about core0 and core1.

Can solrj or other api do this?


thanks very much.

RE: defType argument weirdness

2011-07-19 Thread Jonathan Rochkind

Is it generally recognized that this terminology is confusing, or is it just 
me?  

I do understand what they do (at least well enough to use them), but I find it 
confusing that it's called defType as a main param, but type in a 
LocalParam, when to me they both seem to do the same thing -- which I think 
should probably be called 'queryParser' rather than 'type' or 'defType'.  
That's what they do, choose the query parser for the query they apply to, 
right?  (And if they did/do different things, 'defType' vs 'type' doesn't 
really provide much hint as to what!)

These are both the same, right, but with different param names depending on 
position:
defType=luceneq=foo
q={!type=lucene}foo  # uri escaping not shown

(and then there's 'qt', often confused with defType/type by newbies, since they 
guess it stands for 'query type', but which should probably actually have been 
called 'requestHandler'/'rh' instead, since that's what it actually chooses, 
no?  It gets very confusing). 

If it's generally recognized it's confusing and perhaps a somewhat inconsistent 
mental model being implied, I wonder if there'd be any interest in renaming 
these to be more clear, leaving the old ones as aliases/synonyms for backwards 
compatibility (perhaps with a long deprecation period, or perhaps existing 
forever). I know it was very confusing to me to keep track of these parameters 
and what they did for quite a while, and still trips me up from time to time. 

Jonathan

From: ysee...@gmail.com [ysee...@gmail.com] on behalf of Yonik Seeley 
[yo...@lucidimagination.com]
Sent: Tuesday, July 19, 2011 9:40 PM
To: solr-user@lucene.apache.org
Subject: Re: defType argument weirdness

On Tue, Jul 19, 2011 at 1:25 PM, Naomi Dushay ndus...@stanford.edu wrote:
 Regardless, I thought that defType=dismaxq=*:*   is supposed to be
 equivalent to  q={!defType=dismax}*:*  and also equivalent to q={!dismax}*:*

Not quite - there is a very subtle distinction.

{!dismax}  is short for {!type=dismax}, the type of the actual query,
and this may not be overridden.

The defType local param is only the default type for sub-queries (as
opposed to the current query).
It's useful in conjunction with the query  or nested query qparser:
http://lucene.apache.org/solr/api/org/apache/solr/search/NestedQParserPlugin.html

-Yonik
http://www.lucidimagination.com

Re: How to find whether solr server is running or not

2011-07-19 Thread Romi

François Schiettecatte, how will i get 200 status code. I am getting json
response from solr server, 
as 
*$.getJSON(http://192.168.1.9:8983/solr/db/select/?qt=dismaxwt=jsonstart=0rows=100q=eleganthl=truehl.fl=texthl.usePhraseHighlighter=truesort=
scorejson.wrf=?, function(result){
*

but if solr server is not running this code does not execute...how can i
check than that server is not running

-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3184556.html
Sent from the Solr - User mailing list archive at Nabble.com.

58 matches

Mail list logo