CachedSqlEntityProcessor And Delta Imports

2010-02-26 Thread KirstyS

Hi,
I am on the 1.4 Nightly build from September (still need to upgrade).
I am using CachedSqlEntityProcessor for my main queries but was hoping to
use it for my delta-imports as well. Is this possible?
I have a main entity called 'Article' and then 
  entity name=body pk=CmsArticleBodyPageId
  query=SELECT Body, CmsArticleId
 FROM CmsArticleBodyPage
processor=CachedSqlEntityProcessor
cacheKey=CmsArticleId
cacheLookup=article.CmsArticleId 
 deltaQuery=SELECT Body, CmsArticleId
 FROM CmsArticleBodyPage 
 WHERE where convert(nvarchar(50),
UpdatedDate, 127)   convert(nvarchar(50),
replace('${dataimporter.last_index_time}', '\', ''), 127) 
  parentDeltaQuery=select * from SolrSearch where
convert(nvarchar(50), CmsArticleId) = convert(nvarchar(50),
'${body.CmsArticleId}')
   field column=Body name=Body/
/entity

IF you can use the CachedSqlEntityProcessor for DeltaQuery's...how would one
change this?
Thanks in advance
Kirsty
-- 
View this message in context: 
http://old.nabble.com/CachedSqlEntityProcessor-And-Delta-Imports-tp27717661p27717661.html
Sent from the Solr - User mailing list archive at Nabble.com.



Spellcheck in mulitlanguage

2010-02-26 Thread Sudhakar_Thangavel

Hi

I need Spell check suggestions for user queries (In Italian).how can i
get this...in my solr
-- 
View this message in context: 
http://old.nabble.com/Spellcheck-in-mulitlanguage-tp27717787p27717787.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Solr Cell RTF Woes

2010-02-26 Thread David.Dankwerth
Are you running on a Linux/Unix box that has no X ... Did you try with
headless options ?
http://java.sun.com/developer/technicalArticles/J2SE/Desktop/headless/

Tika's RTF is using Swing and AWT to analyze the rtf, these in turn will
attempt to use Graphics libraries, unless you use headless.



-Original Message-
From: Bill Engle [mailto:billengle...@gmail.com] 
Sent: 25 February 2010 19:09
To: solr-user@lucene.apache.org
Subject: Solr Cell RTF Woes

Any RTF file I tried to index in Solr 1.4 throws these errors out.  I
have no issues with doc, pdf.  Any thoughts?  Thanks.

htmlheadtitleApache Tomcat/6.0.18 - Error
report/titlestyle!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
6;font-size:22px;}
H2
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
6;font-size:16px;}
H3
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
6;font-size:14px;}
BODY
{font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;
} B
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
6;}
P
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-s
ize:12px;}A
{color : black;}A.name {color : black;}HR {color : #525D76;}--/style
/headbodyh1HTTP Status 500 - Could not initialize class
java.awt.EventQueue

java.lang.NoClassDefFoundError: Could not initialize class
java.awt.EventQueue
at
javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:133
3)
at javax.swing.text.StyleContext.reclaim(StyleContext.java:437)
at javax.swing.text.StyleContext.addAttribute(StyleContext.java:294)
at
javax.swing.text.StyleContext$NamedStyle.addAttribute(StyleContext.java:
1488)
at
javax.swing.text.StyleContext$NamedStyle.setName(StyleContext.java:1298)
at
javax.swing.text.StyleContext$NamedStyle.lt;initgt;(StyleContext.java:
1245)
at javax.swing.text.StyleContext.addStyle(StyleContext.java:90)
at javax.swing.text.StyleContext.lt;initgt;(StyleContext.java:70)
at
javax.swing.text.DefaultStyledDocument.lt;initgt;(DefaultStyledDocumen
t.java:95)
at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:42)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105)
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extract
ingDocumentLoader.java:190)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conte
ntStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:131)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReq
uest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
va:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
erChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
e.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
e.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:2
86)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:84
5)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(
Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:637)
/h1HR size=1 noshade=noshadepbtype/b Status
report/ppbmessage/b uCould not initialize class
java.awt.EventQueue

java.lang.NoClassDefFoundError: Could not initialize class
java.awt.EventQueue
at
javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:133
3)
at javax.swing.text.StyleContext.reclaim(StyleContext.java:437)
at javax.swing.text.StyleContext.addAttribute(StyleContext.java:294)
at
javax.swing.text.StyleContext$NamedStyle.addAttribute(StyleContext.java:
1488)
at
javax.swing.text.StyleContext$NamedStyle.setName(StyleContext.java:1298)
at
javax.swing.text.StyleContext$NamedStyle.lt;initgt;(StyleContext.java:
1245)
at javax.swing.text.StyleContext.addStyle(StyleContext.java:90)
at javax.swing.text.StyleContext.lt;initgt;(StyleContext.java:70)
at
javax.swing.text.DefaultStyledDocument.lt;initgt;(DefaultStyledDocumen
t.java:95)
at 

Content Extraction

2010-02-26 Thread Lee Smith
Hey All

Hope someone can advise.

I followed the example in the wiki on how to extract a html page i.e

curl 
'http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentcommit=true'
 -F myfi...@tutorial.html

And it displayed a html page but with a 404 and did not index the document?

Any suggestions on how I can fix this?

Thanks if you can advise.

Lee



Re: Index size

2010-02-26 Thread Jean-Sebastien Vachon
Hi,

All the document can be up to 10K. Most if it comes from a single field which 
is both indexed and stored. 
The data is uncompressed because it would eat up to much CPU considering the 
volume we have. We have around 30 fields in all.
We also need to compute some facets as well as collapse the documents forming 
the result set and to be able to sort them on any field.

Thx

On 2010-02-25, at 5:50 PM, Otis Gospodnetic wrote:

 It depends on many factors - how big those docs are (compare a tweet to a 
 news article to a book chapter) whether you store the data or just index it, 
 whether you compress it, how and how much you analyze the data, etc.
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Hadoop ecosystem search :: http://search-hadoop.com/
 
 
 
 - Original Message 
 From: Jean-Sebastien Vachon js.vac...@videotron.ca
 To: solr-user@lucene.apache.org
 Sent: Wed, February 24, 2010 8:57:21 AM
 Subject: Index size
 
 Hi All,
 
 I'm currently looking on integrating Solr and I'd like to have some hints on 
 the 
 size of the index (number of documents) I could possibly host on a server 
 running a Double-Quad server (16 cores) with 48Gb of RAM running Linux. 
 Basically, I need to determine how many of these servers would be required 
 to 
 host about half a billion documents. Should I setup multiple Solr instances 
 (in 
 Virtual Machines or not) or should I run a single instance (with multicores 
 or 
 not) using all available memory as the cache ?
 
 I also made some tests with shardings on this same server and I could not 
 see 
 any improvement (at least not with 4.5 millions documents). Should all the 
 shards be hosted on different servers? I shall try with more documents in 
 the 
 following days.
 
 Thx 
 



Auto suggestion

2010-02-26 Thread Suram

Hi 

 AutoSuggestion not found for newly indexed data ,how can i configure
that anyone help me

Thans in advance
-- 
View this message in context: 
http://old.nabble.com/Auto-suggestion-tp27718858p27718858.html
Sent from the Solr - User mailing list archive at Nabble.com.



Highest frequency

2010-02-26 Thread pcmanprogrammeur

Hello all (sorry if my english is bad, i'm french) !

I have a Solr Index with ads which contain a title and a description !
For exemple : 
add 1 : title = test / description = [empty]
add 2 : title = test on test / description = this is a test
And now, if I execute the request test in solr/admin, the add 1 is the
first result whereas the add 2 is more pertinent  because the word test is
more present !
So, is it possible to say to Solr, to sort the result in fact of the word
frequency ?

Thanks for your help !
-- 
View this message in context: 
http://old.nabble.com/Highest-frequency-tp27718930p27718930.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: new/first searcher

2010-02-26 Thread Marc Sturlese

There's no problem about having the same warming in both cases. First queries
are use to warm the index once you start the solr instance. New queries warm
the index once a commit in executed, for example.
In first queries warming there was no previous IndexSearcher opened. In new
queries there was and it's the one that serves search requests while the new
one is being warmed.

solrquestion6 wrote:
 
 Hi,
 
 Is it the wrong approach to have the same warmup queries in both new and
 first searcher? The wiki shows a sorting query for the newSearcher and the
 same sorting query plus facet/filter queries for the firstSearcher. 
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/new-first-searcher-tp27714473p27719048.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highest frequency

2010-02-26 Thread Marc Sturlese

As far as I know it's not suported by default. I thing you should implement
your custom Lucene Similarity class and plug it into Solr via solrconfig.xml

pcmanprogrammeur wrote:
 
 Hello all (sorry if my english is bad, i'm french) !
 
 I have a Solr Index with ads which contain a title and a description !
 For exemple : 
 add 1 : title = test / description = [empty]
 add 2 : title = test on test / description = this is a test
 And now, if I execute the request test in solr/admin, the add 1 is the
 first result whereas the add 2 is more pertinent  because the word test
 is more present !
 So, is it possible to say to Solr, to sort the result in fact of the word
 frequency ?
 
 Thanks for your help !
 

-- 
View this message in context: 
http://old.nabble.com/Highest-frequency-tp27718930p27719107.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Content Extraction

2010-02-26 Thread Erick Erickson
You really have to provide more details of
a what you did.
b what the results were.

Have you looked at you r index with the admin page and/or Luke?
Have you tried querying in the admin page?
Have you examined the logs to see what they report?

Best
Erick

On Fri, Feb 26, 2010 at 7:54 AM, Lee Smith l...@weblee.co.uk wrote:

 Hey All

 Hope someone can advise.

 I followed the example in the wiki on how to extract a html page i.e

 curl '
 http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentcommit=true'
 -F myfi...@tutorial.html

 And it displayed a html page but with a 404 and did not index the document?

 Any suggestions on how I can fix this?

 Thanks if you can advise.

 Lee




Re: Auto suggestion

2010-02-26 Thread Erick Erickson
Have you reopened the index after you added the data?

Erick

On Fri, Feb 26, 2010 at 9:23 AM, Suram reactive...@yahoo.com wrote:


 Hi

 AutoSuggestion not found for newly indexed data ,how can i configure
 that anyone help me

 Thans in advance
 --
 View this message in context:
 http://old.nabble.com/Auto-suggestion-tp27718858p27718858.html
 Sent from the Solr - User mailing list archive at Nabble.com.




SEVERE: java.lang.NumberFormatException: For input string: 104708

2010-02-26 Thread Pablo Mercado
Hello,

Solr is raising the following exception when processing queries that
sort on integer attribute.  The same queries and sorts have been
running fine in production for almost a year now.   If I run the query
without the sort on the integer attribute, the query runs fine.  If I
run a query that would return 0 results, but still has a sort
parameter the exception is raised.  The stack trace is the same no
matter what the query.

I need help troubleshooting this issue.  Any clues, or suggested
approaches would be helpful.  Thank you in advance!.

The stack trace is as follows:

SEVERE: java.lang.NumberFormatException: For input string: 104708
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:456)
at java.lang.Integer.parseInt(Integer.java:497)
at 
org.apache.lucene.search.FieldCacheImpl$3.parseInt(FieldCacheImpl.java:148)
at 
org.apache.lucene.search.FieldCacheImpl$7.createValue(FieldCacheImpl.java:262)
at 
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
at 
org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:245)
at 
org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:239)
at 
org.apache.lucene.search.FieldSortedHitQueue.comparatorInt(FieldSortedHitQueue.java:291)
at 
org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:188)
at 
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
at 
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168)
at 
org.apache.lucene.search.FieldSortedHitQueue.init(FieldSortedHitQueue.java:56)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:907)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


Our solr info is:
Solr Specification Version: 1.3.0
Solr Implementation Version: 1.3.0 694707 - grantingersoll - 2008-09-12 11:06:47
Lucene Specification Version: 2.4-dev
Lucene Implementation Version: 2.4-dev 691741 - 2008-09-03 15:25:16


replication. when the slave goes down...

2010-02-26 Thread Matthieu Labour
Hi
I have 2 solr machine. 1 master, 1 slave replicating the index from the master
The machine on which the slave is running went down while the replication was 
running
I suppose the index must be corrupted. Can I safely remove the index on the 
slave and restart the slave and the slave will start over the replication from 
scratch?
Thank you



  

Re: Solr Cell RTF Woes

2010-02-26 Thread Bill Engle
Thanks.   Headless put me in the right direction.

I am running on a headless Mac OSX 10.6 Server.

I added the below to my {CATALINA_HOME}/bin/setenv.sh file and now I am
indexing RTF.

export JAVA_OPTS=-d64 -server -Xmx1024m -XX:MaxPermSize=512m
-Djava.awt.headless=true -Dsun.lang.ClassLoader.allowArraySyntax=true


Thanks again!

-Bill

On Fri, Feb 26, 2010 at 7:50 AM, david.dankwe...@ubs.com wrote:

 Are you running on a Linux/Unix box that has no X ... Did you try with
 headless options ?
 http://java.sun.com/developer/technicalArticles/J2SE/Desktop/headless/

 Tika's RTF is using Swing and AWT to analyze the rtf, these in turn will
 attempt to use Graphics libraries, unless you use headless.



 -Original Message-
 From: Bill Engle [mailto:billengle...@gmail.com]
 Sent: 25 February 2010 19:09
 To: solr-user@lucene.apache.org
 Subject: Solr Cell RTF Woes

 Any RTF file I tried to index in Solr 1.4 throws these errors out.  I
 have no issues with doc, pdf.  Any thoughts?  Thanks.

 htmlheadtitleApache Tomcat/6.0.18 - Error
 report/titlestyle!--H1
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
 6;font-size:22px;}
 H2
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
 6;font-size:16px;}
 H3
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
 6;font-size:14px;}
 BODY
 {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;
 } B
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D7
 6;}
 P
 {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-s
 ize:12px;}A
 {color : black;}A.name {color : black;}HR {color : #525D76;}--/style
 /headbodyh1HTTP Status 500 - Could not initialize class
 java.awt.EventQueue

 java.lang.NoClassDefFoundError: Could not initialize class
 java.awt.EventQueue
at
 javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:133
 3)
at javax.swing.text.StyleContext.reclaim(StyleContext.java:437)
at javax.swing.text.StyleContext.addAttribute(StyleContext.java:294)
at
 javax.swing.text.StyleContext$NamedStyle.addAttribute(StyleContext.java:
 1488)
at
 javax.swing.text.StyleContext$NamedStyle.setName(StyleContext.java:1298)
at
 javax.swing.text.StyleContext$NamedStyle.lt;initgt;(StyleContext.java:
 1245)
at javax.swing.text.StyleContext.addStyle(StyleContext.java:90)
at javax.swing.text.StyleContext.lt;initgt;(StyleContext.java:70)
at
 javax.swing.text.DefaultStyledDocument.lt;initgt;(DefaultStyledDocumen
 t.java:95)
at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:42)
at
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119)
at
 org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105)
at
 org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extract
 ingDocumentLoader.java:190)
at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conte
 ntStreamHandlerBase.java:54)
at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
 ase.java:131)
at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReq
 uest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
 va:338)
at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
 ava:241)
at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
 tionFilterChain.java:235)
at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
 erChain.java:206)
at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
 e.java:233)
at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
 e.java:191)
at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
 :128)
at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
 :102)
at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
 java:109)
at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:2
 86)
at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:84
 5)
at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(
 Http11Protocol.java:583)
at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:637)
 /h1HR size=1 noshade=noshadepbtype/b Status
 report/ppbmessage/b uCould not initialize class
 java.awt.EventQueue

 java.lang.NoClassDefFoundError: Could not initialize class
 java.awt.EventQueue
at
 javax.swing.SwingUtilities.isEventDispatchThread(SwingUtilities.java:133
 3)
at javax.swing.text.StyleContext.reclaim(StyleContext.java:437)
at javax.swing.text.StyleContext.addAttribute(StyleContext.java:294)
at
 

Re: SEVERE: java.lang.NumberFormatException: For input string: 104708

2010-02-26 Thread Yonik Seeley
One of your field values isn't a valid integer, it's 104708
You're probably using the straight integer type in 1.3, which is meant
for back compat with existing lucene indexes and currently doesn't do
validation on it's input.

For Solr 1.4, int is a new field type (example schema maps it to
TrieIntField) that does do validation at index time, and is just as
efficient for sorting.

-Yonik
http://www.lucidimagination.com


On Fri, Feb 26, 2010 at 9:59 AM, Pablo Mercado pa...@sbnation.com wrote:
 Hello,

 Solr is raising the following exception when processing queries that
 sort on integer attribute.  The same queries and sorts have been
 running fine in production for almost a year now.   If I run the query
 without the sort on the integer attribute, the query runs fine.  If I
 run a query that would return 0 results, but still has a sort
 parameter the exception is raised.  The stack trace is the same no
 matter what the query.

 I need help troubleshooting this issue.  Any clues, or suggested
 approaches would be helpful.  Thank you in advance!.

 The stack trace is as follows:

 SEVERE: java.lang.NumberFormatException: For input string: 104708
        at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Integer.parseInt(Integer.java:456)
        at java.lang.Integer.parseInt(Integer.java:497)
        at 
 org.apache.lucene.search.FieldCacheImpl$3.parseInt(FieldCacheImpl.java:148)
        at 
 org.apache.lucene.search.FieldCacheImpl$7.createValue(FieldCacheImpl.java:262)
        at 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
        at 
 org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:245)
        at 
 org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:239)
        at 
 org.apache.lucene.search.FieldSortedHitQueue.comparatorInt(FieldSortedHitQueue.java:291)
        at 
 org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:188)
        at 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
        at 
 org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168)
        at 
 org.apache.lucene.search.FieldSortedHitQueue.init(FieldSortedHitQueue.java:56)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:907)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
        at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)
        at 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
        at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
        at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
        at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
        at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
        at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
        at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
        at org.mortbay.jetty.Server.handle(Server.java:285)
        at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
        at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
        at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
        at 
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


 Our solr info is:
 Solr Specification Version: 1.3.0
 Solr Implementation Version: 1.3.0 694707 - grantingersoll - 2008-09-12 
 11:06:47
 Lucene Specification Version: 2.4-dev
 Lucene Implementation Version: 2.4-dev 691741 - 2008-09-03 15:25:16



Re: Highest frequency

2010-02-26 Thread Erick Erickson
The underlying Lucene automatically takes this into account.the term
frequency
in relation to the length of the field rather than just a term count. So in
your
example doc 1 has a complete field match on title, so it scores higher.

Also, depending upon how you set things up you may not be searching
on description. Unless you specify it searches only go against the default
field (see your schema for the default field).

Which brings up the question whether you really want to override this
behavior.
Do you really want a document with 10,000 tokens in it that mentions test
five
times to score higher than a document with 3 tokens that mentions test
three times?

This page may help you resolve this kind of question...

http://lucene.apache.org/java/2_4_0/scoring.html

http://lucene.apache.org/java/2_4_0/scoring.htmlHTH
Erick

On Fri, Feb 26, 2010 at 9:30 AM, pcmanprogrammeur
pcmanprogramm...@neuf.frwrote:


 Hello all (sorry if my english is bad, i'm french) !

 I have a Solr Index with ads which contain a title and a description !
 For exemple :
 add 1 : title = test / description = [empty]
 add 2 : title = test on test / description = this is a test
 And now, if I execute the request test in solr/admin, the add 1 is the
 first result whereas the add 2 is more pertinent  because the word test
 is
 more present !
 So, is it possible to say to Solr, to sort the result in fact of the word
 frequency ?

 Thanks for your help !
 --
 View this message in context:
 http://old.nabble.com/Highest-frequency-tp27718930p27718930.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Content Extraction

2010-02-26 Thread Lee Smith
Hi Erik

I did a post with more details yesterday with no response.

I have a screen shot of what it does: http://screencast.com/t/MGRiZTU5M

After running it I have done a query with 0 results and have checked to see how 
many docs are indexed with 0 being the value.

Hope you can shed some more light for me.

Lee

On 26 Feb 2010, at 14:57, Erick Erickson wrote:

 You really have to provide more details of
 a what you did.
 b what the results were.
 
 Have you looked at you r index with the admin page and/or Luke?
 Have you tried querying in the admin page?
 Have you examined the logs to see what they report?
 
 Best
 Erick
 
 On Fri, Feb 26, 2010 at 7:54 AM, Lee Smith l...@weblee.co.uk wrote:
 
 Hey All
 
 Hope someone can advise.
 
 I followed the example in the wiki on how to extract a html page i.e
 
 curl '
 http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentcommit=true'
 -F myfi...@tutorial.html
 
 And it displayed a html page but with a 404 and did not index the document?
 
 Any suggestions on how I can fix this?
 
 Thanks if you can advise.
 
 Lee
 
 



Re: SEVERE: java.lang.NumberFormatException: For input string: 104708

2010-02-26 Thread Pablo Mercado
Thank you for taking the time to look at my issue and respond.

Do you have any suggestions for purging the document with this field
from the index?  Would that even help?

I do not know which document has the corrupt value, and searching for
the document with something like

pk_i:104708

does return a document with that value.

(pk_i is the integer field that we try to sort on and that,
presumably, has a non-integer value stored for some document)




On Fri, Feb 26, 2010 at 10:26, Yonik Seeley yo...@lucidimagination.com wrote:
 One of your field values isn't a valid integer, it's 104708
 You're probably using the straight integer type in 1.3, which is meant
 for back compat with existing lucene indexes and currently doesn't do
 validation on it's input.

 For Solr 1.4, int is a new field type (example schema maps it to
 TrieIntField) that does do validation at index time, and is just as
 efficient for sorting.

 -Yonik
 http://www.lucidimagination.com


 On Fri, Feb 26, 2010 at 9:59 AM, Pablo Mercado pa...@sbnation.com wrote:
 Hello,

 Solr is raising the following exception when processing queries that
 sort on integer attribute.  The same queries and sorts have been
 running fine in production for almost a year now.   If I run the query
 without the sort on the integer attribute, the query runs fine.  If I
 run a query that would return 0 results, but still has a sort
 parameter the exception is raised.  The stack trace is the same no
 matter what the query.

 I need help troubleshooting this issue.  Any clues, or suggested
 approaches would be helpful.  Thank you in advance!.

 The stack trace is as follows:

 SEVERE: java.lang.NumberFormatException: For input string: 104708
        at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Integer.parseInt(Integer.java:456)
        at java.lang.Integer.parseInt(Integer.java:497)
        at 
 org.apache.lucene.search.FieldCacheImpl$3.parseInt(FieldCacheImpl.java:148)
        at 
 org.apache.lucene.search.FieldCacheImpl$7.createValue(FieldCacheImpl.java:262)
        at 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
        at 
 org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:245)
        at 
 org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:239)
        at 
 org.apache.lucene.search.FieldSortedHitQueue.comparatorInt(FieldSortedHitQueue.java:291)
        at 
 org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:188)
        at 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
        at 
 org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168)
        at 
 org.apache.lucene.search.FieldSortedHitQueue.init(FieldSortedHitQueue.java:56)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:907)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
        at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)
        at 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
        at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
        at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
        at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
        at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
        at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
        at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
        at org.mortbay.jetty.Server.handle(Server.java:285)
        at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
        at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
        at 
 

Re: Solr Cell and Deduplication - Get ID of doc

2010-02-26 Thread Bill Engle
Any thoughts on this? I would like to get the id back in the request after
indexing.  My initial thoughts were to do a search to get the docid  based
on the attr_stream_name after indexing but now that I reread my message I
mentioned the attr_stream_name (file_name) may be different so that is
unreliable.  My only option is to somehow return the id in the XML
response.  Any guidance is greatly appreciated.

-Bill

On Wed, Feb 24, 2010 at 12:06 PM, Bill Engle billengle...@gmail.com wrote:

 Hi -

 New Solr user here.  I am using Solr Cell to index files (PDF, doc, docx,
 txt, htm, etc.) and there is a good chance that a new file will have
 duplicate content but not necessarily the same file name.  To avoid this I
 am using the deduplication feature of Solr.

   updateRequestProcessorChain name=dedupe
 processor
 class=org.apache.solr.update.processor.SignatureUpdateProcessorFactory
   bool name=enabledtrue/bool
   str name=signatureFieldid/str
   bool name=overwriteDupestrue/bool
   str name=fieldsattr_content/str
   str name=signatureClassorg.apache.solr.update.processor./str
 /processor
 processor class=solr.LogUpdateProcessorFactory /
 processor class=solr.RunUpdateProcessorFactory /
   /updateRequestProcessorChain

 How do I get the id value post Solr processing.  Is there someway to
 modify the curl response so that id is returned.  I need this id because I
 would like to rename the file to the id value.  I could probably do a Solr
 search after the fact to get the id field based on the attr_stream_name but
 I would like to do only one request.

 curl '
 http://localhost:8080/solr/update/extract?uprefix=attr_fmap.content=attr_contentcommit=true'
 -F myfi...@myfile.pdf

 Thanks,
 Bill



Re: SEVERE: java.lang.NumberFormatException: For input string: 104708

2010-02-26 Thread Mark Miller

You have to find the document with the bad value somehow.

In the past I have used Luke to help with this.

Then you need to delete the document.

Finally, you have to get the deleted document out of the index through a 
merge (else the bad term will still be loaded by the FieldCache) - 
easiest way is to do this is an optimize.



--
- Mark

http://www.lucidimagination.com



On 02/26/2010 10:49 AM, Pablo Mercado wrote:

Thank you for taking the time to look at my issue and respond.

Do you have any suggestions for purging the document with this field
from the index?  Would that even help?

I do not know which document has the corrupt value, and searching for
the document with something like

pk_i:104708

does return a document with that value.

(pk_i is the integer field that we try to sort on and that,
presumably, has a non-integer value stored for some document)




On Fri, Feb 26, 2010 at 10:26, Yonik Seeleyyo...@lucidimagination.com  wrote:
   

One of your field values isn't a valid integer, it's 104708
You're probably using the straight integer type in 1.3, which is meant
for back compat with existing lucene indexes and currently doesn't do
validation on it's input.

For Solr 1.4, int is a new field type (example schema maps it to
TrieIntField) that does do validation at index time, and is just as
efficient for sorting.

-Yonik
http://www.lucidimagination.com


On Fri, Feb 26, 2010 at 9:59 AM, Pablo Mercadopa...@sbnation.com  wrote:
 

Hello,

Solr is raising the following exception when processing queries that
sort on integer attribute.  The same queries and sorts have been
running fine in production for almost a year now.   If I run the query
without the sort on the integer attribute, the query runs fine.  If I
run a query that would return 0 results, but still has a sort
parameter the exception is raised.  The stack trace is the same no
matter what the query.

I need help troubleshooting this issue.  Any clues, or suggested
approaches would be helpful.  Thank you in advance!.

The stack trace is as follows:

SEVERE: java.lang.NumberFormatException: For input string: 104708
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:456)
at java.lang.Integer.parseInt(Integer.java:497)
at 
org.apache.lucene.search.FieldCacheImpl$3.parseInt(FieldCacheImpl.java:148)
at 
org.apache.lucene.search.FieldCacheImpl$7.createValue(FieldCacheImpl.java:262)
at 
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
at 
org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:245)
at 
org.apache.lucene.search.FieldCacheImpl.getInts(FieldCacheImpl.java:239)
at 
org.apache.lucene.search.FieldSortedHitQueue.comparatorInt(FieldSortedHitQueue.java:291)
at 
org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:188)
at 
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:72)
at 
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168)
at 
org.apache.lucene.search.FieldSortedHitQueue.init(FieldSortedHitQueue.java:56)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:907)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 

Re: Changing term frequency according to value of one of the fields

2010-02-26 Thread Joe Calderon
extend the similarity class, compile it against the jars in lib, put in 
a path solr can find and set your schema to use it

http://wiki.apache.org/solr/SolrPlugins#Similarity
On 02/25/2010 10:09 PM, Pooja Verlani wrote:

Hi,
I want to modify Similarity class for my app like the following-
Right now tf is Math.sqrt(termFrequency)
I would like to modify it to
Math.sqrt(termFrequncy/solrDoc.getFieldValue(count))
where count is one of the fields in the particular solr document.
Is it possible to do so? Can I import solrDocument class and take the
particular solrDoc for calculating tf in the similarity class?

Please suggest.

regards,
Pooja

   




Solrsharp

2010-02-26 Thread Frederico Azeiteiro
Hi,

 

I don't know if this list includes this kind of help, but I'm using
Solrsharp with C# to operate SOLR. Please advise if this is off-topic
please.

 

I'm having a little trouble to make a search with exclude terms using
the query parameters.

 

Does anyone uses Solrsharp around here? Do you manage to exclude terms
on searches?

 

Br

Frederico

 

 



Re: SEVERE: java.lang.NumberFormatException: For input string: 104708

2010-02-26 Thread Yonik Seeley
On Fri, Feb 26, 2010 at 10:59 AM, Mark Miller markrmil...@gmail.com wrote:
 You have to find the document with the bad value somehow.

 In the past I have used Luke to help with this.

 Then you need to delete the document.

You can also find the document with a raw term query.

q={!raw f=myfield}104708

-Yonik
http://www.lucidimagination.com


Re: Free Webinar: Mastering Solr 1.4 with Yonik Seeley

2010-02-26 Thread Jay Hill
Yes, it will be recorded and available to view after the presentation.

-Jay


On Thu, Feb 25, 2010 at 2:19 PM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Yonk, can you please advise whether this event will be recorded and
 available for later download? (It starts 5am our time ;-)  )

 Regards
 Bern

 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Thursday, 25 February 2010 10:23 AM
 To: solr-user@lucene.apache.org
 Subject: Free Webinar: Mastering Solr 1.4 with Yonik Seeley

 I'd like to invite you to join me for an in-depth review of Solr's
 powerful, versatile new features and functions. The free webinar,
 sponsored by my company, Lucid Imagination, covers an intensive
 how-to for the features you need to make the most of Solr for your
 search application:

* Faceting deep dive, from document fields to performance management
* Best practices for sharding, index partitioning and scaling
* How to construct efficient Range Queries and function queries
* Sneak preview: Solr 1.5 roadmap

 Join us for a free webinar
 Thursday, March 4, 2010
 10:00 AM PST / 1:00 PM EST / 18:00 GMT
 Follow this link to sign up

 http://www.eventsvc.com/lucidimagination/030410?trk=WR-MAR2010-AP

 Thanks,

 -Yonik
 http://www.lucidimagination.com



Re: Solr 1.4 distributed search configuration

2010-02-26 Thread Jeffrey Zhao
Now I got it, just forgot put qt=search in query.

By the way, in solr 1.3, I used shards.txt under conf directory and 
distributed=true in query for distributed search.  In that way,in my 
java application, I can hard code solr query with distributed=true and 
control the using of distributed search by  define shards.txt or not.

In solr 1.4, it is more difficult to use distributed search dynamically.Is 
there a way I just change configuration  without changing query to make DS 
work?

Thanks, 



From:   Mark Miller markrmil...@gmail.com
To: solr-user@lucene.apache.org
Date:   25/02/2010 04:13 PM
Subject:Re: Solr 1.4 distributed search configuration



Can you elaborate on doesn't work when you put it in the /search 
handler?

You get an error in the logs? Nothing happens?

On 02/25/2010 03:47 PM, Jeffrey Zhao wrote:
 Hi Mark,

 Thanks for your reply. I did make a new handler as following, but it 
does
 not work, anything wrong with my configuration?

 Thanks,

   requestHandler name=search class=solr.SearchHandler
!-- default values for query parameters --
 lst name=defaults
   str
 name=shards202.161.196.189:8080/solr,localhost:8080/solr/str
 /lst
   arr name=components
 strquery/str
 strfacet/str
 strspellcheck/str
 strdebug/str
   /arr
 /requestHandler



 From:   Mark Millermarkrmil...@gmail.com
 To: solr-user@lucene.apache.org
 Date:   25/02/2010 03:41 PM
 Subject:Re: Solr 1.4 distributed search configuration



 On 02/25/2010 03:32 PM, Jeffrey Zhao wrote:
 
 How do define a new search handler with a shards parameter?  I defined
 
 as
 
 following way but it doesn't work. If I put the shards parameter in
 default handler, it seems I got an infinite loop.


 requestHandler name=standard class=solr.SearchHandler
 
 default=true
 
   !-- default values for query parameters --
lst name=defaults
  str name=echoParamsexplicit/str
/lst
 /requestHandler

 requestHandler name=search class=solr.SearchHandler
   !-- default values for query parameters --
lst name=defaults
  str
 name=shards202.161.196.189:8080/solr,localhost:8080/solr/str
/lst
  arr name=components
strquery/str
strfacet/str
strspellcheck/str
strdebug/str
  /arr
 /requestHandler


 Thanks,

 
 Not seeing this on the wiki (it should be there), but you can't put the
 shards param on the default search handler without causing an infinite
 loop - you have to make a new request handler and put it on that.

 


-- 
- Mark

http://www.lucidimagination.com







RE: Extended stats via JMX

2010-02-26 Thread Dan Trainor
-Original Message-
From: Matthew Runo [mailto:mr...@zappos.com] 
Sent: Thursday, February 25, 2010 12:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Extended stats via JMX

https://issues.apache.org/jira/browse/SOLR-1750 might help you, since I don't 
think that all of stats.jsp is exposed via MBeans. I could be wrong about that 
though.. (apologies, our solr servers are firewalled and I can't connect via 
JMX at the moment)

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833



Hi, Matthew -

Looks like Shalin confirmed that those values can in fact be found inside the 
RequestHandler's MBean (thanks, Shalin).

Thanks for getting me going in the right direction.  I appreciate it.

Thanks
-dant


Re: Solr 1.4 distributed search configuration

2010-02-26 Thread Joe Calderon
you can set a default shard parameter on the request handler doing
distributed search, you can set up two different request handlers one
with shards default and one without

On Thu, Feb 25, 2010 at 1:35 PM, Jeffrey Zhao
jeffrey.z...@metalogic-inc.com wrote:
 Now I got it, just forgot put qt=search in query.

 By the way, in solr 1.3, I used shards.txt under conf directory and
 distributed=true in query for distributed search.  In that way,in my
 java application, I can hard code solr query with distributed=true and
 control the using of distributed search by  define shards.txt or not.

 In solr 1.4, it is more difficult to use distributed search dynamically.Is
 there a way I just change configuration  without changing query to make DS
 work?

 Thanks,



 From:   Mark Miller markrmil...@gmail.com
 To:     solr-user@lucene.apache.org
 Date:   25/02/2010 04:13 PM
 Subject:        Re: Solr 1.4 distributed search configuration



 Can you elaborate on doesn't work when you put it in the /search
 handler?

 You get an error in the logs? Nothing happens?

 On 02/25/2010 03:47 PM, Jeffrey Zhao wrote:
 Hi Mark,

 Thanks for your reply. I did make a new handler as following, but it
 does
 not work, anything wrong with my configuration?

 Thanks,

   requestHandler name=search class=solr.SearchHandler
        !-- default values for query parameters --
         lst name=defaults
           str
 name=shards202.161.196.189:8080/solr,localhost:8080/solr/str
         /lst
       arr name=components
         strquery/str
         strfacet/str
         strspellcheck/str
         strdebug/str
       /arr
 /requestHandler



 From:   Mark Millermarkrmil...@gmail.com
 To:     solr-user@lucene.apache.org
 Date:   25/02/2010 03:41 PM
 Subject:        Re: Solr 1.4 distributed search configuration



 On 02/25/2010 03:32 PM, Jeffrey Zhao wrote:

 How do define a new search handler with a shards parameter?  I defined

 as

 following way but it doesn't work. If I put the shards parameter in
 default handler, it seems I got an infinite loop.


 requestHandler name=standard class=solr.SearchHandler

 default=true

       !-- default values for query parameters --
        lst name=defaults
          str name=echoParamsexplicit/str
        /lst
     /requestHandler

 requestHandler name=search class=solr.SearchHandler
       !-- default values for query parameters --
        lst name=defaults
          str
 name=shards202.161.196.189:8080/solr,localhost:8080/solr/str
        /lst
      arr name=components
        strquery/str
        strfacet/str
        strspellcheck/str
        strdebug/str
      /arr
     /requestHandler


 Thanks,


 Not seeing this on the wiki (it should be there), but you can't put the
 shards param on the default search handler without causing an infinite
 loop - you have to make a new request handler and put it on that.




 --
 - Mark

 http://www.lucidimagination.com








Question on Facets and Multiple values (confusion from the Wiki)

2010-02-26 Thread Mark Bennett
Certainly lots of matches on Solr and facets.

Contrived example:
* Solr 1.4, etc.
* Yellow pages, business listings.
* Business listings have a zip code that I will use in Faceted search.
* Companies with multiple stores/outlets/offices still only have one record,
but all applicable zip codes are listed. (yes, others ways to solve this, I
know)
* I want each listing to show in all of it's zip codes when zip-code facets
are presented
* I declare one or more fields in schema.xml per the Wiki, etc. So 3 fields,
etc.

The behavior I will actually get will be:
(a) As described, a business in 3 zipcodes will show up under all 3 facets
(b) Nope, only the first zip code will work correctly
(c) Yes and No. Only the first zip code will show up in the facets.  But I
could certainly still search on the other codes and find that listing.  If
other businesses are in the other facets, a click on those zip codes will
return my original business, but if it's the ONLY business in the zip code,
the facet will not get displayed

Normally I'd say (a).  Playing with facets and reading online, this should
be possible, though it may take 2 or 3 versions of the field.

But why then would I even ask about (b) or (c) ?  Well, there's stuff in the
Wiki that makes me hesitate.  First, look at this page:

http://wiki.apache.org/solr/SolrFacetingOverview
First section, 3rd bullet point, where it says:
For faceting: Primary author only, using a solr.StringField:
* Schildt, Herbert

Obviously if I were sorting on this field the first author would matter a
lot.  And it's a bit ambiguous which copy of the field I'm using, etc.

Other things that cause me to hesitate:

http://wiki.apache.org/solr/SchemaXml#Common_field_options
The multiValued=true|false

And this:
http://wiki.apache.org/solr/FieldOptionsByUseCase
multiValued is left blank in many cases, and not filled in for facets.



--
Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513


replication issue

2010-02-26 Thread Matthieu Labour
Hi

I am still having issues with the replication and wonder if things are working 
properly

So I have 1 master and 1 slave

On the slave, I deleted the data/index directory and 
data/replication.properties file and restarted solr.

When slave is pulling data from master, I can see that the size of data 
directory is growing

r...@slr8:/raid/data# du -sh
3.7M    .
r...@slr8:/raid/data# du -sh
4.7M    .

and I can see that data/replication.properties  file got created and also a 
directory data/index.20100226063400

soon after index.20100226063400 disapears and the size of data/index is back to 
12K

r...@slr8:/raid/data/index# du -sh
12K    .

And when I look for the number of documents via the admin interface, I still 
see 0 documents so I feel something is wrong

One more thing, I have a symlink for /solr/data --- /raid/data

Thank you for your help !

matt






  

ConcurrentModificationException

2010-02-26 Thread Dan Hertz (Insight 49, LLC)

Hi guys,

SOLR 1.4 (final) and 1.5 nightly work fine on a Windows box, but on our 
Centos 5 box, we're getting a ConcurrentModificationException when 
starting Tomcat 6.


Any tips on how to solve this and/or troubleshoot?

Made sure there are no duplicate libs in Tomcat and solr/lib, and tried 
to cut down contrib stuff to see if it helped, but no luck.


Thanks, Dan.

= = =  Log Below: = = =

INFO   | jvm 1| 2010/02/24 21:27:04 | SEVERE: 
java.util.ConcurrentModificationException
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
java.util.AbstractList$Itr.next(AbstractList.java:343)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:507)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.solr.core.SolrCore.init(SolrCore.java:606)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:285)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4450)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardService.start(StandardService.java:516)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.Catalina.start(Catalina.java:583)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
java.lang.reflect.Method.invoke(Method.java:597)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
INFO   | jvm 1| 2010/02/24 21:27:04 | at 

Re: ConcurrentModificationException

2010-02-26 Thread Yonik Seeley
Could you open a JIRA issue for this?

After a quick look, it could be firstSearcher / newSearcher events
that are being executed concurrently that change the list?
Could you try commenting out  firstSearcher/newSearcher events in
solrconfig.xml and see if that fixes it?

It could be that a lazy loaded component is triggered by a
firstSearcher/newSearcher event, which happen in other threads,
causing another bean to be added.

-Yonik
http://www.lucidimagination.com



On Fri, Feb 26, 2010 at 2:28 PM, Dan Hertz (Insight 49, LLC)
insigh...@gmail.com wrote:
 Hi guys,

 SOLR 1.4 (final) and 1.5 nightly work fine on a Windows box, but on our
 Centos 5 box, we're getting a ConcurrentModificationException when starting
 Tomcat 6.

 Any tips on how to solve this and/or troubleshoot?

 Made sure there are no duplicate libs in Tomcat and solr/lib, and tried to
 cut down contrib stuff to see if it helped, but no luck.

 Thanks, Dan.

 = = =  Log Below: = = =

 INFO   | jvm 1    | 2010/02/24 21:27:04 | SEVERE:
 java.util.ConcurrentModificationException
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 java.util.AbstractList$Itr.next(AbstractList.java:343)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:507)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.SolrCore.init(SolrCore.java:606)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.CoreContainer.load(CoreContainer.java:285)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4450)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardService.start(StandardService.java:516)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.Catalina.start(Catalina.java:583)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 

Re: ConcurrentModificationException

2010-02-26 Thread Yonik Seeley
Yep, definitely a bug.
It looks like resourceLoader.newInstance() is fundamentally not thread safe.

-Yonik
http://www.lucidimagination.com

On Fri, Feb 26, 2010 at 2:48 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 Could you open a JIRA issue for this?

 After a quick look, it could be firstSearcher / newSearcher events
 that are being executed concurrently that change the list?
 Could you try commenting out  firstSearcher/newSearcher events in
 solrconfig.xml and see if that fixes it?

 It could be that a lazy loaded component is triggered by a
 firstSearcher/newSearcher event, which happen in other threads,
 causing another bean to be added.

 -Yonik
 http://www.lucidimagination.com



 On Fri, Feb 26, 2010 at 2:28 PM, Dan Hertz (Insight 49, LLC)
 insigh...@gmail.com wrote:
 Hi guys,

 SOLR 1.4 (final) and 1.5 nightly work fine on a Windows box, but on our
 Centos 5 box, we're getting a ConcurrentModificationException when starting
 Tomcat 6.

 Any tips on how to solve this and/or troubleshoot?

 Made sure there are no duplicate libs in Tomcat and solr/lib, and tried to
 cut down contrib stuff to see if it helped, but no luck.

 Thanks, Dan.

 = = =  Log Below: = = =

 INFO   | jvm 1    | 2010/02/24 21:27:04 | SEVERE:
 java.util.ConcurrentModificationException
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 java.util.AbstractList$Itr.next(AbstractList.java:343)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:507)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.SolrCore.init(SolrCore.java:606)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:429)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.CoreContainer.load(CoreContainer.java:285)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4450)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardService.start(StandardService.java:516)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 org.apache.catalina.startup.Catalina.start(Catalina.java:583)
 INFO   | jvm 1    | 2010/02/24 21:27:04 |     at
 

Re: replication issue

2010-02-26 Thread Shalin Shekhar Mangar
On Sat, Feb 27, 2010 at 12:13 AM, Matthieu Labour matthieu_lab...@yahoo.com
 wrote:

 Hi

 I am still having issues with the replication and wonder if things are
 working properly

 So I have 1 master and 1 slave

 On the slave, I deleted the data/index directory and
 data/replication.properties file and restarted solr.

 When slave is pulling data from master, I can see that the size of data
 directory is growing

 r...@slr8:/raid/data# du -sh
 3.7M.
 r...@slr8:/raid/data# du -sh
 4.7M.

 and I can see that data/replication.properties  file got created and also a
 directory data/index.20100226063400

 soon after index.20100226063400 disapears and the size of data/index is
 back to 12K

 r...@slr8:/raid/data/index# du -sh
 12K.

 And when I look for the number of documents via the admin interface, I
 still see 0 documents so I feel something is wrong

 One more thing, I have a symlink for /solr/data --- /raid/data


The ReplicationHandler moves files out of the temp directory into the index
directory. Java's File#renameTo can fail if the source and target
directories are on different partitions/disks. Is that the case here? I
believe SOLR-1736 fixes this issue in trunk but that was implemented after
the 1.4 release.

-- 
Regards,
Shalin Shekhar Mangar.


Re: HTTP ERROR: 404 missing core name in path after integrating nutch

2010-02-26 Thread Ian Evans
Just wanted to give an update on my efforts.

I installed the Feb. 26 update this morning. Was able to access /solr/admin.

Copied over the nutch schema.xml. restarted solr and was able to access
/solr/admin

Edited solrconfig.xml to add the nutch requesthandler snippet from
lucidimagination. Restarted solr and got the 404 missing core name in path
error.

What in the requesthandler snippet (see below) could be causing this error?

from http://bit.ly/1mOb

requestHandler name=/nutch class=solr.SearchHandler 
lst name=defaults
str name=defTypedismax/str
str name=echoParamsexplicit/str
float name=tie0.01/float
str name=qf
content^0.5 anchor^1.0 title^1.2
/str
str name=pf
content^0.5 anchor^1.5 title^1.2 site^1.5
/str
str name=fl
url
/str
str name=mm
2lt;-1 5lt;-2 6lt;90%
/str
int name=ps100/int
bool hl=true/
str name=q.alt*:*/str
str name=hl.fltitle url content/str
str name=f.title.hl.fragsize0/str
str name=f.title.hl.alternateFieldtitle/str
str name=f.url.hl.fragsize0/str
str name=f.url.hl.alternateFieldurl/str
str name=f.content.hl.fragmenterregex/str
/lst
/requestHandler

Have a great weekend.


Re: replication issue

2010-02-26 Thread Matthieu Labour
Shalin

Thank you so much for your answer
This is the case here
How can I find out which temp directory Solr replication is using?
Do you have a way to set up the source (temp directory? used by solr) and 
target directory via solr config file so that they live on the same partition ?
Thank you
matt


--- On Fri, 2/26/10, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

From: Shalin Shekhar Mangar shalinman...@gmail.com
Subject: Re: replication issue
To: solr-user@lucene.apache.org
Date: Friday, February 26, 2010, 2:06 PM

On Sat, Feb 27, 2010 at 12:13 AM, Matthieu Labour matthieu_lab...@yahoo.com
 wrote:

 Hi

 I am still having issues with the replication and wonder if things are
 working properly

 So I have 1 master and 1 slave

 On the slave, I deleted the data/index directory and
 data/replication.properties file and restarted solr.

 When slave is pulling data from master, I can see that the size of data
 directory is growing

 r...@slr8:/raid/data# du -sh
 3.7M    .
 r...@slr8:/raid/data# du -sh
 4.7M    .

 and I can see that data/replication.properties  file got created and also a
 directory data/index.20100226063400

 soon after index.20100226063400 disapears and the size of data/index is
 back to 12K

 r...@slr8:/raid/data/index# du -sh
 12K    .

 And when I look for the number of documents via the admin interface, I
 still see 0 documents so I feel something is wrong

 One more thing, I have a symlink for /solr/data --- /raid/data


The ReplicationHandler moves files out of the temp directory into the index
directory. Java's File#renameTo can fail if the source and target
directories are on different partitions/disks. Is that the case here? I
believe SOLR-1736 fixes this issue in trunk but that was implemented after
the 1.4 release.

-- 
Regards,
Shalin Shekhar Mangar.



  

Re: SEVERE: java.lang.NumberFormatException: For input string: 104708

2010-02-26 Thread Pablo Mercado
A big thanks to Yonik and Mark.  Using the raw term query I was able
to find the range(!) of documents that had bad integer field values.
Deleting those documents, committing and optimizing cleared up the
issue.

Still not sure how the bad values were inserted in the first place,
but that is another task.  Thanks again for being so helpful.



On Fri, Feb 26, 2010 at 11:29, Yonik Seeley yo...@lucidimagination.com wrote:
 On Fri, Feb 26, 2010 at 10:59 AM, Mark Miller markrmil...@gmail.com wrote:
 You have to find the document with the bad value somehow.

 In the past I have used Luke to help with this.

 Then you need to delete the document.

 You can also find the document with a raw term query.

 q={!raw f=myfield}104708

 -Yonik
 http://www.lucidimagination.com



Re: ConcurrentModificationException

2010-02-26 Thread Dan Hertz (Insight 49, LLC)

On 2010-02-26 12:55 PM, Yonik Seeley wrote:

Yep, definitely a bug.
It looks like resourceLoader.newInstance() is fundamentally not thread safe.

-Yonik

On Fri, Feb 26, 2010 at 2:48 PM, Yonik Seeley
yo...@lucidimagination.com  wrote:
   

Could you open a JIRA issue for this?

Yonik,

Do you still need me to open a JIRA issue, or has one been opened?
(I'm having trouble connecting to issues.apache.org)

Thanks, Dan



Re: Question on Facets and Multiple values (confusion from the Wiki)

2010-02-26 Thread Jan Høydahl / Cominvent
Hi Mark,

If (a) is wanted behaviour, i.e. have a business show up in facets for all 
ZIPs, you should define a multi-valued ZIP field. Since a ZIP is a number, I 
don't see any reason for any analysis on it, a String or a lightly normalized 
field type would do the job both for search and facets.

What I think confuses you is the author example in SolrFacetingOverview which 
chooses to use only the main (first) author for faceting. This is a business 
decision for this application and has nothing to do with faceting as such. The 
default would be to include all ZIPs. Probably the example in this page should 
clarify this behaviour.

When it comes to the table in FieldOptionsByUseCase, I agree that for faceting 
it makes sense to recommend multiValued if you have multivalue content, but it 
is not required for faceting. I think this table was made to explain what 
params you MUST set to enable certain functionality on a field. I would set 
true[6] for multiValued and a footnote that it must be used for multi value 
faceting.

--
Jan Høydahl  - search architect
Cominvent AS - www.cominvent.com

On 26. feb. 2010, at 19.12, Mark Bennett wrote:

 Certainly lots of matches on Solr and facets.
 
 Contrived example:
 * Solr 1.4, etc.
 * Yellow pages, business listings.
 * Business listings have a zip code that I will use in Faceted search.
 * Companies with multiple stores/outlets/offices still only have one record,
 but all applicable zip codes are listed. (yes, others ways to solve this, I
 know)
 * I want each listing to show in all of it's zip codes when zip-code facets
 are presented
 * I declare one or more fields in schema.xml per the Wiki, etc. So 3 fields,
 etc.
 
 The behavior I will actually get will be:
 (a) As described, a business in 3 zipcodes will show up under all 3 facets
 (b) Nope, only the first zip code will work correctly
 (c) Yes and No. Only the first zip code will show up in the facets.  But I
 could certainly still search on the other codes and find that listing.  If
 other businesses are in the other facets, a click on those zip codes will
 return my original business, but if it's the ONLY business in the zip code,
 the facet will not get displayed
 
 Normally I'd say (a).  Playing with facets and reading online, this should
 be possible, though it may take 2 or 3 versions of the field.
 
 But why then would I even ask about (b) or (c) ?  Well, there's stuff in the
 Wiki that makes me hesitate.  First, look at this page:
 
 http://wiki.apache.org/solr/SolrFacetingOverview
 First section, 3rd bullet point, where it says:
 For faceting: Primary author only, using a solr.StringField:
* Schildt, Herbert
 
 Obviously if I were sorting on this field the first author would matter a
 lot.  And it's a bit ambiguous which copy of the field I'm using, etc.
 
 Other things that cause me to hesitate:
 
 http://wiki.apache.org/solr/SchemaXml#Common_field_options
 The multiValued=true|false
 
 And this:
 http://wiki.apache.org/solr/FieldOptionsByUseCase
 multiValued is left blank in many cases, and not filled in for facets.
 
 
 
 --
 Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
 Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513



Re: If you could have one feature in Solr...

2010-02-26 Thread Don Werve

Realtime search, hands down.


Re: If you could have one feature in Solr...

2010-02-26 Thread Stephen Weiss

+1

I have several projects backburnered in the hope realtime search will  
come to solr soon...


[m]

On Feb 26, 2010, at 8:37 PM, Don Werve d...@madwombat.com wrote:


Realtime search, hands down.


RE: If you could have one feature in Solr...

2010-02-26 Thread Stuart Yeates
The indexer looking for an xml:lang attribute on text fields and using the 
value to pick, tokeniser, dictionaries, etc, etc automatically (and knowing to 
look for them in the standard places).

cheers
stuart

Re: Solr Cell and Deduplication - Get ID of doc

2010-02-26 Thread Lance Norskog
You could create your own unique ID and pass it in with the
literal.field=value feature.

http://wiki.apache.org/solr/ExtractingRequestHandler#Input_Parameters

On Fri, Feb 26, 2010 at 7:56 AM, Bill Engle billengle...@gmail.com wrote:
 Any thoughts on this? I would like to get the id back in the request after
 indexing.  My initial thoughts were to do a search to get the docid  based
 on the attr_stream_name after indexing but now that I reread my message I
 mentioned the attr_stream_name (file_name) may be different so that is
 unreliable.  My only option is to somehow return the id in the XML
 response.  Any guidance is greatly appreciated.

 -Bill

 On Wed, Feb 24, 2010 at 12:06 PM, Bill Engle billengle...@gmail.com wrote:

 Hi -

 New Solr user here.  I am using Solr Cell to index files (PDF, doc, docx,
 txt, htm, etc.) and there is a good chance that a new file will have
 duplicate content but not necessarily the same file name.  To avoid this I
 am using the deduplication feature of Solr.

   updateRequestProcessorChain name=dedupe
     processor
 class=org.apache.solr.update.processor.SignatureUpdateProcessorFactory
       bool name=enabledtrue/bool
       str name=signatureFieldid/str
       bool name=overwriteDupestrue/bool
       str name=fieldsattr_content/str
       str name=signatureClassorg.apache.solr.update.processor./str
     /processor
     processor class=solr.LogUpdateProcessorFactory /
     processor class=solr.RunUpdateProcessorFactory /
   /updateRequestProcessorChain

 How do I get the id value post Solr processing.  Is there someway to
 modify the curl response so that id is returned.  I need this id because I
 would like to rename the file to the id value.  I could probably do a Solr
 search after the fact to get the id field based on the attr_stream_name but
 I would like to do only one request.

 curl '
 http://localhost:8080/solr/update/extract?uprefix=attr_fmap.content=attr_contentcommit=true'
 -F myfi...@myfile.pdf

 Thanks,
 Bill





-- 
Lance Norskog
goks...@gmail.com


Re: If you could have one feature in Solr...

2010-02-26 Thread Dave Searle
To have a coffee waiting for me every morning when I wake up. Marriage  
material indeed. 


solr for reporting purposes

2010-02-26 Thread adeelmahmood

we are trying to use solr for somewhat of a reporting system too (along with
search) .. since it provides such amazing control over queries and basically
over the data that user wants .. they might as well be able to dump that
data in an excel file too if needed .. our data isnt too much close to 25K
docs with 15-20 fields in each doc .. and mostly these reports will be for
close to 500 - 4000 records .. i am thinking about setting up a simple
servlet that grabs all this data that submits the user query to solr over
http .. grabs all that results data and dumps it in an excel file .. i was
just hoping to get some idea of whether this is going to cause any
performance impact on solr search .. especially since its all on the same
server and some users will be doing reports while others will be searching
.. right now search is working GREAT .. its blazing fast .. i dont wanna
loose this but at the same time reporting is an important requirement as
well .. 

also i would appreciate any hints towards some creative ways of doing it ..
something like getting 500 some records in a single request and then using
some timer task repeat the process .. 

thanks for ur help
-- 
View this message in context: 
http://old.nabble.com/solr-for-reporting-purposes-tp27725967p27725967.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: solr for reporting purposes

2010-02-26 Thread adeelmahmood

I just want to clarify if its not obvious .. that the reason I am concerned
about the performance of solr is becaues for reporting requests I will
probably have to request all result rows at the same time .. instead of 10
or 20


adeelmahmood wrote:
 
 we are trying to use solr for somewhat of a reporting system too (along
 with search) .. since it provides such amazing control over queries and
 basically over the data that user wants .. they might as well be able to
 dump that data in an excel file too if needed .. our data isnt too much
 close to 25K docs with 15-20 fields in each doc .. and mostly these
 reports will be for close to 500 - 4000 records .. i am thinking about
 setting up a simple servlet that grabs all this data that submits the user
 query to solr over http .. grabs all that results data and dumps it in an
 excel file .. i was just hoping to get some idea of whether this is going
 to cause any performance impact on solr search .. especially since its all
 on the same server and some users will be doing reports while others will
 be searching .. right now search is working GREAT .. its blazing fast .. i
 dont wanna loose this but at the same time reporting is an important
 requirement as well .. 
 
 also i would appreciate any hints towards some creative ways of doing it
 .. something like getting 500 some records in a single request and then
 using some timer task repeat the process .. 
 
 thanks for ur help
 

-- 
View this message in context: 
http://old.nabble.com/solr-for-reporting-purposes-tp27725967p27726016.html
Sent from the Solr - User mailing list archive at Nabble.com.



indexing using inbuilt lucene in Solr

2010-02-26 Thread mamathahl

Hi
I'm new to Solr.  I have a database which consists of latitude, longitude
and relevant news.  This file has been imported using dataimport.  I think
it has been indexed successfully by Solr.  Now I have to move ahead and give
few queries as mentioned below.
# hsin (great circle): http://localhost:8983/solr/select/?q=name:Minneapolis
AND _val_:recip(hsin(0.78, -1.6, lat_rad, lon_rad, 3963.205), 1, 1, 0)^100

# dist (Euclidean, Manhattan, p-norm):
http://localhost:8983/solr/select/?q=name:Minneapolis AND
_val_:recip(dist(2, lat, lon, 44.794, -93.2696), 1, 1, 0)^100 

How do I get started with it?  Should lucene be used explicitly to index the
file again? Kindly help me to get started off with it.  Thanks in advance.
-- 
View this message in context: 
http://old.nabble.com/indexing-using-inbuilt-lucene-in-Solr-tp27726161p27726161.html
Sent from the Solr - User mailing list archive at Nabble.com.