Re: Sub entities

2011-03-01 Thread Stefan Matheis
Brian,

except for your sql-syntax error in the specie_relations-query SELECT
specie_id FROMspecie_relations .. (missing whitespace after FROM)
your config looks okay.

following questions:
* is there a field named specie in your schema? (otherwise dih will
silently ignore it)
* did you check your mysql-query log? to see which queries were
executed and what their result is?

And, just as quick notice .. there is no need to use field
column=foo name=foo (while both attribute have the same value).

Regards
Stefan

On Mon, Feb 28, 2011 at 9:52 PM, Brian Lamb
brian.l...@journalexperts.com wrote:
 Hi all,

 I was able to get my dataimport to work correctly but I'm a little unclear
 as to how the entity within an entity works in regards to search results.
 When I do a search for all results, it seems only the outermost responses
 are returned. For example, I have the following in my db config file:

 dataConfig
  dataSource type=JdbcDataSource name=mystuff batchSize=-1
 driver=com.mysql.jdbc.Driver
 url=jdbc:mysql://localhost/db?characterEncoding=UTF8amp;zeroDateTimeBehavior=convertToNull
 user=user password=password/
    document
      entity name=animal dataSource=mystuff query=SELECT * FROM
 animals
        field column=id name=id /
        field column=type name=type /
        field column=genus name=genus /

        !-- Add in the species --
        entity name=specie_relations dataSource=mystuff query=SELECT
 specie_id FROMspecie_relations WHERE animal_id=${animal.id}
          entity name=species dataSource=mystuff query=SELECT specie
 FROM species WHERE id=${specie_relations.specie_id}
            field column=specie name=specie /
          /entity
        /entity
      /entity
    /document
  /dataSource
 /dataConfig

 However, specie never shows up in my search results:

 doc
  str name=typeMammal/str
  str name=id1/str
  str name=genusCanis/str
 /doc

 I had hoped the results would include the species. Can it? If so, what is my
 malfunction?



Re: Disabling caching for fq param?

2011-03-01 Thread Markus Jelsma
If filterCache hitratio is low then just disable it in solrconfig by deleting 
the section or setting its values to 0.

 Based on what I've read here and what I could find on the web, it seems
 that each fq clause essentially gets its own results cache.  Is that
 correct?
 
 We have a corporate policy of passing the user's Oracle OLS labels into the
 index in order to be matched against the labels field.  I currently
 separate this from the user's query text by sticking it into an fq
 param...
 
 ?q=user-entered expression
 fq=labels:the label values expression
 qf=song metadata copy field song lyrics field
 tie=0.1
 defType=dismax
 
 ...but since its value (a collection of hundreds of label values) only
 apply to that user, the accompanying result set won't be reusable by other
 users:
 
 My understanding is that this query will result in two result sets (q and
 fq) being cached separately, with the union of the two sets being returned
 to the user.  (Is that correct?)
 
 There are thousands of users, each with a unique combination of labels, so
 there seems to be little value in caching the result set created from the
 fq labels param.  It would be beneficial if there were some kind of fq
 parameter override to indicate to Solr to not cache the results?
 
 
 Thanks!


Re: Problem with sorting using functions.

2011-03-01 Thread Jan Høydahl
Also, if you're on 3.1, the function needs to be without spaces since sort will 
split on space to find the sort order.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 28. feb. 2011, at 22.34, John Sherwood wrote:

 Fair call.  Thanks.
 
 On Tue, Mar 1, 2011 at 8:21 AM, Geert-Jan Brits gbr...@gmail.com wrote:
 sort by functionquery is only available from solr 3.1 (from :
 http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function)
 
 
 2011/2/28 John Sherwood j...@storecrowd.com
 
 This works:
 /select/?q=*:*sort=price desc
 
 This throws a 400 error:
 /select/?q=*:*sort=sum(1, 1) desc
 
 Missing sort order.
 
 I'm using 1.4.2.  I've tried all sorts of different numbers, functions, and
 fields but nothing seems to change that error.  Any ideas?
 
 



Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jan Høydahl
Have you tried removing the dataDir tag from solrconfig.xml? Then it should 
fall back to default ./data relative to core instancedir.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 1. mars 2011, at 00.00, Jonathan Rochkind wrote:

 Unless I'm doing something wrong, in my experience in multi-core Solr in 
 1.4.1, you NEED to explicitly provide an absolute path to the 'data' dir.
 
 I set up multi-core like this:
 
 cores adminPath=/admin/cores
 core name=some_core instanceDir=some_core
 /core
 /cores
 
 
 Now, setting instanceDir like that works for Solr to look for the 'conf' 
 directory in the default location you'd expect, ./some_core/conf.
 
 You'd expect it to look for the 'data' dir for an index in ./some_core/data 
 too, by default.  But it does not seem to. It's still looking for the 'data' 
 directory in the _main_ solr.home/data, not under the relevant core directory.
 
 The only way I can manage to get it to look for the /data directory where I 
 expect is to spell it out with a full absolute path:
 
 core name=some_core instanceDir=some_core
 property name=dataDir value=/path/to/main/solr/some_core/data /
 /core
 
 And then in the solrconfig.xml do a dataDir${dataDir}/dataDir
 
 Is this what everyone else does too? Or am I missing a better way of doing 
 this?  I would have thought it would just work, with Solr by default 
 looking for a ./data subdir of the specified instanceDir.  But it definitely 
 doesn't seem to do that.
 
 Should it? Anyone know if Solr in trunk past 1.4.1 has been changed to do 
 what I expect? Or am I wrong to expect it? Or does everyone else do 
 multi-core in some different way than me where this doesn't come up?
 
 Jonathan
 



Re: Problem with Solr and Nutch integration

2011-03-01 Thread Paul Rogers
Hi Anurag

The request handler has been added the solrconfig file.

I'll try your attached requesthandler and see if that helps.

Interestingly enough the whole setup when I was using nutch 1.2/solr 1.4.1.
 It is only since moving to nutch trunk/solr branch_3x that the problem has
occurred.  I assume that something has changed inbetween and the tutorial's
request handler is incorrect for the later solr version.  Which versions of
solr/nutch are you using?

Assuming the catalina.out file is the correct log file the output I get is
shown below.  This output occurs on restarting the solr-example after adding
the new requesthandler.  When I access the solr admin page no additional
logging occurs.  Can any one see the problem?

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome

INFO: Using JNDI solr.home: /opt/solr/example/solr

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader init

INFO: Solr home set to '/opt/solr/example/solr/'

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
addToClassLoader

SEVERE: Can't find (or read) file to add to classloader:
/opt/solr/example/solr/./lib

Feb 28, 2011 6:28:59 PM org.apache.solr.servlet.SolrDispatchFilter init

INFO: SolrDispatchFilter.init()

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome

INFO: Using JNDI solr.home: /opt/solr/example/solr

Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer$Initializer
initialize
INFO: looking for solr.xml: /opt/solr/example/solr/solr.xml

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome

INFO: Using JNDI solr.home: /opt/solr/example/solr

Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer init

INFO: New CoreContainer: solrHome=/opt/solr/example/solr/ instance=6794958

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader init

INFO: Solr home set to '/opt/solr/example/solr/'

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
addToClassLoader

SEVERE: Can't find (or read) file to add to classloader:
/opt/solr/example/solr/./lib

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader init

INFO: Solr home set to '/opt/solr/example/solr/./'

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
addToClassLoader

SEVERE: Can't find (or read) file to add to classloader:
/opt/solr/example/solr/././lib

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrConfig initLibs

INFO: Adding specified lib dirs to ClassLoader

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/opt/solr/contrib/extraction/lib/commons-compress-1.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/log4j-1.2.14.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/opt/solr/contrib/extraction/lib/commons-logging-1.1.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/tika-parsers-0.8.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/asm-3.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/icu4j-4_6.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/xercesImpl-2.8.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/bcmail-jdk15-1.45.jar'
to classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/fontbox-1.3.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-3.7.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/dom4j-1.6.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/opt/solr/contrib/extraction/lib/geronimo-stax-api_1.0_spec-1.0.1.jar'
to classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-ooxml-3.7.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/xml-apis-1.0.b2.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader


Re: Problem with Solr and Nutch integration

2011-03-01 Thread Anurag
i have nutch-1.0 and Apache-solr-1.3.0 (integrated these two).

On 3/1/11, Paul Rogers [via Lucene]
ml-node+2601915-1461428819-146...@n3.nabble.com wrote:


 Hi Anurag

 The request handler has been added the solrconfig file.

 I'll try your attached requesthandler and see if that helps.

 Interestingly enough the whole setup when I was using nutch 1.2/solr 1.4.1.
  It is only since moving to nutch trunk/solr branch_3x that the problem has
 occurred.  I assume that something has changed inbetween and the tutorial's
 request handler is incorrect for the later solr version.  Which versions of
 solr/nutch are you using?

 Assuming the catalina.out file is the correct log file the output I get is
 shown below.  This output occurs on restarting the solr-example after adding
 the new requesthandler.  When I access the solr admin page no additional
 logging occurs.  Can any one see the problem?

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome

 INFO: Using JNDI solr.home: /opt/solr/example/solr

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader init

 INFO: Solr home set to '/opt/solr/example/solr/'

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 addToClassLoader

 SEVERE: Can't find (or read) file to add to classloader:
 /opt/solr/example/solr/./lib

 Feb 28, 2011 6:28:59 PM org.apache.solr.servlet.SolrDispatchFilter init

 INFO: SolrDispatchFilter.init()

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome

 INFO: Using JNDI solr.home: /opt/solr/example/solr

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer$Initializer
 initialize
 INFO: looking for solr.xml: /opt/solr/example/solr/solr.xml

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome

 INFO: Using JNDI solr.home: /opt/solr/example/solr

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer init

 INFO: New CoreContainer: solrHome=/opt/solr/example/solr/ instance=6794958

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader init

 INFO: Solr home set to '/opt/solr/example/solr/'

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 addToClassLoader

 SEVERE: Can't find (or read) file to add to classloader:
 /opt/solr/example/solr/./lib

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader init

 INFO: Solr home set to '/opt/solr/example/solr/./'

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 addToClassLoader

 SEVERE: Can't find (or read) file to add to classloader:
 /opt/solr/example/solr/././lib

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrConfig initLibs

 INFO: Adding specified lib dirs to ClassLoader

 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding
 'file:/opt/solr/contrib/extraction/lib/commons-compress-1.1.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/log4j-1.2.14.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding
 'file:/opt/solr/contrib/extraction/lib/commons-logging-1.1.1.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/tika-parsers-0.8.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/asm-3.1.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/icu4j-4_6.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/xercesImpl-2.8.1.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/bcmail-jdk15-1.45.jar'
 to classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/fontbox-1.3.1.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-3.7.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/dom4j-1.6.1.jar' to
 classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding
 'file:/opt/solr/contrib/extraction/lib/geronimo-stax-api_1.0_spec-1.0.1.jar'
 to classloader
 Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
 replaceClassLoader

 INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-ooxml-3.7.jar' to
 classloader

Error during auto-warming of key

2011-03-01 Thread Markus Jelsma
Hi,

Yesterday's error log contains something peculiar: 

 ERROR [solr.search.SolrCache] - [pool-29-thread-1] - : Error during auto-
warming of key:+*:* 
(1.0/(7.71E-8*float(ms(const(1298682616680),date(sort_date)))+1.0))^20.0:java.lang.NullPointerException
at org.apache.lucene.util.StringHelper.intern(StringHelper.java:36)
at 
org.apache.lucene.search.FieldCacheImpl$Entry.init(FieldCacheImpl.java:275)
at 
org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:525)
at 
org.apache.solr.search.function.LongFieldSource.getValues(LongFieldSource.java:57)
at 
org.apache.solr.search.function.DualFloatFunction.getValues(DualFloatFunction.java:48)
at 
org.apache.solr.search.function.ReciprocalFloatFunction.getValues(ReciprocalFloatFunction.java:61)
at 
org.apache.solr.search.function.FunctionQuery$AllScorer.init(FunctionQuery.java:123)
at 
org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:246)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:651)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545)
at 
org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:520)
at 
org.apache.solr.search.SolrIndexSearcher$2.regenerateItem(SolrIndexSearcher.java:296)
at org.apache.solr.search.FastLRUCache.warm(FastLRUCache.java:168)
at 
org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481)
at org.apache.solr.core.SolrCore$2.call(SolrCore.java:1131)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)


Well, i use Dismax' bf parameter to boost very recent documents. I'm not using 
the queryResultCache or documentCache, only filterCache and Lucene fieldCache. 
I've check LUCENE-1890 but am unsure if that's the issue. Anyt thoughts on 
this one?

https://issues.apache.org/jira/browse/LUCENE-1890

Cheers,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Problem with sorting using functions

2011-03-01 Thread John Sherwood
This works:
/select/?q=*:*sort=price desc

This throws a 400 error:
/select/?q=*:*sort=sum(1, 1) desc

Missing sort order.

I'm using 1.4.2.  I've tried all sorts of different numbers/functions/fields
and nothing seems to change that error.  Any ideas?
**


Retrieving payload from each highlighted term

2011-03-01 Thread Fabiano Nunes
How can I get the payload from each highlighted term?


RE: Query on multivalue field

2011-03-01 Thread Steven A Rowe
Hi Scott,

Querying against a multi-valued field just works - no special incantation 
required.

Steve

 -Original Message-
 From: Scott Yeadon [mailto:scott.yea...@anu.edu.au]
 Sent: Monday, February 28, 2011 11:50 PM
 To: solr-user@lucene.apache.org
 Subject: Query on multivalue field
 
 Hi,
 
 I have a variable number of text-based fields associated with each
 primary record which I wanted to apply a search across. I wanted to
 avoid the use of dynamic fields if possible or having to create a
 different document type in the index (as the app is based around the
 primary record and different views mean a lot of work to revamp
 pagination etc).
 
 So, is there a way to apply a query to each value of a multivalued field
 or is it always treated as a single field from a query perspective?
 
 Thanks.
 
 Scott.


Help with explain query syntax

2011-03-01 Thread Glòria Martínez
Hello,

I can't understand why this query is not matching anything. Could someone
help me please?

*Query*
http://localhost:8894/solr/select?q=linguajob.plqf=company_namewt=xmlqt=dismaxdebugQuery=onexplainOther=id%3A1

response
-
lst name=responseHeader
int name=status0/int
int name=QTime12/int
-
lst name=params
str name=explainOtherid:1/str
str name=debugQueryon/str
str name=qlinguajob.pl/str
str name=qfcompany_name/str
str name=wtxml/str
str name=qtdismax/str
/lst
/lst
result name=response numFound=0 start=0/
-
lst name=debug
str name=rawquerystringlinguajob.pl/str
str name=querystringlinguajob.pl/str
-
str name=parsedquery
+DisjunctionMaxQuery((company_name:(linguajob.pl linguajob) pl)~0.01) ()
/str
-
str name=parsedquery_toString
+(company_name:(linguajob.pl linguajob) pl)~0.01 ()
/str
lst name=explain/
str name=otherQueryid:1/str
-
lst name=explainOther
-
str name=1

0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
clause(s)
  0.0 = no match on required clause (company_name:(linguajob.pl linguajob)
pl) *- What does this syntax (field:(token1 token2) token3) mean?*
0.0 = (NON-MATCH) fieldWeight(company_name:(linguajob.pl linguajob) pl
in 0), product of:
  0.0 = tf(phraseFreq=0.0)
  1.6137056 = idf(company_name:(linguajob.pl linguajob) pl)
  0.4375 = fieldNorm(field=company_name, doc=0)
/str
/lst
str name=QParserDisMaxQParser/str
null name=altquerystring/
null name=boostfuncs/
+
lst name=timing
...
/response



There's only one document indexed:

*Document*
http://localhost:8894/solr/select?q=1qf=idwt=xmlqt=dismax
response
-
lst name=responseHeader
int name=status0/int
int name=QTime2/int
-
lst name=params
str name=qfid/str
str name=wtxml/str
str name=qtdismax/str
str name=q1/str
/lst
/lst
-
result name=response numFound=1 start=0
-
doc
str name=company_nameLinguaJob.pl/str
str name=id1/str
int name=status6/int
date name=timestamp2011-03-01T11:14:24.553Z/date
/doc
/result
/response

*Solr Admin Schema*
Field: company_name
Field Type: text
Properties: Indexed, Tokenized, Stored
Schema: Indexed, Tokenized, Stored
Index: Indexed, Tokenized, Stored

Position Increment Gap: 100

Index Analyzer: org.apache.solr.analysis.TokenizerChain Details
Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
Filters:
schema.UnicodeNormalizationFilterFactory args:{composed: false
remove_modifiers: true fold: true version: java6 remove_diacritics: true }
org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
ignoreCase: true enablePositionIncrements: true }
org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1
generateWordParts: 1 catenateAll: 0 catenateNumbers: 1 }
org.apache.solr.analysis.LowerCaseFilterFactory args:{}
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}

Query Analyzer: org.apache.solr.analysis.TokenizerChain Details
Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
Filters:
schema.UnicodeNormalizationFilterFactory args:{composed: false
remove_modifiers: true fold: true version: java6 remove_diacritics: true }
org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt
expand: true ignoreCase: true }
org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
ignoreCase: true }
org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0
generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 }
org.apache.solr.analysis.LowerCaseFilterFactory args:{}
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}

Docs: 1
Distinct: 5
Top 5 terms
term frequency
lingua 1
linguajob.pl 1
linguajobpl 1
pl 1
job 1

*Solr Analysis*
Field name: company_name
Field value (Index): LinguaJob.pl
Field value (Query): linguajob.pl

*Index Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position 1
term text LinguaJob.pl
term type word
source start,end 0,12
payload

schema.UnicodeNormalizationFilterFactory {composed=false,
remove_modifiers=true, fold=true, version=java6, remove_diacritics=true}
term position 1
term text LinguaJob.pl
term type word
source start,end 0,12
payload

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true, enablePositionIncrements=true}
term position 1
term text LinguaJob.pl
term type word
source start,end 0,12
payload

org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1,
splitOnCaseChange=1, generateNumberParts=1, catenateWords=1,
generateWordParts=1, catenateAll=0, catenateNumbers=1}
term position 123
term text LinguaJob.plJobpl
LinguaLinguaJobpl
term type wordwordword
wordword
source start,end 0,126,910,12
0,60,12
payload

org.apache.solr.analysis.LowerCaseFilterFactory {}
term position 123
term text linguajob.pljobpl
lingualinguajobpl
term type wordwordword
wordword
source start,end 0,126,910,12
0,60,12
payload


Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jonathan Rochkind
I did try that, yes. I tried that first in fact!  It seems to fall back 
to a ./data directory relative to the _main_ solr directory (the one 
above all the cores), not the core instancedir.  Which is not what I 
expected either.


I wonder if this should be considered a bug? I wonder if anyone has 
considered this and thought of changing/fixing it?


On 3/1/2011 4:23 AM, Jan Høydahl wrote:

Have you tried removing thedataDir  tag from solrconfig.xml? Then it should 
fall back to default ./data relative to core instancedir.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 1. mars 2011, at 00.00, Jonathan Rochkind wrote:


Unless I'm doing something wrong, in my experience in multi-core Solr in 1.4.1, 
you NEED to explicitly provide an absolute path to the 'data' dir.

I set up multi-core like this:

cores adminPath=/admin/cores
core name=some_core instanceDir=some_core
/core
/cores


Now, setting instanceDir like that works for Solr to look for the 'conf' 
directory in the default location you'd expect, ./some_core/conf.

You'd expect it to look for the 'data' dir for an index in ./some_core/data 
too, by default.  But it does not seem to. It's still looking for the 'data' 
directory in the _main_ solr.home/data, not under the relevant core directory.

The only way I can manage to get it to look for the /data directory where I 
expect is to spell it out with a full absolute path:

core name=some_core instanceDir=some_core
property name=dataDir value=/path/to/main/solr/some_core/data /
/core

And then in the solrconfig.xml do adataDir${dataDir}/dataDir

Is this what everyone else does too? Or am I missing a better way of doing this?  I would 
have thought it would just work, with Solr by default looking for a ./data 
subdir of the specified instanceDir.  But it definitely doesn't seem to do that.

Should it? Anyone know if Solr in trunk past 1.4.1 has been changed to do what 
I expect? Or am I wrong to expect it? Or does everyone else do multi-core in 
some different way than me where this doesn't come up?

Jonathan





MLT with boost

2011-03-01 Thread Mark
Is it possible to add function queries/boosts to the results that are by 
MLT? If not out of the box how would one go about achieving this 
functionality?


Thanks


Re: please make JSONWriter public

2011-03-01 Thread Ryan McKinley
You may have noticed the ResponseWriter code is pretty hairy!  Things
are package protected so that the API can change between minor release
without concern for back compatibility.

In 4.0 (/trunk) I hope to rework the whole ResponseWriter framework so
that it is more clean and hopefully stable enough that making parts
public is helpful.

For now, you can:
- copy the code
- put your class in the same package name
- make it public in your own distribution

ryan



On Mon, Feb 28, 2011 at 2:56 PM, Paul Libbrecht p...@hoplahup.net wrote:

 Hello fellow SOLR experts,

 may I ask to make top-level and public the class
    org.apache.solr.request.JSONWriter
 inside
    org.apache.solr.request.JSONResponseWriter
 I am re-using it to output JSON search result to code that I wish not to 
 change on the client but the current visibility settings (JSONWriter is 
 package protected) makes it impossible for me without actually copying the 
 code (which is possible thanks to the good open-source nature).

 thanks in advance

 paul


Re: Sub entities

2011-03-01 Thread Brian Lamb
Yes, it looks like I had left off the field (misspelled it actually). I
reran the full import and the fields did properly show up. However, it is
still not working as expected. Using the example below, a result returned
would only list one specie instead of a list of species. I have the
following in my schema.xml file:

field column=specie multiValued=true name=specie type=string
indexed=true stored=true required=false /

I reran the fullimport but it is still only listing one specie instead of
multiple. Is my above declaration incorrect?

On Tue, Mar 1, 2011 at 3:41 AM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Brian,

 except for your sql-syntax error in the specie_relations-query SELECT
 specie_id FROMspecie_relations .. (missing whitespace after FROM)
 your config looks okay.

 following questions:
 * is there a field named specie in your schema? (otherwise dih will
 silently ignore it)
 * did you check your mysql-query log? to see which queries were
 executed and what their result is?

 And, just as quick notice .. there is no need to use field
 column=foo name=foo (while both attribute have the same value).

 Regards
 Stefan

 On Mon, Feb 28, 2011 at 9:52 PM, Brian Lamb
 brian.l...@journalexperts.com wrote:
  Hi all,
 
  I was able to get my dataimport to work correctly but I'm a little
 unclear
  as to how the entity within an entity works in regards to search results.
  When I do a search for all results, it seems only the outermost responses
  are returned. For example, I have the following in my db config file:
 
  dataConfig
   dataSource type=JdbcDataSource name=mystuff batchSize=-1
  driver=com.mysql.jdbc.Driver
 
 url=jdbc:mysql://localhost/db?characterEncoding=UTF8amp;zeroDateTimeBehavior=convertToNull
  user=user password=password/
 document
   entity name=animal dataSource=mystuff query=SELECT * FROM
  animals
 field column=id name=id /
 field column=type name=type /
 field column=genus name=genus /
 
 !-- Add in the species --
 entity name=specie_relations dataSource=mystuff query=SELECT
   specie_id FROM specie_relations WHERE animal_id=${animal.id}
   entity name=species dataSource=mystuff query=SELECT specie
  FROM species WHERE id=${specie_relations.specie_id}
 field column=specie name=specie /
   /entity
 /entity
   /entity
 /document
   /dataSource
  /dataConfig
 
  However, specie never shows up in my search results:
 
  doc
   str name=typeMammal/str
   str name=id1/str
   str name=genusCanis/str
  /doc
 
  I had hoped the results would include the species. Can it? If so, what is
 my
  malfunction?
 



Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb
Thank you for your reply but the searching is still not working out. For
example, when I go to:

http://localhost:8983/solr/select/?q=*%3A*http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on

I get the following as a response:

result name=response numFound=249943 start=0
  doc
str name=typeMammal/str
str name=id1/str
str name=genusCanis/str
  /doc
/response

(plus some other docs but one is enough for this example)

But if I go to 
http://localhost:8983/solr/select/?q=type%3Ahttp://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
Mammal

I only get:

result name=response numFound=0 start=0

But it seems that should return at least the result I have listed above.
What am I doing incorrectly?

On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:

 q=dog is equivalent to q=text:dog (where the default search field is
 defined as text at the bottom of schema.xml).

 If you want to specify a different field, well, you need to tell it :-)

 Is that it?

 Upayavira

 On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb
 brian.l...@journalexperts.com wrote:
  Hi all,
 
  I was able to get my installation of Solr indexed using dataimport.
  However,
  I cannot seem to get search working. I can verify that the data is there
  by
  going to:
 
 
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 
  This gives me the response: result name=response numFound=234961
  start=0
 
  But when I go to
 
 
 http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10indent=on
 
  I get the response: result name=response numFound=0 start=0
 
  I know that dog should return some results because it is the first result
  when I select all the records. So what am I doing incorrectly that would
  prevent me from seeing results?
 
 ---
 Enterprise Search Consultant at Sourcesense UK,
 Making Sense of Open Source




Re: Sub entities

2011-03-01 Thread Stefan Matheis
Brian,

On Tue, Mar 1, 2011 at 4:52 PM, Brian Lamb
brian.l...@journalexperts.com wrote:
 field column=specie multiValued=true name=specie type=string
 indexed=true stored=true required=false /

Not sure, but iirc field in this context has no column-Attribute ..
that should normally not break your solr-configuration.

Are you sure, that your animal has multiple species assigned? Checked
the Query from the MySQL-Query-Log and verified that it returns more
than one record?

Otherwise you could enable
http://wiki.apache.org/solr/DataImportHandler#LogTransformer for your
dataimport, which outputs a log-row for every record .. just to
ensure, that your Query-Results is correctly imported

HTH, Regards
Stefan


Re: Indexed, but cannot search

2011-03-01 Thread Edoardo Tosca
Hi,
i'm not sure if it is a typo, anyway the second query you mentioned should
be:
http://localhost:8983/solr/select/?q=type:*

HTH,

Edo

On Tue, Mar 1, 2011 at 4:06 PM, Brian Lamb brian.l...@journalexperts.comwrote:

 Thank you for your reply but the searching is still not working out. For
 example, when I go to:

 http://localhost:8983/solr/select/?q=*%3A*
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 

 I get the following as a response:

 result name=response numFound=249943 start=0
  doc
str name=typeMammal/str
str name=id1/str
str name=genusCanis/str
  /doc
 /response

 (plus some other docs but one is enough for this example)

 But if I go to http://localhost:8983/solr/select/?q=type%3A
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 
 Mammal

 I only get:

 result name=response numFound=0 start=0

 But it seems that should return at least the result I have listed above.
 What am I doing incorrectly?

 On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:

  q=dog is equivalent to q=text:dog (where the default search field is
  defined as text at the bottom of schema.xml).
 
  If you want to specify a different field, well, you need to tell it :-)
 
  Is that it?
 
  Upayavira
 
  On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb
  brian.l...@journalexperts.com wrote:
   Hi all,
  
   I was able to get my installation of Solr indexed using dataimport.
   However,
   I cannot seem to get search working. I can verify that the data is
 there
   by
   going to:
  
  
 
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
  
   This gives me the response: result name=response numFound=234961
   start=0
  
   But when I go to
  
  
 
 http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10indent=on
  
   I get the response: result name=response numFound=0 start=0
  
   I know that dog should return some results because it is the first
 result
   when I select all the records. So what am I doing incorrectly that
 would
   prevent me from seeing results?
  
  ---
  Enterprise Search Consultant at Sourcesense UK,
  Making Sense of Open Source
 
 




-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Indexed, but cannot search

2011-03-01 Thread Upayavira
Next question, do you have your type field set to index=true in your
schema?

Upayavira

On Tue, 01 Mar 2011 11:06 -0500, Brian Lamb
brian.l...@journalexperts.com wrote:
 Thank you for your reply but the searching is still not working out. For
 example, when I go to:
 
 http://localhost:8983/solr/select/?q=*%3A*http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 
 I get the following as a response:
 
 result name=response numFound=249943 start=0
   doc
 str name=typeMammal/str
 str name=id1/str
 str name=genusCanis/str
   /doc
 /response
 
 (plus some other docs but one is enough for this example)
 
 But if I go to
 http://localhost:8983/solr/select/?q=type%3Ahttp://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 Mammal
 
 I only get:
 
 result name=response numFound=0 start=0
 
 But it seems that should return at least the result I have listed above.
 What am I doing incorrectly?
 
 On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:
 
  q=dog is equivalent to q=text:dog (where the default search field is
  defined as text at the bottom of schema.xml).
 
  If you want to specify a different field, well, you need to tell it :-)
 
  Is that it?
 
  Upayavira
 
  On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb
  brian.l...@journalexperts.com wrote:
   Hi all,
  
   I was able to get my installation of Solr indexed using dataimport.
   However,
   I cannot seem to get search working. I can verify that the data is there
   by
   going to:
  
  
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
  
   This gives me the response: result name=response numFound=234961
   start=0
  
   But when I go to
  
  
  http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10indent=on
  
   I get the response: result name=response numFound=0 start=0
  
   I know that dog should return some results because it is the first result
   when I select all the records. So what am I doing incorrectly that would
   prevent me from seeing results?
  
  ---
  Enterprise Search Consultant at Sourcesense UK,
  Making Sense of Open Source
 
 
 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: Question on writing custom UpdateHandler

2011-03-01 Thread Chris Hostetter

In your first attempt, the crux of your problem was probably that you were 
never closing the searcher/reader.

: Or how can I perform a query on the current state of the index from within an
: UpdateProcessor?

If you implement UpdateRequestProcessorFactory, the getInstance method is 
given the SolrQueryRequest, which you cna use to access the current 
SolrIndexSearcher.

this will only show you the state of the index as of the last commit, so 
it won't be real time as you are streaming new documents, but if will give 
you the same results as a search query happening concurrent to your 
update.


-Hoss


Re: multi-core solr, specifying the data directory

2011-03-01 Thread Chris Hostetter

: Unless I'm doing something wrong, in my experience in multi-core Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the 'data' dir.

have you looked at the example/multicore directory that was included in 
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the 
solr.xml (or hte solrconfig.xml) and it uses the data dir inside the 
specified instanceDir.

If that example works for you, but your own configs do not, then we'll 
need more details about your own configs -- how are you running solr, what 
does the solrconfig.xml of the core look like, etc...


-Hoss


Re: please make JSONWriter public

2011-03-01 Thread Paul Libbrecht
Ryan,

honestly, hairyness was rather mild.
I found it fairly readable.

paul


Le 1 mars 2011 à 16:46, Ryan McKinley a écrit :

 You may have noticed the ResponseWriter code is pretty hairy!  Things
 are package protected so that the API can change between minor release
 without concern for back compatibility.
 
 In 4.0 (/trunk) I hope to rework the whole ResponseWriter framework so
 that it is more clean and hopefully stable enough that making parts
 public is helpful.
 
 For now, you can:
 - copy the code
 - put your class in the same package name
 - make it public in your own distribution
 
 ryan
 
 
 
 On Mon, Feb 28, 2011 at 2:56 PM, Paul Libbrecht p...@hoplahup.net wrote:
 
 Hello fellow SOLR experts,
 
 may I ask to make top-level and public the class
org.apache.solr.request.JSONWriter
 inside
org.apache.solr.request.JSONResponseWriter
 I am re-using it to output JSON search result to code that I wish not to 
 change on the client but the current visibility settings (JSONWriter is 
 package protected) makes it impossible for me without actually copying the 
 code (which is possible thanks to the good open-source nature).
 
 thanks in advance
 
 paul



solr different sizes on master and slave

2011-03-01 Thread Mike Franon
I was curious why would the size be dramatically different even though
the index versions are the same?

One is 1.2 Gb, and on the slave it is 512 MB

I would think they should both be the same size no?

Thanks


Re: Sub entities

2011-03-01 Thread Brian Lamb
Thanks for the help Stefan. It seems removing column=specie fixed it.

On Tue, Mar 1, 2011 at 11:18 AM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Brian,

 On Tue, Mar 1, 2011 at 4:52 PM, Brian Lamb
 brian.l...@journalexperts.com wrote:
  field column=specie multiValued=true name=specie type=string
  indexed=true stored=true required=false /

 Not sure, but iirc field in this context has no column-Attribute ..
 that should normally not break your solr-configuration.

 Are you sure, that your animal has multiple species assigned? Checked
 the Query from the MySQL-Query-Log and verified that it returns more
 than one record?

 Otherwise you could enable
 http://wiki.apache.org/solr/DataImportHandler#LogTransformer for your
 dataimport, which outputs a log-row for every record .. just to
 ensure, that your Query-Results is correctly imported

 HTH, Regards
 Stefan



Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb
Hi all,

The problem was that my fields were defined as type=string instead of
type=text. Once I corrected that, it seems to be fixed. The only part that
still is not working though is the search across all fields.

For example:

http://localhost:8983/solr/select/?q=type%3AMammal

Now correctly returns the records matching mammal. But if I try to do a
global search across all fields:

http://localhost:8983/solr/select/?q=Mammal
http://localhost:8983/solr/select/?q=text%3AMammal

I get no results returned. Here is how the schema is set up:

field name=text type=text indexed=true stored=false
multiValued=true/
defaultSearchFieldtext/defaultSearchField
copyField source=* dest=text /

Thanks to everyone for your help so far. I think this is the last hurdle I
have to jump over.

On Tue, Mar 1, 2011 at 12:34 PM, Upayavira u...@odoko.co.uk wrote:

 Next question, do you have your type field set to index=true in your
 schema?

 Upayavira

 On Tue, 01 Mar 2011 11:06 -0500, Brian Lamb
 brian.l...@journalexperts.com wrote:
  Thank you for your reply but the searching is still not working out. For
  example, when I go to:
 
  http://localhost:8983/solr/select/?q=*%3A*
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 
 
  I get the following as a response:
 
  result name=response numFound=249943 start=0
doc
  str name=typeMammal/str
  str name=id1/str
  str name=genusCanis/str
/doc
  /response
 
  (plus some other docs but one is enough for this example)
 
  But if I go to
  http://localhost:8983/solr/select/?q=type%3A
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
 
  Mammal
 
  I only get:
 
  result name=response numFound=0 start=0
 
  But it seems that should return at least the result I have listed above.
  What am I doing incorrectly?
 
  On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:
 
   q=dog is equivalent to q=text:dog (where the default search field is
   defined as text at the bottom of schema.xml).
  
   If you want to specify a different field, well, you need to tell it :-)
  
   Is that it?
  
   Upayavira
  
   On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb
   brian.l...@journalexperts.com wrote:
Hi all,
   
I was able to get my installation of Solr indexed using dataimport.
However,
I cannot seem to get search working. I can verify that the data is
 there
by
going to:
   
   
  
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
   
This gives me the response: result name=response numFound=234961
start=0
   
But when I go to
   
   
  
 http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10indent=on
   
I get the response: result name=response numFound=0 start=0
   
I know that dog should return some results because it is the first
 result
when I select all the records. So what am I doing incorrectly that
 would
prevent me from seeing results?
   
   ---
   Enterprise Search Consultant at Sourcesense UK,
   Making Sense of Open Source
  
  
 
 ---
 Enterprise Search Consultant at Sourcesense UK,
 Making Sense of Open Source




Re: Indexed, but cannot search

2011-03-01 Thread Markus Jelsma
Traditionally, people forget to reindex ;)

 Hi all,
 
 The problem was that my fields were defined as type=string instead of
 type=text. Once I corrected that, it seems to be fixed. The only part
 that still is not working though is the search across all fields.
 
 For example:
 
 http://localhost:8983/solr/select/?q=type%3AMammal
 
 Now correctly returns the records matching mammal. But if I try to do a
 global search across all fields:
 
 http://localhost:8983/solr/select/?q=Mammal
 http://localhost:8983/solr/select/?q=text%3AMammal
 
 I get no results returned. Here is how the schema is set up:
 
 field name=text type=text indexed=true stored=false
 multiValued=true/
 defaultSearchFieldtext/defaultSearchField
 copyField source=* dest=text /
 
 Thanks to everyone for your help so far. I think this is the last hurdle I
 have to jump over.
 
 On Tue, Mar 1, 2011 at 12:34 PM, Upayavira u...@odoko.co.uk wrote:
  Next question, do you have your type field set to index=true in your
  schema?
  
  Upayavira
  
  On Tue, 01 Mar 2011 11:06 -0500, Brian Lamb
  
  brian.l...@journalexperts.com wrote:
   Thank you for your reply but the searching is still not working out.
   For example, when I go to:
   
   http://localhost:8983/solr/select/?q=*%3A*
  
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
  dent=on
  
   I get the following as a response:
   
   result name=response numFound=249943 start=0
   
 doc
 
   str name=typeMammal/str
   str name=id1/str
   str name=genusCanis/str
 
 /doc
   
   /response
   
   (plus some other docs but one is enough for this example)
   
   But if I go to
   http://localhost:8983/solr/select/?q=type%3A
  
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
  dent=on
  
   Mammal
   
   I only get:
   
   result name=response numFound=0 start=0
   
   But it seems that should return at least the result I have listed
   above. What am I doing incorrectly?
   
   On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:
q=dog is equivalent to q=text:dog (where the default search field is
defined as text at the bottom of schema.xml).

If you want to specify a different field, well, you need to tell it
:-)

Is that it?

Upayavira

On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb

brian.l...@journalexperts.com wrote:
 Hi all,
 
 I was able to get my installation of Solr indexed using dataimport.
 However,
 I cannot seem to get search working. I can verify that the data is
  
  there
  
 by
  
 going to:
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
  dent=on
  
 This gives me the response: result name=response
 numFound=234961 start=0
 
 But when I go to
  
  http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10inde
  nt=on
  
 I get the response: result name=response numFound=0 start=0
 
 I know that dog should return some results because it is the first
  
  result
  
 when I select all the records. So what am I doing incorrectly that
  
  would
  
 prevent me from seeing results?

---
Enterprise Search Consultant at Sourcesense UK,
Making Sense of Open Source
  
  ---
  Enterprise Search Consultant at Sourcesense UK,
  Making Sense of Open Source


Re: solr different sizes on master and slave

2011-03-01 Thread Markus Jelsma
Are there pending commits on the master?

 I was curious why would the size be dramatically different even though
 the index versions are the same?
 
 One is 1.2 Gb, and on the slave it is 512 MB
 
 I would think they should both be the same size no?
 
 Thanks


Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb
Oh if only it were that easy :-). I have reindexed since making that change
which is how I was able to get the regular search working. I have not
however been able to get the search across all fields to work.

On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma markus.jel...@openindex.iowrote:

 Traditionally, people forget to reindex ;)

  Hi all,
 
  The problem was that my fields were defined as type=string instead of
  type=text. Once I corrected that, it seems to be fixed. The only part
  that still is not working though is the search across all fields.
 
  For example:
 
  http://localhost:8983/solr/select/?q=type%3AMammal
 
  Now correctly returns the records matching mammal. But if I try to do a
  global search across all fields:
 
  http://localhost:8983/solr/select/?q=Mammal
  http://localhost:8983/solr/select/?q=text%3AMammal
 
  I get no results returned. Here is how the schema is set up:
 
  field name=text type=text indexed=true stored=false
  multiValued=true/
  defaultSearchFieldtext/defaultSearchField
  copyField source=* dest=text /
 
  Thanks to everyone for your help so far. I think this is the last hurdle
 I
  have to jump over.
 
  On Tue, Mar 1, 2011 at 12:34 PM, Upayavira u...@odoko.co.uk wrote:
   Next question, do you have your type field set to index=true in
 your
   schema?
  
   Upayavira
  
   On Tue, 01 Mar 2011 11:06 -0500, Brian Lamb
  
   brian.l...@journalexperts.com wrote:
Thank you for your reply but the searching is still not working out.
For example, when I go to:
   
http://localhost:8983/solr/select/?q=*%3A*
  
  
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
   dent=on
  
I get the following as a response:
   
result name=response numFound=249943 start=0
   
  doc
   
str name=typeMammal/str
str name=id1/str
str name=genusCanis/str
   
  /doc
   
/response
   
(plus some other docs but one is enough for this example)
   
But if I go to
http://localhost:8983/solr/select/?q=type%3A
  
  
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
   dent=on
  
Mammal
   
I only get:
   
result name=response numFound=0 start=0
   
But it seems that should return at least the result I have listed
above. What am I doing incorrectly?
   
On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:
 q=dog is equivalent to q=text:dog (where the default search field
 is
 defined as text at the bottom of schema.xml).

 If you want to specify a different field, well, you need to tell it
 :-)

 Is that it?

 Upayavira

 On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb

 brian.l...@journalexperts.com wrote:
  Hi all,
 
  I was able to get my installation of Solr indexed using
 dataimport.
  However,
  I cannot seem to get search working. I can verify that the data
 is
  
   there
  
  by
  
  going to:
  
 http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
   dent=on
  
  This gives me the response: result name=response
  numFound=234961 start=0
 
  But when I go to
  
  
 http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10inde
   nt=on
  
  I get the response: result name=response numFound=0
 start=0
 
  I know that dog should return some results because it is the
 first
  
   result
  
  when I select all the records. So what am I doing incorrectly
 that
  
   would
  
  prevent me from seeing results?

 ---
 Enterprise Search Consultant at Sourcesense UK,
 Making Sense of Open Source
  
   ---
   Enterprise Search Consultant at Sourcesense UK,
   Making Sense of Open Source



Re: solr different sizes on master and slave

2011-03-01 Thread Mike Franon
No pending commits, what it looks like is there are almost two copies
of the index on the master, not sure how that happened.



On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
markus.jel...@openindex.io wrote:
 Are there pending commits on the master?

 I was curious why would the size be dramatically different even though
 the index versions are the same?

 One is 1.2 Gb, and on the slave it is 512 MB

 I would think they should both be the same size no?

 Thanks



numberic or string type for non-sortable field?

2011-03-01 Thread cyang2010
I wonder if i shall use solr int or string for such field with following
requirement

multi-value
facet needed
sort not needed


The field value is a an id.  Therefore, i can store as either numeric field
or just a string.   Shall i choose string for efficiency?

Thanks.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2606353.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Ahmet Arslan
 I wonder if i shall use solr int or
 string for such field with following
 requirement
 
 multi-value
 facet needed
 sort not needed
 
 
 The field value is a an id.  Therefore, i can store as
 either numeric field
 or just a string.   Shall i choose string
 for efficiency?

Trie based integer (tint) is preferred for faster faceting.
 





Re: Query on multivalue field

2011-03-01 Thread Scott Yeadon

Thanks, but just to confirm the way multiValued fields work:

In a multiValued field, call it field1, if I have two values indexed to 
this field, say value 1 = some text...termA...more text and value 2 = 
some text...termB...more text and do a search such as field1:(termA termB)
(where solrQueryParser defaultOperator=AND/) I'm getting a hit 
returned even though both terms don't occur within a single value in the 
multiValued field.


What I'm wondering is if there is a way of applying the query against 
each value of the field rather than against the field in its entirety. 
The reason being is the number of values I want to store is variable and 
I'd like to avoid the use of dynamic fields or restructuring the index 
if possible.


Scott.

On 2/03/11 12:35 AM, Steven A Rowe wrote:

Hi Scott,

Querying against a multi-valued field just works - no special incantation 
required.

Steve


-Original Message-
From: Scott Yeadon [mailto:scott.yea...@anu.edu.au]
Sent: Monday, February 28, 2011 11:50 PM
To:solr-user@lucene.apache.org
Subject: Query on multivalue field

Hi,

I have a variable number of text-based fields associated with each
primary record which I wanted to apply a search across. I wanted to
avoid the use of dynamic fields if possible or having to create a
different document type in the index (as the app is based around the
primary record and different views mean a lot of work to revamp
pagination etc).

So, is there a way to apply a query to each value of a multivalued field
or is it always treated as a single field from a query perspective?

Thanks.

Scott.




Re: solr different sizes on master and slave

2011-03-01 Thread Mike Franon
ok doing some more research I noticed, on the slave it has multiple
folders where it keeps them for example

index
index.20110204010900
index.20110204013355
index.20110218125400

and then there is an index.properties that shows which index it is using.

I am just curious why does it keep multiple copies?  Is there a
setting somewhere I can change to only keep one copy so not to lose
space?

Thanks

On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon kongfra...@gmail.com wrote:
 No pending commits, what it looks like is there are almost two copies
 of the index on the master, not sure how that happened.



 On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
 markus.jel...@openindex.io wrote:
 Are there pending commits on the master?

 I was curious why would the size be dramatically different even though
 the index versions are the same?

 One is 1.2 Gb, and on the slave it is 512 MB

 I would think they should both be the same size no?

 Thanks




Re: numberic or string type for non-sortable field?

2011-03-01 Thread Chris Hostetter

:  The field value is a an id.  Therefore, i can store as
:  either numeric field
:  or just a string.   Shall i choose string
:  for efficiency?
: 
: Trie based integer (tint) is preferred for faster faceting.

range faceting/filtering yes -- not for field faceting which is what i 
think he's asking about.

in that case int would still proably be more efficient, but you don't want 
precision steps (that will introduce added terms)

-Hoss

Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jonathan Rochkind
Hmm, okay, have to try to find time to install the example/multicore and 
see.


It's definitely never worked for me, weird.

Thanks.

On 3/1/2011 2:38 PM, Chris Hostetter wrote:

: Unless I'm doing something wrong, in my experience in multi-core Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the 'data' dir.

have you looked at the example/multicore directory that was included in
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the
solr.xml (or hte solrconfig.xml) and it uses the data dir inside the
specified instanceDir.

If that example works for you, but your own configs do not, then we'll
need more details about your own configs -- how are you running solr, what
does the solrconfig.xml of the core look like, etc...


-Hoss



Re: solr different sizes on master and slave

2011-03-01 Thread Jonathan Rochkind
The slave should not keep multiple copies _permanently_, but might 
temporarily after it's fetched the new files from master, but before 
it's committed them and fully wamred the new index searchers in the 
slave.  Could that be what's going on, is your slave just still working 
on committing and warming the new version(s) of the index?


[If you do 'commit' to slave (and a replication pull counts as a 
'commit') so quick that you get overlapping commits before the slave was 
able to warm a new index... its' going to be trouble all around.]


On 3/1/2011 4:27 PM, Mike Franon wrote:

ok doing some more research I noticed, on the slave it has multiple
folders where it keeps them for example

index
index.20110204010900
index.20110204013355
index.20110218125400

and then there is an index.properties that shows which index it is using.

I am just curious why does it keep multiple copies?  Is there a
setting somewhere I can change to only keep one copy so not to lose
space?

Thanks

On Tue, Mar 1, 2011 at 3:26 PM, Mike Franonkongfra...@gmail.com  wrote:

No pending commits, what it looks like is there are almost two copies
of the index on the master, not sure how that happened.



On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
markus.jel...@openindex.io  wrote:

Are there pending commits on the master?


I was curious why would the size be dramatically different even though
the index versions are the same?

One is 1.2 Gb, and on the slave it is 512 MB

I would think they should both be the same size no?

Thanks


Re: numberic or string type for non-sortable field?

2011-03-01 Thread cyang2010
Sorry i didn't make my question clear.

I will only facet based on field value, not ranged query  (it is just some
ids for a  multi-value field).   And i won't do sort on the field either.

In that case, is string more efficient for the requirement?

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2606762.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query on multivalue field

2011-03-01 Thread Ahmet Arslan
 In a multiValued field, call it field1, if I have two
 values indexed to 
 this field, say value 1 = some text...termA...more text
 and value 2 = 
 some text...termB...more text and do a search such as
 field1:(termA termB)
 (where solrQueryParser defaultOperator=AND/) I'm
 getting a hit 
 returned even though both terms don't occur within a single
 value in the 
 multiValued field.
 
 What I'm wondering is if there is a way of applying the
 query against 
 each value of the field rather than against the field in
 its entirety. 
 The reason being is the number of values I want to store is
 variable and 
 I'd like to avoid the use of dynamic fields or
 restructuring the index 
 if possible.

Your best bet can be using positionIncrementGap and to issue a phrase query 
(implicit AND) with the appropriate slop value. 

Ff you have positionIncrementGap=100, you can simulate this with using
q=field1:termA termB~100

http://search-lucene.com/m/Hbdvz1og7D71/


  


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Ahmet Arslan
 I will only facet based on field value, not ranged
 query  (it is just some
 ids for a  multi-value field).   And i
 won't do sort on the field either.
 
 In that case, is string more efficient for the
 requirement?

Hoss was saying to use, fieldType name=int class=solr.TrieIntField 
precisionStep=0 omitNorms=true positionIncrementGap=0/ 





Searching all terms - SolrJ

2011-03-01 Thread openvictor Open
Dear all,

First I am sorry if this question has already been asked ( I am sure it
was...) but I can't find the right option with solrj.

I want to query only documents that contains ALL query terms.
Let me take an example, I have 4 documents that are simple sequences  ( they
have only one field : text ):

1 : The cat is on the roof
2 : The dog is on the roof
3 : The cat is black
4 : the cat is black and on the roof

if I search cat roof I will have doc 1,2,3,4
In my case I would like to have only : doc 1 and doc 4 (either cat or roof
don't appear in doc 2 and 3).

Is there a simple way to do that automatically with SolrJ or should I should
something like :
text:cat AND text:roof ?

Thank you very much for your help !

Best regards,
Victor


Re: Searching all terms - SolrJ

2011-03-01 Thread Ahmet Arslan

--- On Wed, 3/2/11, openvictor Open openvic...@gmail.com wrote:

 From: openvictor Open openvic...@gmail.com
 Subject: Searching all terms - SolrJ
 To: solr-user@lucene.apache.org
 Date: Wednesday, March 2, 2011, 12:20 AM
 Dear all,
 
 First I am sorry if this question has already been asked (
 I am sure it
 was...) but I can't find the right option with solrj.
 
 I want to query only documents that contains ALL query
 terms.
 Let me take an example, I have 4 documents that are simple
 sequences  ( they
 have only one field : text ):
 
 1 : The cat is on the roof
 2 : The dog is on the roof
 3 : The cat is black
 4 : the cat is black and on the roof
 
 if I search cat roof I will have doc 1,2,3,4
 In my case I would like to have only : doc 1 and doc 4
 (either cat or roof
 don't appear in doc 2 and 3).
 
 Is there a simple way to do that automatically with SolrJ
 or should I should
 something like :
 text:cat AND text:roof ?
 
 Thank you very much for your help !

You can use solrQueryParser defaultOperator=AND/ in your schema.xml





Re: Query on multivalue field

2011-03-01 Thread Scott Yeadon
The only trick with this is ensuring the searches return the right 
results and don't go across value boundaries. If I set the gap to the 
largest text size we expect (approx 5000 chars) what impact does such a 
large value have (i.e. does Solr physically separate these fragments in 
the index or just apply the figure as part of any query?


Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:

In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = some text...termA...more text
and value 2 =
some text...termB...more text and do a search such as
field1:(termA termB)
(wheresolrQueryParser defaultOperator=AND/) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.

Your best bet can be using positionIncrementGap and to issue a phrase query 
(implicit AND) with the appropriate slop value.

Ff you have positionIncrementGap=100, you can simulate this with using
q=field1:termA termB~100

http://search-lucene.com/m/Hbdvz1og7D71/








Re: Distances in spatial search (Solr 4.0)

2011-03-01 Thread Alexandre Rocco
Hi Bill,

I was using a different approach to sort by the distance with the dist()
function, since geodist() is not documented on the wiki (
http://wiki.apache.org/solr/FunctionQuery)

Tried something like:
sort=dist(2, 45.15,-93.85, lat, lng) asc

I made some tests with geodist() function as you pointed and got different
results.
Is it safe to assume that geodist() is the correct way of doing it?

Also, can you clear up how can I see the distance using the _Val_ as you
told?

Thanks!
Alexandre

On Tue, Mar 1, 2011 at 12:03 AM, Bill Bell billnb...@gmail.com wrote:

 Use sort with geodist() to sort by distance.

 Getting the distance returned us documented on the wiki if you are not
 using score. see reference to _Val_

 Bill Bell
 Sent from mobile


 On Feb 28, 2011, at 7:54 AM, Alexandre Rocco alel...@gmail.com wrote:

  Hi guys,
 
  We are implementing a separate index on our website, that will be
 dedicated
  to spatial search.
  I've downloaded a build of Solr 4.0 to try the spatial features and got
 the
  geodist working really fast.
 
  We now have 2 other features that will be needed on this project:
  1. Returning the distance from the reference point to the search hit (in
  kilometers)
  2. Sorting by the distance.
 
  On item 2, the wiki doc points that a distance function can be used but I
  was not able to find good info on how to accomplish it.
  Also, returning the distance (item 1) is noted as currently being in
  development and there is some workaround to get it.
 
  Anyone had experience with the spatial feature and could help with some
  pointers on how to achieve it?
 
  Thanks,
  Alexandre



Re: multi-core solr, specifying the data directory

2011-03-01 Thread Michael Sokolov
I tried this in my 1.4.0 installation (commenting out what had been 
working, hoping the default would be as you said works in the example):


solr persistent=true sharedLib=lib
cores adminPath=/admin/cores
core name=bpro instanceDir=bpro
!-- property name=solr.data.dir value=solr/bpro/data/ --
/core
core name=pfapp instanceDir=pfapp
property name=solr.data.dir value=solr/pfapp/data/
/core
/cores
/solr

In the log after starting up, I get these messages (among many others):

...

Mar 1, 2011 7:51:23 PM org.apache.solr.core.CoreContainer$Initializer 
initialize

INFO: looking for solr.xml: /usr/local/tomcat/solr/solr.xml
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader 
locateSolrHome

INFO: No /solr/home in JNDI
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader 
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or 
JNDI)

Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader init
INFO: Solr home set to 'solr/'

Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader init
INFO: Solr home set to 'solr/bpro/'
...
Mar 1, 2011 7:51:24 PM org.apache.solr.core.SolrCore init
INFO: [bpro] Opening new SolrCore at solr/bpro/, dataDir=./solr/data/
...
Mar 1, 2011 7:51:25 PM org.apache.solr.core.SolrResourceLoader init
INFO: Solr home set to 'solr/pfapp/'
...
Mar 1, 2011 7:51:26 PM org.apache.solr.core.SolrCore init
INFO: [pfapp] Opening new SolrCore at solr/pfapp/, dataDir=solr/pfapp/data/

and it's pretty clearly using the wrong directory at that point.

Some more details:

/usr/local/tomcat has the usual tomcat distribution (this is 6.0.29)
conf/server.xml has:
Host name=localhost  appBase=webapps
unpackWARs=true autoDeploy=true
xmlValidation=false xmlNamespaceAware=false

Aliasrosen/Alias
Aliasrosen.ifactory.com/Alias
Context path= docBase=/usr/local/tomcat/webapps/solr /

/Host

There is a solrconfig.xml in each of the core directories (should there 
only be one of these?).  I believe these are pretty generic (and they 
are identical); the one in the bpro folder has:


!-- Used to specify an alternate directory to hold all index data
   other than the default ./data under the Solr home.
   If replication is in use, this should match the replication 
configuration

. --
dataDir${solr.data.dir:./solr/data}/dataDir



-Mike

On 3/1/2011 4:38 PM, Jonathan Rochkind wrote:
Hmm, okay, have to try to find time to install the example/multicore 
and see.


It's definitely never worked for me, weird.

Thanks.

On 3/1/2011 2:38 PM, Chris Hostetter wrote:
: Unless I'm doing something wrong, in my experience in multi-core 
Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the 
'data' dir.


have you looked at the example/multicore directory that was included in
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the
solr.xml (or hte solrconfig.xml) and it uses the data dir inside the
specified instanceDir.

If that example works for you, but your own configs do not, then we'll
need more details about your own configs -- how are you running solr, 
what

does the solrconfig.xml of the core look like, etc...


-Hoss





Re: Query on multivalue field

2011-03-01 Thread Jonathan Rochkind
Each token has a position set on it. So if you index the value alpha 
beta gamma, it winds up stored in Solr as (sort of, for the way we want 
to look at it)


document1:
alpha:position 1
beta:position 2
gamma: postition 3

 If you set the position increment gap large, then after one value in a 
multi-valued field ends, the position increment gap will be added to the 
positions for the next value. Solr doesn't actually internally have much 
of any idea of a multi-valued field, ALL a multi-valued indexed field 
is, is a position increment gap seperating tokens from different 'values'.


So index in a multi-valued field, with position increment gap 1,  
the values:  [alpha beta gamma, aleph bet], you get kind of like:


document1:
alpha: 1
beta: 2
gamma: 3
aleph: 10004
bet: 10005

A large position increment gap, as far as I know and can tell (please 
someone correct me if I'm wrong, I am not a Solr developer) has no 
effect on the size or efficiency of your index on disk.


I am not sure why positionIncrementGap doesn't just default to a very 
large number, to provide behavior that more matches what people expect 
from the idea of a multi-valued field. So maybe there is some flaw in 
my understanding, that justifies some reason for it not to be this way?


But I set my positionIncrementGap very large, and haven't seen any issues.


On 3/1/2011 5:46 PM, Scott Yeadon wrote:

The only trick with this is ensuring the searches return the right
results and don't go across value boundaries. If I set the gap to the
largest text size we expect (approx 5000 chars) what impact does such a
large value have (i.e. does Solr physically separate these fragments in
the index or just apply the figure as part of any query?

Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:

In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = some text...termA...more text
and value 2 =
some text...termB...more text and do a search such as
field1:(termA termB)
(wheresolrQueryParser defaultOperator=AND/) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.

Your best bet can be using positionIncrementGap and to issue a phrase query 
(implicit AND) with the appropriate slop value.

Ff you have positionIncrementGap=100, you can simulate this with using
q=field1:termA termB~100

http://search-lucene.com/m/Hbdvz1og7D71/








Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jonathan Rochkind
This definitely matches my own experience, and I've heard it from 
others. I haven't heard of anyone who HAS gotten it to work like that.  
But apparently there's a distributed multi-core example which claims to 
work like it doesn't for us.


One of us has to try the Solr distro multi-core example, as Hoss 
suggested/asked, to see if the problem exhibits even there, and if not, 
figure out what the difference is.  Sorry, haven't found time to figure 
out how to install and start up the demo.


I am running in Tomcat, I wonder if container could matter, and maybe it 
somehow works in Jetty or something?


Jonathan


On 3/1/2011 7:05 PM, Michael Sokolov wrote:

I tried this in my 1.4.0 installation (commenting out what had been
working, hoping the default would be as you said works in the example):

solr persistent=true sharedLib=lib
cores adminPath=/admin/cores
core name=bpro instanceDir=bpro
!--property name=solr.data.dir value=solr/bpro/data/  --
/core
core name=pfapp instanceDir=pfapp
property name=solr.data.dir value=solr/pfapp/data/
/core
/cores
/solr

In the log after starting up, I get these messages (among many others):

...

Mar 1, 2011 7:51:23 PM org.apache.solr.core.CoreContainer$Initializer
initialize
INFO: looking for solr.xml: /usr/local/tomcat/solr/solr.xml
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: No /solr/home in JNDI
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoaderinit
INFO: Solr home set to 'solr/'

Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoaderinit
INFO: Solr home set to 'solr/bpro/'
...
Mar 1, 2011 7:51:24 PM org.apache.solr.core.SolrCoreinit
INFO: [bpro] Opening new SolrCore at solr/bpro/, dataDir=./solr/data/
...
Mar 1, 2011 7:51:25 PM org.apache.solr.core.SolrResourceLoaderinit
INFO: Solr home set to 'solr/pfapp/'
...
Mar 1, 2011 7:51:26 PM org.apache.solr.core.SolrCoreinit
INFO: [pfapp] Opening new SolrCore at solr/pfapp/, dataDir=solr/pfapp/data/

and it's pretty clearly using the wrong directory at that point.

Some more details:

/usr/local/tomcat has the usual tomcat distribution (this is 6.0.29)
conf/server.xml has:
Host name=localhost  appBase=webapps
  unpackWARs=true autoDeploy=true
  xmlValidation=false xmlNamespaceAware=false

Aliasrosen/Alias
Aliasrosen.ifactory.com/Alias
Context path= docBase=/usr/local/tomcat/webapps/solr /

/Host

There is a solrconfig.xml in each of the core directories (should there
only be one of these?).  I believe these are pretty generic (and they
are identical); the one in the bpro folder has:

!-- Used to specify an alternate directory to hold all index data
 other than the default ./data under the Solr home.
 If replication is in use, this should match the replication
configuration
. --
dataDir${solr.data.dir:./solr/data}/dataDir



-Mike

On 3/1/2011 4:38 PM, Jonathan Rochkind wrote:

Hmm, okay, have to try to find time to install the example/multicore
and see.

It's definitely never worked for me, weird.

Thanks.

On 3/1/2011 2:38 PM, Chris Hostetter wrote:

: Unless I'm doing something wrong, in my experience in multi-core
Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the
'data' dir.

have you looked at the example/multicore directory that was included in
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the
solr.xml (or hte solrconfig.xml) and it uses the data dir inside the
specified instanceDir.

If that example works for you, but your own configs do not, then we'll
need more details about your own configs -- how are you running solr,
what
does the solrconfig.xml of the core look like, etc...


-Hoss





[ANNOUNCE] Web Crawler

2011-03-01 Thread Dominique Bejean

Hi,

I would like to announce Crawl Anywhere. Crawl-Anywhere is a Java Web 
Crawler. It includes :


   * a crawler
   * a document processing pipeline
   * a solr indexer

The crawler has a web administration in order to manage web sites to be 
crawled. Each web site crawl is configured with a lot of possible 
parameters (no all mandatory) :


   * number of simultaneous items crawled by site
   * recrawl period rules based on item type (html, PDF, …)
   * item type inclusion / exclusion rules
   * item path inclusion / exclusion / strategy rules
   * max depth
   * web site authentication
   * language
   * country
   * tags
   * collections
   * ...

The pileline includes various ready to use stages (text extraction, 
language detection, Solr ready to index xml writer, ...).


All is very configurable and extendible either by scripting or java coding.

With scripting technology, you can help the crawler to handle javascript 
links or help the pipeline to extract relevant title and cleanup the 
html pages (remove menus, header, footers, ..)


With java coding, you can develop your own pipeline stage stage

The Crawl Anywhere web site provides good explanations and screen shots. 
All is documented in a wiki.


The current version is 1.1.4. You can download and try it out from here 
: www.crawl-anywhere.com



Regards

Dominique



Re: Searching all terms - SolrJ

2011-03-01 Thread openvictor Open
Yes but I want to leave the choice to the user.

He can either search all the terms or just some.

Is there any more flexible solution ? Even if I have to code it by hand ?



2011/3/1 Ahmet Arslan iori...@yahoo.com


 --- On Wed, 3/2/11, openvictor Open openvic...@gmail.com wrote:

  From: openvictor Open openvic...@gmail.com
  Subject: Searching all terms - SolrJ
  To: solr-user@lucene.apache.org
  Date: Wednesday, March 2, 2011, 12:20 AM
  Dear all,
 
  First I am sorry if this question has already been asked (
  I am sure it
  was...) but I can't find the right option with solrj.
 
  I want to query only documents that contains ALL query
  terms.
  Let me take an example, I have 4 documents that are simple
  sequences  ( they
  have only one field : text ):
 
  1 : The cat is on the roof
  2 : The dog is on the roof
  3 : The cat is black
  4 : the cat is black and on the roof
 
  if I search cat roof I will have doc 1,2,3,4
  In my case I would like to have only : doc 1 and doc 4
  (either cat or roof
  don't appear in doc 2 and 3).
 
  Is there a simple way to do that automatically with SolrJ
  or should I should
  something like :
  text:cat AND text:roof ?
 
  Thank you very much for your help !

 You can use solrQueryParser defaultOperator=AND/ in your schema.xml






Re: Query on multivalue field

2011-03-01 Thread Scott Yeadon
Tested it out and seems to work well as long as I set the gap to a value 
much longer than the text - 1 appear to work fine for our current 
data. Thanks heaps for all the help guys!


Scott.

On 2/03/11 11:13 AM, Jonathan Rochkind wrote:
Each token has a position set on it. So if you index the value alpha 
beta gamma, it winds up stored in Solr as (sort of, for the way we 
want to look at it)


document1:
alpha:position 1
beta:position 2
gamma: postition 3

 If you set the position increment gap large, then after one value in 
a multi-valued field ends, the position increment gap will be added to 
the positions for the next value. Solr doesn't actually internally 
have much of any idea of a multi-valued field, ALL a multi-valued 
indexed field is, is a position increment gap seperating tokens from 
different 'values'.


So index in a multi-valued field, with position increment gap 1,  
the values:  [alpha beta gamma, aleph bet], you get kind of like:


document1:
alpha: 1
beta: 2
gamma: 3
aleph: 10004
bet: 10005

A large position increment gap, as far as I know and can tell (please 
someone correct me if I'm wrong, I am not a Solr developer) has no 
effect on the size or efficiency of your index on disk.


I am not sure why positionIncrementGap doesn't just default to a very 
large number, to provide behavior that more matches what people expect 
from the idea of a multi-valued field. So maybe there is some flaw 
in my understanding, that justifies some reason for it not to be this 
way?


But I set my positionIncrementGap very large, and haven't seen any 
issues.



On 3/1/2011 5:46 PM, Scott Yeadon wrote:

The only trick with this is ensuring the searches return the right
results and don't go across value boundaries. If I set the gap to the
largest text size we expect (approx 5000 chars) what impact does such a
large value have (i.e. does Solr physically separate these fragments in
the index or just apply the figure as part of any query?

Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:

In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = some text...termA...more text
and value 2 =
some text...termB...more text and do a search such as
field1:(termA termB)
(wheresolrQueryParser defaultOperator=AND/) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.
Your best bet can be using positionIncrementGap and to issue a 
phrase query (implicit AND) with the appropriate slop value.


Ff you have positionIncrementGap=100, you can simulate this with 
using

q=field1:termA termB~100

http://search-lucene.com/m/Hbdvz1og7D71/












Re: numberic or string type for non-sortable field?

2011-03-01 Thread cyang2010
Can I know why?  I thought solr is tuned for string if no sorting of facet by
range query is needed.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2607932.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr different sizes on master and slave

2011-03-01 Thread Markus Jelsma
Indeed, the slave should not have useless copies but it does, at least in 
1.4.0, i haven't seen it in 3.x, but that was just a small test that did not 
exactly meet my other production installs.

In 1.4.0 Solr does not remove old copies at startup and it does not cleanly 
abort running replications at shutdown. Between shutdown and startup there 
might be a higher index version, it will then proceed as expected; download 
the new version and continue. Old copies will appear.

There is an earlier thread i started but without patch. You can, however, work 
around the problem by letting Solr delete a running replication by: 1. disable 
polling and then 2) abort replication. You can also write a script that will 
compare current and available replication directories before startup and act 
accordingly.


 The slave should not keep multiple copies _permanently_, but might
 temporarily after it's fetched the new files from master, but before
 it's committed them and fully wamred the new index searchers in the
 slave.  Could that be what's going on, is your slave just still working
 on committing and warming the new version(s) of the index?
 
 [If you do 'commit' to slave (and a replication pull counts as a
 'commit') so quick that you get overlapping commits before the slave was
 able to warm a new index... its' going to be trouble all around.]
 
 On 3/1/2011 4:27 PM, Mike Franon wrote:
  ok doing some more research I noticed, on the slave it has multiple
  folders where it keeps them for example
  
  index
  index.20110204010900
  index.20110204013355
  index.20110218125400
  
  and then there is an index.properties that shows which index it is using.
  
  I am just curious why does it keep multiple copies?  Is there a
  setting somewhere I can change to only keep one copy so not to lose
  space?
  
  Thanks
  
  On Tue, Mar 1, 2011 at 3:26 PM, Mike Franonkongfra...@gmail.com  wrote:
  No pending commits, what it looks like is there are almost two copies
  of the index on the master, not sure how that happened.
  
  
  
  On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
  
  markus.jel...@openindex.io  wrote:
  Are there pending commits on the master?
  
  I was curious why would the size be dramatically different even though
  the index versions are the same?
  
  One is 1.2 Gb, and on the slave it is 512 MB
  
  I would think they should both be the same size no?
  
  Thanks


Re: Indexed, but cannot search

2011-03-01 Thread Markus Jelsma
Hmm, please provide analyzer of text and output of debugQuery=true. Anyway, if 
field type is fieldType text and the catchall field text is fieldType text as 
well 
and you reindexed, it should work as expected.

 Oh if only it were that easy :-). I have reindexed since making that change
 which is how I was able to get the regular search working. I have not
 however been able to get the search across all fields to work.
 
 On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma 
markus.jel...@openindex.iowrote:
  Traditionally, people forget to reindex ;)
  
   Hi all,
   
   The problem was that my fields were defined as type=string instead of
   type=text. Once I corrected that, it seems to be fixed. The only part
   that still is not working though is the search across all fields.
   
   For example:
   
   http://localhost:8983/solr/select/?q=type%3AMammal
   
   Now correctly returns the records matching mammal. But if I try to do a
   global search across all fields:
   
   http://localhost:8983/solr/select/?q=Mammal
   http://localhost:8983/solr/select/?q=text%3AMammal
   
   I get no results returned. Here is how the schema is set up:
   
   field name=text type=text indexed=true stored=false
   multiValued=true/
   defaultSearchFieldtext/defaultSearchField
   copyField source=* dest=text /
   
   Thanks to everyone for your help so far. I think this is the last
   hurdle
  
  I
  
   have to jump over.
   
   On Tue, Mar 1, 2011 at 12:34 PM, Upayavira u...@odoko.co.uk wrote:
Next question, do you have your type field set to index=true in
  
  your
  
schema?

Upayavira

On Tue, 01 Mar 2011 11:06 -0500, Brian Lamb

brian.l...@journalexperts.com wrote:
 Thank you for your reply but the searching is still not working
 out. For example, when I go to:
 
 http://localhost:8983/solr/select/?q=*%3A*
  
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
  
dent=on

 I get the following as a response:
 
 result name=response numFound=249943 start=0
 
   doc
   
 str name=typeMammal/str
 str name=id1/str
 str name=genusCanis/str
   
   /doc
 
 /response
 
 (plus some other docs but one is enough for this example)
 
 But if I go to
 http://localhost:8983/solr/select/?q=type%3A
  
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
  
dent=on

 Mammal
 
 I only get:
 
 result name=response numFound=0 start=0
 
 But it seems that should return at least the result I have listed
 above. What am I doing incorrectly?
 
 On Mon, Feb 28, 2011 at 6:57 PM, Upayavira u...@odoko.co.uk wrote:
  q=dog is equivalent to q=text:dog (where the default search field
  
  is
  
  defined as text at the bottom of schema.xml).
  
  If you want to specify a different field, well, you need to tell
  it
  
  :-)
  
  Is that it?
  
  Upayavira
  
  On Mon, 28 Feb 2011 15:38 -0500, Brian Lamb
  
  brian.l...@journalexperts.com wrote:
   Hi all,
   
   I was able to get my installation of Solr indexed using
  
  dataimport.
  
   However,
   I cannot seem to get search working. I can verify that the data
  
  is
  
there

   by
  
   going to:
  http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10in
  
dent=on

   This gives me the response: result name=response
   numFound=234961 start=0
   
   But when I go to
  
  http://localhost:8983/solr/select/?q=dogversion=2.2start=0rows=10inde
  
nt=on

   I get the response: result name=response numFound=0
  
  start=0
  
   I know that dog should return some results because it is the
  
  first
  
result

   when I select all the records. So what am I doing incorrectly
  
  that
  
would

   prevent me from seeing results?
  
  ---
  Enterprise Search Consultant at Sourcesense UK,
  Making Sense of Open Source

---
Enterprise Search Consultant at Sourcesense UK,
Making Sense of Open Source


Re: multi-core solr, specifying the data directory

2011-03-01 Thread Chris Hostetter
: !-- Used to specify an alternate directory to hold all index data
:other than the default ./data under the Solr home.
:If replication is in use, this should match the replication
: configuration
: . --
: dataDir${solr.data.dir:./solr/data}/dataDir

that directive says use the solr.data.dir system property to pick a path, 
if it is not set, use ./solr/data (realtive the CWD)

if you want it to use the default, then you need to eliminate it 
completley, or you need to change it to the empty string...

   dataDir${solr.data.dir:}/dataDir

or...

   dataDir/dataDir


-Hoss


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Chris Hostetter

: Can I know why?  I thought solr is tuned for string if no sorting of facet by
: range query is needed.

tuned for string doesn't really mean anything to me, i'm not sure what 
that's in refrence to.  nothing thta i know of is particularly optimized 
for strings.  Almost anything can be indexed/stored/represented as a 
string (in some form ot another) and that tends to work fine in solr, but 
some things are optimized for other more specialized datatypes.

the reason i suggested that using ints might (marginally) be better is 
because of the FieldCache and the fieldValueCache -- the int 
representation uses less memory then if it was holding strings 
representing hte same ints.

worrying about that is really a premature optimization though -- model 
your data in the way that makes the most sense -- if your ids are 
inherently ints, model them as ints until you come up with a reason to 
model them otherwise and move on to the next problem.


-Hoss


Re: Distances in spatial search (Solr 4.0)

2011-03-01 Thread William Bell
See http://wiki.apache.org/solr/SpatialSearch and yest use sort=geodist()+asc

This Wiki page has everything you should need\.


On Tue, Mar 1, 2011 at 3:49 PM, Alexandre Rocco alel...@gmail.com wrote:
 Hi Bill,

 I was using a different approach to sort by the distance with the dist()
 function, since geodist() is not documented on the wiki (
 http://wiki.apache.org/solr/FunctionQuery)

 Tried something like:
 sort=dist(2, 45.15,-93.85, lat, lng) asc

 I made some tests with geodist() function as you pointed and got different
 results.
 Is it safe to assume that geodist() is the correct way of doing it?

 Also, can you clear up how can I see the distance using the _Val_ as you
 told?

 Thanks!
 Alexandre

 On Tue, Mar 1, 2011 at 12:03 AM, Bill Bell billnb...@gmail.com wrote:

 Use sort with geodist() to sort by distance.

 Getting the distance returned us documented on the wiki if you are not
 using score. see reference to _Val_

 Bill Bell
 Sent from mobile


 On Feb 28, 2011, at 7:54 AM, Alexandre Rocco alel...@gmail.com wrote:

  Hi guys,
 
  We are implementing a separate index on our website, that will be
 dedicated
  to spatial search.
  I've downloaded a build of Solr 4.0 to try the spatial features and got
 the
  geodist working really fast.
 
  We now have 2 other features that will be needed on this project:
  1. Returning the distance from the reference point to the search hit (in
  kilometers)
  2. Sorting by the distance.
 
  On item 2, the wiki doc points that a distance function can be used but I
  was not able to find good info on how to accomplish it.
  Also, returning the distance (item 1) is noted as currently being in
  development and there is some workaround to get it.
 
  Anyone had experience with the spatial feature and could help with some
  pointers on how to achieve it?
 
  Thanks,
  Alexandre




Re: Question about Nested Span Near Query

2011-03-01 Thread William Bell
I am not 100% sure. But I why did you not use the standard confix for text ?

fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
/fieldType


You are using:

- fieldtype name=text class=solr.TextField
- analyzer
  tokenizer class=solr.StandardTokenizerFactory
luceneMatchVersion=LUCENE_29 /
  filter class=solr.StandardFilterFactory /
  filter class=solr.LowerCaseFilterFactory /
- !--
 filter class=solr.StopFilterFactory luceneMatchVersion=LUCENE_29/
  filter class=solr.EnglishPorterFilterFactory/

  --
  /analyzer
  /fieldtype


Can you try a more standard approach ?

solr.WhitespaceTokenizerFactory
solr.LowerCaseFilterFactory

??

Thanks.


On Mon, Feb 28, 2011 at 2:38 AM, Ahsan |qbal ahsan.iqbal...@gmail.com wrote:
 Hi Bill
 Any update..

 On Thu, Feb 24, 2011 at 8:58 PM, Ahsan |qbal ahsan.iqbal...@gmail.com
 wrote:

 Hi
 schema and document are attached.

 On Thu, Feb 24, 2011 at 8:24 PM, Bill Bell billnb...@gmail.com wrote:

 Send schema and document in XML format and I'll look at it

 Bill Bell
 Sent from mobile


 On Feb 24, 2011, at 7:26 AM, Ahsan |qbal ahsan.iqbal...@gmail.com
 wrote:

  Hi
 
  To narrow down the issue I indexed a single document with one of the
  sample
  queries (given below) which was giving issue.
 
  *evaluation of loan and lease portfolios for purposes of assessing the
  adequacy of *
 
  Now when i Perform a search query (*TextContents:evaluation of loan
  and
  lease portfolios for purposes of assessing the adequacy of*) the
  parsed
  query is
 
 
  *spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([Contents:evaluation,
  Contents:of], 0, true), Contents:loan], 0, true), Contents:and], 0,
  true),
  Contents:lease], 0, true), Contents:portfolios], 0, true),
  Contents:for], 0,
  true), Contents:purposes], 0, true), Contents:of], 0, true),
  Contents:assessing], 0, true), Contents:the], 0, true),
  Contents:adequacy],
  0, true), Contents:of], 0, true)*
 
  and search is not successful.
 
  If I remove '*evaluation*' from start OR *'assessing the adequacy of*'
  from
  end it works fine. Issue seems to come on relatively long phrases but I
  have
  not been able to find a pattern and its really mind boggling coz I
  thought
  this issue might be due to large position list but this is a single
  document
  with one phrase. So its definitely not related to size of index.
 
  Any ideas whats going on??
 
  On Thu, Feb 24, 2011 at 10:25 AM, Ahsan |qbal
  ahsan.iqbal...@gmail.comwrote:
 
  Hi
 
  It didn't search.. (means no results found even results exist) one
  observation is that it works well even in the long phrases but when
  the long
  phrases contain stop words and same stop word exist two or more time
  in the
  phrase then, solr can't search with query parsed in this way.
 
 
  On Wed, Feb 23, 2011 at 11:49 PM, Otis Gospodnetic 
  otis_gospodne...@yahoo.com wrote:
 
  Hi,
 
  What do you mean by this doesn't work fine?  Does it not work
  correctly
  or is
  it slow or ...
 
  I was going to suggest you look at Surround QP, but it looks like you
  already
  did that.  Wouldn't it be better to get Surround QP to work?
 
  Otis
  
  Sematext :: 

indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread cyang2010
Hi,

I can't seem to be able to index to a solr date field from a query result
using DataImportHandler.  Anyone else know how to resoleve the problem?

entity name=title 
query=select ID,  title_full as TITLE_NAME, YEAR,
COUNTRY_OF_ORIGIN,  modified as RELEASE_DATE from title limit 10

field column=ID name=id /

field column=TITLE_NAME name=title_name /

field column=YEAR name=year /
field column=COUNTRY_OF_ORIGIN name=country /

field column=RELEASE_DATE name=release_date /

When i check the solr document, there is no term populated for release_date
field.  All other fields are populated with terms.

The field, release_date is a solr date type field.


Appreciate your help.


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2608327.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: More Date Math: NOW/WEEK

2011-03-01 Thread Chris Hostetter
: Digging into the source code of DateMathParser.java, i found the following 
: comment:
:99   // NOTE: consciously choosing not to support WEEK at this time,   
: 100   // because of complexity in rounding down to the nearest week   101 
  
: // arround a month/year boundry.   102   // (Not to mention: it's not 
clear 
: what people would *expect*) 
: 
: I was able to implement a work-around in my ruby client using the following 
: pseudo code:
:   wd=NOW.wday; NOW-#{wd}DAY/DAY

the main issue that comment in DateMathParser.java is refering to is what 
the ambiguity of what should happen when you try do something like 
2009-01-02T00:00:00Z/WEEK

WEEK would be the only unit where rounding changed a unit 
*larger* then the one you rounded on -- ie: rounding day only affects 
hours, minutes, seconds, millis; rounding on month only affects days, 
hours, minutes, seconds, millies; but in an example like the one above, 
where Jan 2 2009 was a friday.  rounding down a week (using logic similar 
to what you have) would result in 2008-12-28T00:00:00Z -- changing the 
month and year.

It's not really clear that that is what people would expect -- i'm 
guessing at least a few people would expect it to stop at the 1st of the 
month.

the ambiguity of what behavior makes the most sense is why never got 
arround to implementing it -- it's certianly possible, but the 
various options seemed too confusing to really be very generally useful 
and easy to understand 

as you point out: people who really want special logic like this (and know 
how they want it to behave) have an easy workarround by evaluating NOW 
in the client since every week has exactly seven days.



-Hoss


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread Chris Hostetter
:   query=select ID,  title_full as TITLE_NAME, YEAR,
: COUNTRY_OF_ORIGIN,  modified as RELEASE_DATE from title limit 10

Are you certian that the first 10 results returned (you have limit 10) 
all have a value in the modified field?

if modified is nullable you could very easily just happen to be getting 10 
docs that don't have values in that field.


-Hoss


Re: Searching all terms - SolrJ

2011-03-01 Thread Chris Hostetter

: Yes but I want to leave the choice to the user.
: 
: He can either search all the terms or just some.
: 
: Is there any more flexible solution ? Even if I have to code it by hand ?

the declaration in the schema dictates the default.

you can override the default at query time using the q.op param (ie: 
q.op=AND, q.op=OR) in the request.

in SolrJ you would just call solrQuery.set(q.op,OR) on your SolrQuery 
object.

-Hoss


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread cyang2010
Yes, I am pretty sure every row has a modified field.   I did my testing
before posting question.

I tried with adding DateFormatTransformer, still not help.


entity name=title 
query=select ID,  title_full as TITLE_NAME, YEAR,
COUNTRY_OF_ORIGIN,  modified as RELEASE_DATE from title limit 10


transformer=RegexTransformer,DateFormatTransformer,TemplateTransformer

field column=ID name=id /

field column=TITLE_NAME name=title_name /

field column=YEAR name=year /
field column=COUNTRY_OF_ORIGIN name=country /

field column=RELEASE_DATE name=release_date
dateTimeFormat=-MM-dd/

I assume it is ok to just get the date part of the information out of a
datetime field?

Any thought on this?

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2608452.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread William Bell
field column=date dateTimeFormat=-MM-dd'T'hh:mm:ss /

Did you convert the date to standard GMT format as above in DIH?

Also add transformer=DateFormatTransformer,...

http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html



On Tue, Mar 1, 2011 at 7:54 PM, cyang2010 ysxsu...@hotmail.com wrote:
 Yes, I am pretty sure every row has a modified field.   I did my testing
 before posting question.

 I tried with adding DateFormatTransformer, still not help.


        entity name=title
                        query=select ID,  title_full as TITLE_NAME, YEAR,
 COUNTRY_OF_ORIGIN,  modified as RELEASE_DATE from title limit 10


 transformer=RegexTransformer,DateFormatTransformer,TemplateTransformer

            field column=ID name=id /

            field column=TITLE_NAME name=title_name /

            field column=YEAR name=year /
            field column=COUNTRY_OF_ORIGIN name=country /

            field column=RELEASE_DATE name=release_date
 dateTimeFormat=-MM-dd/

 I assume it is ok to just get the date part of the information out of a
 datetime field?

 Any thought on this?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2608452.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: [ANNOUNCE] Web Crawler

2011-03-01 Thread David Smiley (@MITRE.org)
Dominique,
The obvious number one question is of course why you re-invented this wheel
when there are several existing crawlers to choose from.  Your website says
the reason is that the UIs on existing crawlers (e.g. Nutch, Heritrix, ...)
weren't sufficiently user-friendly or had the site-specific configuration
you wanted.  Well if that is the case, why didn't you add/enhance such
capabilities for an existing crawler?

~ David Smiley

-
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607831p2608956.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread cyang2010
Bill,

I did try to use the way you suggested above.  Unfortunately it does not
work either.

It is pretty much the same as my last reply, except the
dateTimeFormat=-MM-dd'T'hh:mm:ss

Thanks,

cyang

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2609053.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searching all terms - SolrJ

2011-03-01 Thread openvictor Open
Great !

Thank you very much Chris, it will come handy !

Best regards,
Victor

2011/3/1 Chris Hostetter hossman_luc...@fucit.org


 : Yes but I want to leave the choice to the user.
 :
 : He can either search all the terms or just some.
 :
 : Is there any more flexible solution ? Even if I have to code it by hand ?

 the declaration in the schema dictates the default.

 you can override the default at query time using the q.op param (ie:
 q.op=AND, q.op=OR) in the request.

 in SolrJ you would just call solrQuery.set(q.op,OR) on your SolrQuery
 object.

 -Hoss



how to debug dataimporthandler

2011-03-01 Thread cyang2010
I wonder how to run dataimporthandler in debug mode.  Currently i can't get
data correctly into index through dataimporthandler, especially a timestamp
column to solr date field.  I want to debug the process.

According to this wiki page:

Commands
The handler exposes all its API as http requests . The following are the
possible operations 
•full-import : Full Import operation can be started by hitting the URL
http://:/solr/dataimport?command=full-import 
...
■clean : (default 'true'). Tells whether to clean up the index before the
indexing is started 
■commit: (default 'true'). Tells whether to commit after the operation 
■optimize: (default 'true'). Tells whether to optimize after the operation 
■debug : (default false). Runs in debug mode.It is used by the interactive
development mode (see here) 
■Please note that in debug mode, documents are never committed
automatically. If you want to run debug mode and commit the results too, add
'commit=true' as a request parameter. 


Therefore, i run 

http://:/solr/dataimport?command=full-import debug=true

Not only i didn't see log with DEBUG level, but also it crashes my machine
a few times.   I was surprised it can even do that ...

Did someone ever try to debug the process before?  What is your experience
with it?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-debug-dataimporthandler-tp2611506p2611506.html
Sent from the Solr - User mailing list archive at Nabble.com.


MorelikeThis not working with Shards(distributed Search)

2011-03-01 Thread Isha Garg



Hi,

  I am experimenting with the *morelikethis* to see if it also works
with *distributed* search.But i did not get the solution yet.Can anyone
help me regarding this. please provide me detailed description. as I
didnt find it by updating
MoreLikeThisComponent.java,MoreLikeThisHandler.java,ShardRequest.java
specified in the AlternateDistributedMLT.patch .

Thanks in advance..
Isha Garg