Re: Suggest component

2011-03-29 Thread Grijesh
have you checked with q=*:*?
You mentioned in config buildOnCommit=true
So have you checked that your indexing process ends with commit?

-
Thanx: 
Grijesh 
www.gettinhahead.co.in 
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggest-component-tp2725438p2747100.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr result problem

2011-03-29 Thread Grijesh
Try LucidImagination's KStemmer

-
Thanx: 
Grijesh 
www.gettinhahead.co.in 
--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-result-problem-tp2746849p2747106.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Highlighting Problem

2011-03-29 Thread Pierre GOSSE
Look like special chars are filtered at index time and not replaced by space 
that would keep correct offset of terms. Can you paste here the definition of 
the fieldtype in your shema.xml ?


Pierre

-Message d'origine-
De : pottw...@freenet.de [mailto:pottw...@freenet.de] 
Envoyé : lundi 28 mars 2011 11:16
À : solr-user@lucene.apache.org
Objet : Highlighting Problem

dear solr specialists,

my data looks like this:

j]s(dh)fjk [hf]sjkadh asdj(kfh) [skdjfh aslkfjhalwe uigfrhj bsd bsdfga sjfg 
asdlfj.

if I want to query for the first word, the following queries must match:

j]s(dh)fjk
j]s(dhfjk
j]sdhfjk
jsdhfjk
dhf

So the matching should ignore some characters like ( ) [ ] and should match 
substrings.

So far I have the following field definition in the schema.xml:

    
  
    
    
    
    
     
  
  
    
    
      
    
     
  
    


With this definition the matching works as planned. But not for highlighting, 
there the special characters seem to move the  tags to wrong positions, for 
example searching for jsdhfjk misses the last 3 letters of the words ( = 3 
special characters from PatternReplaceFilterFactory)

j]s(dh)fjk

Solr has so many bells and whistles - what must I do to get a correctly working 
highlighting?

kind regards,
F.


---
Zeigen Sie uns Ihre beste Seite und gewinnen Sie ein iPad!
Machen Sie mit beim freenet Homepage Award 2011


Re: Fields not being indexed?

2011-03-29 Thread Stefan Matheis
Charles,

On Tue, Mar 29, 2011 at 3:32 AM, Charles Wardell
charles.ward...@bcsolution.com wrote:
 add
 doc
 field 
 name=guidhttp://twitter.com/AshleyxArsenic/statuses/52164920388763648/field
 ![CDATA[field name=title@Richard_Colo I realy don't think its for 
 awhile. Ill study when it comes up/field]]
 ![CDATA[field name=authorNameAshleyxArsenic (Ashley Hoffman)/field]]

did you see the difference between the first and the following two
definitions? actually it's really well-formed (they are treated as
textnode, as a colleague told me actually) - but they are no used from
the solr-perspective.

do it like this, should work:
field name=title![CDATA[@Richard_Colo I realy don't think its for
awhile. Ill study when it comes up]]/field

and have a look on http://en.wikipedia.org/wiki/CDATA

Regards
Stefan


Error while performing facet search across shards..

2011-03-29 Thread rajini maski
 An error while performing facet across shards..The following is
the query:

http://localhost:8090/InstantOne/select?/indent=on
shards=localhost:8090/InstantOne,localhost:8091/InstantTwo
,localhost:8093/InstantThreeq=filenumber:10facet=onfacet.field=studyId

No studyId fields are blank across any shards.  I have apache solr 1.4.1
version set up for this.

Error is :  common.SolrException log SEVERE: java.lang.NullPointerException
at
org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:331)
atorg.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:232

What might be the reason for this? Any particular configuration or set up
needed to be done?

Awaiting reply.
Rajani


Highlighting problem

2011-03-29 Thread Stefan Mueller
dear solr users,

my data looks like this:

j]s(dh)fjk [hf]sjkadh asdj(kfh) [skdjfh aslkfjhalwe uigfrhj bsd bsdfga sjfg 
asdlfj.

if I want to query for the first word, the following queries must match:

j]s(dh)fjk
j]s(dh)fjk
j]sdhfjk
jsdhfjk
dhf

So the matching should ignore some characters like ( ) [ ] and should match 
substrings.

So far I have the following field definition in the schema.xml:

    fieldType name=text_ngram class=solr.TextField 
positionIncrementGap=100
  analyzer type=index
    tokenizer class=solr.WhitespaceTokenizerFactory/
    filter class=solr.PatternReplaceFilterFactory pattern=[\[\]\(\)] 
replacement= replace=all /
    filter class=solr.LowerCaseFilterFactory/
    charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/
    filter class=solr.NGramFilterFactory minGramSize=2 maxGramSize=2 
/ 
  /analyzer
  analyzer type=query
    tokenizer class=solr.WhitespaceTokenizerFactory/
    filter class=solr.PatternReplaceFilterFactory pattern=[\[\]\(\)] 
replacement= replace=all /
    filter class=solr.LowerCaseFilterFactory/  
    charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/
    filter class=solr.NGramFilterFactory minGramSize=2 maxGramSize=2 
/ 
  /analyzer
    /fieldType


With this definition the matching works as planned. But not for highlighting, 
there the special characters seem to move the em tags to wrong positions, for 
example searching for jsdhfjk misses the last 3 letters of the words ( = 3 
special characters from PatternReplaceFilterFactory)

emj]s(dh)/emfjk

Solr has so many bells and whistles - what must I do to get a correctly working 
highlighting?

kind regards,
Stefan





Re: copyField at search time / multi-language support

2011-03-29 Thread lboutros
Tom,

to solve this kind of problem, if I understand it well, you could extend the
query parser to support something like meta-fields. I'm currently developing
a QueryParser Plugin to support a specific syntax. The support of
meta-fields to search on different fields (multiple languages) is one of the
functionalities that this parser will contain.

Ludovic.

2011/3/29 Markus Jelsma-2 [via Lucene] 
ml-node+2747011-315348515-383...@n3.nabble.com

 I haven't tried this as an UpdateProcessor but it relies on Tika and that
 LanguageIdentifier works well, except for short texts.

  Thanks Markus.
 
  Do you know if this patch is good enough for production use? Thanks.
 
  Andy
 
  --- On Tue, 3/29/11, Markus Jelsma [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=2747011i=0by-user=t
 wrote:
   From: Markus Jelsma [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=2747011i=1by-user=t

   Subject: Re: copyField at search time / multi-language support
   To: [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=2747011i=2by-user=t
   Cc: Andy [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=2747011i=3by-user=t

   Date: Tuesday, March 29, 2011, 1:29 AM
   https://issues.apache.org/jira/browse/SOLR-1979
  
Tom,
   
Could you share the method you use to perform language
  
   detection? Any open
  
source tools that do that?
   
Thanks.
   
--- On Mon, 3/28/11, Tom Mortimer [hidden 
email]http://user/SendEmail.jtp?type=nodenode=2747011i=4by-user=t

  
   wrote:
 From: Tom Mortimer [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=2747011i=5by-user=t

 Subject: copyField at search time /
  
   multi-language support
  
 To: [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=2747011i=6by-user=t
 Date: Monday, March 28, 2011, 4:45 AM
 Hi,

 Here's my problem: I'm indexing a corpus with
  
   text in a
  
 variety of
 languages. I'm planning to detect these at index
  
   time and
  
 send the
 text to one of a suitably-configured field (e.g.
 mytext_de for
 German, mytext_cjk for Chinese/Japanese/Korean
  
   etc.)
  
 At search time I want to search all of these
  
   fields.
  
 However, there
 will be at least 12 of them, which could lead to
  
   a very
  
 long query
 string. (Also I need to use the standard query
  
   parser
  
 rather than
 dismax, for full query syntax.)

 Therefore I was wondering if there was a way to
  
   copy fields
  
 at search
 time, so I can have my mytext query in a single
  
   field and
  
 have it
 copied to mytext_de, mytext_cjk etc. Something
  
   like:
copyQueryField source=mytext

 dest=mytext_de /

copyQueryField source=mytext

 dest=mytext_cjk /

   ...

 If this is not currently possible, could someone
  
   give me
  
 some pointers
 for hacking Solr to support it? Should I
  
   subclass
  
 solr.SearchHandler?
 I know nothing about Solr internals at the
  
   moment...
  
 thanks,
 Tom


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/copyField-at-search-time-multi-language-support-tp2746017p2747011.html
  To start a new topic under Solr - User, email
 ml-node+472068-1765922688-383...@n3.nabble.com
 To unsubscribe from Solr - User, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=.




-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/copyField-at-search-time-multi-language-support-tp2746017p2747386.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlighting problem

2011-03-29 Thread Stefan Matheis
Stefan,

this is a duplicate post for
http://lucene.472066.n3.nabble.com/Highlighting-Problem-td2746022.html
no? if see, please stick w/ one of them

Regards
Stefan

On Tue, Mar 29, 2011 at 10:30 AM, Stefan Mueller solru...@yahoo.com wrote:
 dear solr users,

 my data looks like this:

 j]s(dh)fjk [hf]sjkadh asdj(kfh) [skdjfh aslkfjhalwe uigfrhj bsd bsdfga sjfg 
 asdlfj.

 if I want to query for the first word, the following queries must match:

 j]s(dh)fjk
 j]s(dh)fjk
 j]sdhfjk
 jsdhfjk
 dhf

 So the matching should ignore some characters like ( ) [ ] and should match 
 substrings.

 So far I have the following field definition in the schema.xml:

     fieldType name=text_ngram class=solr.TextField 
 positionIncrementGap=100
   analyzer type=index
     tokenizer class=solr.WhitespaceTokenizerFactory/
     filter class=solr.PatternReplaceFilterFactory pattern=[\[\]\(\)] 
 replacement= replace=all /
     filter class=solr.LowerCaseFilterFactory/
     charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
     filter class=solr.NGramFilterFactory minGramSize=2 
 maxGramSize=2 /
   /analyzer
   analyzer type=query
     tokenizer class=solr.WhitespaceTokenizerFactory/
     filter class=solr.PatternReplaceFilterFactory pattern=[\[\]\(\)] 
 replacement= replace=all /
     filter class=solr.LowerCaseFilterFactory/
     charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
     filter class=solr.NGramFilterFactory minGramSize=2 
 maxGramSize=2 /
   /analyzer
     /fieldType


 With this definition the matching works as planned. But not for highlighting, 
 there the special characters seem to move the em tags to wrong positions, 
 for example searching for jsdhfjk misses the last 3 letters of the words ( 
 = 3 special characters from PatternReplaceFilterFactory)

 emj]s(dh)/emfjk

 Solr has so many bells and whistles - what must I do to get a correctly 
 working highlighting?

 kind regards,
 Stefan






Re: RamBufferSize and AutoCommit

2011-03-29 Thread Isan Fulia
Hi Eric ,
I m actually getting out of memory error.
As I told earlier my rambuffersize is default(32mb).What could be the
reasons for getting this error.
Can u please share ur views.


On 28 March 2011 17:55, Erick Erickson erickerick...@gmail.com wrote:

 Also note that making RAMBufferSize too big isn't useful. Lucid
 recommends 128M as the point over which you hit diminishing
 returns. But unless you're having problems speed-wise with the
 default, why change it?

 And are you actually getting OOMs or is this a background question?

 Best
 Erick

 On Mon, Mar 28, 2011 at 6:23 AM, Li Li fancye...@gmail.com wrote:
  there are 3 conditions that will trigger an auto flushing in lucene
  1. size of index in ram is larger than ram buffer size
  2. documents in mamory is larger than the number set by
 setMaxBufferedDocs.
  3. deleted term number is larger than the ratio set by
  setMaxBufferedDeleteTerms.
 
  auto flushing by time interval is added by solr
 
  rambufferSize  will use estimated size and the real used memory may be
  larger than this value. So if  your Xmx is 2700m, setRAMBufferSizeMB.
  should set value less than it. if you setRAMBufferSizeMB to 2700m and
  the other 3 conditions are not
  triggered, I think it will hit OOM exception.
 
  2011/3/28 Isan Fulia isan.fu...@germinait.com:
  Hi all ,
 
  I would like to know is there any relation between autocommit and
  rambuffersize.
  My solr config does not  contain rambuffersize which mean its
  deault(32mb).Autocommit setting are after 500 docs or 80 sec
  whichever is first.
  Solr starts with Xmx 2700M .Total Ram is 4 GB.
  Does the rambufferSize is alloted outside the heap memory(2700M)?
  How does rambuffersize is related to out of memory errors.
  What is the optimal value for rambuffersize.
 
  --
  Thanks  Regards,
  Isan Fulia.
 
 




-- 
Thanks  Regards,
Isan Fulia.


Re: Cant retrieve data

2011-03-29 Thread Erick Erickson
Your documents aren't getting in your index. Did you follow up on Gora's
comment that you weren't selecting an ID? IDs are NOT generated by Solr,
you need to supply them as part of your document.


Second, look in your Solr logs, or just look at the screen where you started
Solr when you index. I bet you'll see a bunch of errors. Like 400K of them
telling you that your document isn't correct, but that's only a guess.

Third, there is a little-known DIH debugging console,
...solr/admin/dataimport.jsp
that might help.

I'd guess, but it's only a guess since we haven't seen your schema
file, that you
used the example and left either required=true for the id field and/or
defined a uniqueKey field and you aren't passing that through to
Solr...

Best
Erick

On Mon, Mar 28, 2011 at 8:52 AM, Walter Andreas Pucko
a...@globosapiens.net wrote:


 On Mon, 28 Mar 2011 13:12 +0100, Upayavira u...@odoko.co.uk wrote:
 What query are you doing?


 /solr/select/?q=welpe%0D%0Aversion=2.2start=0rows=10indent=on

 Try q=*:*


 returns:
 response
 -
 lst name=responseHeader
 int name=status0/int
 int name=QTime5/int
 -
 lst name=params
 str name=q*:*/str
 /lst
 /lst
 result name=response numFound=0 start=0/
 /response


 Also, what does /solr/admin/stats.jsp report for number of docs?

 That's a good question. Core states: 0, however rows fetched is about
 400K?!
 Isn't that the same? If no, what must I do to get the documents from the
 rows?

 Here are the stats:

 Solr Statistics: (example)
 snake.fritz.box
 Category
[Core] [Cache] [Query] [Update] [Highlighting] [Other]
Current Time: Mon Mar 28 14:46:41 CEST 2011
Server Start Time: Mon Mar 28 12:47:27 CEST 2011

 Core

 name:   core
 class:
 version:1.0
 description:SolrCore
 stats:  coreName :
 startTime : Mon Mar 28 12:47:27 CEST 2011
 refCount : 2
 aliases : []

 name:   searcher
 class:  org.apache.solr.search.SolrIndexSearcher
 version:1.0
 description:index searcher
 stats:  searcherName : Searcher@1cf662f main
 caching : true
 numDocs : 0
 maxDoc : 0
 reader :
 SolrIndexReader{this=1d8f162,r=ReadOnlyDirectoryReader@1d8f162,refCnt=1,segments=0}
 readerDir :
 org.apache.lucene.store.NIOFSDirectory@/home/andy/sw/apache-solr-1.4.1/example/solr/data/index
 indexVersion : 1301253679802
 openedAt : Mon Mar 28 12:48:00 CEST 2011
 registeredAt : Mon Mar 28 12:48:00 CEST 2011
 warmupTime : 3

 name:   Searcher@1cf662f main
 class:  org.apache.solr.search.SolrIndexSearcher
 version:1.0
 description:index searcher
 stats:  searcherName : Searcher@1cf662f main
 caching : true
 numDocs : 0
 maxDoc : 0
 reader :
 SolrIndexReader{this=1d8f162,r=ReadOnlyDirectoryReader@1d8f162,refCnt=1,segments=0}
 readerDir :
 org.apache.lucene.store.NIOFSDirectory@/home/andy/sw/apache-solr-1.4.1/example/solr/data/index
 indexVersion : 1301253679802
 openedAt : Mon Mar 28 12:48:00 CEST 2011
 registeredAt : Mon Mar 28 12:48:00 CEST 2011
 warmupTime : 3


 Query Handlers

 name:   /admin/properties
 class:  org.apache.solr.handler.admin.PropertiesRequestHandler
 version:$Revision: 790580 $
 description:Get System Properties
 stats:  handlerStart : 1301309248506
 requests : 0
 errors : 0
 timeouts : 0
 totalTime : 0
 avgTimePerRequest : NaN
 avgRequestsPerSecond : 0.0

 name:   /update/csv
 class:  Lazy[solr.CSVRequestHandler]
 version:$Revision: 817165 $
 description:Lazy[solr.CSVRequestHandler]
 stats:  note : not initialized yet

 name:   /admin/file
 class:  org.apache.solr.handler.admin.ShowFileRequestHandler
 version:$Revision: 790580 $
 description:Admin Get File -- view config files directly
 stats:  handlerStart : 1301309248509
 requests : 0
 errors : 0
 timeouts : 0
 totalTime : 0
 avgTimePerRequest : NaN
 avgRequestsPerSecond : 0.0

 name:   org.apache.solr.handler.dataimport.DataImportHandler
 class:  org.apache.solr.handler.dataimport.DataImportHandler
 version:1.0
 description:Manage data import from databases to Solr
 stats:  Status : IDLE
 Documents Processed : 0
 Requests made to DataSource : 1
 Rows Fetched : 404575
 Documents Deleted : 0
 Documents Skipped : 0
 Total Documents Processed : 0
 Total Requests made to DataSource : 2
 Total Rows Fetched : 809150
 Total Documents Deleted : 0
 Total Documents Skipped : 0
 handlerStart : 1301309248131
 requests : 4
 errors : 0
 timeouts : 0
 totalTime : 24
 avgTimePerRequest : 6.0
 avgRequestsPerSecond : 5.591581E-4

 name:   org.apache.solr.handler.DumpRequestHandler
 class:  org.apache.solr.handler.DumpRequestHandler
 version:$Revision: 954340 $
 description:Dump handler (debug)
 stats:  handlerStart : 1301309248115
 requests : 0
 errors : 0
 timeouts : 0
 totalTime : 0
 avgTimePerRequest : NaN
 avgRequestsPerSecond : 0.0

 name:   /admin/threads
 class:  org.apache.solr.handler.admin.ThreadDumpHandler
 version:$Revision: 790580 $
 description:Thread Dump
 stats:  handlerStart : 1301309248505
 requests : 0
 

Re: Solrcore.properties

2011-03-29 Thread Ezequiel Calderara
Hi Jayendra, this is the content of the files:
In the Master:
 + SolrConfig.xml : http://pastebin.com/JhvwMTdd
In the Slave:
 + solrconfig.xml: http://pastebin.com/XPuwAkmW
 + solrcore.properties: http://pastebin.com/6HZhQG8z

I don't know which other files do you need or could be involved in this.

I checked the home environment key in the tomcat instance and its ok too.

Any light on this would be appreciated!


On Mon, Mar 28, 2011 at 6:26 PM, Jayendra Patil 
jayendra.patil@gmail.com wrote:

 Can you please attach the other files.
 It doesn't seem to find the enable.master property, so you may want to
 check the properties file exists on the box having issues

 We have the following configuration in the core :-

Core -
- solrconfig.xml - Master  Slave
requestHandler name=/replication
 class=solr.ReplicationHandler 
lst name=master
 str
 name=enable${enable.master:false}/str
 str
 name=replicateAftercommit/str
 str
 name=confFilessolrcore_slave.properties:solrcore.properties,solrconfig.xml,schema.xml/str
/lst
lst name=slave
 str
 name=enable${enable.slave:false}/str
 str
 name=masterUrlhttp://master_host:port/solr/corename/replication/str
/lst
/requestHandler

- solrcore.properties - Master
enable.master=true
enable.slave=false

- solrcore_slave.properties - Slave
enable.master=false
enable.slave=true

 We have the default values and separate properties file for Master and
 slave.
 Replication is enabled for the solrcore.proerties file.

 Regards,
 Jayendra

 On Mon, Mar 28, 2011 at 2:06 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all, i'm having problems when deploying solr in the production
 machines.
 
  I have a master solr, and 3 slaves.
  The master replicates the schema and the solrconfig for the slaves (this
  file in the master is named like solrconfig_slave.xml).
  The solrconfig of the slaves has for example the ${data.dir} and other
  values in the solrtcore.properties
 
  I think that solr isn't recognizing that file, because i get this error:
 
  HTTP Status 500 - Severe errors in solr configuration. Check your log
  files for more detailed information on what may be wrong. If you want
 solr
  to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in null
  -
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master at
  org.apache.solr.common.util.DOMUtil.substituteProperty(DOMUtil.java:311)
  ... MORE STACK TRACE INFO...
 
  But here is the thing:
  org.apache.solr.common.SolrException: No system property or default value
  specified for enable.master
 
  I'm attaching the master schema, the master solr config, the solr config
 of
  the slaves and the solrcore.properties.
 
  If anyone has any info on this i would be more than appreciated!...
 
  Thanks
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: copyField at search time / multi-language support

2011-03-29 Thread Erick Erickson
This may not be all that helpful, but have you looked at edismax?
https://issues.apache.org/jira/browse/SOLR-1553

It allows the full Solr query syntax while preserving the goodness of
dismax.

This is standard equipment on 3.1, which is being released even as we
speak, and I also know it's being used in production situations.

If going to 3.1 is not an option, I know people have applied that patch
to 1.4.1, but haven't done it myself.

Best
Erick

On Mon, Mar 28, 2011 at 4:45 AM, Tom Mortimer t...@flax.co.uk wrote:
 Hi,

 Here's my problem: I'm indexing a corpus with text in a variety of
 languages. I'm planning to detect these at index time and send the
 text to one of a suitably-configured field (e.g. mytext_de for
 German, mytext_cjk for Chinese/Japanese/Korean etc.)

 At search time I want to search all of these fields. However, there
 will be at least 12 of them, which could lead to a very long query
 string. (Also I need to use the standard query parser rather than
 dismax, for full query syntax.)

 Therefore I was wondering if there was a way to copy fields at search
 time, so I can have my mytext query in a single field and have it
 copied to mytext_de, mytext_cjk etc. Something like:

   copyQueryField source=mytext dest=mytext_de /
   copyQueryField source=mytext dest=mytext_cjk /
  ...

 If this is not currently possible, could someone give me some pointers
 for hacking Solr to support it? Should I subclass solr.SearchHandler?
 I know nothing about Solr internals at the moment...

 thanks,
 Tom



Re: Error while performing facet search across shards..

2011-03-29 Thread Yonik Seeley
On Tue, Mar 29, 2011 at 3:55 AM, rajini maski rajinima...@gmail.com wrote:
             An error while performing facet across shards..The following is
 the query:

 http://localhost:8090/InstantOne/select?/indent=on
 shards=localhost:8090/InstantOne,localhost:8091/InstantTwo
 ,localhost:8093/InstantThreeq=filenumber:10facet=onfacet.field=studyId

 No studyId fields are blank across any shards.  I have apache solr 1.4.1
 version set up for this.

 Error is :  common.SolrException log SEVERE: java.lang.NullPointerException
 at
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:331)
 atorg.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:232

 What might be the reason for this? Any particular configuration or set up
 needed to be done?

This bug has been fixed in 3.1 and trunk.
This can happen when there was some sort of exception during the sub-request.
Go look at the logs on each server (or execute the request on each
server w/o the shards param) to see what the exception is.

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


how to start GarbageCollector

2011-03-29 Thread stockii
Hello,

my problem is, that after a full-import solr reserved all of my RAM and my
delta-imports need about 1 hour for less than 5000 small documents.

How can i start GarbageCollector to get the RAM back ? 

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-start-GarbageCollector-tp2748080p2748080.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to start GarbageCollector

2011-03-29 Thread Erick Erickson
I doubt this is your issue, the garbage collector will run automatically
at need.

What happens if you do a full import, stop and restart your Solr server
and then try the delta? If the delta takes an hour then it has nothing to
do with garbage collection.

What are you importing from? I'd suspect that your problem is more
along the lines that your extraction process somehow takes a long
time to just extract the data and Solr is a red herring...

Best
Erick

On Tue, Mar 29, 2011 at 8:22 AM, stockii stock.jo...@googlemail.com wrote:
 Hello,

 my problem is, that after a full-import solr reserved all of my RAM and my
 delta-imports need about 1 hour for less than 5000 small documents.

 How can i start GarbageCollector to get the RAM back ?

 -
 --- System 
 

 One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
 1 Core with 31 Million Documents other Cores  100.000

 - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
 - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/how-to-start-GarbageCollector-tp2748080p2748080.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: how to start GarbageCollector

2011-03-29 Thread Markus Jelsma
I seriously doubt heap usage is actually your problem.  Usually a garbage 
collector is running, if it (somehow) doesn't you will definately run out of 
memory some time. Where did you check memory usage?



 Hello,
 
 my problem is, that after a full-import solr reserved all of my RAM and my
 delta-imports need about 1 hour for less than 5000 small documents.
 
 How can i start GarbageCollector to get the RAM back ?
 
 -
 --- System
 
 
 One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
 1 Core with 31 Million Documents other Cores  100.000
 
 - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
 - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/how-to-start-GarbageCollector-tp2748080
 p2748080.html Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to start GarbageCollector

2011-03-29 Thread stockii
i run an full-import via DIH, 35 Million Documents, i dont restart solr. my
cronjob start automaticly an delta. if i restart solr, delta obtain in ~10
seconds ...


free -m show me how many RAM is beeing used and with top. the server is
only for solr, so no other processes are using my RAM.

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-start-GarbageCollector-tp2748080p2748134.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solrcore.properties

2011-03-29 Thread Ezequiel Calderara
I think that i found the problem:
The contents of the solrcore.properties were:

 #solrcore.properties
 data.dir=D:\Solr\data\solr\
 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

I found a folder in the D:\ called: SolrDatasolrenable.master=false
So i researched a little and tested another little more and i found that i
have to escape the data.dir like this:

 #solrcore.properties
 data.dir=D:\\Solr\\data\\solr\\
 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

And Problem solved, for now at least :P

On Tue, Mar 29, 2011 at 8:37 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 Hi Jayendra, this is the content of the files:
 In the Master:
  + SolrConfig.xml : http://pastebin.com/JhvwMTdd
 In the Slave:
  + solrconfig.xml: http://pastebin.com/XPuwAkmW
  + solrcore.properties: http://pastebin.com/6HZhQG8z

 I don't know which other files do you need or could be involved in this.

 I checked the home environment key in the tomcat instance and its ok too.

 Any light on this would be appreciated!


 On Mon, Mar 28, 2011 at 6:26 PM, Jayendra Patil 
 jayendra.patil@gmail.com wrote:

 Can you please attach the other files.
 It doesn't seem to find the enable.master property, so you may want to
 check the properties file exists on the box having issues

 We have the following configuration in the core :-

Core -
- solrconfig.xml - Master  Slave
requestHandler name=/replication
 class=solr.ReplicationHandler 
lst name=master
 str
 name=enable${enable.master:false}/str
 str
 name=replicateAftercommit/str
 str
 name=confFilessolrcore_slave.properties:solrcore.properties,solrconfig.xml,schema.xml/str
/lst
lst name=slave
 str
 name=enable${enable.slave:false}/str
 str
 name=masterUrlhttp://master_host:port/solr/corename/replication/str
/lst
/requestHandler

- solrcore.properties - Master
enable.master=true
enable.slave=false

- solrcore_slave.properties - Slave
enable.master=false
enable.slave=true

 We have the default values and separate properties file for Master and
 slave.
 Replication is enabled for the solrcore.proerties file.

 Regards,
 Jayendra

 On Mon, Mar 28, 2011 at 2:06 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all, i'm having problems when deploying solr in the production
 machines.
 
  I have a master solr, and 3 slaves.
  The master replicates the schema and the solrconfig for the slaves (this
  file in the master is named like solrconfig_slave.xml).
  The solrconfig of the slaves has for example the ${data.dir} and other
  values in the solrtcore.properties
 
  I think that solr isn't recognizing that file, because i get this error:
 
  HTTP Status 500 - Severe errors in solr configuration. Check your log
  files for more detailed information on what may be wrong. If you want
 solr
  to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in null
  -
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master at
 
 org.apache.solr.common.util.DOMUtil.substituteProperty(DOMUtil.java:311)
  ... MORE STACK TRACE INFO...
 
  But here is the thing:
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master
 
  I'm attaching the master schema, the master solr config, the solr config
 of
  the slaves and the solrcore.properties.
 
  If anyone has any info on this i would be more than appreciated!...
 
  Thanks
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com
 




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: how to start GarbageCollector

2011-03-29 Thread Markus Jelsma
Unix tools won't show heap usage statistics. Please use tools that come with 
your JVM such as jps, jtop, jstat or setup monitoring over JMX to get a good 
picture.

All aside, RAM is most likely not your problem.

 i run an full-import via DIH, 35 Million Documents, i dont restart solr. my
 cronjob start automaticly an delta. if i restart solr, delta obtain in ~10
 seconds ...
 
 
 free -m show me how many RAM is beeing used and with top. the server is
 only for solr, so no other processes are using my RAM.
 
 -
 --- System
 
 
 One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
 1 Core with 31 Million Documents other Cores  100.000
 
 - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
 - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/how-to-start-GarbageCollector-tp2748080
 p2748134.html Sent from the Solr - User mailing list archive at Nabble.com.


RE: [WKT] Spatial Searching

2011-03-29 Thread Smiley, David W.
Thanks for the links Chris.  I think my approach in SOLR-2155 complies with 
those rules.  The only part that concerns me wether this is true is the rule 
regarding the default action of a build script:
YOU MUST NOT distribute build scripts or documentation within an Apache product 
with the purpose of causing the default/standard build of an Apache product to 
include any part of a prohibited work.

It's not saying specifically we can't compile against LGPL, it's ambiguously 
saying include.  I take that to mean the result of the build -- e.g. class 
and jar files, which may not include LGPL.

RE SIS... I wonder how the expertise on that project compares to that of JTS's 
Martin Davis -- an expert, and the library has been in use for 10 years.

~ David

From: Mattmann, Chris A (388J) [chris.a.mattm...@jpl.nasa.gov]
Sent: Tuesday, March 29, 2011 1:00 AM
To: solr-user@lucene.apache.org
Cc: Adam Estrada
Subject: Re: [WKT] Spatial Searching

LGPL licenses and Apache aren't exactly compatible, see:

http://www.apache.org/legal/3party.html#transition-examples-lgpl
http://www.apache.org/legal/resolved.html#category-x

In practice, this was the reason we started the SIS project.

Cheers,
Chris

On Mar 28, 2011, at 11:16 AM, Smiley, David W. wrote:

 (This is one of those messages that I would have responded to at the time if 
 I only noticed it.)

 There is not yet indexing of arbitrary shapes (i.e. your data can only be 
 points), but with SOLR-2155 you can query via WKT thanks to JTS.  If you want 
 to index shapes then you'll have to wait a month or two for work that is 
 underway right now.  It's coming; be patient.

 I don't see the LGPL licensing as a problem; it's *L*GPL, not GPL, after all. 
  In SOLR-2155 the patch I take measures to download this library dynamically 
 at build time and compile against it.  JTS need not ship with Solr; the user 
 can get it themselves if they want this capability.  Non-JTS query shapes 
 should work without the presence of JTS.

 ~ David Smiley
 Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/

 On Feb 8, 2011, at 11:18 PM, Adam Estrada wrote:

 I just came across a ~nudge post over in the SIS list on what the status is 
 for that project. This got me looking more in to spatial mods with Solr4.0.  
 I found this enhancement in Jira. 
 https://issues.apache.org/jira/browse/SOLR-2155. In this issue, David 
 mentions that he's already integrated JTS in to Solr4.0 for querying on 
 polygons stored as WKT.

 It's relatively easy to get WKT strings in to Solr but does the Field type 
 exist yet? Is there a patch or something that I can test out?

 Here's how I would do it using GDAL/OGR and the already existing csv update 
 handler. http://www.gdal.org/ogr/drv_csv.html

 ogr2ogr -f CSV output.csv input.shp -lco GEOMETRY=AS_WKT
 This converts a shapefile to a csv with the geometries in tact in the form 
 of WKT. You can then get the data in to Solr by running the following 
 command.
 curl 
 http://localhost:8983/solr/update/csv?commit=trueseparator=%2Cfieldnames=id,attr1,attr2,attr3,geomstream.file=C:\tmp\output.csvoverwrite=truestream.contentType=text/plain;charset=utf-8;
 There are lots of flavors of geometries so I suspect that this will be a 
 daunting task but because JTS recognizes each geometry type it should be 
 possible to work with them.
 Does anyone know of a patch or even when this functionality might be 
 included in to Solr4.0? I need to query for polygons ;-)
 Thanks,
 Adam


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



2 index within the same Solr server ?

2011-03-29 Thread Amel Fraisse
Hello every body,

Is it possible to create 2 index within the same Solr server ?

Thank you.

Amel.


Re: 2 index within the same Solr server ?

2011-03-29 Thread Markus Jelsma
http://wiki.apache.org/solr/CoreAdmin

 Hello every body,
 
 Is it possible to create 2 index within the same Solr server ?
 
 Thank you.
 
 Amel.


Re: [WKT] Spatial Searching

2011-03-29 Thread Ryan McKinley
 Does anyone know of a patch or even when this functionality might be included 
 in to Solr4.0? I need to query for polygons ;-)

check:
http://code.google.com/p/lucene-spatial-playground/

This is my sketch / soon-to-be-proposal for what I think lucene
spatial should look like.  It includes a WKTField that can do complex
geometry queries:

https://lucene-spatial-playground.googlecode.com/svn/trunk/spatial-lucene/src/main/java/org/apache/lucene/spatial/search/jts/


ryan


Re: Solrcore.properties

2011-03-29 Thread Ezequiel Calderara
Just for the record, in case anymore is having trouble, the masterUrl should
be: http://url:port/solr/replication (don't forget the /replication/ part!)

On Tue, Mar 29, 2011 at 9:44 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 I think that i found the problem:
 The contents of the solrcore.properties were:

 #solrcore.properties
 data.dir=D:\Solr\data\solr\

 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

 I found a folder in the D:\ called: SolrDatasolrenable.master=false
 So i researched a little and tested another little more and i found that i
 have to escape the data.dir like this:

 #solrcore.properties
 data.dir=D:\\Solr\\data\\solr\\

 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

 And Problem solved, for now at least :P

 On Tue, Mar 29, 2011 at 8:37 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 Hi Jayendra, this is the content of the files:
 In the Master:
  + SolrConfig.xml : http://pastebin.com/JhvwMTdd
 In the Slave:
  + solrconfig.xml: http://pastebin.com/XPuwAkmW
  + solrcore.properties: http://pastebin.com/6HZhQG8z

 I don't know which other files do you need or could be involved in this.

 I checked the home environment key in the tomcat instance and its ok too.

 Any light on this would be appreciated!


 On Mon, Mar 28, 2011 at 6:26 PM, Jayendra Patil 
 jayendra.patil@gmail.com wrote:

 Can you please attach the other files.
 It doesn't seem to find the enable.master property, so you may want to
 check the properties file exists on the box having issues

 We have the following configuration in the core :-

Core -
- solrconfig.xml - Master  Slave
requestHandler name=/replication
 class=solr.ReplicationHandler 
lst name=master
 str
 name=enable${enable.master:false}/str
 str
 name=replicateAftercommit/str
 str
 name=confFilessolrcore_slave.properties:solrcore.properties,solrconfig.xml,schema.xml/str
/lst
lst name=slave
 str
 name=enable${enable.slave:false}/str
 str
 name=masterUrlhttp://master_host:port/solr/corename/replication/str
/lst
/requestHandler

- solrcore.properties - Master
enable.master=true
enable.slave=false

- solrcore_slave.properties - Slave
enable.master=false
enable.slave=true

 We have the default values and separate properties file for Master and
 slave.
 Replication is enabled for the solrcore.proerties file.

 Regards,
 Jayendra

 On Mon, Mar 28, 2011 at 2:06 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all, i'm having problems when deploying solr in the production
 machines.
 
  I have a master solr, and 3 slaves.
  The master replicates the schema and the solrconfig for the slaves
 (this
  file in the master is named like solrconfig_slave.xml).
  The solrconfig of the slaves has for example the ${data.dir} and other
  values in the solrtcore.properties
 
  I think that solr isn't recognizing that file, because i get this
 error:
 
  HTTP Status 500 - Severe errors in solr configuration. Check your log
  files for more detailed information on what may be wrong. If you want
 solr
  to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in null
  -
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master at
 
 org.apache.solr.common.util.DOMUtil.substituteProperty(DOMUtil.java:311)
  ... MORE STACK TRACE INFO...
 
  But here is the thing:
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master
 
  I'm attaching the master schema, the master solr config, the solr
 config of
  the slaves and the solrcore.properties.
 
  If anyone has any info on this i would be more than appreciated!...
 
  Thanks
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com
 




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: how to start GarbageCollector

2011-03-29 Thread stockii
okay, i installed an monitor, jconsole and jvisualvm. how can i see with
this, where my probem is ? 

what data are needed ? :/

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-start-GarbageCollector-tp2748080p2748421.html
Sent from the Solr - User mailing list archive at Nabble.com.


DisMaxQueryParser: Unknown function min in FunctionQuery

2011-03-29 Thread Robert Gründler

Hi all,

i'm trying to implement a FunctionQuery using the bf parameter of the 
DisMaxQueryParser, however, i'm getting an exception:


Unknown function min in FunctionQuery('min(1,2)', pos=4)

The request that causes the error looks like this:

http://localhost:2345/solr/main/select?qt=dismaxqf=name^0.1qf=name_exact^10.0debugQuery=truebf=min(1,2)version=1.2wt=jsonjson.nl=mapq=+foostart=0rows=3


I'm not sure where the pos=4 part of the FunctionQuery is coming from.

My Solr version is 1.4.1.

Has anyone a hint why i'm getting this error?


thanks!

-robert




[infomercial] Lucene Refcard at DZone

2011-03-29 Thread Erik Hatcher
I've written an Understanding Lucene refcard that has just been published at 
DZone.  See here for details:

   
http://www.lucidimagination.com/blog/2011/03/28/understanding-lucene-by-erik-hatcher-free-dzone-refcard-now-available/

If you're new to Lucene or Solr, this refcard will be a nice grounding in the 
fundamental concepts.  For you old timers, pass it on to your friends and 
coworkers :)

Erik



Long list of shards breaks solrj query

2011-03-29 Thread JohnRodey
So I have a simple class that builds a SolrQuery and sets the shards param. 
I have a really long list of shards, over 250.  My search seems to work
until I get my shard list up to a certain length. As soon as I add one more
shard I get:

org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
INFO: I/O exception (java.net.SocketException) caught when processing
request: Connection reset by peer: socket write error
org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
INFO: Retrying request

My class just looks like this:
public static void main(String[] args) {
try {
SolrServer s = new CommonsHttpSolrServer(http://mynode:8080/solr;);
SolrQuery q = new SolrQuery();
q.setQuery(test);
q.setHighlight(true);
q.setRows(50);
q.setStart(0);
q.setParam(shards, node1:1010/solr/core01,node1:1010/solr/core02,...);
} catch (Exception e) {
e.printStackTrace();
}

If I execute the same request in a browser it returns fine.

One other question I had was even if I set the version to 2.2 the response
has version=1.  Is that normal?  In a browser it returns version=2.2 though.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Long-list-of-shards-breaks-solrj-query-tp2748556p2748556.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DisMaxQueryParser: Unknown function min in FunctionQuery

2011-03-29 Thread Erik Hatcher

On Mar 29, 2011, at 10:01 , Robert Gründler wrote:

 Hi all,
 
 i'm trying to implement a FunctionQuery using the bf parameter of the 
 DisMaxQueryParser, however, i'm getting an exception:
 
 Unknown function min in FunctionQuery('min(1,2)', pos=4)
 
 The request that causes the error looks like this:
 
 http://localhost:2345/solr/main/select?qt=dismaxqf=name^0.1qf=name_exact^10.0debugQuery=truebf=min(1,2)version=1.2wt=jsonjson.nl=mapq=+foostart=0rows=3
 
 
 I'm not sure where the pos=4 part of the FunctionQuery is coming from.
 
 My Solr version is 1.4.1.
 
 Has anyone a hint why i'm getting this error?

From http://wiki.apache.org/solr/FunctionQuery#min - min() is 3.2 (though I 
think that really means 3.1 now, right??).  Definitely not in 1.4.1.

Erik



Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-29 Thread fr . jurain
Hi Solrists, 

thank you for your kind responses. 
Grant, François, I'll keep your advice in mind  your links in store; 
they may be useful in one of my use cases, 
even though I doubt they might in the primary one. 
As RDF litterals, my documents are affixed with language tags @en, @fr, @ja c, 
as per ISO 639-1, so that language identification is straightforward. 
It's the discovery of the corresponding Analyzer subclasses  constructors
I'm trying to automate. 
 
Solr looks like it's up to the server admins to specify in XML 
what Analyzer subclasses they want in a given case, 
then it's up to Solr to instantiate those subclasses by Java reflection. 
I would like to spare myself the burden to write  maintain this XML. 
 
Rather, I'd use Java code to build the mapping 
by inventorying the classpath, with rules like 
on finding jarentry /whats/this/package/analysis/xx/WhatsThisAnalyzer.class, 
if class WhatsThisAnalyzer is a subclass of lucene.analysis.Analyzer, 
if reflection reveals a public new WhatsThisAnalyzer(lucene.util.Version), 
if instantiation succeeds,
then the instance is the presumptive default analyzer for ISO 639-1 code xx. 
 
Might make a Lucene submission, more properly than a Solr one. 
 
Thanks again for your time  your help.
Best regards,
 François Jurain.

 Message du 25/03/11  à 23h06
 De : François Schiettecatte fschietteca...@gmail.com
 A : solr-user@lucene.apache.org
 Copie à : 
 Objet : Re: Wanted: a directory of quick-and-(not too)dirty analyzers for 
 multi-language RDF.
 
 
 François
 
 I think there is a language identification tool in the Nutch code base, 
 otherwise I have written one in Perl which could easily be translated to 
 Java. I wont have access to it for 10 days (I am traveling), but I am happy 
 to send you a link to it when I get back (and anyone else who wants it).
 
 Cheers
 
 François
 
 On Mar 25, 2011, at 11:59 AM, Grant Ingersoll wrote:
 
  You are looking for a language identification tool.  You could check 
  https://issues.apache.org/jira/browse/SOLR-1979 for the start of this.  
  Otherwise, you have to roll your own or buy a third party one.
  
  On Mar 24, 2011, at 12:24 PM, fr.jur...@voila.fr wrote:
  
  Hello Solrists,
  
  As it says in the subject line, I'm looking for a Java component that,
  given an ISO 639-1 code or some equivalent,
  would return a Lucene Analyzer ready to gobble documents in the 
  corresponding language.
  Solr looks like it has to contain one,
  only I've not been able to locate it so far; 
  can you point the spot?
  
  I've found org.apache.solr.analysis,
  and thing like org.apache.lucene.analysis.bg c in lucene/modules,
  with many classes which I'm sure are related, however the factory itself 
  still eludes me;
  I mean the Java class.method that'd decide on request, what to do with all 
  these packages
  to bring the requisite object to existence, once the language is specified.
  Where should I look? Or was I mistaken  Solr has nothing of the kind, at 
  least in Java?
  Thanks in advance for your help.
  
  Best regards,
François Jurain.
  
  
  
  Retrouvez les 10 conseils pour économiser votre carburant sur Voila :  
  http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/
  
  
  
  
  --
  Grant Ingersoll
  http://www.lucidimagination.com/
  
  Search the Lucene ecosystem docs using Solr/Lucene:
  http://www.lucidimagination.com/search
  
 
 




  Suivez toute l'actualité en photos de l'émission Carré Viiip et retrouvez les 
derniers échanges des viiip sur : 
http://people.voila.fr/evenementiel/carre-viiip





Conditional Scoring (was: Re: DisMaxQueryParser: Unknown function min in FunctionQuery)

2011-03-29 Thread Robert Gründler

sorry, didn't see that.


So, as also the relevance functions are only available in solr  4.0 
(http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions), i'm not
sure if i can solve our requirement in one query ( i thought i could use 
a function query for this).


Here's our Problem:

We have 3 Fields:

1. exact_match ( text )
2. fuzzy_match ( text )
3. popularity ( integer )

Our requirement looks as follows:

All results which have a match in exact_match MUST score higher than 
results without a match in exact_match, regardless of the value in the 
popularity field. All results which have no match in exact_match 
should use the popularity field for scoring.


Is this possible without using a function query ?


thanks.


-robert





On 29.03.11 16:34, Erik Hatcher wrote:

On Mar 29, 2011, at 10:01 , Robert Gründler wrote:


Hi all,

i'm trying to implement a FunctionQuery using the bf parameter of the 
DisMaxQueryParser, however, i'm getting an exception:

Unknown function min in FunctionQuery('min(1,2)', pos=4)

The request that causes the error looks like this:

http://localhost:2345/solr/main/select?qt=dismaxqf=name^0.1qf=name_exact^10.0debugQuery=truebf=min(1,2)version=1.2wt=jsonjson.nl=mapq=+foostart=0rows=3


I'm not sure where the pos=4 part of the FunctionQuery is coming from.

My Solr version is 1.4.1.

Has anyone a hint why i'm getting this error?

 From http://wiki.apache.org/solr/FunctionQuery#min - min() is 3.2 (though I 
think that really means 3.1 now, right??).  Definitely not in 1.4.1.

Erik





Re: [WKT] Spatial Searching

2011-03-29 Thread Walter Underwood
On Mar 29, 2011, at 8:12 AM, Mattmann, Chris A (388J) wrote:

 
 RE SIS... I wonder how the expertise on that project compares to that of 
 JTS's Martin Davis -- an expert, and the library has been in use for 10 
 years.
 
 Time will tell. I'd favor the Apache model where instead of name dropping we 
 rely on the collective expertise p a group of like minded individuals

Lucene was originally the work of one expert. That's not a bad way to start.

wunder
--
Walter Underwood





String field

2011-03-29 Thread Brian Lamb
Hi all,

I'm a little confused about the string field. I read somewhere that if I
want to do an exact match, I should use an exact match. So I made a few
modifications to my schema file:

field name=id type=string indexed=true stored=true required=false
/
field name=common_names multiValued=true type=string indexed=true
stored=true required=false /
field name=genus type=string indexed=true stored=true
required=false /
field name=species type=string indexed=true stored=true
required=false /

And did a full import but when I do a search and return all fields, only id
is showing up. The only difference is that id is my primary key field so
that could be why it is showing up but why aren't the others showing up?

Thanks,

Brian Lamb


Why do .nfs* files still exist after opening a new searcher?

2011-03-29 Thread flin
I'm running a Solr (1.3) slave machine over NFS. 
The problem is, whenever I updated the index directory from the newest
snapshot, and try to use bin/commit or bin/readercycle to open a new
searcher, the temporary .nfs* files do not get cleared. I thought that the
new searcher should read the new segment files and release the old files.
How do I get rid of those .nfs* files? Any ideas?

Thanks.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-do-nfs-files-still-exist-after-opening-a-new-searcher-tp2749079p2749079.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: String field

2011-03-29 Thread Scott Gonyea
First, make sure your request handler is set to spit out everything.  I take
it you did, but I hate to assume.

Second, I suggest indexing your data twice.  One as tokenized-text, the
other as a string.  It'll save you from howling at the moon in anguish...
Unless you really only do care about pure, exact-matching.  IE, down to the
character-case.

Scott

On Tue, Mar 29, 2011 at 8:46 AM, Brian Lamb
brian.l...@journalexperts.comwrote:

 Hi all,

 I'm a little confused about the string field. I read somewhere that if I
 want to do an exact match, I should use an exact match. So I made a few
 modifications to my schema file:

 field name=id type=string indexed=true stored=true
 required=false
 /
 field name=common_names multiValued=true type=string indexed=true
 stored=true required=false /
 field name=genus type=string indexed=true stored=true
 required=false /
 field name=species type=string indexed=true stored=true
 required=false /

 And did a full import but when I do a search and return all fields, only id
 is showing up. The only difference is that id is my primary key field so
 that could be why it is showing up but why aren't the others showing up?

 Thanks,

 Brian Lamb



FW: no results searching for stadium seating chairs

2011-03-29 Thread Robert Petersen
 

Very interestingly, LucidKStemFilterFactory is stemming ‘ing’s differently for 
different words.  The word ‘seating’ doesn't lose the 'ing' but the word 
‘counseling’ does!  Can anyone explain the difference here?  protwords.txt is 
empty btw.

 

com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory 
{protected=protwords.txt}

term position

1

2

term text

privy

counsel

term type

word

word

source start,end

0,5

6,16

payload



 

 

com.lucidimagination.solrworks.analysis.LucidKStemFilterFactory 
{protected=protwords.txt}

term position

1

term text

seating

term type

word

source start,end

0,7

 



Re: FW: no results searching for stadium seating chairs

2011-03-29 Thread Yonik Seeley
On Tue, Mar 29, 2011 at 1:17 PM, Robert Petersen rober...@buy.com wrote:
 Very interestingly, LucidKStemFilterFactory is stemming ‘ing’s differently 
 for different words.  The word ‘seating’ doesn't lose the 'ing' but the word 
 ‘counseling’ does!  Can anyone explain the difference here?  protwords.txt is 
 empty btw.

KStem is dictionary driven, so seating is probably in the
dictionary.  I guess the author decided that seating and seat were
sufficiently different.


-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


RE: FW: no results searching for stadium seating chairs

2011-03-29 Thread Robert Petersen
For retail product title search, would there be a better stemmer to use?  We 
wanted a less aggressive stemmer, but I would expect the term seating to stem.  
I have found several other words which end in ing and do not get stemmed.  
Amongst our product lines are four million books with all kinds of crazy 
titles, like the following oddity!  Here counseling stems and unknowing doesn't:

1. The Cloud of Unknowing and the Book of Privy Counseling 
Buy New: $29.95 $18.30
3 New and Used from $18.30


-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Tuesday, March 29, 2011 10:27 AM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: FW: no results searching for stadium seating chairs

On Tue, Mar 29, 2011 at 1:17 PM, Robert Petersen rober...@buy.com wrote:
 Very interestingly, LucidKStemFilterFactory is stemming 'ing's differently 
 for different words.  The word 'seating' doesn't lose the 'ing' but the word 
 'counseling' does!  Can anyone explain the difference here?  protwords.txt is 
 empty btw.

KStem is dictionary driven, so seating is probably in the
dictionary.  I guess the author decided that seating and seat were
sufficiently different.


-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


Re: 2 index within the same Solr server ?

2011-03-29 Thread litan1...@gmail.com
Yes, you can use multicore to create 2nd index from 1st index

Sent from my iPhone

On Mar 29, 2011, at 6:01, Amel Fraisse amel.frai...@gmail.com wrote:

 Hello every body,
 
 Is it possible to create 2 index within the same Solr server ?
 
 Thank you.
 
 Amel.



Re: FW: no results searching for stadium seating chairs

2011-03-29 Thread Jonathan Rochkind
It seems unlikely you are going to find something that stems everything 
exactly how you want it, and nothing how you don't want it. This is very 
domain dependent, as you've discovered. I doubt there's even such a 
thing as the way everyone doing a 'retail product title search' would 
want it, it's going to vary.


You could use the synonym feature to make your own stemming dictionary, 
tell it to stem seating to seat.


Of course, that's also very expensive in terms of your time, to create 
your own custom dictionary.  But you're going to have to live with one 
of the compromises, software cant' do magic!


For particular titles, you could also, in your own metadata control, add 
alternate titles that you want it to match on, before it even gets 
indexed.


On 3/29/2011 1:43 PM, Robert Petersen wrote:

For retail product title search, would there be a better stemmer to use?  We 
wanted a less aggressive stemmer, but I would expect the term seating to stem.  
I have found several other words which end in ing and do not get stemmed.  
Amongst our product lines are four million books with all kinds of crazy 
titles, like the following oddity!  Here counseling stems and unknowing doesn't:

1. The Cloud of Unknowing and the Book of Privy Counseling
Buy New: $29.95 $18.30
3 New and Used from $18.30


-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Tuesday, March 29, 2011 10:27 AM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: FW: no results searching for stadium seating chairs

On Tue, Mar 29, 2011 at 1:17 PM, Robert Petersenrober...@buy.com  wrote:

Very interestingly, LucidKStemFilterFactory is stemming 'ing's differently for 
different words.  The word 'seating' doesn't lose the 'ing' but the word 
'counseling' does!  Can anyone explain the difference here?  protwords.txt is 
empty btw.

KStem is dictionary driven, so seating is probably in the
dictionary.  I guess the author decided that seating and seat were
sufficiently different.


-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco



Re: 2 index within the same Solr server ?

2011-03-29 Thread Rahul Warawdekar
Please refer
http://wiki.apache.org/solr/MultipleIndexes

On 3/29/11, Amel Fraisse amel.frai...@gmail.com wrote:
 Hello every body,

 Is it possible to create 2 index within the same Solr server ?

 Thank you.

 Amel.



-- 
Thanks and Regards
Rahul A. Warawdekar


Re: String field

2011-03-29 Thread Erick Erickson
try the schema browser from the admin page to be sure the fields
you *think* are in the index really are. Did you do a commit
after indexing? Did you re-index after the schema changes? Are
you 100% sure that, if you did re-index, the new fields were in the
docs submitted?

Best
Erick

On Tue, Mar 29, 2011 at 11:46 AM, Brian Lamb
brian.l...@journalexperts.com wrote:
 Hi all,

 I'm a little confused about the string field. I read somewhere that if I
 want to do an exact match, I should use an exact match. So I made a few
 modifications to my schema file:

 field name=id type=string indexed=true stored=true required=false
 /
 field name=common_names multiValued=true type=string indexed=true
 stored=true required=false /
 field name=genus type=string indexed=true stored=true
 required=false /
 field name=species type=string indexed=true stored=true
 required=false /

 And did a full import but when I do a search and return all fields, only id
 is showing up. The only difference is that id is my primary key field so
 that could be why it is showing up but why aren't the others showing up?

 Thanks,

 Brian Lamb



Concatenate multivalued DIH fields

2011-03-29 Thread neha
I have two multivalued DIH fields fname and lname. I want to concatenate
each of the fname and lname pairs to get a third multivalued DIH field
name.

I tried this :



 
But the result is :  [Lars L., Helle K., Thomas A., Jes] [Thomsen, Iversen,
Brinck, Olesen],  instead of   Lars L. Thomsen, Helle K. Iverson, Thomas A
Brinck, Jes Oleson.

Is there a way to iterate through the multivalued fields or is there
something more simple to do this.

Thanks,
Neha



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Concatenate-multivalued-DIH-fields-tp2749988p2749988.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Concatenate multivalued DIH fields

2011-03-29 Thread Markus Jelsma
Haven't tried your use case but i believe DIH's ScriptTransformer can do the 
trick. It seems to operate on rows so you can fetch both fields and add a 
concatenated field.

http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer

 I have two multivalued DIH fields fname and lname. I want to
 concatenate each of the fname and lname pairs to get a third multivalued
 DIH field name.
 
 I tried this :
 
 
 
 
 But the result is :  [Lars L., Helle K., Thomas A., Jes] [Thomsen, Iversen,
 Brinck, Olesen],  instead of   Lars L. Thomsen, Helle K. Iverson, Thomas A
 Brinck, Jes Oleson.
 
 Is there a way to iterate through the multivalued fields or is there
 something more simple to do this.
 
 Thanks,
 Neha
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Concatenate-multivalued-DIH-fields-tp27
 49988p2749988.html Sent from the Solr - User mailing list archive at
 Nabble.com.


Javabin-JSon

2011-03-29 Thread paulohess
Hi guys,

I have a Javabin object  and I need to convert that to a JSon object. How ?
pls help?
I am using solrj (client) that doesn't support JSON so (wt=json) won't
convert it to JSon.

thanks
Paulo

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Javabin-JSon-tp2750066p2750066.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Javabin-JSon

2011-03-29 Thread Markus Jelsma
You've asked this twice now. This is a Java specific question and unless 
someone feels like answering i'd try googling somewhere else.

 Hi guys,
 
 I have a Javabin object  and I need to convert that to a JSon object. How ?
 pls help?
 I am using solrj (client) that doesn't support JSON so (wt=json) won't
 convert it to JSon.
 
 thanks
 Paulo
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Javabin-JSon-tp2750066p2750066.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Concatenate multivalued DIH fields

2011-03-29 Thread neha
Thank you for ur reply, but is there more documentation on Script
Transformer?? I am newbie to Solr DIH. Can I send two rows as parameters to
the script transformer function. Also what is the syntax to call the script
transformer in the DIH field?? The documentation is not very clear about it.

Thanks,
Neha

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Concatenate-multivalued-DIH-fields-tp2749988p2750237.html
Sent from the Solr - User mailing list archive at Nabble.com.


DIH with XML question.

2011-03-29 Thread Marcelo Iturbe
Hello,
I have an XML with multiple nodes with the same name.

In the data-config.xml document, I have set up:

   field column=email xpath=/feed/entry/email/@address /

But this only feeds the last email address into Solr (in the case bellow,
mv...@yahoo.org appears in Solr).

The XML for each entity is as follows:

entry
idhttp://www.google.com/m8/feeds/contacts/me%40here.com/base/0
/id
updated2011-03-25T06:34:29.714Z/updated
category scheme='http://schemas.google.com/g/2005#kind' term='
http://schemas.google.com/contact/2008#contact'/
title type='text'Marcelo Vera/title
link rel='self' type='application/atom+xml' href='
https://www.google.com/m8/feeds/contacts/me%40here.com/full/0'/
link rel='edit' type='application/atom+xml' href='
https://www.google.com/m8/feeds/contacts/me%40here.com/full/0/1301034869714001'/

gd:email rel='http://schemas.google.com/g/2005#other' address='
marc...@here.com' primary='true'/
gd:email rel='http://schemas.google.com/g/2005#home' address='
marcelo.v...@there.com'/
gd:email rel='http://schemas.google.com/g/2005#other' address='
mv...@yahoo.org'/
/entry

The gd:email node can repeat indefenate number of times, how can I feed
each value of address to Solr?

Thanks


Re: String field

2011-03-29 Thread Brian Lamb
The full import wasn't spitting out any errors on the web page but in
looking at the logs, there were errors. Correcting those errors solved that
issue.

Thanks,

Brian Lamb

On Tue, Mar 29, 2011 at 2:44 PM, Erick Erickson erickerick...@gmail.comwrote:

 try the schema browser from the admin page to be sure the fields
 you *think* are in the index really are. Did you do a commit
 after indexing? Did you re-index after the schema changes? Are
 you 100% sure that, if you did re-index, the new fields were in the
 docs submitted?

 Best
 Erick

 On Tue, Mar 29, 2011 at 11:46 AM, Brian Lamb
 brian.l...@journalexperts.com wrote:
  Hi all,
 
  I'm a little confused about the string field. I read somewhere that if I
  want to do an exact match, I should use an exact match. So I made a few
  modifications to my schema file:
 
  field name=id type=string indexed=true stored=true
 required=false
  /
  field name=common_names multiValued=true type=string
 indexed=true
  stored=true required=false /
  field name=genus type=string indexed=true stored=true
  required=false /
  field name=species type=string indexed=true stored=true
  required=false /
 
  And did a full import but when I do a search and return all fields, only
 id
  is showing up. The only difference is that id is my primary key field so
  that could be why it is showing up but why aren't the others showing up?
 
  Thanks,
 
  Brian Lamb
 



Re: DIH with XML question.

2011-03-29 Thread neha
make sure the field email is multivalued in schema.xml file

Neha 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-with-XML-question-tp2750288p2750416.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DIH with XML question.

2011-03-29 Thread Erick Erickson
Set the multiValued=true attribute on the field definition. Note it
is case sensitive.

See: http://wiki.apache.org/solr/SchemaXml

Best
Erick

On Tue, Mar 29, 2011 at 3:58 PM, Marcelo Iturbe marc...@santiago.cl wrote:
 Hello,
 I have an XML with multiple nodes with the same name.

 In the data-config.xml document, I have set up:

   field column=email xpath=/feed/entry/email/@address /

 But this only feeds the last email address into Solr (in the case bellow,
 mv...@yahoo.org appears in Solr).

 The XML for each entity is as follows:

        entry
            idhttp://www.google.com/m8/feeds/contacts/me%40here.com/base/0
 /id
            updated2011-03-25T06:34:29.714Z/updated
            category scheme='http://schemas.google.com/g/2005#kind' term='
 http://schemas.google.com/contact/2008#contact'/
            title type='text'Marcelo Vera/title
            link rel='self' type='application/atom+xml' href='
 https://www.google.com/m8/feeds/contacts/me%40here.com/full/0'/
            link rel='edit' type='application/atom+xml' href='
 https://www.google.com/m8/feeds/contacts/me%40here.com/full/0/1301034869714001'/

            gd:email rel='http://schemas.google.com/g/2005#other' address='
 marc...@here.com' primary='true'/
            gd:email rel='http://schemas.google.com/g/2005#home' address='
 marcelo.v...@there.com'/
            gd:email rel='http://schemas.google.com/g/2005#other' address='
 mv...@yahoo.org'/
        /entry

 The gd:email node can repeat indefenate number of times, how can I feed
 each value of address to Solr?

 Thanks



Re: Javabin-JSon

2011-03-29 Thread paulohess
you are not helping 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Javabin-JSon-tp2750066p2750461.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Javabin-JSon

2011-03-29 Thread Tim Gilbert
Markus is right, this isn't the list for Java questions, but you can
look into Jackson.  Jackson is a java binder that can convert java pojos
into json.

http://jackson.codehaus.org/

I use it in Spring MVC to convert my output to json.

Tim

-Original Message-
From: paulohess [mailto:pauloh...@yahoo.com] 
Sent: Tuesday, March 29, 2011 3:16 PM
To: solr-user@lucene.apache.org
Subject: Javabin-JSon

Hi guys,

I have a Javabin object  and I need to convert that to a JSon object.
How ?
pls help?
I am using solrj (client) that doesn't support JSON so (wt=json) won't
convert it to JSon.

thanks
Paulo

--
View this message in context:
http://lucene.472066.n3.nabble.com/Javabin-JSon-tp2750066p2750066.html
Sent from the Solr - User mailing list archive at Nabble.com.


Matching on a multi valued field

2011-03-29 Thread Brian Lamb
Hi all,

I have a field set up like this:

field name=common_names multiValued=true type=text indexed=true
stored=true required=false /

And I have some records:

RECORD1
arr name=common_names
  strman's best friend/str
  strpooch/str
/arr

RECORD2
arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
/arr

Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb


Re: Matching on a multi valued field

2011-03-29 Thread Jonathan Rochkind
As far as I know, there's no support in Solr for all words must match 
in the same value of a multi-valued field.


I agree it would be useful in some cases.

As long as you don't need to do an _actual_ phrase search, you can kind 
of fake it by using a phrase query, with the query slop set so high that 
it will encompass the whole field. Just make sure your 
positionIncrementGap in your solrconfig.xml is higher than your phrase 
slop, to keep your phrase slop from leaking over into another value of 
the multi-valued field.


fq=man's friend~1
(but url encode the value)

On 3/29/2011 4:57 PM, Brian Lamb wrote:

Hi all,

I have a field set up like this:

field name=common_names multiValued=true type=text indexed=true
stored=true required=false /

And I have some records:

RECORD1
arr name=common_names
   strman's best friend/str
   strpooch/str
/arr

RECORD2
arr name=common_names
   strman's worst enemy/str
   strfriend to no one/str
/arr

Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb



Re: Matching on a multi valued field

2011-03-29 Thread Savvas-Andreas Moysidis
I assume you are using the Standard Handler?
In that case wouldn't something like:
q=common_names:(man's friend)q.op=AND work?

On 29 March 2011 21:57, Brian Lamb brian.l...@journalexperts.com wrote:

 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names
  strman's best friend/str
  strpooch/str
 /arr

 RECORD2
 arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDdf=common_names}man's
 friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so that
 only RECORD1 is returned?

 Thanks,

 Brian Lamb



Re: Matching on a multi valued field

2011-03-29 Thread Erick Erickson
Two things need to be done. First, define positionIncrementGap
(see http://wiki.apache.org/solr/SchemaXml) for the field.

Then use phrase searches with the slop less than what you've
defined for positionIncrementGap.

Of course you'll have to have a positionIncrementGap larger than the
number of tokens in any single entry in your multiValued field, and you'll
have to re-index.

Best
Erick

On Tue, Mar 29, 2011 at 4:57 PM, Brian Lamb
brian.l...@journalexperts.com wrote:
 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names
  strman's best friend/str
  strpooch/str
 /arr

 RECORD2
 arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
 friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so that
 only RECORD1 is returned?

 Thanks,

 Brian Lamb



Re: Matching on a multi valued field

2011-03-29 Thread Savvas-Andreas Moysidis
my bad..just realised your problem.. :D

On 29 March 2011 22:07, Savvas-Andreas Moysidis 
savvas.andreas.moysi...@googlemail.com wrote:

 I assume you are using the Standard Handler?
 In that case wouldn't something like:
 q=common_names:(man's friend)q.op=AND work?

 On 29 March 2011 21:57, Brian Lamb brian.l...@journalexperts.com wrote:

 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names
  strman's best friend/str
  strpooch/str
 /arr

 RECORD2
 arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDdf=common_names}man's
 friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so
 that
 only RECORD1 is returned?

 Thanks,

 Brian Lamb





Re: Matching on a multi valued field

2011-03-29 Thread Markus Jelsma
Hi,

Your filter query is looking for a match of man's friend in a single field. 
Regardless of analysis of the common_names field, all terms are present in the 
common_names field of both documents. A multiValued field is actually a single 
field with all data separated with positionIncrement. Try setting that value 
high enough and use a PhraseQuery. 

That should work

Cheers,

 Hi all,
 
 I have a field set up like this:
 
 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /
 
 And I have some records:
 
 RECORD1
 arr name=common_names
   strman's best friend/str
   strpooch/str
 /arr
 
 RECORD2
 arr name=common_names
   strman's worst enemy/str
   strfriend to no one/str
 /arr
 
 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
 df=common_names}man's friend
 
 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so that
 only RECORD1 is returned?
 
 Thanks,
 
 Brian Lamb


Re: Matching on a multi valued field

2011-03-29 Thread Markus Jelsma
orly, all replies came in while sending =)

 Hi,
 
 Your filter query is looking for a match of man's friend in a single
 field. Regardless of analysis of the common_names field, all terms are
 present in the common_names field of both documents. A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.
 
 That should work
 
 Cheers,
 
  Hi all,
  
  I have a field set up like this:
  
  field name=common_names multiValued=true type=text indexed=true
  stored=true required=false /
  
  And I have some records:
  
  RECORD1
  arr name=common_names
  
strman's best friend/str
strpooch/str
  
  /arr
  
  RECORD2
  arr name=common_names
  
strman's worst enemy/str
strfriend to no one/str
  
  /arr
  
  Now if I do a search such as:
  http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
  df=common_names}man's friend
  
  Both records are returned. However, I only want RECORD1 returned. I
  understand why RECORD2 is returned but how can I structure my query so
  that only RECORD1 is returned?
  
  Thanks,
  
  Brian Lamb


Re: Challenges of bundling Solr out-of-box

2011-03-29 Thread Markus Jelsma
Hi,

You're right, there are new technical challenges for customers that don't have 
the experience in-house. Some customers have personnel you can teach how to 
monitor and maintain an installation. Others just take a service level 
agreement or just let it run forever without issues, if the environment allows 
it.

Customers that don't want to become search-savvy administrators all of a 
sudden can opt for a SLA.

Cheers,

 I am hoping to get some feedback from the users in this list about how they
 tackled the challenges of bundling Solr out-of-box with an existing product
 that was already being sold to customers.
 
 Technical challenges of indexing  searching by adopting Solr are trivial
 when compared to the fact that the customer now has to be told how to
 maintain their search indices for optimum performance using Solr
 administration!!!
 
 I do understand that any product that adopts Solr probably had this coming
 to them anyway but for folks that move away from structured-search database
 technologies and into the text-search world ... they aren't quite sure how
 to ask their customers to become search-savvy administrators all of a
 sudden.
 
 Any thoughts on how you broke you customers into loving the Solr
 experience?
 
 - Pulkit


catch_all field versus multiple OR Boolean query

2011-03-29 Thread Savvas-Andreas Moysidis
Hello,

Currently in our index we have multiple fields and a copyfield / catch_all
field. When users select all search options we specify the catch_all field
as the field to search on. This has worked very well for our needs but a
question was recently raised within our team regarding  the difference
between using a catch_all field and specifying a Boolean query by OR-ing all
fields together.
From our own experimentation, we have observed that using those two
different strategies we get back different results lists.

By looking at the Similarity class, we can understand how the score is
calculated for the catch_all field but is there any input on how the score
gets calculated for the Boolean query?

Regards,
- Savvas


Re: Matching on a multi valued field

2011-03-29 Thread Juan Pablo Mora
 A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.


That is true but you cannot do things like:

q=bar* foo*~10 with default query search.

and if you use dismax you will have the same problems with multivalued fields. 
Imagine the situation:

Doc1:
field A: [foo bar,dooh] 2 values

Doc2:
field A: [bar dooh, whatever] Another 2 values

the query:
qt=dismax  qf= fieldA  q = ( bar dooh )

will return both Doc1 and Doc2. The only thing you can do in this situation is 
boost phrase query in Doc2 with parameter pf in order to get Doc2 in the first 
position of the results:

pf = fieldA^1


Thanks,
JP.


El 29/03/2011, a las 23:14, Markus Jelsma escribió:

 orly, all replies came in while sending =)
 
 Hi,
 
 Your filter query is looking for a match of man's friend in a single
 field. Regardless of analysis of the common_names field, all terms are
 present in the common_names field of both documents. A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.
 
 That should work
 
 Cheers,
 
 Hi all,
 
 I have a field set up like this:
 
 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /
 
 And I have some records:
 
 RECORD1
 arr name=common_names
 
  strman's best friend/str
  strpooch/str
 
 /arr
 
 RECORD2
 arr name=common_names
 
  strman's worst enemy/str
  strfriend to no one/str
 
 /arr
 
 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
 df=common_names}man's friend
 
 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so
 that only RECORD1 is returned?
 
 Thanks,
 
 Brian Lamb



Re: catch_all field versus multiple OR Boolean query

2011-03-29 Thread Erick Erickson
It's not so much the Boolean as it is different field characteristics.
The length
of a field factors into the score, and a boolean query that goes against the
individual fields will certainly score differently than putting all
the fields in a
catch-all which is, obviously, longer.

Have you looked at the dismax query parser? It allows you to
distribute queries over
fields automatically, even with varying boosts.

Finally, consider adding debugQuery=on to your query to see what each field
contributes to the score, that'll help with understanding the scoring,
although it's
a little hard to read...

Best
Erick

On Tue, Mar 29, 2011 at 6:06 PM, Savvas-Andreas Moysidis
savvas.andreas.moysi...@googlemail.com wrote:
 Hello,

 Currently in our index we have multiple fields and a copyfield / catch_all
 field. When users select all search options we specify the catch_all field
 as the field to search on. This has worked very well for our needs but a
 question was recently raised within our team regarding  the difference
 between using a catch_all field and specifying a Boolean query by OR-ing all
 fields together.
 From our own experimentation, we have observed that using those two
 different strategies we get back different results lists.

 By looking at the Similarity class, we can understand how the score is
 calculated for the catch_all field but is there any input on how the score
 gets calculated for the Boolean query?

 Regards,
 - Savvas



Re: Fwd: machine tags, copy fields and pattern tokenizers

2011-03-29 Thread sukhdev
Hi,

Was you able to solve machine tag problem in solr.  Actually I am also
looking if machine tags can be stored as index in solr and search in
efficient way.

Regards


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fwd-machine-tags-copy-fields-and-pattern-tokenizers-tp506491p2751745.html
Sent from the Solr - User mailing list archive at Nabble.com.


Exporting to CSV

2011-03-29 Thread Charles Wardell
Is there an easy way to get queried data exported from solr in a csv format?
Hoping there is a handler or library for this.

Regards,
charlie


Re: Exporting to CSV

2011-03-29 Thread Koji Sekiguchi

(11/03/30 10:59), Charles Wardell wrote:

Is there an easy way to get queried data exported from solr in a csv format?
Hoping there is a handler or library for this.


Charlie,

Solr 3.1, will be released shortly, has csv response writer which is implicitly
defined. Try wt=csv request parameter.

Koji
--
http://www.rondhuit.com/en/


Re: Exporting to CSV

2011-03-29 Thread Charles Wardell
Hi Koji,

Do you mean that adding wt=csv to my http request will give me a csv?
The only downloads that I see on the SOLR site is for 1.4.x
Is there a 3.1 beta?


On Mar 29, 2011, at 10:32 PM, Koji Sekiguchi wrote:

 (11/03/30 10:59), Charles Wardell wrote:
 Is there an easy way to get queried data exported from solr in a csv format?
 Hoping there is a handler or library for this.
 
 Charlie,
 
 Solr 3.1, will be released shortly, has csv response writer which is 
 implicitly
 defined. Try wt=csv request parameter.
 
 Koji
 -- 
 http://www.rondhuit.com/en/



Re: Exporting to CSV

2011-03-29 Thread Estrada Groups
Check out the trunk version of Solr and build that. Those mods are in there for 
sure. I think the version in trunk is 4.0 but that discussion should be on a 
different thread ;-)

Adam


On Mar 29, 2011, at 11:35 PM, Charles Wardell charles.ward...@bcsolution.com 
wrote:

 Hi Koji,
 
 Do you mean that adding wt=csv to my http request will give me a csv?
 The only downloads that I see on the SOLR site is for 1.4.x
 Is there a 3.1 beta?
 
 
 On Mar 29, 2011, at 10:32 PM, Koji Sekiguchi wrote:
 
 (11/03/30 10:59), Charles Wardell wrote:
 Is there an easy way to get queried data exported from solr in a csv format?
 Hoping there is a handler or library for this.
 
 Charlie,
 
 Solr 3.1, will be released shortly, has csv response writer which is 
 implicitly
 defined. Try wt=csv request parameter.
 
 Koji
 -- 
 http://www.rondhuit.com/en/