Re: Single Core or Multiple Core?

2009-09-04 Thread Shalin Shekhar Mangar
On Fri, Sep 4, 2009 at 4:35 AM, Jonathan Ariel ionat...@gmail.com wrote:

 It seems like it is really hard to decide when the Multiple Core solution
 is
 more appropriate.As I could understand from this list and wiki the Multiple
 Core feature was designed to address the need of handling different sets of
 data within the same solr instance, where the sets of data don't need to be
 joined.


Correct. It is also useful when you don't want to setup multiple boxes or
tomcats for each Solr.


 In my case the documents are of a specific site and country. So document A
 can be of Site 1 / Country 1, B of Site 2 / Country 1, C of Site 1 /
 Country
 2, and so on.
 For the use cases of my application I will never query across countries or
 sites. I will always have to provide to the query the country id and the
 site id.
 Would you suggest to split my data into cores? I have few sites (around 20)
 and more countries (around 90).
 Should I split my data into sites (around 20 cores) and within a core
 filter
 by site? Should I split by Site and Country (around 1800 cores)?
 What should I consider when splitting my data into multiple cores?


The first question is why do you want to split at all? Is the schema or
solrconfig different? Are the different sites or countries updated at
different times? Is the combined index very big that the response times jump
wildly when all the caches are thrown out if documents related to one site
or country are updated? Does warmup or optimize or replication take too much
time with one big index?

Each core will have its own configuration files (maintenance) and you need
to setup replication separately for each core (which is a pain with the
script based replication). Also note that by keeping all cores in one tomcat
(one JVM), a stop-the-world GC will stop all cores which is not the case
when using separate JVMs for each index/core.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem querying for a value with a space

2009-09-04 Thread Shalin Shekhar Mangar
On Fri, Sep 4, 2009 at 6:40 AM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : Use +specific_LIST_s:(For Sale)
 : or
 : +specific_LIST_s:For Sale

 those are *VERY* different queries.

 The first is just syntac sugar for...
  +specific_LIST_s:For +specific_LIST_s:Sale

 ...which is not the same as the second query (especially when using
 StrField, or KeyworddTokenizer)


Ah, my bad. I learnt something new today :)

-- 
Regards,
Shalin Shekhar Mangar.


Re: Exact Word Search

2009-09-04 Thread Shalin Shekhar Mangar
If you want to do a exact match (case sensitive) in Solr, you should have a
string type field and the query should be made as fieldname:value

However, reading your mail I get the feeling that the query is actually
being made by Carrot2.

On Fri, Sep 4, 2009 at 7:41 AM, bhaskar chandrasekar
bas_s...@yahoo.co.inwrote:

 Hi shalin,

 Thanks for your reply.
 I am not sure as how the query is formed in Solr.
 If you could throw some light on this , it will be helpful.
 Is it achievable?.

 Regards
 Bhaskar


 --- On Thu, 9/3/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:


 From: Shalin Shekhar Mangar shalinman...@gmail.com
 Subject: Re: Exact Word Search
 To: solr-user@lucene.apache.org
 Date: Thursday, September 3, 2009, 5:14 AM


 On Thu, Sep 3, 2009 at 1:33 PM, bhaskar chandrasekar
 bas_s...@yahoo.co.inwrote:

  Hi,
 
  Can any one help me with the below scenario?.
 
  Scenario :
 
  I have integrated Solr with Carrot2.
  The issue is
  Assuming i give bhaskar as input string for search.
  It should give me search results pertaining to bhaskar only.
   Example: It should not display search results as chandarbhaskar or
   bhaskarc.
   Basically search should happen based on the exact word match. I am not
  bothered about case sensitive here
   How to achieve the above Scenario in Carrot2 ?.
 

 Bhaskar, I think this question is better suited for the Carrot mailing
 lists. Unless you yourself control how the solr query is created, we will
 not be able to help you.

 --
 Regards,
 Shalin Shekhar Mangar.








-- 
Regards,
Shalin Shekhar Mangar.


Re: Exact Word Search

2009-09-04 Thread bhaskar chandrasekar
 
Hi Shalin,
 
Where on in which file should i set the values you have mentioned?.
Let me know how to set it.
 
Regards
Bhaskar

--- On Fri, 9/4/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:


From: Shalin Shekhar Mangar shalinman...@gmail.com
Subject: Re: Exact Word Search
To: solr-user@lucene.apache.org
Date: Friday, September 4, 2009, 1:47 AM


If you want to do a exact match (case sensitive) in Solr, you should have a
string type field and the query should be made as fieldname:value

However, reading your mail I get the feeling that the query is actually
being made by Carrot2.

On Fri, Sep 4, 2009 at 7:41 AM, bhaskar chandrasekar
bas_s...@yahoo.co.inwrote:

 Hi shalin,

 Thanks for your reply.
 I am not sure as how the query is formed in Solr.
 If you could throw some light on this , it will be helpful.
 Is it achievable?.

 Regards
 Bhaskar


 --- On Thu, 9/3/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:


 From: Shalin Shekhar Mangar shalinman...@gmail.com
 Subject: Re: Exact Word Search
 To: solr-user@lucene.apache.org
 Date: Thursday, September 3, 2009, 5:14 AM


 On Thu, Sep 3, 2009 at 1:33 PM, bhaskar chandrasekar
 bas_s...@yahoo.co.inwrote:

  Hi,
 
  Can any one help me with the below scenario?.
 
  Scenario :
 
  I have integrated Solr with Carrot2.
  The issue is
  Assuming i give bhaskar as input string for search.
  It should give me search results pertaining to bhaskar only.
   Example: It should not display search results as chandarbhaskar or
   bhaskarc.
   Basically search should happen based on the exact word match. I am not
  bothered about case sensitive here
   How to achieve the above Scenario in Carrot2 ?.
 

 Bhaskar, I think this question is better suited for the Carrot mailing
 lists. Unless you yourself control how the solr query is created, we will
 not be able to help you.

 --
 Regards,
 Shalin Shekhar Mangar.








-- 
Regards,
Shalin Shekhar Mangar.



  

Re: Solr, JNDI config, dataDir, and solr home problem

2009-09-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is nowhere mentioned that you can use a variable ${solr.home} in
your solrconfig.xml. There is a bug related to this issue
https://issues.apache.org/jira/browse/SOLR-1267

On Fri, Sep 4, 2009 at 5:47 AM, Archon810archon...@gmail.com wrote:

 Here's my problem.

 I'm trying to follow a multi Solr setup, straight from the Solr wiki -
 http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac.

 Here's the relevant code:
 lt;Context docBase=/some/path/solr.war debug=0 crossContext=true gt;
   lt;Environment name=solr/home type=java.lang.String
 value=/some/path/solr1home override=true /gt;
 lt;/Contextgt;

 Now I want to set the Solr lt;dataDirgt; in solrconfig.xml, relative to
 the solr home property. The instructions
 http://wiki.apache.org/solr/SolrConfigXml#head-e8fbf2d748d90c5900aac712d0e3385ced5bd128
 say lt;dataDirgt; is used to specify an alternate directory to hold all
 index data other than the default ./data under the Solr home. If replication
 is in use, this should match the replication configuration. If this
 directory is not absolute, then it is relative to the current working
 directory of the servlet container.

 However, no matter how I try to set the dataDir property, solr home is not
 being found. For example,
  lt;dataDirgt;${solr.home}/datalt;/dataDirgt;

 What's even more confusing are these INFO notices in the log:
 INFO: No /solr/home in JNDI
 Sep 3, 2009 4:33:26 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome
 INFO: solr home defaulted to 'solr/' (could not find system property or
 JNDI)

 The JNDI instructions instruct to specify solr/home, the log complains
 about /solr/home (extra slash), the solrconfig.xml file seems to expect
 ${solr.home} - how more confusing can it get?

 This person is having the same issue:
 http://mysolr.com/tips/setting-solr-home-solrhome-in-jndi-on-tomcat-55/

 So, how does one refer to solr home from solrconfig.xml in a JNDI
 configuration scenario? Also, is there a way to debug/see variables that are
 defined in a specific context, such as solrconfig.xml? I feel like I'm
 completely blind here.

 Thank you!
 --
 View this message in context: 
 http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25286277.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Solr, JNDI config, dataDir, and solr home problem

2009-09-04 Thread Archon810

I saw it being used in the default solrconfig.xml in this phrase:
If you wish to hide files under ${solr.home}/conf, explicitly register the
ShowFileRequestHandler using...

It was only natural to assume it would work for something as trivial as
dataDir.

So, there's no way to refer to the solr/home value defined in JNDI?


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 
 it is nowhere mentioned that you can use a variable ${solr.home} in
 your solrconfig.xml. There is a bug related to this issue
 https://issues.apache.org/jira/browse/SOLR-1267
 
 On Fri, Sep 4, 2009 at 5:47 AM, Archon810archon...@gmail.com wrote:

 Here's my problem.

 I'm trying to follow a multi Solr setup, straight from the Solr wiki -
 http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac.

 Here's the relevant code:
 lt;Context docBase=/some/path/solr.war debug=0 crossContext=true
 gt;
   lt;Environment name=solr/home type=java.lang.String
 value=/some/path/solr1home override=true /gt;
 lt;/Contextgt;

 Now I want to set the Solr lt;dataDirgt; in solrconfig.xml, relative to
 the solr home property. The instructions
 http://wiki.apache.org/solr/SolrConfigXml#head-e8fbf2d748d90c5900aac712d0e3385ced5bd128
 say lt;dataDirgt; is used to specify an alternate directory to hold all
 index data other than the default ./data under the Solr home. If
 replication
 is in use, this should match the replication configuration. If this
 directory is not absolute, then it is relative to the current working
 directory of the servlet container.

 However, no matter how I try to set the dataDir property, solr home is
 not
 being found. For example,
  lt;dataDirgt;${solr.home}/datalt;/dataDirgt;

 What's even more confusing are these INFO notices in the log:
 INFO: No /solr/home in JNDI
 Sep 3, 2009 4:33:26 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome
 INFO: solr home defaulted to 'solr/' (could not find system property or
 JNDI)

 The JNDI instructions instruct to specify solr/home, the log complains
 about /solr/home (extra slash), the solrconfig.xml file seems to expect
 ${solr.home} - how more confusing can it get?

 This person is having the same issue:
 http://mysolr.com/tips/setting-solr-home-solrhome-in-jndi-on-tomcat-55/

 So, how does one refer to solr home from solrconfig.xml in a JNDI
 configuration scenario? Also, is there a way to debug/see variables that
 are
 defined in a specific context, such as solrconfig.xml? I feel like I'm
 completely blind here.

 Thank you!
 --
 View this message in context:
 http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25286277.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25292025.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Exact Word Search

2009-09-04 Thread bhaskar chandrasekar




Hi, 

  
I have integrated Solr with Carrot2 Cluster Engine (v 3.1.0). 
  
Carrot2 is used as a presentation layer. Carrot2 sends requested query to 
external source (Solr) and get results from Solr. 
Carrot2 may not be responsible for forming Query. It would have been handled 
from Solr end. 
  
Please help me with the below scenarios.
 
Scenario: (Please DO NOT consider any case sensitive)  
  
Assuming I give bhaskar as input string 
It should give me search results pertaining to word ‘bhaskar’ only.  
  
I am expecting output like below database query 
Select * from MASTER where name =’bhaskar’; 
  
Above query suppose to return matched records for ‘bhaskar’.. 
  
My Carrot2 search result should have similar out. 
  
It should not display search results prefixed or suffixed with bhaskar. 
  



Regards 
Bhaskar 

--- On Fri, 9/4/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:


From: Shalin Shekhar Mangar shalinman...@gmail.com
Subject: Re: Exact Word Search
To: solr-user@lucene.apache.org
Date: Friday, September 4, 2009, 1:47 AM


If you want to do a exact match (case sensitive) in Solr, you should have a
string type field and the query should be made as fieldname:value

However, reading your mail I get the feeling that the query is actually
being made by Carrot2.

On Fri, Sep 4, 2009 at 7:41 AM, bhaskar chandrasekar
bas_s...@yahoo.co.inwrote:

 Hi shalin,

 Thanks for your reply.
 I am not sure as how the query is formed in Solr.
 If you could throw some light on this , it will be helpful.
 Is it achievable?.

 Regards
 Bhaskar


 --- On Thu, 9/3/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:


 From: Shalin Shekhar Mangar shalinman...@gmail.com
 Subject: Re: Exact Word Search
 To: solr-user@lucene.apache.org
 Date: Thursday, September 3, 2009, 5:14 AM


 On Thu, Sep 3, 2009 at 1:33 PM, bhaskar chandrasekar
 bas_s...@yahoo.co.inwrote:

  Hi,
 
  Can any one help me with the below scenario?.
 
  Scenario :
 
  I have integrated Solr with Carrot2.
  The issue is
  Assuming i give bhaskar as input string for search.
  It should give me search results pertaining to bhaskar only.
   Example: It should not display search results as chandarbhaskar or
   bhaskarc.
   Basically search should happen based on the exact word match. I am not
  bothered about case sensitive here
   How to achieve the above Scenario in Carrot2 ?.
 

 Bhaskar, I think this question is better suited for the Carrot mailing
 lists. Unless you yourself control how the solr query is created, we will
 not be able to help you.

 --
 Regards,
 Shalin Shekhar Mangar.








-- 
Regards,
Shalin Shekhar Mangar.



  

Facet search field returning results on split words

2009-09-04 Thread EwanH

Hi

I have a solr search where a particular field named location is a place
name.  I have the field indexed and stored.  It is quite likely that a field
value could comprise more than one term or at least 2 words split by a space
such as Burnham Market.  Now if I search on location:burnham I get the
appropriate docs returned ok but the facet results return

lst name=location
int name=burnham2/int
int name=thorp2/int
/lst

i.e. values for both words which I don' want.  What can I do about this? 
Can I somehow escape the space when adding the data for indexing?

-- Ewan 
-- 
View this message in context: 
http://www.nabble.com/Facet-search-field-returning-results-on-split-words-tp25293787p25293787.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Exact Word Search

2009-09-04 Thread Shalin Shekhar Mangar
On Fri, Sep 4, 2009 at 6:06 PM, bhaskar chandrasekar
bas_s...@yahoo.co.inwrote:


 Hi,


 I have integrated Solr with Carrot2 Cluster Engine (v 3.1.0).

 Carrot2 is used as a presentation layer. Carrot2 sends requested query to
 external source (Solr) and get results from Solr.
 Carrot2 may not be responsible for forming Query. It would have been
 handled from Solr end.


Can you post the exact query that your application or Carrot2 is sending to
Solr? Can you also list the Solr field and type defined in schema.xml which
is being searched?



 Please help me with the below scenarios.

 Scenario: (Please DO NOT consider any case sensitive)

 Assuming I give bhaskar as input string
 It should give me search results pertaining to word ‘bhaskar’ only.

 I am expecting output like below database query
 Select * from MASTER where name =’bhaskar’;

 Above query suppose to return matched records for ‘bhaskar’..


Use a solr.TextField with KeywordTokenizer and LowerCaseFilter and search
with q=field-name:field-value

-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr, JNDI config, dataDir, and solr home problem

2009-09-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
${solr.home} is used for documentation purpose. It is not set as a variable.

On Fri, Sep 4, 2009 at 3:58 PM, Archon810archon...@gmail.com wrote:

 I saw it being used in the default solrconfig.xml in this phrase:
 If you wish to hide files under ${solr.home}/conf, explicitly register the
 ShowFileRequestHandler using...

 It was only natural to assume it would work for something as trivial as
 dataDir.

 So, there's no way to refer to the solr/home value defined in JNDI?


 Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:

 it is nowhere mentioned that you can use a variable ${solr.home} in
 your solrconfig.xml. There is a bug related to this issue
 https://issues.apache.org/jira/browse/SOLR-1267

 On Fri, Sep 4, 2009 at 5:47 AM, Archon810archon...@gmail.com wrote:

 Here's my problem.

 I'm trying to follow a multi Solr setup, straight from the Solr wiki -
 http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac.

 Here's the relevant code:
 lt;Context docBase=/some/path/solr.war debug=0 crossContext=true
 gt;
   lt;Environment name=solr/home type=java.lang.String
 value=/some/path/solr1home override=true /gt;
 lt;/Contextgt;

 Now I want to set the Solr lt;dataDirgt; in solrconfig.xml, relative to
 the solr home property. The instructions
 http://wiki.apache.org/solr/SolrConfigXml#head-e8fbf2d748d90c5900aac712d0e3385ced5bd128
 say lt;dataDirgt; is used to specify an alternate directory to hold all
 index data other than the default ./data under the Solr home. If
 replication
 is in use, this should match the replication configuration. If this
 directory is not absolute, then it is relative to the current working
 directory of the servlet container.

 However, no matter how I try to set the dataDir property, solr home is
 not
 being found. For example,
  lt;dataDirgt;${solr.home}/datalt;/dataDirgt;

 What's even more confusing are these INFO notices in the log:
 INFO: No /solr/home in JNDI
 Sep 3, 2009 4:33:26 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome
 INFO: solr home defaulted to 'solr/' (could not find system property or
 JNDI)

 The JNDI instructions instruct to specify solr/home, the log complains
 about /solr/home (extra slash), the solrconfig.xml file seems to expect
 ${solr.home} - how more confusing can it get?

 This person is having the same issue:
 http://mysolr.com/tips/setting-solr-home-solrhome-in-jndi-on-tomcat-55/

 So, how does one refer to solr home from solrconfig.xml in a JNDI
 configuration scenario? Also, is there a way to debug/see variables that
 are
 defined in a specific context, such as solrconfig.xml? I feel like I'm
 completely blind here.

 Thank you!
 --
 View this message in context:
 http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25286277.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com



 --
 View this message in context: 
 http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25292025.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Facet search field returning results on split words

2009-09-04 Thread Avlesh Singh
Your field needs to be untokenized for expected results. Faceting on the
text field that you use to search will give you facets like these. You can
index the same data in some other string field and facet on that field.

PS: You can use copyField to copy data during index time from one field to
other.

Cheers
Avlesh

On Fri, Sep 4, 2009 at 6:21 PM, EwanH drldgt...@sneakemail.com wrote:


 Hi

 I have a solr search where a particular field named location is a place
 name.  I have the field indexed and stored.  It is quite likely that a
 field
 value could comprise more than one term or at least 2 words split by a
 space
 such as Burnham Market.  Now if I search on location:burnham I get the
 appropriate docs returned ok but the facet results return

 lst name=location
 int name=burnham2/int
 int name=thorp2/int
 /lst

 i.e. values for both words which I don' want.  What can I do about this?
 Can I somehow escape the space when adding the data for indexing?

 -- Ewan
 --
 View this message in context:
 http://www.nabble.com/Facet-search-field-returning-results-on-split-words-tp25293787p25293787.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Solr, JNDI config, dataDir, and solr home problem

2009-09-04 Thread Archon810

OK, so I can't access it by ${solr.home}, but is there a way to access it?
After all, it's a variable defined in JNDI, shouldn't there be a way to
refer to it?

Also, what about the INFO message that says it can't find /solr/home, while
the instructions refer to solr/home ?



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 
 ${solr.home} is used for documentation purpose. It is not set as a
 variable.
 
 On Fri, Sep 4, 2009 at 3:58 PM, Archon810archon...@gmail.com wrote:

 I saw it being used in the default solrconfig.xml in this phrase:
 If you wish to hide files under ${solr.home}/conf, explicitly register
 the
 ShowFileRequestHandler using...

 It was only natural to assume it would work for something as trivial as
 dataDir.

 So, there's no way to refer to the solr/home value defined in JNDI?


 Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:

 it is nowhere mentioned that you can use a variable ${solr.home} in
 your solrconfig.xml. There is a bug related to this issue
 https://issues.apache.org/jira/browse/SOLR-1267

 On Fri, Sep 4, 2009 at 5:47 AM, Archon810archon...@gmail.com wrote:

 Here's my problem.

 I'm trying to follow a multi Solr setup, straight from the Solr wiki -
 http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac.

 Here's the relevant code:
 lt;Context docBase=/some/path/solr.war debug=0 crossContext=true
 gt;
   lt;Environment name=solr/home type=java.lang.String
 value=/some/path/solr1home override=true /gt;
 lt;/Contextgt;

 Now I want to set the Solr lt;dataDirgt; in solrconfig.xml, relative
 to
 the solr home property. The instructions
 http://wiki.apache.org/solr/SolrConfigXml#head-e8fbf2d748d90c5900aac712d0e3385ced5bd128
 say lt;dataDirgt; is used to specify an alternate directory to hold
 all
 index data other than the default ./data under the Solr home. If
 replication
 is in use, this should match the replication configuration. If this
 directory is not absolute, then it is relative to the current working
 directory of the servlet container.

 However, no matter how I try to set the dataDir property, solr home is
 not
 being found. For example,
  lt;dataDirgt;${solr.home}/datalt;/dataDirgt;

 What's even more confusing are these INFO notices in the log:
 INFO: No /solr/home in JNDI
 Sep 3, 2009 4:33:26 PM org.apache.solr.core.SolrResourceLoader
 locateSolrHome
 INFO: solr home defaulted to 'solr/' (could not find system property or
 JNDI)

 The JNDI instructions instruct to specify solr/home, the log
 complains
 about /solr/home (extra slash), the solrconfig.xml file seems to
 expect
 ${solr.home} - how more confusing can it get?

 This person is having the same issue:
 http://mysolr.com/tips/setting-solr-home-solrhome-in-jndi-on-tomcat-55/

 So, how does one refer to solr home from solrconfig.xml in a JNDI
 configuration scenario? Also, is there a way to debug/see variables
 that
 are
 defined in a specific context, such as solrconfig.xml? I feel like I'm
 completely blind here.

 Thank you!
 --
 View this message in context:
 http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25286277.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com



 --
 View this message in context:
 http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25292025.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Solr%2C-JNDI-config%2C-dataDir%2C-and-solr-home-problem-tp25286277p25296862.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Schema for group/child entity setup

2009-09-04 Thread Aakash Dharmadhikari
can't you store the locations as part of the parent listing while storing.
This way there would be only one document per parent listing. And all the
locations related information can be multi valued attributes per property or
any other way depending on the attributes.

2009/9/3 R. Tan tanrihae...@gmail.com

 Hi Solrers,
 I would like to get your opinion on how to best approach a search
 requirement that I have. The scenario is I have a set of business listings
 that may be group into one parent business (such as 7-eleven having several
 locations). On the results page, I only want 7-eleven to show up once but
 also show how many locations matched the query (facet filtered by state,
 for
 example) and maybe a preview of the some of the locations.

 Searching for the business name is straightforward but the locations within
 the a result is quite tricky. I can do the opposite, searching for the
 locations and faceting on business names, but it will still basically be
 the
 same thing and repeat results with the same business name.

 Any advice?

 Thanks,
 R



Re: how to scan dynamic field without specifying each field in query

2009-09-04 Thread Aakash Dharmadhikari
what all other searches you would like to perform on these fields?

From the proposed function definition I believe that when foo*:3 is to be
searched all foo* would be searched and none are to be excluded. Assuming
that this is the only search that are to be performed on these fields, we
might declare the dynamic field foo* and rather than constructing fields
actual name with property key we can construct it with the to be searched
value and store key as the value.

So assume we want to search fooA:X fooB:X fooC:X I would rather store
fooX as the multivalued field and store all A, B, C as its values.

The search query can be FooX:*, that is if field fooX exists, get all the
values for the same.

But again as I asked first, it would depend on what kind of other queries
you want  to perform.

regards,
aakash

2009/9/4 gdeconto gerald.deco...@topproducer.com


 I am thinking that my example was too simple/generic :-U.  It is possible
 for
 more several dynamic fields to exist and other functionality to be
 required.
 i.e. what about if my example had read:

 http://localhost:8994/solr/select?q=((Foo1:3http://localhost:8994/solr/select?q=%28%28Foo1:3OR
  Foo2:3 OR Foo3:3 OR …
 Foo999:3) AND (Bar1:1 OR Bar2:1 OR Bar3:1...Bar999:1) AND (Etc1:7 OR Etc2:7
 OR Etc3:7...Etc:999:7)

 obviously a nasty query (and care would be needed for MAX_BOOLEAN_CLAUSES).
 that said, are there other mechanisms to better handle that type of query,
 i.e.:

 http://localhost:8994/solr/select?q=(myfunction(http://localhost:8994/solr/select?q=%28myfunction%28‘Foo’,
 3) AND
 myfunction('Bar', 1) AND (myfunction('Etc', 7))


 gdeconto wrote:
 
  say I have a dynamic field called Foo* (where * can be in the hundreds)
  and want to search Foo* for a value of 3 (for example)
 
  I know I can do this via this:
 
  http://localhost:8994/solr/select?q=(Foo1:3http://localhost:8994/solr/select?q=%28Foo1:3OR
   Foo2:3 OR Foo3:3 OR …
  Foo999:3)
 
  However, is there a better way?  i.e. is there some way to query by a
  function I create, possibly something like this:
 
  http://localhost:8994/solr/select?q=myfunction(http://localhost:8994/solr/select?q=myfunction%28‘Foo’,
 3)
 
  where myfunction itself iterates thru all the instances of Foo*
 
  any help appreciated
 
 

 --
 View this message in context:
 http://www.nabble.com/how-to-scan-dynamic-field-without-specifying-each-field-in-query-tp25280228p25283094.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: how to scan dynamic field without specifying each field in query

2009-09-04 Thread gdeconto

I dont have that answer as I was asking a general question, not one for a
specific situation I am encountering).

what I am essentially asking for is: is there a short, simple and generic
method/technique to deal with large numbers of dynamic fields (rather than
having to specify each and every test on each and every dynamic field) in a
query

what originally prompted this question is I was looking at FunctionQueries
(http://wiki.apache.org/solr/FunctionQuery) and started to wonder if there
was some way to create my own functions to handle dynamic fields.


Aakash Dharmadhikari wrote:
 
 what all other searches you would like to perform on these fields?
 
 ...
 

-- 
View this message in context: 
http://www.nabble.com/how-to-scan-dynamic-field-without-specifying-each-field-in-query-tp25280228p25297439.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: how to scan dynamic field without specifying each field in query

2009-09-04 Thread Avlesh Singh

 I dont have that answer as I was asking a general question, not one for a
 specific situation I am encountering).

I can understand :)

what I am essentially asking for is: is there a short, simple and generic
 method/technique to deal with large numbers of dynamic fields (rather than
 having to specify each and every test on each and every dynamic field) in a
 query

Not as of now. There are a lot of open issues in Solr aiming to handle
dynamic fields in an intuitive way. SolrJ has already been made capable of
binding dynamic field content into Java beans (
https://issues.apache.org/jira/browse/SOLR-1129). Faceting on myField_* (
https://issues.apache.org/jira/browse/SOLR-1387) and adding SolrDocuments
with MapString, String myField_* (
https://issues.apache.org/jira/browse/SOLR-1357) are just some of the
enhancements on the way.

what originally prompted this question is I was looking at FunctionQueries (
 http://wiki.apache.org/solr/FunctionQuery) and started to wonder if there
 was some way to create my own functions to handle dynamic fields.

I don't think you need function queries here. Function queries are supposed
to return score for a document based on their ValueSource. What you probably
need is a custom QueryParser.

Cheers
Avlesh

On Fri, Sep 4, 2009 at 9:48 PM, gdeconto gerald.deco...@topproducer.comwrote:


 I dont have that answer as I was asking a general question, not one for a
 specific situation I am encountering).

 what I am essentially asking for is: is there a short, simple and generic
 method/technique to deal with large numbers of dynamic fields (rather than
 having to specify each and every test on each and every dynamic field) in a
 query

 what originally prompted this question is I was looking at FunctionQueries
 (http://wiki.apache.org/solr/FunctionQuery) and started to wonder if there
 was some way to create my own functions to handle dynamic fields.


 Aakash Dharmadhikari wrote:
 
  what all other searches you would like to perform on these fields?
 
  ...
 

 --
 View this message in context:
 http://www.nabble.com/how-to-scan-dynamic-field-without-specifying-each-field-in-query-tp25280228p25297439.html
 Sent from the Solr - User mailing list archive at Nabble.com.




capturing field length into a stored document field

2009-09-04 Thread mike.schultz

For various statistics I collect from an index it's important for me to know
the length (measured in tokens) of a document field.  I can get that
information to some degree from the norms for the field but a) the
resolution isn't that great, and b) more importantly, if boosts are used
it's almost impossible to get lengths from this.

Here's two ideas I was thinking about that maybe some can comment on.

1) Use copyto to copy the field in question, fieldA to an addition field,
fieldALength, which has an extra filter that just counts the tokens and only
outputs a token representing the length of the field.  This has the
disadvantage of retokenizing basically the whole document (because the field
in question is basically the body).  Plus I would think littering the term
space with these tokens might be bad for performance, I'm not sure.

2) Add a filter to the field in question which again counts the tokens. 
This filter allows the regular tokens to be indexed as usual but somehow
manages to get the token-count into a stored field of the document.  This
has the advantage of not having to retokenize the field and instead of
littering the token space, the count becomes docdata for each doc.  Can this
be done?  Maybe using threadLocal to temporarily store the count?

Thanks.

-- 
View this message in context: 
http://www.nabble.com/capturing-field-length-into-a-stored-document-field-tp25297690p25297690.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Schema for group/child entity setup

2009-09-04 Thread R. Tan
I can't because there are facet values for each location, such as
state/city/neighborhood and facilities. Example result is 7 Eleven, 100
locations when no location filters are applied, where there is a filter for
state, it should show 7 Eleven, 20 locations.

On Fri, Sep 4, 2009 at 11:57 PM, Aakash Dharmadhikari aaka...@gmail.comwrote:

 can't you store the locations as part of the parent listing while storing.
 This way there would be only one document per parent listing. And all the
 locations related information can be multi valued attributes per property
 or
 any other way depending on the attributes.

 2009/9/3 R. Tan tanrihae...@gmail.com

  Hi Solrers,
  I would like to get your opinion on how to best approach a search
  requirement that I have. The scenario is I have a set of business
 listings
  that may be group into one parent business (such as 7-eleven having
 several
  locations). On the results page, I only want 7-eleven to show up once but
  also show how many locations matched the query (facet filtered by state,
  for
  example) and maybe a preview of the some of the locations.
 
  Searching for the business name is straightforward but the locations
 within
  the a result is quite tricky. I can do the opposite, searching for the
  locations and faceting on business names, but it will still basically be
  the
  same thing and repeat results with the same business name.
 
  Any advice?
 
  Thanks,
  R
 



Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-04 Thread R. Tan
Okay. Thanks for giving an insight on how it works in general. Without
trying it myself, are the field values for the collapsed ones also part of
the results data?
What is the latest build that is safe to use on a production environment?
I'd probably go for that and use field collapsing.

Thank you very much.


On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote:

 The collapsed documents are represented by one master document which can
 be part of the normal search result (the doc list), so pagination just works
 as expected, meaning taking only the returned documents in account (ignoring
 the collapsed ones). As for the scoring, the master document is actually
 the document with the highest score in the collapsed group.

 As for Solr 1.3 compatibility... well... it's very hart to tell. All latest
 patch are certainly *not* 1.3 compatible (I think they're also depending on
 some changes in lucene which are not available for solr 1.3). I guess you'll
 have to try some of the old patches, but I'm not sure about their stability.

 cheers,
 Uri


 R. Tan wrote:

 Thanks Uri. How does paging and scoring work when using field collapsing?
 What patch works with 1.3? Is it production ready?

 R


 On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote:



 The development on this patch is quite active. It works well for single
 solr instance, but distributed search (ie. shards) is not yet supported.
 Using this page you can group search results based on a specific field.
 There are two flavors of field collapsing - adjacent and non-adjacent,
 the
 former collapses only document which happen to be located next to each
 other
 in the otherwise-non-collapsed results set. The later (the non-adjacent)
 one
 collapses all documents with the same field value (regardless of their
 position in the otherwise-non-collapsed results set). Note, that
 non-adjacent performs better than adjacent one. There's currently
 discussion
 to extend this support so in addition to collapsing the documents, extra
 information will be returned for the collapsed documents (see the
 discussion
 on the issue page).

 Uri


 R. Tan wrote:



 I think this is what I'm looking for. What is the status of this patch?

 On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote:





 Hi Solrers,
 I would like to get your opinion on how to best approach a search
 requirement that I have. The scenario is I have a set of business
 listings
 that may be group into one parent business (such as 7-eleven having
 several
 locations). On the results page, I only want 7-eleven to show up once
 but
 also show how many locations matched the query (facet filtered by
 state,
 for
 example) and maybe a preview of the some of the locations.

 Searching for the business name is straightforward but the locations
 within
 the a result is quite tricky. I can do the opposite, searching for the
 locations and faceting on business names, but it will still basically
 be
 the
 same thing and repeat results with the same business name.

 Any advice?

 Thanks,
 R














Re: Return 2 fields per facet.. name and id, for example? / facet value search

2009-09-04 Thread R. Tan
Thanks. I guess it will have to be the workaround then.

On Thu, Sep 3, 2009 at 3:34 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Fri, Aug 28, 2009 at 12:57 AM, Rihaed Tan tanrihae...@gmail.com
 wrote:

  Hi,
 
  I have a similar requirement to Matthew (from his post 2 years ago). Is
  this
  still the way to go in storing both the ID and name/value for facet
 values?
  I'm planning to use id#name format if this is still the case and doing a
  prefix query. I believe this is a common requirement so I'd appreciate if
  any of you guys can share what's the best way to do it.
 
  Also, I'm indexing the facet values for text search as well. Should the
  field declaration below suffice the requirement?
 
  field name=category type=text indexed=true stored=true
  required=true multiValued=true/
 

 There have been talks of having a pair field type in Solr but there is no
 patch yet. So I guess the way proposed by Yonik is a good solution.

 --
 Regards,
 Shalin Shekhar Mangar.



Re: Schema for group/child entity setup

2009-09-04 Thread R. Tan
Hmmm, interesting solution. But, as I've discovered the field collapsing
feature recently (although I haven't tested it), can't it solve this
requirement?

On Sat, Sep 5, 2009 at 1:14 AM, Avlesh Singh avl...@gmail.com wrote:

 Well you are talking about a very relational behavior, Tan.
 You can declare a locations and location_* field in your schema. While
 indexing a document, put all the locations inside the field locations.
 Populate location_state, location_city etc .. with their corresponding
 location values. That ways, when no filter is applied, you can facet on the
 locations field to get all the locations. In all other scenarios when a
 filter on field foo is applied, faceting on location_foo will give you
 the desired results.

 Cheers
 Avlesh

 On Fri, Sep 4, 2009 at 10:16 PM, R. Tan tanrihae...@gmail.com wrote:

  I can't because there are facet values for each location, such as
  state/city/neighborhood and facilities. Example result is 7 Eleven, 100
  locations when no location filters are applied, where there is a filter
 for
  state, it should show 7 Eleven, 20 locations.
 
  On Fri, Sep 4, 2009 at 11:57 PM, Aakash Dharmadhikari aaka...@gmail.com
  wrote:
 
   can't you store the locations as part of the parent listing while
  storing.
   This way there would be only one document per parent listing. And all
 the
   locations related information can be multi valued attributes per
 property
   or
   any other way depending on the attributes.
  
   2009/9/3 R. Tan tanrihae...@gmail.com
  
Hi Solrers,
I would like to get your opinion on how to best approach a search
requirement that I have. The scenario is I have a set of business
   listings
that may be group into one parent business (such as 7-eleven having
   several
locations). On the results page, I only want 7-eleven to show up once
  but
also show how many locations matched the query (facet filtered by
  state,
for
example) and maybe a preview of the some of the locations.
   
Searching for the business name is straightforward but the locations
   within
the a result is quite tricky. I can do the opposite, searching for
 the
locations and faceting on business names, but it will still basically
  be
the
same thing and repeat results with the same business name.
   
Any advice?
   
Thanks,
R
   
  
 



Re: Schema for group/child entity setup

2009-09-04 Thread Avlesh Singh
Well you are talking about a very relational behavior, Tan.
You can declare a locations and location_* field in your schema. While
indexing a document, put all the locations inside the field locations.
Populate location_state, location_city etc .. with their corresponding
location values. That ways, when no filter is applied, you can facet on the
locations field to get all the locations. In all other scenarios when a
filter on field foo is applied, faceting on location_foo will give you
the desired results.

Cheers
Avlesh

On Fri, Sep 4, 2009 at 10:16 PM, R. Tan tanrihae...@gmail.com wrote:

 I can't because there are facet values for each location, such as
 state/city/neighborhood and facilities. Example result is 7 Eleven, 100
 locations when no location filters are applied, where there is a filter for
 state, it should show 7 Eleven, 20 locations.

 On Fri, Sep 4, 2009 at 11:57 PM, Aakash Dharmadhikari aaka...@gmail.com
 wrote:

  can't you store the locations as part of the parent listing while
 storing.
  This way there would be only one document per parent listing. And all the
  locations related information can be multi valued attributes per property
  or
  any other way depending on the attributes.
 
  2009/9/3 R. Tan tanrihae...@gmail.com
 
   Hi Solrers,
   I would like to get your opinion on how to best approach a search
   requirement that I have. The scenario is I have a set of business
  listings
   that may be group into one parent business (such as 7-eleven having
  several
   locations). On the results page, I only want 7-eleven to show up once
 but
   also show how many locations matched the query (facet filtered by
 state,
   for
   example) and maybe a preview of the some of the locations.
  
   Searching for the business name is straightforward but the locations
  within
   the a result is quite tricky. I can do the opposite, searching for the
   locations and faceting on business names, but it will still basically
 be
   the
   same thing and repeat results with the same business name.
  
   Any advice?
  
   Thanks,
   R
  
 



Re: Schema for group/child entity setup

2009-09-04 Thread Avlesh Singh

 But, as I've discovered the field collapsing feature recently (although I
 haven't tested it), can't it solve this requirement?

From the top of my head, no. The answer might change on deep thinking. It is
one of the most popular features which is yet to be incorporated into Solr.

Cheers
Avlesh

On Fri, Sep 4, 2009 at 10:58 PM, R. Tan tanrihae...@gmail.com wrote:

 Hmmm, interesting solution. But, as I've discovered the field collapsing
 feature recently (although I haven't tested it), can't it solve this
 requirement?

 On Sat, Sep 5, 2009 at 1:14 AM, Avlesh Singh avl...@gmail.com wrote:

  Well you are talking about a very relational behavior, Tan.
  You can declare a locations and location_* field in your schema.
 While
  indexing a document, put all the locations inside the field locations.
  Populate location_state, location_city etc .. with their
 corresponding
  location values. That ways, when no filter is applied, you can facet on
 the
  locations field to get all the locations. In all other scenarios when a
  filter on field foo is applied, faceting on location_foo will give
 you
  the desired results.
 
  Cheers
  Avlesh
 
  On Fri, Sep 4, 2009 at 10:16 PM, R. Tan tanrihae...@gmail.com wrote:
 
   I can't because there are facet values for each location, such as
   state/city/neighborhood and facilities. Example result is 7 Eleven,
 100
   locations when no location filters are applied, where there is a filter
  for
   state, it should show 7 Eleven, 20 locations.
  
   On Fri, Sep 4, 2009 at 11:57 PM, Aakash Dharmadhikari 
 aaka...@gmail.com
   wrote:
  
can't you store the locations as part of the parent listing while
   storing.
This way there would be only one document per parent listing. And all
  the
locations related information can be multi valued attributes per
  property
or
any other way depending on the attributes.
   
2009/9/3 R. Tan tanrihae...@gmail.com
   
 Hi Solrers,
 I would like to get your opinion on how to best approach a search
 requirement that I have. The scenario is I have a set of business
listings
 that may be group into one parent business (such as 7-eleven having
several
 locations). On the results page, I only want 7-eleven to show up
 once
   but
 also show how many locations matched the query (facet filtered by
   state,
 for
 example) and maybe a preview of the some of the locations.

 Searching for the business name is straightforward but the
 locations
within
 the a result is quite tricky. I can do the opposite, searching for
  the
 locations and faceting on business names, but it will still
 basically
   be
 the
 same thing and repeat results with the same business name.

 Any advice?

 Thanks,
 R

   
  
 



dismax matches ranking

2009-09-04 Thread ram_sj

Hi I have following questions about dismax query handler? someone can clarify
me about it.

1. dismax query handler and filter query (fq)

if query= coffee , fq= yiw_bus_city: san jose, 

I get 0 results for this query again, but this one works fine, If mention
qt=standard query handler

2. dismax and ranking

q=san jose 

but my collection have more document for San Francisco, less for San Jose,

a. i get san francisco listed or listed before san jose some time, i guess
this is because of the term frequency of san francisco,

how can I present the results for the exact query match first? , I don't
want to manually boost the particular keyword for some reason. listing the
exact matches first and following by other results will be good.


configs:

requestHandler name=dismax class=solr.SearchHandler default=true 
lst name=defaults
  str name=defTypedismax/str
  str name=echoParamsexplicit/str
  float name=tie0.01/float
  str name=qf
yiw_bus_name^1.0 yiw_bus_city^1.0 yiw_bus_ps_info^0.2
yiw_bus_description^0.2 yiw_bus_general_information^0.2 yiw_bus_zip^0.5
yiw_bus_street^0.5
  /str
  str name=pf
yiw_bus_city^1.0 yiw_bus_zip^0.5 yiw_bus_street^0.5
  /str
  str name=bf
ord(yiw_bus_name)^0.5 recip(rord(yiw_bus_city),1,1000,1000)^0.3
  /str
  !-- 
 str name=fl/str
 --
  str name=mm
2lt;-1 5lt;-2 6lt;70%
  /str
  int name=ps100/int
  str name=q.alt*:*/str
  !-- example highlighter config, enable per-query with hl=true --
  str name=hl.fl/str
  !-- for this field, we want no fragmenting, just highlighting --
  str name=f.name.hl.fragsize0/str
  !-- instructs Solr to return the field itself if no query terms are
  found --
  str name=f.name.hl.alternateFieldyiw_bus_name/str
  str name=f.text.hl.fragmenterregex/str
  !-- defined below --
/lst
  /requestHandler

schema:

field name=yiw_bus_general_information type=text indexed=true
stored=true default=NA /
field name=yiw_bus_ps_info type=string indexed=true stored=true
default=NA /
field name=yiw_bus_city type=string indexed=true stored=true
multiValued=false default=NA /
field name=yiw_bus_state type=string indexed=true stored=true
multiValued=false default=NA /
field name=yiw_bus_country type=string indexed=true stored=true
multiValued=false default=NA /
field name=yiw_bus_street type=string indexed=true stored=true
multiValued=false default=NA /
field name=yiw_bus_zip type=string   indexed=true  stored=true
multiValued=false default=0 /







 
-- 
View this message in context: 
http://www.nabble.com/dismax-matches---ranking-tp25300011p25300011.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: how to scan dynamic field without specifying each field in query

2009-09-04 Thread gdeconto

Thanks Avlesh.

I was only thinking of something 'like' function queries (since they
appeared to have similar behavior).

Agree that custom QueryParser is looking like my only choice.  Now have to
figure out how to do that :-)



Avlesh Singh wrote:
 
 I don't think you need function queries here. Function queries are
 supposed
 to return score for a document based on their ValueSource. What you
 probably
 need is a custom QueryParser.
 
 

-- 
View this message in context: 
http://www.nabble.com/how-to-scan-dynamic-field-without-specifying-each-field-in-query-tp25280228p25300855.html
Sent from the Solr - User mailing list archive at Nabble.com.



how to create a custom queryparse to handle new functions

2009-09-04 Thread gdeconto

Can someone point me in the general direction of how to create a custom
queryparser that would allow me to create custom query commands like this:

http://localhost:8994/solr/select?q=myfunction(‘Foo’, 3)

or point me towards an example?

note that the actual functionality of myfunction is not defined.  I am just
wondering if this sort of extensibility is possible.
-- 
View this message in context: 
http://www.nabble.com/how-to-create-a-custom-queryparse-to-handle-new-functions-tp25301698p25301698.html
Sent from the Solr - User mailing list archive at Nabble.com.



TermsComponent

2009-09-04 Thread Todd Benge
Hi,

I was looking at TermsComponent in Solr 1.4 as a way of building a
autocomplete function.  I have a prototype working but noticed that terms
that have whitespace in them when indexed are absent the whitespace when
returned from the TermsComponent.

Any ideas on why that may be happening?  Am I just missing a configuration
option?

Thanks,

Todd


Re: how to create a custom queryparse to handle new functions

2009-09-04 Thread Shalin Shekhar Mangar
On Sat, Sep 5, 2009 at 2:15 AM, gdeconto gerald.deco...@topproducer.comwrote:


 Can someone point me in the general direction of how to create a custom
 queryparser that would allow me to create custom query commands like this:

 http://localhost:8994/solr/select?q=myfunction(http://localhost:8994/solr/select?q=myfunction%28‘Foo’,
 3)

 or point me towards an example?

 note that the actual functionality of myfunction is not defined.  I am just
 wondering if this sort of extensibility is possible.


You do not need to create a custom query parser for this. You just need to
create a custom function query. Look at one of the existing function queries
in Solr as an example.


-- 
Regards,
Shalin Shekhar Mangar.


Re: capturing field length into a stored document field

2009-09-04 Thread Grant Ingersoll
The Similarity.lengthNorm() is a callback from Lucene that gives you  
the information you seek.  Of course, the trick still is how to use  
that.  Perhaps you can describe a bit more about why you need that  
length.


On Sep 4, 2009, at 11:34 AM, mike.schultz wrote:



For various statistics I collect from an index it's important for me  
to know

the length (measured in tokens) of a document field.  I can get that
information to some degree from the norms for the field but a) the
resolution isn't that great, and b) more importantly, if boosts are  
used

it's almost impossible to get lengths from this.

Here's two ideas I was thinking about that maybe some can comment on.

1) Use copyto to copy the field in question, fieldA to an addition  
field,
fieldALength, which has an extra filter that just counts the tokens  
and only

outputs a token representing the length of the field.  This has the
disadvantage of retokenizing basically the whole document (because  
the field
in question is basically the body).  Plus I would think littering  
the term

space with these tokens might be bad for performance, I'm not sure.

2) Add a filter to the field in question which again counts the  
tokens.
This filter allows the regular tokens to be indexed as usual but  
somehow
manages to get the token-count into a stored field of the document.   
This

has the advantage of not having to retokenize the field and instead of
littering the token space, the count becomes docdata for each doc.   
Can this

be done?  Maybe using threadLocal to temporarily store the count?

Thanks.

--
View this message in context: 
http://www.nabble.com/capturing-field-length-into-a-stored-document-field-tp25297690p25297690.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Filter query to check for not null filed

2009-09-04 Thread Mohamed Parvez
Say i have 3 fields, named field1,field2 and field3

I want to query all records that have xxx in field1 and filed3 should not
be null

I tried
1] q=field1:xxxfq=?
Got an error saying = org.apache.lucene.queryParser.ParseException: Cannot
parse 'title:?': '*' or '?' not allowed as first character in WildcardQuery

2] q=filed1:xxxfq=-x
Got an error saying = org.apache.lucene.queryParser.ParseException: Cannot
parse 'title:-smb': Encountered  - -  at line 1, column 6. Was
expecting one of: ( ... * ... ... ... ... ... [ ... { ... ...


Any suggestions?


Thanks/Regards,
Parvez


stemming plurals

2009-09-04 Thread Joe Calderon
i saw some post regarding stemming plurals in the archives from 2008,
i was wondering if this was ever integrated or if custom hackery is
still needed, is there something like a stemplurals analyzer is the
kstemmer the closest thing?


thx much
--joe


Re: how to create a custom queryparse to handle new functions

2009-09-04 Thread Avlesh Singh

 You do not need to create a custom query parser for this. You just need to
 create a custom function query. Look at one of the existing function queries
 in Solr as an example.

This is where the need originates from -
http://www.lucidimagination.com/search/document/a4bb0dfee53f7493/how_to_scan_dynamic_field_without_specifying_each_field_in_query

Within the function, the intent is to rewrite incoming parameter into a
different query. Can this be done? AFAIK, not.

Cheers
Avlesh

On Sat, Sep 5, 2009 at 3:21 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Sat, Sep 5, 2009 at 2:15 AM, gdeconto gerald.deco...@topproducer.com
 wrote:

 
  Can someone point me in the general direction of how to create a custom
  queryparser that would allow me to create custom query commands like
 this:
 
  http://localhost:8994/solr/select?q=myfunction(http://localhost:8994/solr/select?q=myfunction%28
 http://localhost:8994/solr/select?q=myfunction%28‘Foo’,
  3)
 
  or point me towards an example?
 
  note that the actual functionality of myfunction is not defined.  I am
 just
  wondering if this sort of extensibility is possible.
 

 You do not need to create a custom query parser for this. You just need to
 create a custom function query. Look at one of the existing function
 queries
 in Solr as an example.


 --
 Regards,
 Shalin Shekhar Mangar.



Re: Field Collapsing (was Re: Schema for group/child entity setup)

2009-09-04 Thread R. Tan
Anybody using it on public site? Would love to see some live examples.

On Sat, Sep 5, 2009 at 12:50 AM, R. Tan tanrihae...@gmail.com wrote:

 Okay. Thanks for giving an insight on how it works in general. Without
 trying it myself, are the field values for the collapsed ones also part of
 the results data?
 What is the latest build that is safe to use on a production environment?
 I'd probably go for that and use field collapsing.

 Thank you very much.


 On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote:

 The collapsed documents are represented by one master document which can
 be part of the normal search result (the doc list), so pagination just works
 as expected, meaning taking only the returned documents in account (ignoring
 the collapsed ones). As for the scoring, the master document is actually
 the document with the highest score in the collapsed group.

 As for Solr 1.3 compatibility... well... it's very hart to tell. All
 latest patch are certainly *not* 1.3 compatible (I think they're also
 depending on some changes in lucene which are not available for solr 1.3). I
 guess you'll have to try some of the old patches, but I'm not sure about
 their stability.

 cheers,
 Uri


 R. Tan wrote:

 Thanks Uri. How does paging and scoring work when using field collapsing?
 What patch works with 1.3? Is it production ready?

 R


 On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote:



 The development on this patch is quite active. It works well for single
 solr instance, but distributed search (ie. shards) is not yet supported.
 Using this page you can group search results based on a specific field.
 There are two flavors of field collapsing - adjacent and non-adjacent,
 the
 former collapses only document which happen to be located next to each
 other
 in the otherwise-non-collapsed results set. The later (the non-adjacent)
 one
 collapses all documents with the same field value (regardless of their
 position in the otherwise-non-collapsed results set). Note, that
 non-adjacent performs better than adjacent one. There's currently
 discussion
 to extend this support so in addition to collapsing the documents, extra
 information will be returned for the collapsed documents (see the
 discussion
 on the issue page).

 Uri


 R. Tan wrote:



 I think this is what I'm looking for. What is the status of this patch?

 On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote:





 Hi Solrers,
 I would like to get your opinion on how to best approach a search
 requirement that I have. The scenario is I have a set of business
 listings
 that may be group into one parent business (such as 7-eleven having
 several
 locations). On the results page, I only want 7-eleven to show up once
 but
 also show how many locations matched the query (facet filtered by
 state,
 for
 example) and maybe a preview of the some of the locations.

 Searching for the business name is straightforward but the locations
 within
 the a result is quite tricky. I can do the opposite, searching for the
 locations and faceting on business names, but it will still basically
 be
 the
 same thing and repeat results with the same business name.

 Any advice?

 Thanks,
 R