date:20090312

On Thu, Mar 12, 2009 at 4:39 PM, dabboo ag...@sapient.com wrote:


 Hi,

 I am able to rectify that exception but now what I am looking for is : How
 I
 can pass the value to the date field to search for the record of a specific
 date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I
 will pass the value with the column name. If I pass the value it throws an
 exception saying that it is expecting TO ..


The format for range search is your_date_field:[minDate TO maxDate] and for
a normal term query it is your_date_field:the_date

Each of the dates should be in the format described in the example
schema.xml
-- 
Regards,
Shalin Shekhar Mangar.

Re: Date Search with q query parameter

2009-03-12 Thread dabboo


Hi,

Date range query is working fine for me. This is the query I entered.

q=productPublicationDate_product_dt:1993-02-01T12:00:00Zversion=2.2start=0rows=10indent=onqt=dismaxrequest

It threw this exception:

type Status report

message Invalid Date String:'1993-02-01t12'

description The request sent by the client was syntactically incorrect
(Invalid Date String:'1993-02-01t12').


thanks,
Amit Garg


Shalin Shekhar Mangar wrote:
 
 On Thu, Mar 12, 2009 at 4:39 PM, dabboo ag...@sapient.com wrote:
 

 Hi,

 I am able to rectify that exception but now what I am looking for is :
 How
 I
 can pass the value to the date field to search for the record of a
 specific
 date value. e.g. I want to retrieve all the records of Jan 01, 2007. How
 I
 will pass the value with the column name. If I pass the value it throws
 an
 exception saying that it is expecting TO ..

 
 The format for range search is your_date_field:[minDate TO maxDate] and
 for
 a normal term query it is your_date_field:the_date
 
 Each of the dates should be in the format described in the example
 schema.xml
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22474608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Date Search with q query parameter

2009-03-12 Thread dabboo


Hi,

I am able to rectify that exception but now what I am looking for is : How I
can pass the value to the date field to search for the record of a specific
date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I
will pass the value with the column name. If I pass the value it throws an
exception saying that it is expecting TO ..

Please suggest.

thanks,
Amit Garg



Venu Mittal wrote:
 
 Is your final query in this format ?
 
 col1:[2009-01-01T00:00:00Z+TO+2009-01-01T23:59:59Z]
 
 
 
 
 From: dabboo ag...@sapient.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, March 12, 2009 12:27:48 AM
 Subject: Date Search with q query parameter
 
 
 Hi,
 
 I am facing an issue with the date field, I have in my records.
 
 e.g. I am using q query parameter and passing some string as search
 criteria
 like test. While creating query with q parameter, how query forms is:
 
 column1:test | column2:test | column3:test . ...
 
 I have one column as date column, which is appended with _dt like
 column4_dt. Now, when it creates the query like 
 
 column1:test | column2:test | column3:test | column4_dt:test 
 
 Here it throws an exception saying Invalid date format.
 
 Please suggest how I can prevent this.
 
 Thanks,
 Amit Garg
 
 -- 
 View this message in context:
 http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
   
 

-- 
View this message in context: 
http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22473029.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Tomcat holding deleted snapshots until it's restarted

2009-03-12 Thread Marc Sturlese

The old IndexSearcher is beeing closed correctly:

2009-03-12 13:05:06,200 [pool-7-thread-1] INFO
org.apache.solr.core.SolrCore - [core_01] Registered new searcher
searc...@c6692 main
2009-03-12 13:05:06,200 [pool-7-thread-1] INFO
org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7

main

hossman wrote:

: If the problem is not there the other thing that comes to my mind is
: lucene2.9-dev... maybe there's a problem closing indexWriter?...
opiously
: it's just a thought.

you never answered yoniks question about wether you see any Closing
Searcher messagges in your log, also it's useful to know what you see in
the CORE section when you look at stats.jsp ... typically the main
searcher is listed there twice, but during warming you'll see the old
searcher as well ... if older searchers aren't getting closed for some
reason, they should be listed there.

i'd start by confirming/ruling out hte old searchers before speculating
about the indexwriter or other problems.

: On a quiet system, you should see the original searcher closed right
: after the new searcher is registered.
:
: Example:
: Mar 11, 2009 2:22:25 PM org.apache.solr.core.SolrCore registerSearcher
: INFO: [] Registered new searcher searc...@1f1cbf6 main
: Mar 11, 2009 2:22:25 PM org.apache.solr.search.SolrIndexSearcher close
: INFO: Closing searc...@acdd02 main

-Hoss

--
View this message in context:
http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22475001.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Custom path for solr lib and data folder

2009-03-12 Thread con




Hoss

Assume my current working directory is C:/MyApplication/searchApp
and in the solr.xml i am specifying C:/lib as shared lib,
then the console output contains the following line:

INFO: loading shared library: C:\MyApplication\searchApp\C:\lib

Thanks
con




hossman wrote:
 
 Adding ' + jars[j].toString() + ' to Solr classloaderAdding ' + 
 jars[j].toString() + ' to Solr classloader
 
 :  But how can i redirect solr to a seperate lib directrory that is
 outside of
 :  the solr.home
 :  
 :  Is this possible in solr 1.3
 : 
 : I don't believe it is possible (but please correct me if I'm wrong). 
 From
 : SolrResourceLoader:
 : 
 :log.info(Solr home set to ' + this.instanceDir + ');
 :this.classLoader = createClassLoader(new File(this.instanceDir +
 lib/),
 : parent);
 : 
 : So only a lib/ under Solr home directory is used.  It would be a nice
 
 that's the lib directory specific to the core (hence it's relative the 
 instanceDir).
 
 In con's original post he was claiming to have problems getting 
 solr.xml's sharedLib option to point to an absolute path ... this should 
 work fine.
 
 con: when your solr.xml you should see an INFO message starting with 
 loading shared library:... -- what path is listed on that line?
 
 your sharedLib=%COMMON_LIB% example won't work (for the reasons Noble 
 mentioned) but your sharedLib=C:\lib should work (assuming that path 
 exists) and then immediately following the log message i mentioned 
 above, you should see INFO messages like...
   Adding file:///...foo.jar to Solr classloader
 ...for each jar in that directory.  if there are none, or the directory 
 can't be found you might see Reusing parent classloader or Can't 
 construct solr lib class loader messages instead.
 
 what do you see in your logs?
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Custom-path-for-solr-lib-and-data-folder-tp22450530p22475244.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to remove stemming from the analyzer - Finding blah when searching for blah*

Hi,

I am trying to disable stemming from the analyzer, but I am not sure how to
do it.

For instance, I have a field that contains blah, but when I search for
blah* it cannot find it, whereas if I search for bla* it does. I was
using the text type field, from the example schema.xml. How should I modify
it so that stemming is not done and I can find blah when I search for
blah*?

fieldType name=text class=solr.TextField positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType

I have tried using the textTight type to no avail. Most of the fields in
my documents have this structure:

DOC1 field gene name:brca2
DOC2 field gene name:brca23

If I searched for brca2* I would like to find both documents. My field
values normally contain colons ':' that should be used as stop words.

Thank you in advance,

Bruno

Re: How to remove stemming from the analyzer - Finding blah when searching for blah*

2009-03-12 Thread Erik Hatcher

Remove the EnglishPorterFilterFactory from your text analyzer  
configuration (both index and query sides).  And reindex all documents.


Erik

On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote:


Hi,

I am trying to disable stemming from the analyzer, but I am not sure  
how to

do it.

For instance, I have a field that contains blah, but when I search  
for
blah* it cannot find it, whereas if I search for bla* it does. I  
was
using the text type field, from the example schema.xml. How should I  
modify
it so that stemming is not done and I can find blah when I search  
for

blah*?

fieldType name=text class=solr.TextField  
positionIncrementGap=100

 analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory/
   !-- in this example, we will only use synonyms at query time
   filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
   --
   !-- Case insensitive stop word removal.
 add enablePositionIncrements=true in both the index and query
 analyzers to leave a 'gap' for more accurate phrase queries.
   --
   filter class=solr.StopFilterFactory
   ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true
   /
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
 analyzer type=query
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory  
synonyms=synonyms.txt

ignoreCase=true expand=true/
   filter class=solr.StopFilterFactory
   ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true
   /
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
   /fieldType

I have tried using the textTight type to no avail. Most of the  
fields in

my documents have this structure:

DOC1 field gene name:brca2
DOC2 field gene name:brca23

If I searched for brca2* I would like to find both documents. My  
field

values normally contain colons ':' that should be used as stop words.

Thank you in advance,

Bruno

Re: How to remove stemming from the analyzer - Finding blah when searching for blah*

Thanks for your answer, I am trying now with this custom text field:

fieldType name=textIntact class=solr.TextField
positionIncrementGap=100 
  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=0
catenateWords=0 catenateNumbers=0 catenateAll=0
expand=0 splitOnCaseChange=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType

And still it does not find blah when using the wildcard and searching for
blah*. Am I missing something?

Thanks,

Bruno

2009/3/12 Erik Hatcher e...@ehatchersolutions.com

 Remove the EnglishPorterFilterFactory from your text analyzer
 configuration (both index and query sides).  And reindex all documents.

Erik


 On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote:

  Hi,

 I am trying to disable stemming from the analyzer, but I am not sure how
 to
 do it.

 For instance, I have a field that contains blah, but when I search for
 blah* it cannot find it, whereas if I search for bla* it does. I was
 using the text type field, from the example schema.xml. How should I
 modify
 it so that stemming is not done and I can find blah when I search for
 blah*?

 fieldType name=text class=solr.TextField positionIncrementGap=100
 analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory/
   !-- in this example, we will only use synonyms at query time
   filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
   --
   !-- Case insensitive stop word removal.
 add enablePositionIncrements=true in both the index and query
 analyzers to leave a 'gap' for more accurate phrase queries.
   --
   filter class=solr.StopFilterFactory
   ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true
   /
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
 analyzer type=query
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
   filter class=solr.StopFilterFactory
   ignoreCase=true
   words=stopwords.txt
   enablePositionIncrements=true
   /
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
   /fieldType

 I have tried using the textTight type to no avail. Most of the fields in
 my documents have this structure:

 DOC1 field gene name:brca2
 DOC2 field gene name:brca23

 If I searched for brca2* I would like to find both documents. My field
 values normally contain colons ':' that should be used as stop words.

 Thank you in advance,

 Bruno

How to correctly boost results in Solr Dismax query

2009-03-12 Thread Pete Smith

Hi,

I have managed to build an index in Solr which I can search on keyword,
produce facets, query facets etc. This is all working great. I have
implemented my search using a dismax query so it searches predetermined
fields.

However, my results are coming back sorted by score which appears to be
calculated by keyword relevancy only. I would like to adjust the score
where fields have pre-determined values. I think I can do this with
boost query and boost functions but the documentation here:

http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3

Is not particularly helpful. I tried adding adding a bq argument to my
search: 

bq=media:DVD^2

(yes, this is an index of films!) but I find when I start adding more
and more:

bq=media:DVD^2bq=media:BLU-RAY^1.5

I find the negative results - e.g. films that are DVD but are not
BLU-RAY get negatively affected in their score. In the end it all seems
to even out and my score is as it was before i started boosting.

I must be doing this wrong and I wonder whether boost function comes
in somewhere. Any ideas on how to correctly use boost?

Cheers,
Pete

-- 
Pete Smith
Developer

No.9 | 6 Portal Way | London | W3 6RU |
T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111

LOVEFiLM.com

Re: How to remove stemming from the analyzer - Finding blah when searching for blah*

2009-03-12 Thread Erik Hatcher

What is the full query you're issuing to Solr and the corresponding  
request handler configuration?


Chances are you're using the dismax query parser, which does not  
support wildcards.  Other things to check, be sure you've tied the  
field to your new textIntact type, and that you're searching that  
field (see defaultField in schema.xml).


Try something like /solr/select?q=field_name:blah*

Erik

On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote:


Thanks for your answer, I am trying now with this custom text field:

fieldType name=textIntact class=solr.TextField
positionIncrementGap=100 
 analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=0
   catenateWords=0 catenateNumbers=0 catenateAll=0
expand=0 splitOnCaseChange=0/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
   /fieldType

And still it does not find blah when using the wildcard and  
searching for

blah*. Am I missing something?

Thanks,

Bruno

2009/3/12 Erik Hatcher e...@ehatchersolutions.com


Remove the EnglishPorterFilterFactory from your text analyzer
configuration (both index and query sides).  And reindex all  
documents.


  Erik


On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote:

Hi,


I am trying to disable stemming from the analyzer, but I am not  
sure how

to
do it.

For instance, I have a field that contains blah, but when I  
search for
blah* it cannot find it, whereas if I search for bla* it does.  
I was

using the text type field, from the example schema.xml. How should I
modify
it so that stemming is not done and I can find blah when I  
search for

blah*?

fieldType name=text class=solr.TextField  
positionIncrementGap=100

   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory  
synonyms=synonyms.txt

ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType

I have tried using the textTight type to no avail. Most of the  
fields in

my documents have this structure:

DOC1 field gene name:brca2
DOC2 field gene name:brca23

If I searched for brca2* I would like to find both documents. My  
field
values normally contain colons ':' that should be used as stop  
words.


Thank you in advance,

Bruno

RE: Combination of EmbeddedSolrServer and CommonHttpSolrServer

2009-03-12 Thread Kulkarni, Ajit Kamalakar

Hi Shalin Shekhar Mangar,

Thanks for your inputs.

Please see my comments below.

 

 

I wish to know if there is any user who used EmbeddedSolrServer for
indexing and CommonsHttpSolrServer for search.

I have found that this combination offers better performance for
indexing. Searching becomes flexible as you can search from more number
of http clients simultaneously.

Does anyone have any related performance data? 

 

 

Thanks,

Ajit

 

 

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Wednesday, March 11, 2009 7:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Combination of EmbeddedSolrServer and CommonHttpSolrServer

 

On Wed, Mar 11, 2009 at 6:37 PM, Kulkarni, Ajit Kamalakar 

ajkulka...@ptc.com wrote:

 

 

 If we index the documents using CommonsHttpSolrServer and search using

 the same, we get the updated results

 

 That means we can search the latest added document as well even if it
is

 not committed to the file system

 

 

That is not possible. Without calling commit, new documents will not be

visible to a searcher.

 

 

Ajit: When I tested using CommonsHttpSolrServer for indexing as well as
searching, I could search the latest added document through solr admin
page.

I could also search the document through CommonsHttpSolrServer without
explicitly calling commit.

I am even more surprised to see the same result by using
EmbeddedSolrServer for indexing and for searching CommonsHttpSolrServer.

I used embeddedSolrServer = new
EmbeddedSolrServer(SolrCore.getSolrCore()); which is deprecated API.

For this I did not need to call commit on CommonsHttpSolrServer to get
latest document searched on either solr admin page or even
programmatically through CommonsHttpSolrServer

 

However if I use 

 

  CoreContainer multicore = new CoreContainer(); 

  File home = new File( getSolrHome() );

  File f = new File( home, solr.xml );

  multicore.load( getSolrHome(), f );

  embeddedSolrServer = new EmbeddedSolrServer( multicore,
SolrIndexConstants.DEFAULT_CORE );

 

I had to use commit on CommonsHttpSolrServer to search the latest added
documents and the document was available through solr admin page only
when I programatcaaly searched after calling commit on
CommonsHttpSolrServer

This is consistent with what you mentioned above.

 

 

 

 So it looks like there is some kind of cache that is used by both
index

 and search logic inside solr for a given SolrServer components (e. g.

 CommonsHttpSolrServer, EmbeddedSolrServer)

 

 

Indexing does not create any cache. The caching is done only by the

searcher. The old searcher/cache is discarded and a new searcher/cache
is

created when you call commit. Setting autoWarmCount on the caches in

solrconfig.xml makes the new searcher run some of the most recently used

queries on the old searcher to warm up the new cache.

 

Calling commit on the SolrServer to synch with the index data may not be

 good option as I suppose it to be expensive operation.

 

 

It is the only option. But you may be able to make the operation cheaper
by

tweaking the autowarmCount on the caches (this is specified in

solrconfig.xml). However, caches are important for good search
performance.

Depending on your search traffic, you'll need to find a sweet spot.

 

 

 The cache and hard disk data synchronization should be independent of

 the SolrServer instances managed by Solr Web Application inside
tomcat.

 

 

SolrServer is not really a server in itself. It is (a pointer to?) a
server

being used by a solrj client. The CommonsHttpSolrServer refers to a
remote

server url and makes calls through HTTP. SolrCore is the internal class

which manages the state of the server.

 

A SolrCore is created by the solr webapp. When you create another
SolrCore

for use by EmbeddedSolrServer, they do not know about each other.
Therefore

you need to notify it if you change the index through another core.

 

Ajit: If the same JVM is managing responding searchers for
EmbeddedSolrServer as well as CommonsHttpSolrServer, then why can't
responding searcher be same? I understand that EmbeddedSolrServer and
CommonsHttpSolrServer clients are separate but if searchers are managed
in same JVM, theoretically we should be able to make singleton searcher
attached to every kind of SolrServer. This searcher should be listener
for indexer.

Since searching is read operation, there won't be any threading or
scalability issue but indexer should be one

Since I don't have enough knowledge about solr and lucene so I may be
totally wrong!

 

 The issue still will be that EmbeddedSolrServer may directly access
hard

 index data as it may bypass the Solr web app totally

 

 I am embedding tomcat in my RMI server.

 

 The RMI Server is going to use EmbeddedSolrServer and it also hosts
the

 Solr WebApp inside its tomcat instance

 

 So I guess I should be able to manage a singleton cache  that is given

 to both,

Operators and Minimum Match with Dismax handler

2009-03-12 Thread hbi dev

Hi All,

I have a question regarding the dismax handler and minimum match (mm=)

I have an index which we are setting the default operator to AND.
Am I right in saying that using the dismax handler, the default operator in
the schema file is effectively ignored? (This is the conclusion I've made
from testing myself)

So I have set the mm value to 100%

The issue I have with this, is that if I want to include an OR in my phrase,
these are effectively getting ignored. The parser is still trying to match
100% of the search terms

e.g. 'lucene OR query' still only finds matches for 'lucene AND query'
the parsed query is: +(((drug_name:lucen) (drug_name:queri))~2) ()

I know I could programatically set the mm=0 if my phrase contains certain
keywords, however this would get very complicated with more terms in the
phrase (I'd have to preserve/inject operators to keep my default) and I
assume I would effectively be duplicating what dismax handler does for the
most part already.

Does anyone have any advise as to how I could deal with this kkind of
problem?

Thanks
Waseem

Re: Solr 1.3; Data Import w/ Dynamic Fields

2009-03-12 Thread Wesley Small

I was successful at distributing the Solr-1.4-DEV data import functionality
within the Solr 1.3 war.

1. Copy the data import’s src directory from 1.4 to 1.3.
2. Made sure to used the data import’s build.xml already existing in Solr
1.3
3. Commented out all code within #SolrWriter.rollback method
4. Commented out the following import statements from #SolrWriter
#import org.apache.solr.update.RollbackUpdateCommand;
5. Copied required libraries for logging from 1.4/lib to 1.3/lib
slf4j-api-1.5.5.jar
slf4j-jdk14-1.5.5.jar

I was planning on replacing the Solr 1.4 logging scheme to the style in Solr
1.3, but that was unnecessary work.

Continuing my testing with this customized distributing.

Thanks again,
Wesley.



On 3/11/09 6:35 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 On Wed, Mar 11, 2009 at 4:01 PM, Noble Paul നോബിള്‍ नोब्ळ् 
 noble.p...@gmail.com wrote:
 
  I guess you can take the trunk and comment out the contents of
  SolrWriter#rollback() and it should work with Solr1.3
 
 
 I agree. Rollback is the only feature which depends on enhancements in
 Solr/Lucene libraries. So if you remove this feature, everything else should
 work fine with 1.3
 
 --
 Regards,
 Shalin Shekhar Mangar.

Is wiki page still accurate

2009-03-12 Thread Eric Pugh


Folks,

Is this section title Full Import Example on http://wiki.apache.org/solr/DataImportHandler 
 still accurate?  The steps referring to the example-solr-home.jar  
and the SOLR-469 patch seem out of date with where 1.4 is today?


Seems like the example-DIH stuff is simpler/more direct example???

Eric

-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal

Re: How to remove stemming from the analyzer - Finding blah when searching for blah*

Thanks again. This is the default request handler:

 requestHandler name=standard class=solr.SearchHandler default=true
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
 /lst
  /requestHandler

Doing this query:

http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh

Find 1 result. The term Nefh is found in the field mitab.

Doing:

http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh*

Finds nothing.

I have realised that Ne* of Nef* do not return results as well, using the
textIntact type...

Thank you,

Bruno

2009/3/12 Erik Hatcher e...@ehatchersolutions.com

 What is the full query you're issuing to Solr and the corresponding request
 handler configuration?

 Chances are you're using the dismax query parser, which does not support
 wildcards.  Other things to check, be sure you've tied the field to your new
 textIntact type, and that you're searching that field (see defaultField in
 schema.xml).

 Try something like /solr/select?q=field_name:blah*


Erik

 On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote:

  Thanks for your answer, I am trying now with this custom text field:

 fieldType name=textIntact class=solr.TextField
 positionIncrementGap=100 
 analyzer
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=0
   catenateWords=0 catenateNumbers=0 catenateAll=0
 expand=0 splitOnCaseChange=0/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
   /fieldType

 And still it does not find blah when using the wildcard and searching
 for
 blah*. Am I missing something?

 Thanks,

 Bruno

 2009/3/12 Erik Hatcher e...@ehatchersolutions.com

  Remove the EnglishPorterFilterFactory from your text analyzer
 configuration (both index and query sides).  And reindex all documents.

  Erik


 On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote:

 Hi,


 I am trying to disable stemming from the analyzer, but I am not sure how
 to
 do it.

 For instance, I have a field that contains blah, but when I search for
 blah* it cannot find it, whereas if I search for bla* it does. I was
 using the text type field, from the example schema.xml. How should I
 modify
 it so that stemming is not done and I can find blah when I search for
 blah*?

 fieldType name=text class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 !-- in this example, we will only use synonyms at query time
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
 --
 !-- Case insensitive stop word removal.
   add enablePositionIncrements=true in both the index and query
   analyzers to leave a 'gap' for more accurate phrase queries.
 --
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
  /fieldType

 I have tried using the textTight type to no avail. Most of the fields
 in
 my documents have this structure:

 DOC1 field gene name:brca2
 DOC2 field gene name:brca23

 If I searched for brca2* I would like to find both documents. My field
 values normally contain colons ':' that should be used as stop words.

 Thank you in advance,

 Bruno

Re: How to remove stemming from the analyzer - Finding blah when searching for blah*

2009-03-12 Thread Erik Hatcher



On Mar 12, 2009, at 10:47 AM, Bruno Aranda wrote:

Doing this query:

http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh

Find 1 result. The term Nefh is found in the field mitab.

Doing:

http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh*

Finds nothing.

I have realised that Ne* of Nef* do not return results as well,  
using the

textIntact type...


Ah... the problem is that wildcarded query terms do not get analyzed,  
nor do they get lowercased (this is an open issue with Solr to at  
least make lowercasing configurable, Lucene supports it).


Try lowercasing in your query client, that should do the trick.

Erik

Re: How to remove stemming from the analyzer - Finding blah when searching for blah*

Thank you! Next time I will remind not to change the words to make the
example simpler...

blah is not the same as Nefh :-)

Thanks,

Bruno

2009/3/12 Erik Hatcher e...@ehatchersolutions.com


 On Mar 12, 2009, at 10:47 AM, Bruno Aranda wrote:

 Doing this query:

 http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh

 Find 1 result. The term Nefh is found in the field mitab.

 Doing:

 http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh*

 Finds nothing.

 I have realised that Ne* of Nef* do not return results as well, using the
 textIntact type...


 Ah... the problem is that wildcarded query terms do not get analyzed, nor
 do they get lowercased (this is an open issue with Solr to at least make
 lowercasing configurable, Lucene supports it).

 Try lowercasing in your query client, that should do the trick.

Erik

Programmatic access to other handlers

2009-03-12 Thread Pascal Dimassimo


Hi,

I've designed a front handler that will send request to other handlers and
return a aggregated response.

Inside this handler, I call other handlers like this (inside the method
handleRequestBody):

SolrCore core = req.getCore();
SolrRequestHandler mlt = core.getRequestHandler(/mlt);
ModifiableSolrParams params = new ModifiableSolrParams(req.getParams());
params.set(mlt.fl, nFullText);
req.setParams(params);
mlt.handleRequest(req, rsp);

First question: is this the recommended way to call another handler?
Second question: how could I call a handler of another core?
-- 
View this message in context: 
http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22477731.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Tomcat holding deleted snapshots until it's restarted

2009-03-12 Thread Marc Sturlese


I have noticed that the first time I execute full import (having an old index
in the index folder) once it is done, the old indexsearcher will be closed:

2009-03-12 13:05:06,200 [pool-7-thread-1] INFO 
org.apache.solr.core.SolrCore - [core_01] Registered new searcher
searc...@c6692 main
2009-03-12 13:05:06,200 [pool-7-thread-1] INFO 
org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7

The problem is that if I do another full-import... the old searcher will not
be closed, there will just appear the line:
2009-03-12 13:05:06,200 [pool-7-thread-1] INFO 
org.apache.solr.core.SolrCore - [core_01] Registered new searcher
searc...@c6692 main

If I keep doing full-imports the ols searchers will never be closed. seems
that they are just closed in the first full import...
Does it mean something to anyone?


Marc Sturlese wrote:
 
 The old IndexSearcher is beeing closed correctly:
 
 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO 
 org.apache.solr.core.SolrCore - [core_01] Registered new searcher
 searc...@c6692 main
 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO 
 org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 
 
 main
 
 hossman wrote:
 
 
 : If the problem is not there the other thing that comes to my mind is
 : lucene2.9-dev... maybe there's a problem closing indexWriter?...
 opiously
 : it's just a thought.
 
 you never answered yoniks question about wether you see any Closing 
 Searcher messagges in your log, also it's useful to know what you see in 
 the CORE section when you look at stats.jsp ... typically the main 
 searcher is listed there twice, but during warming you'll see the old 
 searcher as well ... if older searchers aren't getting closed for some 
 reason, they should be listed there.
 
 i'd start by confirming/ruling out hte old searchers before speculating 
 about the indexwriter or other problems.
 
 :  On a quiet system, you should see the original searcher closed right
 :  after the new searcher is registered.
 :  
 :  Example:
 :  Mar 11, 2009 2:22:25 PM org.apache.solr.core.SolrCore
 registerSearcher
 :  INFO: [] Registered new searcher searc...@1f1cbf6 main
 :  Mar 11, 2009 2:22:25 PM org.apache.solr.search.SolrIndexSearcher
 close
 :  INFO: Closing searc...@acdd02 main
 
 
 
 -Hoss
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22478204.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr 1.4: filter documens using fields

2009-03-12 Thread Rui António da Cruz Pereira


Hi all!
I'm using StandardRequestHandler and I wanted to filter results by two 
fields in order to avoid duplicate results (in this case the documents 
are very similar, with differences in fields that are not returned in a 
query response).

For example, considering the response:

doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3142/long
   str name=topologynameLocais/str
/doc
doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3141/long
   str name=topologynameinventario/str
/doc
doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3141/long
   str name=topologynameinventario/str
/doc
doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3140/long
   str name=topologynameCPE/str
/doc
doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3140/long
   str name=topologynameCPE/str
/doc

I wanted to filter by: instancekey and topologyid in order to get the 
following response:

doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3142/long
   str name=topologynameLocais/str
/doc
doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3141/long
   str name=topologynameinventario/str
/doc
doc
   long name=instancekey285/long
   str name=instancename186_Testing/str
   long name=topologyid3140/long
   str name=topologynameCPE/str
/doc

I'm manage to do the filtering in the client, but then the paging 
doesn't work as it should (some pages may contain more duplicated 
results than others).

Is there a way (query or other RequestHandler) to do this?

Thanks,
   Rui Pereira

Re: Is wiki page still accurate

On Thu, Mar 12, 2009 at 8:05 PM, Eric Pugh
ep...@opensourceconnections.comwrote:

 Folks,

 Is this section title Full Import Example on
 http://wiki.apache.org/solr/DataImportHandler still accurate?  The steps
 referring to the example-solr-home.jar and the SOLR-469 patch seem out of
 date with where 1.4 is today?

 Seems like the example-DIH stuff is simpler/more direct example???


Yikes! I'll fix it.
-- 
Regards,
Shalin Shekhar Mangar.

RE: Replication in 1.3

2009-03-12 Thread Vauthrin, Laurent

Just so I'm clear on it, do you mean Windows replication via Cygwin is not 
supported or not possible?

If it's possible, I'm just curious if anyone else on the list has experience 
with it.

Thanks,
Laurent

-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Wednesday, March 11, 2009 5:03 PM
To: solr-user@lucene.apache.org
Subject: Re: Replication in 1.3

On Wed, Mar 11, 2009 at 1:29 PM, Vauthrin, Laurent
laurent.vauth...@disney.com wrote:
 I'm hoping to use Solr version 1.4 but in the meantime I'm trying to get
 replication to work in version 1.3.  I'm running Tomcat as a Windows
 service and have Cygwin installed.

The rsync method of replication is not supported under Windows (due to
differing OS/filesystem semantics).  The Java-based synchronization in
Solr 1.4 does support Windows though.

-Yonik
http://www.lucidimagination.com

Re: Is wiki page still accurate

On Thu, Mar 12, 2009 at 10:04 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Thu, Mar 12, 2009 at 8:05 PM, Eric Pugh 
 ep...@opensourceconnections.com wrote:

 Folks,

 Is this section title Full Import Example on
 http://wiki.apache.org/solr/DataImportHandler still accurate?  The steps
 referring to the example-solr-home.jar and the SOLR-469 patch seem out of
 date with where 1.4 is today?

 Seems like the example-DIH stuff is simpler/more direct example???


 Yikes! I'll fix it.


I've updated the instructions. Thanks for reporting this, Eric.
-- 
Regards,
Shalin Shekhar Mangar.

Re: Replication in 1.3

On Thu, Mar 12, 2009 at 12:34 PM, Vauthrin, Laurent
laurent.vauth...@disney.com wrote:
 Just so I'm clear on it, do you mean Windows replication via Cygwin is not 
 supported or not possible?

Not really possible - the strategy the scripts use won't work on
Windows because of the different filesystem semantics.  Things like
the fact that you can make a hard link, but you can't move or delete
any of the links to an open file like you can with UNIX.

-Yonik
http://www.lucidimagination.com

Re: Tomcat holding deleted snapshots until it's restarted

2009-03-12 Thread Chris Hostetter

: I have noticed that the first time I execute full import (having an old index
: in the index folder) once it is done, the old indexsearcher will be closed:
...
: The problem is that if I do another full-import... the old searcher will not
: be closed, there will just appear the line:
...
: If I keep doing full-imports the ols searchers will never be closed. seems
: that they are just closed in the first full import...
: Does it mean something to anyone?

Hmmm... sounds like maybe DIH is triggering something weird.  

Just to clarify: 
a) what does the stats page show (in terms of the number of 
Searchers listed in the CORE section) after a couple of full imports?  
b) can you reproduce this doing full builds even with replication 
disabled? 
c) can you reproduce this using the example DIH configs?




-Hoss

Re: Tomcat holding deleted snapshots until it's restarted

2009-03-12 Thread Marc Sturlese


: Just to clarify:
: a) what does the stats page show (in terms of the number of
: Searchers listed in the CORE section) after a couple of full imports?  

After 4 full-imports it will show 3 indexsearchers. I have also printed the
var _searchers from SolrCore.java and it shows me 3 indexsearchers.

: b) can you reproduce this doing full builds even with replication
: disabled?

I have replication disabled. I use solr collection distribution but for all
this tests I am not even using that. I just use one machine and index just
in there
 
: c) can you reproduce this using the example DIH configs?
My configs look really similar than defaults. I get data from mysql database
in data-config.xml. Solrconfig.xml has the caches and warmings same as
defaults.
I have disabled solrdeletionpolicystuff (and replication aswell).

I have checked the oficial 1.3 release and I hace seen
DirectUpdateHandler2.java is quite different that the one in the nightlys.
In the commit void... 1.3 is calling a closeSearcher function :

public void commit(CommitUpdateCommand cmd) throws IOException {

if (cmd.optimize) {
  optimizeCommands.incrementAndGet();
} else {
  commitCommands.incrementAndGet();
}

Future[] waitSearcher = null;
if (cmd.waitSearcher) {
  waitSearcher = new Future[1];
}

boolean error=true;
iwCommit.lock();
try {
  log.info(start +cmd);

  if (cmd.optimize) {
closeSearcher();
openWriter();
writer.optimize(cmd.maxOptimizeSegments);
  }

  closeSearcher();
  closeWriter();

These closeSearcher function doesn't exist in the nightly (I supose all the
proces works in a different way now).
It seems that once DataImportHandler does the first import touches something
that makes IndexSearchers to not set free never again.





hossman wrote:
 
 : I have noticed that the first time I execute full import (having an old
 index
 : in the index folder) once it is done, the old indexsearcher will be
 closed:
   ...
 : The problem is that if I do another full-import... the old searcher will
 not
 : be closed, there will just appear the line:
   ...
 : If I keep doing full-imports the ols searchers will never be closed.
 seems
 : that they are just closed in the first full import...
 : Does it mean something to anyone?
 
 Hmmm... sounds like maybe DIH is triggering something weird.  
 
 Just to clarify: 
 a) what does the stats page show (in terms of the number of 
 Searchers listed in the CORE section) after a couple of full imports?  
 b) can you reproduce this doing full builds even with replication 
 disabled? 
 c) can you reproduce this using the example DIH configs?
 
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22481571.html
Sent from the Solr - User mailing list archive at Nabble.com.

stemming (maybe?) question

2009-03-12 Thread Jon Drukman

is it possible to make solr think that omeara and o'meara are the 
same thing?


-jsd-

RE: Replication in 1.3

2009-03-12 Thread Vauthrin, Laurent

Thanks for the reply.  Hopefully 1.4 will come soon enough so that we
can still use Windows.

-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
Seeley
Sent: Thursday, March 12, 2009 9:55 AM
To: solr-user@lucene.apache.org
Subject: Re: Replication in 1.3

On Thu, Mar 12, 2009 at 12:34 PM, Vauthrin, Laurent
laurent.vauth...@disney.com wrote:
 Just so I'm clear on it, do you mean Windows replication via Cygwin is
not supported or not possible?

Not really possible - the strategy the scripts use won't work on
Windows because of the different filesystem semantics.  Things like
the fact that you can make a hard link, but you can't move or delete
any of the links to an open file like you can with UNIX.

-Yonik
http://www.lucidimagination.com

fl wildcards

2009-03-12 Thread Schley Andrew Kutz

If I wanted to hack Solr so that it has the ability to process  
wildcards for the field list parameter (fl), where would I look?  
(Perhaps I should look on the solr-dev mailing list, but since I am  
already on this one I thought I would start here). Thanks!


--
-a

Ideally, a code library must be immediately usable by naive  
developers, easily customized by more sophisticated developers, and  
readily extensible by experts. -- L. Stein

Re: Tomcat holding deleted snapshots until it's restarted

On Thu, Mar 12, 2009 at 1:34 PM, Marc Sturlese marc.sturl...@gmail.com wrote:
 : Just to clarify:
 : a) what does the stats page show (in terms of the number of
 : Searchers listed in the CORE section) after a couple of full imports?

 After 4 full-imports it will show 3 indexsearchers. I have also printed the
 var _searchers from SolrCore.java and it shows me 3 indexsearchers.

Definitely seems like a bug somewhere...
Could you try a recent nightly build to see if it's fixed or not?

-Yonik
http://www.lucidimagination.com

Adding authentication Token to the CommonsHttpSolrServer

2009-03-12 Thread Narayanan, Karthikeyan

Hi,
  We have installed the Solr in Tomcat server and enabled the
security constraint at the Tomcat level.. We require to pass the
authentication token(cookie) to the search call that is made using
CommonsHttpSolrServer. Would like to know how can I add  the token to
the CommonsHttpSolrServer. Appreciate any idea on this.

Thanks.
Karthik

Re: stemming (maybe?) question

On Thu, Mar 12, 2009 at 1:36 PM, Jon Drukman jdruk...@gmail.com wrote:
 is it possible to make solr think that omeara and o'meara are the same
 thing?

WordDelimiter would handle it if the document had o'meara (but you
may or may not want the other stuff that comes with
WordDelimiterFilter).
You could also use a PatternReplaceFilter to normalize tokens like this.

-Yonik
http://www.lucidimagination.com

DIH outer joins

2009-03-12 Thread Rui António da Cruz Pereira

I have queries with outer joins defined in some entities and for the 
same root object I can have two or more lines with different objects, 
for example:


Taking the following 3 tables, andquery defined in the entity with outer 
joins between tables:

Table1 - Table2 - Table3

I can have the following lines returned by the query:
Table1Instance1 - Table2Instance1 - Table3Instance1
Table1Instance1 - Table2Instance1 - Table3Instance2
Table1Instance1 - Table2Instance2 - Table3Instance3
Table1Instance2 - Table2Instance3 - Table3Instance4

I wanted to have a single document per root object instance (in this 
case per Table1 instance) but with the values from the different lines 
returned.


Is it possible to have this behavior in DataImportHandler? How?

Thanks in advance,
   Rui Pereira

Re: Programmatic access to other handlers

2009-03-12 Thread Pascal Dimassimo


I found this code to access other core from my custom requesthandler:

CoreContainer.Initializer initializer = new CoreContainer.Initializer();
CoreContainer cores = initializer.initialize();
SolrCore otherCore = cores.getCore(otherCore);

It seems to work with some little testing. But is it a recommended approach? 


Pascal Dimassimo wrote:
 
 Hi,
 
 I've designed a front handler that will send request to other handlers
 and return a aggregated response.
 
 Inside this handler, I call other handlers like this (inside the method
 handleRequestBody):
 
 SolrCore core = req.getCore();
 SolrRequestHandler mlt = core.getRequestHandler(/mlt);
 ModifiableSolrParams params = new ModifiableSolrParams(req.getParams());
 params.set(mlt.fl, nFullText);
 req.setParams(params);
 mlt.handleRequest(req, rsp);
 
 First question: is this the recommended way to call another handler?
 Second question: how could I call a handler of another core?
 

-- 
View this message in context: 
http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html
Sent from the Solr - User mailing list archive at Nabble.com.

DIH use of the ?command=full-import entity= command option

2009-03-12 Thread Fergus McMenemie

Hello,

Can anybody describe the intended purpose, or provide a 
few examples, of how the DIH entity= command option works.

Am I supposed to build a data-conf.xml file which contains
many different alternate entities.. or 

Regards 
-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===

Re: Programmatic access to other handlers

2009-03-12 Thread Pascal Dimassimo


Thanks ryantxu for your answer.

I implement the interface and it returns me the current core. But how is it
different from doing request.getCore() from handleRequestBody()? And I don't
see how this can give me access to other cores. I think that what I need is
to get access to an instance of CoreContainer, so I can call getCore(name)
and getAdminCore to manage the different cores. So I'm wondering if this is
a good way to get that instance:

CoreContainer.Initializer initializer = new  
CoreContainer.Initializer();
CoreContainer cores = initializer.initialize(); 



ryantxu wrote:
 
 If you are doing this in a RequestHandler, implement SolrCoreAware and  
 you will get a callback with the Core
 
 http://wiki.apache.org/solr/SolrPlugins#head-8b3ac1fc3584fe1e822924b98af23d72b02ab134
 
 
 On Mar 12, 2009, at 3:04 PM, Pascal Dimassimo wrote:
 

 I found this code to access other core from my custom requesthandler:

 CoreContainer.Initializer initializer = new  
 CoreContainer.Initializer();
 CoreContainer cores = initializer.initialize();
 SolrCore otherCore = cores.getCore(otherCore);

 It seems to work with some little testing. But is it a recommended  
 approach?


 Pascal Dimassimo wrote:

 Hi,

 I've designed a front handler that will send request to other  
 handlers
 and return a aggregated response.

 Inside this handler, I call other handlers like this (inside the  
 method
 handleRequestBody):

 SolrCore core = req.getCore();
 SolrRequestHandler mlt = core.getRequestHandler(/mlt);
 ModifiableSolrParams params = new  
 ModifiableSolrParams(req.getParams());
 params.set(mlt.fl, nFullText);
 req.setParams(params);
 mlt.handleRequest(req, rsp);

 First question: is this the recommended way to call another handler?
 Second question: how could I call a handler of another core?


 -- 
 View this message in context:
 http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html
 Sent from the Solr - User mailing list archive at Nabble.com.

 
 
 

-- 
View this message in context: 
http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22486235.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Programmatic access to other handlers

2009-03-12 Thread Chris Hostetter


: I implement the interface and it returns me the current core. But how is it
: different from doing request.getCore() from handleRequestBody()? And I don't

i think ryan missunderstood your goal .. that's just a way for you to get 
access to your core prior to handling requests.

: see how this can give me access to other cores. I think that what I need is
: to get access to an instance of CoreContainer, so I can call getCore(name)
: and getAdminCore to manage the different cores. So I'm wondering if this is
: a good way to get that instance:

I'm not positive, but i think the code you listed will actaully 
reconstruct new copies of all of the cores.  the simplest way to get 
access to the CoreContainer is via the CoreDescriptor...

yourCore.getCoreDescriptor().getCoreContainer().getCore(otherCoreName);

(note i've never actually done this, it's just waht i remember off the 
top of my head from the past multicore design discussions ... the 
class/method names may be slightly wrong)

-Hoss

Issues with stale searchers.

2009-03-12 Thread Jeremy Carroll

I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue 
where searchers are opened up right away when tomcat starts, and never goes 
away. This is causing read locks on the Lucene index holding open deleted files 
during merges. This causes our server to run out of disk space in our index. 
Wondering what is causing this issue as I have been searching for two days 
without any real answers.

Thanks,

LSOF Output
java   7322   tomcat   70r  REG  253,0  2569538
2883610 /opt/solr/data/index/_m5n.cfs (deleted)
java   7322   tomcat   71r  REG  253,0  2338291
2883609 /opt/solr/data/index/_m5m.cfs (deleted)
java   7322   tomcat   72r  REG  253,0 13398930
2883608 /opt/solr/data/index/_m5l.cfs (deleted)
java   7322   tomcat   73r  REG  253,0  2692917
2883598 /opt/solr/data/index/_m5k.cfs (deleted)
java   7322   tomcat   74r  REG  253,0 32324600
2883592 /opt/solr/data/index/_m5j.cfx (deleted)
java   7322   tomcat   75r  REG  253,0  6767344
2883603 /opt/solr/data/index/_m5j.cfs (deleted)
java   7322   tomcat   76r  REG  253,0 32324600
2883592 /opt/solr/data/index/_m5j.cfx (deleted)
java   7322   tomcat   77r  REG  253,0 15937346
2883600 /opt/solr/data/index/_m5i.cfs (deleted)

Stats page on Solr Admin


 searc...@66952905 main
class: org.apache.solr.search.SolrIndexSearcher
version:1.0
description:index searcher
stats: searcherName : searc...@66952905 main
caching : true
numDocs : 187169908
maxDoc : 187169908
readerImpl : ReadOnlyMultiSegmentReader
readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
indexVersion : 1224609883675
openedAt : Thu Mar 12 17:13:15 CDT 2009
registeredAt : Thu Mar 12 17:13:23 CDT 2009
warmupTime : 0

name:   core
class:
version:1.0
description:SolrCore
stats: coreName :
startTime : Thu Mar 12 17:13:15 CDT 2009
refCount : 2
aliases : []

name:   searcher
class: org.apache.solr.search.SolrIndexSearcher
version:1.0
description:index searcher
stats: searcherName : searc...@66952905 main
caching : true
numDocs : 187169908
maxDoc : 187169908
readerImpl : ReadOnlyMultiSegmentReader
readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
indexVersion : 1224609883675
openedAt : Thu Mar 12 17:13:15 CDT 2009
registeredAt : Thu Mar 12 17:13:23 CDT 2009
warmupTime : 0

Jeremy Carroll
Sr. Network Engineer
Networked Insights

Re: DIH use of the ?command=full-import entity= command option

2009-03-12 Thread Paul Libbrecht

Wouldn't an entity be something such as a stream, or DB, a manifest- 
channel?


The name source would be better to me but... there's the sQL data- 
sources.


paul


Le 12-mars-09 à 22:47, Fergus McMenemie a écrit :

Can anybody describe the intended purpose, or provide a
few examples, of how the DIH entity= command option works.

Am I supposed to build a data-conf.xml file which contains
many different alternate entities.. or 




smime.p7s
Description: S/MIME cryptographic signature

Re: OR/NOT query syntax

2009-03-12 Thread Jonathan Haddad

I might be wrong on this, but since you can't do a query that's just a
NOT statement, this wouldn't work either.  I believe the NOT must
negate results of a query, not the entire dataset.

On Wed, Mar 11, 2009 at 6:56 PM, Andrew Wall rew...@gmail.com wrote:
 I'm attempting to write a solr query that ensures that if one field has a
 particular value that another field also have a particular value.   I've
 arrived at this syntax, but it doesn't seem to work correctly.

 ((myField:superneat AND myOtherField:somethingElse) OR NOT myField:superneat)

 either operand functions correctly on its own - but not when joined together
 with the or not condition.  I don't understand why this syntax doesn't
 work - can someone shed some light on this?

 Thanks!
 Andrew Wall




-- 
Jonathan Haddad
http://www.rustyrazorblade.com

Re: OR/NOT query syntax

On Wed, Mar 11, 2009 at 9:56 PM, Andrew Wall rew...@gmail.com wrote:
 I'm attempting to write a solr query that ensures that if one field has a
 particular value that another field also have a particular value. I've
 arrived at this syntax, but it doesn't seem to work correctly.

 ((myField:superneat AND myOtherField:somethingElse) OR NOT myField:superneat)

Try
(myField:superneat AND myOtherField:somethingElse) OR (*:* -myField:superneat)

-Yonik
http://www.lucidimagination.com

RE: Issues with stale searchers.

2009-03-12 Thread Jeremy Carroll

If that's the case it is causing out of disk issues with Solr. We have a 187m 
document count index which is about ~200Gb in size. Over a period of about a 
week after optimizations, etc... the open file but deleted count grows very 
large. Causing the system to not be able to optimize due to lack of disk space. 
Also new documents that are indexed are not showing up in search results.

-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Thursday, March 12, 2009 7:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Issues with stale searchers.

On Thu, Mar 12, 2009 at 6:29 PM, Jeremy Carroll
jeremy.carr...@networkedinsights.com wrote:
 I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue 
 where searchers are opened up right away when tomcat starts, and never goes 
 away. This is causing read locks on the Lucene index holding open deleted 
 files during merges.

Deleted files being held open can be normal - that's the current
IndexSearcher serving requests (even though those files may have been
deleted by the IndexWriter already).

Looking at your Stats, I only see one  Searcher, so things look fine there too.

-Yonik
http://www.lucidimagination.com

 This causes our server to run out of disk space in our index.
Wondering what is causing this issue as I have been searching for two
days without any real answers.

 Thanks,

 LSOF Output
 java       7322   tomcat   70r      REG              253,0      2569538    
 2883610 /opt/solr/data/index/_m5n.cfs (deleted)
 java       7322   tomcat   71r      REG              253,0      2338291    
 2883609 /opt/solr/data/index/_m5m.cfs (deleted)
 java       7322   tomcat   72r      REG              253,0     13398930    
 2883608 /opt/solr/data/index/_m5l.cfs (deleted)
 java       7322   tomcat   73r      REG              253,0      2692917    
 2883598 /opt/solr/data/index/_m5k.cfs (deleted)
 java       7322   tomcat   74r      REG              253,0     32324600    
 2883592 /opt/solr/data/index/_m5j.cfx (deleted)
 java       7322   tomcat   75r      REG              253,0      6767344    
 2883603 /opt/solr/data/index/_m5j.cfs (deleted)
 java       7322   tomcat   76r      REG              253,0     32324600    
 2883592 /opt/solr/data/index/_m5j.cfx (deleted)
 java       7322   tomcat   77r      REG              253,0     15937346    
 2883600 /opt/solr/data/index/_m5i.cfs (deleted)

 Stats page on Solr Admin

                 searc...@66952905 main
 class:     org.apache.solr.search.SolrIndexSearcher
 version:                1.0
 description:        index searcher
 stats:     searcherName : searc...@66952905 main
 caching : true
 numDocs : 187169908
 maxDoc : 187169908
 readerImpl : ReadOnlyMultiSegmentReader
 readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
 indexVersion : 1224609883675
 openedAt : Thu Mar 12 17:13:15 CDT 2009
 registeredAt : Thu Mar 12 17:13:23 CDT 2009
 warmupTime : 0

 name:   core
 class:
 version:                1.0
 description:        SolrCore
 stats:     coreName :
 startTime : Thu Mar 12 17:13:15 CDT 2009
 refCount : 2
 aliases : []

 name:   searcher
 class:     org.apache.solr.search.SolrIndexSearcher
 version:                1.0
 description:        index searcher
 stats:     searcherName : searc...@66952905 main
 caching : true
 numDocs : 187169908
 maxDoc : 187169908
 readerImpl : ReadOnlyMultiSegmentReader
 readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index
 indexVersion : 1224609883675
 openedAt : Thu Mar 12 17:13:15 CDT 2009
 registeredAt : Thu Mar 12 17:13:23 CDT 2009
 warmupTime : 0

 Jeremy Carroll
 Sr. Network Engineer
 Networked Insights

SolrJ : EmbeddedSolrServer and database data indexing

2009-03-12 Thread Ashish P


Is it possible to index DB data directly to solr using EmbeddedSolrServer. I
tried using data-Config File and Full-import commad, it works. So assuming
using CommonsHttpServer will also work. But can I do it with
EmbeddedSolrServer??

Thanks in advance...
Ashish
-- 
View this message in context: 
http://www.nabble.com/SolrJ-%3A-EmbeddedSolrServer-and-database-data-indexing-tp22488697p22488697.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issues with stale searchers.

On Thu, Mar 12, 2009 at 9:38 PM, Jeremy Carroll
jeremy.carr...@networkedinsights.com wrote:
 If that's the case it is causing out of disk issues with Solr. We have a 187m 
 document count index which is about ~200Gb in size. Over a period of about a 
 week after optimizations, etc... the open file but deleted count grows very 
 large. Causing the system to not be able to optimize due to lack of disk 
 space. Also new documents that are indexed are not showing up in search 
 results.

Multiply the index size by 3 to get the max disk space.
- 1 for the index currently open for searching
- up to 1 for new segments written by the index writer (including merges)
- up to 1 when the index writer does major merges or optimizes (the
index writer can't delete the old segment files until it's sure that
the new index has been written successfully).

That said, what you are seeing could be normal, or could be a bug.

-Yonik

Re: SolrJ : EmbeddedSolrServer and database data indexing

2009-03-12 Thread Ashish P


Is there any api in SolrJ that calls the dataImportHandler to execute
commands like full-import and delta-import.
Please help..


Ashish P wrote:
 
 Is it possible to index DB data directly to solr using EmbeddedSolrServer.
 I tried using data-Config File and Full-import commad, it works. So
 assuming using CommonsHttpServer will also work. But can I do it with
 EmbeddedSolrServer??
 
 Thanks in advance...
 Ashish
 

-- 
View this message in context: 
http://www.nabble.com/SolrJ-%3A-EmbeddedSolrServer-and-database-data-indexing-tp22488697p22489420.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH use of the ?command=full-import entity= command option

On Fri, Mar 13, 2009 at 3:17 AM, Fergus McMenemie fer...@twig.me.uk wrote:

 Hello,

 Can anybody describe the intended purpose, or provide a
 few examples, of how the DIH entity= command option works.

 Am I supposed to build a data-conf.xml file which contains
 many different alternate entities.. or 


With the entity parameter you can specify the name of any root entity and
import only that one. You can specify multiple entity parameters too. For
example:
/dataimport?command=full-importentity=xentity=y

You may need to specify preImportDeleteQuery separately on each entity to
make sure all documents are not deleted.
-- 
Regards,
Shalin Shekhar Mangar.

Re: DIH use of the ?command=full-import entity= command option

2009-03-12 Thread Fergus McMenemie

If my data-config.xml contains multiple root level entities
what is the expected action if I call full-import without an
entity=XXX sub-command?

Does it process all entities one after the other or only the
first? (It would be useful IMHO if it only did the first.)

On Fri, Mar 13, 2009 at 3:17 AM, Fergus McMenemie fer...@twig.me.uk wrote:

 Hello,

 Can anybody describe the intended purpose, or provide a
 few examples, of how the DIH entity= command option works.

 Am I supposed to build a data-conf.xml file which contains
 many different alternate entities.. or 


With the entity parameter you can specify the name of any root entity and
import only that one. You can specify multiple entity parameters too. For
example:
/dataimport?command=full-importentity=xentity=y

You may need to specify preImportDeleteQuery separately on each entity to
make sure all documents are not deleted.
-- 
Regards,
Shalin Shekhar Mangar.

-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk
Techmore Ltd   Phone:(UK) 07721 376021

Unix/Mac/Intranets Analyst Programmer
===

Re: CJKAnalyzer and Chinese Text sort

2009-03-12 Thread Sachin

Thanks Hoss for your comments! I don't mind submitting it as a patch, shall I 
create a issue in Jira and submit the patch with that? Also, I didn't modify 
the core solr for locale based sorting; I just added the created a jar file 
with the class file  copied it over to the lib folder. As part of the patch, 
shall I add it to the core solr code-base (users who want to use this don't 
need anything extra to do) or add it as a contrib field (they need to compile 
it as jar and copy it over to the lib folder)?

Thanks!

-- Original Message --
From: Chris Hostetter hossman_luc...@fucit.org
To: solr-user@lucene.apache.org
Subject: Re: CJKAnalyzer and Chinese Text sort
Date: Wed, 11 Mar 2009 15:50:40 -0700 (PDT)


First off: you can't sort on a field where any doc has more then one token 
-- that's why worting on a TextField doesn't work unless you use something 
like the KeywordTokenizer.

Second...

: I found out that reason the strings are not getting sorted is because 
: there is no way to pass the locale information to StrField, I ended up 
: extending StrField to take an additional attribute in schema.xml and 
: then had to override the getSortString method where in I create a new 
: Locale based on the schema attribute and pass it to the StrField. I put 
: this newly created jar file in the lib folder and everything seems to be 
: working fine after that. Since, my java knowledge is almost zilch, I was 
: wondering is this the right way to solve this problem or is there any 
: other recommended approach for this?

I don't remember what the state of Locale-based sorting is, but the 
modifications you describe sound right based on what i remember ... would 
you be interested in submitting them back as a patch?

http://wiki.apache.org/solr/HowToContribute


-Hoss




Be there without being there. Click now for great video conferencing solutions!
http://thirdpartyoffers.netzero.net/TGL2231/fc/BLSrjnxPnB4hOQVqoEYkOC4tiqZzd7wrCMz9gjPk2mJcEaQiXNZxDIlo7b6/

Re: DIH use of the ?command=full-import entity= command option