date:20090930

Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie

Hi all,

I am attempting to test some changes I made to my DIH based
indexing process. The changes only affect the way I 
describe my fields in data-config.xml, there should be no
changes to the way the data is indexed or stored.

As a QA check I was wanting to compare the results from
indexing the same data before/after the change. I was looking
for a way of getting counts of terms in each field. I 
guess Luke etc most allow this but how?

Regards Fergus.

Re: Number of terms in a SOLR field

2009-09-30 Thread Andrzej Bialecki


Fergus McMenemie wrote:

Hi all,

I am attempting to test some changes I made to my DIH based
indexing process. The changes only affect the way I 
describe my fields in data-config.xml, there should be no

changes to the way the data is indexed or stored.

As a QA check I was wanting to compare the results from
indexing the same data before/after the change. I was looking
for a way of getting counts of terms in each field. I 
guess Luke etc most allow this but how?


Luke uses brute force approach - it traverses all terms, and counts 
terms per field. This is easy to implement yourself - just get 
IndexReader.terms() enumeration and traverse it.



--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Create new core on the fly

2009-09-30 Thread Shalin Shekhar Mangar

On Wed, Sep 30, 2009 at 3:48 AM, djain101 dharmveer_j...@yahoo.com wrote:


 Hi Shalin,

 Can you please elaborate, why we need to do unload after create?


No you don't need to. You can unload if you want to for some reasons.


 So, if we
 do a create, will it modify the solr.xml everytime? Can it be avoided in
 subsequent requests for create?


No, solr.xml will be modified only if persist=true is passed as a request
param. I don't understand your second question. Why would you want to issue
create commands for the same core multiple times?


 Also, if we want to implement Load, can you please give some directions to
 implement load action?


I don't know what you want to do. Loading cores without restarting Solr is
possible right now by using the create command.

-- 
Regards,
Shalin Shekhar Mangar.

delay while adding document to solr index

2009-09-30 Thread swapna_here


hi all,

I have indexed 10 documents (daily around 5000 documents will be indexed
one at a time to solr)
at the same time daily few(around 2000) indexed documents (added 30 days
back) will be deleted using DeleteByQuery of SolrJ
Previously each document used to be indexed within 5ms..
but recently i am facing a delay (sometimes 2min to 10 min) while adding
document to index.
And my index (folder) size is also increased to 625MB which is very large
Previously it was around 230MB

My Questions are:

1) is solr not deleting the older documents(added 30 days back) permenently
from index event after committing 

2)Why the index size is increased

3)reason for delay (2min to 10 mins) while adding the document one at a time
to index

Help is appreciated

Thanks in advance..

-- 
View this message in context: 
http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25676777.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie


Fergus McMenemie wrote:
 Hi all,
 
 I am attempting to test some changes I made to my DIH based
 indexing process. The changes only affect the way I 
 describe my fields in data-config.xml, there should be no
 changes to the way the data is indexed or stored.
 
 As a QA check I was wanting to compare the results from
 indexing the same data before/after the change. I was looking
 for a way of getting counts of terms in each field. I 
 guess Luke etc most allow this but how?

Luke uses brute force approach - it traverses all terms, and counts 
terms per field. This is easy to implement yourself - just get 
IndexReader.terms() enumeration and traverse it.

Thanks Andrzej 

This is just a one off QA check. How do I get Luke to display
terms and counts?


-- 
Best regards,
Andrzej Bialecki 

Fergus.  
--

Re: Number of terms in a SOLR field

2009-09-30 Thread Andrzej Bialecki


Fergus McMenemie wrote:

Fergus McMenemie wrote:

Hi all,

I am attempting to test some changes I made to my DIH based
indexing process. The changes only affect the way I 
describe my fields in data-config.xml, there should be no

changes to the way the data is indexed or stored.

As a QA check I was wanting to compare the results from
indexing the same data before/after the change. I was looking
for a way of getting counts of terms in each field. I 
guess Luke etc most allow this but how?
Luke uses brute force approach - it traverses all terms, and counts 
terms per field. This is easy to implement yourself - just get 
IndexReader.terms() enumeration and traverse it.


Thanks Andrzej 


This is just a one off QA check. How do I get Luke to display
terms and counts?


1. get Luke 0.9.9
2. open index with Luke
3. Look at the Overview panel, you will see the list titled Available 
fields and term counts per field.



--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Problem getting Solr home from JNDI in Tomcat

2009-09-30 Thread Andrew Clegg

hossman wrote:

: Hi all, I'm having problems getting Solr to start on Tomcat 6.

which version of Solr?

Sorry -- a nightly build from about a month ago. Re. your other message, I
was sure the two machines had the same version on, but maybe not -- when I'm
back in the office tomorrow I'll upgrade them both to a fresh nightly.

hossman wrote:

: Tomcat is installed in /opt/apache-tomcat , solr is in
: /opt/apache-tomcat/webapps/solr , and my Solr home directory is
/opt/solr .

if solr is in /opt/apache-tomcat/webapps/solr means that you put the
solr.war in /opt/apache-tomcat/webapps/ and tomcat expanded it into
/opt/apache-tomcat/webapps/solr then that is your problem -- tomcat isn't
even looking at your context file (it only looks at the context files to
ersolve URLs that it cant resolve looking in the webapps directory)

Yes, it's auto-expanded from a war in webapps.

I have to admit to being a bit baffled though -- I can't find this rule
anywhere in the Tomcat docs, but I'm a beginner really and they're not the
clearest :-)

hossman wrote:

This is why the examples of using context files on the wiki talk about
keeping the war *outside* of the webapps directory, and using docBase in
your Context declaration...
http://wiki.apache.org/solr/SolrTomcat

Great, I'll try it this way and see if it clears up. Is it okay to keep the
war file *inside* the Solr home directory (/opt/solr in my case) so it's all
self-contained?

Many thanks,

Andrew.

--
View this message in context:
http://www.nabble.com/Problem-getting-Solr-home-from-JNDI-in-Tomcat-tp25662200p25677750.html
Sent from the Solr - User mailing list archive at Nabble.com.

Invalid response with search key having numbers

2009-09-30 Thread con



Hi all
I am getting incorrect results when i search with numbers only or string
containing numbers.
when such a search is done, all the results in the index is returned,
irrespective of the search key.
For eg, the phone number field is mapped to TextField. it can contains
values like , 653-23345
also search string like john25, searched against name will show all the
results.

my analyser looks like:

fieldType name=mytype class=solr.TextField
analyzer type=index
tokenizer class=solr.LowerCaseTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory
catenateNumbers=1
/  
/analyzer
analyzer type=query
tokenizer class=solr.LowerCaseTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory
catenateNumbers=1
/
/analyzer
 /fieldType
 
anything wrong in the analyser? Do i need to use any other filters instead
of catenateAll.

Thanks
C
-- 
View this message in context: 
http://www.nabble.com/Invalid-response-with-search-key-having-numbers-tp25677793p25677793.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: search for non empty field

2009-09-30 Thread Jorge Agudo Praena

Hi, i'm not having the expected results when using [* TO *], the results are
including empty fields.
Here is my configuration:

schema.xml:
field name=refFaseExp type=string indexed=true stored=true
multiValued=true/

bean:
@Field
private ListString refFaseExp= new ArrayListString();

query:
http://host.com/select?rows=0facet=truefacet.field=refFaseExpq=*:* AND
refFaseExp:[* TO *]

query results:
(...)
lst name=facet_counts
lst name=facet_queries/
-
lst name=facet_fields
-
lst name=refFaseExp
int name=32/int
(...)

I tried changing type=string to long and nothing changed.
When I use -refFaseExp:[* TO *], results 0 documents.
Any idea? Thx in advance.


On Mon, Mar 31, 2008 at 2:07 PM, Matt Mitchell goodie...@gmail.com wrote:

 Thanks Erik. I think this is the thread here:


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200709.mbox/%3c67117a73-2208-401f-ab5d-148634c77...@variogr.am%3e

 Matt

 On Sun, Mar 30, 2008 at 9:50 PM, Erik Hatcher e...@ehatchersolutions.com
 wrote:

  Documents with a particular field can be matched using:
 
   field:[* TO *]
 
  Or documents without a particular field with:
 
   -field:[* TO *]
 
  An empty field?  Meaning one that was indexed but with no terms?  I'm
  not sure about that one.  Seems like Hoss replied to something
  similar on this last week or so though - check the archives.
 
 Erik
 
 
  On Mar 30, 2008, at 9:43 PM, Matt Mitchell wrote:
   I'm looking for the exact same thing.
  
   On Sun, Mar 30, 2008 at 8:45 PM, Ismail Siddiqui ism...@gmail.com
   wrote:
  
   Hi all,
  
  
   I have a situation where i have to filter result on a non empty
   field .
   wild card wont work as it will have to match with a letter.
   How can I form query to return result where a particular field is
   non-empty
   .
  
  
  
   Ismail

Re: delay while adding document to solr index

2009-09-30 Thread Pravin Paratey

Swapna,

Your answers are inline.

2009/9/30 swapna_here swapna.here...@gmail.com:

 hi all,

 I have indexed 10 documents (daily around 5000 documents will be indexed
 one at a time to solr)
 at the same time daily few(around 2000) indexed documents (added 30 days
 back) will be deleted using DeleteByQuery of SolrJ
 Previously each document used to be indexed within 5ms..
 but recently i am facing a delay (sometimes 2min to 10 min) while adding
 document to index.
 And my index (folder) size is also increased to 625MB which is very large
 Previously it was around 230MB

 My Questions are:

 1) is solr not deleting the older documents(added 30 days back) permenently
 from index event after committing

Have you run optimize?

 2)Why the index size is increased

If 5000 docs are added daily and only 2000 deleted, the index size
would increase because of the remaining 3000 documents.

 3)reason for delay (2min to 10 mins) while adding the document one at a time
 to index

I don't know why this would happen. Is your disk nearly full? Which OS
are you running on? What is the configuration of Solr?

 Help is appreciated

 Thanks in advance..

 --
 View this message in context: 
 http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25676777.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Hope this helps
Pravin

Solr Porting to .Net

2009-09-30 Thread Antonio Calò

Hi All

I'm wondering if is already available a Solr version for .Net or if it is
still under development/planning. I've searched on Solr website but I've
found only info on Lucene .Net project.

Best Regards

Antonio

-- 
Antonio Calò
--
Software Developer Engineer
@ Intellisemantic
Mail anton.c...@gmail.com
Tel. 011-56.90.429
--

Re: Solr Porting to .Net

2009-09-30 Thread Pravin Paratey

You may want to check out - http://code.google.com/p/solrnet/

2009/9/30 Antonio Calò anton.c...@gmail.com:
 Hi All

 I'm wondering if is already available a Solr version for .Net or if it is
 still under development/planning. I've searched on Solr website but I've
 found only info on Lucene .Net project.

 Best Regards

 Antonio

 --
 Antonio Calò
 --
 Software Developer Engineer
 @ Intellisemantic
 Mail anton.c...@gmail.com
 Tel. 011-56.90.429
 --

Re: delay while adding document to solr index

2009-09-30 Thread Pravin Paratey

Also, what is your merge factor set to?

Pravin

2009/9/30 Pravin Paratey prav...@gmail.com:
 Swapna,

 Your answers are inline.

 2009/9/30 swapna_here swapna.here...@gmail.com:

 hi all,

 I have indexed 10 documents (daily around 5000 documents will be indexed
 one at a time to solr)
 at the same time daily few(around 2000) indexed documents (added 30 days
 back) will be deleted using DeleteByQuery of SolrJ
 Previously each document used to be indexed within 5ms..
 but recently i am facing a delay (sometimes 2min to 10 min) while adding
 document to index.
 And my index (folder) size is also increased to 625MB which is very large
 Previously it was around 230MB

 My Questions are:

 1) is solr not deleting the older documents(added 30 days back) permenently
 from index event after committing

 Have you run optimize?

 2)Why the index size is increased

 If 5000 docs are added daily and only 2000 deleted, the index size
 would increase because of the remaining 3000 documents.

 3)reason for delay (2min to 10 mins) while adding the document one at a time
 to index

 I don't know why this would happen. Is your disk nearly full? Which OS
 are you running on? What is the configuration of Solr?

 Help is appreciated

 Thanks in advance..

 --
 View this message in context: 
 http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25676777.html
 Sent from the Solr - User mailing list archive at Nabble.com.



 Hope this helps
 Pravin

Re: delay while adding document to solr index

2009-09-30 Thread swapna_here


thanks for your reply
i have not optimized at all
my knowledge is optimize improves the query performance but it will take
more disk space
except that i have no idea how to use it

previously for 10 documents the size occupied was around 250MB

But after 2 months it is 625MB

why this happened ?
is it because i have not optimized the index
can any body tell me when and how to optimize the index(with configuration
details) .
-- 
View this message in context: 
http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25678531.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: delay while adding document to solr index

2009-09-30 Thread Pravin Paratey

Swapna

While the disk space does increase during the process of optimization,
it should almost always return to the original size or slightly less.

This is a silly question. But off the top of my head, I can't think of
any other reason why the index size would increase - Are you running a
commit/ after adding documents?

If you are, you might want to compare the size of each document being
currently indexed with the ones you indexed a few months back.

To optimize the index, simply post optimize/ to Solr. Or read
[http://wiki.apache.org/solr/SolrOperationsTools]

Pravin

2009/9/30 swapna_here swapna.here...@gmail.com:

 thanks for your reply
 i have not optimized at all
 my knowledge is optimize improves the query performance but it will take
 more disk space
 except that i have no idea how to use it

 previously for 10 documents the size occupied was around 250MB

 But after 2 months it is 625MB

 why this happened ?
 is it because i have not optimized the index
 can any body tell me when and how to optimize the index(with configuration
 details) .
 --
 View this message in context: 
 http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25678531.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: ${dataimporter.last_index_time} as an argument to newerThan in FileListEntityProcessor?

2009-09-30 Thread Shalin Shekhar Mangar

On Tue, Sep 29, 2009 at 11:43 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Tue, Sep 29, 2009 at 8:14 PM, Bill Dueber b...@dueber.com wrote:

 Is this possible? I can't figure out a syntax that works, and all the
 examples show using last_index_time as an argument to an SQL query.


 It is possible but it doesn't work right now. I've created an issue and I
 will give a patch shortly.

 https://issues.apache.org/jira/browse/SOLR-1473


Bill, this fix is now available in trunk. A sample usage would look like the
following:

dataConfig
document
entity name=x processor=FileListEntityProcessor
fileName=.* newerThan=${dih.last_index_time}
baseDir=/data transformer=TemplateTransformer
field column=id template=${x.file} /
/entity
/document
/dataConfig

Thanks for reporting this!

-- 
Regards,
Shalin Shekhar Mangar.

Re: delay while adding document to solr index

2009-09-30 Thread swapna_here


thanks again for your immediate response 

yes, i am running the commit after a document is indexed

here i don't understand why my index size is increased to 625MB(for the
10 documents)
which was previously 250MB
is this due to i have not optimized at all my index or since i am adding
documents individually

i need solution for this urgently 
thanks a lot
-- 
View this message in context: 
http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25679463.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: search for non empty field

2009-09-30 Thread Erik Hatcher

field:[* TO *] matches documents that have that have one or more terms  
in that field.   If your indexer is sending a value, it'll end up with  
a term.


Note that changing from string to long requires reindexing, though  
that isn't the issue here.


Erik



On Sep 30, 2009, at 2:39 AM, Jorge Agudo Praena wrote:

Hi, i'm not having the expected results when using [* TO *], the  
results are

including empty fields.
Here is my configuration:

schema.xml:
field name=refFaseExp type=string indexed=true stored=true
multiValued=true/

bean:
@Field
private ListString refFaseExp= new ArrayListString();

query:
http://host.com/select? 
rows=0facet=truefacet.field=refFaseExpq=*:* AND

refFaseExp:[* TO *]

query results:
(...)
lst name=facet_counts
lst name=facet_queries/
-
lst name=facet_fields
-
lst name=refFaseExp
int name=32/int
(...)

I tried changing type=string to long and nothing changed.
When I use -refFaseExp:[* TO *], results 0 documents.
Any idea? Thx in advance.


On Mon, Mar 31, 2008 at 2:07 PM, Matt Mitchell goodie...@gmail.com  
wrote:



Thanks Erik. I think this is the thread here:


http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200709.mbox/%3c67117a73-2208-401f-ab5d-148634c77...@variogr.am%3e

Matt

On Sun, Mar 30, 2008 at 9:50 PM, Erik Hatcher e...@ehatchersolutions.com 


wrote:


Documents with a particular field can be matched using:

field:[* TO *]

Or documents without a particular field with:

-field:[* TO *]

An empty field?  Meaning one that was indexed but with no terms?   
I'm

not sure about that one.  Seems like Hoss replied to something
similar on this last week or so though - check the archives.

  Erik


On Mar 30, 2008, at 9:43 PM, Matt Mitchell wrote:

I'm looking for the exact same thing.

On Sun, Mar 30, 2008 at 8:45 PM, Ismail Siddiqui ism...@gmail.com
wrote:


Hi all,


I have a situation where i have to filter result on a non empty
field .
wild card wont work as it will have to match with a letter.
How can I form query to return result where a particular field is
non-empty
.



Ismail

init parameters for queryParser

2009-09-30 Thread Jérôme Etévé

Hi all,

  I've got my own query parser plugin defined thanks to the queryParser tag:

queryParser name=myqueryparser class=my.package.MyQueryParserPlugin /

The QParserPlugin class has got an init method like this:
public void init(NamedList args);

Where and how do I put my args to be passed to init for my query parser plugin?

I'm trying

queryParser name=myqueryparser class=my.package.MyQueryParserPlugin 
lst name=defaults
  str name=param1value1/str
   str name=param1value1/str
/lst
/queryParser

But I'm not sure if it's the right way.

Could we also update the wiki about this?
http://wiki.apache.org/solr/SolrPlugins#QParserPlugin

Jerome.

-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

Re: Solr Porting to .Net

2009-09-30 Thread Mauricio Scheffer

SolrNet is only a http client to Solr.
I've been experimenting with IKVM but wasn't very successful... There seem
to be some issues with class loading, but unfortunately I don't have much
time to continue these experiments right now. In case you're interested in
continuing this, here's the repository:
http://code.google.com/p/mausch/source/browse/trunk/SolrIKVM

Also recently someone registered a project on google code with the same
intentions, but no commits yet: http://code.google.com/p/solrwin/

http://code.google.com/p/mausch/source/browse/trunk/SolrIKVMCheers,
Mauricio

On Wed, Sep 30, 2009 at 7:09 AM, Pravin Paratey prav...@gmail.com wrote:

 You may want to check out - http://code.google.com/p/solrnet/

 2009/9/30 Antonio Calò anton.c...@gmail.com:
  Hi All
 
  I'm wondering if is already available a Solr version for .Net or if it is
  still under development/planning. I've searched on Solr website but I've
  found only info on Lucene .Net project.
 
  Best Regards
 
  Antonio
 
  --
  Antonio Calò
  --
  Software Developer Engineer
  @ Intellisemantic
  Mail anton.c...@gmail.com
  Tel. 011-56.90.429
  --

Re: delay while adding document to solr index

2009-09-30 Thread Jérôme Etévé

Hi,

- Try to let solr do the commits for you (setting up autocommit
feature). (and stop committing after inserting one document). This
should greatly improve the delays you're experiencing.

- If you do not optimize, it's normal your index size only grows.
Optimize once regularly when your load is minimal.

Jerome.

2009/9/30 swapna_here swapna.here...@gmail.com:

 thanks again for your immediate response

 yes, i am running the commit after a document is indexed

 here i don't understand why my index size is increased to 625MB(for the
 10 documents)
 which was previously 250MB
 is this due to i have not optimized at all my index or since i am adding
 documents individually

 i need solution for this urgently
 thanks a lot
 --
 View this message in context: 
 http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25679463.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

Re: init parameters for queryParser

2009-09-30 Thread Shalin Shekhar Mangar

On Wed, Sep 30, 2009 at 7:14 PM, Jérôme Etévé jerome.et...@gmail.comwrote:

 Hi all,

  I've got my own query parser plugin defined thanks to the queryParser tag:

 queryParser name=myqueryparser class=my.package.MyQueryParserPlugin /

 The QParserPlugin class has got an init method like this:
 public void init(NamedList args);

 Where and how do I put my args to be passed to init for my query parser
 plugin?

 I'm trying

 queryParser name=myqueryparser class=my.package.MyQueryParserPlugin 
 lst name=defaults
  str name=param1value1/str
   str name=param1value1/str
 /lst
 /queryParser

 But I'm not sure if it's the right way.


You don't need to put lst name=defaults - defaults, appends, invariants
are keys used by RequestHandlers. Just put all the params you need directly:
queryParser name=myqueryparser class=my.package.MyQueryParserPlugin 
  str name=param1value1/str
  bool name=param2true/bool
/queryParser

-- 
Regards,
Shalin Shekhar Mangar.

Where do I need to install Solr

2009-09-30 Thread Kevin Miller

Does Solr have to be installed on the web server, or can I install Solr
on a different server and access it from my web server?

Kevin Miller
Web Services

Re: Where do I need to install Solr

2009-09-30 Thread Claudio Martella

Kevin Miller wrote:
 Does Solr have to be installed on the web server, or can I install Solr
 on a different server and access it from my web server?

 Kevin Miller
 Web Services

   
you can access it from your webserver (or browser) via HTTP/XML requests
and responses.
have a look at solr tutorial: http://lucene.apache.org/solr/tutorial.html
and this one: http://www.xml.com/lpt/a/1668

-- 
Claudio Martella
Digital Technologies
Unit Research  Development - Engineer

TIS innovation park
Via Siemens 19 | Siemensstr. 19
39100 Bolzano | 39100 Bozen
Tel. +39 0471 068 123
Fax  +39 0471 068 129
claudio.marte...@tis.bz.it http://www.tis.bz.it

Short information regarding use of personal data. According to Section 13 of 
Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
process your personal data in order to fulfil contractual and fiscal 
obligations and also to send you information regarding our services and events. 
Your personal data are processed with and without electronic means and by 
respecting data subjects' rights, fundamental freedoms and dignity, 
particularly with regard to confidentiality, personal identity and the right to 
personal data protection. At any time and without formalities you can write an 
e-mail to priv...@tis.bz.it in order to object the processing of your personal 
data for the purpose of sending advertising materials and also to exercise the 
right to access personal data and other rights referred to in Section 7 of 
Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, 
Siemens Street n. 19, Bolzano. You can find the complete information on the web 
site www.tis.bz.it.

Re: Solr Porting to .Net

2009-09-30 Thread Antonio Calò

I guys, thanks for your prompt feedback.


So, you are saying that SolrNet is just a wrapper written in C#, that
connnect the Solr (still written in Java that run on the IKVM) ?

Is my understanding correct?

Regards

Antonio

2009/9/30 Mauricio Scheffer mauricioschef...@gmail.com

 SolrNet is only a http client to Solr.
 I've been experimenting with IKVM but wasn't very successful... There seem
 to be some issues with class loading, but unfortunately I don't have much
 time to continue these experiments right now. In case you're interested in
 continuing this, here's the repository:
 http://code.google.com/p/mausch/source/browse/trunk/SolrIKVM

 Also recently someone registered a project on google code with the same
 intentions, but no commits yet: http://code.google.com/p/solrwin/

 http://code.google.com/p/mausch/source/browse/trunk/SolrIKVMCheers,
 Mauricio

 On Wed, Sep 30, 2009 at 7:09 AM, Pravin Paratey prav...@gmail.com wrote:

  You may want to check out - http://code.google.com/p/solrnet/
 
  2009/9/30 Antonio Calò anton.c...@gmail.com:
   Hi All
  
   I'm wondering if is already available a Solr version for .Net or if it
 is
   still under development/planning. I've searched on Solr website but
 I've
   found only info on Lucene .Net project.
  
   Best Regards
  
   Antonio
  
   --
   Antonio Calò
   --
   Software Developer Engineer
   @ Intellisemantic
   Mail anton.c...@gmail.com
   Tel. 011-56.90.429
   --
  
 




-- 
Antonio Calò
--
Software Developer Engineer
@ Intellisemantic
Mail anton.c...@gmail.com
Tel. 011-56.90.429
--

Re: Where do I need to install Solr

2009-09-30 Thread Jérôme Etévé

Solr is a separate service, in the same way a RDMS is a separate service.

Whether you install it on the same machine as your webserver or not,
it's logically separated from your server.

Jerome.

2009/9/30 Claudio Martella claudio.marte...@tis.bz.it:
 Kevin Miller wrote:
 Does Solr have to be installed on the web server, or can I install Solr
 on a different server and access it from my web server?

 Kevin Miller
 Web Services


 you can access it from your webserver (or browser) via HTTP/XML requests
 and responses.
 have a look at solr tutorial: http://lucene.apache.org/solr/tutorial.html
 and this one: http://www.xml.com/lpt/a/1668

 --
 Claudio Martella
 Digital Technologies
 Unit Research  Development - Engineer

 TIS innovation park
 Via Siemens 19 | Siemensstr. 19
 39100 Bolzano | 39100 Bozen
 Tel. +39 0471 068 123
 Fax  +39 0471 068 129
 claudio.marte...@tis.bz.it http://www.tis.bz.it

 Short information regarding use of personal data. According to Section 13 of 
 Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we 
 process your personal data in order to fulfil contractual and fiscal 
 obligations and also to send you information regarding our services and 
 events. Your personal data are processed with and without electronic means 
 and by respecting data subjects' rights, fundamental freedoms and dignity, 
 particularly with regard to confidentiality, personal identity and the right 
 to personal data protection. At any time and without formalities you can 
 write an e-mail to priv...@tis.bz.it in order to object the processing of 
 your personal data for the purpose of sending advertising materials and also 
 to exercise the right to access personal data and other rights referred to in 
 Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation 
 Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete 
 information on the web site www.tis.bz.it.






-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

Re: Solr Porting to .Net

2009-09-30 Thread Mauricio Scheffer

Solr is a server that runs on Java and it exposes a http interface.SolrNet
is a client library for .Net that connects to a Solr instance via its http
interface.
My experiment (let's call it SolrIKVM) is an attempt to run Solr on .Net.

Hope that clear things up.

On Wed, Sep 30, 2009 at 11:50 AM, Antonio Calò anton.c...@gmail.com wrote:

 I guys, thanks for your prompt feedback.


 So, you are saying that SolrNet is just a wrapper written in C#, that
 connnect the Solr (still written in Java that run on the IKVM) ?

 Is my understanding correct?

 Regards

 Antonio

 2009/9/30 Mauricio Scheffer mauricioschef...@gmail.com

  SolrNet is only a http client to Solr.
  I've been experimenting with IKVM but wasn't very successful... There
 seem
  to be some issues with class loading, but unfortunately I don't have much
  time to continue these experiments right now. In case you're interested
 in
  continuing this, here's the repository:
  http://code.google.com/p/mausch/source/browse/trunk/SolrIKVM
 
  Also recently someone registered a project on google code with the same
  intentions, but no commits yet: http://code.google.com/p/solrwin/
 
  http://code.google.com/p/mausch/source/browse/trunk/SolrIKVMCheers,
  Mauricio
 
  On Wed, Sep 30, 2009 at 7:09 AM, Pravin Paratey prav...@gmail.com
 wrote:
 
   You may want to check out - http://code.google.com/p/solrnet/
  
   2009/9/30 Antonio Calò anton.c...@gmail.com:
Hi All
   
I'm wondering if is already available a Solr version for .Net or if
 it
  is
still under development/planning. I've searched on Solr website but
  I've
found only info on Lucene .Net project.
   
Best Regards
   
Antonio
   
--
Antonio Calò
--
Software Developer Engineer
@ Intellisemantic
Mail anton.c...@gmail.com
Tel. 011-56.90.429
--
   
  
 



 --
 Antonio Calò
 --
 Software Developer Engineer
 @ Intellisemantic
 Mail anton.c...@gmail.com
 Tel. 011-56.90.429
 --

Questions about synonyms and highlighting

2009-09-30 Thread Nourredine K.

Hi,

Can you please give me some answers for those questions : 

1 - How can I get synonyms found for  a keyword ? 

I mean i search foo and i have in my synonyms.txt file the following tokens : 
foo, foobar, fee (with expand = true)
My index contains foo and foobar. I want to display a message in a result 
page, on the header for example, only the 2 matched tokens and not fee  like 
Results found for foo and foobar 

2 - Can solR make analysis on an index to extract associations between tokens ?

for example , if foo often appears with fee in a field, it will associate 
the 2 tokens.

3 - Is it possible and if so How can I configure solR to set or not 
highlighting for tokens with diacritics ? 

Settings for vélo (all highlighted) == the two words emvélo/em and 
emvelo/em are highlighted
Settings for vélo == the first word emvélo/em is highlighted but not 
the second  : velo

4 - the same question for highlighting with lemmatisation?

Settings for manage (all highlighted) == the two wordsemmanage/em and 
emmanagement/em are highlighted
Settings for manage == the first word emmanage/em is highlighted but 
not the second  : management


Thanks in advance.

Regards 

Nourredine.


__
Do You Yahoo!?
En finir avec le spam? Yahoo! Mail vous offre la meilleure protection possible 
contre les messages non sollicités 
http://mail.yahoo.fr Yahoo! Mail

Multi-valued field cache

2009-09-30 Thread wojtekpia


I want to build a FunctionQuery that scores documents based on a multi-valued
field. My intention was to use the field cache, but that doesn't get me
multiple values per document. I saw other posts suggesting UnInvertedField
as the solution. I don't see a method in the UnInvertedField class that will
give me a list of field values per document. I only see methods that give
values per document set. Should I use one of those methods and create
document sets of size 1 for each document?

Thanks,

Wojtek
-- 
View this message in context: 
http://www.nabble.com/Multi-valued-field-cache-tp25684952p25684952.html
Sent from the Solr - User mailing list archive at Nabble.com.

Adding data from nutch to a Solr index

2009-09-30 Thread Sönke Goldbeck


Alright, first post to this list and I hope the question
is not too stupid or misplaced ...

what I currently have:
- a nicely working Solr 1.3 index with information about some
entities e.g. organisations, indexed from an RDBMS. Many of these
entities have an URL pointing at further information, e.g. the
website of an institute or company.

- an installation of nutch 0.9 with which I can crawl for the
URLs that I can extract from the RDBMS mentioned above and put
into a seed file

- tutorials about how to put crawled and indexed data from
nutch 1.0 (which I could install w/o problems) into a separate
Solr index


what I want:
- combine the indexed information from the RDBMS and the website
in one Solr index so that I can search both in one and with the
capability of using all the Solr features. E.g. having the following
(example) fields in one document:

doc
  name-from-RDBMS
  indexed-content-from-RDBMS
  indexed-content-from-website
  URL
  ...
/doc

Any input appreciated!

Cheers, Sönke

NGramTokenFilter behaviour

2009-09-30 Thread aodhol

If I index the following text: I live in Dublin Ireland where
Guinness is brewed

Then search for: duvlin

Should Solr return a match?

In the admin interface under the analysis section, Solr highlights
some NGram matches?

When I enter the following query string into my browser address bar, I
get 0 results?

http://localhost:8983/solr/select/?q=duvlindebugQuery=true

Nor do I get results for dub, dubli, ublin, dublin (du does return a result).

I also notice when I use debugQuery=true, the parsed query is a
PhraseQuery. This doesn't make sense to me, as surely the point of the
NGram is to use a Boolean OR between each Gram??

However, if I don't use an NGramFilterFactory at query time, I can get
results for: dub, ublin, du, but not duvlin.

fieldType name=text class=solr.TextField
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.NGramFilterFactory minGramSize=2
maxGramSize=15/
  /analyzer
/fieldType

Can someone please clarify what the purpose of the
NGramFilter/tokenizer is, if not to allow for
misspellings/morphological variation and also, what the correct
configuration is in terms of use at index/query time.

Any help appreciated!

Aodh.

Solr 1.3, JDK 1.6

Re: Adding data from nutch to a Solr index

2009-09-30 Thread Andrzej Bialecki


Sönke Goldbeck wrote:

Alright, first post to this list and I hope the question
is not too stupid or misplaced ...

what I currently have:
- a nicely working Solr 1.3 index with information about some
entities e.g. organisations, indexed from an RDBMS. Many of these
entities have an URL pointing at further information, e.g. the
website of an institute or company.

- an installation of nutch 0.9 with which I can crawl for the
URLs that I can extract from the RDBMS mentioned above and put
into a seed file

- tutorials about how to put crawled and indexed data from
nutch 1.0 (which I could install w/o problems) into a separate
Solr index


what I want:
- combine the indexed information from the RDBMS and the website
in one Solr index so that I can search both in one and with the
capability of using all the Solr features. E.g. having the following
(example) fields in one document:

doc
  name-from-RDBMS
  indexed-content-from-RDBMS
  indexed-content-from-website
  URL
  ...
/doc


I believe that this kind of document merging is not possible (at least 
not easily) - you have to assemble the whole document before you index 
it in Solr.


If these documents use the same primary key (I guess they do, otherwise 
how would you merge them...) then you can do the merging in your 
front-end application, which would have to submit the main query to 
Solr, and then for each Solr document on the list of results it would 
retrieve a Nutch document (using NutchBean API).


(The not so easy way involves writing a SearchComponent that does the 
latter part of that process on the Solr side.)


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

RE: NGramTokenFilter behaviour

2009-09-30 Thread Feak, Todd

My understanding of a NGramTokenizing is to help with languages that don't 
necessarily contain spaces as a word delimiter (Japanese et al). In that case 
bi-gramming is used to find words contained within a stream of unbroken 
characters. In that case, you want to find all of the bi-grams that you input 
for the search query. An OR wouldn't work as well, as you would find tons of 
hits.

-Todd Feak

-Original Message-
From: aod...@gmail.com [mailto:aod...@gmail.com] 
Sent: Wednesday, September 30, 2009 10:54 AM
To: solr-user@lucene.apache.org
Subject: NGramTokenFilter behaviour

If I index the following text: I live in Dublin Ireland where
Guinness is brewed

Then search for: duvlin

Should Solr return a match?

In the admin interface under the analysis section, Solr highlights
some NGram matches?

When I enter the following query string into my browser address bar, I
get 0 results?

http://localhost:8983/solr/select/?q=duvlindebugQuery=true

Nor do I get results for dub, dubli, ublin, dublin (du does return a result).

I also notice when I use debugQuery=true, the parsed query is a
PhraseQuery. This doesn't make sense to me, as surely the point of the
NGram is to use a Boolean OR between each Gram??

However, if I don't use an NGramFilterFactory at query time, I can get
results for: dub, ublin, du, but not duvlin.

fieldType name=text class=solr.TextField
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.NGramFilterFactory minGramSize=2
maxGramSize=15/
  /analyzer
/fieldType

Can someone please clarify what the purpose of the
NGramFilter/tokenizer is, if not to allow for
misspellings/morphological variation and also, what the correct
configuration is in terms of use at index/query time.

Any help appreciated!

Aodh.

Solr 1.3, JDK 1.6

Re: n-Gram, only works with queries of 2 letters

2009-09-30 Thread aodhol

Has this issue been fixed yet? can anyone shed some light on what's
going on here please. NGramming is critical to my app. I will have to
look to something other than Solr if it's not possible to do :(

Re: Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie

Fergus McMenemie wrote:
 Fergus McMenemie wrote:
 Hi all,

 I am attempting to test some changes I made to my DIH based
 indexing process. The changes only affect the way I 
 describe my fields in data-config.xml, there should be no
 changes to the way the data is indexed or stored.

 As a QA check I was wanting to compare the results from
 indexing the same data before/after the change. I was looking
 for a way of getting counts of terms in each field. I 
 guess Luke etc most allow this but how?
 Luke uses brute force approach - it traverses all terms, and counts 
 terms per field. This is easy to implement yourself - just get 
 IndexReader.terms() enumeration and traverse it.

 Thanks Andrzej 
 
 This is just a one off QA check. How do I get Luke to display
 terms and counts?

1. get Luke 0.9.9
2. open index with Luke
3. Look at the Overview panel, you will see the list titled Available 
fields and term counts per field.


Thanks,

That got me going, and I felt a little stupid after stumbling
across http://wiki.apache.org/solr/LukeRequestHandler

Regards Fergus

Re: NGramTokenFilter behaviour

2009-09-30 Thread Shalin Shekhar Mangar

On Wed, Sep 30, 2009 at 11:24 PM, aod...@gmail.com wrote:


 Can someone please clarify what the purpose of the
 NGramFilter/tokenizer is, if not to allow for
 misspellings/morphological variation and also, what the correct
 configuration is in terms of use at index/query time.


If it is spellcheck you are interested in, take a look at
http://wiki.apache.org/solr/SpellCheckComponent

-- 
Regards,
Shalin Shekhar Mangar.

Re: NGramTokenFilter behaviour

2009-09-30 Thread Shalin Shekhar Mangar

On Wed, Sep 30, 2009 at 11:24 PM, aod...@gmail.com wrote:

 If I index the following text: I live in Dublin Ireland where
 Guinness is brewed

 Then search for: duvlin

 Should Solr return a match?

 In the admin interface under the analysis section, Solr highlights
 some NGram matches?

 When I enter the following query string into my browser address bar, I
 get 0 results?

 http://localhost:8983/solr/select/?q=duvlindebugQuery=true

 Nor do I get results for dub, dubli, ublin, dublin (du does return a
 result).

 I also notice when I use debugQuery=true, the parsed query is a
 PhraseQuery. This doesn't make sense to me, as surely the point of the
 NGram is to use a Boolean OR between each Gram??

 However, if I don't use an NGramFilterFactory at query time, I can get
 results for: dub, ublin, du, but not duvlin.


Is the n-grammed field specified as the defaultSearchField in your
schema.xml? If not, then you will have to specify the field name during
querying e.g. field_name:duvlin. You can see exactly how your query is being
parsed if you add debugQuery=on as a request parameter.

-- 
Regards,
Shalin Shekhar Mangar.

Conditional deduplication

2009-09-30 Thread Michael

If I index a bunch of email documents, is there a way to sayshow me all
email documents, but only one per To: email address
so that if there are a total of 10 distinct To: fields in the corpus, I get
back 10 email documents?

I'm aware of http://wiki.apache.org/solr/Deduplication but I want to retain
the ability to search across all of my email documents most of the time, and
only occasionally search for the distinct ones.

Essentially I want to do a
SELECT DISTINCT to_field FROM documents
where a normal search is a
SELECT * FROM documents

Thanks for any pointers.

Re: Conditional deduplication

2009-09-30 Thread Mauricio Scheffer

See http://wiki.apache.org/solr/FieldCollapsing

On Wed, Sep 30, 2009 at 4:41 PM, Michael solrco...@gmail.com wrote:

 If I index a bunch of email documents, is there a way to sayshow me all
 email documents, but only one per To: email address
 so that if there are a total of 10 distinct To: fields in the corpus, I get
 back 10 email documents?

 I'm aware of http://wiki.apache.org/solr/Deduplication but I want to
 retain
 the ability to search across all of my email documents most of the time,
 and
 only occasionally search for the distinct ones.

 Essentially I want to do a
 SELECT DISTINCT to_field FROM documents
 where a normal search is a
 SELECT * FROM documents

 Thanks for any pointers.

field collapsing sums

2009-09-30 Thread Joe Calderon

hello all, i have a question on the field collapsing patch, say i have
an integer field called num_in_stock and i collapse by some other
column, is it possible to sum up that integer field and return the
total in the output, if not how would i go about extending the
collapsing component to support that?


thx much

--joe

mergefactor=1 questions

2009-09-30 Thread Phillip Farber



In order to make maximal use of our storage by avoiding the dead 2x 
overhead needed to optimize the index we are considering setting 
mergefactor=1 and living with the slow indexing performance which is not 
a problem in our use case.


Some questions:

1) Does mergefactor=1 mean that the size of the index on disk increases 
only due to add/s or is there some sort of merging that happens that 
temporarily inflates disk usage?


2) It was mentioned that, with per-segment readers, an optimized index 
may not be the best option.


What are per-segment readers? Is this configurable or some sort of default?

What are the cases where an optimized index (one segment) might not be 
the best option?


Thanks!

Phil

Seeking Solr/Nutch consultant in San Jose, CA

2009-09-30 Thread Leann Pereira

Hi,

I am working with a SaaS vendor who is integrated with Nutch 0.9 and SOLR.  We 
are looking for some help to migrate this to Nutch 1.0.  The work involves:


1)  We made changes to Nutch 0.9;  these need to be ported to Nutch 1.0.

2)  Configure SOLR integration with Nutch 1.0

3)  Configure SOLR to do Japanese indexing;  expose this configuration as 
part of Baynote configuration.

4)  Check if indexes are portable between Nutch 0.9 and Nutch 1.0 - should 
we re-index?

Please email me if there is interest.  The work is in San Jose, CA.  Duration 
and rate are not yet known.

Best regards,

Leann


Leann Pereira | o: +1 650.425.7950 | le...@1sourcestaffing.com | Senior 
Technical Recruiter

Re: Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-09-30 Thread Nick Dimiduk

As Bradford is out of town this evening, I will take up the mantel of
Person-on-Point. Contact me with questions re: tonight's gathering.

See you tonight!

-Nick
614.657.0267

On Mon, Sep 28, 2009 at 4:33 PM, Bradford Stephens 
bradfordsteph...@gmail.com wrote:

 Hello everyone!
 Don't forget that the Meetup is THIS Wednesday! I'm looking forward to
 hearing about Hive from the Facebook team ... and there might be a few
 other
 interesting talks as well. Here's the details in the wiki:
 http://wiki.apache.org/hadoop/PNW_Hadoop_%2B_Apache_Cloud_Stack_User_Group

 Cheers,
 Bradford

 On Mon, Sep 14, 2009 at 11:35 AM, Bradford Stephens 
 bradfordsteph...@gmail.com wrote:

  Greetings,
 
  It's time for another Hadoop/Lucene/ApacheCloud  Stack meetup!
  This month it'll be on Wednesday, the 30th, at 6:45 pm.
 
  We should have a few interesting guests this time around -- someone from
  Facebook may be stopping by to talk about Hive :)
 
  We've had great attendance in the past few months, let's keep it up! I'm
  always
  amazed by the things I learn from everyone.
 
  We're back at the University of Washington, Allen Computer Science
  Center (not Computer Engineering)
  Map: http://www.washington.edu/home/maps/?CSE
 
  Room: 303 -or- the Entry level. If there are changes, signs will be
 posted.
 
  More Info:
 
  The meetup is about 2 hours (and there's usually food): we'll have two
  in-depth talks of 15-20
  minutes each, and then several lightning talks of 5 minutes. If no
  one offers, We'll then have discussion and 'social time'.  we'll just
  have general discussion. Let net know if you're interested in speaking
  or attending. We'd like to focus on education, so every presentation
  *needs* to ask some questions at the end. We can talk about these
  after the presentations, and I'll record what we've learned in a wiki
  and share that with the rest of us.
 
  Contact: Bradford Stephens, 904-415-3009, bradfordsteph...@gmail.com
 
  Cheers,
  Bradford
  --
  http://www.roadtofailure.com -- The Fringes of Scalability, Social
  Media, and Computer Science
 



 --
 http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
 and Computer Science

Re: Writing optimized index to different storage?

2009-09-30 Thread Phillip Farber

Sorry, I should have given more background. We have, at the moment 3.8 
million documents of 0.7MB/doc average so we have extremely large 
shards.  We build about 400,000 documents to a shard resulting 
200GB/shard.  We are also using LVM snapshots to manage a snapshot of 
the shard which we serve while we continue to build.


In order to optimize the building shard of around 200GB we need 400GB of 
 disk space to allow for 2x size increase. Due to the nature of 
snapshotting, the volume containing the snapshot has to be as large as 
the build volume, i.e. 400GB.


If we could write the optimized build shard elsewhere instead of in 
place we could avoid the need for the serving volume to match the size 
of the building volume.


We'd like to avoid the need to have 200GB+ hanging around just to 
optimize.


Responses we got on whether writing elsewhere optimize make it clear 
that's not a solution.


I posted another question to the list just a bit ago asking whether 
mergefactor=1 would give us a single segment index that is always 
optimized so that we don't have the 2x overhead.


However, running a build with merge factor=1 shows that lots of segments 
get created/merged and that the index grows in size but shrinks at 
intervals to a degree too.  It is not clear how big the index is at any 
point in time.



Chris Hostetter wrote:

: Is it possible to tell Solr or Lucene, when optimizing, to write the files
: that constitute the optimized index to somewhere other than
: SOLR_HOME/data/index or is there something about the optimize that requires
: the final segment to be created in SOLR_HOME/data/index?

For what purpose?

http://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an XY Problem ... that is: you are dealing
with X, you are assuming Y will help you, and you are asking about Y
without giving more details about the X so that we can understand the
full issue.  Perhaps the best solution doesn't involve Y at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341




-Hoss

Webinar: Apache Solr 1.4 – Faster, Easier, an d More Versatile than Ever

2009-09-30 Thread Erik Hatcher


Excuse the cross-posting and gratuitous marketing :)

Erik


My company, Lucid Imagination, is sponsoring a free and in-depth  
technical webinar with Erik Hatcher, one of our co-founders as Lucid  
Imagination, as well as co-author of Lucene in Action, and Lucene/Solr  
PMC member and committer. Sign up here: http://www.eventsvc.com/lucidimagination/100909?trk=WR-OCT2009-AP


Friday, October 9th 2009
10:00AM – 11:00AM PDT / 1:00 – 2:00PM EDT

If you’ve got a lot of data to tame in a variety of formats, there’s  
no better, deeper, faster platform to build your search application  
with than Solr. Apache Solr 1.4 expands the power and versatility of  
the leading open source search server, with its convenient web- 
services interfaces and well-packaged server implementation. Erik will  
present and discuss key features and innovations of Solr 1.4,  
covering, among others:


  * Faster, more streamlined document and query processing
  * New powerful search methods including multi-select faceting,  
deduplication and numeric range handling
  * Simplified, powerful, highly-scalable deployment improvements  
with new Java server infrastructure


Sign up for the free webinar at
http://www.eventsvc.com/lucidimagination/100909?trk=WR-OCT2009-AP

About the presenter:
Erik Hatcher, is the co-author of “Lucene in Action” as well as co- 
author of “Java Development with Ant”. Erik has been an active member  
of the Lucene community – a leading Lucene and Solr committer, member  
of the Lucene Project Management Committee, member of the Apache  
Software Foundation as well as a frequent invited speaker at various  
industry events.

changing dismax parser to not treat symbols differently

2009-09-30 Thread Joe Calderon

how would i go about modifying the dismax parser to treat +/- as regular text?

Re: changing dismax parser to not treat symbols differently

2009-09-30 Thread Mark Miller

Joe Calderon wrote:
 how would i go about modifying the dismax parser to treat +/- as regular text?
   
Would be nice if there was a tiny simple method you could override for
this, but:

You should extend the dismax parser and override addMainQuery

Where it calls SolrPluginUtils.partialEscape, call your own escape
method that does what that one does, but also
escapes + and -.

I think that should work alright.

-- 
- Mark

http://www.lucidimagination.com

Re: field collapsing sums

2009-09-30 Thread Uri Boness


Hi,

At the moment I think the most appropriate place to put it is in the 
AbstractDocumentCollapser (in the getCollapseInfo method). Though, it 
might not be the most efficient.


Cheers,
Uri

Joe Calderon wrote:

hello all, i have a question on the field collapsing patch, say i have
an integer field called num_in_stock and i collapse by some other
column, is it possible to sum up that integer field and return the
total in the output, if not how would i go about extending the
collapsing component to support that?


thx much

--joe

Re: Create new core on the fly

2009-09-30 Thread djain101

So, if we
do a create, will it modify the solr.xml everytime? Can it be avoided in
subsequent requests for create?

No, solr.xml will be modified only if persist=true is passed as a request
param. I don't understand your second question. Why would you want to issue
create commands for the same core multiple times?

Shalin, persist=true does not work with the create action. I am creating the
core using the below url and everytime it is modifying solr.xml
http://localhost:8080/app/solr/admin/cores?action=CREATEname=core1instanceDir=core1persist=false

Our requirement is that, we have multiple solr app servers behind a load
balancer. So if we hit a url to create solr core, it will hit any one of the
app server and core will be loaded on to that app server only. Rest all
other app servers will not be aware about the new solr core and all the
search requests will fail if it hit the other app servers on which core is
not loaded. That's the reason we need to CREATE cores on each of the app
server but we don't necessarily want solr.xml to be modified.

Thanks,
Dharmveer

Shalin Shekhar Mangar wrote:

On Wed, Sep 30, 2009 at 3:48 AM, djain101 dharmveer_j...@yahoo.com
wrote:

Hi Shalin,

Can you please elaborate, why we need to do unload after create?

No you don't need to. You can unload if you want to for some reasons.

So, if we
do a create, will it modify the solr.xml everytime? Can it be avoided in
subsequent requests for create?

No, solr.xml will be modified only if persist=true is passed as a request
param. I don't understand your second question. Why would you want to
issue
create commands for the same core multiple times?

Also, if we want to implement Load, can you please give some directions
to
implement load action?

I don't know what you want to do. Loading cores without restarting Solr is
possible right now by using the create command.

--
Regards,
Shalin Shekhar Mangar.

--
View this message in context:
http://www.nabble.com/Create-new-core-on-the-fly-tp14585788p25691408.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: field collapsing sums

2009-09-30 Thread Matt Weber

You might want to see how the stats component works with field  
collapsing.


Thanks,

Matt Weber

On Sep 30, 2009, at 5:16 PM, Uri Boness wrote:


Hi,

At the moment I think the most appropriate place to put it is in the  
AbstractDocumentCollapser (in the getCollapseInfo method). Though,  
it might not be the most efficient.


Cheers,
Uri

Joe Calderon wrote:
hello all, i have a question on the field collapsing patch, say i  
have

an integer field called num_in_stock and i collapse by some other
column, is it possible to sum up that integer field and return the
total in the output, if not how would i go about extending the
collapsing component to support that?


thx much

--joe

50 matches

Mail list logo