Re: indexing pdf documents

2008-05-13 Thread Cam Bazz
yes, I have seen the documentation on RichDocumentRequestHandler at the
http://wiki.apache.org/solr/UpdateRichDocuments page.
However, from what I understand this just feeds documents to solr. How can I
construct something like: document_id, document_name, document_text and feed
it in. (i.e. my documents have labels)

Best.
-C.B.

On Tue, May 13, 2008 at 1:30 AM, Chris Harris [EMAIL PROTECTED] wrote:

 Solr does not have this support built in, but there's a patch for it:

 https://issues.apache.org/jira/browse/SOLR-284

 On Mon, May 12, 2008 at 2:02 PM, Cam Bazz [EMAIL PROTECTED] wrote:
  Hello,
 
   Before making a little program to extract the txt from my pdfs and feed
 it
   into solr with xml, I just wanted to check if solr has capability to
 digest
   pdf files apart from xml?
 
   Best Regards,
   -C.B.
 



how to clean an index ?

2008-05-13 Thread Pierre-Yves LANDRON
Hello,

I want to clean an index (ie delete all documents), but cannot delete the index 
repertory.
Is it possible with the rest interface ?

Thanks,

Pierre-Yves Landron

_
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+worldmkt=en-USform=QBRE

phrase query with DismaxHandler

2008-05-13 Thread KhushbooLohia

Hi All,

I am using EnglishPorterFilterFactory in text field for stemming the words. 
Also I am using DisMaxRequestHandler for handling requests.
When phrase query is passed to solr ex: windows installation. 
Sometimes the results obtained are correct but sometimes the results occur
with only word install or just windows or just with installation. 
Its observed that, if the phrase doesn't have anything to be stemmed like
windows or cpmany the results are returned as expected. But phrase with
words like combination, colusion get stemmed to combine or conclude and
brings wierd results.


Please revert back.

Thanks
Khushboo

-- 
View this message in context: 
http://www.nabble.com/phrase-query-with-DismaxHandler-tp17204921p17204921.html
Sent from the Solr - User mailing list archive at Nabble.com.



Duplicates results when using a non optimized index

2008-05-13 Thread Tim Mahy
Hi all,

is this expected behavior when having an index like this :

numDocs : 9479963
maxDoc : 12622942
readerImpl : MultiReader

which is in the process of optimizing that when we search through the index we 
get this :

doc
long name=id15257559/long
/doc
doc
long name=id15257559/long
/doc
doc
long name=id17177888/long
/doc
doc
long name=id11825631/long
/doc
doc
long name=id11825631/long
/doc

The id field is declared like this :
field name=id type=long indexed=true stored=true required=true /

and is set as the unique identity like this in the schema xml :
  uniqueKeyid/uniqueKey

so the question : is this expected behavior and if so is there a way to let 
Solr only return unique documents ?

greetings and thanx in advance,
Tim




Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx


RE: how to clean an index ?

2008-05-13 Thread Tim Mahy
Hi,

you can create a delete query matching al your documents like the query *:*

greetings,
Tim

Van: Pierre-Yves LANDRON [EMAIL PROTECTED]
Verzonden: dinsdag 13 mei 2008 11:53
Aan: solr-user@lucene.apache.org
Onderwerp: how to clean an index ?

Hello,

I want to clean an index (ie delete all documents), but cannot delete the index 
repertory.
Is it possible with the rest interface ?

Thanks,

Pierre-Yves Landron

_
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+worldmkt=en-USform=QBRE




Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx


Re: help for preprocessing the query

2008-05-13 Thread Umar Shah
On Mon, May 12, 2008 at 10:30 PM, Shalin Shekhar Mangar 
[EMAIL PROTECTED] wrote:

 You'll *not* write a servlet. You'll write implement the Filter interface
 http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/servlet/Filter.html

 In the doFilter method, you'll create a ServletRequestWrapper which
 changes
 the incoming param. Then you'll call chain.doFilter with the new request
 object. You'll need to add this filter before the SolrRequestFilter in
 Solr's web.xml

I created a CustomFilter that would dump the request contents to a file,
I created the jar and added it to the solr.war in WEB_INF/lib folder
I edited the web.xml in the same folder to include the following lines:
filter
filter-nameCustomFilter/filter-name
filter-class(packagename).CustomFilter/filter-class
  /filter

where CustomFilter is the name of the class extending javax.servlet.Filter.

I don't see anything in the contents of the file..

thanks for your help
-umar



 Look at

 http://www.onjava.com/pub/a/onjava/2001/05/10/servlet_filters.html?page=1for
 more details.

 On Mon, May 12, 2008 at 8:51 PM, Umar Shah [EMAIL PROTECTED] wrote:

  On Mon, May 12, 2008 at 8:42 PM, Shalin Shekhar Mangar 
  [EMAIL PROTECTED] wrote:
 
   ServletRequest and ServletRequestWrapper are part of the Java
  servlet-api
   (not Solr). Basically, Koji is hinting at writing a ServletFilter
   implementation (again using servlet-api) and creating a wrapper
   ServletRequest which modifies the underlying request params which can
  then
   be used by Solr.
  
 
  sorry for the silly question, basically i am new to servlets.
  Now If my understanding is right , I will need to create a
 servlet/wrapper
  that would listen the user facing queries and then pass the processed
 text
  to solr request handler and I need to pack this servlet class file into
  Solr
  war file.
 
  But How would I ensure that my servlet is called instead of solr request
  handler?
 
 
   On Mon, May 12, 2008 at 8:36 PM, Umar Shah [EMAIL PROTECTED] wrote:
  
On Mon, May 12, 2008 at 2:50 PM, Koji Sekiguchi [EMAIL PROTECTED]
wrote:
   
 Hi Umar,

 You may be able to preprocess your request parameter in your
 servlet filter. In the doFilter() method, you do:

 ServletRequest myRequest = new MyServletRequestWrapper( request );
   
   
Thanks for your response,
   
Where is the ServletRequest class , I am using Solr 1.3 trunk code
found SolrServletm, butit is depricated, which class can I use
 instead
   of
SolrRequest in 1.3 codebase?
   
   
I also tried overloading Standard request handler , How do I re
 write
queryparams there?
   
Can you point me to some documentation?
   
   
   :
 chain.doFilter( myRequest, response );

 And you have MyServletRequestWrapper that extends
   ServletRequestWrapper.
 Then you can get|set q* parameters through getParameter() method.

 Hope this helps,

 Koji



 Umar Shah wrote:

  Hi,
 
  Due some requirement I need to transform the user queries before
passing
  it
  to the standard handler in Solr,  can anyone suggest me the best
  way
to
  do
  this.
 
  I will need to use a transfomation class that would provide
   functions
to
  process the input query 'qIn' and transform it to the resultant
   query
  'qOut'
  and then pass it to solr handler as if qOut were the original
 user
  query.
 
  thanks in anticipation,
  -umar
 
 
 


   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
 



 --
 Regards,
 Shalin Shekhar Mangar.



Differences between nightly builds

2008-05-13 Thread Lucas F. A. Teixeira

Hello,

Here we use a nightly build from aug '07. It`s what we need with some 
bugs that we`ve worked on it.
I want to change this to a newer nightly build, but as this is 'stable' 
people are affraid of changing to a 'unknown' build.


Is there some place where I can find all changes between some date (my 
aug 07') and nowadays? Maybe with this I can make their mind!


Thank you.

[]s,


--
Lucas Frare A. Teixeira
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
Tel: +55 11 3660.1622 - R3018



Re: help for preprocessing the query

2008-05-13 Thread Shalin Shekhar Mangar
Did you put a filter-mapping in web.xml?

On Tue, May 13, 2008 at 4:20 PM, Umar Shah [EMAIL PROTECTED] wrote:

 On Mon, May 12, 2008 at 10:30 PM, Shalin Shekhar Mangar 
 [EMAIL PROTECTED] wrote:

  You'll *not* write a servlet. You'll write implement the Filter
 interface
  http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/servlet/Filter.html
 
  In the doFilter method, you'll create a ServletRequestWrapper which
  changes
  the incoming param. Then you'll call chain.doFilter with the new request
  object. You'll need to add this filter before the SolrRequestFilter in
  Solr's web.xml

 I created a CustomFilter that would dump the request contents to a file,
 I created the jar and added it to the solr.war in WEB_INF/lib folder
 I edited the web.xml in the same folder to include the following lines:
 filter
filter-nameCustomFilter/filter-name
filter-class(packagename).CustomFilter/filter-class
  /filter

 where CustomFilter is the name of the class extending
 javax.servlet.Filter.

 I don't see anything in the contents of the file..

 thanks for your help
 -umar


 
  Look at
 
 
 http://www.onjava.com/pub/a/onjava/2001/05/10/servlet_filters.html?page=1for
  more details.
 
  On Mon, May 12, 2008 at 8:51 PM, Umar Shah [EMAIL PROTECTED] wrote:
 
   On Mon, May 12, 2008 at 8:42 PM, Shalin Shekhar Mangar 
   [EMAIL PROTECTED] wrote:
  
ServletRequest and ServletRequestWrapper are part of the Java
   servlet-api
(not Solr). Basically, Koji is hinting at writing a ServletFilter
implementation (again using servlet-api) and creating a wrapper
ServletRequest which modifies the underlying request params which
 can
   then
be used by Solr.
   
  
   sorry for the silly question, basically i am new to servlets.
   Now If my understanding is right , I will need to create a
  servlet/wrapper
   that would listen the user facing queries and then pass the processed
  text
   to solr request handler and I need to pack this servlet class file
 into
   Solr
   war file.
  
   But How would I ensure that my servlet is called instead of solr
 request
   handler?
  
  
On Mon, May 12, 2008 at 8:36 PM, Umar Shah [EMAIL PROTECTED]
 wrote:
   
 On Mon, May 12, 2008 at 2:50 PM, Koji Sekiguchi 
 [EMAIL PROTECTED]
 wrote:

  Hi Umar,
 
  You may be able to preprocess your request parameter in your
  servlet filter. In the doFilter() method, you do:
 
  ServletRequest myRequest = new MyServletRequestWrapper( request
 );


 Thanks for your response,

 Where is the ServletRequest class , I am using Solr 1.3 trunk code
 found SolrServletm, butit is depricated, which class can I use
  instead
of
 SolrRequest in 1.3 codebase?


 I also tried overloading Standard request handler , How do I re
  write
 queryparams there?

 Can you point me to some documentation?


:
  chain.doFilter( myRequest, response );
 
  And you have MyServletRequestWrapper that extends
ServletRequestWrapper.
  Then you can get|set q* parameters through getParameter()
 method.
 
  Hope this helps,
 
  Koji
 
 
 
  Umar Shah wrote:
 
   Hi,
  
   Due some requirement I need to transform the user queries
 before
 passing
   it
   to the standard handler in Solr,  can anyone suggest me the
 best
   way
 to
   do
   this.
  
   I will need to use a transfomation class that would provide
functions
 to
   process the input query 'qIn' and transform it to the
 resultant
query
   'qOut'
   and then pass it to solr handler as if qOut were the original
  user
   query.
  
   thanks in anticipation,
   -umar
  
  
  
 
 

   
   
   
--
Regards,
Shalin Shekhar Mangar.
   
  
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 




-- 
Regards,
Shalin Shekhar Mangar.


Re: help for preprocessing the query

2008-05-13 Thread Umar Shah
On Tue, May 13, 2008 at 4:39 PM, Shalin Shekhar Mangar 
[EMAIL PROTECTED] wrote:

 Did you put a filter-mapping in web.xml?


no,
I just did that and it seems to be working...

what is filter-mapping required for?



 On Tue, May 13, 2008 at 4:20 PM, Umar Shah [EMAIL PROTECTED] wrote:

  On Mon, May 12, 2008 at 10:30 PM, Shalin Shekhar Mangar 
  [EMAIL PROTECTED] wrote:
 
   You'll *not* write a servlet. You'll write implement the Filter
  interface
  
 http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/servlet/Filter.html
  
   In the doFilter method, you'll create a ServletRequestWrapper which
   changes
   the incoming param. Then you'll call chain.doFilter with the new
 request
   object. You'll need to add this filter before the SolrRequestFilter in
   Solr's web.xml
 
  I created a CustomFilter that would dump the request contents to a file,
  I created the jar and added it to the solr.war in WEB_INF/lib folder
  I edited the web.xml in the same folder to include the following lines:
  filter
 filter-nameCustomFilter/filter-name
 filter-class(packagename).CustomFilter/filter-class
   /filter
 
  where CustomFilter is the name of the class extending
  javax.servlet.Filter.
 
  I don't see anything in the contents of the file..
 
  thanks for your help
  -umar
 
 
  
   Look at
  
  
 
 http://www.onjava.com/pub/a/onjava/2001/05/10/servlet_filters.html?page=1for
   more details.
  
   On Mon, May 12, 2008 at 8:51 PM, Umar Shah [EMAIL PROTECTED] wrote:
  
On Mon, May 12, 2008 at 8:42 PM, Shalin Shekhar Mangar 
[EMAIL PROTECTED] wrote:
   
 ServletRequest and ServletRequestWrapper are part of the Java
servlet-api
 (not Solr). Basically, Koji is hinting at writing a ServletFilter
 implementation (again using servlet-api) and creating a wrapper
 ServletRequest which modifies the underlying request params which
  can
then
 be used by Solr.

   
sorry for the silly question, basically i am new to servlets.
Now If my understanding is right , I will need to create a
   servlet/wrapper
that would listen the user facing queries and then pass the
 processed
   text
to solr request handler and I need to pack this servlet class file
  into
Solr
war file.
   
But How would I ensure that my servlet is called instead of solr
  request
handler?
   
   
 On Mon, May 12, 2008 at 8:36 PM, Umar Shah [EMAIL PROTECTED]
  wrote:

  On Mon, May 12, 2008 at 2:50 PM, Koji Sekiguchi 
  [EMAIL PROTECTED]
  wrote:
 
   Hi Umar,
  
   You may be able to preprocess your request parameter in your
   servlet filter. In the doFilter() method, you do:
  
   ServletRequest myRequest = new MyServletRequestWrapper(
 request
  );
 
 
  Thanks for your response,
 
  Where is the ServletRequest class , I am using Solr 1.3 trunk
 code
  found SolrServletm, butit is depricated, which class can I use
   instead
 of
  SolrRequest in 1.3 codebase?
 
 
  I also tried overloading Standard request handler , How do I re
   write
  queryparams there?
 
  Can you point me to some documentation?
 
 
 :
   chain.doFilter( myRequest, response );
  
   And you have MyServletRequestWrapper that extends
 ServletRequestWrapper.
   Then you can get|set q* parameters through getParameter()
  method.
  
   Hope this helps,
  
   Koji
  
  
  
   Umar Shah wrote:
  
Hi,
   
Due some requirement I need to transform the user queries
  before
  passing
it
to the standard handler in Solr,  can anyone suggest me the
  best
way
  to
do
this.
   
I will need to use a transfomation class that would provide
 functions
  to
process the input query 'qIn' and transform it to the
  resultant
 query
'qOut'
and then pass it to solr handler as if qOut were the
 original
   user
query.
   
thanks in anticipation,
-umar
   
   
   
  
  
 



 --
 Regards,
 Shalin Shekhar Mangar.

   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
 



 --
 Regards,
 Shalin Shekhar Mangar.



Warning: latest Tomcat 6 release is broken (was Re: Weird problems with document size)

2008-05-13 Thread Andrew Savory
Hi,

Here's a warning for anyone trying to use solr in the latest release
of tomcat, 6.0.16.

Previously I was having problems successfully posting updates to a
solr instance running in tomcat:

2008/5/9 Andrew Savory [EMAIL PROTECTED]:

  Meanwhile it seems that these documents can successfully be added to
  solr when it is running in jetty, so I'm now trying to find out what
  Tomcat is doing to break things.

A colleague (thanks, Alexis!) has just unearthed a regression bug in
tomcat dating back to February that causes posts of more than 8k to be
truncated: https://issues.apache.org/bugzilla/show_bug.cgi?id=44494

So if you're using Tomcat, aim for 6.0.14 instead.


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


RE: how to clean an index ?

2008-05-13 Thread Pierre-Yves LANDRON

Thanks ! 

I should have known !

anyway, it works fine.



 From: [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Date: Tue, 13 May 2008 11:58:16 +0200
 Subject: RE: how to clean an index ?
 
 Hi,
 
 you can create a delete query matching al your documents like the query *:*
 
 greetings,
 Tim
 
 Van: Pierre-Yves LANDRON [EMAIL PROTECTED]
 Verzonden: dinsdag 13 mei 2008 11:53
 Aan: solr-user@lucene.apache.org
 Onderwerp: how to clean an index ?
 
 Hello,
 
 I want to clean an index (ie delete all documents), but cannot delete the 
 index repertory.
 Is it possible with the rest interface ?
 
 Thanks,
 
 Pierre-Yves Landron
 
 _
 Explore the seven wonders of the world
 http://search.msn.com/results.aspx?q=7+wonders+worldmkt=en-USform=QBRE
 
 
 
 
 Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx 

_
Invite your mail contacts to join your friends list with Windows Live Spaces. 
It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=createwx_url=/friends.aspxmkt=en-us

Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread William Pierce
Hi,

I am having problems with Solr 1.2 running tomcat version 6.0.16 (I also tried 
6.0.14 but same problems exist).  Here is the situation:  I have an ASP.net 
application where I am trying to add and commit a single document to an 
index.   After I add the document and issue the commit / I can see (in the 
solr stats page) that the commit count has been increment but the docsPending 
is 1,  and my document is still not visible from a search perspective. 

When I issue another commit/,  the commit counter increments,  docsPending is 
now zero,  and my document is visible and searchable.

I saw that someone was observing problems with 6.0.16 tomcat,  so I reverted 
back to 6.0.14.  Same problem.

Can anyone help?

-- Bill

Re: JMX monitoring

2008-05-13 Thread Marshall Weir

Thank you, Shalin!

It works great.

Marshall

On May 13, 2008, at 1:57 AM, Shalin Shekhar Mangar wrote:


Hi Marshall,

I've uploaded a new patch which works off the current trunk. Let me  
know if

you run into any problems with this.

On Tue, May 13, 2008 at 2:36 AM, Marshall Weir [EMAIL PROTECTED] wrote:


Hi,

I'm new to Solr and I've been attempting to get JMX monitoring  
working. I
can get simple information by using the - 
Dcom.sun.management.jmxremote
command line switch, but I'd like to get more useful statistics.  
I've been
working on applying the SOLR-256 and jmx patch, but the original  
revisions
are pretty old and I'm having to spend a lot of time wandering  
through the

source.

Is there a better solution to getting this working or a newer  
version of

the patch?

Thank you,
Marshall





--
Regards,
Shalin Shekhar Mangar.




Re: indexing pdf documents

2008-05-13 Thread Bess Sadler
C.B., are you saying you have metadata about your PDF files (i.e.,  
title, author, etc) separate from the PDF file itself, or are you  
saying you want to extract that information from the PDF file? The  
first of these is pretty easy, the second of these can be difficult  
or impossible, depending on how your PDF file was generated and how  
consistent your files are.


It's a bit of a hack, but I've had great success in the past with  
using XTF (http://www.cdlib.org/inside/projects/xtf/) to index my PDF  
files, and then pointing solr at the resulting lucene index.  It's  
worth checking to see if this would do the trick for you.


Bess

Elizabeth (Bess) Sadler
Research and Development Librarian
Digital Scholarship Services
Box 400129
Alderman Library
University of Virginia
Charlottesville, VA 22904

On May 13, 2008, at 3:58 AM, Cam Bazz wrote:
yes, I have seen the documentation on RichDocumentRequestHandler at  
the

http://wiki.apache.org/solr/UpdateRichDocuments page.
However, from what I understand this just feeds documents to solr.  
How can I
construct something like: document_id, document_name, document_text  
and feed

it in. (i.e. my documents have labels)

Best.
-C.B.

On Tue, May 13, 2008 at 1:30 AM, Chris Harris [EMAIL PROTECTED]  
wrote:



Solr does not have this support built in, but there's a patch for it:

https://issues.apache.org/jira/browse/SOLR-284

On Mon, May 12, 2008 at 2:02 PM, Cam Bazz [EMAIL PROTECTED] wrote:

Hello,

 Before making a little program to extract the txt from my pdfs  
and feed

it
 into solr with xml, I just wanted to check if solr has  
capability to

digest

 pdf files apart from xml?

 Best Regards,
 -C.B.









Re: Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread Alexander Ramos Jardim
Maybe a delay in commit? How may time elapsed between commits?

2008/5/13 William Pierce [EMAIL PROTECTED]:

 Hi,

 I am having problems with Solr 1.2 running tomcat version 6.0.16 (I also
 tried 6.0.14 but same problems exist).  Here is the situation:  I have an
 ASP.net application where I am trying to add and commit a single
 document to an index.   After I add the document and issue the commit / I
 can see (in the solr stats page) that the commit count has been increment
 but the docsPending is 1,  and my document is still not visible from a
 search perspective.

 When I issue another commit/,  the commit counter increments,
  docsPending is now zero,  and my document is visible and searchable.

 I saw that someone was observing problems with 6.0.16 tomcat,  so I
 reverted back to 6.0.14.  Same problem.

 Can anyone help?

 -- Bill




-- 
Alexander Ramos Jardim


Re: Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread Yonik Seeley
By default, a commit won't return until a new searcher has been opened
and the results are visible.
So just make sure you wait for the commit command to return before querying.

Also, if you are committing every add, you can avoid a separate commit
command by putting ?commit=true in the URL of the add command.

-Yonik

On Tue, May 13, 2008 at 9:31 AM, Alexander Ramos Jardim
[EMAIL PROTECTED] wrote:
 Maybe a delay in commit? How may time elapsed between commits?

  2008/5/13 William Pierce [EMAIL PROTECTED]:



   Hi,
  
   I am having problems with Solr 1.2 running tomcat version 6.0.16 (I also
   tried 6.0.14 but same problems exist).  Here is the situation:  I have an
   ASP.net application where I am trying to add and commit a single
   document to an index.   After I add the document and issue the commit / I
   can see (in the solr stats page) that the commit count has been increment
   but the docsPending is 1,  and my document is still not visible from a
   search perspective.
  
   When I issue another commit/,  the commit counter increments,
docsPending is now zero,  and my document is visible and searchable.
  
   I saw that someone was observing problems with 6.0.16 tomcat,  so I
   reverted back to 6.0.14.  Same problem.
  
   Can anyone help?
  
   -- Bill




  --
  Alexander Ramos Jardim



Re: ERROR:unknown field, but what document was it?

2008-05-13 Thread Alexander Ramos Jardim
Well,

Keep-Alive is a standard at HTTP/1.1, it is not a Java standard.

2008/5/8 Chris Hostetter [EMAIL PROTECTED]:


 : My tests showed that it was a big difference. It took about 1.2 seconds
 to
 : index 500 separate adds in separate xml files (with a single commit
 : afterwards), compared to about 200 milliseconds when sending a single
 xml with
 : 500 adds. And according to the documentation java automatically uses
 : keep-alive (I found no way to force it myself).

 I'm not sure what you mean by java automatically uses keep-alive ... you
 mean you wrote your client code using java?  but how do you initiate your
 connections to Solr?

 Nothing I know of in the way Solr handles updates should make adding
 multiple docs in one request faster then adding one doc per request -- any
 added overhead should be in the servlet container (and keep-alive should
 minimize that) ... if you have a simple reproducable test that
 demonstrates otherwise, i would consider that a performance bug.

 :  i thought we added something like this ... but i guess not.
 : 
 :  feel free to file a feature request in Jira.
 :
 : ah, but I guess it is only awailable in a nightly build? Do you know a
 jira
 : issue number I can look at? I didn't find anything related to this.

 no, i mean: i thought we added it, but when i tried on the trunk i see the
 same thing you see ... please file a feature request.



 -Hoss




-- 
Alexander Ramos Jardim


Re: help for preprocessing the query

2008-05-13 Thread Noble Paul നോബിള്‍ नोब्ळ्
http://java.sun.com/products/servlet/Filters.html
this is a servlet container feature
BTW , this may not be a right forum for this topic.
--Noble

On Tue, May 13, 2008 at 5:04 PM, Umar Shah [EMAIL PROTECTED] wrote:
 On Tue, May 13, 2008 at 4:39 PM, Shalin Shekhar Mangar 
  [EMAIL PROTECTED] wrote:

   Did you put a filter-mapping in web.xml?


  no,
  I just did that and it seems to be working...

  what is filter-mapping required for?




  
   On Tue, May 13, 2008 at 4:20 PM, Umar Shah [EMAIL PROTECTED] wrote:
  
On Mon, May 12, 2008 at 10:30 PM, Shalin Shekhar Mangar 
[EMAIL PROTECTED] wrote:
   
 You'll *not* write a servlet. You'll write implement the Filter
interface

   http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/servlet/Filter.html

 In the doFilter method, you'll create a ServletRequestWrapper which
 changes
 the incoming param. Then you'll call chain.doFilter with the new
   request
 object. You'll need to add this filter before the SolrRequestFilter in
 Solr's web.xml
   
I created a CustomFilter that would dump the request contents to a file,
I created the jar and added it to the solr.war in WEB_INF/lib folder
I edited the web.xml in the same folder to include the following lines:
filter
   filter-nameCustomFilter/filter-name
   filter-class(packagename).CustomFilter/filter-class
 /filter
   
where CustomFilter is the name of the class extending
javax.servlet.Filter.
   
I don't see anything in the contents of the file..
   
thanks for your help
-umar
   
   

 Look at


   
   
 http://www.onjava.com/pub/a/onjava/2001/05/10/servlet_filters.html?page=1for
 more details.

 On Mon, May 12, 2008 at 8:51 PM, Umar Shah [EMAIL PROTECTED] wrote:

  On Mon, May 12, 2008 at 8:42 PM, Shalin Shekhar Mangar 
  [EMAIL PROTECTED] wrote:
 
   ServletRequest and ServletRequestWrapper are part of the Java
  servlet-api
   (not Solr). Basically, Koji is hinting at writing a ServletFilter
   implementation (again using servlet-api) and creating a wrapper
   ServletRequest which modifies the underlying request params which
can
  then
   be used by Solr.
  
 
  sorry for the silly question, basically i am new to servlets.
  Now If my understanding is right , I will need to create a
 servlet/wrapper
  that would listen the user facing queries and then pass the
   processed
 text
  to solr request handler and I need to pack this servlet class file
into
  Solr
  war file.
 
  But How would I ensure that my servlet is called instead of solr
request
  handler?
 
 
   On Mon, May 12, 2008 at 8:36 PM, Umar Shah [EMAIL PROTECTED]
wrote:
  
On Mon, May 12, 2008 at 2:50 PM, Koji Sekiguchi 
[EMAIL PROTECTED]
wrote:
   
 Hi Umar,

 You may be able to preprocess your request parameter in your
 servlet filter. In the doFilter() method, you do:

 ServletRequest myRequest = new MyServletRequestWrapper(
   request
);
   
   
Thanks for your response,
   
Where is the ServletRequest class , I am using Solr 1.3 trunk
   code
found SolrServletm, butit is depricated, which class can I use
 instead
   of
SolrRequest in 1.3 codebase?
   
   
I also tried overloading Standard request handler , How do I re
 write
queryparams there?
   
Can you point me to some documentation?
   
   
   :
 chain.doFilter( myRequest, response );

 And you have MyServletRequestWrapper that extends
   ServletRequestWrapper.
 Then you can get|set q* parameters through getParameter()
method.

 Hope this helps,

 Koji



 Umar Shah wrote:

  Hi,
 
  Due some requirement I need to transform the user queries
before
passing
  it
  to the standard handler in Solr,  can anyone suggest me the
best
  way
to
  do
  this.
 
  I will need to use a transfomation class that would provide
   functions
to
  process the input query 'qIn' and transform it to the
resultant
   query
  'qOut'
  and then pass it to solr handler as if qOut were the
   original
 user
  query.
 
  thanks in anticipation,
  -umar
 
 
 


   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
 



 --
 Regards,
 Shalin Shekhar Mangar.

   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  




-- 
--Noble Paul


Re: ERROR:unknown field, but what document was it?

2008-05-13 Thread Yonik Seeley
On Thu, May 8, 2008 at 4:59 PM,  [EMAIL PROTECTED] wrote:
  My tests showed that it was a big difference. It took about 1.2 seconds to
 index 500 separate adds in separate xml files (with a single commit
 afterwards), compared to about 200 milliseconds when sending a single xml
 with 500 adds.

Did you overlap the adds (use multiple threads)?

-Yonik


Re: Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread William Pierce

Thanks for the comments

The reason I am just adding one document followed by a commit is for this 
particular test --- in actuality,  I will be loading documents from a db. 
But thanks for the pointer on the ?commit=true on the add command.


Now on the commit / problem itself,  I am still confused:  Doesn't the 
commit count of 1 indicate that the commit is completed?


In any event,  just for testing purposes,  I started everything from scratch 
(deleted all documents, stopped/restarted tomcat).  I noticed that the only 
files in my index folder were:  segments.gen and segments_1.


Then I did the add followed by commit / and noticed that there were now 
three files:  segments.gen, segments_1 and write.lock.


Now it is 7 minutes later, and when I query the index using the 
http://localhost:59575/splus1/admin/; url, I still do not see the document.


Again, when I issue another commit / command everything seems to work. 
Why are TWO commit commands apparently required?


Thanks,

Sridhar

--
From: Yonik Seeley [EMAIL PROTECTED]
Sent: Tuesday, May 13, 2008 6:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Commit problems on Solr 1.2 with Tomcat


By default, a commit won't return until a new searcher has been opened
and the results are visible.
So just make sure you wait for the commit command to return before 
querying.


Also, if you are committing every add, you can avoid a separate commit
command by putting ?commit=true in the URL of the add command.

-Yonik

On Tue, May 13, 2008 at 9:31 AM, Alexander Ramos Jardim
[EMAIL PROTECTED] wrote:

Maybe a delay in commit? How may time elapsed between commits?

 2008/5/13 William Pierce [EMAIL PROTECTED]:



  Hi,
 
  I am having problems with Solr 1.2 running tomcat version 6.0.16 (I 
also
  tried 6.0.14 but same problems exist).  Here is the situation:  I have 
an

  ASP.net application where I am trying to add and commit a single
  document to an index.   After I add the document and issue the commit 
/ I
  can see (in the solr stats page) that the commit count has been 
increment

  but the docsPending is 1,  and my document is still not visible from a
  search perspective.
 
  When I issue another commit/,  the commit counter increments,
   docsPending is now zero,  and my document is visible and searchable.
 
  I saw that someone was observing problems with 6.0.16 tomcat,  so I
  reverted back to 6.0.14.  Same problem.
 
  Can anyone help?
 
  -- Bill




 --
 Alexander Ramos Jardim





Re: Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread Erik Hatcher
I'm not sure if you are issuing a separate commit/ _request_ after  
your add, or putting a commit/ into the same request.  Solr only  
supports one command (add or commit, but not both) per request.


Erik


On May 13, 2008, at 10:36 AM, William Pierce wrote:


Thanks for the comments

The reason I am just adding one document followed by a commit is  
for this particular test --- in actuality,  I will be loading  
documents from a db. But thanks for the pointer on the ?commit=true  
on the add command.


Now on the commit / problem itself,  I am still confused:   
Doesn't the commit count of 1 indicate that the commit is completed?


In any event,  just for testing purposes,  I started everything  
from scratch (deleted all documents, stopped/restarted tomcat).  I  
noticed that the only files in my index folder were:  segments.gen  
and segments_1.


Then I did the add followed by commit / and noticed that there  
were now three files:  segments.gen, segments_1 and write.lock.


Now it is 7 minutes later, and when I query the index using the  
http://localhost:59575/splus1/admin/; url, I still do not see the  
document.


Again, when I issue another commit / command everything seems to  
work. Why are TWO commit commands apparently required?


Thanks,

Sridhar

--
From: Yonik Seeley [EMAIL PROTECTED]
Sent: Tuesday, May 13, 2008 6:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Commit problems on Solr 1.2 with Tomcat

By default, a commit won't return until a new searcher has been  
opened

and the results are visible.
So just make sure you wait for the commit command to return before  
querying.


Also, if you are committing every add, you can avoid a separate  
commit

command by putting ?commit=true in the URL of the add command.

-Yonik

On Tue, May 13, 2008 at 9:31 AM, Alexander Ramos Jardim
[EMAIL PROTECTED] wrote:

Maybe a delay in commit? How may time elapsed between commits?

 2008/5/13 William Pierce [EMAIL PROTECTED]:



  Hi,
 
  I am having problems with Solr 1.2 running tomcat version  
6.0.16 (I also
  tried 6.0.14 but same problems exist).  Here is the  
situation:  I have an
  ASP.net application where I am trying to add and commit a  
single
  document to an index.   After I add the document and issue the  
commit / I
  can see (in the solr stats page) that the commit count has  
been increment
  but the docsPending is 1,  and my document is still not  
visible from a

  search perspective.
 
  When I issue another commit/,  the commit counter increments,
   docsPending is now zero,  and my document is visible and  
searchable.

 
  I saw that someone was observing problems with 6.0.16 tomcat,   
so I

  reverted back to 6.0.14.  Same problem.
 
  Can anyone help?
 
  -- Bill




 --
 Alexander Ramos Jardim





Re: Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread William Pierce

Erik:  I am indeed issuing multiple Solr requests.

Here is my code snippet (deletexml and addxml are the strings that contain 
the add and delete strings for the items to be added or deleted).   For 
our simple example,  nothing is being deleted so stufftodelete is always 
false.


//we are done...we now need to post the requests...
   if (stufftodelete)
   {
   SendSolrIndexingRequest(deletexml);
   }
   if (stufftoadd)
   {
   SendSolrIndexingRequest(addxml);
   }

   if ( stufftodelete || stufftoadd)
   {
   SendSolrIndexingRequest(commit waitFlush=\true\ 
waitSearcher=\true\/);

   }

I am using the full form of the commit here just to see if the commit / 
was somehow not working.


The SendSolrIndexingRequest is the routine that takes the string argument 
and issues the POST request to the update URL.


Thanks,

Bill

--
From: Erik Hatcher [EMAIL PROTECTED]
Sent: Tuesday, May 13, 2008 7:40 AM
To: solr-user@lucene.apache.org
Subject: Re: Commit problems on Solr 1.2 with Tomcat

I'm not sure if you are issuing a separate commit/ _request_ after  your 
add, or putting a commit/ into the same request.  Solr only  supports 
one command (add or commit, but not both) per request.


Erik


On May 13, 2008, at 10:36 AM, William Pierce wrote:


Thanks for the comments

The reason I am just adding one document followed by a commit is  for 
this particular test --- in actuality,  I will be loading  documents from 
a db. But thanks for the pointer on the ?commit=true  on the add command.


Now on the commit / problem itself,  I am still confused:   Doesn't the 
commit count of 1 indicate that the commit is completed?


In any event,  just for testing purposes,  I started everything  from 
scratch (deleted all documents, stopped/restarted tomcat).  I  noticed 
that the only files in my index folder were:  segments.gen  and 
segments_1.


Then I did the add followed by commit / and noticed that there  were 
now three files:  segments.gen, segments_1 and write.lock.


Now it is 7 minutes later, and when I query the index using the 
http://localhost:59575/splus1/admin/; url, I still do not see the 
document.


Again, when I issue another commit / command everything seems to  work. 
Why are TWO commit commands apparently required?


Thanks,

Sridhar

--
From: Yonik Seeley [EMAIL PROTECTED]
Sent: Tuesday, May 13, 2008 6:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Commit problems on Solr 1.2 with Tomcat


By default, a commit won't return until a new searcher has been  opened
and the results are visible.
So just make sure you wait for the commit command to return before 
querying.


Also, if you are committing every add, you can avoid a separate  commit
command by putting ?commit=true in the URL of the add command.

-Yonik

On Tue, May 13, 2008 at 9:31 AM, Alexander Ramos Jardim
[EMAIL PROTECTED] wrote:

Maybe a delay in commit? How may time elapsed between commits?

 2008/5/13 William Pierce [EMAIL PROTECTED]:



  Hi,
 
  I am having problems with Solr 1.2 running tomcat version  6.0.16 (I 
also
  tried 6.0.14 but same problems exist).  Here is the  situation:  I 
have an
  ASP.net application where I am trying to add and commit a 
single
  document to an index.   After I add the document and issue the 
commit / I
  can see (in the solr stats page) that the commit count has  been 
increment
  but the docsPending is 1,  and my document is still not  visible 
from a

  search perspective.
 
  When I issue another commit/,  the commit counter increments,
   docsPending is now zero,  and my document is visible and 
searchable.

 
  I saw that someone was observing problems with 6.0.16 tomcat,   so I
  reverted back to 6.0.14.  Same problem.
 
  Can anyone help?
 
  -- Bill




 --
 Alexander Ramos Jardim






Re: Commit problems on Solr 1.2 with Tomcat

2008-05-13 Thread Yonik Seeley
Is SendSolrIndexingRequest synchronous or asynchronous?
If the call to SendSolrIndexingRequest() can return before the
response from the add is received, then the commit could sneak in and
finish *before* the add is done (in which case, you won't see it
before the next commit).

-Yonik

On Tue, May 13, 2008 at 10:49 AM, William Pierce [EMAIL PROTECTED] wrote:
 Erik:  I am indeed issuing multiple Solr requests.

  Here is my code snippet (deletexml and addxml are the strings that contain
 the add and delete strings for the items to be added or deleted).   For
 our simple example,  nothing is being deleted so stufftodelete is always
 false.

 //we are done...we now need to post the requests...
if (stufftodelete)
{
SendSolrIndexingRequest(deletexml);
}
if (stufftoadd)
{
SendSolrIndexingRequest(addxml);
}

if ( stufftodelete || stufftoadd)
{
SendSolrIndexingRequest(commit waitFlush=\true\
 waitSearcher=\true\/);
}

  I am using the full form of the commit here just to see if the commit /
 was somehow not working.

  The SendSolrIndexingRequest is the routine that takes the string argument
 and issues the POST request to the update URL.

  Thanks,

  Bill

  --
  From: Erik Hatcher [EMAIL PROTECTED]
  Sent: Tuesday, May 13, 2008 7:40 AM


  To: solr-user@lucene.apache.org
  Subject: Re: Commit problems on Solr 1.2 with Tomcat


  I'm not sure if you are issuing a separate commit/ _request_ after  your
 add, or putting a commit/ into the same request.  Solr only  supports
 one command (add or commit, but not both) per request.
 
  Erik
 
 
  On May 13, 2008, at 10:36 AM, William Pierce wrote:
 
 
   Thanks for the comments
  
   The reason I am just adding one document followed by a commit is  for
 this particular test --- in actuality,  I will be loading  documents from a
 db. But thanks for the pointer on the ?commit=true  on the add command.
  
   Now on the commit / problem itself,  I am still confused:   Doesn't
 the commit count of 1 indicate that the commit is completed?
  
   In any event,  just for testing purposes,  I started everything  from
 scratch (deleted all documents, stopped/restarted tomcat).  I  noticed that
 the only files in my index folder were:  segments.gen  and segments_1.
  
   Then I did the add followed by commit / and noticed that there  were
 now three files:  segments.gen, segments_1 and write.lock.
  
   Now it is 7 minutes later, and when I query the index using the
 http://localhost:59575/splus1/admin/; url, I still do not see the document.
  
   Again, when I issue another commit / command everything seems to
 work. Why are TWO commit commands apparently required?
  
   Thanks,
  
   Sridhar
  
   --
   From: Yonik Seeley [EMAIL PROTECTED]
   Sent: Tuesday, May 13, 2008 6:42 AM
   To: solr-user@lucene.apache.org
   Subject: Re: Commit problems on Solr 1.2 with Tomcat
  
  
By default, a commit won't return until a new searcher has been
 opened
and the results are visible.
So just make sure you wait for the commit command to return before
 querying.
   
Also, if you are committing every add, you can avoid a separate
 commit
command by putting ?commit=true in the URL of the add command.
   
-Yonik
   
On Tue, May 13, 2008 at 9:31 AM, Alexander Ramos Jardim
[EMAIL PROTECTED] wrote:
   
 Maybe a delay in commit? How may time elapsed between commits?

  2008/5/13 William Pierce [EMAIL PROTECTED]:



   Hi,
  
   I am having problems with Solr 1.2 running tomcat version  6.0.16
 (I also
   tried 6.0.14 but same problems exist).  Here is the  situation:
 I have an
   ASP.net application where I am trying to add and commit a
 single
   document to an index.   After I add the document and issue the
 commit / I
   can see (in the solr stats page) that the commit count has  been
 increment
   but the docsPending is 1,  and my document is still not  visible
 from a
   search perspective.
  
   When I issue another commit/,  the commit counter increments,
docsPending is now zero,  and my document is visible and
 searchable.
  
   I saw that someone was observing problems with 6.0.16 tomcat,
 so I
   reverted back to 6.0.14.  Same problem.
  
   Can anyone help?
  
   -- Bill




  --
  Alexander Ramos Jardim


   
  
 
 
 



Re: How Special Character '' used in indexing

2008-05-13 Thread Walter Underwood
ASAP means As Soon As Possible, not As Soon As Convenient.
Please don't say that if you don't mean it. --wunder

On 5/12/08 6:48 AM, Ricky [EMAIL PROTECTED] wrote:

 Hi Mike,
 
 Thanx for your reply. I have got the answer to the question posted.
 
 I know people are donating time here. ASAP doesnt mean that am demanding
 them to reply fast. Please read the lines before you comment something(*Please
 kindly* reply ASAP). Am a newbie and with curiosity i have requested to
 answer. I dont know if it has hurt you(Am sorry for that)
 
 Thanks,
 Ricky.
 
 
 On Fri, May 9, 2008 at 3:30 PM, Mike Klaas [EMAIL PROTECTED] wrote:
 
 
 On 9-May-08, at 6:26 AM, Ricky wrote:
 
  I have tried sending the 'amp' instead of '' like the following,
 field name =companyA amp K Inc/field.
 
 But i still get the same error entity reference name can not contain
 character  ' position: START_TAG seen ...fieldname = companyA amp
 ..
 
 
 Please use a library for doing xml encoding--there is absolutely no reason
 to do this yourself.
 
  Please kindly reply ASAP.
 
 
 Please also realize that people responding here are donating their time
 and that it is inappropriate to ask for an expedited response.
 
 -Mike
 
 



Re: Extending XmlRequestHandler

2008-05-13 Thread Walter Underwood
There is one huge advantage of talking to Solr with SolrJ (or any
other client that uses the REST API), and that is that you can
put an HTTP cache between that and Solr. We get a 75% hit rate
on that cache. SOAP is not cacheable in any useful sense.

I designed and implemented the SOAP interface for all the search
engines at Verity, so I'm not just guessing about this.

wunder

On 5/12/08 7:02 AM, Erik Hatcher [EMAIL PROTECTED] wrote:

 
 On May 12, 2008, at 9:52 AM, Alexander Ramos Jardim wrote:
 I understood what you said about putting the SOAP at Solr. I agree.
 That's
 not smart.
 Now, I am thinking about the web service talking with an embedded Solr
 server.
 Is that you were talking about?
 
 Quite pleasantly you don't even really have to code in that level of
 detail in any hardcoded way.  You can use SolrJ behind a SOAP
 interface, and use it with a SolrServer.  The implementation of that
 can switch between embedded (which I'm not even really sure what
 that means exactly) or via HTTP the good ol' fashioned way.
 
 Erik
 
 
 



Re: single character terms in index - why?

2008-05-13 Thread Walter Underwood
We have some useful single character terms in the rating field,
like G and R, alongside PG and others.

wunder

On 5/12/08 1:33 PM, Yonik Seeley [EMAIL PROTECTED] wrote:

 On Mon, May 12, 2008 at 4:13 PM, Naomi Dushay [EMAIL PROTECTED] wrote:
  So I'm now asking:  why would SOLR want single character terms?
 
 Solr, like Lucene, can be configured however you want.  The example
 schema is just that - an example.
 
 But, there are many field types that might be interested in keeping
 single letter terms.
 One can even think of examples where single letter terms would be
 useful for normal full-text fields, depending on the domain or on the
 analysys.
 
 One simple example:  d-day might be alternately indexed as d day
 so it would be found with a query of d day
 
 -Yonik



Re: JMX monitoring

2008-05-13 Thread Chris Hostetter

: Thank you, Shalin!
: 
: It works great.

please post feedback like that in the Jira issue (and ideally: vote for 
the issue as well)

comments on issues from people saying that they tried out patches and 
found them useful helps committers asses the utility of features and the 
effectiveness of the patch.




-Hoss



Re: Field Grouping

2008-05-13 Thread oleg_gnatovskiy

There is an XSLT example here: http://wiki.apache.org/solr/XsltResponseWriter
, but it doesn't seem like that would work either... This example would only
do a group by for the current page. If I use Solr for pagination, this would
not work for me.


oleg_gnatovskiy wrote:
 
 But I don't want the search results to be ranked based on that field. I
 only want all the documents with the same value grouped together... The
 way my system is set up, most documents will have that field empty. Thus,
 if Is rot by it, those documents that have a value will bubble to the
 top...
 
 
 
 Yonik Seeley wrote:
 
 On Mon, May 12, 2008 at 9:58 PM, oleg_gnatovskiy
 [EMAIL PROTECTED] wrote:
  Hello. I was wondering if there is a way to get solr to return fields
 with
  the same value for a particular field together. For example I might
 want to
  have all the documents with exactly the same name field all returned
 next to
  each other. Is this possible? Thanks!
 
 Sort by that field.  Since you can only sort by fields with a single
 term at most (this rules out full-text fields), you might want to do a
 copyField of the name field to something like a name_s field which
 is of type string (which can be sorted on).
 
 -Yonik
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Field-Grouping-tp17199592p17215641.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unlimited number of return documents?

2008-05-13 Thread Marc Bechler

Hi Walter,

thanks for your advice and, indeed, that is correct, too (and I will 
likely implement the cleaning mechanism this way). (Btw: what would the 
query look like to get row 101-200 in the second chunk?) However, using 
chunks is not atomic so you may not get results of inegrity.


Regards,

 marc



Walter Underwood schrieb:

Nope. You should fetch all the rows in 100 row chunks. Much, much
better than getting them all in one request. I do that to load
the auto-complete table.

I really cannot think of a good reason to fetch all the rows
in one request. That is more like a denial of service attack
than like a useful engineering solution.

wunder

On 5/9/08 11:11 AM, Marc Bechler [EMAIL PROTECTED] wrote:


Hi all,

one possible use case could be to synchronize the index against a given
database. E.g., assume that you have a filesystem that is indexed
periodically. If files are deleted on this filesystem, they will not be
deleted in the index. This way, you can get (e.g.) the complete content
from your index in order to check for consistency.

Btw: I also played around with the rows parameter in order to get the
overall index; but I got exceptions (not sufficient heap space), when
setting up rows above some higher thresholds.

Regards,

  marc


Erik Hatcher schrieb:

Or make two requests...  one with rows=0 to see how many documents match
without retrieving any, then another with that amount specified.

Erik


On May 9, 2008, at 8:54 AM, Francisco Sanmartin wrote:

Yeah, I understand the possible problems of changing this value. It's
just a very particular case and there won't be a lot of documents to
return. I guess I'll have to use a very high int number, I just wanted
to know if there was any proper configuration for this situation.

Thanks for the answer!

Pako


Otis Gospodnetic wrote:

Will something a la rows=max int here work? ;) But are you sure you
want to do that?  It could be sloow.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 


From: Francisco Sanmartin [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Thursday, May 8, 2008 4:18:46 PM
Subject: Unlimited number of return documents?

What is the value to set to rows in solrconfig.xml in order not to
have any limitation about the number of returned documents? I've
tried with -1 and 0 but not luck...

solr 0 name=rows*10*
I want solr to return all available documents by default.

Thanks!

Pako









Re: Unlimited number of return documents?

2008-05-13 Thread Alexander Ramos Jardim
I think that keep a transaction log is the best aproach for your use case.

2008/5/13 Marc Bechler [EMAIL PROTECTED]:

 Hi Walter,

 thanks for your advice and, indeed, that is correct, too (and I will
 likely implement the cleaning mechanism this way). (Btw: what would the
 query look like to get row 101-200 in the second chunk?) However, using
 chunks is not atomic so you may not get results of inegrity.

 Regards,

  marc



 Walter Underwood schrieb:

  Nope. You should fetch all the rows in 100 row chunks. Much, much
  better than getting them all in one request. I do that to load
  the auto-complete table.
 
  I really cannot think of a good reason to fetch all the rows
  in one request. That is more like a denial of service attack
  than like a useful engineering solution.
 
  wunder
 
  On 5/9/08 11:11 AM, Marc Bechler [EMAIL PROTECTED] wrote:
 
   Hi all,
  
   one possible use case could be to synchronize the index against a
   given
   database. E.g., assume that you have a filesystem that is indexed
   periodically. If files are deleted on this filesystem, they will not
   be
   deleted in the index. This way, you can get (e.g.) the complete
   content
   from your index in order to check for consistency.
  
   Btw: I also played around with the rows parameter in order to get the
   overall index; but I got exceptions (not sufficient heap space),
   when
   setting up rows above some higher thresholds.
  
   Regards,
  
marc
  
  
   Erik Hatcher schrieb:
  
Or make two requests...  one with rows=0 to see how many documents
match
without retrieving any, then another with that amount specified.
   
   Erik
   
   
On May 9, 2008, at 8:54 AM, Francisco Sanmartin wrote:
   
 Yeah, I understand the possible problems of changing this value.
 It's
 just a very particular case and there won't be a lot of documents
 to
 return. I guess I'll have to use a very high int number, I just
 wanted
 to know if there was any proper configuration for this
 situation.

 Thanks for the answer!

 Pako


 Otis Gospodnetic wrote:

  Will something a la rows=max int here work? ;) But are you
  sure you
  want to do that?  It could be sloow.
 
 
  Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
  - Original Message 
 
   From: Francisco Sanmartin [EMAIL PROTECTED]
   To: solr-user@lucene.apache.org
   Sent: Thursday, May 8, 2008 4:18:46 PM
   Subject: Unlimited number of return documents?
  
   What is the value to set to rows in solrconfig.xml in order
   not to
   have any limitation about the number of returned documents?
   I've
   tried with -1 and 0 but not luck...
  
   solr 0 name=rows*10*
   I want solr to return all available documents by default.
  
   Thanks!
  
   Pako
  
  
 
 
 
 


-- 
Alexander Ramos Jardim


Re: Field Grouping

2008-05-13 Thread Ryan McKinley

You may want to check field collapsing
https://issues.apache.org/jira/browse/SOLR-236

There is a patch that works against 1.2, but the one for trunk needs  
some work before it can work...


ryan


On May 13, 2008, at 2:46 PM, oleg_gnatovskiy wrote:


There is an XSLT example here: http://wiki.apache.org/solr/XsltResponseWriter
, but it doesn't seem like that would work either... This example  
would only
do a group by for the current page. If I use Solr for pagination,  
this would

not work for me.


oleg_gnatovskiy wrote:


But I don't want the search results to be ranked based on that  
field. I
only want all the documents with the same value grouped together...  
The
way my system is set up, most documents will have that field empty.  
Thus,

if Is rot by it, those documents that have a value will bubble to the
top...



Yonik Seeley wrote:


On Mon, May 12, 2008 at 9:58 PM, oleg_gnatovskiy
[EMAIL PROTECTED] wrote:
Hello. I was wondering if there is a way to get solr to return  
fields

with
the same value for a particular field together. For example I might
want to
have all the documents with exactly the same name field all  
returned

next to
each other. Is this possible? Thanks!


Sort by that field.  Since you can only sort by fields with a single
term at most (this rules out full-text fields), you might want to  
do a
copyField of the name field to something like a name_s field  
which

is of type string (which can be sorted on).

-Yonik







--
View this message in context: 
http://www.nabble.com/Field-Grouping-tp17199592p17215641.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Differences between nightly builds

2008-05-13 Thread Otis Gospodnetic
Lucas,

Look at the solr svn repository's root and you will see a file name called 
CHANGES.txt.  That contains all major Solr changes back to January 2006.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Lucas F. A. Teixeira [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, May 13, 2008 6:59:55 AM
 Subject: Differences between nightly builds
 
 Hello,
 
 Here we use a nightly build from aug '07. It`s what we need with some 
 bugs that we`ve worked on it.
 I want to change this to a newer nightly build, but as this is 'stable' 
 people are affraid of changing to a 'unknown' build.
 
 Is there some place where I can find all changes between some date (my 
 aug 07') and nowadays? Maybe with this I can make their mind!
 
 Thank you.
 
 []s,
 
 
 -- 
 Lucas Frare A. Teixeira
 [EMAIL PROTECTED] 
 Tel: +55 11 3660.1622 - R3018



Re: phrase query with DismaxHandler

2008-05-13 Thread Otis Gospodnetic
Hi,

I don't think what you said makes 100% sense.  Both words windows and 
installation will be different when stemmed.  Also, the word combination will 
not get stemmed to combine (that's not what Porter stemmer would shop it down 
it).

Go to Solr admin page, enter windows installation, then modify the URL and 
add: qt=dismaxdebugQuery=true and have a look at the XML.  It will contain 
the query string rewritten by DisMax, which will tell you what's going on.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: KhushbooLohia [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, May 13, 2008 5:50:30 AM
 Subject: phrase query with DismaxHandler
 
 
 Hi All,
 
 I am using EnglishPorterFilterFactory in text field for stemming the words. 
 Also I am using DisMaxRequestHandler for handling requests.
 When phrase query is passed to solr ex: windows installation. 
 Sometimes the results obtained are correct but sometimes the results occur
 with only word install or just windows or just with installation. 
 Its observed that, if the phrase doesn't have anything to be stemmed like
 windows or cpmany the results are returned as expected. But phrase with
 words like combination, colusion get stemmed to combine or conclude and
 brings wierd results.
 
 
 Please revert back.
 
 Thanks
 Khushboo
 
 -- 
 View this message in context: 
 http://www.nabble.com/phrase-query-with-DismaxHandler-tp17204921p17204921.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: the time factor

2008-05-13 Thread Otis Gospodnetic
Jack,

The answer is: function queries! :)
You can easily use function queries with DisMaxRequestHandler.  For example, 
this is what you can add to the dismax config section in solrconfig.xml:

 str name=bf
recip(rord(addDate),1,1000,1000)^2.5
 /str

Assuming you have an addDate field, this will give fresher document some boost. 
 Look for this on the Wiki, it's all there.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: JLIST [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, May 13, 2008 5:42:38 AM
 Subject: the time factor
 
 Hi,
 
 I'm indexing news articles from a few news feeds.
 With news, there's the factor of relevance and also the
 factor of freshness. Relevance-only results are not satisfactory.
 Sorting on feed update time is not satisfactory, either,
 because one source may update more frequently than the
 others and it tends to occupy the first rows most of
 the time. I wonder what is the best way of combining the
 time factor in news search?
 
 Thanks,
 Jack



Re: Duplicates results when using a non optimized index

2008-05-13 Thread Otis Gospodnetic
Hm, not sure why that is happening, but here is some info regarding other stuff 
from your email

- there should be no duplicates even if you are searching an index that is 
being optimized
- why are you searching an index that is being optimized?  It's doable, but 
people typically perform index-modifying operations on a Solr master and 
read-only operations on Solr query slave(s)
- do duplicates go away after optimization is done?
- do duplicate IDs that you are seeing IDs of previously deleted documents?
- which Solr version are you using and can you try a recent nightly?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Tim Mahy [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tuesday, May 13, 2008 5:59:28 AM
 Subject: Duplicates results when using a non optimized index
 
 Hi all,
 
 is this expected behavior when having an index like this :
 
 numDocs : 9479963
 maxDoc : 12622942
 readerImpl : MultiReader
 
 which is in the process of optimizing that when we search through the index 
 we 
 get this :
 
 
 15257559
 
 
 15257559
 
 
 17177888
 
 
 11825631
 
 
 11825631
 
 
 The id field is declared like this :
 
 
 and is set as the unique identity like this in the schema xml :
   id
 
 so the question : is this expected behavior and if so is there a way to let 
 Solr 
 only return unique documents ?
 
 greetings and thanx in advance,
 Tim
 
 
 
 
 Please see our disclaimer, http://www.infosupport.be/Pages/Disclaimer.aspx