Word Delimiter struggles

2009-01-16 Thread David Shettler
This has likely been covered, and I've tried searching through the
archives, but having trouble finding an answer.

On OSVDB.org, if you search for:

title:PHPGroupWare

You get...nothing

if you search for:

title:phpGroupWare

(which is how the entry is indexed originally), you get a match of course.

same with phpgroupware

If I get rid of word delimiter, then things are fine, unless you want
to search for PHP GroupWare and get a match...

Basically, I need to get a match on any of these searches:

PHPGroupWare
PHPGroupware
phpGroupware
phpGroupWare
phpgroupware
php groupware
php group ware
PHPGroup ware

etc.

We've been dealing with this problem for about 36 months now, but
there has to be a better way...or am I dreaming? :)

Can anyone suggestion a schema that would accommodate this?  I've
tried every combination of word delimiter that I can think of, but I'm
no expert on the topic.

I can also manipulate input prior to search and indexing if you can
think of a way there.  It's wanting the best of select from LIKE, and
solr's voodoo...perhaps I'm wanting too much!

Cheers,

Dave
OSVDB.org


Commit frequency

2009-01-16 Thread George Aroush
Hi Folks,

I'm trying to collect some data -- if you can share them -- about the
frequency of commit you have set on your index and at what rate did you find
it acceptable.  This is for a none-master / slave setup.

For my case, in a test environment, I have experimented with a 1 minute
interval (each 1 minute I commit anywhere between 0 to 10 new documents, and
0 to 10 updated documents).  While commit is ongoing, I'm also searching on
the index.

For this experiment, my index size is about 3.5 Gb, and I have about 1.2
million documents.  My experiment was done on a Windows 2003 server, with 4
Gb RAM and 3 GHZ 2x Xean CPU.

So, if you can share your setup, at least the commit frequency, I would
appreciate it.

What I'm trying to get out of this, is what's the lowest commit frequency
that Solr can handle.

Regards,

-- George



Re: about the xml output format

2009-01-16 Thread Marc Sturlese

Yeah, I saw that... I am wandering if php serialized response use UTF-8
encoding by default...
Thanks

Bill Au wrote:
> 
> Solr does have a PHPResponseWriter:
> 
> http://wiki.apache.org/solr/SolPHP?highlight=(CategoryQueryResponseWriter)|((CategoryQueryResponseWriter))
> 
> http://lucene.apache.org/solr/api/org/apache/solr/request/PHPResponseWriter.html
> 
> Bill
> 
> On Fri, Jan 16, 2009 at 1:09 PM, Marc Sturlese
> wrote:
> 
>>
>> Thanks, I have to study it but I think php serialized format will be the
>> best
>> for my case...
>>
>> Erik Hatcher wrote:
>> >
>> >
>> > On Jan 16, 2009, at 12:26 PM, Marc Sturlese wrote:
>> >> I would like to know if there is any way to customize the output xml
>> >> that
>> >> contains the response. I have been checking the source and looks for
>> >> me that
>> >> should be something close to XMLWriter.java and
>> >> XMLResponseWriter.java but
>> >> not sure about that... Is there any way to handle a pluggin to do
>> >> that?
>> >
>> > There's an XSL option that you can use without writing any code:
>> >
>> > 
>> >
>> > But beyond that, you can write your own QueryResponseWriter
>> > implementation and plug it into solrconfig.xml for
>> > &wt=my_custom_writer.  See Solr's example solrconfig.xml for details
>> > on that.
>> >
>> >   Erik
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/about-the-xml-output-format-tp21504381p21505146.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/about-the-xml-output-format-tp21504381p21509406.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: populating synonyms.txt

2009-01-16 Thread Walter Underwood
Synonyms are domain-specific. A food site would list "arugula" and
"rocket" as synonyms, but that would be a bad idea for NASA.

wunder

On 1/16/09 1:35 PM, "Daniel Lovins"  wrote:

> Hello list.
> 
> Are there standardized lists out there for populating synonyms.txt?
> Entering the terms manually seems like a bad idea.
> 
> Thanks for your help.
> 
> Daniel



populating synonyms.txt

2009-01-16 Thread Daniel Lovins
Hello list.

Are there standardized lists out there for populating synonyms.txt?
Entering the terms manually seems like a bad idea.

Thanks for your help.

Daniel


Re: Index files deleted but still accessed by Tomcat

2009-01-16 Thread Chris Hostetter

: Problem solved. We gave the index space five times its original size, just
: to be safe. Then the optimize finished without problems and without
: leaving deleted, yet still open files behind. During the optimize the
: index directory peaked at almost three times its size, and now,
: afterwards, it's shrunk by about a fourth!

FWIW: adjusting your mergeFactor so that Solr merges segments more often 
during regular adds can help reduce the amount of disk needed during a 
full optimise.  

Also: the recently added "maxSegments" option on optimize commands can be 
used to only partially optimize ... i believe a series of partial 
optimizes might wind up requring less total disk since it gradually frees 
up disk space merging away the smaller files and lets that space be 
reclaimed as new searchers get opend and the old searchers close. ... but 
i haven't really sat down to work it out to make sure i'm correct with 
that one.



-Hoss



Re: Is it just me or multicore default is broken? Can't ping

2009-01-16 Thread Chris Hostetter

: Also can can open admin at
: 
: http://localhost:8983/solr/core1/admin/

: But then trying to ping
: http://localhost:8983/solr/core1/admin/ping
: 
: I get  error 500 INTERNAL SERVER ERROR

The ping URL only works if you configure the PingRequestHandler with a 
default query, or use the extremely deprecated  syntax in the 
 block of your solrconfig.xml ... the multicore example has a very 
bare bones solrconfig.xml, so neither of this things is configured.

the error you are seeing is fairly unfortunate ... logging a better error 
message about the lack of configurtion would be ideal, i've opened 
SOLR-965 to track this as an improvement.

-Hoss



Re: CoreAdmin for replication STATUS

2009-01-16 Thread Chris Hostetter

: > How do I find out the status of a slave's index?  I have the following
: > scenario:

FYI: regardless of where your index comes from (java replication, script 
replication, manual mucking, etc...) the stats page of any solr core will 
tell you the index version number in use by the current searcher which 
always increases as an index is modified over time.

the replication handler might tell you that the slave is currently 
replicating, but maybe it's already finished replicating the version you 
care about, and has moved on to replicating an even newer version?  or 
maybe replciation has finished, but auto-warming is stll taking place (i'm 
not sure what the replication handler status says in that case)

checking the info page on your master, remembering the version number, and 
then checking the info page on the slaves will always tell you if hte 
slave is currently searching an index version equal to or greater then 
when you last checked the master...

http://localhost:8983/solr/admin/stats.jsp
...

  
searcher
  
...

  1200621623713



-Hoss



Re: Index files deleted but still accessed by Tomcat

2009-01-16 Thread Dominik Schramm
Hi again,

Problem solved. We gave the index space five times its original size, just
to be safe. Then the optimize finished without problems and without
leaving deleted, yet still open files behind. During the optimize the
index directory peaked at almost three times its size, and now,
afterwards, it's shrunk by about a fourth!

Thanks for your help, Alexander!

Dominik

Alexander Ramos Jardim wrote:
> Dominik,
>
> Form my experiences with Solr, I have seen the index double its size
> during
> an optimize.
>
> I always keep my index directory in a partition that has at least its
> triple
> size, so I have a margin for optimize and for natural doc quantity growth.
>
> 2009/1/16 Dominik Schramm 
>
>> On more thing that might help to identify the problem source (which I've
>> discovered just now) -- at the time the optimize "finished" (or broke
>> off), Tomcat logged the following:
>>
>> r...@cms004:~# /opt/apache-tomcat-sba1-live/logs/catalina.2009-01-16.log
>> ...
>> Jan 16, 2009 11:40:14 AM org.apache.solr.core.SolrCore execute
>> INFO: /update/csv
>>
>> header=false&separator=;&commit=true&fieldnames=key,siln,biln,prln,date,validfrom,contractno,currency,
>>
>> reserve1,reserve2,posno,processingcode,ean,ek,pb,quantityunit,p_s_ek1,p_s_m1,p_s_q1,p_s_ek2,p_s_m2,p_s_q2,p_s_ek3,p_s_m3
>>
>> ,p_s_q3,p_s_ek4,p_s_m4,p_s_q4,p_s_ek5,p_s_m5,p_s_q5,p_s_ek6,p_s_m6,p_s_q6,p_s_ek7,p_s_m7,p_s_q7,p_s_ek8,p_s_m8,p_s_q8,p_
>>
>> s_ek9,p_s_m9,p_s_q9,p_s_ek10,p_s_m10,p_s_q10,a_s_ek1,a_s_m1,a_s_q1,a_s_ek2,a_s_m2,a_s_q2,a_s_ek3,a_s_m3,a_s_q3,a_s_ek4,a
>>
>> _s_m4,a_s_q4,a_s_ek5,a_s_m5,a_s_q5,a_s_ek6,a_s_m6,a_s_q6,a_s_ek7,a_s_m7,a_s_q7,a_s_ek8,a_s_m8,a_s_q8,a_s_ek9,a_s_m9,a_s_
>> q9,a_s_ek10,a_s_m10,a_s_q10&stream.file=/opt/asdf/bla/bla 3754
>> Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2
>> commit
>> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
>> Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2
>> commit
>> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
>> Jan 16, 2009 12:08:01 PM org.apache.solr.core.SolrException log
>> SEVERE: java.io.IOException: No space left on device
>>at java.io.RandomAccessFile.writeBytes(Native Method)
>>at java.io.RandomAccessFile.write(RandomAccessFile.java:456)
>>at
>>
>> org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:589)
>>at
>>
>> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
>>at
>>
>> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
>>at
>>
>> org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
>>at
>>
>> org.apache.lucene.store.FSDirectory$FSIndexOutput.close(FSDirectory.java:594)
>>at
>> org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:48)
>>at
>> org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:211)
>>at
>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)
>>at
>> org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835)
>>at
>> org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195)
>>at
>>
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508)
>>at
>> de.businessmart.cai.solr.CAIUpdateHandler.commit(CAIUpdateHandler.java:343)
>>at
>>
>> org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:214)
>>at
>>
>> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84)
>>at
>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77)
>>at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
>>at
>>
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191)
>>at
>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159)
>>at
>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
>>at
>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
>>at
>>
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
>>at
>>
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
>>at
>>
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
>>at
>>
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
>>at
>> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
>>at
>>
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
>>at
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
>>at
>>
>> org.apache.coyote.http11.Http11AprProc

Date Format in QueryParsing

2009-01-16 Thread Hana

Hi

When I parse DateRange query in a custom RequestHandler I get the date in
format -MM-dd'T'HH:mm:ss, but I would like it with the trailling 'Z' for
UTC time. Is there a way how to set the desired date format?

Here is a snippet of the code:

SolrParams p = req.getParams();
String query = p.get(CommonParams.Q);
Query q = QueryParsing.parseQuery(query, req.getSchema());
log.debug(q.toString()); // output the dates in -MM-dd'T'HH:mm:ss format

Cheers! 

Hana

-- 
View this message in context: 
http://www.nabble.com/Date-Format-in-QueryParsing-tp21505639p21505639.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: about the xml output format

2009-01-16 Thread Bill Au
Solr does have a PHPResponseWriter:

http://wiki.apache.org/solr/SolPHP?highlight=(CategoryQueryResponseWriter)|((CategoryQueryResponseWriter))

http://lucene.apache.org/solr/api/org/apache/solr/request/PHPResponseWriter.html

Bill

On Fri, Jan 16, 2009 at 1:09 PM, Marc Sturlese wrote:

>
> Thanks, I have to study it but I think php serialized format will be the
> best
> for my case...
>
> Erik Hatcher wrote:
> >
> >
> > On Jan 16, 2009, at 12:26 PM, Marc Sturlese wrote:
> >> I would like to know if there is any way to customize the output xml
> >> that
> >> contains the response. I have been checking the source and looks for
> >> me that
> >> should be something close to XMLWriter.java and
> >> XMLResponseWriter.java but
> >> not sure about that... Is there any way to handle a pluggin to do
> >> that?
> >
> > There's an XSL option that you can use without writing any code:
> >
> > 
> >
> > But beyond that, you can write your own QueryResponseWriter
> > implementation and plug it into solrconfig.xml for
> > &wt=my_custom_writer.  See Solr's example solrconfig.xml for details
> > on that.
> >
> >   Erik
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/about-the-xml-output-format-tp21504381p21505146.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: about the xml output format

2009-01-16 Thread Marc Sturlese

Thanks, I have to study it but I think php serialized format will be the best
for my case...

Erik Hatcher wrote:
> 
> 
> On Jan 16, 2009, at 12:26 PM, Marc Sturlese wrote:
>> I would like to know if there is any way to customize the output xml  
>> that
>> contains the response. I have been checking the source and looks for  
>> me that
>> should be something close to XMLWriter.java and  
>> XMLResponseWriter.java but
>> not sure about that... Is there any way to handle a pluggin to do  
>> that?
> 
> There's an XSL option that you can use without writing any code:
> 
> 
> 
> But beyond that, you can write your own QueryResponseWriter  
> implementation and plug it into solrconfig.xml for  
> &wt=my_custom_writer.  See Solr's example solrconfig.xml for details  
> on that.
> 
>   Erik
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/about-the-xml-output-format-tp21504381p21505146.html
Sent from the Solr - User mailing list archive at Nabble.com.



How to select *actual* match from a multi-valued field

2009-01-16 Thread Feak, Todd
At a high level, I'm trying to do some more intelligent searching using
an app that will send multiple queries to Solr. My current issue is
around multi-valued fields and determining which entry actually
generated the "hit" for a particular query.

 

For example, let's say that I have a multi-valued field containing
people's names, associated with the document (trying to be non-specific
on purpose). In one document, I have the following names:

Jane Smith, Bob Smith, Roger Smith, Jane Doe. If the user performs a
search for Bob Smith, this document is returned. What I want to know is
that this document was returned because of "Bob Smith", not because of
Jane or Roger. I've tried using the highlighting settings. They do
provide some help, as the Jane Doe entry doesn't come back highlighted,
but both Jane and Roger do. I've tried using hl.requireFieldMatch, but
that seems to pertain only to fields, not entries within a multi-valued
field.

 

Using Solr, is there a way to get the information I am looking for?
Specifically, that "Bob Smith" is the value in the multi-valued field
that triggered the hit?

 

-Todd Feak



Re: about the xml output format

2009-01-16 Thread Erik Hatcher


On Jan 16, 2009, at 12:26 PM, Marc Sturlese wrote:
I would like to know if there is any way to customize the output xml  
that
contains the response. I have been checking the source and looks for  
me that
should be something close to XMLWriter.java and  
XMLResponseWriter.java but
not sure about that... Is there any way to handle a pluggin to do  
that?


There's an XSL option that you can use without writing any code:

   

But beyond that, you can write your own QueryResponseWriter  
implementation and plug it into solrconfig.xml for  
&wt=my_custom_writer.  See Solr's example solrconfig.xml for details  
on that.


Erik



about the xml output format

2009-01-16 Thread Marc Sturlese

Hey there,
I would like to know if there is any way to customize the output xml that
contains the response. I have been checking the source and looks for me that
should be something close to XMLWriter.java and XMLResponseWriter.java but
not sure about that... Is there any way to handle a pluggin to do that?

Thanks in advance 
-- 
View this message in context: 
http://www.nabble.com/about-the-xml-output-format-tp21504381p21504381.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Parsing in Custom Request Handler

2009-01-16 Thread Hana

Sorry to all, there was a terrible bug in my code.
I should have checked whether the query was changed by
(q.toString().equals(newQuery.toString())   instead of (q != newQuery)!





Hana wrote:
> 
> Hi
> 
> I need a help with boolean queries in my custom RequestHandler. The
> purpose of the handler is to translate human readable
> date (like January 1990 or 15.2.1983 or 1995) into two date range fields
> using internal date representation.
> 
> E.g. simple search 'q=chronological:1942' translates to
> 
> '+from:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z]
> +to:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z]'
> 
> Everything works fine in the previous search, but when I try more complex
> boolean search it returns no result.
> 
> E.g complex search 'q=London AND chronological:1942'
> 
> my RequestHandler translates it to 
> 
> '+text:london +(+from:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z]
> +to:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z])'
> 
> So this query above doesn't work and I don't see the reason why because it
> seems to produce correct query.
> 
> 
> I have checked it with direct query bellow, it returns correct results: 
> 
> 'q=London AND (from:[1942-01-01T00:00:00Z TO 1942-12-31T23:59:59Z] AND
> to:[1942-01-01T00:00:00Z TO 1942-12-31T23:59:59Z])'
> 
> and the boolean query syntax is:
> 
> '+text:london +(+from:[1942-01-01T00:00:00 TO 1942-12-31T23:59:59]
> +to:[1942-01-01T00:00:00 TO 1942-12-31T23:59:59])'
> 
> 
> So I do not not understand why the previous is not working when the
> boolean query is totally the same except for the 'Z' char for dates
> string. But as the simple query works it don't seems to be the reason of
> not working of the complex query.
> 
> 
> Cheers
> 
> Hana
> 
> 
> Here's the code of the RequestHandler:
> 
> 
> public class CenturyShareRequestHandling extends StandardRequestHandler
> {
>  
>   public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
> rsp) throws Exception
>   {
> SolrParams p = req.getParams();
> String query = p.get(CommonParams.Q);
> Query q = QueryParsing.parseQuery(query, req.getSchema());
> Query newQuery = searchChronological(q);
> if (q != newQuery)
> {
>ModifiableSolrParams m = new
> ModifiableSolrParams(SolrParams.toMultiMap(p.toNamedList()));
>   m.remove(CommonParams.Q);
>   m.add(CommonParams.Q, newQuery.toString());
>   req.setParams(m);
> }
> super.handleRequestBody(req, rsp);
>   }
> 
>  
>   private Query searchChronological(Query q)
>   {
> if (q instanceof BooleanQuery)
> {
>   BooleanQuery bq = (BooleanQuery) q;
>   BooleanClause[] cl = bq.getClauses();
>   for (int i = 0; i < cl.length; i++)
>   {
> if (cl[i].getQuery() instanceof BooleanQuery)
> {
>   searchChronological(cl[i].getQuery());
> } else if (cl[i].getQuery() instanceof TermQuery)
> {
>   String result = getTemporalTerm((TermQuery) cl[i].getQuery());
>   if (result != null)
>   {
> Query dateQuery = replaceChronological(result);
> if (dateQuery != null)
>   cl[i].setQuery(dateQuery);
>   }
> }
>   }
> }
> else if (q instanceof TermQuery)
> {
>   String result = getTemporalTerm((TermQuery) q);
>if (result != null)
>   {
> Query dateQuery = replaceChronological(result);
> if (dateQuery != null)
>   q = dateQuery;
>  }
> }
>  return q;
>   }
> 
>   private String getTemporalTerm(TermQuery tq)
>   {
>  if ("chronological".equals(tq.getTerm().field()))
>   return tq.getTerm().text();
> else
>   return null;
>   }
> 
>   private Query replaceChronological(String chronological)
>   {
> DateRange r = getDateRange(chronological);
> BooleanQuery query = null;
> if (r.getStartDate() != null && r.getEndDate() != null)
> {
>   String startDate = r.getFormatedStartDate();
>   String endDate = r.getFormatedEndDate();
>   Term start = new Term("from", startDate);
>   Term end = new Term("from", endDate);
>   
>   RangeQuery startQuery = new RangeQuery(start, end, true);
>   start = new Term("to", startDate);
>   end = new Term("to", endDate);
>   RangeQuery endQuery = new RangeQuery(start, end, true);
>   query = new BooleanQuery();
>   query.add(new BooleanClause(startQuery, BooleanClause.Occur.MUST));
>   query.add(new BooleanClause(endQuery, BooleanClause.Occur.MUST));
> }
> return query;
>   }
> 
>   private DateRange getDateRange(String text)
>   {
> if (text == null)
>   return null;
> else
> {
>   DateParser p = new DateParser();
>   return p.parseDateRange(text);
> }
>   }
> 
> }
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Parsing-in-Custom-Request-Handler-tp21501351p21504363.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Help with Solr 1.3 lockups?

2009-01-16 Thread Bryan Talbot
I think it's pretty easy to check if SOLR is alive.  Even from a shell  
script, a simple command like


curl -iIs --url "http://solrhost/solr/select?start=0&rows=0"; | grep -c  
"HTTP/1.1 200 OK"


will return 1 if the response is an HTTP 200.  If the return is not 1,  
then there is a problem.  A load balancer or other tool can probably  
internalize the check and not need to fork processes like a shell  
script would, but the check can be the same.  This simply requests an  
HTTP HEAD (doesn't return any content) to for a fast executing query.   
In this case, the query with no "q=" specified seems to default to *:*  
when using dismax which is my default handler.



-Bryan




On Jan 15, 2009, at Jan 15, 2:13 PM, Stephen Weiss wrote:

I've been wondering about this one myself - most of the services we  
have installed work this way, if they crash out for whatever reason  
they restart automatically (Apache, MySQL, even the OS itself).   
Failures are detected and corrected by the load balancers and also  
in some cases by the machine itself (like with kernel panics).   But  
not SOLR, and I'm not quite sure what to do to get it there.  We use  
Jetty but it's the same story.  It's not like it fails out all that  
often, but when it does it will still respond to HTTP requests  
(because Jetty itself is still working), which makes it a lot harder  
to detect a failure... I've tried writing something for nagios but  
the problem is that most responses solr would give to a request vary  
depending on index updates, so it's not like I can just take a  
checksum and compare it - and even then, it would only really alert  
us to the problem, we'd still have to go in and restart everything  
(personally I don't enjoy restarting servers from my blackberry  
nearly as much as I should).


I'd have to come up with something that can intelligently interpret  
the response and decide if the server's still working properly or  
not, and the processing time on that alone might make it too  
inefficient to run every few seconds, but at least with that we'd be  
able to tell the cluster "don't send anything to this server for  
now".  Is there some really obvious way to track if a particular  
servlet is still running properly (in either Tomcat or Jetty,  
because if Tomcat has this I'd switch) and restart the container if  
it's not?


Thanks!!

--
Steve

On Jan 15, 2009, at 1:57 PM, Jerome L Quinn wrote:



An even bigger problem is the fact that once Solr is wedged, it  
stays that
way until a human notices and restarts things.  The tomcat stays  
running

and there's no automatic detection that will either restart Solr, or
restart the Tomcat container.

Any suggestions on either front?

Thanks,
Jerry Quinn







Re: Is it just me or multicore default is broken? Can't ping

2009-01-16 Thread Fergus McMenemie
Julian,

This is with the nightly from jan 12.

I am using mutli core and playing about with DIH. I cant its 
Interactive development mode to work properly and suspect
that to use it I need to run in single core mode.

I am still developing, so I have nothing setup within tomcat startup
files, it all depends on the directory I launch tomcat from which is
/Volumes/spare/ts:-

fergus: ls -al /Volumes/spare/ts 
   total 2657816
   drwxrwxrwx  19 rootfergus646 Jan 14 11:06 .
   drwxrwxr-x  18 rootadmin 680 Jan 13 10:46 ..
   -rw-rw-rw-@  1 fergus  fergus   6148 Jan 16 14:58 .DS_Store
   drwxr-xr-x  16 fergus  fergus544 Apr  8  2008 apache-solr-bc
   drwxr-xr-x@ 15 fergus  fergus510 Jan 14 11:06 apache-solr-nightly
   drwxr-xr-x   3 fergus  fergus102 Jan 13 11:06 solr
   -rw-r--r--@  1 fergus  fergus   57874925 Jan 12 22:31 solr-2009-01-12.tgz
   drwxr-xr-x   8 fergus  fergus272 Dec 16 17:53 solrbc
   drwxr-xr-x   7 fergus  fergus238 Jan 16 12:08 solrnightlyjanes

fergus: ls -al /Volumes/spare/ts/solr
   total 8
   drwxr-xr-x   3 fergus  fergus  102 Jan 13 11:06 .
   drwxrwxrwx  19 rootfergus  646 Jan 14 11:06 ..
   -rw-rw-rw-@  1 fergus  fergus  500 Jan 13 11:07 solr.xml

fergus: more /Volumes/spare/ts/solr/solr.xml
   
   
  
 


 


 
  


Here is a fragment from the top of one of my solrconfig.xml file. Note the use 
of 
solr.data.dir.

fergus: more /Volumes/spare/ts/solrnightlyjanes/conf/solrconfig.xml file. 
   
   
 
 
${solr.abortOnConfigurationError:true}
   
 
 ${solr.data.dir:./solr/data}






fergus: get 'http://localhost:8080/solr/admin/cores' | perl -p -e 
's[()][$1\n  ]g;'
  
  
  0
 2
 
 gazetteer
 solr/../solrbc/
 solrbc/data/
 2009-01-16T12:08:56.033Z
 3078174
 6705364
 6705364
 1229202899164
 false
 true
 false
 org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/Volumes/spare/ts/solrbc/data/index
 2008-12-13T21:39:08Z
 
 
 janesdocs
 solr/../solrnightlyjanes/
 solrnightlyjanes/data/
 2009-01-16T12:08:56.613Z
 3077596
 269
 269
 1232107736664
 true
 true
 false
 org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/Volumes/spare/ts/solrnightlyjanes/data/index
 2009-01-16T12:57:40Z
 
 
   
 
 
fergus: get 'http://localhost:8080/solr/janesdocs/admin/ping' | perl -p -e 
's[()][$1\n  ]g;'
  
  
  0
 2
 all
 all
 solrpingquery
 standard
 
 
 OK
 
 
fergus: get 'http://localhost:8080/solr/gazetteer/admin/ping' | perl -p -e 
's[()][$1\n  ]g;'
 
 
 0
 2
 all
 all
 solrpingquery
 standard
 
 
 OK
 
 
Hope this helps.


>I gave few new shots today:
>- with jetty and nightly build 16 Jan  - same problem null pointer exception
>- Then I decided I am not using solr multicore but rather tomcat to
>handle this. So I get latets tomcat and again with using 1.3.0 solr.war
>I setup all
>as explained
>http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac
>Links again are all smooth for admin and all but I still get 500 on pings :(
>
>Is everyone using solr with single index(core)   ?
>
>Cheers
>
>All setup is smooth, working
>
>Julian Davchev wrote:
>> Hi,
>>
>> I am trying with 1.3.0from 
>> http://apache.cbox.biz/lucene/solr/1.3.0/apache-solr-1.3.0.tgz
>>
>> which I supposed is stable release.
>>
>> Otis Gospodnetic wrote:
>>   
>>> Not sure, I'd have to try it.  But you didn't mention which version of Solr 
>>> you are using.  Nightly build?
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> - Original Message 
>>>   
>>> 
 From: Julian Davchev 
 To: solr-user@lucene.apache.org
 Sent: Thursday, January 15, 2009 9:53:37 AM
 Subject: Is it just me or multicore default is broken? Can't ping

 Hi,
 I am trying to setup multicore solr. So I just download default one with
 jetty...goto example/
 and run
 java -Dsolr.solr.home=multicore -jar start.jar


 All looks smooth without errors on startup.
 Also can can open admin at

 http://localhost:8983/solr/core1/admin/


 But then trying to ping
 http://localhost:8983/solr/core1/admin/ping

 I get  error 500 INTERNAL SERVER ERROR


 And tons of exceptions in background starting with nullpointer

 Anyone have a clue? Is solr stable to be used or multicore is something
 reacently added and not to be trusted yet?
 
   
>>>   
>>> 
>>
>>   

-- 

===
Fergus McMenemie   Email:fer...@twig.me.uk

Query Parsing in Custom Request Handler

2009-01-16 Thread Hana

Hi

I need a help with boolean queries in my custom RequestHandler. The purpose
of the handler is to translate human readable
date (like January 1990 or 15.2.1983 or 1995) into two date range fields
using internal date representation.

E.g. simple search 'q=chronological:1942' translates to

'+from:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z]
+to:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z]'

Everything works fine in the previous search, but when I try more complex
boolean search it returns no result.

E.g complex search 'q=London AND chronological:1942'

my RequestHandler translates it to 

'+text:london +(+from:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z]
+to:[1942-01-01T00:00:01Z TO 1942-12-31T23:59:59Z])'

So this query above doesn't work and I don't see the reason why because it
seems to produce correct query.


I have checked it with direct query bellow, it returns correct results: 

'q=London AND (from:[1942-01-01T00:00:00Z TO 1942-12-31T23:59:59Z] AND
to:[1942-01-01T00:00:00Z TO 1942-12-31T23:59:59Z])'

and the boolean query syntax is:

'+text:london +(+from:[1942-01-01T00:00:00 TO 1942-12-31T23:59:59]
+to:[1942-01-01T00:00:00 TO 1942-12-31T23:59:59])'


So I do not not understand why the previous is not working when the boolean
query is totally the same except for the 'Z' char for dates string. But as
the simple query works it don't seems to be the reason of not working of the
complex query.


Cheers

Hana


Here's the code of the RequestHandler:


public class CenturyShareRequestHandling extends StandardRequestHandler
{
 
  public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
throws Exception
  {
SolrParams p = req.getParams();
String query = p.get(CommonParams.Q);
Query q = QueryParsing.parseQuery(query, req.getSchema());
Query newQuery = searchChronological(q);
if (q != newQuery)
{
   ModifiableSolrParams m = new
ModifiableSolrParams(SolrParams.toMultiMap(p.toNamedList()));
  m.remove(CommonParams.Q);
  m.add(CommonParams.Q, newQuery.toString());
  req.setParams(m);
}
super.handleRequestBody(req, rsp);
  }

 
  private Query searchChronological(Query q)
  {
if (q instanceof BooleanQuery)
{
  BooleanQuery bq = (BooleanQuery) q;
  BooleanClause[] cl = bq.getClauses();
  for (int i = 0; i < cl.length; i++)
  {
if (cl[i].getQuery() instanceof BooleanQuery)
{
  searchChronological(cl[i].getQuery());
} else if (cl[i].getQuery() instanceof TermQuery)
{
  String result = getTemporalTerm((TermQuery) cl[i].getQuery());
  if (result != null)
  {
Query dateQuery = replaceChronological(result);
if (dateQuery != null)
  cl[i].setQuery(dateQuery);
  }
}
  }
}
else if (q instanceof TermQuery)
{
  String result = getTemporalTerm((TermQuery) q);
   if (result != null)
  {
Query dateQuery = replaceChronological(result);
if (dateQuery != null)
  q = dateQuery;
 }
}
 return q;
  }

  private String getTemporalTerm(TermQuery tq)
  {
 if ("chronological".equals(tq.getTerm().field()))
  return tq.getTerm().text();
else
  return null;
  }

  private Query replaceChronological(String chronological)
  {
DateRange r = getDateRange(chronological);
BooleanQuery query = null;
if (r.getStartDate() != null && r.getEndDate() != null)
{
  String startDate = r.getFormatedStartDate();
  String endDate = r.getFormatedEndDate();
  Term start = new Term("from", startDate);
  Term end = new Term("from", endDate);
  
  RangeQuery startQuery = new RangeQuery(start, end, true);
  start = new Term("to", startDate);
  end = new Term("to", endDate);
  RangeQuery endQuery = new RangeQuery(start, end, true);
  query = new BooleanQuery();
  query.add(new BooleanClause(startQuery, BooleanClause.Occur.MUST));
  query.add(new BooleanClause(endQuery, BooleanClause.Occur.MUST));
}
return query;
  }

  private DateRange getDateRange(String text)
  {
if (text == null)
  return null;
else
{
  DateParser p = new DateParser();
  return p.parseDateRange(text);
}
  }

}

-- 
View this message in context: 
http://www.nabble.com/Query-Parsing-in-Custom-Request-Handler-tp21501351p21501351.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Having no luck with build-in replication and multicore

2009-01-16 Thread Jacob Singh
I'm not sure what else to share here... I can try to code dive a bit
this week, but I imagine it is over my head and at a Lucene level.
Fortunately for us, re-indexing is not an issue, so we can manage it.
If someone can confirm it, I think it would be good to update the wiki
and let people know even if it doesn't get fixed.

Best,
Jacob

On Fri, Jan 16, 2009 at 9:44 AM, Noble Paul നോബിള്‍  नोब्ळ्
 wrote:
> On Fri, Jan 16, 2009 at 7:14 PM, Jacob Singh  wrote:
>> Hi Shalin,
>>
>> Sorry, my post was unlcear.  I am calling snappull from the slave, I
>> get that part, just obfuscating the domains incorrectly :).  The
>> problem it seems is the index version.
>>
>> The index was a 1.3 index and I've since moved to 1.4.  It's been
>> working great so far until I hit the replication bit.  It silently
>> doesn't work.  When I deleted the index, and re-indexed (so it is
>> native 1.4) it started working.
>>
>> Is this documented somewhere?
> This is not one piece that we have tested so probably you can share
> your findings
>>
>> Best,
>> Jacob
>>
>> On Fri, Jan 16, 2009 at 1:27 AM, Shalin Shekhar Mangar
>>  wrote:
>>> Hi Jacob,
>>>
>>> You don't need to call snapshoot on the master. That is only used to create
>>> a backup of the index files.
>>>
>>> You are calling snappull on the master. It is only applicable for the
>>> slaves. You don't need to issue these calls yourself at all. The
>>> ReplicationHandler is designed to take care of these.
>>>
>>> The master is showing indexversion as 0 because you haven't called commit on
>>> the master yet. Can you call commit and see if replication happens on the
>>> slave?
>>>
>>> On Fri, Jan 16, 2009 at 2:24 AM, Jacob Singh  wrote:
>>>
 Hi Shalin,

 Thanks for responding!  This used to be a 1.3 index (could that be the
 issue?)

 curl '
 http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe//replication?command=indexversion
 '
 
 
 0>>> name="QTime">00>>> name="generation">0
 

 Best,
 Jacob


 On Jan 15, 2009 3:32pm, Shalin Shekhar Mangar 
 wrote:
 > What is the output of /replication?command=indexversion on the master?
 >
 >
 >
 > On Fri, Jan 16, 2009 at 1:27 AM, Jacob Singh jacobsi...@gmail.com>
 wrote:
 >
 >
 >
 > > Hi folks,
 >
 > >
 >
 > > Here's what I've got going:
 >
 > >
 >
 > > Master Server with the following config:
 >
 > >
 >
 > >
 >
 > >commit
 >
 > >schema.xml,stopwords.txt,elevate.xml
 >
 > >
 >
 > >
 >
 > >
 >
 > > Slave server with the following:
 >
 > >
 >
 > >
 >
 > >
 >
 > >
 >
 > > http://mydomain:8080/solr/065f079c24914a4103e2a57178164bbe/replication
 >
 > >
 >
 > >00:00:20
 >
 > >
 >
 > >
 >
 > >
 >
 > > I think there is a bug in the JSP for the admin pages (which I can
 >
 > > post a patch if desired) where the replication link goes to
 >
 > > replication/ and index.jsp doesn't get loaded automatically (at least
 >
 > > on my box).  I managed to get to the dashboard by adding index.jsp,
 >
 > > and it seems that while the slave is polling constantly, it never
 >
 > > receives an update.
 >
 > >
 >
 > > I tried the following:
 >
 > >
 >
 > > curl '
 >
 > >
 http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe/replication?command=snapshoot
 >
 > > '
 >
 > >
 >
 > >
 >
 > > 0
 > > name="QTime">1
 > >
 >
 > >
 name="exception">java.lang.NullPointerException:java.lang.NullPointerException
 >
 > >
 >
 > >
 >
 > > The index has about 400 docs in it, and old style replication used to
 >
 > > work just fine on it.
 >
 > >
 >
 > > When I run the snappull command from the slave:
 >
 > >
 >
 > > curl '
 >
 > >
 http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe/replication?command=snappull
 >
 > > '
 >
 > >
 >
 > >
 >
 > > 0
 > > name="QTime">1OK
 >
 > >
 >
 > >
 >
 > > The replication page also remains unchanged and there are no docs on
 the
 >
 > > slave.
 >
 > >
 >
 > > Any ideas?
 >
 > >
 >
 > > Thanks,
 >
 > > Jacob
 >
 > >
 >
 > >
 >
 > >
 >
 > >
 >
 > >
 >
 > >
 >
 > > --
 >
 > >
 >
 > > +1 510 277-0891 (o)
 >
 > > +91  33 7458 (m)
 >
 > >
 >
 > > web: http://pajamadesign.com
 >
 > >
 >
 > > Skype: pajamadesign
 >
 > > Yahoo: jacobsingh
 >
 > > AIM: jacobsingh
 >
 > > gTalk: jacobsi...@gmail.com
 >
 > >
 >
 >
 >
 >
 >
>>

Re: Having no luck with build-in replication and multicore

2009-01-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Jan 16, 2009 at 7:14 PM, Jacob Singh  wrote:
> Hi Shalin,
>
> Sorry, my post was unlcear.  I am calling snappull from the slave, I
> get that part, just obfuscating the domains incorrectly :).  The
> problem it seems is the index version.
>
> The index was a 1.3 index and I've since moved to 1.4.  It's been
> working great so far until I hit the replication bit.  It silently
> doesn't work.  When I deleted the index, and re-indexed (so it is
> native 1.4) it started working.
>
> Is this documented somewhere?
This is not one piece that we have tested so probably you can share
your findings
>
> Best,
> Jacob
>
> On Fri, Jan 16, 2009 at 1:27 AM, Shalin Shekhar Mangar
>  wrote:
>> Hi Jacob,
>>
>> You don't need to call snapshoot on the master. That is only used to create
>> a backup of the index files.
>>
>> You are calling snappull on the master. It is only applicable for the
>> slaves. You don't need to issue these calls yourself at all. The
>> ReplicationHandler is designed to take care of these.
>>
>> The master is showing indexversion as 0 because you haven't called commit on
>> the master yet. Can you call commit and see if replication happens on the
>> slave?
>>
>> On Fri, Jan 16, 2009 at 2:24 AM, Jacob Singh  wrote:
>>
>>> Hi Shalin,
>>>
>>> Thanks for responding!  This used to be a 1.3 index (could that be the
>>> issue?)
>>>
>>> curl '
>>> http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe//replication?command=indexversion
>>> '
>>> 
>>> 
>>> 0>> name="QTime">00>> name="generation">0
>>> 
>>>
>>> Best,
>>> Jacob
>>>
>>>
>>> On Jan 15, 2009 3:32pm, Shalin Shekhar Mangar 
>>> wrote:
>>> > What is the output of /replication?command=indexversion on the master?
>>> >
>>> >
>>> >
>>> > On Fri, Jan 16, 2009 at 1:27 AM, Jacob Singh jacobsi...@gmail.com>
>>> wrote:
>>> >
>>> >
>>> >
>>> > > Hi folks,
>>> >
>>> > >
>>> >
>>> > > Here's what I've got going:
>>> >
>>> > >
>>> >
>>> > > Master Server with the following config:
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >commit
>>> >
>>> > >schema.xml,stopwords.txt,elevate.xml
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > Slave server with the following:
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > http://mydomain:8080/solr/065f079c24914a4103e2a57178164bbe/replication
>>> >
>>> > >
>>> >
>>> > >00:00:20
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > I think there is a bug in the JSP for the admin pages (which I can
>>> >
>>> > > post a patch if desired) where the replication link goes to
>>> >
>>> > > replication/ and index.jsp doesn't get loaded automatically (at least
>>> >
>>> > > on my box).  I managed to get to the dashboard by adding index.jsp,
>>> >
>>> > > and it seems that while the slave is polling constantly, it never
>>> >
>>> > > receives an update.
>>> >
>>> > >
>>> >
>>> > > I tried the following:
>>> >
>>> > >
>>> >
>>> > > curl '
>>> >
>>> > >
>>> http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe/replication?command=snapshoot
>>> >
>>> > > '
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > 0
>>> > > name="QTime">1
>>> > >
>>> >
>>> > >
>>> name="exception">java.lang.NullPointerException:java.lang.NullPointerException
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > The index has about 400 docs in it, and old style replication used to
>>> >
>>> > > work just fine on it.
>>> >
>>> > >
>>> >
>>> > > When I run the snappull command from the slave:
>>> >
>>> > >
>>> >
>>> > > curl '
>>> >
>>> > >
>>> http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe/replication?command=snappull
>>> >
>>> > > '
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > 0
>>> > > name="QTime">1OK
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > The replication page also remains unchanged and there are no docs on
>>> the
>>> >
>>> > > slave.
>>> >
>>> > >
>>> >
>>> > > Any ideas?
>>> >
>>> > >
>>> >
>>> > > Thanks,
>>> >
>>> > > Jacob
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > >
>>> >
>>> > > --
>>> >
>>> > >
>>> >
>>> > > +1 510 277-0891 (o)
>>> >
>>> > > +91  33 7458 (m)
>>> >
>>> > >
>>> >
>>> > > web: http://pajamadesign.com
>>> >
>>> > >
>>> >
>>> > > Skype: pajamadesign
>>> >
>>> > > Yahoo: jacobsingh
>>> >
>>> > > AIM: jacobsingh
>>> >
>>> > > gTalk: jacobsi...@gmail.com
>>> >
>>> > >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Regards,
>>> >
>>> > Shalin Shekhar Mangar.
>>> >
>>>
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>
>
> --
>
> +1 510 277-0891 (o)
> +91  33 7458 (m)
>
> web: http://pajamadesign.com
>
> Skype: pajamadesign
> Yahoo: jacobsingh
> AIM: jacobsingh
> gTalk: jacobsi...@gmail.com
>



-- 
--Noble Paul


Re: Querying Solr Index for date fields

2009-01-16 Thread Erik Hatcher
It doesn't really make sense to use a date field in a dismax qf  
parameter.  Use an fq parameter instead, to filter results by a date  
field.


Dismax is aimed for end users textual queries, not for field selection  
or more refined typed queries like date or numeric ranges.


Erik


On Jan 16, 2009, at 9:23 AM, prerna07 wrote:




We also make query on date ranges, it works when you use NOW function.
Try using :
?q=dateField:[* TO NOW]
?q=dateField:[NOW-45DAYS TO NOW]
?q=dateField:[NOW TO NOW+45DAYS]


Issue: Current issue which i am facing is with dismaxrequesthandler  
for date

field.
As soon as I add dateField in dismaxrequest  tag, dismax for other
string / text attributes stops working. My search query is ? 
q=SearchString,

the error i get is
"The request sent by the client was syntactically incorrect (Invalid  
Date

String:'searchTerm')."

Please suggets how can i use date field in qf of dismaxrequest.

Thanks,
Prerna


Akshay-8 wrote:


You will have to URL encode the string correctly and supply date in  
format

Solr expects. Please check this:
http://wiki.apache.org/solr/SolrQuerySyntax

On Fri, Jan 9, 2009 at 12:21 PM, Rayudu   
wrote:




Hi All,
I have a field with is solr.DateField in my schema file. If I  
want

to
get the docs. for a given date for eg: get all the docs. whose  
date value

is
2009-01-09 then how can I query my index. As solr's date format is
-mm-ddThh:mm:ss,

   if I give the date as 2009-01-09T00:00:00Z it is  
thorwing an

exception "solr.SolrException: HTTP code=400, reason=Invalid Date
String:'2009-01-09T00' " .
   if I give the date as 2009-01-09 it is  
thorwing

an
exception , solr.SolrException: HTTP code=400, reason=Invalid Date
String:'2009-01-09'

Thanks,
Rayudu.
--
View this message in context:
http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21367097.html
Sent from the Solr - User mailing list archive at Nabble.com.





--
Regards,
Akshay Ukey.




--
View this message in context: 
http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21500362.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Querying Solr Index for date fields

2009-01-16 Thread prerna07


We also make query on date ranges, it works when you use NOW function.
Try using :
?q=dateField:[* TO NOW]  
?q=dateField:[NOW-45DAYS TO NOW]
?q=dateField:[NOW TO NOW+45DAYS]


Issue: Current issue which i am facing is with dismaxrequesthandler for date
field.
As soon as I add dateField in dismaxrequest  tag, dismax for other
string / text attributes stops working. My search query is ?q=SearchString,
the error i get is 
"The request sent by the client was syntactically incorrect (Invalid Date
String:'searchTerm')."

Please suggets how can i use date field in qf of dismaxrequest.

Thanks,
Prerna


Akshay-8 wrote:
> 
> You will have to URL encode the string correctly and supply date in format
> Solr expects. Please check this:
> http://wiki.apache.org/solr/SolrQuerySyntax
> 
> On Fri, Jan 9, 2009 at 12:21 PM, Rayudu  wrote:
> 
>>
>> Hi All,
>>  I have a field with is solr.DateField in my schema file. If I want
>> to
>> get the docs. for a given date for eg: get all the docs. whose date value
>> is
>> 2009-01-09 then how can I query my index. As solr's date format is
>> -mm-ddThh:mm:ss,
>>
>> if I give the date as 2009-01-09T00:00:00Z it is thorwing an
>> exception "solr.SolrException: HTTP code=400, reason=Invalid Date
>> String:'2009-01-09T00' " .
>> if I give the date as 2009-01-09 it is thorwing
>> an
>> exception , solr.SolrException: HTTP code=400, reason=Invalid Date
>> String:'2009-01-09'
>>
>> Thanks,
>> Rayudu.
>> --
>> View this message in context:
>> http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21367097.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Akshay Ukey.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Querying-Solr-Index-for-date-fields-tp21367097p21500362.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to open a new searcher and close the old one

2009-01-16 Thread Alexander Ramos Jardim
No,

You can't assume that. You have to set a good autoCommit value for your
solrconfig.xml, so you don't run out of memory for no commiting to Solr
often, depending on your enviroment, memory share, doc size and update
frequency.

2009/1/16 Manupriya 

>
> Thanks for the information!!
>
> So can I safely assume that we will not face any memory issue due to
> caching
> even if we do not send commit that frequently? (If we wont send commit,
> then
> new searcher wont be initialized. So I can assume that the current searcher
> will correctly manage cache without any memory issues.)
>
> Thanks,
> Manu
> --
> View this message in context:
> http://www.nabble.com/How-to-open-a-new-searcher-and-close-the-old-one-by-sending-HTTP-request-tp21496803p21499810.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Alexander Ramos Jardim


Re: DIH XPathEntityProcessor fails with docs containing

2009-01-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
I have raised an issue and a patch is provided.
Please confirm if it helps
https://issues.apache.org/jira/browse/SOLR-964

On Fri, Jan 16, 2009 at 3:52 PM, Noble Paul നോബിള്‍  नोब्ळ्
 wrote:
> stax parser automatically tries to fetch the DTD. How can we disable
> that at the parser level?
>
> On Fri, Jan 16, 2009 at 3:34 PM, Fergus McMenemie  wrote:
>> Hello all, as the subject says:
>>   DIH XPathEntityProcessor fails with docs containing 
>>
>> This is using a solr nightly build from monday.
>>
>> INFO: Server startup in 3623 ms
>> Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.SolrWriter 
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrCore execute
>> INFO: [jdocs] webapp=/solr path=/walkj params={command=full-import} status=0 
>> QTime=13
>> Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.DataImporter 
>> doFullImport
>> INFO: Starting Full Import
>> Jan 16, 2009 9:54:12 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll
>> INFO: [jdocs] REMOVING ALL DOCUMENTS FROM INDEX
>> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy onInit
>> INFO: SolrDeletionPolicy.onInit: commits:num=2
>>
>> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_c,version=1232026423291,generation=12,filenames=[segments_c,
>>  _4.fnm, _4.frq, _4.prx, _4.tis, _4.tii, _4.nrm, _4.fdx, _4.fdt]
>>
>> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_d,version=1232026423292,generation=13,filenames=[segments_d]
>> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy updateCommits
>> INFO: last commit = 1232026423292
>> Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DocBuilder 
>> buildDocument
>> SEVERE: Exception while processing: jcurrent document : null
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing 
>> failed for xml, url:/j/dtd/jxml/data/news/2008/frp70450.xmlrows processed :0 
>> Processing Document # 1
>>at 
>> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>>at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:252)
>>at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:177)
>>at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
>>at 
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313)
>>at 
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>>at 
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202)
>>at 
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
>>at 
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321)
>>at 
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381)
>>at 
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
>> Caused by: java.lang.RuntimeException: 
>> com.ctc.wstx.exc.WstxParsingException: (was java.io.FileNotFoundException) 
>> /../config/jml-delivery-norm-2.1.dtd (No such file or directory)
>>  at [row,col {unknown-source}]: [3,81]
>>at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
>>at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:242)
>>... 9 more
>> Caused by: com.ctc.wstx.exc.WstxParsingException: (was 
>> java.io.FileNotFoundException) /../config/jml-delivery-norm-2.1.dtd (No such 
>> file or directory)
>>  at [row,col {unknown-source}]: [3,81]
>>at 
>> com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
>>at 
>> com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
>>at 
>> com.ctc.wstx.sr.ValidatingStreamReader.findDtdExtSubset(ValidatingStreamReader.java:475)
>>at 
>> com.ctc.wstx.sr.ValidatingStreamReader.finishDTD(ValidatingStreamReader.java:358)
>>at 
>> com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3351)
>>at 
>> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:1988)
>>at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
>>at 
>> org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:141)
>>at 
>> org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89)
>>at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82)
>>... 10 more
>> Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DataImporter 
>> doFullImport
>> SEVERE: Full Import failed
>>
>> A fragment from the t

Re: How to open a new searcher and close the old one

2009-01-16 Thread Manupriya

Thanks for the information!!

So can I safely assume that we will not face any memory issue due to caching
even if we do not send commit that frequently? (If we wont send commit, then
new searcher wont be initialized. So I can assume that the current searcher
will correctly manage cache without any memory issues.)

Thanks,
Manu
-- 
View this message in context: 
http://www.nabble.com/How-to-open-a-new-searcher-and-close-the-old-one-by-sending-HTTP-request-tp21496803p21499810.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Index files deleted but still accessed by Tomcat

2009-01-16 Thread Alexander Ramos Jardim
Dominik,

Form my experiences with Solr, I have seen the index double its size during
an optimize.

I always keep my index directory in a partition that has at least its triple
size, so I have a margin for optimize and for natural doc quantity growth.

2009/1/16 Dominik Schramm 

> On more thing that might help to identify the problem source (which I've
> discovered just now) -- at the time the optimize "finished" (or broke
> off), Tomcat logged the following:
>
> r...@cms004:~# /opt/apache-tomcat-sba1-live/logs/catalina.2009-01-16.log
> ...
> Jan 16, 2009 11:40:14 AM org.apache.solr.core.SolrCore execute
> INFO: /update/csv
>
> header=false&separator=;&commit=true&fieldnames=key,siln,biln,prln,date,validfrom,contractno,currency,
>
> reserve1,reserve2,posno,processingcode,ean,ek,pb,quantityunit,p_s_ek1,p_s_m1,p_s_q1,p_s_ek2,p_s_m2,p_s_q2,p_s_ek3,p_s_m3
>
> ,p_s_q3,p_s_ek4,p_s_m4,p_s_q4,p_s_ek5,p_s_m5,p_s_q5,p_s_ek6,p_s_m6,p_s_q6,p_s_ek7,p_s_m7,p_s_q7,p_s_ek8,p_s_m8,p_s_q8,p_
>
> s_ek9,p_s_m9,p_s_q9,p_s_ek10,p_s_m10,p_s_q10,a_s_ek1,a_s_m1,a_s_q1,a_s_ek2,a_s_m2,a_s_q2,a_s_ek3,a_s_m3,a_s_q3,a_s_ek4,a
>
> _s_m4,a_s_q4,a_s_ek5,a_s_m5,a_s_q5,a_s_ek6,a_s_m6,a_s_q6,a_s_ek7,a_s_m7,a_s_q7,a_s_ek8,a_s_m8,a_s_q8,a_s_ek9,a_s_m9,a_s_
> q9,a_s_ek10,a_s_m10,a_s_q10&stream.file=/opt/asdf/bla/bla 3754
> Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
> Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
> Jan 16, 2009 12:08:01 PM org.apache.solr.core.SolrException log
> SEVERE: java.io.IOException: No space left on device
>at java.io.RandomAccessFile.writeBytes(Native Method)
>at java.io.RandomAccessFile.write(RandomAccessFile.java:456)
>at
>
> org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:589)
>at
>
> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
>at
>
> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
>at
>
> org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
>at
>
> org.apache.lucene.store.FSDirectory$FSIndexOutput.close(FSDirectory.java:594)
>at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:48)
>at
> org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:211)
>at
> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)
>at
> org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835)
>at
> org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195)
>at
>
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508)
>at
> de.businessmart.cai.solr.CAIUpdateHandler.commit(CAIUpdateHandler.java:343)
>at
>
> org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:214)
>at
>
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84)
>at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
>at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191)
>at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159)
>at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
>at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
>at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
>at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
>at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
>at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
>at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
>at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
>at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
>at
>
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:833)
>at
>
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:639)
>at
> org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1285)
>at java.lang.Thread.run(Thread.java:595)
> ...
>
> So a new question is: how much free disk space does an optimize need? The
> size of the index additionally?
>
> Dominik
>
> Dominik Schramm wrote:
> > Hello,
> >
> > we are running "Solr Implementation Version: 1.2.0 - Yonik 

Re: Index files deleted but still accessed by Tomcat

2009-01-16 Thread Dominik Schramm
On more thing that might help to identify the problem source (which I've
discovered just now) -- at the time the optimize "finished" (or broke
off), Tomcat logged the following:

r...@cms004:~# /opt/apache-tomcat-sba1-live/logs/catalina.2009-01-16.log
...
Jan 16, 2009 11:40:14 AM org.apache.solr.core.SolrCore execute
INFO: /update/csv
header=false&separator=;&commit=true&fieldnames=key,siln,biln,prln,date,validfrom,contractno,currency,
reserve1,reserve2,posno,processingcode,ean,ek,pb,quantityunit,p_s_ek1,p_s_m1,p_s_q1,p_s_ek2,p_s_m2,p_s_q2,p_s_ek3,p_s_m3
,p_s_q3,p_s_ek4,p_s_m4,p_s_q4,p_s_ek5,p_s_m5,p_s_q5,p_s_ek6,p_s_m6,p_s_q6,p_s_ek7,p_s_m7,p_s_q7,p_s_ek8,p_s_m8,p_s_q8,p_
s_ek9,p_s_m9,p_s_q9,p_s_ek10,p_s_m10,p_s_q10,a_s_ek1,a_s_m1,a_s_q1,a_s_ek2,a_s_m2,a_s_q2,a_s_ek3,a_s_m3,a_s_q3,a_s_ek4,a
_s_m4,a_s_q4,a_s_ek5,a_s_m5,a_s_q5,a_s_ek6,a_s_m6,a_s_q6,a_s_ek7,a_s_m7,a_s_q7,a_s_ek8,a_s_m8,a_s_q8,a_s_ek9,a_s_m9,a_s_
q9,a_s_ek10,a_s_m10,a_s_q10&stream.file=/opt/asdf/bla/bla 3754
Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
Jan 16, 2009 11:43:14 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
Jan 16, 2009 12:08:01 PM org.apache.solr.core.SolrException log
SEVERE: java.io.IOException: No space left on device
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:456)
at
org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:589)
at
org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
at
org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
at
org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
at
org.apache.lucene.store.FSDirectory$FSIndexOutput.close(FSDirectory.java:594)
at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:48)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:211)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)
at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1835)
at
org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1195)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:508)
at
de.businessmart.cai.solr.CAIUpdateHandler.commit(CAIUpdateHandler.java:343)
at
org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:214)
at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:84)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:77)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:191)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:159)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:833)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:639)
at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1285)
at java.lang.Thread.run(Thread.java:595)
...

So a new question is: how much free disk space does an optimize need? The
size of the index additionally?

Dominik

Dominik Schramm wrote:
> Hello,
>
> we are running "Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02
> 17:35:12" with "Lucene Implementation Version: build 2007-05-20" in a
> Tomcat application server "Apache Tomcat/5.5.20" on a 64-bit Ubuntu 7.10.
>
> For some time now (probably due to the continuous growth of the index,
> which is now roughly 40 GB in size) we experience a problem with deleted
> but still growing index files:
>
> r...@cms004:~# lsof | grep deleted
> ...
> java  10601root   84u  REG8,9   237359104
>   2981966 /

Re: Having no luck with build-in replication and multicore

2009-01-16 Thread Jacob Singh
Hi Shalin,

Sorry, my post was unlcear.  I am calling snappull from the slave, I
get that part, just obfuscating the domains incorrectly :).  The
problem it seems is the index version.

The index was a 1.3 index and I've since moved to 1.4.  It's been
working great so far until I hit the replication bit.  It silently
doesn't work.  When I deleted the index, and re-indexed (so it is
native 1.4) it started working.

Is this documented somewhere?

Best,
Jacob

On Fri, Jan 16, 2009 at 1:27 AM, Shalin Shekhar Mangar
 wrote:
> Hi Jacob,
>
> You don't need to call snapshoot on the master. That is only used to create
> a backup of the index files.
>
> You are calling snappull on the master. It is only applicable for the
> slaves. You don't need to issue these calls yourself at all. The
> ReplicationHandler is designed to take care of these.
>
> The master is showing indexversion as 0 because you haven't called commit on
> the master yet. Can you call commit and see if replication happens on the
> slave?
>
> On Fri, Jan 16, 2009 at 2:24 AM, Jacob Singh  wrote:
>
>> Hi Shalin,
>>
>> Thanks for responding!  This used to be a 1.3 index (could that be the
>> issue?)
>>
>> curl '
>> http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe//replication?command=indexversion
>> '
>> 
>> 
>> 0> name="QTime">00> name="generation">0
>> 
>>
>> Best,
>> Jacob
>>
>>
>> On Jan 15, 2009 3:32pm, Shalin Shekhar Mangar 
>> wrote:
>> > What is the output of /replication?command=indexversion on the master?
>> >
>> >
>> >
>> > On Fri, Jan 16, 2009 at 1:27 AM, Jacob Singh jacobsi...@gmail.com>
>> wrote:
>> >
>> >
>> >
>> > > Hi folks,
>> >
>> > >
>> >
>> > > Here's what I've got going:
>> >
>> > >
>> >
>> > > Master Server with the following config:
>> >
>> > >
>> >
>> > >
>> >
>> > >commit
>> >
>> > >schema.xml,stopwords.txt,elevate.xml
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > > Slave server with the following:
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > > http://mydomain:8080/solr/065f079c24914a4103e2a57178164bbe/replication
>> >
>> > >
>> >
>> > >00:00:20
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > > I think there is a bug in the JSP for the admin pages (which I can
>> >
>> > > post a patch if desired) where the replication link goes to
>> >
>> > > replication/ and index.jsp doesn't get loaded automatically (at least
>> >
>> > > on my box).  I managed to get to the dashboard by adding index.jsp,
>> >
>> > > and it seems that while the slave is polling constantly, it never
>> >
>> > > receives an update.
>> >
>> > >
>> >
>> > > I tried the following:
>> >
>> > >
>> >
>> > > curl '
>> >
>> > >
>> http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe/replication?command=snapshoot
>> >
>> > > '
>> >
>> > >
>> >
>> > >
>> >
>> > > 0
>> > > name="QTime">1
>> > >
>> >
>> > >
>> name="exception">java.lang.NullPointerException:java.lang.NullPointerException
>> >
>> > >
>> >
>> > >
>> >
>> > > The index has about 400 docs in it, and old style replication used to
>> >
>> > > work just fine on it.
>> >
>> > >
>> >
>> > > When I run the snappull command from the slave:
>> >
>> > >
>> >
>> > > curl '
>> >
>> > >
>> http://mydomain.com:8080/solr/065f079c24914a4103e2a57178164bbe/replication?command=snappull
>> >
>> > > '
>> >
>> > >
>> >
>> > >
>> >
>> > > 0
>> > > name="QTime">1OK
>> >
>> > >
>> >
>> > >
>> >
>> > > The replication page also remains unchanged and there are no docs on
>> the
>> >
>> > > slave.
>> >
>> > >
>> >
>> > > Any ideas?
>> >
>> > >
>> >
>> > > Thanks,
>> >
>> > > Jacob
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > >
>> >
>> > > --
>> >
>> > >
>> >
>> > > +1 510 277-0891 (o)
>> >
>> > > +91  33 7458 (m)
>> >
>> > >
>> >
>> > > web: http://pajamadesign.com
>> >
>> > >
>> >
>> > > Skype: pajamadesign
>> >
>> > > Yahoo: jacobsingh
>> >
>> > > AIM: jacobsingh
>> >
>> > > gTalk: jacobsi...@gmail.com
>> >
>> > >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Regards,
>> >
>> > Shalin Shekhar Mangar.
>> >
>>
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 

+1 510 277-0891 (o)
+91  33 7458 (m)

web: http://pajamadesign.com

Skype: pajamadesign
Yahoo: jacobsingh
AIM: jacobsingh
gTalk: jacobsi...@gmail.com


Index files deleted but still accessed by Tomcat

2009-01-16 Thread Dominik Schramm
Hello,

we are running "Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02
17:35:12" with "Lucene Implementation Version: build 2007-05-20" in a
Tomcat application server "Apache Tomcat/5.5.20" on a 64-bit Ubuntu 7.10.

For some time now (probably due to the continuous growth of the index,
which is now roughly 40 GB in size) we experience a problem with deleted
but still growing index files:

r...@cms004:~# lsof | grep deleted
...
java  10601root   84u  REG8,9   237359104
  2981966 /opt/solr1/data/index/_3tmy9.frq (deleted)
java  10601root   85u  REG8,9   120507392
  2981967 /opt/solr1/data/index/_3tmy9.prx (deleted)
java  10601root   86u  REG8,914528512
  2981968 /opt/solr1/data/index/_3tmy9.tis (deleted)
java  10601root   87u  REG8,9  233472
  2981971 /opt/solr1/data/index/_3tmy9.tii (deleted)
...
r...@cms004:~# ps -fp 10601
UIDPID  PPID  C STIME TTY  TIME CMD
root 10601 1 82 Jan15 pts/220:37:04
/usr/lib/jvm/java-1.5.0-sun-1.5.0.13/bin/java -Djava.awt.headless=true -
r...@cms004:~#

During the runs of the optimize.pl script (daily at night) the number of
files marked as deleted increases and drops again, but not to zero!
Several large files always remain even after the optimizer finished (the
lsof snapshot above is from after an optimize run). But even more
important is the fact that the Tomcat process still writes to them,
eventually filling up the partition. The only solution right now is to
restart the Tomcat process once a day. This is not super critical because
we run a staging environment, but it's a nuisance.

I've noticed that the commit preceding the optimize always fails:

2009/01/16 12:08:01 started by d
2009/01/16 12:08:01 command: /opt/solr1/bin/commit
2009/01/16 12:08:04 commit request to Solr at
http://localhost:8083/solr1/update failed:
2009/01/16 12:08:04   02423 
2009/01/16 12:08:04 failed (elapsed time: 3 sec)

What can I do about this? Is this a known phenomenon in version 1.2.0 or
with Tomcat etc. and has it been solved in subsequent versions? I couldn't
find any specific hints at this in the changelogs. An upgrade would be
untrivial and time-consuming, so I would like to make sure that the
problem will go away afterwards.

BTW: Danilo Fantinato described supposedly the same problem on Thu, 27 Sep
2007 16:37:01 GMT, (when the version we are still using was more or less
the current one) the subject was "Problem with handle hold deleted files"
-- there was no reply. See here:
http://www.nabble.com/Problem-with-handle-hold-deleted-files-td12925293.html

Thanks in advance for any help.

Dominik










Re: How to open a new searcher and close the old one

2009-01-16 Thread Alexander Ramos Jardim
Shalin is right about cache management, but fot the sake of knowledge,
everytime you sendd a commit to Solr, it will close the old Searcher and
open a new one.

2009/1/16 Shalin Shekhar Mangar 

> On Fri, Jan 16, 2009 at 4:17 PM, Manupriya  >wrote:
>
> >
> > Hi,
> >
> > We are using Solr as a standalone server. And our web application sends a
> > HTTP request for searching. We receive JSON result back and use the
> result.
> >
> > I had initailly asked about Searher
> > (http://www.nabble.com/What-do-we-mean-by-Searcher--td21436737.html).
> Now
> > I
> > understand it better.
> >
> > As per my understanding, when I send a search query for the first time, a
> > new searcher is opened. And this searcher caters to subsequent requests
> as
> > well. When I stop the Solr server, the current searcher stops/closes with
> > it. So on restarting, a new searcher will be initialized.
>
>
> The first searcher is opened on startup.
>
>
> > Now, I want to know , how can I close a current searcher and open a new
> > searcher through HTTP request only? I donot want to restart the server to
> > open a new searcher.
>
>
> Why do you need to do that?
>
>
> > We would be implementing caching for our application. And I read that in
> > Solr, cached objects will be valid as long as the Searcher is valid. So
> in
> > order to properly manage cache, we would want to understand if there is
> any
> > way that we can close/open searcher through HTTP requests.
>
>
> Solr automatically manages the cache correctly. Whenever a new commit
> happens on the index, a new searcher is created, warmed up and then it
> replaces the active searcher. All new requests then go to the new searcher.
> The old searcher is closed after it has finished handling all the previous
> requests which had been directed to it.
>
> The cache in Solr will never be stale. So you do not need to worry about
> these things.
>
>
> >
> >
> > Thanks,
> > Manu
> >
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/How-to-open-a-new-searcher-and-close-the-old-one-tp21496803p21496803.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
Alexander Ramos Jardim


Re: Is it just me or multicore default is broken? Can't ping

2009-01-16 Thread Julian Davchev
I gave few new shots today:
- with jetty and nightly build 16 Jan  - same problem null pointer exception
- Then I decided I am not using solr multicore but rather tomcat to
handle this. So I get latets tomcat and again with using 1.3.0 solr.war
I setup all
as explained
http://wiki.apache.org/solr/SolrTomcat#head-024d7e11209030f1dbcac9974e55106abae837ac
Links again are all smooth for admin and all but I still get 500 on pings :(

Is everyone using solr with single index(core)   ?

Cheers

All setup is smooth, working

Julian Davchev wrote:
> Hi,
>
> I am trying with 1.3.0from 
> http://apache.cbox.biz/lucene/solr/1.3.0/apache-solr-1.3.0.tgz
>
> which I supposed is stable release.
>
> Otis Gospodnetic wrote:
>   
>> Not sure, I'd have to try it.  But you didn't mention which version of Solr 
>> you are using.  Nightly build?
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> - Original Message 
>>   
>> 
>>> From: Julian Davchev 
>>> To: solr-user@lucene.apache.org
>>> Sent: Thursday, January 15, 2009 9:53:37 AM
>>> Subject: Is it just me or multicore default is broken? Can't ping
>>>
>>> Hi,
>>> I am trying to setup multicore solr. So I just download default one with
>>> jetty...goto example/
>>> and run
>>> java -Dsolr.solr.home=multicore -jar start.jar
>>>
>>>
>>> All looks smooth without errors on startup.
>>> Also can can open admin at
>>>
>>> http://localhost:8983/solr/core1/admin/
>>>
>>>
>>> But then trying to ping
>>> http://localhost:8983/solr/core1/admin/ping
>>>
>>> I get  error 500 INTERNAL SERVER ERROR
>>>
>>>
>>> And tons of exceptions in background starting with nullpointer
>>>
>>> Anyone have a clue? Is solr stable to be used or multicore is something
>>> reacently added and not to be trusted yet?
>>> 
>>>   
>>   
>> 
>
>   



Re: How to open a new searcher and close the old one

2009-01-16 Thread Shalin Shekhar Mangar
On Fri, Jan 16, 2009 at 4:17 PM, Manupriya wrote:

>
> Hi,
>
> We are using Solr as a standalone server. And our web application sends a
> HTTP request for searching. We receive JSON result back and use the result.
>
> I had initailly asked about Searher
> (http://www.nabble.com/What-do-we-mean-by-Searcher--td21436737.html). Now
> I
> understand it better.
>
> As per my understanding, when I send a search query for the first time, a
> new searcher is opened. And this searcher caters to subsequent requests as
> well. When I stop the Solr server, the current searcher stops/closes with
> it. So on restarting, a new searcher will be initialized.


The first searcher is opened on startup.


> Now, I want to know , how can I close a current searcher and open a new
> searcher through HTTP request only? I donot want to restart the server to
> open a new searcher.


Why do you need to do that?


> We would be implementing caching for our application. And I read that in
> Solr, cached objects will be valid as long as the Searcher is valid. So in
> order to properly manage cache, we would want to understand if there is any
> way that we can close/open searcher through HTTP requests.


Solr automatically manages the cache correctly. Whenever a new commit
happens on the index, a new searcher is created, warmed up and then it
replaces the active searcher. All new requests then go to the new searcher.
The old searcher is closed after it has finished handling all the previous
requests which had been directed to it.

The cache in Solr will never be stale. So you do not need to worry about
these things.


>
>
> Thanks,
> Manu
>
>
> --
> View this message in context:
> http://www.nabble.com/How-to-open-a-new-searcher-and-close-the-old-one-tp21496803p21496803.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


How to open a new searcher and close the old one

2009-01-16 Thread Manupriya

Hi,

We are using Solr as a standalone server. And our web application sends a
HTTP request for searching. We receive JSON result back and use the result.

I had initailly asked about Searher
(http://www.nabble.com/What-do-we-mean-by-Searcher--td21436737.html). Now I
understand it better. 

As per my understanding, when I send a search query for the first time, a
new searcher is opened. And this searcher caters to subsequent requests as
well. When I stop the Solr server, the current searcher stops/closes with
it. So on restarting, a new searcher will be initialized.

Now, I want to know , how can I close a current searcher and open a new
searcher through HTTP request only? I donot want to restart the server to
open a new searcher.

We would be implementing caching for our application. And I read that in
Solr, cached objects will be valid as long as the Searcher is valid. So in
order to properly manage cache, we would want to understand if there is any
way that we can close/open searcher through HTTP requests.

Thanks,
Manu


-- 
View this message in context: 
http://www.nabble.com/How-to-open-a-new-searcher-and-close-the-old-one-tp21496803p21496803.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: DIH XPathEntityProcessor fails with docs containing

2009-01-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
stax parser automatically tries to fetch the DTD. How can we disable
that at the parser level?

On Fri, Jan 16, 2009 at 3:34 PM, Fergus McMenemie  wrote:
> Hello all, as the subject says:
>   DIH XPathEntityProcessor fails with docs containing 
>
> This is using a solr nightly build from monday.
>
> INFO: Server startup in 3623 ms
> Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.SolrWriter 
> readIndexerProperties
> INFO: Read dataimport.properties
> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrCore execute
> INFO: [jdocs] webapp=/solr path=/walkj params={command=full-import} status=0 
> QTime=13
> Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.DataImporter 
> doFullImport
> INFO: Starting Full Import
> Jan 16, 2009 9:54:12 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll
> INFO: [jdocs] REMOVING ALL DOCUMENTS FROM INDEX
> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy onInit
> INFO: SolrDeletionPolicy.onInit: commits:num=2
>
> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_c,version=1232026423291,generation=12,filenames=[segments_c,
>  _4.fnm, _4.frq, _4.prx, _4.tis, _4.tii, _4.nrm, _4.fdx, _4.fdt]
>
> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_d,version=1232026423292,generation=13,filenames=[segments_d]
> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy updateCommits
> INFO: last commit = 1232026423292
> Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DocBuilder 
> buildDocument
> SEVERE: Exception while processing: jcurrent document : null
> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
> for xml, url:/j/dtd/jxml/data/news/2008/frp70450.xmlrows processed :0 
> Processing Document # 1
>at 
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:252)
>at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:177)
>at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
>at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313)
>at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
>at 
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202)
>at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
>at 
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321)
>at 
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381)
>at 
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
> Caused by: java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: 
> (was java.io.FileNotFoundException) /../config/jml-delivery-norm-2.1.dtd (No 
> such file or directory)
>  at [row,col {unknown-source}]: [3,81]
>at 
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
>at 
> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:242)
>... 9 more
> Caused by: com.ctc.wstx.exc.WstxParsingException: (was 
> java.io.FileNotFoundException) /../config/jml-delivery-norm-2.1.dtd (No such 
> file or directory)
>  at [row,col {unknown-source}]: [3,81]
>at 
> com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
>at 
> com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
>at 
> com.ctc.wstx.sr.ValidatingStreamReader.findDtdExtSubset(ValidatingStreamReader.java:475)
>at 
> com.ctc.wstx.sr.ValidatingStreamReader.finishDTD(ValidatingStreamReader.java:358)
>at 
> com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3351)
>at 
> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:1988)
>at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
>at 
> org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:141)
>at 
> org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89)
>at 
> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82)
>... 10 more
> Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DataImporter 
> doFullImport
> SEVERE: Full Import failed
>
> A fragment from the top of the failing document is
>
> 
>  href="../../../../config/support/j-deliver.xsl"?>
> 
> http://dtd.j.com/2002/Content/"; id="frp70450"  
> urname="record">
>  http://www.w3.org/1999/xlink"; xlink:href="" 
> urname="metadata" xlink:type="simple">
>http://purl.org/dc/elements/1.1/";

DIH XPathEntityProcessor fails with docs containing

2009-01-16 Thread Fergus McMenemie
Hello all, as the subject says:
   DIH XPathEntityProcessor fails with docs containing 
   
This is using a solr nightly build from monday.

INFO: Server startup in 3623 ms
Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.SolrWriter 
readIndexerProperties
INFO: Read dataimport.properties
Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrCore execute
INFO: [jdocs] webapp=/solr path=/walkj params={command=full-import} status=0 
QTime=13 
Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.DataImporter 
doFullImport
INFO: Starting Full Import
Jan 16, 2009 9:54:12 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll
INFO: [jdocs] REMOVING ALL DOCUMENTS FROM INDEX
Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=2

commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_c,version=1232026423291,generation=12,filenames=[segments_c,
 _4.fnm, _4.frq, _4.prx, _4.tis, _4.tii, _4.nrm, _4.fdx, _4.fdt]

commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_d,version=1232026423292,generation=13,filenames=[segments_d]
Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: last commit = 1232026423292
Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DocBuilder 
buildDocument
SEVERE: Exception while processing: jcurrent document : null
org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed 
for xml, url:/j/dtd/jxml/data/news/2008/frp70450.xmlrows processed :0 
Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:252)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:177)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
Caused by: java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: 
(was java.io.FileNotFoundException) /../config/jml-delivery-norm-2.1.dtd (No 
such file or directory)
 at [row,col {unknown-source}]: [3,81]
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:242)
... 9 more
Caused by: com.ctc.wstx.exc.WstxParsingException: (was 
java.io.FileNotFoundException) /../config/jml-delivery-norm-2.1.dtd (No such 
file or directory)
 at [row,col {unknown-source}]: [3,81]
at 
com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
at 
com.ctc.wstx.sr.ValidatingStreamReader.findDtdExtSubset(ValidatingStreamReader.java:475)
at 
com.ctc.wstx.sr.ValidatingStreamReader.finishDTD(ValidatingStreamReader.java:358)
at 
com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3351)
at 
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:1988)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
at 
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:141)
at 
org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89)
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82)
... 10 more
Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DataImporter 
doFullImport
SEVERE: Full Import failed

A fragment from the top of the failing document is




http://dtd.j.com/2002/Content/"; id="frp70450"  
urname="record">
  http://www.w3.org/1999/xlink"; xlink:href="" 
urname="metadata" xlink:type="simple">
http://purl.org/dc/elements/1.1/"; 
qualifier="pdate">20080131

The DTD does exist at the specified location. Removing the DOCTYPE directive
fixes everything. I know that use of DOCTYPE is out of fashion, and it does
not exist in our newer documents, however there are lots of older XML docs 
about!

Regards Fergus.
-- 

===
Fergus McMenemie 

Re: Unwanted clustering of search results after sorting by score

2009-01-16 Thread Axel Tetzlaff

Hi Otis,

thanks for your input. Although I agree that we may have to go over the
search result once more, I dont think doing so for the first result page
only, is sufficient.
In the first example I showed before, you can see that some of the desired
products (of shops B and C) in fact occur on later pages - and the example
is heavily simplified. With over half a million products, searches for
single words (which are most common) can easily have a huge set of matching
documents.


Otis Gospodnetic wrote:
> 
>   This should be doable with a function query, too.
> 
I had a look at function queries as well, and couldn't figure out how to
incorporate them for this purpose. Afaik one can only operate on numeric
fields - which have to be set up at index time. But the distribution of the
shop to which a product in the search result belongs, can only be determined
at search time.
Can you give me a closer hint on how you would aggregate this information
with a function query?

thanks,
Axel
-- 
View this message in context: 
http://www.nabble.com/Unwanted-clustering-of-search-results-after-sorting-by-score-tp20977761p21495453.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: collectionDistribution vs SolrReplication

2009-01-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Jan 16, 2009 at 3:37 AM, Chris Hostetter
 wrote:
>
> : I would like to know the advantages of moving from:
> : a master-slave system using CollectionDistribution with all their .sh
> : scripts
> : http://wiki.apache.org/solr/CollectionDistribution
> : to:
> : use SolrReplication and his solrconfig.xml configuration.
> : http://wiki.apache.org/solr/SolrReplication
>
> in addition to other comments posted it's important to keep in mind that
> one of the original motivations for the new style of replication was to
> have a 100% java based solution, as a result, it's is the only
> replication approach that works on windows.
>
> (in particular: it has no dependency on being able to delete hardlinks, or
> on running rsync, or on using ssh, or on having external crons, etc..)
>
> I still haven't had a chance to really kick the tires on the java based
> replication, so i have no real experience to base either of these claims
> on, but my hunch is that:
>  1) new users will find the java based replication *much* easier to get
> up an running (a lot less moving parts and external processes to deal
> with)
>  2) existing users who already have the script based replication working
> for them may find the java based replication less transparent and harder
> to maniplate in tricky ways.
>
> ...that second hunch comes from the fact that since the java replication
> is all self contained in solr, and doesn't use use all of hte various
> external processes (cron, rsync, snapshooter, snappuller, ssh, etc...)
> there are less places for people to manipulate the replication when doing
> atypical' operations ... for example: during a phased rollout of some new
> code/schema, you might disable all replication by shutting down the rsyncd
> port; then disabling it for a few slaves by commenting out the snappuller
> cron before turning rsyncd back on ... etc.
inbuilt replication allows schema/conf replication which makes a lot
of these unnecessary.
All disable enable stuff are exposed as http commands
>
> these types of tricks are probably unneccessary in 90% of the use cases,
> and people who aren't use to being able to do them probably won't care,
> but if you are use to having that level of control, you might miss them.
>
> (but as i said: i haven't had a chance to try out the java replication at
> all, so for all i know it's just as tweakable and i'm just an idiot.)
>
> -Hoss
>
>



-- 
--Noble Paul