date:20110712

Re: Can I still search documents once updated?

2011-07-12 Thread Gabriele Kahlout

It indeed is not stored, but this is still unexpected behavior. It's a
stored and indexed field, why has the index data been lost?


On Wed, Jul 13, 2011 at 12:44 AM, Erick Erickson wrote:

> Unless you stored your "content" field, the value you put in there won't
> be fetched from the index. Verify that the doc you retrieve from the index
> has values for "content", I bet it doesn't
>
> Best
> Erick
>
> On Tue, Jul 12, 2011 at 9:38 AM, Gabriele Kahlout
>  wrote:
> >  @Test
> >public void testUpdateLoseTermsSimplified() throws Exception {
> > *IndexWriter writer = indexDoc();*
> >assertEquals(1, writer.numDocs());
> >IndexSearcher searcher = getSearcher(writer);
> >final TermQuery termQuery = new TermQuery(new Term(content,
> > "essen"));
> >
> >TopDocs docs = searcher.search(termQuery, 1);
> >assertEquals(1, docs.totalHits);
> >Document doc = searcher.doc(0);
> >
> > *writer.updateDocument(new Term(id,doc.get(id)),doc);*
> >
> >searcher = getSearcher(writer);
> > *docs = searcher.search(termQuery, 1);*
> > *assertEquals(1, docs.totalHits);*//docs.totalHits == 0 !
> >}
> >
> > testUpdateLosesTerms(com.mysimpatico.me.indexplugins.WcTest)  Time
> elapsed:
> > 0.346 sec  <<< FAILURE!
> > java.lang.AssertionError: expected:<1> but was:<0>
> >at org.junit.Assert.fail(Assert.java:91)
> >at org.junit.Assert.failNotEquals(Assert.java:645)
> >at org.junit.Assert.assertEquals(Assert.java:126)
> >at org.junit.Assert.assertEquals(Assert.java:470)
> >at org.junit.Assert.assertEquals(Assert.java:454)
> >at
> >
> com.mysimpatico.me.indexplugins.WcTest.testUpdateLosesTerms(WcTest.java:271)
> >
> > I have not changed anything (as you can see) during the update. I just
> > retrieve a document and the update it. But then the termQuery that worked
> > before doesn't work anymore (while the "id" field wasn't changed). Is
> this
> > to be expected when content field is not stored?
> >
> > --
> > Regards,
> > K. Gabriele
> >
> > --- unchanged since 20/9/10 ---
> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
> > receipt within 48 hours then I don't resend the email.
> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> time(x)
> > < Now + 48h) ⇒ ¬resend(I, this).
> >
> > If an email is sent by a sender that is not a trusted contact or the
> email
> > does not contain a valid code then the email is not received. A valid
> code
> > starts with a hyphen and ends with "X".
> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> > L(-[a-z]+[0-9]X)).
> >
>



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Re: nutch 1.2, solr 3.3, tomcat6. java.io.IOException: Job failed! problem when building solrindex

2011-07-12 Thread Geek Gamer

you need to update the solrj libs to 3.x version. the java bin format
has changed .
I made the change a few months back, you can pull the changes from
https://github.com/geek4377/nutch/tree/geek5377-1.2.1

hope that helps,


On Wed, Jul 13, 2011 at 8:58 AM, Leo Subscriptions
 wrote:
> I'm running 64bit Ubuntu 11.04, nutch 1.2, solr 3.3 (downloaded, not
> built) and tomcat6 following this (and some other) links
> http://wiki.apache.org/nutch/RunningNutchAndSolr
>
> I have added the nutch schema and can access/view this schema via the
> admin page. nutch also works as I can perfrom successful searches.
>
> When I execute the following:
>
>>> ./bin/nutch solrindex http://localhost:8080/solr/core0 crawl/crawldb
> crawl/linkdb crawl/segments/*
>
> I (eventually) get an io error.
>
> Tha above command creates the following
> files /var/lib/tomcat6/solr/core0/data/index/
>
> ---
> 544 -rw-r--r-- 1 tomcat6 tomcat6 557056 2011-07-13 11:09 _1.fdt
>  0 -rw-r--r-- 1 tomcat6 tomcat6      0 2011-07-13 11:00 _1.fdx
>  4 -rw-r--r-- 1 tomcat6 tomcat6     32 2011-07-13 10:59 segments_2
>  4 -rw-r--r-- 1 tomcat6 tomcat6     20 2011-07-13 10:59 segments.gen
>  0 -rw-r--r-- 1 tomcat6 tomcat6      0 2011-07-13 11:00 write.lock
> ---
>
> but the hadoop.log reports the following error
>
> ---
> 2011-07-13 11:09:47,665 INFO  indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.basic.BasicIndexingFilter
> 2011-07-13 11:09:47,666 INFO  indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.anchor.AnchorIndexingFilter
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: content
> dest: content
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: site
> dest: site
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: title
> dest: title
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: host
> dest: host
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: segment
> dest: segment
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: boost
> dest: boost
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: digest
> dest: digest
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: tstamp
> dest: tstamp
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: url dest:
> id
> 2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: url dest:
> url
> 2011-07-13 11:09:49,272 WARN  mapred.LocalJobRunner - job_local_0001
> java.lang.RuntimeException: Invalid version or the data in not in
> 'javabin' format
>        at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
>        at
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
>        at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:466)
>        at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
>        at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>        at
> org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
>        at
> org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:64)
>        at org.apache.nutch.indexer.IndexerOutputFormat
> $1.write(IndexerOutputFormat.java:54)
>        at org.apache.nutch.indexer.IndexerOutputFormat
> $1.write(IndexerOutputFormat.java:44)
>        at org.apache.hadoop.mapred.ReduceTask
> $3.collect(ReduceTask.java:440)
>        at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:159)
>        at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
>        at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
>        at org.apache.hadoop.mapred.LocalJobRunner
> $Job.run(LocalJobRunner.java:216)
> 2011-07-13 11:09:49,611 ERROR solr.SolrIndexer - java.io.IOException:
> Job failed!
> ---
>
> I'd appreciate any help with this.
>
> Thanks,
>
> Leo
>
>
>
>

nutch 1.2, solr 3.3, tomcat6. java.io.IOException: Job failed! problem when building solrindex

2011-07-12 Thread Leo Subscriptions

I'm running 64bit Ubuntu 11.04, nutch 1.2, solr 3.3 (downloaded, not
built) and tomcat6 following this (and some other) links
http://wiki.apache.org/nutch/RunningNutchAndSolr

I have added the nutch schema and can access/view this schema via the
admin page. nutch also works as I can perfrom successful searches.

When I execute the following:

>> ./bin/nutch solrindex http://localhost:8080/solr/core0 crawl/crawldb
crawl/linkdb crawl/segments/*

I (eventually) get an io error. 

Tha above command creates the following
files /var/lib/tomcat6/solr/core0/data/index/

---
544 -rw-r--r-- 1 tomcat6 tomcat6 557056 2011-07-13 11:09 _1.fdt
  0 -rw-r--r-- 1 tomcat6 tomcat6  0 2011-07-13 11:00 _1.fdx
  4 -rw-r--r-- 1 tomcat6 tomcat6 32 2011-07-13 10:59 segments_2
  4 -rw-r--r-- 1 tomcat6 tomcat6 20 2011-07-13 10:59 segments.gen
  0 -rw-r--r-- 1 tomcat6 tomcat6  0 2011-07-13 11:00 write.lock
---

but the hadoop.log reports the following error

---
2011-07-13 11:09:47,665 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2011-07-13 11:09:47,666 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: content
dest: content
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: site
dest: site
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: title
dest: title
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: host
dest: host
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: segment
dest: segment
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: boost
dest: boost
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: digest
dest: digest
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: tstamp
dest: tstamp
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: url dest:
id
2011-07-13 11:09:47,690 INFO  solr.SolrMappingReader - source: url dest:
url
2011-07-13 11:09:49,272 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: Invalid version or the data in not in
'javabin' format
at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:466)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at
org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:64)
at org.apache.nutch.indexer.IndexerOutputFormat
$1.write(IndexerOutputFormat.java:54)
at org.apache.nutch.indexer.IndexerOutputFormat
$1.write(IndexerOutputFormat.java:44)
at org.apache.hadoop.mapred.ReduceTask
$3.collect(ReduceTask.java:440)
at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:159)
at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.LocalJobRunner
$Job.run(LocalJobRunner.java:216)
2011-07-13 11:09:49,611 ERROR solr.SolrIndexer - java.io.IOException:
Job failed!
---

I'd appreciate any help with this.

Thanks,

Leo

Re: Possible bug in Solr 3.3 grouping

2011-07-12 Thread Nikhil Chhaochharia

Thanks Martijn - I should be able to patch the Solr 3.3 release based on 
r1145748.

- Nikhil

From: Martijn v Groningen
To: solr-user@lucene.apache.org; Nikhil Chhaochharia
Sent: Wednesday, 13 July 2011 2:04 AM
Subject: Re: Possible bug in Solr 3.3 grouping

Hi Nikhil,

Thanks for raising this issue. I checked this particular issue in a test case 
and I ran into the same error, so this is indeed a bug. I've fixed this issue 
for 3x in revision 1145748.
So checking out the latest 3x branch and building Solr yourself should give you 
this bug fix. Or you can wait until the 3x build produces new nightly artifacts:
https://builds.apache.org/job/Solr-3.x/lastSuccessfulBuild/artifact/artifacts/

The numFound in your case is based on number of documents, not groups. So you 
might still get an empty result, because there might be less than 40 groups in 
this result set.
You can see the number of groups by using the group.ngroups=true parameter this 
includes the number of groups, but the the group.format must be grouped 
otherwise you don't get the number of groups in the response.

Martijn

On 12 July 2011 09:00, Nikhil Chhaochharia  wrote:

Hi,
>
>I am using Solr 3.3 and have run into a problem with grouping.  If 
>'group.main' is 'true' and 'start' is greater than 'rows', then I do not get 
>any results.  A sample response is:
>
>
>0name="QTime">602numFound="2239538" start="40" maxScore="6.140459"/>
>
>
>
>If 'group.main' is false, then I get results.
>
>Did anyone else come across this problem?  Using the grouping feature with 
>pagination of results will make start > rows from the third page onwards.
>
>Thanks,
>Nikhil
>
>

Re: ' invisible ' words

2011-07-12 Thread deniz

nothing was changed... the result is still the same... shuold i implement my
own analyzer or tokenizer for the problem? 

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/invisible-words-tp3158060p3164670.html
Sent from the Solr - User mailing list archive at Nabble.com.

Grouping / Collapse Query

2011-07-12 Thread entdeveloper

I'm messing around with the field collapsing in 4.x
http://wiki.apache.org/solr/FieldCollapsing . Is it currently possible to
group by a field with a certain value only and leave all the others
ungrouped using the group.query param? This currently doesn't seem to work
the way I want it to.

For example, I have documents all with a "type" field. Possible values are:
picture, video, game, other. I want to only group the pictures, and leave
all other documents ungrouped.

If I query something like:
q=dogs&group=true&group.query=type:picture

I ONLY get pictures back. Seems like this behaves more like an 'fq'

What I want is a result set that looks like this:

1. doc 1, type=video
2. doc 2, type=game
3. doc 3, type=picture, + 3 other pictures
4. doc 4, type=video
5. doc 5, type=video
...

I've also tried:
q=dogs&group=true&group.query=type:picture&group.query=-type:video
-type:game

But this doesn't work because the order of the groups don't put together the
correct order of results that would be displayed.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Grouping-Collapse-Query-tp3164433p3164433.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr/velocity: funtion for sorting asc/desc

2011-07-12 Thread Erick Erickson

Velocity should have nothing to do with it, you specify on the query
something like &sort=title asc or &sort=title desc. But do note that
sorting only really makes sense when the field is NOT tokenized. Often
using a combination of a Keyword tokenizer and LowerCaseFilter gives
the results you're looking for...

Best
Erick

On Tue, Jul 12, 2011 at 3:00 PM, okayndc  wrote:
> hello,
>
> was wondering if there is a solr/velocity function out there that can sort
> say, a title name, by clicking on a link named "sort title" that can sort
> ascending or descending by alpha?  or is this a frontend/jquery type of
> thing?
>
> thanks
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/solr-velocity-funtion-for-sorting-asc-desc-tp3163549p3163549.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: ContentStreamLoader Problem

2011-07-12 Thread Erick Erickson

This is a shot in the dark, but this smells like a classpath issue,
and since you have
a 1.4.1 installation on the machine, I'm *guessing* that you're getting a mix of
old and new Jars. What happens if you try this on a machine that doesn't have
1.4.1 on it? If that works, then it's likely a classpath issue

Best
Erick

On Tue, Jul 12, 2011 at 1:33 PM, Tod  wrote:
> I'm getting this error testing Solr V3.3.0 using the
> ExtractingRequestHandler.  I'm taking advantage of the REST interface and
> batching my documents in using stream.url.   It happens for every document I
> try to index.  It works fine under Solr 1.4.1.
>
> I'm running everything under Tomcat.  I already have an existing 1.4.1
> instance running, could that be causing the problem?
>
>
> Thanks - Tod
>
>
>
>
> Jul 12, 2011 1:11:31 PM org.apache.solr.update.processor.LogUpdateProcessor
> finish
> INFO: {} 0 1
> Jul 12, 2011 1:11:31 PM org.apache.solr.common.SolrException log
> SEVERE: java.lang.AbstractMethodError:
> org/apache/solr/handler/ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
>        at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67)
>        at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>        at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>        at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>        at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>        at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
>        at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>        at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>        at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
>        at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
>        at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
>        at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
>        at java.lang.Thread.run(Thread.java:811)
>
>

Re: how to build lucene-solr (espeically if behind a firewall)?

2011-07-12 Thread Erick Erickson

What target did you try to build? Did you try "ant dist"?

Best
Erick

On Tue, Jul 12, 2011 at 12:38 PM, Will Milspec  wrote:
> hi all,
>
> building lucene/solr behind the firewall fails for us due to proxy errors.
>
> I tried setting the ant_opts -Dhttp.proxyHost, etc, but found the "lucene"
> portion still failed on javadoc links.
>
> I worked round this by changing failonjavadocerror to 'false' in
> lucene/common-build.xml (or alternatively adding -J-Dhttp.proxyHost, etc as
> "args" element to the
> javadoc tasks), but then 'changes2html' failed to connect to
> https://issues.apache.org.
>
> I'm posting to the solr-user group (even though compiling is developer-ish
> stuff) as we need to apply a few patches lucene-solr.
>
> Would someone be so kind as to post the following?
> * Easiest way to build lucene-solr from source
> * same, but if you're behind the firewall.
>
> thanks,
>
> will
>

Re: Can I still search documents once updated?

2011-07-12 Thread Erick Erickson

Unless you stored your "content" field, the value you put in there won't
be fetched from the index. Verify that the doc you retrieve from the index
has values for "content", I bet it doesn't

Best
Erick

On Tue, Jul 12, 2011 at 9:38 AM, Gabriele Kahlout
 wrote:
>  @Test
>    public void testUpdateLoseTermsSimplified() throws Exception {
> *        IndexWriter writer = indexDoc();*
>        assertEquals(1, writer.numDocs());
>        IndexSearcher searcher = getSearcher(writer);
>        final TermQuery termQuery = new TermQuery(new Term(content,
> "essen"));
>
>        TopDocs docs = searcher.search(termQuery, 1);
>        assertEquals(1, docs.totalHits);
>        Document doc = searcher.doc(0);
>
> *        writer.updateDocument(new Term(id,doc.get(id)),doc);*
>
>        searcher = getSearcher(writer);
> *        docs = searcher.search(termQuery, 1);*
> *        assertEquals(1, docs.totalHits);*//docs.totalHits == 0 !
>    }
>
> testUpdateLosesTerms(com.mysimpatico.me.indexplugins.WcTest)  Time elapsed:
> 0.346 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
>    at org.junit.Assert.fail(Assert.java:91)
>    at org.junit.Assert.failNotEquals(Assert.java:645)
>    at org.junit.Assert.assertEquals(Assert.java:126)
>    at org.junit.Assert.assertEquals(Assert.java:470)
>    at org.junit.Assert.assertEquals(Assert.java:454)
>    at
> com.mysimpatico.me.indexplugins.WcTest.testUpdateLosesTerms(WcTest.java:271)
>
> I have not changed anything (as you can see) during the update. I just
> retrieve a document and the update it. But then the termQuery that worked
> before doesn't work anymore (while the "id" field wasn't changed). Is this
> to be expected when content field is not stored?
>
> --
> Regards,
> K. Gabriele
>
> --- unchanged since 20/9/10 ---
> P.S. If the subject contains "[LON]" or the addressee acknowledges the
> receipt within 48 hours then I don't resend the email.
> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
> < Now + 48h) ⇒ ¬resend(I, this).
>
> If an email is sent by a sender that is not a trusted contact or the email
> does not contain a valid code then the email is not received. A valid code
> starts with a hyphen and ends with "X".
> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> L(-[a-z]+[0-9]X)).
>

Re: Query Rewrite

2011-07-12 Thread Jamie Johnson

I'm not following where the aliasing feature I'm looking for is.
Looking at the patch I didn't see it either.  Essentially what I'm
looking for is when a user searches for person_name that the query
turns into person_name:john OR person_name_first:john OR
person_name_last:john.  I don't see anything like that here, am I just
missing it?

On Tue, Jul 12, 2011 at 3:06 PM, Chris Hostetter
 wrote:
>
> : Thanks Hoss.  I'm not really sure where to begin looking with this, I
> : quickly read the JIRA but don't see mention of exposing the multiple
> : aliases.  Can you provide any more details?
>
> i refered to it as "uf" or "user fields" ... note the specific comment i
> linked to in the first url, and the subsequent patch
>
> the colon bug in edismax is what hung me up at the time.
>
> :
> : On Tue, Jul 12, 2011 at 1:19 PM, Chris Hostetter
> :  wrote:
> : > : Taking a closer look at this it seems as if the
> : > : DisjunctionMaxQueryParser supports doing multiple aliases and
> : > : generating multiple queries, I didn't see this same capability in the
> : > : ExtendedDismaxQParser, am I just missing it?  If this capability were
> : >
> : > it's never been exposed at a user level ... i started looking at adding it
> : > to edismax but ran into a bug i couldn't uncover in the time i had to work
> : > on it (which i think has since been fixed)...
> : >
> : > 
> https://issues.apache.org/jira/browse/SOLR-1553?focusedCommentId=12839892&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12839892
> : > https://issues.apache.org/jira/browse/SOLR-2409
> : > https://issues.apache.org/jira/browse/SOLR-2368
> : >
> : >
> : > -Hoss
> : >
> :
>
> -Hoss

Re: Possible bug in Solr 3.3 grouping

2011-07-12 Thread Martijn v Groningen

Hi Nikhil,

Thanks for raising this issue. I checked this particular issue in a test
case and I ran into the same error, so this is indeed a bug. I've fixed this
issue for 3x in revision 1145748.
So checking out the latest 3x branch and building Solr yourself should give
you this bug fix. Or you can wait until the 3x build produces
new nightly artifacts:
https://builds.apache.org/job/Solr-3.x/lastSuccessfulBuild/artifact/artifacts/

The numFound in your case is based on number of documents, not groups. So
you might still get an empty result, because there might be less than 40
groups in this result set.
You can see the number of groups by using the group.ngroups=true parameter
this includes the number of groups, but the the group.format must be grouped
otherwise you don't get the number of groups in the response.

Martijn

On 12 July 2011 09:00, Nikhil Chhaochharia  wrote:

> Hi,
>
> I am using Solr 3.3 and have run into a problem with grouping.  If
> 'group.main' is 'true' and 'start' is greater than 'rows', then I do not get
> any results.  A sample response is:
>
> 
> 0 name="QTime">602 numFound="2239538" start="40" maxScore="6.140459"/>
> 
>
>
> If 'group.main' is false, then I get results.
>
> Did anyone else come across this problem?  Using the grouping feature with
> pagination of results will make start > rows from the third page onwards.
>
> Thanks,
> Nikhil
>
>

Re: Solr and Php Question

2011-07-12 Thread Damien Camilleri

Hi mate,

I'm a php Dev. Try zend server community edition from zend. Install is easy and 
comes with most things u need and may solve this for you. Centos is always a 
bit behind for stability reasons. Zs also has lots of other goodies.

Personally I use a php library rather than a php extension so has no 
dependencies and no installation needed and I can read their source and extend 
it to suit my needs.

There's a few php libraries around for solr. 

Phpmyadmin can run under php 5.3, but personally I just use mysql client. 

Damien




Sent from my iPhone

On 13/07/2011, at 3:07 AM, Cupbearer  wrote:

> Total Linux noob, 1 month into first server ever...
> 
> CentOS 5.6 Final
> Loaded php52 from ius
> yum install of mysql
> tomcat
> 
> I've got Nutch up and running and working (may need to work on filters at
> some point) and I have Solr up and running and indexing everything.  But,
> then I came to the point of trying to get the results from the Crawl and
> Index to display on the webpage that I'm building with Php and it points me
> to php.net.  The minimum requirements are:
> 
> The libxml and curl extensions must also be enabled for the Apache Solr
> extension to be available.
> 
> libxml2 2.6.31 or later is required.
> 
> libcurl 7.18.0 or later is also required.
> 
> The Yum install is 2.6.15 and like 7.13.0 (didn't get past the first one
> yet).  I have zero idea how to upgrade these!  What am I going to have to
> check to make sure gets upgraded?  I've only run yum installs from different
> repositories so far and haven't had to compile anything myself, so maybe if
> I have to do that someone can point me to a tutorial.  Or, the Search and
> Results pages should be pretty simple can I just avoid a few of the newer
> commands in php because they aren't compatible and not worry about it and
> hope everything gets fixed with centos 6.0?  It also seems that 3 or 4 days
> ago you only needed libxml2 2.6.17 and now it's up to x.x.31 for a
> prerequisite, is this something that I need to learn anyways to keep current
> on since it seems to be changing rather regularly?  Should I just try to
> upgrade my php52 to php53 and see if that gets me the newer repositories? 
> When I looked at that my yum installed phpmyadmin wasn't compatible with 53
> which is why I went with 52.  I didn't bother updating mysql from 5.0 since
> there didn't seem to be any blazingly obvious reason to do so (especially
> since it's such a small instance).
> 
> Thanks,
> 
> -
> 
> Cupbearer 
> Jerry E. Craig, Jr.
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-and-Php-Question-tp3163155p3163155.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query Rewrite

2011-07-12 Thread Chris Hostetter


: Thanks Hoss.  I'm not really sure where to begin looking with this, I
: quickly read the JIRA but don't see mention of exposing the multiple
: aliases.  Can you provide any more details?

i refered to it as "uf" or "user fields" ... note the specific comment i 
linked to in the first url, and the subsequent patch

the colon bug in edismax is what hung me up at the time.

: 
: On Tue, Jul 12, 2011 at 1:19 PM, Chris Hostetter
:  wrote:
: > : Taking a closer look at this it seems as if the
: > : DisjunctionMaxQueryParser supports doing multiple aliases and
: > : generating multiple queries, I didn't see this same capability in the
: > : ExtendedDismaxQParser, am I just missing it?  If this capability were
: >
: > it's never been exposed at a user level ... i started looking at adding it
: > to edismax but ran into a bug i couldn't uncover in the time i had to work
: > on it (which i think has since been fixed)...
: >
: > 
https://issues.apache.org/jira/browse/SOLR-1553?focusedCommentId=12839892&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12839892
: > https://issues.apache.org/jira/browse/SOLR-2409
: > https://issues.apache.org/jira/browse/SOLR-2368
: >
: >
: > -Hoss
: >
: 

-Hoss

solr/velocity: funtion for sorting asc/desc

2011-07-12 Thread okayndc

hello,

was wondering if there is a solr/velocity function out there that can sort
say, a title name, by clicking on a link named "sort title" that can sort
ascending or descending by alpha?  or is this a frontend/jquery type of
thing?

thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-velocity-funtion-for-sorting-asc-desc-tp3163549p3163549.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query Rewrite

2011-07-12 Thread Jamie Johnson

Thanks Hoss.  I'm not really sure where to begin looking with this, I
quickly read the JIRA but don't see mention of exposing the multiple
aliases.  Can you provide any more details?

On Tue, Jul 12, 2011 at 1:19 PM, Chris Hostetter
 wrote:
> : Taking a closer look at this it seems as if the
> : DisjunctionMaxQueryParser supports doing multiple aliases and
> : generating multiple queries, I didn't see this same capability in the
> : ExtendedDismaxQParser, am I just missing it?  If this capability were
>
> it's never been exposed at a user level ... i started looking at adding it
> to edismax but ran into a bug i couldn't uncover in the time i had to work
> on it (which i think has since been fixed)...
>
> https://issues.apache.org/jira/browse/SOLR-1553?focusedCommentId=12839892&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12839892
> https://issues.apache.org/jira/browse/SOLR-2409
> https://issues.apache.org/jira/browse/SOLR-2368
>
>
> -Hoss
>

[Announce] Lucene-Eurocon Call for Participation Closes Friday, JULY 15

2011-07-12 Thread Mark Miller

Hey all - just a friendly FYI reminder:

CALL FOR PARTICIPATION CLOSES FRIDAY, JULY 15!
TO SUBMIT A TOPIC, GO TO: http://2011.lucene-eurocon.org/pages/cfp

Now in its second year, Apache Lucene Eurocon 2011 comes to Barcelona, Spain, 
providing an unparalleled opportunity for  European search application 
developers and technologists to connect and  network. The conference takes 
place October 19 - 20, preceded by two days of optional training workshops 
October 17 - 18.

Get Involved Today! The Call for Participation Closes This Week!   

Consider presenting at Apache Lucene EuroCon 2011. Submit your ideas by July 
15. If  you have a great Solr or Lucene story to tell, the community wants to 
hear about it. Share your expertise and innovations! To submit a topic, go to:
http://2011.lucene-eurocon.org/pages/cfp

Sample topics of interest include:

* Lucene and Solr in the Enterprise (case studies, implementation, return on 
investment, etc.)
* “How We Did It”  Development Case Studies
* Relevance in Practice
* Spatial/Geo search
* Lucene and Solr in the Cloud
* Scalability and Performance Tuning
* Large Scale Search
* Real Time Search
* Data Integration/Data Management
* Tika, Nutch and Mahout
* Faceting and Categorization
* Lucene & Solr for Mobile Applications
* Multi-language Support
* Indexing and Analysis Techniques
* Advanced Topics in Lucene & Solr Development

Want to be added to the conference mailing list? Is your organization 
interested in sponsorship opportunities? Please send an email to  
i...@lucene-eurocon.org 

Best Regards,

Suzanne Kushner
Lucid Imagination Corporate Marketing
www.lucidimagination.com

DATE: OCTOBER 17 - 20 2011

LOCATION:
Hotel Meliá Barcelona
C/ Avenida Sarriá,
50 Barcelona - SPAIN 08029
Tel: (0034) 93 4106060

Apache Lucene EuroCon 2011 is presented by Lucid Imagination, the commercial 
entity for Apache Solr/Lucene Open Source Search; proceeds of the conference 
benefit The Apache Software Foundation.

"Lucene" and "Apache Solr" are trademarks of the Apache Software Foundation.





- Mark Miller
lucidimagination.com

Re: Many Cores with Solr

2011-07-12 Thread Shalin Shekhar Mangar

Hi Torsten,

On Tue, Jul 12, 2011 at 2:45 PM, Torsten Kunze wrote:

> Hi,
>
> as a feasibility study I am trying to run Solr with multiple thousands of
> cores in the same shard to have small indexes that can be created and
> removed very fast.
> Now, I have a Tomcat running with 1.600 cores. Memory and open file handles
> have been adjusted to be enough for that scenario.
>
> I am using SolrJ and I implemented a feeder using timer threads to realize
> auto commits to each Solr Core independently.
> Feeding is done randomly to the cores in parallel. Auto commit is enabled.
>
> My questions:
> Do I need to execute a commit to each core itself or does a commit to one
> dedicated core commit all changes of the whole shard?
>

You need to execute a commit to each core to commit the updates done on it.


> Can I feed in parallel to some cores, if  a commit or optimize to another
> core is currently applied or does Solr block further content integration
> requests during that time?
>

You can feed in parallel and Solr won't block requests to a different core
during that time. That being said, your disk would become the bottleneck.


>
> Because of that many cores, it would be better to define cores for lazy
> loading during creation. Unfortunately the current implementation of
> CoreAdminHandler does not allow to set the 'loadOnStart' parameter of
> solr.xml. Is there a possibility to do this or do I need to implement my own
> handlers?
>
> Does anybody has some good or bad experiences with using many many cores?
>

I've done a bunch of stuff. There are some (very old) patches as well but
probably not useful by themselves. See
http://wiki.apache.org/solr/LotsOfCores

-- 
Regards,
Shalin Shekhar Mangar.

ContentStreamLoader Problem

2011-07-12 Thread Tod

I'm getting this error testing Solr V3.3.0 using the 
ExtractingRequestHandler.  I'm taking advantage of the REST interface 
and batching my documents in using stream.url.   It happens for every 
document I try to index.  It works fine under Solr 1.4.1.


I'm running everything under Tomcat.  I already have an existing 1.4.1 
instance running, could that be causing the problem?



Thanks - Tod




Jul 12, 2011 1:11:31 PM 
org.apache.solr.update.processor.LogUpdateProcessor finish

INFO: {} 0 1
Jul 12, 2011 1:11:31 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.AbstractMethodError: 
org/apache/solr/handler/ContentStreamLoader.load(Lorg/apache/solr/request/SolrQueryRequest;Lorg/apache/solr/response/SolrQueryResponse;Lorg/apache/solr/common/util/ContentStream;)V
	at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67)
	at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
	at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
	at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
	at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
	at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
	at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
	at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
	at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
	at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
	at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
	at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
	at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
	at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
	at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)

at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:811)

Re: Query Rewrite

2011-07-12 Thread Chris Hostetter

: Taking a closer look at this it seems as if the
: DisjunctionMaxQueryParser supports doing multiple aliases and
: generating multiple queries, I didn't see this same capability in the
: ExtendedDismaxQParser, am I just missing it?  If this capability were

it's never been exposed at a user level ... i started looking at adding it 
to edismax but ran into a bug i couldn't uncover in the time i had to work 
on it (which i think has since been fixed)...

https://issues.apache.org/jira/browse/SOLR-1553?focusedCommentId=12839892&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12839892
https://issues.apache.org/jira/browse/SOLR-2409
https://issues.apache.org/jira/browse/SOLR-2368


-Hoss

Delta import possible when no timestamp of specific field is set

2011-07-12 Thread PeterKerk

I want to show the new avatar of an ad as soon as the thumbid value of that
ad is updated.

This is what I had before delta query which works all fine:


 











in my schema.xml I have this field:



Here's the definition of ad_photos table: 
CREATE TABLE [dbo].[ad_photos]( 
[id] [int] IDENTITY(1,1) NOT NULL, 
[adid] [int] NOT NULL, 
[locpath] [nvarchar](150) NOT NULL, 
[title] [nvarchar](50) NULL, 
[createdate] [datetime] NOT NULL, 
 CONSTRAINT [PK_ad_photos] PRIMARY KEY CLUSTERED 
( 
[id] ASC 
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY =
OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY] 
) ON [PRIMARY] 
GO  



Now Im trying to add delta query functionality to the ad_thumb entity...
Now my guess is that the problem is that nowhere is saved WHEN the thumbid
was changed, the only place where the new thumb is updated is when the value
of column [ad].[thumbid] has changed, but without a timestamp.

I tried this:
 




But as you can see it doenst really make sense...:$ I have no idea how to
approach this and if its possible without having the timestamp of thumb
changed date.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delta-import-possible-when-no-timestamp-of-specific-field-is-set-tp3163190p3163190.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: moreLikeThis with filter?

2011-07-12 Thread Elaine Li

Thanks for the link. Using the fq with MLT request handler works.

Elaine

On Tue, Jul 12, 2011 at 12:55 PM, Koji Sekiguchi  wrote:
> (11/07/13 1:40), Elaine Li wrote:
>>
>> Hi Folks,
>>
>> I need to filter out the returns if the document's field activated =
>> false.
>>
>> I tried the following in order to retrieve similar products which are
>> all activated products. But this does not work.
>> http://localhost:8983/solr/mlt?q=(id:2043144 AND
>> activated:true)&mlt.count=10
>>
>> Can you suggest something?
>
> Try MoreLikeThisHandler instead of MoreLikeThisComponent:
>
> http://wiki.apache.org/solr/MoreLikeThis#MoreLikeThisHandler
>
> koji
> --
> http://www.rondhuit.com/en/
>

Solr and Php Question

2011-07-12 Thread Cupbearer

Total Linux noob, 1 month into first server ever...

CentOS 5.6 Final
Loaded php52 from ius
yum install of mysql
tomcat

I've got Nutch up and running and working (may need to work on filters at
some point) and I have Solr up and running and indexing everything.  But,
then I came to the point of trying to get the results from the Crawl and
Index to display on the webpage that I'm building with Php and it points me
to php.net.  The minimum requirements are:

The libxml and curl extensions must also be enabled for the Apache Solr
extension to be available.

libxml2 2.6.31 or later is required.

libcurl 7.18.0 or later is also required.

The Yum install is 2.6.15 and like 7.13.0 (didn't get past the first one
yet).  I have zero idea how to upgrade these!  What am I going to have to
check to make sure gets upgraded?  I've only run yum installs from different
repositories so far and haven't had to compile anything myself, so maybe if
I have to do that someone can point me to a tutorial.  Or, the Search and
Results pages should be pretty simple can I just avoid a few of the newer
commands in php because they aren't compatible and not worry about it and
hope everything gets fixed with centos 6.0?  It also seems that 3 or 4 days
ago you only needed libxml2 2.6.17 and now it's up to x.x.31 for a
prerequisite, is this something that I need to learn anyways to keep current
on since it seems to be changing rather regularly?  Should I just try to
upgrade my php52 to php53 and see if that gets me the newer repositories? 
When I looked at that my yum installed phpmyadmin wasn't compatible with 53
which is why I went with 52.  I didn't bother updating mysql from 5.0 since
there didn't seem to be any blazingly obvious reason to do so (especially
since it's such a small instance).

Thanks,

-

Cupbearer 
Jerry E. Craig, Jr.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-and-Php-Question-tp3163155p3163155.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: moreLikeThis with filter?

2011-07-12 Thread Koji Sekiguchi


(11/07/13 1:40), Elaine Li wrote:

Hi Folks,

I need to filter out the returns if the document's field activated = false.

I tried the following in order to retrieve similar products which are
all activated products. But this does not work.
http://localhost:8983/solr/mlt?q=(id:2043144 AND activated:true)&mlt.count=10

Can you suggest something?


Try MoreLikeThisHandler instead of MoreLikeThisComponent:

http://wiki.apache.org/solr/MoreLikeThis#MoreLikeThisHandler

koji
--
http://www.rondhuit.com/en/

moreLikeThis with filter?

2011-07-12 Thread Elaine Li

Hi Folks,

I need to filter out the returns if the document's field activated = false.

I tried the following in order to retrieve similar products which are
all activated products. But this does not work.
http://localhost:8983/solr/mlt?q=(id:2043144 AND activated:true)&mlt.count=10

Can you suggest something?

Thanks.

Elaine

how to build lucene-solr (espeically if behind a firewall)?

2011-07-12 Thread Will Milspec

hi all,

building lucene/solr behind the firewall fails for us due to proxy errors.

I tried setting the ant_opts -Dhttp.proxyHost, etc, but found the "lucene"
portion still failed on javadoc links.

I worked round this by changing failonjavadocerror to 'false' in
lucene/common-build.xml (or alternatively adding -J-Dhttp.proxyHost, etc as
"args" element to the
javadoc tasks), but then 'changes2html' failed to connect to
https://issues.apache.org.

I'm posting to the solr-user group (even though compiling is developer-ish
stuff) as we need to apply a few patches lucene-solr.

Would someone be so kind as to post the following?
* Easiest way to build lucene-solr from source
* same, but if you're behind the firewall.

thanks,

will

Re: Delta import issue

2011-07-12 Thread PeterKerk

That did the trick! Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delta-import-issue-tp3162581p3163009.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Delta import issue

2011-07-12 Thread Rahul Warawdekar



On Tue, Jul 12, 2011 at 11:34 AM, PeterKerk  wrote:

> Hi Rahul,
>
> Not sure how I would do this "Try adding the primary key attribute to the
> root entity 'ad'"?
>
> In my entity ad I already have these fields (I left those out earlier for
> readability):
><-- this is primary key of ads table
> 
> 
>
> Is that what you mean?
>
> And I'm using MSSQL2008
>
>
> Thanks!
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Delta-import-issue-tp3162581p3162809.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Thanks and Regards
Rahul A. Warawdekar

Re: Delta import issue

2011-07-12 Thread PeterKerk

Hi Rahul,

Not sure how I would do this "Try adding the primary key attribute to the
root entity 'ad'"?

In my entity ad I already have these fields (I left those out earlier for
readability):
   <-- this is primary key of ads table



Is that what you mean?

And I'm using MSSQL2008


Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delta-import-issue-tp3162581p3162809.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to get doc # to use in reader.norms("content")[doc]?

2011-07-12 Thread Gabriele Kahlout

Hello,
I'm trying to get the norm of an indexed document for a given field but
beside reader.norms(fieldName) I'm not finding any API to retrieve it. Now
reader.norms(..) returns an array with the norms for that field of all
indexed documents. How do I know the index of my document in there?

TermQuery.explain(){
...
byte[] fieldNorms = reader.norms(field);
  float fieldNorm =
fieldNorms!=null ? similarity.decodeNormValue(fieldNorms[doc]) :
1.0f;
  fieldNormExpl.setValue(fieldNorm);
...
In here doc is

 DocSlice docs = (DocSlice) values.get("response");
for (DocIterator it = docs.iterator(); it.hasNext();) {
final int docId = it.nextDoc();

but what about when I don't have a SolrQueryResponse ?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Re: Delta import issue

2011-07-12 Thread Rahul Warawdekar

Hi Peter,

Try adding the primary key attribute to the root entity 'ad' and check if
delta import works.
By the way, which database are you using ?

On Tue, Jul 12, 2011 at 10:27 AM, PeterKerk  wrote:

>
> I'm having an issue with a delta import.
>
> I have the following in my data-config.xml:
>
>
>query="select * from ads WHERE approvedate > '1/1/1900' and
> publishdate
> < getdate() AND depublishdate > getdate() and deletedate = '1/1/1900'"
>deltaImportQuery="select * from ads WHERE approvedate >
> '1/1/1900' and
> publishdate < getdate() AND depublishdate > getdate() and deletedate =
> '1/1/1900' and id='${dataimporter.delta.id}'"
>deltaQuery="select id from ads where updatedate >
> '${dataimporter.last_index_time}'">
>
>deltaImportQuery="select locpath as locpath FROM
> ad_photos where
> adid='${dataimporter.delta.id}'"
>deltaQuery="select locpath as locpath FROM ad_photos
> where createdate
> > '${dataimporter.last_index_time}'">
>
>
>
>
>
>
> Now, when I add a new photo to the ad_photos table, its not index when I
> perform a delta import like so:
> http://localhost:8983/solr/i2m/dataimport?command=delta-import.
> When I do a FULL import I do see the new images.
>
>
> Here's the definition of ad_photos table:
>
> CREATE TABLE [dbo].[ad_photos](
>[id] [int] IDENTITY(1,1) NOT NULL,
>[adid] [int] NOT NULL,
>[locpath] [nvarchar](150) NOT NULL,
>[title] [nvarchar](50) NULL,
>[createdate] [datetime] NOT NULL,
>  CONSTRAINT [PK_ad_photos] PRIMARY KEY CLUSTERED
> (
>[id] ASC
> )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY =
> OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
> ) ON [PRIMARY]
>
> GO
>
>
>
> What am I doing wrong?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Delta-import-issue-tp3162581p3162581.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Thanks and Regards
Rahul A. Warawdekar

Re: Call indexer after action on website

2011-07-12 Thread PeterKerk

Bit of a delayed answer, but your suggestion worked, so thank you! :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Call-indexer-after-action-on-website-tp3105153p3162743.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: POST for queries, length/complexity limit of fq?

2011-07-12 Thread Marian Steinbach

Thanks for the reply!

I got a followup question: Can I customize Tomcat logging so that the
"fq" value is not logged?

Marian


On Tue, Jul 12, 2011 at 15:34, Ahmet Arslan  wrote:
> ...
>
> Yes it accepts both GET and POST.
>
> http://wiki.apache.org/solr/SolrTomcat#Enabling_Longer_Query_Requests
>
> I think some users choose to construct similar long fq's inside solr.
>
>...
>
> It takes into consideration the complete fq term.
>

Re: Restart Solr

2011-07-12 Thread Koji Sekiguchi


How to restart Solar ? I am using Solr with the windchill 10.


I don't know windchill 10, but if you run Solr on Jetty or Tomcat,
CTRL+C to shutdown then start Solr again. Or if you use CoreAdmin,
RELOAD action might help:

http://wiki.apache.org/solr/CoreAdmin#RELOAD

koji
--
http://www.rondhuit.com/en/

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-12 Thread Sowmya V.B.

Hi

I've fixed the missing license issues by attaching a license.txt file to my
jar files.

Everything gets built now, with a single error.

/Users/svajjala/Downloads/apache-solr-3.3.0/solr/contrib/uima/src/main/java/org/apache/solr/uima/processor/UIMAToSolrMapper.java:62:
type org.apache.uima.cas.FeatureStructure does not take parameters

[javac] FeatureStructure fs = iterator.next();

Any ideas if UIMA class definitions changed dramatically? I used 2.2.0...and
the present one which came with Solr 3.3 seems to be 2.3.1.

Sowmya.

On Tue, Jul 12, 2011 at 2:13 PM, Sowmya V.B.  wrote:

> Hi Koji
>
> Yes, I do use SolrJ.
>
> I began recompiling the whole thing... since I thought the problem is the
> UIMA snapshot.
> Previously, I compiled files from eclipse and it worked fine.
> (Now I realize that eclipse compiled it because I added my jar files to its
> build path)
>
> I am now getting build errors when I say "ant clean dist" inside
> solr/contrib/uima, after adding all my requisite annotators and libraries
> inside it at requisite places.
> it says:
>
> There are missing LICENSE files in:
> /Users/svajjala/Downloads/apache-solr-3.3.0/solr/contrib/uima/lib Jar file
> count: 13 License Count: 6
> There may be missing NOTICE files in:
> /Users/svajjala/Downloads/apache-solr-3.3.0/solr/contrib/uima/lib.  Note,
> not all files require a NOTICE. Jar file count: 13 Notice Count: 6
>
> But the other 7 in these are the jar files that I added, for my annotators
> to work. They don't have any license or notice files. How should I go about
> this now?
>
> S
>
>
> On Tue, Jul 12, 2011 at 1:55 PM, Koji Sekiguchi wrote:
>
>> Hmm, I'm bit confused. Do you really use SolrJ ?
>>
>> If so:
>>
>>
>> > If I put it inside /update, the following is the stacktrace:
>> > request:
>> > http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>> javabin&version=2
>> > org.apache.solr.common.**SolrException: Bad Request
>>
>> what did you mean by "it" and why did you put it inside /update, but still
>> tried to use /update/javabin ?
>>
>>
>> koji
>> --
>> http://www.rondhuit.com/en/
>>
>>
>> (11/07/12 16:10), Sowmya V.B. wrote:
>>
>>> Yes, I do have an '/update/javabin' request handler in SolrConfig.
>>>
>>> But, should I remove that?
>>>
>>> I tried putting the UIMA update chain inside /update/javabin instead of
>>> /update request handler..
>>> >>   class="solr.**BinaryUpdateRequestHandler">
>>> 
>>> uima
>>> 
>>>   
>>>
>>> .and here is the stacktrace:
>>>
>>> request:
>>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>>> javabin&version=2
>>> org.apache.solr.common.**SolrException: Internal Server Error
>>>
>>> Internal Server Error
>>>
>>> request:
>>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>>> javabin&version=2
>>> at
>>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>>> request(CommonsHttpSolrServer.**java:435)
>>> at
>>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>>> request(CommonsHttpSolrServer.**java:244)
>>> at
>>> org.apache.solr.client.solrj.**request.AbstractUpdateRequest.**
>>> process(AbstractUpdateRequest.**java:105)
>>> at org.apache.solr.client.solrj.**SolrServer.add(SolrServer.**
>>> java:49)
>>> at indexerapp.ir4llSolrIndexer.**indexAll(ir4llSolrIndexer.**
>>> java:150)
>>> at indexerapp.ir4llSolrIndexer.**main(ir4llSolrIndexer.java:57)
>>>
>>>
>>> *
>>>
>>> If I put it inside /update, the following is the stacktrace:
>>> request:
>>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>>> javabin&version=2
>>> org.apache.solr.common.**SolrException: Bad Request
>>>
>>> Bad Request
>>>
>>> request:
>>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>>> javabin&version=2
>>> at
>>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>>> request(CommonsHttpSolrServer.**java:435)
>>> at
>>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>>> request(CommonsHttpSolrServer.**java:244)
>>> at
>>> org.apache.solr.client.solrj.**request.AbstractUpdateRequest.**
>>> process(AbstractUpdateRequest.**java:105)
>>> at org.apache.solr.client.solrj.**SolrServer.add(SolrServer.**
>>> java:49)
>>> at indexerapp.ir4llSolrIndexer.**indexAll(ir4llSolrIndexer.**
>>> java:150)
>>> at indexerapp.ir4llSolrIndexer.**main(ir4llSolrIndexer.java:57)
>>> **
>>>
>>> But, I still don't undetstand where I can see a more detailed Log of Solr
>>> Server.
>>> On my tomcat logs (Iam running from Eclipse), (path:
>>> /Users/svajjala/Documents/**workspace

Delta import issue

2011-07-12 Thread PeterKerk


I'm having an issue with a delta import.

I have the following in my data-config.xml:


 








Now, when I add a new photo to the ad_photos table, its not index when I
perform a delta import like so:
http://localhost:8983/solr/i2m/dataimport?command=delta-import.
When I do a FULL import I do see the new images.


Here's the definition of ad_photos table:

CREATE TABLE [dbo].[ad_photos](
[id] [int] IDENTITY(1,1) NOT NULL,
[adid] [int] NOT NULL,
[locpath] [nvarchar](150) NOT NULL,
[title] [nvarchar](50) NULL,
[createdate] [datetime] NOT NULL,
 CONSTRAINT [PK_ad_photos] PRIMARY KEY CLUSTERED 
(
[id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY =
OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO



What am I doing wrong?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delta-import-issue-tp3162581p3162581.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: POST for queries, length/complexity limit of fq?

2011-07-12 Thread Ahmet Arslan

> I am going to propose a concept where pretty long and
> complex filter
> query (fq) parameters can occur. They are used to enforce
> permissions
> so that a user sees only those documents which he has
> permissions for.
> 
> 1. I assume that it's worthwhile to rely on POST method
> instead of GET
> when issuing a search. Right? As I can see, this should
> work.

Yes it accepts both GET and POST. 

http://wiki.apache.org/solr/SolrTomcat#Enabling_Longer_Query_Requests

I think some users choose to construct similar long fq's inside solr.

> 3. Does filter caching work granular, or does it always
> take into
> consideration the complete fq term? I.e. if someone
> searches with
> "fq=(category:1 AND tag:foo)" and then  searches for
> "fq=(category:1)", does the second request benefit from the
> first
> being cached?

It takes into consideration the complete fq term.

POST for queries, length/complexity limit of fq?

2011-07-12 Thread Marian Steinbach

Hi!

I am going to propose a concept where pretty long and complex filter
query (fq) parameters can occur. They are used to enforce permissions
so that a user sees only those documents which he has permissions for.

1. I assume that it's worthwhile to rely on POST method instead of GET
when issuing a search. Right? As I can see, this should work.

2. Is there a limit on the length of the fq value or on the level of
brace nestings? Are there serious performance drawbacks to take into
consideration?

3. Does filter caching work granular, or does it always take into
consideration the complete fq term? I.e. if someone searches with
"fq=(category:1 AND tag:foo)" and then  searches for
"fq=(category:1)", does the second request benefit from the first
being cached?

As soon as possible, I will try things for myself, but if there is
someone here with enough experience to answer any of these questions,
I'd be glad to know upfront.

Thank you very much!

Marian

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-12 Thread Sowmya V.B.

Hi Koji

Yes, I do use SolrJ.

I began recompiling the whole thing... since I thought the problem is the
UIMA snapshot.
Previously, I compiled files from eclipse and it worked fine.
(Now I realize that eclipse compiled it because I added my jar files to its
build path)

I am now getting build errors when I say "ant clean dist" inside
solr/contrib/uima, after adding all my requisite annotators and libraries
inside it at requisite places.
it says:

There are missing LICENSE files in:
/Users/svajjala/Downloads/apache-solr-3.3.0/solr/contrib/uima/lib Jar file
count: 13 License Count: 6
There may be missing NOTICE files in:
/Users/svajjala/Downloads/apache-solr-3.3.0/solr/contrib/uima/lib.  Note,
not all files require a NOTICE. Jar file count: 13 Notice Count: 6

But the other 7 in these are the jar files that I added, for my annotators
to work. They don't have any license or notice files. How should I go about
this now?

S

On Tue, Jul 12, 2011 at 1:55 PM, Koji Sekiguchi  wrote:

> Hmm, I'm bit confused. Do you really use SolrJ ?
>
> If so:
>
>
> > If I put it inside /update, the following is the stacktrace:
> > request:
> > http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
> javabin&version=2
> > org.apache.solr.common.**SolrException: Bad Request
>
> what did you mean by "it" and why did you put it inside /update, but still
> tried to use /update/javabin ?
>
>
> koji
> --
> http://www.rondhuit.com/en/
>
>
> (11/07/12 16:10), Sowmya V.B. wrote:
>
>> Yes, I do have an '/update/javabin' request handler in SolrConfig.
>>
>> But, should I remove that?
>>
>> I tried putting the UIMA update chain inside /update/javabin instead of
>> /update request handler..
>> >   class="solr.**BinaryUpdateRequestHandler">
>> 
>> uima
>> 
>>   
>>
>> .and here is the stacktrace:
>>
>> request:
>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>> javabin&version=2
>> org.apache.solr.common.**SolrException: Internal Server Error
>>
>> Internal Server Error
>>
>> request:
>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>> javabin&version=2
>> at
>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>> request(CommonsHttpSolrServer.**java:435)
>> at
>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>> request(CommonsHttpSolrServer.**java:244)
>> at
>> org.apache.solr.client.solrj.**request.AbstractUpdateRequest.**
>> process(AbstractUpdateRequest.**java:105)
>> at org.apache.solr.client.solrj.**SolrServer.add(SolrServer.**
>> java:49)
>> at indexerapp.ir4llSolrIndexer.**indexAll(ir4llSolrIndexer.**
>> java:150)
>> at indexerapp.ir4llSolrIndexer.**main(ir4llSolrIndexer.java:57)
>>
>>
>> *
>>
>> If I put it inside /update, the following is the stacktrace:
>> request:
>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>> javabin&version=2
>> org.apache.solr.common.**SolrException: Bad Request
>>
>> Bad Request
>>
>> request:
>> http://localhost:8080/apache-**solr-3.3.0/update/javabin?wt=**
>> javabin&version=2
>> at
>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>> request(CommonsHttpSolrServer.**java:435)
>> at
>> org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
>> request(CommonsHttpSolrServer.**java:244)
>> at
>> org.apache.solr.client.solrj.**request.AbstractUpdateRequest.**
>> process(AbstractUpdateRequest.**java:105)
>> at org.apache.solr.client.solrj.**SolrServer.add(SolrServer.**
>> java:49)
>> at indexerapp.ir4llSolrIndexer.**indexAll(ir4llSolrIndexer.**
>> java:150)
>> at indexerapp.ir4llSolrIndexer.**main(ir4llSolrIndexer.java:57)
>> **
>>
>> But, I still don't undetstand where I can see a more detailed Log of Solr
>> Server.
>> On my tomcat logs (Iam running from Eclipse), (path:
>> /Users/svajjala/Documents/**workspace/.metadata/.plugins/**
>> org.eclipse.wst.server.core/**tmp0/logs)
>> -I dont see anything except a single line:
>>
>> 134.2.129.160 - - [12/Jul/2011:09:02:16 +0200] "POST
>> /apache-solr-3.3.0/update/**javabin?wt=javabin&version=2 HTTP/1.1" 400
>> 1262
>>
>> It is difficult to understand whats going on. Can anyone tell me where I
>> can
>> see a more detailed log?
>>
>> S.
>>
>>
>> On Tue, Jul 12, 2011 at 2:39 AM, Koji Sekiguchi
>>  wrote:
>>
>>  I don't think you have wrong setting in UIMA, but you may have the
>>> request
>>> handler
>>> named "/update/javabin" in solrconfig.xml is not correct?
>>>
>>>
>>> koji
>>> --
>>> http://www.rondhuit.com/en/
>>>
>>> (11/07/12 0:52), Sowmya V.B. wrote:
>>>
>>>  Hi

 I just added the fiel

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-12 Thread Koji Sekiguchi

Hmm, I'm bit confused. Do you really use SolrJ ?

If so:

> If I put it inside /update, the following is the stacktrace:
> request:
> http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
> org.apache.solr.common.SolrException: Bad Request

what did you mean by "it" and why did you put it inside /update, but still
tried to use /update/javabin ?

koji
--
http://www.rondhuit.com/en/

(11/07/12 16:10), Sowmya V.B. wrote:

Yes, I do have an '/update/javabin' request handler in SolrConfig.

But, should I remove that?

I tried putting the UIMA update chain inside /update/javabin instead of
/update request handler..

 uima

.and here is the stacktrace:

request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
org.apache.solr.common.SolrException: Internal Server Error

Internal Server Error

request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
 at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
 at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
 at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
 at indexerapp.ir4llSolrIndexer.indexAll(ir4llSolrIndexer.java:150)
 at indexerapp.ir4llSolrIndexer.main(ir4llSolrIndexer.java:57)

*

If I put it inside /update, the following is the stacktrace:
request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
org.apache.solr.common.SolrException: Bad Request

Bad Request

request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
 at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
 at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
 at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
 at indexerapp.ir4llSolrIndexer.indexAll(ir4llSolrIndexer.java:150)
 at indexerapp.ir4llSolrIndexer.main(ir4llSolrIndexer.java:57)
**

But, I still don't undetstand where I can see a more detailed Log of Solr
Server.
On my tomcat logs (Iam running from Eclipse), (path:
/Users/svajjala/Documents/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/logs)
-I dont see anything except a single line:

134.2.129.160 - - [12/Jul/2011:09:02:16 +0200] "POST
/apache-solr-3.3.0/update/javabin?wt=javabin&version=2 HTTP/1.1" 400 1262

It is difficult to understand whats going on. Can anyone tell me where I can
see a more detailed log?

S.

On Tue, Jul 12, 2011 at 2:39 AM, Koji Sekiguchi  wrote:

I don't think you have wrong setting in UIMA, but you may have the request
handler
named "/update/javabin" in solrconfig.xml is not correct?

koji
--
http://www.rondhuit.com/en/

(11/07/12 0:52), Sowmya V.B. wrote:

Hi

I just added the fields which are added to the index by one of the
annotators to the index, in the fieldmappings section. I am not getting
any
compilation errors and still see the admin interface. However, when I
index,
i just get a SolrException,

org.apache.solr.common.**SolrException: Bad Request.

On the server log, I don't see anything except for this:
127.0.0.1 - - [11/Jul/2011:17:44:04 +0200]  "POST
/apache-solr-3.3.0/update/**javabin?wt=javabin&version=2 HTTP/1.1" 400
1328

Here is my UpdateRequestProcessorChain in sorlconfig.xml (Just changed the
original path names for privacy's sake!)

***

   /**Users/svajjala/Documents/**
EnglishTok.bin.gz
   /Users/svajjala/Documents/**
tag.bin.gz
  **english
 **false
  /Users/**svajjala/Documents/**NewGreenline

 /Users/**svajjala/Documents/**
ir4icallPipeline.xml

false

   false

 text

I don't understand where exactly can I see a more detailed log of why its
not getting indexed.

Sowmya.

On Mon, Jul 11, 2011 at 5:26 PM, Koji Sekiguchi
  wrote:

  disclaimer: I'm not an expert of UIMA. I've just started using it when

Solr
3.1
integrated UIMA!

  Thanks for the clarification. Now, I get it.

Shouldsection mention all the annotators, even if the
annotators do not add any new fields?

  For example, if I have a pipeline, starting from "parser", "tokenizer"
and

"tagger", all of them operate on a field called "text"..which is
the
of the document.

Questions about failover in master / slave configuration

2011-07-12 Thread Laurent Moret

Hi

Our team plans to use SolR for our searchs (we currently use Lucene
directly, but some new business requirements such as "near real time" index
updates seems to be easier to handle with solr).

Our application runs with 2 data centers in active / active mode.
Read operations in the index are made by web or web services applications,
write operations by batch programs.

We studied the documentation and wiki on the replication mechanisms, but we
have some questions about the way it will work...

What we plan :
- In data center A (where batch programs updating the index run in "normal"
behaviour) : a master solr server M1 (updates will be done to this one), and
a slave S1 replicating from M1 queried by applications from this data center
- In data center B : a pseudo master M2 configured to synchronise from M1,
and as a repeater to dispatch updates to this data center, and a slave S2
replicating from M2 queried by applications from this data center

Our questions are about failover management :
- If a slave is down in a datacenter, we can use a standard HTTP failover /
load balancing system to use the slave from the other data center for search
operations from applications (if S1 is down for example, search operations
from application in both data centers A and B will use S2), this is the
simple point
- If M2 is down, we have to update solr config on S2 so it replicates
directly from M1 instead of using the M2 repeater. How can we do this
without restarting S2 ?
- And the most complicated : M1 is down... In this case, M2 can become the
"new master". But to do so, we have to update end point for batch programs
that update (so they push updates to M2 instead of M1), update solr config
in M2 so it doesn't try to synchronize from M1, and update solr config in S1
so it uses M2 as master server... And what will happen when M1 is back ? Its
index will be older than the one in M2, so it has to synch from it...

We don't have any experience with solr, and this must be a common problem.
Maybe someone can help and give us some tips ?

Thanks !

Laurent

Re: Result list order in case of ties

2011-07-12 Thread François Schiettecatte

You just need to provide a second sort field along the lines of:

sort=score desc, author desc

François

On Jul 12, 2011, at 6:13 AM, Lox wrote:

> Hi,
> 
> In the case where two or more documents are returned with the same score, is
> there a way to tell Solr to sort them alphabetically?
> 
> I have already tried to use the tie-breaker, but I have just one field to
> search.
> 
> Thank you.
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Result-list-order-in-case-of-ties-tp3162001p3162001.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Result list order in case of ties

2011-07-12 Thread Michael Kuhlmann

Am 12.07.2011 12:13, schrieb Lox:
> Hi,
> 
> In the case where two or more documents are returned with the same score, is
> there a way to tell Solr to sort them alphabetically?

Yes, add the parameter

sort=score desc,your_field_that_shall_be_sorted_alphabetically asc

to your request.

Greetings,
Kuli

Re: Average PDF index time

2011-07-12 Thread Michael Kuhlmann

Am 12.07.2011 12:03, schrieb alexander sulz:
> Still, why the PHP stops working correctly is beyond me, but it seems to
> be fixed now.

You should mind the max_execution_time parameter in you php.ini.

Greetings,
Kuli

Result list order in case of ties

2011-07-12 Thread Lox

Hi,

In the case where two or more documents are returned with the same score, is
there a way to tell Solr to sort them alphabetically?

I have already tried to use the tie-breaker, but I have just one field to
search.

Thank you.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Result-list-order-in-case-of-ties-tp3162001p3162001.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Average PDF index time

2011-07-12 Thread alexander sulz


Am 12.07.2011 10:08, schrieb alexander sulz:



Hi all,

Are there some kind of average indexing times or PDF's in relation to
its size?
I have here a 10MB PDF (50 pages) which takes about 30 seconds to 
index!

Is that normal?

Depends on you hardware. PDF parsing is a lot more tedious than XML and
besides parsing it's also analyzed and stored and maybe even 
committed. Is it

a problem or do you have many thousands of files with this size?


Luckily I don't there just about 500 of them all in all and about 100 
of them are bigger,
10 of them even problematicly big so that my php script stops working 
but thats another problem.
Unfortunatly I don't have a clue about the server spec's or know 
anyone who does.

greetings
   alex


So I figured out I had my "bleeding-edge" Version of Solr running.
It was 3.3 with the latest tika pulled from SVN (tika1.0-SNAPSHOT).
I reverted back to the stable 0.9 release and now I get 2 seconds index 
time for the same PDF!
Still, why the PHP stops working correctly is beyond me, but it seems to 
be fixed now.


regards
 alex

Saravanan Chinnadurai/Actionimages is out of the office.

2011-07-12 Thread Saravanan . Chinnadurai

I will be out of the office starting  12/07/2011 and will not return until
14/07/2011.

Please email to itsta...@actionimages.com  for any urgent issues.


Action Images is a division of Reuters Limited and your data will therefore be 
protected
in accordance with the Reuters Group Privacy / Data Protection notice which is 
available
in the privacy footer at www.reuters.com
Registered in England No. 145516   VAT REG: 397000555

Many Cores with Solr

2011-07-12 Thread Torsten Kunze

Hi,

as a feasibility study I am trying to run Solr with multiple thousands of cores 
in the same shard to have small indexes that can be created and removed very 
fast.
Now, I have a Tomcat running with 1.600 cores. Memory and open file handles 
have been adjusted to be enough for that scenario.

I am using SolrJ and I implemented a feeder using timer threads to realize auto 
commits to each Solr Core independently.
Feeding is done randomly to the cores in parallel. Auto commit is enabled.

My questions:
Do I need to execute a commit to each core itself or does a commit to one 
dedicated core commit all changes of the whole shard?
Can I feed in parallel to some cores, if  a commit or optimize to another core 
is currently applied or does Solr block further content integration requests 
during that time?

Because of that many cores, it would be better to define cores for lazy loading 
during creation. Unfortunately the current implementation of CoreAdminHandler 
does not allow to set the 'loadOnStart' parameter of solr.xml. Is there a 
possibility to do this or do I need to implement my own handlers?

Does anybody has some good or bad experiences with using many many cores?

Thanks and Regards,
Torsten

-- 
This email was Anti Virus checked by B-S-S GmbH Astaro Security Gateway

Re: How to create a solr core if no solr cores were created before?

2011-07-12 Thread Gabriele Kahlout

if you need the core just for testing then use Solr test framework as in the
link.

On Tue, Jul 12, 2011 at 10:29 AM, Mark Schoy  wrote:

> Thanks for your answer, but your answer is a little bit useless for
> me. Could you please add more information in addition to this link?
>
> Do I have to create a "root" core to create other cores?
> How can I create a "root" core? Manually adding in the solr.xml config?
>

Should all be answered here See http://wiki.apache.org/solr/SolrTomcat
for multiple cores use solr.xml:



 
  
  
 



>
> 2011/7/11 Gabriele Kahlout :
> > have a look here [1].
> >
> > [1]
> >
> https://issues.apache.org/jira/browse/SOLR-2645?focusedCommentId=13062748&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13062748
> >
> > On Mon, Jul 11, 2011 at 4:46 PM, Mark Schoy  wrote:
> >
> >> Hi,
> >>
> >> I tried to create a solr core but I always get "No such solr
> >> core:"-Exception.
> >>
> >> -
> >> File home = new File( pathToSolrHome );
> >> File f = new File( home, "solr.xml" );
> >>
> >> CoreContainer coreContainer = new CoreContainer();
> >> coreContainer.load( pathToSolrHome, f );
> >>
> >> EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, "");
> >> CoreAdminRequest.createCore("coreName", "coreDir", server);
> >> -
> >>
> >> I think the problem is the "" in new EmbeddedSolrServer(coreContainer,
> "");
> >>
> >> Thanks.
> >>
> >
> >
> >
> > --
> > Regards,
> > K. Gabriele
> >
> > --- unchanged since 20/9/10 ---
> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
> > receipt within 48 hours then I don't resend the email.
> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> time(x)
> > < Now + 48h) ⇒ ¬resend(I, this).
> >
> > If an email is sent by a sender that is not a trusted contact or the
> email
> > does not contain a valid code then the email is not received. A valid
> code
> > starts with a hyphen and ends with "X".
> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> > L(-[a-z]+[0-9]X)).
> >
>



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Re: How to create a solr core if no solr cores were created before?

2011-07-12 Thread Mark Schoy

Thanks for your answer, but your answer is a little bit useless for
me. Could you please add more information in addition to this link?

Do I have to create a "root" core to create other cores?
How can I create a "root" core? Manually adding in the solr.xml config?

2011/7/11 Gabriele Kahlout :
> have a look here [1].
>
> [1]
> https://issues.apache.org/jira/browse/SOLR-2645?focusedCommentId=13062748&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13062748
>
> On Mon, Jul 11, 2011 at 4:46 PM, Mark Schoy  wrote:
>
>> Hi,
>>
>> I tried to create a solr core but I always get "No such solr
>> core:"-Exception.
>>
>> -
>> File home = new File( pathToSolrHome );
>> File f = new File( home, "solr.xml" );
>>
>> CoreContainer coreContainer = new CoreContainer();
>> coreContainer.load( pathToSolrHome, f );
>>
>> EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, "");
>> CoreAdminRequest.createCore("coreName", "coreDir", server);
>> -
>>
>> I think the problem is the "" in new EmbeddedSolrServer(coreContainer, "");
>>
>> Thanks.
>>
>
>
>
> --
> Regards,
> K. Gabriele
>
> --- unchanged since 20/9/10 ---
> P.S. If the subject contains "[LON]" or the addressee acknowledges the
> receipt within 48 hours then I don't resend the email.
> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
> < Now + 48h) ⇒ ¬resend(I, this).
>
> If an email is sent by a sender that is not a trusted contact or the email
> does not contain a valid code then the email is not received. A valid code
> starts with a hyphen and ends with "X".
> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> L(-[a-z]+[0-9]X)).
>

Re: Average PDF index time

2011-07-12 Thread alexander sulz




Hi all,

Are there some kind of average indexing times or PDF's in relation to
its size?
I have here a 10MB PDF (50 pages) which takes about 30 seconds to index!
Is that normal?

Depends on you hardware. PDF parsing is a lot more tedious than XML and
besides parsing it's also analyzed and stored and maybe even committed. Is it
a problem or do you have many thousands of files with this size?
Luckily I don't there just about 500 of them all in all and about 100 of 
them are bigger,
10 of them even problematicly big so that my php script stops working 
but thats another problem.
Unfortunatly I don't have a clue about the server spec's or know anyone 
who does.

greetings
   alex

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

2011-07-12 Thread Sowmya V.B.

Yes, I do have an '/update/javabin' request handler in SolrConfig.

But, should I remove that?

I tried putting the UIMA update chain inside /update/javabin instead of
/update request handler..


uima

  

.and here is the stacktrace:

request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
org.apache.solr.common.SolrException: Internal Server Error

Internal Server Error

request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at indexerapp.ir4llSolrIndexer.indexAll(ir4llSolrIndexer.java:150)
at indexerapp.ir4llSolrIndexer.main(ir4llSolrIndexer.java:57)


*

If I put it inside /update, the following is the stacktrace:
request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
org.apache.solr.common.SolrException: Bad Request

Bad Request

request:
http://localhost:8080/apache-solr-3.3.0/update/javabin?wt=javabin&version=2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at indexerapp.ir4llSolrIndexer.indexAll(ir4llSolrIndexer.java:150)
at indexerapp.ir4llSolrIndexer.main(ir4llSolrIndexer.java:57)
**

But, I still don't undetstand where I can see a more detailed Log of Solr
Server.
On my tomcat logs (Iam running from Eclipse), (path:
/Users/svajjala/Documents/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/logs)
-I dont see anything except a single line:

134.2.129.160 - - [12/Jul/2011:09:02:16 +0200] "POST
/apache-solr-3.3.0/update/javabin?wt=javabin&version=2 HTTP/1.1" 400 1262

It is difficult to understand whats going on. Can anyone tell me where I can
see a more detailed log?

S.


On Tue, Jul 12, 2011 at 2:39 AM, Koji Sekiguchi  wrote:

> I don't think you have wrong setting in UIMA, but you may have the request
> handler
> named "/update/javabin" in solrconfig.xml is not correct?
>
>
> koji
> --
> http://www.rondhuit.com/en/
>
> (11/07/12 0:52), Sowmya V.B. wrote:
>
>> Hi
>>
>> I just added the fields which are added to the index by one of the
>> annotators to the index, in the fieldmappings section. I am not getting
>> any
>> compilation errors and still see the admin interface. However, when I
>> index,
>> i just get a SolrException,
>>
>> org.apache.solr.common.**SolrException: Bad Request.
>>
>> On the server log, I don't see anything except for this:
>> 127.0.0.1 - - [11/Jul/2011:17:44:04 +0200]  "POST
>> /apache-solr-3.3.0/update/**javabin?wt=javabin&version=2 HTTP/1.1" 400
>> 1328
>>
>> Here is my UpdateRequestProcessorChain in sorlconfig.xml (Just changed the
>> original path names for privacy's sake!)
>>
>> ***
>> 
>> 
>> > class="org.apache.solr.uima.**processor.**UIMAUpdateRequestProcessorFact*
>> *ory">
>>   
>> 
>> 
>>   > name="**tokenizerModelFileLocation">/**Users/svajjala/Documents/**
>> EnglishTok.bin.gz
>>   > name="taggerModelFileLocation"**>/Users/svajjala/Documents/**
>> tag.bin.gz
>>  **english
>> **false
>>  > name="GreenlineLists">/Users/**svajjala/Documents/**NewGreenline
>>
>>
>> > name="analysisEngine">/Users/**svajjala/Documents/**
>> ir4icallPipeline.xml
>> 
>>false
>> 
>> 
>>   false
>>   
>> text
>>   
>> 
>>
>>  
>>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> > field="Generic_TotalWordCount" />
>> 
>>
>> 
>>   
>> 
>> 
>> 
>>   
>>
>> 
>> I don't understand where exactly can I see a more detailed log of why its
>> not getting indexed.
>>
>> Sowmya.
>>
>> On Mon, Jul 11, 2011 at 5:26 PM, Koji Sekiguchi
>>  wrote:
>>
>>  disclaimer: I'm not an expert of UIMA. I've just started using it when
>>> Solr
>>> 3.1
>>> integrated UIMA!
>>>
>>>
>>>  Thanks for the clarification. Now, I get it.
>>>
 Should   section mention all the annotators, even if the
 annotators do not add any new fields?


>>>  For example, if I have a pipeline, starting from "parser", "tokenizer"
>>> and
>>>
 "tagger", all of them operate on a field called "text"..whi

Possible bug in Solr 3.3 grouping

2011-07-12 Thread Nikhil Chhaochharia

Hi,

I am using Solr 3.3 and have run into a problem with grouping.  If 'group.main' 
is 'true' and 'start' is greater than 'rows', then I do not get any results.  A 
sample response is:


0602



If 'group.main' is false, then I get results.

Did anyone else come across this problem?  Using the grouping feature with 
pagination of results will make start > rows from the third page onwards.

Thanks,
Nikhil

53 matches

Mail list logo