Re: How to reserve ids?

2011-09-27 Thread Gabriele Kahlout
Otis,

I'm following up on this as solving my problem though the stopwords
mechanism would be great. *Do stopwords apply also to the url/id field?*

Continuing on the msn.com example, with msn.com as a stopword
msn.comwebpage may still actually be indexed if neither the title nor
the body
contains msn.com. Isn't it?

P.S.
I just click on 'reply to all' (or reply on the phone). If it bothers you
I'll make the less lazy effort of selecting 'reply'
[image: replyall.png]
On Tue, Sep 27, 2011 at 6:40 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Gabriele,

 Using msn.com as a stopword would simply mean that msn.com would not be
 indexed and therefore a search for msn.com would not yield results.  You
 could still search for hotmail and it may match documents that have 
 msn.com token stored in them, even though msn.com is a stopword.

 Otis

 P.S.
 No need to CC me, I'm on the list.
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/


 
 From: Gabriele Kahlout gabri...@mysimpatico.com
 To: solr-user@lucene.apache.org; Otis Gospodnetic 
 otis_gospodne...@yahoo.com
 Sent: Tuesday, September 27, 2011 1:58 AM
 Subject: Re: How to reserve ids?
 
 I'm interested in the stopwords solution as it sounds like less work but
 i'm not sure i understand how it works. By having msn.com as a stopword it
 doesnt mean i wont get msn.com as a result for say 'hotmail'. My
 understanding is that msn.com will never make it to the similarity
 function and thus affect the score calculation. But seldom does the url
 anyway (in my searches on content)!
 
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to reserve ids?

2011-09-26 Thread Gabriele Kahlout
Hello,

While indexing there are certain urls/ids I'd never want to appear in the
search results (so be indexed). Is there already a 'supported by design'
mechanism to do that to point me too, or should I just create this blacklist
as an processor in the update chain?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to reserve ids?

2011-09-26 Thread Gabriele Kahlout
I'm interested in the stopwords solution as it sounds like less work but i'm 
not sure i understand how it works. By having msn.com as a stopword it doesnt 
mean i wont get msn.com as a result for say 'hotmail'. My understanding is that 
msn.com will never make it to the similarity function and thus affect the score 
calculation. But seldom does the url anyway (in my searches on content)!

Re: How to make the url id case insensitive?

2011-09-05 Thread Gabriele Kahlout
On Mon, Sep 5, 2011 at 1:22 PM, Markus Jelsma markus.jel...@openindex.iowrote:

 Hi,

 URI paths are case-sensitive. If you really want to treat all URL's as
 case-
 insensitive i would suggest to modifiy the basic URL normalizer to
 lowercase
 all URL's so that it also ends up lowercased in the CrawlDB.

 What is your problem? I would strongly suggest another solution if you're
 doing wide web crawls.


I don't want duplicate results where the only real difference is the case of
some letters in the URL.
What other solution?



 Cheers,

  Hi,
  I've just noticed that two search results of indexed data have the same
  url:
 
  http://www.atory.com/dupe_checker_pro/
  http://www.atory.com/dupe_checker_PRO/
 
  I thought the url/id was case-insentively unique. Is there how I can set
 it
  up to be so?
 
  For Solr it makes sense not to make it the default for disparate uses,
 but
  for nutch not.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to make the url id case insensitive?

2011-09-04 Thread Gabriele Kahlout
Hi,
I've just noticed that two search results of indexed data have the same url:

http://www.atory.com/dupe_checker_pro/
http://www.atory.com/dupe_checker_PRO/

I thought the url/id was case-insentively unique. Is there how I can set it
up to be so?

For Solr it makes sense not to make it the default for disparate uses, but
for nutch not.

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to get all the terms in a document as Luke does?

2011-08-30 Thread Gabriele Kahlout
The Term Vector Component (TVC) is a SearchComponent designed to return
information about documents that is stored when setting the termVector
attribute on a field:

Will I have to re-index after adding that to the schema?

On Tue, Aug 30, 2011 at 11:06 PM, Jayendra Patil 
jayendra.patil@gmail.com wrote:

 you might want to check - http://wiki.apache.org/solr/TermVectorComponent
 Should provide you with the term vectors with a lot of additional info.

 Regards,
 Jayendra

 On Tue, Aug 30, 2011 at 3:34 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  Hello,
 
  This time I'm trying to duplicate Luke's functionality of knowing which
  terms occur in a search result/document (w/o parsing it again). Any Solrj
  API to do that?
 
  P.S. I've also posted the question on
  SOhttp://stackoverflow.com/q/7219111/300248
  .
 
  On Wed, Jul 6, 2011 at 11:09 AM, Gabriele Kahlout
  gabri...@mysimpatico.comwrote:
 
  From you patch I see TermFreqVector  which provides the information I
  want.
 
  I also found FieldInvertState.getLength() which seems to be exactly what
 I
  want. I'm after the word count (sum of tf for every term in the doc).
 I'm
  just not sure whether FieldInvertState.getLength() returns just the
 number
  of terms (not multiplied by the frequency of each term - word count) or
 not
  though. It seems as if it returns word count, but I've not tested it
  sufficienctly.
 
 
  On Wed, Jul 6, 2011 at 1:39 AM, Trey Grainger 
 the.apache.t...@gmail.comwrote:
 
  Gabriele,
 
  I created a patch that does this about a year ago.  See
  https://issues.apache.org/jira/browse/SOLR-1837.  It was written for
 Solr
  1.4 and is based upon the Document Reconstructor in Luke.  The patch
 adds
  a
  link to the main solr admin page to a docinspector page which will
  reconstruct the document given a uniqueid (required).  Keep in mind
 that
  you're only looking at what's in the index for non-stored fields, not
  the
  original text.
 
  If you have any issues using this on the most recent release, let me
 know
  and I'd be happy to create a new patch for solr 3.3.  One of these days
  I'll
  remove the JSP dependency and this may eventually making it into trunk.
 
  Thanks,
 
  -Trey Grainger
  Search Technology Development Team Lead, Careerbuilder.com
  Site Architect, Celiaccess.com
 
 
  On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout
  gabri...@mysimpatico.comwrote:
 
   Hello,
  
   With an inverted index the term is the key, and the documents are the
   values. Is it still however possible that given a document id I get
 the
   terms indexed for that document?
  
   --
   Regards,
   K. Gabriele
  
   --- unchanged since 20/9/10 ---
   P.S. If the subject contains [LON] or the addressee acknowledges
 the
   receipt within 48 hours then I don't resend the email.
   subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
   time(x)
Now + 48h) ⇒ ¬resend(I, this).
  
   If an email is sent by a sender that is not a trusted contact or the
  email
   does not contain a valid code then the email is not received. A valid
  code
   starts with a hyphen and ends with X.
   ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧
 y ∈
   L(-[a-z]+[0-9]X)).
  
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)  Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y

How to get all the terms in a document as Luke does?

2011-08-29 Thread Gabriele Kahlout
Hello,

This time I'm trying to duplicate Luke's functionality of knowing which
terms occur in a search result/document (w/o parsing it again). Any Solrj
API to do that?

P.S. I've also posted the question on
SOhttp://stackoverflow.com/q/7219111/300248
.

On Wed, Jul 6, 2011 at 11:09 AM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 From you patch I see TermFreqVector  which provides the information I
 want.

 I also found FieldInvertState.getLength() which seems to be exactly what I
 want. I'm after the word count (sum of tf for every term in the doc). I'm
 just not sure whether FieldInvertState.getLength() returns just the number
 of terms (not multiplied by the frequency of each term - word count) or not
 though. It seems as if it returns word count, but I've not tested it
 sufficienctly.


 On Wed, Jul 6, 2011 at 1:39 AM, Trey Grainger 
 the.apache.t...@gmail.comwrote:

 Gabriele,

 I created a patch that does this about a year ago.  See
 https://issues.apache.org/jira/browse/SOLR-1837.  It was written for Solr
 1.4 and is based upon the Document Reconstructor in Luke.  The patch adds
 a
 link to the main solr admin page to a docinspector page which will
 reconstruct the document given a uniqueid (required).  Keep in mind that
 you're only looking at what's in the index for non-stored fields, not
 the
 original text.

 If you have any issues using this on the most recent release, let me know
 and I'd be happy to create a new patch for solr 3.3.  One of these days
 I'll
 remove the JSP dependency and this may eventually making it into trunk.

 Thanks,

 -Trey Grainger
 Search Technology Development Team Lead, Careerbuilder.com
 Site Architect, Celiaccess.com


 On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout
 gabri...@mysimpatico.comwrote:

  Hello,
 
  With an inverted index the term is the key, and the documents are the
  values. Is it still however possible that given a document id I get the
  terms indexed for that document?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Why are not query keywords treated as a set?

2011-08-20 Thread Gabriele Kahlout
 Part of the query is 'injected' by my application while unaware of the user
query. Would I know that 'paste past' end up together as query 'past past' I
would not inject anything as it distorts the score calculation. I could
inject after it, but it is not easy.


So, trying to solve it right into the RequestHandler I've difficulties with
queries that contain phrases () or the 'must be present' + operator. For
example I'd not want to touch a user query: +zusammen essen +alein essen
where 'essen' is the duplicate term.

My 'good enough solution' is thus to not remove the duplicate in clauses
prefixed by + or .

C := set of clauses in which duplicated term t occurs.
for each clause c in C:
do
if(!c.toString().startsWith() 
  !c.toString().startsWith(+) 
  |C|  1){
C.remove(c);
}
end

What do you think? Better solutions or algorithms to make sure the same term
occurs only once in a query, or at least it's weighted once only in the
score calculation?


On Mon, Jun 20, 2011 at 11:15 AM, Markus Jelsma
markus.jel...@openindex.iowrote:

 That only removed tokens on the same position, as the wiki explains.

 Gabrielle, why would you expect that? You input two tokens so you query for
 two tokens, why would it be a `set` ?

  this might help in your analysis chain
 
 
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.RemoveDupl
  icatesTokenFilterFactory
 
  On 20 June 2011 04:21, Gabriele Kahlout gabri...@mysimpatico.com
 wrote:
   str name=rawquerystringpast past/str
   str name=querystring*past past*/str
   str name=parsedquery*content:past content:past*/str
  
   I was expecting the query to get parsed into content:past only and not
   content:past content:past.
  
   On Mon, Jun 20, 2011 at 12:12 AM, lee carroll
  
   lee.a.carr...@googlemail.comwrote:
   do you mean a phrase query? past past
   can you give some more detail?
  
   On 18 June 2011 13:02, Gabriele Kahlout gabri...@mysimpatico.com
 wrote:
q=past past
   
1.0 = (MATCH) sum of:
*  0.5 = (MATCH) fieldWeight(content:past in 0), product of:*
  1.0 = tf(termFreq(content:past)=1)
  1.0 = idf(docFreq=1, maxDocs=2)
  0.5 = fieldNorm(field=content, doc=0)
*  0.5 = (MATCH) fieldWeight(content:past in 0), product of:*
  1.0 = tf(termFreq(content:past)=1)
  1.0 = idf(docFreq=1, maxDocs=2)
  0.5 = fieldNorm(field=content, doc=0)
   
Is there how I can treat the query keywords as a set?
   
--
Regards,
K. Gabriele
   
--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges
 the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  
   time(x)
  
 Now + 48h) ⇒ ¬resend(I, this).
   
If an email is sent by a sender that is not a trusted contact or the
  
   email
  
does not contain a valid code then the email is not received. A
 valid
  
   code
  
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧
 y
∈ L(-[a-z]+[0-9]X)).
  
   --
   Regards,
   K. Gabriele
  
   --- unchanged since 20/9/10 ---
   P.S. If the subject contains [LON] or the addressee acknowledges the
   receipt within 48 hours then I don't resend the email.
   subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
   time(x)  Now + 48h) ⇒ ¬resend(I, this).
  
   If an email is sent by a sender that is not a trusted contact or the
   email does not contain a valid code then the email is not received. A
   valid code starts with a hyphen and ends with X.
   ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
 ∈
   L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to add TrieIntField to a SolrInputDocument?

2011-07-14 Thread Gabriele Kahlout
this works:

doc.remove(wc);
SolrInputField wcField = new SolrInputField(wc);
wcField.setValue(150, 1.0f);
doc.put(wc,wcField);

On Wed, Jul 13, 2011 at 4:19 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 SolrInputDocument doc = new SolrInputDocument();
 doc.setField(id, 0);
 doc.setField(url, getURL(0));
 doc.setField(content, blah blah blah);
 *doc.setField(wc, 150); //wc is of solr.TrieIntField field type in
 schema.xml*
 assertU(adoc(doc));
 assertU(commit());
 assertNumFound(1);

 The above test fails until I change the following in schema.xml:
  - fieldType name=int class=solr.*TrieIntField* omitNorms=true/
  + fieldType name=int class=solr.*IntField* omitNorms=true/


 On Sun, Jul 10, 2011 at 10:36 PM, Gabriele Kahlout 
 gabri...@mysimpatico.com wrote:


 This was my problem:
 fieldType name=int class=solr.TrieIntField omitNorms=true/

 I had taken my queu from Nutch's schema:
 fieldType name=long class=solr.LongField omitNorms=true/



 On Sat, Jul 9, 2011 at 4:55 PM, Yonik Seeley 
 yo...@lucidimagination.comwrote:

 Something is wrong with your indexing.
 Is wc an indexed field?  If not, change it so it is, then re-index your
 data.

 If so, I'd recommend starting with the example data and filter for
 something like popularity:[6 TO 10] to convince yourself it works,
 then figuring out what you did differently in your schema/data.

 -Yonik
 http://www.lucidimagination.com

 On Sat, Jul 9, 2011 at 10:50 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  http://localhost:8080/solr/select?indent=onversion=2.2q=*%3A**
  fq=wc%3A%5B255+TO+257%5D*
  start=0rows=10fl=*%2Cscoreqt=wt=xmlexplainOther=hl.fl=
 
  The toString of the request:
 
 {explainOther=fl=*,scoreindent=onstart=0q=*:*hl.fl=qt=wt=xmlfq=wc:[255+TO+257]rows=1version=2.2}
 
  Even when the FilterQuery is constructed in Java it doesn't work (i get
  results that ignore the filter query completely).
 
 
  On Sat, Jul 9, 2011 at 3:40 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
 
   I don't get it to work!
  
   If I specify no fq I get the first result with int
   name=wc256/int
  
   With wc:[255 TO 257] (fq=wc%3A%5B255+TO+257%5D) nothing
   comes out.
 
  If you give us the Full URL you are using, it can be helpful.
 
  Correct syntax is fq=wc:[255 TO 257]
 
  You can use more that fq in a request.
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
 ∈
  L(-[a-z]+[0-9]X)).
 




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Why cannot I open a read-only IndexReader from TestHarness.getIndexDir() ?

2011-07-14 Thread Gabriele Kahlout
 IndexReader getReader() throws CorruptIndexException, IOException {
return IndexReader.open(FSDirectory.open(new
File(h.getCore().getIndexDir())), true);
}

*org.apache.lucene.index.IndexNotFoundException: no segments* file found in
org.apache.lucene.store.NIOFSDirectory@/private/var/folders/54/54wUdohaH8eR-mvbJL0l2k+++TI/-Tmp-/solrtest-SolrTestCaseJ4-1310631397578/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@62d337d3: files: []*
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:428)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:288)
at com.mysimpatico.me.indexplugins.SolrTest.getReader(SolrTest.java:43)

I'm calling it right after a assertU(commit()) and assertQ(req(*:*),
getNumFoundXPath(1)) which asserts a document has been indexed.

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Why cannot I open a read-only IndexReader from TestHarness.getIndexDir() ?

2011-07-14 Thread Gabriele Kahlout
i don't know about the path, TestHarness chose it (seems like a temporary
directory). Does this work for you?


assertU(adoc(id, 0, url, getURL(docUID), content, blah blah
blah);
assertU(commit());
assertNumFound(1); //this is a helper method of mine
IndexReader.open(FSDirectory.open(new File(h.getCore().getIndexDir())),
true); //for me it fails here. But since the document was added I suspect
this is a bug


On Thu, Jul 14, 2011 at 10:48 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Thu, Jul 14, 2011 at 1:56 PM, Gabriele Kahlout
 gabri...@mysimpatico.comwrote:

   IndexReader getReader() throws CorruptIndexException, IOException {
 return IndexReader.open(FSDirectory.open(new
  File(h.getCore().getIndexDir())), true);
 }
 
  *org.apache.lucene.index.IndexNotFoundException: no segments* file found
 in
  org.apache.lucene.store.NIOFSDirectory@
 
 /private/var/folders/54/54wUdohaH8eR-mvbJL0l2k+++TI/-Tmp-/solrtest-SolrTestCaseJ4-1310631397578/index
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@62d337d3: files:
  []*
 at
 
 
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
 at
 org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
 at org.apache.lucene.index.IndexReader.open(IndexReader.java:428)
 at org.apache.lucene.index.IndexReader.open(IndexReader.java:288)
 at
 com.mysimpatico.me.indexplugins.SolrTest.getReader(SolrTest.java:43)
 
  I'm calling it right after a assertU(commit()) and assertQ(req(*:*),
  getNumFoundXPath(1)) which asserts a document has been indexed.
 
 
 I'm not sure but the error indicates that the index does not exist. Perhaps
 the path is wrong?

 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I still search documents once updated?

2011-07-13 Thread Gabriele Kahlout
It indeed is not stored, but this is still unexpected behavior. It's a
stored and indexed field, why has the index data been lost?


On Wed, Jul 13, 2011 at 12:44 AM, Erick Erickson erickerick...@gmail.comwrote:

 Unless you stored your content field, the value you put in there won't
 be fetched from the index. Verify that the doc you retrieve from the index
 has values for content, I bet it doesn't

 Best
 Erick

 On Tue, Jul 12, 2011 at 9:38 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
   @Test
 public void testUpdateLoseTermsSimplified() throws Exception {
  *IndexWriter writer = indexDoc();*
 assertEquals(1, writer.numDocs());
 IndexSearcher searcher = getSearcher(writer);
 final TermQuery termQuery = new TermQuery(new Term(content,
  essen));
 
 TopDocs docs = searcher.search(termQuery, 1);
 assertEquals(1, docs.totalHits);
 Document doc = searcher.doc(0);
 
  *writer.updateDocument(new Term(id,doc.get(id)),doc);*
 
 searcher = getSearcher(writer);
  *docs = searcher.search(termQuery, 1);*
  *assertEquals(1, docs.totalHits);*//docs.totalHits == 0 !
 }
 
  testUpdateLosesTerms(com.mysimpatico.me.indexplugins.WcTest)  Time
 elapsed:
  0.346 sec   FAILURE!
  java.lang.AssertionError: expected:1 but was:0
 at org.junit.Assert.fail(Assert.java:91)
 at org.junit.Assert.failNotEquals(Assert.java:645)
 at org.junit.Assert.assertEquals(Assert.java:126)
 at org.junit.Assert.assertEquals(Assert.java:470)
 at org.junit.Assert.assertEquals(Assert.java:454)
 at
 
 com.mysimpatico.me.indexplugins.WcTest.testUpdateLosesTerms(WcTest.java:271)
 
  I have not changed anything (as you can see) during the update. I just
  retrieve a document and the update it. But then the termQuery that worked
  before doesn't work anymore (while the id field wasn't changed). Is
 this
  to be expected when content field is not stored?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I still search documents once updated?

2011-07-13 Thread Gabriele Kahlout
On Wed, Jul 13, 2011 at 1:57 PM, Erick Erickson erickerick...@gmail.comwrote:

 Wait, you directly contradicted yourself G You say it's
 not stored, then you say it's stored and indexed, which is it?


ja, i meant indexed and not stored.



 When you fetch a document, only stored fields are returned
 and the returned data is the verbatim copy of the original
 data. No attempt is made to return un-stored fields. This
 has been the behavior allways. If you attempted to returned
 indexed but not stored data, you'd get stemmed versions,
 stop words would be removed, synonyms would be in place
 etc. Not to mention it would be very slow.


this is what i was expecting. Otherwise updating a field of a document that
has an unstored but indexed field is impossible (without losing the unstored
but indexed field. I call this updating a field of a document AND
deleting/updating all its unstored but indexed fields).


 If the field is stored, then there's another problem, you might
 want to dump the document after reading it from the IR.

 Best
 Erick

 On Wed, Jul 13, 2011 at 2:25 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  It indeed is not stored, but this is still unexpected behavior. It's a
  stored and indexed field, why has the index data been lost?
 
 
  On Wed, Jul 13, 2011 at 12:44 AM, Erick Erickson 
 erickerick...@gmail.comwrote:
 
  Unless you stored your content field, the value you put in there won't
  be fetched from the index. Verify that the doc you retrieve from the
 index
  has values for content, I bet it doesn't
 
  Best
  Erick
 
  On Tue, Jul 12, 2011 at 9:38 AM, Gabriele Kahlout
  gabri...@mysimpatico.com wrote:
@Test
  public void testUpdateLoseTermsSimplified() throws Exception {
   *IndexWriter writer = indexDoc();*
  assertEquals(1, writer.numDocs());
  IndexSearcher searcher = getSearcher(writer);
  final TermQuery termQuery = new TermQuery(new Term(content,
   essen));
  
  TopDocs docs = searcher.search(termQuery, 1);
  assertEquals(1, docs.totalHits);
  Document doc = searcher.doc(0);
  
   *writer.updateDocument(new Term(id,doc.get(id)),doc);*
  
  searcher = getSearcher(writer);
   *docs = searcher.search(termQuery, 1);*
   *assertEquals(1, docs.totalHits);*//docs.totalHits == 0 !
  }
  
   testUpdateLosesTerms(com.mysimpatico.me.indexplugins.WcTest)  Time
  elapsed:
   0.346 sec   FAILURE!
   java.lang.AssertionError: expected:1 but was:0
  at org.junit.Assert.fail(Assert.java:91)
  at org.junit.Assert.failNotEquals(Assert.java:645)
  at org.junit.Assert.assertEquals(Assert.java:126)
  at org.junit.Assert.assertEquals(Assert.java:470)
  at org.junit.Assert.assertEquals(Assert.java:454)
  at
  
 
 com.mysimpatico.me.indexplugins.WcTest.testUpdateLosesTerms(WcTest.java:271)
  
   I have not changed anything (as you can see) during the update. I just
   retrieve a document and the update it. But then the termQuery that
 worked
   before doesn't work anymore (while the id field wasn't changed). Is
  this
   to be expected when content field is not stored?
  
   --
   Regards,
   K. Gabriele
  
   --- unchanged since 20/9/10 ---
   P.S. If the subject contains [LON] or the addressee acknowledges the
   receipt within 48 hours then I don't resend the email.
   subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)
Now + 48h) ⇒ ¬resend(I, this).
  
   If an email is sent by a sender that is not a trusted contact or the
  email
   does not contain a valid code then the email is not received. A valid
  code
   starts with a hyphen and ends with X.
   ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
 ∈
   L(-[a-z]+[0-9]X)).
  
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I still search documents once updated?

2011-07-13 Thread Gabriele Kahlout
Well, I'm !sure how usual this scenario would be:
1. In general those using solr with nutch don't store the content field to
avoid storing the whole web/intranet in their index, twice (1 in the form of
stored data, and one in the form of indexed data).

Now everytime they need to update a field unrelated to content (number of
inbound links for an example) they would have to re-crawl the page again.
This is at least !intuitive.


On Wed, Jul 13, 2011 at 2:40 PM, Michael Kuhlmann s...@kuli.org wrote:

 Am 13.07.2011 14:05, schrieb Gabriele Kahlout:
  this is what i was expecting. Otherwise updating a field of a document
 that
  has an unstored but indexed field is impossible (without losing the
 unstored
  but indexed field. I call this updating a field of a document AND
  deleting/updating all its unstored but indexed fields).

 Not necessarily. The usual use case is that you have some kind of
 existing data source from where you fill your Solr index. When you want
 to update field of a document, then you simply re-index from that
 source. There's no need to fetch data from Solr before.

 Otherwise, if you really don't have such an existing data source because
 a horde of typewriting monkeys filled your Solr index, then you should
 better declare all your fields as stored. Otherwise you'll never have a
 chance to get that data back.

 Greeting,
 Kuli




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I still search documents once updated?

2011-07-13 Thread Gabriele Kahlout
On Wed, Jul 13, 2011 at 3:54 PM, Michael Kuhlmann s...@kuli.org wrote:

 Am 13.07.2011 15:37, schrieb Gabriele Kahlout:
  Well, I'm !sure how usual this scenario would be:
  1. In general those using solr with nutch don't store the content field
 to
  avoid storing the whole web/intranet in their index, twice (1 in the form
 of
  stored data, and one in the form of indexed data).
 

 Not exactly. The indexed form is quite different from the stored form;
 only the tokens are stored, each token only once, and some additional
 data like the document count and, maybe, shingle information etc..

 Hence, indexed data usually needs much less space on disk than the
 original data.


I realized that. Maybe I should have said 1.X (1 in the form of stored data
and 0.X in the form of indexed data).


 There's no practical alternative to storing the content in a stored
 field. What would you otherwise display as a search result? The
 following web pages have your search term somewhere in their contents,
 don't know where, take a look on your own?

 Display the title, and url (and implicitly say The
following web pages have your search term somewhere in their contents, don't
REMEMBER where, take a look on your own?).

Solr is already configured by default not to store more than a
maxFieldLength anyway. Usually one stores content only to display
snippets.



 Greetings,
 Kuli




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to add TrieIntField to a SolrInputDocument?

2011-07-13 Thread Gabriele Kahlout
SolrInputDocument doc = new SolrInputDocument();
doc.setField(id, 0);
doc.setField(url, getURL(0));
doc.setField(content, blah blah blah);
*doc.setField(wc, 150); //wc is of solr.TrieIntField field type in
schema.xml*
assertU(adoc(doc));
assertU(commit());
assertNumFound(1);

The above test fails until I change the following in schema.xml:
 - fieldType name=int class=solr.*TrieIntField* omitNorms=true/
 + fieldType name=int class=solr.*IntField* omitNorms=true/


On Sun, Jul 10, 2011 at 10:36 PM, Gabriele Kahlout gabri...@mysimpatico.com
 wrote:


 This was my problem:
 fieldType name=int class=solr.TrieIntField omitNorms=true/

 I had taken my queu from Nutch's schema:
 fieldType name=long class=solr.LongField omitNorms=true/



 On Sat, Jul 9, 2011 at 4:55 PM, Yonik Seeley 
 yo...@lucidimagination.comwrote:

 Something is wrong with your indexing.
 Is wc an indexed field?  If not, change it so it is, then re-index your
 data.

 If so, I'd recommend starting with the example data and filter for
 something like popularity:[6 TO 10] to convince yourself it works,
 then figuring out what you did differently in your schema/data.

 -Yonik
 http://www.lucidimagination.com

 On Sat, Jul 9, 2011 at 10:50 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  http://localhost:8080/solr/select?indent=onversion=2.2q=*%3A**
  fq=wc%3A%5B255+TO+257%5D*
  start=0rows=10fl=*%2Cscoreqt=wt=xmlexplainOther=hl.fl=
 
  The toString of the request:
 
 {explainOther=fl=*,scoreindent=onstart=0q=*:*hl.fl=qt=wt=xmlfq=wc:[255+TO+257]rows=1version=2.2}
 
  Even when the FilterQuery is constructed in Java it doesn't work (i get
  results that ignore the filter query completely).
 
 
  On Sat, Jul 9, 2011 at 3:40 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
   I don't get it to work!
  
   If I specify no fq I get the first result with int
   name=wc256/int
  
   With wc:[255 TO 257] (fq=wc%3A%5B255+TO+257%5D) nothing
   comes out.
 
  If you give us the Full URL you are using, it can be helpful.
 
  Correct syntax is fq=wc:[255 TO 257]
 
  You can use more that fq in a request.
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to create a solr core if no solr cores were created before?

2011-07-12 Thread Gabriele Kahlout
if you need the core just for testing then use Solr test framework as in the
link.

On Tue, Jul 12, 2011 at 10:29 AM, Mark Schoy hei...@gmx.de wrote:

 Thanks for your answer, but your answer is a little bit useless for
 me. Could you please add more information in addition to this link?

 Do I have to create a root core to create other cores?
 How can I create a root core? Manually adding in the solr.xml config?


Should all be answered here See http://wiki.apache.org/solr/SolrTomcat
for multiple cores use solr.xml:

?xml version=1.0 encoding=UTF-8?
solr persistent=true sharedLib=lib
 cores adminPath=/admin/cores defaultCoreName=live shareSchema=true
  core name=live instanceDir=. dataDir=live /
  core name=test instanceDir=. dataDir=test /
 /cores
/solr



 2011/7/11 Gabriele Kahlout gabri...@mysimpatico.com:
  have a look here [1].
 
  [1]
 
 https://issues.apache.org/jira/browse/SOLR-2645?focusedCommentId=13062748page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13062748
 
  On Mon, Jul 11, 2011 at 4:46 PM, Mark Schoy hei...@gmx.de wrote:
 
  Hi,
 
  I tried to create a solr core but I always get No such solr
  core:-Exception.
 
  -
  File home = new File( pathToSolrHome );
  File f = new File( home, solr.xml );
 
  CoreContainer coreContainer = new CoreContainer();
  coreContainer.load( pathToSolrHome, f );
 
  EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, );
  CoreAdminRequest.createCore(coreName, coreDir, server);
  -
 
  I think the problem is the  in new EmbeddedSolrServer(coreContainer,
 );
 
  Thanks.
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to get doc # to use in reader.norms(content)[doc]?

2011-07-12 Thread Gabriele Kahlout
Hello,
I'm trying to get the norm of an indexed document for a given field but
beside reader.norms(fieldName) I'm not finding any API to retrieve it. Now
reader.norms(..) returns an array with the norms for that field of all
indexed documents. How do I know the index of my document in there?

TermQuery.explain(){
...
byte[] fieldNorms = reader.norms(field);
  float fieldNorm =
fieldNorms!=null ? similarity.decodeNormValue(fieldNorms[doc]) :
1.0f;
  fieldNormExpl.setValue(fieldNorm);
...
In here doc is

 DocSlice docs = (DocSlice) values.get(response);
for (DocIterator it = docs.iterator(); it.hasNext();) {
final int docId = it.nextDoc();

but what about when I don't have a SolrQueryResponse ?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How do I specify a different analyzer at search-time?

2011-07-11 Thread Gabriele Kahlout
With a lucene QueryParser instance it's possible to set the analyzer in use.
I suspect Solr doesn't use the same analyzer it used at indexing, defined in
schema.xml but I cannot verify that without the queryparser instance.
From Jan's diagram it seems this is set in the SearchHandler's init. Is it?
How?

On Sun, Apr 10, 2011 at 11:05 AM, Jan Høydahl jan@cominvent.com wrote:

  Looks really good, but two bits that i think might confuse people are
  the implications that a Query Parser then invokes a series of search
  components; and that analysis (and the pieces of an analyzer chain)
  are what to lookups in the underlying lucene index.
 
  the first might just be the ambiguity of Query .. using the term
  request parser might make more sense, in comparison to the update
  parsing from the other side of hte diagram.

 Thanks for commenting.

 Yea, the purpose is more to show a conceptual rather than actual relation
 between the different components, focusing on the flow. A 100% technical
 correct diagram would be too complex for beginners to comprehend,
 although it could certainly be useful for developers.

 I've removed the arrow between QueryParser and search components to
 clarify.
 The boxes first and foremost show that query parsing and response writers
 are within the realm of search request handler.

  the analysis piece is a little harder to fix cleanly.  you really want
 the
  end of the analysis chain to feed back up to the searh components, and
  then show it (most of hte search components really) talking to the Lucene
  index.

 Yea, I know. Showing how Faceting communicate with the main index and
 spellchecker with its spellchecker index could also be useful, but I think
 that would be for another more detailed diagram.

 I felt it was more important for beginners to realize visually that
 analysis happens both at index and search time, and that the analyzers
 align 1:1. At this stage in the digram I often explain the importance
 of matching up the analysis on both sides to get a match in the index.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to create a solr core if no solr cores were created before?

2011-07-11 Thread Gabriele Kahlout
have a look here [1].

[1]
https://issues.apache.org/jira/browse/SOLR-2645?focusedCommentId=13062748page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13062748

On Mon, Jul 11, 2011 at 4:46 PM, Mark Schoy hei...@gmx.de wrote:

 Hi,

 I tried to create a solr core but I always get No such solr
 core:-Exception.

 -
 File home = new File( pathToSolrHome );
 File f = new File( home, solr.xml );

 CoreContainer coreContainer = new CoreContainer();
 coreContainer.load( pathToSolrHome, f );

 EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, );
 CoreAdminRequest.createCore(coreName, coreDir, server);
 -

 I think the problem is the  in new EmbeddedSolrServer(coreContainer, );

 Thanks.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Can I write to the index from within RequestHandler.handleRequestBody(..)?

2011-07-10 Thread Gabriele Kahlout
Hello,

IndexWriter writer = new IndexWriter(FSDirectory.open(new
File(req.getCore().getDataDir(), index)), req.getSchema().getAnalyzer(),
IndexWriter.MaxFieldLength.LIMITED);
updateSolrIndex(writer);


But this is what I get (I know that RequestHandler are not intended to write
updates).

HTTP Status 500 - null java.nio.channels.OverlappingFileLockException at
sun.nio.ch.FileChannelImpl$SharedFileLockTable.checkList(FileChannelImpl.java:1166)
at
sun.nio.ch.FileChannelImpl$SharedFileLockTable.add(FileChannelImpl.java:1068)
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:868) at
java.nio.channels.FileChannel.tryLock(FileChannel.java:962) at
org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:216) at
org.apache.lucene.store.Lock.obtain(Lock.java:72) at
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:) at
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:955) at
com.mysimpatico.me.indexplugins.MRequestHandler.handleRequestBody(MRequestHandler.java:97)
at
-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I write to the index from within RequestHandler.handleRequestBody(..)?

2011-07-10 Thread Gabriele Kahlout
On Sun, Jul 10, 2011 at 6:21 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:

 There are such RequestHandlers. Look at CSVRequestHandler, for example.



  IndexWriter writer = new IndexWriter(FSDirectory.open(**new
 File(req.getCore().getDataDir(**), index)),
 req.getSchema().getAnalyzer(),
 IndexWriter.MaxFieldLength.**LIMITED);
 updateSolrIndex(writer);


 Don't use your own writer for same index. Use 
 UpdateRequestProcessor.**processAdd()
 instead.


What you seem to be suggesting is:

UpdateRequestProcessorChain processorChain =
req.getCore().getUpdateProcessingChain(WcUpdate);
UpdateRequestProcessor processor =
processorChain.createProcessor(req, rsp);
try{
RequestHandlerUtils.handleCommit(processor, params, false);
RequestHandlerUtils.handleRollback(processor, params, false);
}finally{
processor.finish();
}

But this is not what I want. I want an IndexWriter instance from which I can
get a reader, the analyzer in use, and the similarity class. If I shall
crease a new IndexWriter for the same index, can I re-use the current one
directly, without the UpdateRequestProcessor interface?

koji
 --
 http://www.rondhuit.com/en/




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: What's the fq= syntax for NumericRangeFilter?

2011-07-10 Thread Gabriele Kahlout
This was my problem:
fieldType name=int class=solr.TrieIntField omitNorms=true/

I had taken my queu from Nutch's schema:
fieldType name=long class=solr.LongField omitNorms=true/


On Sat, Jul 9, 2011 at 4:55 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 Something is wrong with your indexing.
 Is wc an indexed field?  If not, change it so it is, then re-index your
 data.

 If so, I'd recommend starting with the example data and filter for
 something like popularity:[6 TO 10] to convince yourself it works,
 then figuring out what you did differently in your schema/data.

 -Yonik
 http://www.lucidimagination.com

 On Sat, Jul 9, 2011 at 10:50 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  http://localhost:8080/solr/select?indent=onversion=2.2q=*%3A**
  fq=wc%3A%5B255+TO+257%5D*
  start=0rows=10fl=*%2Cscoreqt=wt=xmlexplainOther=hl.fl=
 
  The toString of the request:
 
 {explainOther=fl=*,scoreindent=onstart=0q=*:*hl.fl=qt=wt=xmlfq=wc:[255+TO+257]rows=1version=2.2}
 
  Even when the FilterQuery is constructed in Java it doesn't work (i get
  results that ignore the filter query completely).
 
 
  On Sat, Jul 9, 2011 at 3:40 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
   I don't get it to work!
  
   If I specify no fq I get the first result with int
   name=wc256/int
  
   With wc:[255 TO 257] (fq=wc%3A%5B255+TO+257%5D) nothing
   comes out.
 
  If you give us the Full URL you are using, it can be helpful.
 
  Correct syntax is fq=wc:[255 TO 257]
 
  You can use more that fq in a request.
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Can I delete the stored value?

2011-07-09 Thread Gabriele Kahlout
I've stored the contents of some pages I no longer need. How can I now
delete the stored content without re-crawling the pages (i.e. using
updateDocument ). I cannot just remove the field, since I still want the
field to be indexed, I just don't want to store something with it.
My understanding is that field.setValue() won't do since that should
affect the indexed value as well.

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


What's the fq= syntax for NumericRangeFilter?

2011-07-09 Thread Gabriele Kahlout
I'm trying to filter a query by the value of a numeric field. I can do it in
Java as follows, but I don't know how to do it with the query syntax, and I
found no documentation of it.

@Test
public void testFqWc() throws Exception {
IndexSearcher searcher = wc();
*Filter wc3 = NumericRangeFilter.newIntRange(wc, 3, 3, true, true);*
final MatchAllDocsQuery allQ = new MatchAllDocsQuery();
TopDocs allDocs = searcher.search(allQ, 10);
assertEquals(1, allDocs.totalHits);
int wc =
Integer.parseInt(searcher.doc(allDocs.scoreDocs[0].doc).get(this.wc));
assertEquals(3,wc);
TopDocs docs = searcher.search(allQ, wc3, 10);
assertEquals(allDocs.totalHits, docs.totalHits);
}

On Sun, Jun 19, 2011 at 12:43 PM, Ahmet Arslan iori...@yahoo.com wrote:

  Beside creating an index with just the site in question, is
  it possible like
  with Google to search for results only in a given domain?

 If you have an appropriate field that is indexed, yes. fq=site:foo.com
 http://wiki.apache.org/solr/CommonQueryParameters#fq




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: What's the fq= syntax for NumericRangeFilter?

2011-07-09 Thread Gabriele Kahlout
I don't get it to work!

If I specify no fq I get the first result with int name=wc256/int

With wc:[255 TO 257] (fq=wc%3A%5B255+TO+257%5D) nothing comes out.

On Sat, Jul 9, 2011 at 12:29 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 Hu? It's describe in the link Ahmet's given you.

  I'm trying to filter a query by the value of a numeric field. I can do it
  in Java as follows, but I don't know how to do it with the query syntax,
  and I found no documentation of it.
 
  @Test
  public void testFqWc() throws Exception {
  IndexSearcher searcher = wc();
  *Filter wc3 = NumericRangeFilter.newIntRange(wc, 3, 3, true,
  true);* final MatchAllDocsQuery allQ = new MatchAllDocsQuery();
  TopDocs allDocs = searcher.search(allQ, 10);
  assertEquals(1, allDocs.totalHits);
  int wc =
  Integer.parseInt(searcher.doc(allDocs.scoreDocs[0].doc).get(this.wc));
  assertEquals(3,wc);
  TopDocs docs = searcher.search(allQ, wc3, 10);
  assertEquals(allDocs.totalHits, docs.totalHits);
  }
 
  On Sun, Jun 19, 2011 at 12:43 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
Beside creating an index with just the site in question, is
it possible like
with Google to search for results only in a given domain?
  
   If you have an appropriate field that is indexed, yes. fq=site:foo.com
   http://wiki.apache.org/solr/CommonQueryParameters#fq




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: What's the fq= syntax for NumericRangeFilter?

2011-07-09 Thread Gabriele Kahlout
http://localhost:8080/solr/select?indent=onversion=2.2q=*%3A**
fq=wc%3A%5B255+TO+257%5D*
start=0rows=10fl=*%2Cscoreqt=wt=xmlexplainOther=hl.fl=

The toString of the request:
{explainOther=fl=*,scoreindent=onstart=0q=*:*hl.fl=qt=wt=xmlfq=wc:[255+TO+257]rows=1version=2.2}

Even when the FilterQuery is constructed in Java it doesn't work (i get
results that ignore the filter query completely).


On Sat, Jul 9, 2011 at 3:40 PM, Ahmet Arslan iori...@yahoo.com wrote:

  I don't get it to work!
 
  If I specify no fq I get the first result with int
  name=wc256/int
 
  With wc:[255 TO 257] (fq=wc%3A%5B255+TO+257%5D) nothing
  comes out.

 If you give us the Full URL you are using, it can be helpful.

 Correct syntax is fq=wc:[255 TO 257]

 You can use more that fq in a request.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do I add a custom field?

2011-07-07 Thread Gabriele Kahlout
so, how about this:
 Document doc = searcher.doc(i); // i get the doc
doc.removeField(wc); // remove the field in case there's
addWc(doc, docLength); //add the new field
writer.updateDocument(new Term(id, Integer.toString(i++)), doc);
//update the doc

For some reason it doesn't get added to the index. Should it?

On 7/3/11, Michael Sokolov soko...@ifactory.com wrote:
 You'll need to index the field.  I would think you would want to
 index/store the field along with the associated document, in which case
 you'll have to reindex the documents as well - there's no single-field
 update capability in Lucene (yet?).

 -Mike

 On 7/3/2011 1:09 PM, Gabriele Kahlout wrote:
 Is there how I can compute and add the field to all indexed documents
 without re-indexing? MyField counts the number of terms per document
 (unique
 word count).

 On Sun, Jul 3, 2011 at 12:24 PM, lee carroll
 lee.a.carr...@googlemail.comwrote:

 Hi Gabriele,
 Did you index any docs with your new field ?

 The results will just bring back docs and what fields they have. They
 won't
 bring back null fields just because they are in your schema. Lucene
 is schema-less.
 Solr adds the schema to make it nice to administer and very powerful to
 use.





 On 3 July 2011 11:01, Gabriele Kahloutgabri...@mysimpatico.com  wrote:
 Hello,

 I want to have an additional  field that appears for every document in
 search results. I understand that I should do this by adding the field
 to
 the schema.xml, so I add:
 field name=myField default=0 type=integer stored=true
 indexed=false/
 Then I restart Solr (so that I loads the new schema.xml) and make a
 query
 specifying that it should return myField too, but it doesn't. Will it do
 only for newly indexed documents? Am I missing something?

 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the
 email
 does not contain a valid code then the email is not received. A valid
 code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).







-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I invert the inverted index?

2011-07-06 Thread Gabriele Kahlout
From you patch I see TermFreqVector  which provides the information I want.

I also found FieldInvertState.getLength() which seems to be exactly what I
want. I'm after the word count (sum of tf for every term in the doc). I'm
just not sure whether FieldInvertState.getLength() returns just the number
of terms (not multiplied by the frequency of each term - word count) or not
though. It seems as if it returns word count, but I've not tested it
sufficienctly.

On Wed, Jul 6, 2011 at 1:39 AM, Trey Grainger the.apache.t...@gmail.comwrote:

 Gabriele,

 I created a patch that does this about a year ago.  See
 https://issues.apache.org/jira/browse/SOLR-1837.  It was written for Solr
 1.4 and is based upon the Document Reconstructor in Luke.  The patch adds a
 link to the main solr admin page to a docinspector page which will
 reconstruct the document given a uniqueid (required).  Keep in mind that
 you're only looking at what's in the index for non-stored fields, not the
 original text.

 If you have any issues using this on the most recent release, let me know
 and I'd be happy to create a new patch for solr 3.3.  One of these days
 I'll
 remove the JSP dependency and this may eventually making it into trunk.

 Thanks,

 -Trey Grainger
 Search Technology Development Team Lead, Careerbuilder.com
 Site Architect, Celiaccess.com


 On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout
 gabri...@mysimpatico.comwrote:

  Hello,
 
  With an inverted index the term is the key, and the documents are the
  values. Is it still however possible that given a document id I get the
  terms indexed for that document?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Does Nutch make any use of solr.WhitespaceTokenizerFactory defined in schema.xml?

2011-07-05 Thread Gabriele Kahlout
nice...where?

I'm trying to figure out 2 things:
1) How to create an analyzer that corresponds to the one in the schema.xml.

 analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer

2) I'd like to see the code that creates it reading it from schema.xml .

On Tue, Jul 5, 2011 at 12:33 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 No. SolrJ only builds input docs from NutchDocument objects. Solr will do
 analysis. The integration is analogous to XML post of Solr documents.

 On Tuesday 05 July 2011 12:28:21 Gabriele Kahlout wrote:
  Hello,
 
  I'm trying to understand better Nutch and Solr integration. My
  understanding is that Documents are added to Solr index from SolrWriter's
  write(NutchDocument doc) method. But does it make any use of the
  WhitespaceTokenizerFactory?

 --
 Markus Jelsma - CTO - Openindex
 http://www.linkedin.com/in/markus17
 050-8536620 / 06-50258350




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Does Nutch make any use of solr.WhitespaceTokenizerFactory defined in schema.xml?

2011-07-05 Thread Gabriele Kahlout
I suspect the following should do (1). I'm just not sure about file
references as in  stopInit.put(words, stopwords.txt) . (2) should
clarify.

1)
class SchemaAnalyzer extends Analyzer{

@Override
public TokenStream tokenStream(String fieldName, Reader reader) {
HashMapString, String stopInit = new HashMapString,String();
stopInit.put(words, stopwords.txt);
stopInit.put(ignoreCase, Boolean.TRUE.toString());
StopFilterFactory stopFilterFactory = new StopFilterFactory();
stopFilterFactory.init(stopInit);

final HashMapString, String wordDelimInit = new
HashMapString, String();
wordDelimInit.put(generateWordParts, 1);
wordDelimInit.put(generateNumberParts, 1);
wordDelimInit.put(catenateWords, 1);
wordDelimInit.put(catenateWords, 1);
wordDelimInit.put(catenateNumbers, 1);
wordDelimInit.put(catenateAll, 0);
wordDelimInit.put(splitOnCaseChange, 1);

WordDelimiterFilterFactory wordDelimiterFilterFactory = new
WordDelimiterFilterFactory();
wordDelimiterFilterFactory.init(wordDelimInit);
HashMapString, String porterInit = new HashMapString,
String();
porterInit.put(protected, protwords.txt);
EnglishPorterFilterFactory englishPorterFilterFactory = new
EnglishPorterFilterFactory();
englishPorterFilterFactory.init(porterInit);

return new
RemoveDuplicatesTokenFilter(englishPorterFilterFactory.create(new
LowerCaseFilter(wordDelimiterFilterFactory.create(stopFilterFactory.create(new
WhitespaceTokenizer(reader));
}
}

On Tue, Jul 5, 2011 at 1:00 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 nice...where?

 I'm trying to figure out 2 things:
 1) How to create an analyzer that corresponds to the one in the schema.xml.


  analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1/
   /analyzer

 2) I'd like to see the code that creates it reading it from schema.xml .


 On Tue, Jul 5, 2011 at 12:33 PM, Markus Jelsma markus.jel...@openindex.io
  wrote:

 No. SolrJ only builds input docs from NutchDocument objects. Solr will do
 analysis. The integration is analogous to XML post of Solr documents.

 On Tuesday 05 July 2011 12:28:21 Gabriele Kahlout wrote:
  Hello,
 
  I'm trying to understand better Nutch and Solr integration. My
  understanding is that Documents are added to Solr index from
 SolrWriter's
  write(NutchDocument doc) method. But does it make any use of the
  WhitespaceTokenizerFactory?

 --
 Markus Jelsma - CTO - Openindex
 http://www.linkedin.com/in/markus17
 050-8536620 / 06-50258350




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Does Nutch make any use of solr.WhitespaceTokenizerFactory defined in schema.xml?

2011-07-05 Thread Gabriele Kahlout
Not yet an answer to 2) but this is where and how Solr initializes the
Analyzer defined in the schema.xml into :

//org.apache.solr.schema.IndexSchema
 // Load the Tokenizer
// Although an analyzer only allows a single Tokenizer, we load a list
to make sure
// the configuration is ok
//

final ArrayListTokenizerFactory tokenizers = new
ArrayListTokenizerFactory(1);
AbstractPluginLoaderTokenizerFactory tokenizerLoader =
  new AbstractPluginLoaderTokenizerFactory( [schema.xml]
analyzer/tokenizer, false, false )
{
  @Override
  protected void init(TokenizerFactory plugin, Node node) throws
Exception {
if( !tokenizers.isEmpty() ) {
  throw new SolrException( SolrException.ErrorCode.SERVER_ERROR,
  The schema defines multiple tokenizers for: +node );
}
final MapString,String params =
DOMUtil.toMapExcept(node.getAttributes(),class);
// copy the luceneMatchVersion from config, if not set
if (!params.containsKey(LUCENE_MATCH_VERSION_PARAM))
  params.put(LUCENE_MATCH_VERSION_PARAM,
solrConfig.luceneMatchVersion.toString());
plugin.init( params );
tokenizers.add( plugin );
  }

  @Override
  protected TokenizerFactory register(String name, TokenizerFactory
plugin) throws Exception {
return null; // used for map registration
  }
};
tokenizerLoader.load( loader, (NodeList)xpath.evaluate(./tokenizer,
node, XPathConstants.NODESET) );

// Make sure something was loaded
if( tokenizers.isEmpty() ) {
  throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,analyzer
without class or tokenizer  filter list);
}


// Load the Filters
//

final ArrayListTokenFilterFactory filters = new
ArrayListTokenFilterFactory();
AbstractPluginLoaderTokenFilterFactory filterLoader =
  new AbstractPluginLoaderTokenFilterFactory( [schema.xml]
analyzer/filter, false, false )
{
  @Override
  protected void init(TokenFilterFactory plugin, Node node) throws
Exception {
if( plugin != null ) {
  final MapString,String params =
DOMUtil.toMapExcept(node.getAttributes(),class);
  // copy the luceneMatchVersion from config, if not set
  if (!params.containsKey(LUCENE_MATCH_VERSION_PARAM))
params.put(LUCENE_MATCH_VERSION_PARAM,
solrConfig.luceneMatchVersion.toString());
  plugin.init( params );
  filters.add( plugin );
}
  }

  @Override
  protected TokenFilterFactory register(String name, TokenFilterFactory
plugin) throws Exception {
return null; // used for map registration
  }
};
filterLoader.load( loader, (NodeList)xpath.evaluate(./filter, node,
XPathConstants.NODESET) );

return new TokenizerChain(charFilters.toArray(new
CharFilterFactory[charFilters.size()]),
tokenizers.get(0), filters.toArray(new
TokenFilterFactory[filters.size()]));
  };


On Tue, Jul 5, 2011 at 2:26 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 I suspect the following should do (1). I'm just not sure about file
 references as in  stopInit.put(words, stopwords.txt) . (2) should
 clarify.

 1)
 class SchemaAnalyzer extends Analyzer{

 @Override
 public TokenStream tokenStream(String fieldName, Reader reader) {
 HashMapString, String stopInit = new
 HashMapString,String();
 stopInit.put(words, stopwords.txt);
 stopInit.put(ignoreCase, Boolean.TRUE.toString());
 StopFilterFactory stopFilterFactory = new StopFilterFactory();
 stopFilterFactory.init(stopInit);

 final HashMapString, String wordDelimInit = new
 HashMapString, String();
 wordDelimInit.put(generateWordParts, 1);
 wordDelimInit.put(generateNumberParts, 1);
 wordDelimInit.put(catenateWords, 1);
 wordDelimInit.put(catenateWords, 1);
 wordDelimInit.put(catenateNumbers, 1);
 wordDelimInit.put(catenateAll, 0);
 wordDelimInit.put(splitOnCaseChange, 1);

 WordDelimiterFilterFactory wordDelimiterFilterFactory = new
 WordDelimiterFilterFactory();
 wordDelimiterFilterFactory.init(wordDelimInit);
 HashMapString, String porterInit = new HashMapString,
 String();
 porterInit.put(protected, protwords.txt);
 EnglishPorterFilterFactory englishPorterFilterFactory = new
 EnglishPorterFilterFactory();
 englishPorterFilterFactory.init(porterInit);

 return new
 RemoveDuplicatesTokenFilter(englishPorterFilterFactory.create(new
 LowerCaseFilter(wordDelimiterFilterFactory.create(stopFilterFactory.create(new
 WhitespaceTokenizer(reader));
 }
 }

 On Tue, Jul 5, 2011 at 1:00 PM, Gabriele Kahlout gabri

Re: Does Nutch make any use of solr.WhitespaceTokenizerFactory defined in schema.xml?

2011-07-05 Thread Gabriele Kahlout
the answer to 2) is new IndexSchema(solrConf, schema).getAnalyzer();


On Tue, Jul 5, 2011 at 2:48 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 Not yet an answer to 2) but this is where and how Solr initializes the
 Analyzer defined in the schema.xml into :

 //org.apache.solr.schema.IndexSchema
  // Load the Tokenizer
 // Although an analyzer only allows a single Tokenizer, we load a list
 to make sure
 // the configuration is ok
 //
 
 final ArrayListTokenizerFactory tokenizers = new
 ArrayListTokenizerFactory(1);
 AbstractPluginLoaderTokenizerFactory tokenizerLoader =
   new AbstractPluginLoaderTokenizerFactory( [schema.xml]
 analyzer/tokenizer, false, false )
 {
   @Override
   protected void init(TokenizerFactory plugin, Node node) throws
 Exception {
 if( !tokenizers.isEmpty() ) {
   throw new SolrException( SolrException.ErrorCode.SERVER_ERROR,
   The schema defines multiple tokenizers for: +node );
 }
 final MapString,String params =
 DOMUtil.toMapExcept(node.getAttributes(),class);
 // copy the luceneMatchVersion from config, if not set
 if (!params.containsKey(LUCENE_MATCH_VERSION_PARAM))
   params.put(LUCENE_MATCH_VERSION_PARAM,
 solrConfig.luceneMatchVersion.toString());
 plugin.init( params );
 tokenizers.add( plugin );
   }

   @Override
   protected TokenizerFactory register(String name, TokenizerFactory
 plugin) throws Exception {
 return null; // used for map registration
   }
 };
 tokenizerLoader.load( loader, (NodeList)xpath.evaluate(./tokenizer,
 node, XPathConstants.NODESET) );

 // Make sure something was loaded
 if( tokenizers.isEmpty() ) {
   throw new
 SolrException(SolrException.ErrorCode.SERVER_ERROR,analyzer without class
 or tokenizer  filter list);
 }


 // Load the Filters
 //
 
 final ArrayListTokenFilterFactory filters = new
 ArrayListTokenFilterFactory();
 AbstractPluginLoaderTokenFilterFactory filterLoader =
   new AbstractPluginLoaderTokenFilterFactory( [schema.xml]
 analyzer/filter, false, false )
 {
   @Override
   protected void init(TokenFilterFactory plugin, Node node) throws
 Exception {
 if( plugin != null ) {
   final MapString,String params =
 DOMUtil.toMapExcept(node.getAttributes(),class);
   // copy the luceneMatchVersion from config, if not set
   if (!params.containsKey(LUCENE_MATCH_VERSION_PARAM))
 params.put(LUCENE_MATCH_VERSION_PARAM,
 solrConfig.luceneMatchVersion.toString());
   plugin.init( params );
   filters.add( plugin );
 }
   }

   @Override
   protected TokenFilterFactory register(String name, TokenFilterFactory
 plugin) throws Exception {
 return null; // used for map registration
   }
 };
 filterLoader.load( loader, (NodeList)xpath.evaluate(./filter, node,
 XPathConstants.NODESET) );

 return new TokenizerChain(charFilters.toArray(new
 CharFilterFactory[charFilters.size()]),
 tokenizers.get(0), filters.toArray(new
 TokenFilterFactory[filters.size()]));
   };



 On Tue, Jul 5, 2011 at 2:26 PM, Gabriele Kahlout gabri...@mysimpatico.com
  wrote:

 I suspect the following should do (1). I'm just not sure about file
 references as in  stopInit.put(words, stopwords.txt) . (2) should
 clarify.

 1)
 class SchemaAnalyzer extends Analyzer{

 @Override
 public TokenStream tokenStream(String fieldName, Reader reader) {
 HashMapString, String stopInit = new
 HashMapString,String();
 stopInit.put(words, stopwords.txt);
 stopInit.put(ignoreCase, Boolean.TRUE.toString());
 StopFilterFactory stopFilterFactory = new StopFilterFactory();
 stopFilterFactory.init(stopInit);

 final HashMapString, String wordDelimInit = new
 HashMapString, String();
 wordDelimInit.put(generateWordParts, 1);
 wordDelimInit.put(generateNumberParts, 1);
 wordDelimInit.put(catenateWords, 1);
 wordDelimInit.put(catenateWords, 1);
 wordDelimInit.put(catenateNumbers, 1);
 wordDelimInit.put(catenateAll, 0);
 wordDelimInit.put(splitOnCaseChange, 1);

 WordDelimiterFilterFactory wordDelimiterFilterFactory = new
 WordDelimiterFilterFactory();
 wordDelimiterFilterFactory.init(wordDelimInit);
 HashMapString, String porterInit = new HashMapString,
 String();
 porterInit.put(protected, protwords.txt);
 EnglishPorterFilterFactory englishPorterFilterFactory = new
 EnglishPorterFilterFactory();
 englishPorterFilterFactory.init(porterInit);

 return new

Cannot I search documents added by IndexWriter after commit?

2011-07-05 Thread Gabriele Kahlout
@Test
public void testUpdate() throws IOException,
ParserConfigurationException, SAXException, ParseException {
Analyzer analyzer = getAnalyzer();
QueryParser parser = new QueryParser(Version.LUCENE_32, content,
analyzer);
Query allQ = parser.parse(*:*);

IndexWriter writer = getWriter();
IndexSearcher searcher = new IndexSearcher(IndexReader.open(writer,
true));
TopDocs docs = searcher.search(allQ, 10);
*assertEquals(0, docs.totalHits); // empty/no index*

Document doc = getDoc();
writer.addDocument(doc);
writer.commit();

docs = searcher.search(allQ, 10);
*assertEquals(1,docs.totalHits); //it fails here. docs.totalHits
equals 0*
}
What am I doing wrong here?

If I initialize searcher with new IndexSearcher(directory) I'm told:
org.apache.lucene.index.IndexNotFoundException: no segments* file found in
org.apache.lucene.store.RAMDirectory@3caa4blockFactory=org.apache.lucene.store.SingleInstanceLockFactory@ed0220c:
files: []

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Cannot I search documents added by IndexWriter after commit?

2011-07-05 Thread Gabriele Kahlout
and how do you do that? There is no reopen method

On Tue, Jul 5, 2011 at 8:09 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 After your writer.commit you need to reopen your searcher to see the
 changes.

 Mike McCandless

 http://blog.mikemccandless.com

 On Tue, Jul 5, 2011 at 1:48 PM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
 @Test
 public void testUpdate() throws IOException,
  ParserConfigurationException, SAXException, ParseException {
 Analyzer analyzer = getAnalyzer();
 QueryParser parser = new QueryParser(Version.LUCENE_32, content,
  analyzer);
 Query allQ = parser.parse(*:*);
 
 IndexWriter writer = getWriter();
 IndexSearcher searcher = new
 IndexSearcher(IndexReader.open(writer,
  true));
 TopDocs docs = searcher.search(allQ, 10);
  *assertEquals(0, docs.totalHits); // empty/no index*
 
 Document doc = getDoc();
 writer.addDocument(doc);
 writer.commit();
 
 docs = searcher.search(allQ, 10);
  *assertEquals(1,docs.totalHits); //it fails here. docs.totalHits
  equals 0*
 }
  What am I doing wrong here?
 
  If I initialize searcher with new IndexSearcher(directory) I'm told:
  org.apache.lucene.index.IndexNotFoundException: no segments* file found
 in
  org.apache.lucene.store.RAMDirectory@3caa4blockFactory
 =org.apache.lucene.store.SingleInstanceLockFactory@ed0220c:
  files: []
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Cannot I search documents added by IndexWriter after commit?

2011-07-05 Thread Gabriele Kahlout
Still won't work (same as before).

 @Test
public void testUpdate() throws IOException,
ParserConfigurationException, SAXException, ParseException {
Analyzer analyzer = getAnalyzer();
QueryParser parser = new QueryParser(Version.LUCENE_32, content,
analyzer);
Query allQ = parser.parse(*:*);

IndexWriter writer = getWriter();
final IndexReader indexReader = IndexReader.open(writer, true);

IndexSearcher searcher = new IndexSearcher(indexReader);
TopDocs docs = searcher.search(allQ, 10);
assertEquals(0, docs.totalHits); // empty/no index

Document doc = getDoc();
writer.addDocument(doc);
writer.commit();

*indexReader.reopen();
searcher = new IndexSearcher(indexReader);
docs = searcher.search(allQ, 10);*
assertEquals(1,docs.totalHits);
}

  private Document getDoc() {
Document doc = new Document();
doc.add(new Field(id, 0, Field.Store.YES,
Field.Index.NOT_ANALYZED));
return doc;
}

 private IndexWriter getWriter() throws IOException {// 2
return new IndexWriter(directory, new WhitespaceAnalyzer(), // 2
IndexWriter.MaxFieldLength.UNLIMITED); // 2
}

On Tue, Jul 5, 2011 at 8:15 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Sorry, you must reopen the underlying IndexReader, and then make a new
 IndexSearcher from the reopened reader.

 Mike McCandless

 http://blog.mikemccandless.com

 On Tue, Jul 5, 2011 at 2:12 PM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  and how do you do that? There is no reopen method
 
  On Tue, Jul 5, 2011 at 8:09 PM, Michael McCandless 
  luc...@mikemccandless.com wrote:
 
  After your writer.commit you need to reopen your searcher to see the
  changes.
 
  Mike McCandless
 
  http://blog.mikemccandless.com
 
  On Tue, Jul 5, 2011 at 1:48 PM, Gabriele Kahlout
  gabri...@mysimpatico.com wrote:
  @Test
  public void testUpdate() throws IOException,
   ParserConfigurationException, SAXException, ParseException {
  Analyzer analyzer = getAnalyzer();
  QueryParser parser = new QueryParser(Version.LUCENE_32,
 content,
   analyzer);
  Query allQ = parser.parse(*:*);
  
  IndexWriter writer = getWriter();
  IndexSearcher searcher = new
  IndexSearcher(IndexReader.open(writer,
   true));
  TopDocs docs = searcher.search(allQ, 10);
   *assertEquals(0, docs.totalHits); // empty/no index*
  
  Document doc = getDoc();
  writer.addDocument(doc);
  writer.commit();
  
  docs = searcher.search(allQ, 10);
   *assertEquals(1,docs.totalHits); //it fails here.
 docs.totalHits
   equals 0*
  }
   What am I doing wrong here?
  
   If I initialize searcher with new IndexSearcher(directory) I'm told:
   org.apache.lucene.index.IndexNotFoundException: no segments* file
 found
  in
   org.apache.lucene.store.RAMDirectory@3caa4blockFactory
  =org.apache.lucene.store.SingleInstanceLockFactory@ed0220c:
   files: []
  
   --
   Regards,
   K. Gabriele
  
   --- unchanged since 20/9/10 ---
   P.S. If the subject contains [LON] or the addressee acknowledges the
   receipt within 48 hours then I don't resend the email.
   subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)
Now + 48h) ⇒ ¬resend(I, this).
  
   If an email is sent by a sender that is not a trusted contact or the
  email
   does not contain a valid code then the email is not received. A valid
  code
   starts with a hyphen and ends with X.
   ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
 ∈
   L(-[a-z]+[0-9]X)).
  
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Cannot I search documents added by IndexWriter after commit?

2011-07-05 Thread Gabriele Kahlout
Re-open doens't work, but open does.

@Test
public void testUpdate() throws IOException,
ParserConfigurationException, SAXException, ParseException {
Analyzer analyzer = getAnalyzer();
QueryParser parser = new QueryParser(Version.LUCENE_32, content,
analyzer);
Query allQ = parser.parse(*:*);

IndexWriter writer = getWriter();
final IndexReader indexReader = IndexReader.open(writer, true);

IndexSearcher searcher = new IndexSearcher(indexReader);
TopDocs docs = searcher.search(allQ, 10);
assertEquals(0, docs.totalHits); // empty/no index

Document doc = getDoc();
writer.addDocument(doc);
writer.commit();

searcher = new IndexSearcher(IndexReader.open(writer, true));//new
IndexSearcher(directory);
docs = searcher.search(allQ, 10);
assertEquals(1, docs.totalHits);
}

On Tue, Jul 5, 2011 at 8:23 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 Still won't work (same as before).


  @Test
 public void testUpdate() throws IOException,
 ParserConfigurationException, SAXException, ParseException {
 Analyzer analyzer = getAnalyzer();
 QueryParser parser = new QueryParser(Version.LUCENE_32, content,
 analyzer);
 Query allQ = parser.parse(*:*);

 IndexWriter writer = getWriter();
 final IndexReader indexReader = IndexReader.open(writer, true);

 IndexSearcher searcher = new IndexSearcher(indexReader);

 TopDocs docs = searcher.search(allQ, 10);
 assertEquals(0, docs.totalHits); // empty/no index

 Document doc = getDoc();
 writer.addDocument(doc);
 writer.commit();

 *indexReader.reopen();
 searcher = new IndexSearcher(indexReader);

 docs = searcher.search(allQ, 10);
 *
 assertEquals(1,docs.totalHits);
 }

   private Document getDoc() {
 Document doc = new Document();
 doc.add(new Field(id, 0, Field.Store.YES,
 Field.Index.NOT_ANALYZED));
 return doc;
 }

  private IndexWriter getWriter() throws IOException {// 2
 return new IndexWriter(directory, new WhitespaceAnalyzer(), // 2
 IndexWriter.MaxFieldLength.UNLIMITED); // 2

 }

 On Tue, Jul 5, 2011 at 8:15 PM, Michael McCandless 
 luc...@mikemccandless.com wrote:

 Sorry, you must reopen the underlying IndexReader, and then make a new
 IndexSearcher from the reopened reader.

 Mike McCandless

 http://blog.mikemccandless.com

 On Tue, Jul 5, 2011 at 2:12 PM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  and how do you do that? There is no reopen method
 
  On Tue, Jul 5, 2011 at 8:09 PM, Michael McCandless 
  luc...@mikemccandless.com wrote:
 
  After your writer.commit you need to reopen your searcher to see the
  changes.
 
  Mike McCandless
 
  http://blog.mikemccandless.com
 
  On Tue, Jul 5, 2011 at 1:48 PM, Gabriele Kahlout
  gabri...@mysimpatico.com wrote:
  @Test
  public void testUpdate() throws IOException,
   ParserConfigurationException, SAXException, ParseException {
  Analyzer analyzer = getAnalyzer();
  QueryParser parser = new QueryParser(Version.LUCENE_32,
 content,
   analyzer);
  Query allQ = parser.parse(*:*);
  
  IndexWriter writer = getWriter();
  IndexSearcher searcher = new
  IndexSearcher(IndexReader.open(writer,
   true));
  TopDocs docs = searcher.search(allQ, 10);
   *assertEquals(0, docs.totalHits); // empty/no index*
  
  Document doc = getDoc();
  writer.addDocument(doc);
  writer.commit();
  
  docs = searcher.search(allQ, 10);
   *assertEquals(1,docs.totalHits); //it fails here.
 docs.totalHits
   equals 0*
  }
   What am I doing wrong here?
  
   If I initialize searcher with new IndexSearcher(directory) I'm told:
   org.apache.lucene.index.IndexNotFoundException: no segments* file
 found
  in
   org.apache.lucene.store.RAMDirectory@3caa4blockFactory
  =org.apache.lucene.store.SingleInstanceLockFactory@ed0220c:
   files: []
  
   --
   Regards,
   K. Gabriele
  
   --- unchanged since 20/9/10 ---
   P.S. If the subject contains [LON] or the addressee acknowledges
 the
   receipt within 48 hours then I don't resend the email.
   subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)
Now + 48h) ⇒ ¬resend(I, this).
  
   If an email is sent by a sender that is not a trusted contact or the
  email
   does not contain a valid code then the email is not received. A valid
  code
   starts with a hyphen and ends with X.
   ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧
 y ∈
   L(-[a-z]+[0-9]X)).
  
 
 
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x

Can I invert the inverted index?

2011-07-05 Thread Gabriele Kahlout
Hello,

With an inverted index the term is the key, and the documents are the
values. Is it still however possible that given a document id I get the
terms indexed for that document?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Can I invert the inverted index?

2011-07-05 Thread Gabriele Kahlout
I had looked an term vectors but don't understand them to solve my problem.
Consider the following index entries:

t0, doc0, doc1
t1, doc0

From the 2nd entry we know that t1 is only present in doc0.
Now, my problem, given doc0 how can I know which terms occur in in (t0 and
t1) (without storing the content)?
One way is go over all terms in the index using the term dictionary.


On Tue, Jul 5, 2011 at 10:14 PM, lboutros boutr...@gmail.com wrote:

 Hi Gabriele,

 I'm not sure to understand your problem, but the TermVectorComponent may
 fit
 your needs ?

 http://wiki.apache.org/solr/TermVectorComponent
 http://wiki.apache.org/solr/TermVectorComponentExampleEnabled

 Ludovic.

 -
 Jouve
 France.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Can-I-invert-the-inverted-index-tp3142206p3142269.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do I compute and store a field?

2011-07-04 Thread Gabriele Kahlout
Gee, I was about to post. I figured my issue is that of computing the unique
terms per document. One approach to compute that value is running the
analyzer on the document before before calling addDocument, and count the
number of tokens.
Then I can invoke addDocument with the value of the field computed.

The only issue is that I'm here making the assumption that if I use the same
Analyzer addDocument used in addDocument then that will always equal the
number of terms indexed for that document. Is that a right assumption? Any
alternative where I don't need to make this assumption?


On Tue, Jul 5, 2011 at 1:29 AM, Markus Jelsma markus.jel...@openindex.iowrote:

 You can create a custom update processor. The passed AddUpdateCommand
 object
 has an accessor to the SolrInputDocument you're about to add. In the
 processAdd method you can add a new field with whatever you want.

 The wiki has a good example:
 http://wiki.apache.org/solr/UpdateRequestProcessor


  Hello,
 
  I'm trying to add a field that counts the number of terms in a document
 to
  my schema. So far I've been computing this value at query-time. Is there
  how I could compute this once only and store the field?
 
  final SolrIndexSearcher searcher = request.getSearcher();
  final SolrIndexReader reader = searcher.getReader();
  final String content = content;
 
  final byte[] norms = reader.norms(content);
  final int[] docLengths;
  if (norms == null) {
  docLengths = null;
  } else {
  docLengths = new int[norms.length];
  int i = 0;
  for (byte b : norms) {
 
  float docNorm =
  searcher.getSimilarity().decodeNormValue(b); int docLength = 0;
  if (docNorm != 0) {
  docLength = (int) (1 / docNorm); //reciprocal
  }
  docLengths[i++] = docLength;
  }
  ...
   final NumericField docLenNormField = new
  NumericField(TestQueryResponseWriter.DOC_LENGHT);
   docLenNormField.setIntValue(docLengths[id]);
   doc.add(docLenNormField);




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How do I add a custom field?

2011-07-03 Thread Gabriele Kahlout
Hello,

I want to have an additional  field that appears for every document in
search results. I understand that I should do this by adding the field to
the schema.xml, so I add:
field name=myField default=0 type=integer stored=true
indexed=false/
Then I restart Solr (so that I loads the new schema.xml) and make a query
specifying that it should return myField too, but it doesn't. Will it do
only for newly indexed documents? Am I missing something?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do I add a custom field?

2011-07-03 Thread Gabriele Kahlout
Is there how I can compute and add the field to all indexed documents
without re-indexing? MyField counts the number of terms per document (unique
word count).

On Sun, Jul 3, 2011 at 12:24 PM, lee carroll
lee.a.carr...@googlemail.comwrote:

 Hi Gabriele,
 Did you index any docs with your new field ?

 The results will just bring back docs and what fields they have. They won't
 bring back null fields just because they are in your schema. Lucene
 is schema-less.
 Solr adds the schema to make it nice to administer and very powerful to
 use.





 On 3 July 2011 11:01, Gabriele Kahlout gabri...@mysimpatico.com wrote:
  Hello,
 
  I want to have an additional  field that appears for every document in
  search results. I understand that I should do this by adding the field to
  the schema.xml, so I add:
 field name=myField default=0 type=integer stored=true
  indexed=false/
  Then I restart Solr (so that I loads the new schema.xml) and make a query
  specifying that it should return myField too, but it doesn't. Will it do
  only for newly indexed documents? Am I missing something?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How do I compute and store a field?

2011-07-03 Thread Gabriele Kahlout
Hello,

I'm trying to add a field that counts the number of terms in a document to
my schema. So far I've been computing this value at query-time. Is there how
I could compute this once only and store the field?

final SolrIndexSearcher searcher = request.getSearcher();
final SolrIndexReader reader = searcher.getReader();
final String content = content;

final byte[] norms = reader.norms(content);
final int[] docLengths;
if (norms == null) {
docLengths = null;
} else {
docLengths = new int[norms.length];
int i = 0;
for (byte b : norms) {

float docNorm = searcher.getSimilarity().decodeNormValue(b);
int docLength = 0;
if (docNorm != 0) {
docLength = (int) (1 / docNorm); //reciprocal
}
docLengths[i++] = docLength;
}
...
 final NumericField docLenNormField = new
NumericField(TestQueryResponseWriter.DOC_LENGHT);
 docLenNormField.setIntValue(docLengths[id]);
 doc.add(docLenNormField);

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


site: feature in Solr?

2011-06-19 Thread Gabriele Kahlout
Hello,

Beside creating an index with just the site in question, is it possible like
with Google to search for results only in a given domain?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Why are not query keywords treated as a set?

2011-06-19 Thread Gabriele Kahlout
str name=rawquerystringpast past/str
str name=querystring*past past*/str
str name=parsedquery*content:past content:past*/str

I was expecting the query to get parsed into content:past only and not
content:past content:past.

On Mon, Jun 20, 2011 at 12:12 AM, lee carroll
lee.a.carr...@googlemail.comwrote:

 do you mean a phrase query? past past
 can you give some more detail?

 On 18 June 2011 13:02, Gabriele Kahlout gabri...@mysimpatico.com wrote:
  q=past past
 
  1.0 = (MATCH) sum of:
  *  0.5 = (MATCH) fieldWeight(content:past in 0), product of:*
1.0 = tf(termFreq(content:past)=1)
1.0 = idf(docFreq=1, maxDocs=2)
0.5 = fieldNorm(field=content, doc=0)
  *  0.5 = (MATCH) fieldWeight(content:past in 0), product of:*
1.0 = tf(termFreq(content:past)=1)
1.0 = idf(docFreq=1, maxDocs=2)
0.5 = fieldNorm(field=content, doc=0)
 
  Is there how I can treat the query keywords as a set?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Why does paste get parsed into past?

2011-06-18 Thread Gabriele Kahlout
Hello,

Debugging query results I find that:
str name=querystringpaste/str
  str name=parsedquerycontent:past/str

Now paste and past are two different words. Why does Solr not consider
that? How do I make it?

--
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
time(x)  Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the
email does not contain a valid code then the email is not received. A
valid code starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
∈ L(-[a-z]+[0-9]X)).


Why are not query keywords treated as a set?

2011-06-18 Thread Gabriele Kahlout
q=past past

1.0 = (MATCH) sum of:
*  0.5 = (MATCH) fieldWeight(content:past in 0), product of:*
   1.0 = tf(termFreq(content:past)=1)
   1.0 = idf(docFreq=1, maxDocs=2)
   0.5 = fieldNorm(field=content, doc=0)
*  0.5 = (MATCH) fieldWeight(content:past in 0), product of:*
   1.0 = tf(termFreq(content:past)=1)
   1.0 = idf(docFreq=1, maxDocs=2)
   0.5 = fieldNorm(field=content, doc=0)

Is there how I can treat the query keywords as a set?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Why does paste get parsed into past?

2011-06-18 Thread Gabriele Kahlout
I'm !sure where those are set, but on reflection I'd keep the default
settings. My real issue is why are not query keywords treated as a
set?http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201106.mbox/%3CBANLkTikHunhyWc2WVTofRYU4ZW=c8oe...@mail.gmail.com%3E
2011/6/18 François Schiettecatte fschietteca...@gmail.com

 What do you have set up for stemming?

 François

 On Jun 18, 2011, at 8:00 AM, Gabriele Kahlout wrote:

  Hello,
 
  Debugging query results I find that:
  str name=querystringpaste/str
   str name=parsedquerycontent:past/str
 
  Now paste and past are two different words. Why does Solr not consider
  that? How do I make it?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
  time(x)  Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
  email does not contain a valid code then the email is not received. A
  valid code starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
  ∈ L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Is it true that I cannot delete stored content from the index?

2011-06-18 Thread Gabriele Kahlout
Hello,

I've indexing with the content field stored. Now I'd like to delete all
stored content, is there how to do that without re-indexing?

It seems not from lucene
FAQhttp://wiki.apache.org/lucene-java/LuceneFAQ#How_do_I_update_a_document_or_a_set_of_documents_that_are_already_indexed.3F
:
How do I update a document or a set of documents that are already
indexed? There
is no direct update procedure in Lucene. To update an index incrementally
you must first *delete* the documents that were updated, and *then
re-add*them to the index.

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


It's not possible to decide at run-time which similarity class to use, right?

2011-06-16 Thread Gabriele Kahlout
Hello,

I'm testing out different Similarity implementations, and to do that I
restart Solr each time I want to try a different similarity class I change
the class attributed of the similiary element in schema.xml. Beside running
multiple-cores, each with its own schema, is there a way to tell the
RequestHandler which similarity class to use?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: It's not possible to decide at run-time which similarity class to use, right?

2011-06-16 Thread Gabriele Kahlout
On Thu, Jun 16, 2011 at 9:14 PM, Erik Hatcher erik.hatc...@gmail.comwrote:

 No, there's not a way to control Similarity on a per-request basis.

 Some factors from Similarity are computed at index-time though.


You got me on this.


 What factors are you trying to tweak that way and why?  Maybe doing
 boosting using some other mechanism (boosting functions, boosting clauses)
 would be a better way to go?

 I'm trying to assess the impact of coord (search-time) on Qtime. In one
implementation coord returns 1, while in another it's actually computed.

Running multiple cores adds considerable complication (must specify to share
data but not conf).
Patching the request handler to change similarity (didn't yet look into
this) will only change 'search-time' similarity. How about breaking up
similarity into run-time and compile-time? So requesthandler could take a
parameter to 'safely' set the run-time similarity?
I think many would welcome such responsibility distinction.


Erik




 On Jun 16, 2011, at 14:55 , Gabriele Kahlout wrote:

  Hello,
 
  I'm testing out different Similarity implementations, and to do that I
  restart Solr each time I want to try a different similarity class I
 change
  the class attributed of the similiary element in schema.xml. Beside
 running
  multiple-cores, each with its own schema, is there a way to tell the
  RequestHandler which similarity class to use?
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do I make sure the resulting documents contain the query terms?

2011-06-07 Thread Gabriele Kahlout
Sorry being unclear and thank you for answering.
Consider the following documents A(k0,k1,k2), B(k1,k2,k3), and C(k0,k2,k3),
where A,B,C are document identifiers and the ks in bracket with each are the
terms each contains.
So Solr inverted index should be something like:

k0 -- A | C
k1 -- A | B
k2 -- A | B | C
k3 -- B | C

Now let q=k1, how do I make sure C doesn't appear as a result since it
doesn't contain any occurence of k1?

On Tue, Jun 7, 2011 at 12:21 AM, Erick Erickson erickerick...@gmail.comwrote:

 I'm having a hard time understanding what you're driving at, can
 you provide some examples? This *looks* like filter queries,
 but I think you already know about those...

 Best
 Erick

 On Mon, Jun 6, 2011 at 4:00 PM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  Hello,
 
  I've seen that through boosting it's possible to influence the scoring
  function, but what I would like is sort of a boolean property. In some
 way
  it's to search only the indexed documents by that keyword (or the
  intersection/union) rather than the whole set.
  Is this supported in any way?
 
 
  --
  Regards,
  K. Gabriele
 
  --- unchanged since 20/9/10 ---
  P.S. If the subject contains [LON] or the addressee acknowledges the
  receipt within 48 hours then I don't resend the email.
  subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)
   Now + 48h) ⇒ ¬resend(I, this).
 
  If an email is sent by a sender that is not a trusted contact or the
 email
  does not contain a valid code then the email is not received. A valid
 code
  starts with a hyphen and ends with X.
  ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
  L(-[a-z]+[0-9]X)).
 




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do I make sure the resulting documents contain the query terms?

2011-06-07 Thread Gabriele Kahlout
On Tue, Jun 7, 2011 at 8:43 AM, pravesh suyalprav...@yahoo.com wrote:

 k0 -- A | C
 k1 -- A | B
 k2 -- A | B | C
 k3 -- B | C
 Now let q=k1, how do I make sure C doesn't appear as a result since it
 doesn't contain any occurence of k1?
 Do we bother to do that. Now that's what lucene does :)

 Lucene/Solr doesn't do that, it ranks documents based on a scoring
function, and with that it lacks the possibility of specifying that a
particular term must appear (the closest way I know of is boosting it).

The solution would be a way to tell Solr/lucene which documents/indices to
query, i.e. query only the union/intersection of the documents in which
k1,...kn appear, instead of query all indexed documents and apply the
ranking function (which will give weight to documents that contains
k1...kn).



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-do-I-make-sure-the-resulting-documents-contain-the-query-terms-tp3031637p3033451.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do I make sure the resulting documents contain the query terms?

2011-06-07 Thread Gabriele Kahlout
You are right, Lucene will return based on my scoring function
implementation (Similarity
classhttp://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html
):

score(q,d)   =
coord(q,d)http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html#formula_coord
·
queryNorm(q)http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html#formula_queryNorm
·
∑  ( tf(t in 
d)http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html#formula_tf
·
idf(t)http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html#formula_idf
2  ·  
t.getBoost()http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html#formula_termBoost
·
norm(t,d)http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/Similarity.html#formula_norm
)
It can be seen that whenever tf(t in d) =0 the whole score will be 0, so as
you say C will never be returned.

My issue is when the query has multiple terms (my example was too simple!),
and some are 'mandatory' while others not. In that case I should make a
query that uses the
+%20http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#+(eg.
q=+k1).
I'm unsure I'll get the syntax right, but let's say k1 is mandatory and and
k2 and k3 are optional, then q=k2 k3 +k1. I see that queries made through
solrj are received with + in place of the   (default to OR), so
q=k2+k3++k1.



On Tue, Jun 7, 2011 at 5:23 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 Um, normally that would never happen, because, well, like you say, the
 inverted index doesn't have docC for term K1, because doc C didn't include
 term K1.

 If you search on q=K1, then how/why would docC ever be in your result set?
  Are you seeing it in your result set? The question then would be _why_,
 what weird thing is going on to make that happen,  that's not expected.

 The result set _starts_ from only the documents that actually include the
 term.  Boosting/relevancy ranking only effects what order these documents
 appear in, but there's no reason documentC should be in the result set at
 all in your case of q=k1, where docC is not indexed under k1.


 On 6/7/2011 2:35 AM, Gabriele Kahlout wrote:

 Sorry being unclear and thank you for answering.
 Consider the following documents A(k0,k1,k2), B(k1,k2,k3), and
 C(k0,k2,k3),
 where A,B,C are document identifiers and the ks in bracket with each are
 the
 terms each contains.
 So Solr inverted index should be something like:

 k0 --  A | C
 k1 --  A | B
 k2 --  A | B | C
 k3 --  B | C

 Now let q=k1, how do I make sure C doesn't appear as a result since it
 doesn't contain any occurence of k1?




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How do I make sure the resulting documents contain the query terms?

2011-06-06 Thread Gabriele Kahlout
Hello,

I've seen that through boosting it's possible to influence the scoring
function, but what I would like is sort of a boolean property. In some way
it's to search only the indexed documents by that keyword (or the
intersection/union) rather than the whole set.
Is this supported in any way?


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to know how many documents are indexed? Anything more elegant than parsing numFound?

2011-06-04 Thread Gabriele Kahlout
sorry, this was my bad.. should have used  and !  (append)

On Fri, Jun 3, 2011 at 9:45 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 $ curl --fail http://192.168.34.51:8080/solr/admin/stats.jsp;  resp.xml
 $ xmlstarlet sel -t -v //@numDocs resp.xml
 *Extra content at the end of the document*


 On Fri, Jun 3, 2011 at 8:56 PM, Ahmet Arslan iori...@yahoo.com wrote:

 : How to know how many documents are indexed? Anything more elegant than
 : parsing numFound?
  $ curl http://192.168.34.51:8080/solr/select?q=*%3A*rows=0;
   resp.xml
  $ xmlstarlet sel -t -v //@numFound resp.xml

 solr/admin/stats.jsp is actually an xml too and contains numDocs and
 maxDoc info.

 I think you can get numDocs with jmx too.
 http://wiki.apache.org/solr/SolrJmx




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to know how many documents are indexed? Anything more elegant than parsing numFound?

2011-06-03 Thread Gabriele Kahlout
$ curl http://192.168.34.51:8080/solr/select?q=*%3A*rows=0;  resp.xml
$ xmlstarlet sel -t -v //@numFound resp.xml


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to know how many documents are indexed? Anything more elegant than parsing numFound?

2011-06-03 Thread Gabriele Kahlout
$ curl --fail http://192.168.34.51:8080/solr/admin/stats.jsp;  resp.xml
$ xmlstarlet sel -t -v //@numDocs resp.xml
*Extra content at the end of the document*

On Fri, Jun 3, 2011 at 8:56 PM, Ahmet Arslan iori...@yahoo.com wrote:

 : How to know how many documents are indexed? Anything more elegant than
 : parsing numFound?
  $ curl http://192.168.34.51:8080/solr/select?q=*%3A*rows=0;
   resp.xml
  $ xmlstarlet sel -t -v //@numFound resp.xml

 solr/admin/stats.jsp is actually an xml too and contains numDocs and maxDoc
 info.

 I think you can get numDocs with jmx too.
 http://wiki.apache.org/solr/SolrJmx




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


What's the need for a complicated SolrTestCaseJ4.getClassName() ?

2011-05-23 Thread Gabriele Kahlout
Hello,

As long as I subclass SolrTestCaseJ4 I cannot do
this.getClass().getSimpleName(), I don't understand why. I wonder if the
following complicated methods  in SolrTestCaseJ4 have anything to do with
it?

  protected static String getClassName() {
StackTraceElement[] stack = new
RuntimeException(WhoAmI).fillInStackTrace().getStackTrace();
for (int i = stack.length-1; i=0; i--) {
  StackTraceElement ste = stack[i];
  String cname = ste.getClassName();
  if (cname.indexOf(.lucene.)=0 || cname.indexOf(.solr.)=0) {
return cname;
  }
}
return SolrTestCaseJ4.class.getName();
  }

  protected static String getSimpleClassName() {
String cname = getClassName();
return cname.substring(cname.lastIndexOf('.')+1);
  }

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


(How) can I use SolrTestCaseJ4.assertQ(..) to test an existing index?

2011-05-21 Thread Gabriele Kahlout
Hello,

Examining Solr Core example it seems that a new index is created in a temp
dataDir deleted after each test (Good practice - agreed). But before I start
debugging adoc(..) I'm wondering if I can query the same index which I see
to work through Solr Web Server interface. Also for large indeces I see it
faster and easier to just copy paste a test resource index and just
assertQ(..) on it.

Examining the logs I figure out that SolrCore.initIndex() never picks up my
index.
The issue is
So far, it's not working me, although I specify the dataDir it always finds
no document.
The issue is that SolrCore.initDirectoryFactory() called from
SolrCore.initIndex()is initialized to RAMDirectoryFactory which
understandably returns false to getDirectoryFactory().exists(indexDir).

Other than hacking to use the StandardDirectoryFactory* how I can test an
existing index?*

It's been multiple days that I'm trying to figure out how to test with Solr!

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: (How) can I use SolrTestCaseJ4.assertQ(..) to test an existing index?

2011-05-21 Thread Gabriele Kahlout
On Sat, May 21, 2011 at 3:29 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 Hello,

 Examining Solr Core example it seems that a new index is created in a temp
 dataDir deleted after each test (Good practice - agreed).


errr..from a test to the other only dataDir is rm but not the in-memory
index. That's blown away with the core in  @AfterClass . I'm not sure what's
the point then of deleting the dataDir after each test. It's at least
counter-intuitive (to me).

@Test
public void testAddDoc() throws Exception {
final String docUID = getDocUID();
assertU(*adoc*(id, docUID, url, getURL(docUID), content, blah
blah blah));
assertU(commit());
assertQ(req(anythingQ), //*[@numFound='*1*']);
}

@Test
public void testAddOtherDoc() throws Exception {
final String docUID = getDocUID();
assertU(*adoc*(id, docUID, url, getURL(docUID), content, blah
blah blah));
assertU(commit());
assertQ(req(anythingQ), //*[@numFound='*2*']);
}


 But before I start debugging adoc(..) I'm wondering if I can query the
 same index which I see to work through Solr Web Server interface. Also for
 large indeces I see it faster and easier to just copy paste a test resource
 index and just assertQ(..) on it.

 Examining the logs I figure out that SolrCore.initIndex() never picks up
 my index.
 The issue is
 So far, it's not working me, although I specify the dataDir it always finds
 no document.
 The issue is that SolrCore.initDirectoryFactory() called from
 SolrCore.initIndex()is initialized to RAMDirectoryFactory which
 understandably returns false to getDirectoryFactory().exists(indexDir).

 Other than hacking to use the StandardDirectoryFactory* how I can test an
 existing index?*

  It's been multiple days that I'm trying to figure out how to test with
 Solr!

 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to test Solr Integartion - how to get EmbeddedSolrServer?

2011-05-18 Thread Gabriele Kahlout
Thinking more about it, I can solve my immediate problem by just
copy-pasting the classes I need into my own project packages (KISS
like herehttps://github.com/Filirom1/solr-test-exemple
).

I'd however suggest to refactor Solr code structure to be much more
defaults-compliant making it easier for external developers to understand,
and hopefully easier to maintain for committers (with fewer special-needs
configurations). I've done some of those refactorings on my local copy of
Solr and would be glad to contribute.

For this particular problem the KISS solution would be to create yet one
more module for Tests which depend on Solr Core and on the Test Framework.
The org burden of that extra module, versus the ease of building
configuration, I believe, outweights.



On Tue, May 17, 2011 at 7:11 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:


 http://stackoverflow.com/questions/6034513/can-i-avoid-a-dependency-cycle-with-one-edge-being-a-test-dependency


 On Tue, May 17, 2011 at 6:49 PM, Gabriele Kahlout 
 gabri...@mysimpatico.com wrote:




 On Tue, May 17, 2011 at 3:52 PM, Gabriele Kahlout 
 gabri...@mysimpatico.com wrote:



 On Tue, May 17, 2011 at 3:44 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Gabriele,

 On 5/17/2011 at 9:34 AM, Gabriele Kahlout wrote:
  Solr Core should declare a test dependency on Solr Test Framework.

 I agree:

 - Solr Core should have a test-scope dependency on Solr Test Framework.
 - Solr Test Framework should have a compile-scope dependency on Solr
 Core.

 But Maven views this as a circular dependency.


 I've seen, but adding it with scope test /scope works. The logic:
 the src is compiled first and then re-used (I'm assuming maven does
 something smart about not including the full jar).


 Not quite. I've tried a demo and the reactor complains. I'll try to see if
 maven could become 'smarter', or if the 2-build phase solution will work.

 The projects in the reactor contain a cyclic reference: Edge between
 'Vertex{label='com.mysimpatico:TestFramework:1.0-SNAPSHOT'}' and
 'Vertex{label='org.apache:DummyCore:1.0-SNAPSHOT'}' introduces to cycle in
 the graph org.apache:DummyCore:1.0-SNAPSHOT --
 com.mysimpatico:TestFramework:1.0-SNAPSHOT --
 org.apache:DummyCore:1.0-SNAPSHOT - [Help 1]







 The workaround: Solr Core includes the source of Solr Test Framework as
 part of its test source code.  It's not pretty, but it works.

 I'd be happy to entertain other (functional) approaches.


 In dp4j.com pom.xml I build in 2 phases to compile with the same
 annotations in the project itself (but i don't think we need that here)



 Steve




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the
 email does not contain a valid code then the email is not received. A valid
 code starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to list/see all the indexed terms of a particular field in a document?

2011-05-18 Thread Gabriele Kahlout
ant luke?

On Wed, May 18, 2011 at 11:47 AM, Gnanakumar gna...@zoniac.com wrote:

 Hi,

 I'm using Apache Solr v3.1.

 How do I list/get to see all the indexed terms of a particular field in a
 document (by passing Unique Key ID of the document)?

 For example, I've the following field definition in schema.xml:

 field name=mydocumentid type=string indexed=true stored=true
 required=true /
 field name=mytextcontent type=text indexed=true stored=true
 required=true /

 In this case, I expect/want to list/see all the indexed terms of a
 particular document (mydocumentid:x) for the document field
 mytextcontent.

 Regards,
 Gnanam




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Does every Solr request-response require a running server?

2011-05-18 Thread Gabriele Kahlout
Hello,

I'm wondering if Solr Test framework at the end of the day always runs an
embedded/jetty server (which is the only way to interact with solr, i.e. no
web server -- no solr) or in the tests they interact without one, calling
directly the under line methods?

The latter seems to be the case trying to understand SolrTestCaseJ4. That
would be more white-box than otherwise.

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Does every Solr request-response require a running server?

2011-05-18 Thread Gabriele Kahlout
On Wed, May 18, 2011 at 5:09 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Wed, May 18, 2011 at 10:50 AM, Gabriele Kahlout
 gabri...@mysimpatico.com wrote:
  Hello,
 
  I'm wondering if Solr Test framework at the end of the day always runs an
  embedded/jetty server (which is the only way to interact with solr, i.e.
 no
  web server -- no solr) or in the tests they interact without one,
 calling
  directly the under line methods?
 
  The latter seems to be the case trying to understand SolrTestCaseJ4. That
  would be more white-box than otherwise.

 Solr does either, depending on the test.

 Most tests start only an
 embedded solr server w/ no web server,


What is confusing me is the solr server. Is it SolrCore? In what aspects is
it a 'server'? In my understanding it's the core of the Solr Web application
which makes up the servlets interface, i.e. it's under the servlets not on
top of them.


 but others use an embedded
 jetty server so one can talk HTTP to it.  JettySolrRunner is used for
 the latter.

 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to test Solr Integartion - how to get EmbeddedSolrServer?

2011-05-17 Thread Gabriele Kahlout
Hello,

I'm starting to write tests of my Solr integration, and have unfortunately
spent a lot of time chasing updated documentation.

Follows a test I found
herehttp://blog.synyx.de/2011/01/integration-tests-for-your-solr-config/which
uses anEmbeddedSolrServerto communicate with the server and run some
queries.

@Test
public void testThatNoResultsAreReturned() throws SolrServerException {
SolrParams params = new SolrQuery(text that is not found);
assertQ(TEST_SEED, null, tests);

QueryResponse response = req(params);
assertEquals(0L, response.getResults().getNumFound());
}

The issue is that I cannot add a dependency on Solr-3.2-SNAPSHOT since it's
packaged as a war. I've tried to attach the sources and make the dependency
of type classes but it still won't work.

plugin
groupIdorg.apache.maven.plugins/groupId
artifactIdmaven-war-plugin/artifactId
configuration
  warSourceDirectoryweb/warSourceDirectory
  webXmlweb/WEB-INF/web.xml/webXml
*  attachClassestrue/attachClasses*
/configuration
  /plugin

How could you use EmbeddedSolrServer outside of Solr Webapp?

I've see that org.apache.solr.client.solrj.embedded.TestSolrProperties does
that in Solr Core, but not through a dependency on Solr Webapp (and I'm not
figuring out where it comes from).


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to test Solr Integartion - how to get EmbeddedSolrServer?

2011-05-17 Thread Gabriele Kahlout
thank you. I'd like to stick to the same version (i.e. 3.2-SNAPSHOT). It
seems things have changed there.

To reproduce (should we file this and add my test as a test to avoid this
bumping up again?)

$
svn co -r 1104120
http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ solr
cd solr; ant get-maven-poms; mvn -N -Pbootstrap install; mvn -DskipTests
install
wget http://dp4j.sf.net/debug/embeddedServerTest.zip
unzip embeddedServerTest.zip
cd embeddedServerTest; mvn -X test

P.S. I realize the example is not SSCCE (but close, and i laready uplaoded
it).

 dependencies
dependency
  groupIdjunit/groupId
  artifactIdjunit/artifactId
  version4.8.2/version
  scopetest/scope
  typejar/type
/dependency
dependency
  groupIdorg.apache.solr/groupId
  artifactIdsolr-core/artifactId
  version3.2-SNAPSHOT/version
/dependency
dependency
  groupIdorg.apache.solr/groupId
  artifactIdsolr-test-framework/artifactId
  version3.2-SNAPSHOT/version
/dependency
  /dependencies

import org.junit.Before;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.util.AbstractSolrTestCase;


public class SolrConfigTest extends AbstractSolrTestCase {

public String getSchemaFile() {
return /conf/schema.xml;
}

public String getSolrConfigFile() {
return /conf/solrconfig.xml;
}

@Before
@Override
public void setUp() throws Exception {
super.setUp();
new EmbeddedSolrServer(h.getCoreContainer(), h.getCore().getName());
}

}



On Tue, May 17, 2011 at 2:38 PM, Colin Vipurs
colin.vip...@shazamteam.comwrote:

  I use the following:

 dependency
 groupIdorg.apache.solr/groupId
 artifactIdsolr-core/artifactId
 version3.1.0/version
 /dependency
 dependency
 groupIdorg.apache.solr/groupId
 artifactIdsolr-solrj/artifactId
  version3.1.0/version
 /dependency

  Hello,

 I'm starting to write tests of my Solr integration, and have unfortunately
 spent a lot of time chasing updated documentation.

 Follows a test I found
 herehttp://blog.synyx.de/2011/01/integration-tests-for-your-solr-config/which
 uses anEmbeddedSolrServerto communicate with the server and run some
 queries.

 @Test
 public void testThatNoResultsAreReturned() throws SolrServerException {
 SolrParams params = new SolrQuery(text that is not found);
 assertQ(TEST_SEED, null, tests);

 QueryResponse response = req(params);
 assertEquals(0L, response.getResults().getNumFound());
 }

 The issue is that I cannot add a dependency on Solr-3.2-SNAPSHOT since it's
 packaged as a war. I've tried to attach the sources and make the dependency
 of type classes but it still won't work.

 plugin
 groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-war-plugin/artifactId
 configuration
   warSourceDirectoryweb/warSourceDirectory
   webXmlweb/WEB-INF/web.xml/webXml
 *  attachClassestrue/attachClasses*
 /configuration
   /plugin

 How could you use EmbeddedSolrServer outside of Solr Webapp?

 I've see that org.apache.solr.client.solrj.embedded.TestSolrProperties does
 that in Solr Core, but not through a dependency on Solr Webapp (and I'm not
 figuring out where it comes from).


 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).

 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email
 __


   --


 *Colin Vipurs*
 *Server Team Lead*

 *Shazam Entertainment Ltd   *
 *26-28 Hammersmith Grove, London W6 7HA*
 m:   +44 (0)  000 000   t: +44 (0) 20 8742 6820
 w:*www.shazam.com*

 Please consider the environment before printing this document

 This e-mail and its contents are strictly private and confidential. It must
 not be disclosed, distributed or copied without our prior consent. If you
 have received this transmission in error, please notify Shazam Entertainment
 immediately on: +44 (0) 020 8742 6820 and then delete it from your system.
 Please note that the information contained herein shall additionally
 constitute Confidential Information for the purposes of any NDA between the
 recipient/s and Shazam Entertainment. Shazam 

Re: How to test Solr Integartion - how to get EmbeddedSolrServer?

2011-05-17 Thread Gabriele Kahlout
On Tue, May 17, 2011 at 3:44 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Gabriele,

 On 5/17/2011 at 9:34 AM, Gabriele Kahlout wrote:
  Solr Core should declare a test dependency on Solr Test Framework.

 I agree:

 - Solr Core should have a test-scope dependency on Solr Test Framework.
 - Solr Test Framework should have a compile-scope dependency on Solr Core.

 But Maven views this as a circular dependency.


I've seen, but adding it with scope test /scope works. The logic: the
src is compiled first and then re-used (I'm assuming maven does something
smart about not including the full jar).




 The workaround: Solr Core includes the source of Solr Test Framework as
 part of its test source code.  It's not pretty, but it works.

 I'd be happy to entertain other (functional) approaches.


In dp4j.com pom.xml I build in 2 phases to compile with the same annotations
in the project itself (but i don't think we need that here)



 Steve




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How do i I modify XMLWriter to write foobar?

2011-05-15 Thread Gabriele Kahlout
Got this sorted checking out the branch revision.

On Thu, May 5, 2011 at 9:44 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 I've now tried to write my own QueryResponseWriter plugin[1], as a maven
 project depending on Solr Core 3.1, which is the same version of Solr I've
 installed. It seems I'm not able to get rid of some cache.



 $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 queryResponseWriter name=*xml* class=org.apache.solr.request.*
 XMLResponseWriter*/
 queryResponseWriter name=*Test* class=com.mysimpatico.me.indexplugins.
 *TestQueryResponseWriter* default=true/

 Restarted tomcat after changing solrconfig.xml and placing indexplugins.jar
 in $SOLR_HOME/
 At tomcat boot:
 INFO: Adding 'file:/Users/simpatico/SOLR_HOME/lib/IndexPlugins.jar' to
 classloader

 I get legacy code of the plugin for both, and I don't understand why. At
 least the xml should be different. Why could this be? How to find out?
 http://localhost:8080/solr/select?q=apachewt=Test and
 http://localhost:8080/solr/select?q=apachewt=xml
 XML Parsing Error: syntax error
 Location: http://localhost:8080/solr/select?q=apachewt=xml (//Test
 Line Number 1, Column 1:
 foobarresponseHeaderstatusQTimeparamsqapachewtxmlresponse00foobar
 ^

 It seems the new code for TestQueryResponseWriter[1] seems to never be
 executed since i added a severe log statement that doesn't appear in tomcat
 logs. Where are those caches?

 Thank you in advance.

 [1]
 package com.mysimpatico.me.indexplugins;

 import java.io.*;
 import java.util.logging.Level;
 import java.util.logging.Logger;
 import org.apache.solr.request.XMLResponseWriter;


 /**
  * Hello world!
  *
  */
 public class TestQueryResponseWriter extends XMLResponseWriter{

 @Override
 public void write(Writer writer,
 org.apache.solr.request.SolrQueryRequest request,
 org.apache.solr.response.SolrQueryResponse response) throws IOException {

 Logger.getLogger(TestQueryResponseWriter.class.getName()).log(Level.SEVERE,
 Hello from TestQueryResponseWriter);
 super.write(writer, request, response);

 }
 }


 On Thu, May 5, 2011 at 9:01 PM, Chris Hostetter 
 hossman_luc...@fucit.orgwrote:


 : $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 : queryResponseWriter name=xml class=org.apache.solr.request.*
 : XMLResponseWriter* default=true/
 :
 : Now I comment the line in Solrconfix.xml, and there's no more writer.
 : $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 :
 : I make a query, and the XMLResponseWriter is still in charge.
 : *$ curl -L http://localhost:8080/solr/select?q=apache*
 : ?xml version=1.0 encoding=UTF-8?

 ...

 Your example request is not specifying a wt param.

 in addition to the response writers declared in your solrconfig.xml, there
 are response writers that exist implicitly unless you define your own
 instances that override those names (xml, json, python, etc...)

 the real question is: what writer do you *want* to have used when no wt is
 specified?

 whatever the answer is: declare n instance of that writer with
 default=true in your solrconfig.xml


 -Hoss




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to get the filtered terms from a Query in the ResponseWriter?

2011-05-15 Thread Gabriele Kahlout
Hello,

For a given q string I'm trying to extract the terms (identifiers of tokens)
that the Query Parser identified at terms (and shows when explaining
results). I manage to do it as follows, but *I hope there a better way (more
direct) you will tell me about:*


NamedList analysis = new *FieldAnalysisRequestHandler*().doAnalysis(request);
//doAnalyis is protected, should extend with own dummy to get bypass, but
for now just hack
SimpleOrderedMap fieldsMap = (SimpleOrderedMap)
analysis.get(field_names);
SimpleOrderedMap contentMap = (SimpleOrderedMap)
fieldsMap.get(content);
final Set terms = new HashSet();
for (Object object : contentMap) {
List termsList = (List) object;
for (Object object1 : termsList) {
SimpleOrderedMap termMap = (SimpleOrderedMap) object1;
 *terms.add((String) termMap.get(text)); *//actually I
want the intersection of the terms returned here (i.e. those that made
through all the filters, and not the union
}
}





-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How to plugin the value of a Field? DocInverterPerField?

2011-05-14 Thread Gabriele Kahlout
Hello,

I'm trying to add an extra field to the schema.xml that is only stored, but
with nutch not knowing about it, I don't know how to tell Solr of its value
for each document. I'd like to plugin the computation, something like is
done with Similarity, but I'm not sure how to do that.

From SOLR-1566 https://issues.apache.org/jira/browse/SOLR-1566:
Currently it is not possible for components to add fields to outgoing
documents which are not in the the stored fields of the document.
That's my next problem, but let's say I'm okay storing the field, how do I
do that?

BTW, I tried hacking the code to add the fields a response-time to the
defaultFields per document, but since it always has a cached document it'll
not add them (I could still force the adding, but I'm not sure what else
will break as a result).

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to plugin the value of a Field? DocInverterPerField?

2011-05-14 Thread Gabriele Kahlout
It looks like I've to contact
  updateHandler class=solr.DirectUpdateHandler2 with an AddUpdateCommand
.


On Sat, May 14, 2011 at 12:36 PM, Gabriele Kahlout gabri...@mysimpatico.com
 wrote:

 Hello,

 I'm trying to add an extra field to the schema.xml that is only stored, but
 with nutch not knowing about it, I don't know how to tell Solr of its value
 for each document. I'd like to plugin the computation, something like is
 done with Similarity, but I'm not sure how to do that.

 From SOLR-1566 https://issues.apache.org/jira/browse/SOLR-1566:
 Currently it is not possible for components to add fields to outgoing
 documents which are not in the the stored fields of the document.
 That's my next problem, but let's say I'm okay storing the field, how do I
 do that?

 BTW, I tried hacking the code to add the fields a response-time to the
 defaultFields per document, but since it always has a cached document it'll
 not add them (I could still force the adding, but I'm not sure what else
 will break as a result).

 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to plugin the value of a Field? DocInverterPerField?

2011-05-14 Thread Gabriele Kahlout
I calculate it from search-time + index-time field values.
For example, say I want to print the reciprocal of the content field norm
(available at index-time) along every document in the results. What's the
'clean' way of doing that?

On Sat, May 14, 2011 at 3:42 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 I'm not sure what you're trying to do. Where does the field value needs to
 come
 from?

  Hello,
 
  I'm trying to add an extra field to the schema.xml that is only stored,
 but
  with nutch not knowing about it, I don't know how to tell Solr of its
 value
  for each document. I'd like to plugin the computation, something like is
  done with Similarity, but I'm not sure how to do that.
 
  From SOLR-1566 https://issues.apache.org/jira/browse/SOLR-1566:
  Currently it is not possible for components to add fields to outgoing
  documents which are not in the the stored fields of the document.
  That's my next problem, but let's say I'm okay storing the field, how do
 I
  do that?
 
  BTW, I tried hacking the code to add the fields a response-time to the
  defaultFields per document, but since it always has a cached document
 it'll
  not add them (I could still force the adding, but I'm not sure what else
  will break as a result).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Want to Delete Existing Index create fresh index

2011-05-14 Thread Gabriele Kahlout
I guess you are having issues with the datadir. Did you set the datadir in
solrconfig.xml?

On Sat, May 14, 2011 at 4:10 PM, Pawan Darira pawan.dar...@gmail.comwrote:

 Hi

 I am using Solr 1.4.  had changed schema already. When i created the index
 for first time, the directory was automatically created  index made
 perfectly fine.

 Now, i want to create the index from scratch, so I deleted the whole
 data/index directory  ran the script. Now it is only creating empty
 directories  NO index files inside that.

 Thanks
 Pawan


 On Sat, May 14, 2011 at 6:54 PM, Dmitry Kan dmitry@gmail.com wrote:

  Hi Pawan,
 
  Which SOLR version do you have installed?
 
  It should be absolutely normal for the data/ sub directory to create when
  starting up SOLR.
 
  So just go ahead and post your data into SOLR, if you have changed the
  schema already.
 
  --
  Regards,
 
  Dmitry Kan
 
  On Sat, May 14, 2011 at 4:01 PM, Pawan Darira pawan.dar...@gmail.com
  wrote:
 
   I did that. Index directory is created but not contents in that
  
   2011/5/14 François Schiettecatte fschietteca...@gmail.com
  
You can also shut down solr/lucene, do:
   
   rm -rf /YourIndexName/data/index
   
and restart, the index directory will be automatically recreated.
   
François
   
On May 14, 2011, at 1:53 AM, Gabriele Kahlout wrote:
   
 curl --fail $solrIndex/update?commit=true -d
 'deletequery*:*/query/delete' #empty index [1
 
   
  http://wiki.apache.org/nutch/Whole-Web%20Crawling%20incremental%20script
   ]

 did u try?


 On Sat, May 14, 2011 at 7:26 AM, Pawan Darira 
  pawan.dar...@gmail.com
wrote:

 Hi

 I had an existing index created months back. now my database
 schema
   has
 changed. i wanted to delete the current data/index directory 
   re-create
 the
 fresh index

 but it is saying that segments file not found  just create
 blank
 data/index directory. Please help

 --
 Thanks,
 Pawan Darira




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges
  the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this)
 ∧
time(x)
  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or
 the
email
 does not contain a valid code then the email is not received. A
 valid
code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x)
 ∧
  y
   ∈
 L(-[a-z]+[0-9]X)).
   
   
  
  
   --
   Thanks,
   Pawan Darira
  
 



 --
 Thanks,
 Pawan Darira




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: How to plugin the value of a Field? DocInverterPerField?

2011-05-14 Thread Gabriele Kahlout
Just reporting on progress:

Hacking my own ResponseWriter I manage to add the field to the doc
just-in-time before it's written. It's not that messy after all, and i
suspect the fields could also be declared in schema.xml (if we want to be
able to disable them at run-time) and only if present the value is computed
and added.
As acknowledged by others before me, there's room for refactoring
ResponseWriters to at least make them more re-usable.

Hope SOLR-1566 https://issues.apache.org/jira/browse/SOLR-1566  come up
with a cleaner solution.

On Sat, May 14, 2011 at 3:55 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 I calculate it from search-time + index-time field values.
 For example, say I want to print the reciprocal of the content field norm
 (available at index-time) along every document in the results. What's the
 'clean' way of doing that?


 On Sat, May 14, 2011 at 3:42 PM, Markus Jelsma markus.jel...@openindex.io
  wrote:

 I'm not sure what you're trying to do. Where does the field value needs to
 come
 from?

  Hello,
 
  I'm trying to add an extra field to the schema.xml that is only stored,
 but
  with nutch not knowing about it, I don't know how to tell Solr of its
 value
  for each document. I'd like to plugin the computation, something like is
  done with Similarity, but I'm not sure how to do that.
 
  From SOLR-1566 https://issues.apache.org/jira/browse/SOLR-1566:
  Currently it is not possible for components to add fields to outgoing
  documents which are not in the the stored fields of the document.
  That's my next problem, but let's say I'm okay storing the field, how do
 I
  do that?
 
  BTW, I tried hacking the code to add the fields a response-time to the
  defaultFields per document, but since it always has a cached document
 it'll
  not add them (I could still force the adding, but I'm not sure what else
  will break as a result).




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Editor loads wrong version of IndexSearcher while debugging - how to fix?

2011-05-13 Thread Gabriele Kahlout
Hello,

I'm debugging Solr built as a maven project in NB, and when I enter the code
of a Lucene dependency, namely
org.apache.lucene.search.IndexSearcher.explain(..) the call stack expects
this method to be at line 599 while in the editor the class ends at 304.

from solr-core's pom.xml:
dependency
  groupId${project.groupId}/groupId
  artifactIdsolr-solrj/artifactId
*  version${project.version}/version*
/dependency

from solrj's pom.xml:
 dependency
  groupIdorg.apache.lucene/groupId
  artifactIdlucene-core/artifactId
*  version${project.version}/version*
/dependency

Looking up the actual class it's indeed 846 lines class and the editor is
loading a faulty version sources.jar (download sourcecode).
So the code in the sources.jar doesn't correspond to the binary code.
Now the big question is,* why do I get sources different from the binary of
the same version for a dependency*? How more could this be debugged? I don't
know how NB downloads a dependency sources (googling it seems that each IDE
has it's plugin for doing that).

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Want to Delete Existing Index create fresh index

2011-05-13 Thread Gabriele Kahlout
curl --fail $solrIndex/update?commit=true -d
'deletequery*:*/query/delete' #empty index [1
http://wiki.apache.org/nutch/Whole-Web%20Crawling%20incremental%20script]

did u try?


On Sat, May 14, 2011 at 7:26 AM, Pawan Darira pawan.dar...@gmail.comwrote:

 Hi

 I had an existing index created months back. now my database schema has
 changed. i wanted to delete the current data/index directory  re-create
 the
 fresh index

 but it is saying that segments file not found  just create blank
 data/index directory. Please help

 --
 Thanks,
 Pawan Darira




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Is it possible to build Solr as a maven project?

2011-05-12 Thread Gabriele Kahlout
On Tue, May 10, 2011 at 3:56 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:



 On Tue, May 10, 2011 at 3:50 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Gabriele,

 There are some Maven instructions here (not in Lucene/Solr 3.1 because I
 just wrote the file a couple of days ago):
 
 http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/dev-tools/maven/README.maven
 

 My recommendation, since the Solr 3.1 source tarball does not include
 dev-tools/, is to check out the 3.1-tagged sources from Subversion:

 svn co http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1

 and then follow the instructions in the above-linked README.maven.  I did
 that just now and it worked for me.  The results are in solr/package/maven/.


 I did that and i think they worked for me but i didn't get nutch to work
 with it, so I preferred to revert to what is officially supported (not even,
 but...).

 I'll be trying and report back.


Everything worked! Those the revisions used:

$ svn co -r 1101526
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1 solr 1086822
$ svn co -r 1101540
http://svn.apache.org/repos/asf/nutch/branches/branch-1.3 nutch



 Thank you






 Please write back if you run into any problems.

 Steve


 From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com]
 Sent: Tuesday, May 10, 2011 8:37 AM
 To: boutr...@gmail.com
 Cc: solr-user@lucene.apache.org; Steven A Rowe; ryan...@gmail.com
 Subject: Re: Is it possible to build Solr as a maven project?


 sorry, this was not the target I used (this one should work too, but...),

 Can we expand on the but...?

 $ wget http://apache.panu.it//lucene/solr/3.1.0/apache-solr-3.1.0-src.tgz
 http://apache.panu.it/lucene/solr/3.1.0/apache-solr-3.1.0-src.tgz
 $ tar xf apache-solr-3.1.0-src.tgz
 $ cd apache-solr-3.1.0
 $ ant generate-maven-artifacts
 generate-maven-artifacts:

 get-maven-poms:

 BUILD FAILED
 /Users/simpatico/Downloads/apache-solr-3.1.0/build.xml:59: The following
 error occurred while executing this line:
 /Users/simpatico/Downloads/apache-solr-3.1.0/lucene/build.xml:445: The
 following error occurred while executing this line:
 /Users/simpatico/Downloads/apache-solr-3.1.0/build.xml:45:
 /Users/simpatico/Downloads/apache-solr-3.1.0/dev-tools/maven does not exist.



 Now for those that build this, it must have worked sometime. How? Or is
 this a bug in the release?
 Looking the revisions history of the build script I might be referring to
 LUCENE-2490https://issues.apache.org/jira/browse/LUCENE-2490 but I'm
 not sure I understand the solution out. I've checked out dev-tools but even
 with it things don't work (tried the one with 3.1.0 relesase).




 the one I used is get-maven-poms. That will just create pom files and copy
 them to their right target locations.

 I'm using netbeans and I'm using the plugin Automatic Projects to do
 everything inside the IDE.

 Which version of Solr are you using ?

 Ludovic.

 2011/5/4 Gabriele Kahlout [via Lucene] 
 ml-node+2898211-2124746009-383...@n3.nabble.commailto:
 ml-node%2b2898211-2124746009-383...@n3.nabble.com

  generate-maven-artifacts:
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/build/maven
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/dist/maven
  [copy] Copying 1 file to
  /Users/simpatico/SOLR_HOME/build/maven/src/maven
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
 
  *BUILD FAILED*
  /Users/simpatico/SOLR_HOME/*build.xml:800*: The following error occurred
  while executing this line:
  /Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
  support the uniqueVersion attribute
 
 
  *build.xml:800: *m2-deploy
  pom.xml=src/maven/solr-parent-pom.xml.template/
 
  removed uniquVersion attirubte:
 
  generate-maven-artifacts:
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/mavenfile:///\\Users\simpatico\SOLR_HOME\dist\maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-parent'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
  org.apache.solr:solr-parent:1.4.2-SNAPSHOT'
   [copy] Copying 1 file to /Users/simpatico/SOLR_HOME/build/maven/lib
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/mavenfile:///\\Users\simpatico\SOLR_HOME\dist\maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata

Coord in queryExplain

2011-05-12 Thread Gabriele Kahlout
Hello,

I'm wondering why the results of coord() are not displayed when debugging
query results, as described in the
wiki[1http://wiki.apache.org/solr/SolrRelevancyFAQ#Why_does_id:archangel_come_before_id:hawkgirl_when_querying_for_.22wings.22].
I'd like to see it.
Could someone point to how to make it appear with the debug fields?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Coord in queryExplain

2011-05-12 Thread Gabriele Kahlout
You are right!

On Thu, May 12, 2011 at 2:54 PM, Ahmet Arslan iori...@yahoo.com wrote:

  I'm wondering why the results of coord() are not displayed
  when debugging
  query results, as described in the
  wiki[1
 http://wiki.apache.org/solr/SolrRelevancyFAQ#Why_does_id:archangel_come_before_id:hawkgirl_when_querying_for_.22wings.22
 ].
  I'd like to see it.
  Could someone point to how to make it appear with the debug
  fields?

 coord info displayed, however it seems that it is not displayed for value
 of 1.0 .
 To see coord, issue a multi-word query, and advance to the end of the list
 via start param.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


No more standard query type?

2011-05-11 Thread Gabriele Kahlout
Is the tagged release of solr 3.1 different from the one distributed in the
downloads page? It looks like a reproducible bug.

svn co -r 1101526
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1 solr

This is the default query I get from
http://localhost:8080/solr/admin/form.jsp:

http://localhost:8080/solr/select?indent=onversion=2.2q=*%3A*fq=start=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=
HTTP Status 400 - unknown handler: standard
--

*type* Status report

*message* *unknown handler: standard*

*description* *The request sent by the client was syntactically incorrect
(unknown handler: standard).*
--
Apache Tomcat/6.0.29I get the same with
http://localhost:8080/solr/select?q=*%3A*wt=standardqt=standard, but not
with:
http://localhost:8080/solr/select/?q=*%3A*version=2.2start=0rows=10indent=on
(from http://localhost:8080/solr/admin)

(good)
response
−
lst name=responseHeader
int name=status0/int
int name=QTime5/int
−
lst name=params
str name=indenton/str
str name=start0/str
str name=q*:*/str
str name=rows10/str
str name=version2.2/str
/lst
/lst
result name=response numFound=0 start=0/
/response

The

On Thu, May 5, 2011 at 9:01 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 : queryResponseWriter name=xml class=org.apache.solr.request.*
 : XMLResponseWriter* default=true/
 :
 : Now I comment the line in Solrconfix.xml, and there's no more writer.
 : $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 :
 : I make a query, and the XMLResponseWriter is still in charge.
 : *$ curl -L http://localhost:8080/solr/select?q=apache*
 : ?xml version=1.0 encoding=UTF-8?

 ...

 Your example request is not specifying a wt param.

 in addition to the response writers declared in your solrconfig.xml, there
 are response writers that exist implicitly unless you define your own
 instances that override those names (xml, json, python, etc...)

 the real question is: what writer do you *want* to have used when no wt is
 specified?

 whatever the answer is: declare n instance of that writer with
 default=true in your solrconfig.xml


 -Hoss




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Is it possible to build Solr as a maven project?

2011-05-10 Thread Gabriele Kahlout
 sorry, this was not the target I used (this one should work too, but...),


Can we expand on the but...?

$ wget http://apache.panu.it//lucene/solr/3.1.0/apache-solr-3.1.0-src.tgz
$ tar xf apache-solr-3.1.0-src.tgz
$ cd apache-solr-3.1.0
$ ant generate-maven-artifacts
*generate-maven-artifacts:

get-maven-poms:

BUILD FAILED
/Users/simpatico/Downloads/apache-solr-3.1.0/build.xml:59: The following
error occurred while executing this line:
/Users/simpatico/Downloads/apache-solr-3.1.0/lucene/build.xml:445: The
following error occurred while executing this line:
/Users/simpatico/Downloads/apache-solr-3.1.0/build.xml:45:
/Users/simpatico/Downloads/apache-solr-3.1.0/dev-tools/maven does not exist.
*


Now for those that build this, it must have worked sometime. How? Or is this
a bug in the release?
Looking the revisions history of the build script I might be referring to
LUCENE-2490 https://issues.apache.org/jira/browse/LUCENE-2490 but I'm not
sure I understand the solution out. I've checked out dev-tools but even with
it things don't work (tried the one with 3.1.0 relesase).





 the one I used is get-maven-poms. That will just create pom files and copy
 them to their right target locations.

 I'm using netbeans and I'm using the plugin Automatic Projects to do
 everything inside the IDE.

 Which version of Solr are you using ?

 Ludovic.

 2011/5/4 Gabriele Kahlout [via Lucene] 
 ml-node+2898211-2124746009-383...@n3.nabble.com

  generate-maven-artifacts:
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/build/maven
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/dist/maven
  [copy] Copying 1 file to
  /Users/simpatico/SOLR_HOME/build/maven/src/maven
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
 
  *BUILD FAILED*
  /Users/simpatico/SOLR_HOME/*build.xml:800*: The following error occurred
  while executing this line:
  /Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
  support the uniqueVersion attribute
 
 
  *build.xml:800: *m2-deploy
  pom.xml=src/maven/solr-parent-pom.xml.template/
 
  removed uniquVersion attirubte:
 
  generate-maven-artifacts:
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-parent'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
  org.apache.solr:solr-parent:1.4.2-SNAPSHOT'
   [copy] Copying 1 file to /Users/simpatico/SOLR_HOME/build/maven/lib
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-commons-csv'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading project information for
 solr-commons-csv
 
  1.4.2-SNAPSHOT
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
  org.apache.solr:solr-commons-csv:1.4.2-SNAPSHOT'
   [copy] Copying 1 file to
  /Users/simpatico/SOLR_HOME/build/maven/contrib/dataimporthandler
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
 
  BUILD FAILED
  /Users/simpatico/SOLR_HOME/build.xml:809: The following error occurred
  while
  executing this line:
  */Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
  support the nested attach element*
 
 


 -
 Jouve
 France.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Is-it-possible-to-build-Solr-as-a-maven-project-tp2898068p2898315.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Is it possible to build Solr as a maven project?

2011-05-10 Thread Gabriele Kahlout
On Tue, May 10, 2011 at 3:50 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Gabriele,

 There are some Maven instructions here (not in Lucene/Solr 3.1 because I
 just wrote the file a couple of days ago):
 
 http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/dev-tools/maven/README.maven
 

 My recommendation, since the Solr 3.1 source tarball does not include
 dev-tools/, is to check out the 3.1-tagged sources from Subversion:

 svn co http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1

 and then follow the instructions in the above-linked README.maven.  I did
 that just now and it worked for me.  The results are in solr/package/maven/.


I did that and i think they worked for me but i didn't get nutch to work
with it, so I preferred to revert to what is officially supported (not even,
but...).

I'll be trying and report back. Thank you in advance.



 Please write back if you run into any problems.

 Steve


 From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com]
 Sent: Tuesday, May 10, 2011 8:37 AM
 To: boutr...@gmail.com
 Cc: solr-user@lucene.apache.org; Steven A Rowe; ryan...@gmail.com
 Subject: Re: Is it possible to build Solr as a maven project?


 sorry, this was not the target I used (this one should work too, but...),

 Can we expand on the but...?

 $ wget http://apache.panu.it//lucene/solr/3.1.0/apache-solr-3.1.0-src.tgz
 http://apache.panu.it/lucene/solr/3.1.0/apache-solr-3.1.0-src.tgz
 $ tar xf apache-solr-3.1.0-src.tgz
 $ cd apache-solr-3.1.0
 $ ant generate-maven-artifacts
 generate-maven-artifacts:

 get-maven-poms:

 BUILD FAILED
 /Users/simpatico/Downloads/apache-solr-3.1.0/build.xml:59: The following
 error occurred while executing this line:
 /Users/simpatico/Downloads/apache-solr-3.1.0/lucene/build.xml:445: The
 following error occurred while executing this line:
 /Users/simpatico/Downloads/apache-solr-3.1.0/build.xml:45:
 /Users/simpatico/Downloads/apache-solr-3.1.0/dev-tools/maven does not exist.



 Now for those that build this, it must have worked sometime. How? Or is
 this a bug in the release?
 Looking the revisions history of the build script I might be referring to
 LUCENE-2490https://issues.apache.org/jira/browse/LUCENE-2490 but I'm not
 sure I understand the solution out. I've checked out dev-tools but even with
 it things don't work (tried the one with 3.1.0 relesase).




 the one I used is get-maven-poms. That will just create pom files and copy
 them to their right target locations.

 I'm using netbeans and I'm using the plugin Automatic Projects to do
 everything inside the IDE.

 Which version of Solr are you using ?

 Ludovic.

 2011/5/4 Gabriele Kahlout [via Lucene] 
 ml-node+2898211-2124746009-383...@n3.nabble.commailto:
 ml-node%2b2898211-2124746009-383...@n3.nabble.com

  generate-maven-artifacts:
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/build/maven
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/dist/maven
  [copy] Copying 1 file to
  /Users/simpatico/SOLR_HOME/build/maven/src/maven
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
 
  *BUILD FAILED*
  /Users/simpatico/SOLR_HOME/*build.xml:800*: The following error occurred
  while executing this line:
  /Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
  support the uniqueVersion attribute
 
 
  *build.xml:800: *m2-deploy
  pom.xml=src/maven/solr-parent-pom.xml.template/
 
  removed uniquVersion attirubte:
 
  generate-maven-artifacts:
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/mavenfile:///\\Users\simpatico\SOLR_HOME\dist\maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-parent'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
  org.apache.solr:solr-parent:1.4.2-SNAPSHOT'
   [copy] Copying 1 file to /Users/simpatico/SOLR_HOME/build/maven/lib
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/mavenfile:///\\Users\simpatico\SOLR_HOME\dist\maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-commons-csv'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading project information for
 solr-commons-csv
 
  1.4.2-SNAPSHOT
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading

SolrHome ends with /./ - is this normal?

2011-05-10 Thread Gabriele Kahlout
 Hello,

I'm having trouble getting Solr 3.1 to work with nutch-1.3.  I'm not sure
where the problem is, but I'm wondering why does the solrHome path end with
/./.

cwd=/Applications/NetBeans/apache-tomcat-7.0.6/bin
SolrHome=/Users/simpatico/apache-solr-3.1.0/solr/./

In the web.xml of solr:

   env-entry
   env-entry-namesolr/home/env-entry-name

env-entry-value${user.home}/apache-solr-3.1.0/solr/env-entry-value
   env-entry-typejava.lang.String/env-entry-type
/env-entry


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: SolrHome ends with /./ - is this normal?

2011-05-10 Thread Gabriele Kahlout
It apparently is normal, and my issue is indeed with nutch.

I've modified post.sh from the example docs to use the solr in
http://localhost:8080/apache-solr-3.1-SNAPSHOT and now finally data made it
to the index.
$ post.sh solr.xml monitor.xml

With nutch I'm at:

$ svn info
Path: .
URL: http://svn.apache.org/repos/asf/nutch/branches/branch-1.3
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: *1101459*
Node Kind: directory
Schedule: normal
Last Changed Author: markus
Last Changed Rev: 1101280
Last Changed Date: 2011-05-10 02:46:04 +0200 (Tue, 10 May 2011)

Does this work for you? All I've done is svn co nutch 1.3 and execute my
script which up to now worked.


On Tue, May 10, 2011 at 4:11 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 Hello,

 I'm having trouble getting Solr 3.1 to work with nutch-1.3.  I'm not sure
 where the problem is, but I'm wondering why does the solrHome path end with
 /./.

 cwd=/Applications/NetBeans/apache-tomcat-7.0.6/bin
 SolrHome=/Users/simpatico/apache-solr-3.1.0/solr/./

 In the web.xml of solr:

env-entry
env-entry-namesolr/home/env-entry-name

 env-entry-value${user.home}/apache-solr-3.1.0/solr/env-entry-value
env-entry-typejava.lang.String/env-entry-type
 /env-entry


 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: SolrHome ends with /./ - is this normal?

2011-05-10 Thread Gabriele Kahlout
From solr logs:

May 10, 2011 4:33:20 PM org.apache.solr.common.SolrException log
*SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field 'content'
*
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:321)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:244)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:393)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:244)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:550)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:380)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:166)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:288)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)


in conf/schema.xml:
   !-- fields for index-basic plugin --
field name=host type=url stored=false indexed=true/
field name=site type=string stored=false indexed=true/
field name=url type=url stored=true indexed=true
required=true/
*field name=content type=text stored=false indexed=true/*

in conf/solrindex-mapping.xml:
fields
field dest=content source=content/

In recent solr I think this has been renamed into text?

Solr's conf/schema.xml:
via copyField further on in this schema  --
*   field name=text type=text indexed=true stored=false
multiValued=true/*

On Tue, May 10, 2011 at 4:30 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 It apparently is normal, and my issue is indeed with nutch.

 I've modified post.sh from the example docs to use the solr in
 http://localhost:8080/apache-solr-3.1-SNAPSHOT and now finally data made
 it to the index.
 $ post.sh solr.xml monitor.xml

 With nutch I'm at:

 $ svn info
 Path: .
 URL: http://svn.apache.org/repos/asf/nutch/branches/branch-1.3
 Repository Root: http://svn.apache.org/repos/asf
 Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
 Revision: *1101459*
 Node Kind: directory
 Schedule: normal
 Last Changed Author: markus
 Last Changed Rev: 1101280
 Last Changed Date: 2011-05-10 02:46:04 +0200 (Tue, 10 May 2011)

 Does this work for you? All I've done is svn co nutch 1.3 and execute my
 script which up to now worked.



 On Tue, May 10, 2011 at 4:11 PM, Gabriele Kahlout 
 gabri...@mysimpatico.com wrote:

 Hello,

 I'm having trouble getting Solr 3.1 to work with nutch-1.3.  I'm not sure
 where the problem is, but I'm wondering why does the solrHome path end with
 /./.

 cwd=/Applications/NetBeans/apache-tomcat-7.0.6/bin
 SolrHome=/Users/simpatico/apache-solr-3.1.0/solr/./

 In the web.xml of solr:

env-entry
env-entry-namesolr/home/env-entry-name

 env-entry-value${user.home}/apache-solr-3.1.0/solr/env-entry-value
env-entry-typejava.lang.String/env-entry-type
 /env-entry


 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I

Re: SolrHome ends with /./ - is this normal?

2011-05-10 Thread Gabriele Kahlout
I don't get you, are you talking about conf/schema.xml? That's what I'm
referring to. Am i supposed to do something with the nutch's
conf/schema.xml?

On Tue, May 10, 2011 at 4:46 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 There is a working example schema in Nutch' conf directory.

 On Tuesday 10 May 2011 16:40:02 Gabriele Kahlout wrote:
  From solr logs:
 
  May 10, 2011 4:33:20 PM org.apache.solr.common.SolrException log
  *SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field
  'content' *
  at
 
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:321)
  at
 
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdatePro
  cessorFactory.java:60) at
  org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147) at
  org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
  at
 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentS
  treamHandlerBase.java:55) at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase
  .java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
  356) at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java
  :252) at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicatio
  nFilterChain.java:244) at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterC
  hain.java:210) at
 
 org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilte
  r.java:393) at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicatio
  nFilterChain.java:244) at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterC
  hain.java:210) at
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.j
  ava:240) at
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.j
  ava:161) at
 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:16
  4) at
 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:10
  0) at
  org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:550)
  at
 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.jav
  a:118) at
 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:380)
  at
 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243)
  at
 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Htt
  p11Protocol.java:188) at
 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Htt
  p11Protocol.java:166) at
 
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java
  :288) at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j
  ava:886) at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
  908) at java.lang.Thread.run(Thread.java:680)
 
 
  in conf/schema.xml:
 !-- fields for index-basic plugin --
  field name=host type=url stored=false indexed=true/
  field name=site type=string stored=false indexed=true/
  field name=url type=url stored=true indexed=true
  required=true/
  *field name=content type=text stored=false
 indexed=true/*
 
  in conf/solrindex-mapping.xml:
  fields
  field dest=content source=content/
 
  In recent solr I think this has been renamed into text?
 
  Solr's conf/schema.xml:
  via copyField further on in this schema  --
  *   field name=text type=text indexed=true stored=false
  multiValued=true/*
 
  On Tue, May 10, 2011 at 4:30 PM, Gabriele Kahlout
 
  gabri...@mysimpatico.comwrote:
   It apparently is normal, and my issue is indeed with nutch.
  
   I've modified post.sh from the example docs to use the solr in
   http://localhost:8080/apache-solr-3.1-SNAPSHOT and now finally data
 made
   it to the index.
   $ post.sh solr.xml monitor.xml
  
   With nutch I'm at:
  
   $ svn info
   Path: .
   URL: http://svn.apache.org/repos/asf/nutch/branches/branch-1.3
   Repository Root: http://svn.apache.org/repos/asf
   Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
   Revision: *1101459*
   Node Kind: directory
   Schedule: normal
   Last Changed Author: markus
   Last Changed Rev: 1101280
   Last Changed Date: 2011-05-10 02:46:04 +0200 (Tue, 10 May 2011)
  
   Does this work for you? All I've done is svn co nutch 1.3 and execute
 my
   script which up to now worked.
  
  
  
   On Tue, May 10, 2011 at 4:11 PM, Gabriele Kahlout 
  
   gabri...@mysimpatico.com wrote:
   Hello,
  
   I'm having trouble getting Solr 3.1 to work with nutch-1.3.  I'm not
   sure where the problem is, but I'm wondering why does the solrHome
 path
   end with /./.
  
   cwd=/Applications/NetBeans/apache-tomcat-7.0.6/bin
   SolrHome=/Users/simpatico/apache-solr-3.1.0

Re: SolrHome ends with /./ - is this normal?

2011-05-10 Thread Gabriele Kahlout
You mean that I should copy it from nutch into solr?

$ cp $NUTCH_HOME/conf/schema.xml $SOLR_HOME/conf/schema.xml

After restarting tomcat, and re-executing the script nothing changed.

On Tue, May 10, 2011 at 5:35 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 You need to use the schema.xml shipped with Nutch in Solr. It provides most
 fields that you need.

 On Tuesday 10 May 2011 17:31:33 Gabriele Kahlout wrote:
  I don't get you, are you talking about conf/schema.xml? That's what I'm
  referring to. Am i supposed to do something with the nutch's
  conf/schema.xml?
 
  On Tue, May 10, 2011 at 4:46 PM, Markus Jelsma
 
  markus.jel...@openindex.iowrote:
   There is a working example schema in Nutch' conf directory.
  
   On Tuesday 10 May 2011 16:40:02 Gabriele Kahlout wrote:
From solr logs:
   
May 10, 2011 4:33:20 PM org.apache.solr.common.SolrException log
*SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field
'content' *
   
at
  
  
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:32
   1)
  
at
  
  
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateP
   ro
  
cessorFactory.java:60) at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147)
 at
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
   
at
  
  
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conten
   tS
  
treamHandlerBase.java:55) at
  
  
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBa
   se
  
.java:129) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
   
at
  
  
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
356) at
  
  
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
   va
  
:252) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
   io
  
nFilterChain.java:244) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
   rC
  
hain.java:210) at
  
  
 org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFil
   te
  
r.java:393) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
   io
  
nFilterChain.java:244) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
   rC
  
hain.java:210) at
  
  
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve
   .j
  
ava:240) at
  
  
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve
   .j
  
ava:161) at
  
  
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:
   16
  
4) at
  
  
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:
   10
  
0) at
   
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55
0)
   
at
  
  
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.j
   av
  
a:118) at
  
  
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:38
   0)
  
at
  
  
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243
   )
  
at
  
  
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
   tt
  
p11Protocol.java:188) at
  
  
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
   tt
  
p11Protocol.java:166) at
  
  
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.ja
   va
  
:288) at
  
  
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
   .j
  
ava:886) at
  
  
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
908) at java.lang.Thread.run(Thread.java:680)
   
in conf/schema.xml:
   !-- fields for index-basic plugin --
   
field name=host type=url stored=false indexed=true/
field name=site type=string stored=false
indexed=true/ field name=url type=url stored=true
indexed=true
   
required=true/
   
*field name=content type=text stored=false
  
   indexed=true/*
  
in conf/solrindex-mapping.xml:
fields
   
field dest=content source=content/
   
In recent solr I think this has been renamed into text?
   
Solr's conf/schema.xml:
via copyField further on in this schema  --
   
*   field name=text type=text indexed=true stored=false
multiValued=true/*
   
On Tue, May 10, 2011 at 4:30 PM, Gabriele Kahlout
   
gabri...@mysimpatico.comwrote:
 It apparently is normal, and my issue is indeed with nutch.

 I've modified post.sh from the example docs to use the solr in
 http://localhost:8080/apache-solr-3.1-SNAPSHOT and now finally
 data
  
   made
  
 it to the index.
 $ post.sh solr.xml monitor.xml

 With nutch I'm at:

 $ svn

Re: SolrHome ends with /./ - is this normal?

2011-05-10 Thread Gabriele Kahlout
actually something changed, I managed to crawl and index some pages (the
other must have to do with regex-urls). Thank you!

Was this always necessary? Any pointer discussing why it's needed?

On Tue, May 10, 2011 at 5:40 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 You mean that I should copy it from nutch into solr?

 $ cp $NUTCH_HOME/conf/schema.xml $SOLR_HOME/conf/schema.xml

 After restarting tomcat, and re-executing the script nothing changed.


 On Tue, May 10, 2011 at 5:35 PM, Markus Jelsma markus.jel...@openindex.io
  wrote:

 You need to use the schema.xml shipped with Nutch in Solr. It provides
 most
 fields that you need.

 On Tuesday 10 May 2011 17:31:33 Gabriele Kahlout wrote:
  I don't get you, are you talking about conf/schema.xml? That's what I'm
  referring to. Am i supposed to do something with the nutch's
  conf/schema.xml?
 
  On Tue, May 10, 2011 at 4:46 PM, Markus Jelsma
 
  markus.jel...@openindex.iowrote:
   There is a working example schema in Nutch' conf directory.
  
   On Tuesday 10 May 2011 16:40:02 Gabriele Kahlout wrote:
From solr logs:
   
May 10, 2011 4:33:20 PM org.apache.solr.common.SolrException log
*SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field
'content' *
   
at
  
  
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:32
   1)
  
at
  
  
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateP
   ro
  
cessorFactory.java:60) at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147)
 at
org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
   
at
  
  
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Conten
   tS
  
treamHandlerBase.java:55) at
  
  
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBa
   se
  
.java:129) at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
   
at
  
  

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
356) at
  
  
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
   va
  
:252) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
   io
  
nFilterChain.java:244) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
   rC
  
hain.java:210) at
  
  
 org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFil
   te
  
r.java:393) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicat
   io
  
nFilterChain.java:244) at
  
  
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilte
   rC
  
hain.java:210) at
  
  
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve
   .j
  
ava:240) at
  
  
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve
   .j
  
ava:161) at
  
  
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:
   16
  
4) at
  
  
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:
   10
  
0) at
   
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55
0)
   
at
  
  
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.j
   av
  
a:118) at
  
  
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:38
   0)
  
at
  
  
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243
   )
  
at
  
  
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
   tt
  
p11Protocol.java:188) at
  
  
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(H
   tt
  
p11Protocol.java:166) at
  
  
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.ja
   va
  
:288) at
  
  
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
   .j
  
ava:886) at
  
  

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
908) at java.lang.Thread.run(Thread.java:680)
   
in conf/schema.xml:
   !-- fields for index-basic plugin --
   
field name=host type=url stored=false
 indexed=true/
field name=site type=string stored=false
indexed=true/ field name=url type=url stored=true
indexed=true
   
required=true/
   
*field name=content type=text stored=false
  
   indexed=true/*
  
in conf/solrindex-mapping.xml:
fields
   
field dest=content source=content/
   
In recent solr I think this has been renamed into text?
   
Solr's conf/schema.xml:
via copyField further on in this schema  --
   
*   field name=text type=text indexed=true stored=false
multiValued=true/*
   
On Tue, May 10, 2011 at 4:30 PM, Gabriele Kahlout
   
gabri...@mysimpatico.comwrote:
 It apparently is normal, and my

Re: Solr 4.0

2011-05-09 Thread Gabriele Kahlout
REPOST as a more general question about ivy dependencies:
http://stackoverflow.com/questions/5941789/do-ivy-dependency-revisions-have-anything-to-do-with-svns


On Mon, May 9, 2011 at 11:31 AM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 I think you are talking about this dependency:

 dependency org=org.apache.solr name=solr-solrj *rev=1.4.1*
 conf=*-default /

 I've checked out solr 4 svn revision 1099940[1]. What value should I use
 for rev?

 [1]
 http://lucene.472066.n3.nabble.com/Is-it-possible-to-build-Solr-as-a-maven-project-tp2898068p2905051.html


 On Tue, Apr 19, 2011 at 2:48 PM, Julien Nioche 
 lists.digitalpeb...@gmail.com wrote:

 You need to change the version of SOLR in ivy/ivy.xml then rebuild unless
 you change the jars straight in to nutch-1.3/runtime/local/lib - assuming
 that you're running Nutch locally only

 On 19 April 2011 07:09, Haspadar haspa...@gmail.com wrote:

  Yes, it occured after removing SolrJ1.4 jar and copy 4.0 version. Before
 it
  I upgrated Nutch for Solr 3.1 the same way and all worked fine.
 
  Thanks
 
  2011/4/19 Markus Jelsma markus.jel...@openindex.io
 
   Hi,
  
Hello.
I'm using Nutch 1.3. I decided to upgrade Solr to version 4.0 and I
replaced Nutch libs (Snapshot and SolrJ) from Solr dist. After that
 I
  got
the error at SolrIndexer on Reduce stage:
   
11/04/19 01:47:19 INFO mapred.JobClient:  map 100% reduce 27%
11/04/19 01:47:21 INFO mapred.JobClient: Task Id :
attempt_201104190142_0009_r_00_0, Status : FAILED
org.apache.solr.common.SolrException: ERROR: [doc=
 http://www.site.net/
  ]
Error adding field 'tstamp'='2011-04-18T22:45:17.404Z'
   
ERROR: [doc=http://www.site.net/] Error adding field
'tstamp'='2011-04-18T22:45:17.404Z'
   
request: http://127.0.0.1:8983/solr/update?wt=javabinversion=2
at
   
  
 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttp
SolrServer.java:436) at
   
  
 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttp
SolrServer.java:245) at
   
  
 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(Abstract
UpdateRequest.java:105) at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:50) at
org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
 at
   
  
 
 org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.ja
va:48) at
   
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
  
   If you are using Solr  1.4.x then you must upgrade the SolrJ jar's in
   Nutch.
   Solr 1.4.x and higher are not compatible. Just remove the 1.4.x jar's
 and
   copy
   over the new.
  
   
I tried to remove tstamp from solrindex-mapping.xml and Solr's
   schema.xml.
But this field is required in schema.xml and I got the error:
   
11/04/19 01:58:03 INFO mapred.JobClient: Task Id :
attempt_201104190142_0010_r_00_0, Status : FAILED
org.apache.solr.common.SolrException: ERROR: [doc=
 http://www.site.net/
  ]
unknown field 'tstamp'
   
ERROR: [doc=http://www.site.net/] unknown field 'tstamp'
  
   Removing a mapping doesn't mean the field isn't copied over. All
 unmapped
   fields
   are copied as is. The example mapping seems rather useless as it
 copies
   exact
   field names. It's only useful if your source fields and destination
  fields
   are
   actually different, which is usually not the case if you dedicate a
 Solr
   core
   for a Nutch crawl.
  
   You must either not create the field by some plugin or add the field
 to
   your
   Solr index.
  
   I'm surprised this error actually showed up considering the
 incompatible
   Javabin versions. Perhaps you already upgraded the SolrJ api?
  
   
How I can upgrade Solr to 4 version?
   
Thank you.
  
 



 --
 *
 *Open Source Solutions for Text Engineering

 http://digitalpebble.blogspot.com/
 http://www.digitalpebble.com




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email

Why is org.apache.solr.response.XMLWriter final?

2011-05-05 Thread Gabriele Kahlout
Hello,

It's final in the trunk, and has always been since conception in 2006 at
revision 372455. Why?

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


How do I debug Unable to evaluate expression using this context printed at start?

2011-05-05 Thread Gabriele Kahlout
I've tried to re-install solr on tomcat, and now when I launch tomcat in
debug mode I see the following exception relating to solr. It's not enough
to understand the problem (and fix it), but I don't know where to look for
more (or what to do). Please help me.

Following the tutorial and discussion here, this is my context descriptor
(solr.xml):

?xml version=1.0 encoding=utf-8?
Context docBase=/Users/simpatico/SOLR_HOME/dist/solr.war debug=0
crossContext=true
  Environment name=solr/home type=java.lang.String
value=/Users/simpatico/SOLR_HOME override=true/
/Context

(the war exists)
$ ls $SOLR_HOME/dist/solr.war
/Users/simpatico/SOLR_HOME//dist/solr.war

$ ls $SOLR_HOME/conf/solrconfig.xml
/Users/simpatico/SOLR_HOME//conf/solrconfig.xml

When Tomcat starts:

INFO: Using JNDI solr.home: /Users/simpatico/SOLR_HOME
May 5, 2011 2:46:50 PM org.apache.solr.core.SolrResourceLoader init
INFO: Solr home set to '/Users/simpatico/SOLR_HOME/'
...
INFO: Adding 'file:/Users/simpatico/SOLR_HOME/lib/wstx-asl-3.2.7.jar' to
classloader
May 5, 2011 2:46:50 PM org.apache.solr.common.SolrException log
SEVERE:
*javax.xml.transform.TransformerException: Unable to evaluate expression
using this context*
at com.sun.org.apache.xpath.internal.XPath.execute(XPath.java:363)
at
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:213)
at
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
at
org.apache.solr.core.CoreContainer.readProperties(CoreContainer.java:303)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:242)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.RuntimeException: Unable to evaluate expression using
this context
at
com.sun.org.apache.xpath.internal.axes.NodeSequence.setRoot(NodeSequence.java:212)
at
com.sun.org.apache.xpath.internal.axes.LocPathIterator.execute(LocPathIterator.java:210)
at com.sun.org.apache.xpath.internal.XPath.execute(XPath.java:335)
... 18 more
-
java.lang.RuntimeException: Unable to evaluate expression using this context
at
com.sun.org.apache.xpath.internal.axes.NodeSequence.setRoot(NodeSequence.java:212)
at
com.sun.org.apache.xpath.internal.axes.LocPathIterator.execute(LocPathIterator.java:210)
at com.sun.org.apache.xpath.internal.XPath.execute(XPath.java:335)
at
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:213)
at
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
at
org.apache.solr.core.CoreContainer.readProperties(CoreContainer.java:303)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:242)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
--- 

Re: Is it possible to build Solr as a maven project?

2011-05-05 Thread Gabriele Kahlout
Okay, that sequence worked, but then shouldn't I be able to do $ mvn install
afterwards? This is what I get:

...
Compiling 478 source files to /Users/simpatico/debug/solr4/solr/build/solr
-
COMPILATION ERROR :
-
org/apache/solr/spelling/suggest/fst/InputStreamDataInput.java:[7,27]
package com.google.common.io does not exist
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[28,32] package
com.google.common.collect does not exist
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[29,27] package
com.google.common.io does not exist
org/apache/solr/spelling/suggest/fst/InputStreamDataInput.java:[29,4] cannot
find symbol
symbol  : variable ByteStreams
location: class org.apache.solr.spelling.suggest.fst.InputStreamDataInput
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[128,57] cannot find
symbol
symbol  : variable Lists
location: class org.apache.solr.spelling.suggest.fst.FSTLookup
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[170,26] cannot find
symbol
symbol  : variable Lists
location: class org.apache.solr.spelling.suggest.fst.FSTLookup
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[203,35] cannot find
symbol
symbol  : variable Lists
location: class org.apache.solr.spelling.suggest.fst.FSTLookup
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[529,6] cannot find
symbol
symbol  : variable Closeables
location: class org.apache.solr.spelling.suggest.fst.FSTLookup
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[551,6] cannot find
symbol
symbol  : variable Closeables
location: class org.apache.solr.spelling.suggest.fst.FSTLookup
9 errors
-

Reactor Summary:

Grandparent POM for Apache Lucene Java and Apache Solr  SUCCESS [13.255s]
Lucene parent POM . SUCCESS [0.199s]
Lucene Core ... SUCCESS [15.528s]
Lucene Test Framework . SUCCESS [4.657s]
Lucene Common Analyzers ... SUCCESS [16.770s]
Lucene Contrib Ant  SUCCESS [1.103s]
Lucene Contrib bdb  SUCCESS [0.883s]
Lucene Contrib bdb-je . SUCCESS [0.872s]
Lucene Database aggregator POM  SUCCESS [0.091s]
Lucene Demo ... SUCCESS [0.842s]
Lucene Memory . SUCCESS [0.726s]
Lucene Queries  SUCCESS [1.559s]
Lucene Highlighter  SUCCESS [3.007s]
Lucene InstantiatedIndex .. SUCCESS [1.224s]
Lucene Lucli .. SUCCESS [1.579s]
Lucene Miscellaneous .. SUCCESS [1.163s]
Lucene Query Parser ... SUCCESS [4.274s]
Lucene Spatial  SUCCESS [1.159s]
Lucene Spellchecker ... SUCCESS [0.841s]
Lucene Swing .. SUCCESS [1.177s]
Lucene Wordnet  SUCCESS [0.816s]
Lucene XML Query Parser ... SUCCESS [1.197s]
Lucene Contrib aggregator POM . SUCCESS [0.079s]
Lucene ICU Analysis Components  SUCCESS [1.494s]
Lucene Phonetic Filters ... SUCCESS [0.759s]
Lucene Smart Chinese Analyzer . SUCCESS [3.534s]
Lucene Stempel Analyzer ... SUCCESS [1.537s]
Lucene Analysis Modules aggregator POM  SUCCESS [0.081s]
Lucene Benchmark .. SUCCESS [3.693s]
Lucene Modules aggregator POM . SUCCESS [0.147s]
Apache Solr parent POM  SUCCESS [0.099s]
Apache Solr Solrj . SUCCESS [3.670s]
Apache Solr Core .. FAILURE [7.842s]

On Thu, May 5, 2011 at 3:36 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Gabriele,

 The sequence should be

 1. svn update
 2. ant get-maven-poms
 3. mvn -N -Pbootstrap install

 I think you left out #2 - there was a very recent change to the POMs that
 affects the noggit jar name.

 Steve

  -Original Message-
  From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com]
  Sent: Thursday, May 05, 2011 1:22 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Is it possible to build Solr as a maven project?
 
  Thank you so much for this gem, David!
 
  I still don't manage to build though:
  $ svn update
  At revision 1099684.
 
  $ mvn clean
 
  $ mvn -N -Pbootstrap install
 
  [INFO]
  
  [INFO] BUILD FAILURE
  [INFO

Re: How do I debug Unable to evaluate expression using this context printed at start?

2011-05-05 Thread Gabriele Kahlout
While the question remains valid, I found there reason to my problem.
Backing up I had saved Tomcat's descriptor file in my $SOLR_HOME and Solr
was trying to read it as described in SolrCore
Wikihttp://wiki.apache.org/solr/CoreAdmin
.

What saved me was remembering Chris's earlier
remarkhttp://markmail.org/thread/3y4zqieyjqfi5vl3. Thank you Chris!


On Thu, May 5, 2011 at 2:58 PM, Gabriele Kahlout
gabri...@mysimpatico.comwrote:

 I've tried to re-install solr on tomcat, and now when I launch tomcat in
 debug mode I see the following exception relating to solr. It's not enough
 to understand the problem (and fix it), but I don't know where to look for
 more (or what to do). Please help me.

 Following the tutorial and discussion here, this is my context descriptor
 (solr.xml):

 ?xml version=1.0 encoding=utf-8?
 Context docBase=/Users/simpatico/SOLR_HOME/dist/solr.war debug=0
 crossContext=true
   Environment name=solr/home type=java.lang.String
 value=/Users/simpatico/SOLR_HOME override=true/
 /Context

 (the war exists)
 $ ls $SOLR_HOME/dist/solr.war
 /Users/simpatico/SOLR_HOME//dist/solr.war

 $ ls $SOLR_HOME/conf/solrconfig.xml
 /Users/simpatico/SOLR_HOME//conf/solrconfig.xml

 When Tomcat starts:
 
 INFO: Using JNDI solr.home: /Users/simpatico/SOLR_HOME
 May 5, 2011 2:46:50 PM org.apache.solr.core.SolrResourceLoader init
 INFO: Solr home set to '/Users/simpatico/SOLR_HOME/'
 ...
 INFO: Adding 'file:/Users/simpatico/SOLR_HOME/lib/wstx-asl-3.2.7.jar' to
 classloader
 May 5, 2011 2:46:50 PM org.apache.solr.common.SolrException log
 SEVERE:
 *javax.xml.transform.TransformerException: Unable to evaluate expression
 using this context*
 at com.sun.org.apache.xpath.internal.XPath.execute(XPath.java:363)
 at
 com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:213)
 at
 com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
 at
 org.apache.solr.core.CoreContainer.readProperties(CoreContainer.java:303)
 at org.apache.solr.core.CoreContainer.load(CoreContainer.java:242)
 at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
 at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
 at
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
 at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
 at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
 at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
 at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
 at
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
 at
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:680)
 Caused by: java.lang.RuntimeException: Unable to evaluate expression using
 this context
 at
 com.sun.org.apache.xpath.internal.axes.NodeSequence.setRoot(NodeSequence.java:212)
 at
 com.sun.org.apache.xpath.internal.axes.LocPathIterator.execute(LocPathIterator.java:210)
 at com.sun.org.apache.xpath.internal.XPath.execute(XPath.java:335)
 ... 18 more
 -
 java.lang.RuntimeException: Unable to evaluate expression using this
 context
 at
 com.sun.org.apache.xpath.internal.axes.NodeSequence.setRoot(NodeSequence.java:212)
 at
 com.sun.org.apache.xpath.internal.axes.LocPathIterator.execute(LocPathIterator.java:210)
 at com.sun.org.apache.xpath.internal.XPath.execute(XPath.java:335)
 at
 com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:213)
 at
 com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
 at
 org.apache.solr.core.CoreContainer.readProperties(CoreContainer.java:303)
 at org.apache.solr.core.CoreContainer.load(CoreContainer.java:242)
 at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
 at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
 at
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
 at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
 at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
 at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
 at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382

Re: Is it possible to build Solr as a maven project?

2011-05-05 Thread Gabriele Kahlout
Steven, thank you!

$ mvn -DskipTests=true install
works!

[INFO] Reactor Summary:
[INFO]
[INFO] Grandparent POM for Apache Lucene Java and Apache Solr  SUCCESS
[13.142s]
[INFO] Lucene parent POM . SUCCESS [0.345s]
[INFO] Lucene Core ... SUCCESS [18.448s]
[INFO] Lucene Test Framework . SUCCESS [3.560s]
[INFO] Lucene Common Analyzers ... SUCCESS [7.739s]
[INFO] Lucene Contrib Ant  SUCCESS [1.265s]
[INFO] Lucene Contrib bdb  SUCCESS [1.332s]
[INFO] Lucene Contrib bdb-je . SUCCESS [1.321s]
[INFO] Lucene Database aggregator POM  SUCCESS [0.242s]
[INFO] Lucene Demo ... SUCCESS [1.813s]
[INFO] Lucene Memory . SUCCESS [2.412s]
[INFO] Lucene Queries  SUCCESS [2.275s]
[INFO] Lucene Highlighter  SUCCESS [2.985s]
[INFO] Lucene InstantiatedIndex .. SUCCESS [2.170s]
[INFO] Lucene Lucli .. SUCCESS [1.814s]
[INFO] Lucene Miscellaneous .. SUCCESS [1.998s]
[INFO] Lucene Query Parser ... SUCCESS [2.755s]
[INFO] Lucene Spatial  SUCCESS [1.314s]
[INFO] Lucene Spellchecker ... SUCCESS [1.535s]
[INFO] Lucene Swing .. SUCCESS [1.233s]
[INFO] Lucene Wordnet  SUCCESS [1.309s]
[INFO] Lucene XML Query Parser ... SUCCESS [1.483s]
[INFO] Lucene Contrib aggregator POM . SUCCESS [0.151s]
[INFO] Lucene ICU Analysis Components  SUCCESS [2.728s]
[INFO] Lucene Phonetic Filters ... SUCCESS [1.765s]
[INFO] Lucene Smart Chinese Analyzer . SUCCESS [3.709s]
[INFO] Lucene Stempel Analyzer ... SUCCESS [4.241s]
[INFO] Lucene Analysis Modules aggregator POM  SUCCESS [0.213s]
[INFO] Lucene Benchmark .. SUCCESS [2.926s]
[INFO] Lucene Modules aggregator POM . SUCCESS [0.307s]
[INFO] Apache Solr parent POM  SUCCESS [0.233s]
[INFO] Apache Solr Solrj . SUCCESS [3.780s]
[INFO] Apache Solr Core .. SUCCESS [9.693s]
[INFO] Apache Solr Search Server . SUCCESS [6.739s]
[INFO] Apache Solr Test Framework  SUCCESS [2.699s]
[INFO] Apache Solr Analysis Extras ... SUCCESS [3.868s]
[INFO] Apache Solr Clustering  SUCCESS [6.736s]
[INFO] Apache Solr DataImportHandler . SUCCESS [4.914s]
[INFO] Apache Solr DataImportHandler Extras .. SUCCESS [2.721s]
[INFO] Apache Solr DataImportHandler aggregator POM .. SUCCESS [0.253s]
[INFO] Apache Solr Content Extraction Library  SUCCESS [1.909s]
[INFO] Apache Solr - UIMA integration  SUCCESS [1.922s]
[INFO] Apache Solr Contrib aggregator POM  SUCCESS [0.211s]
[INFO]

[INFO] BUILD SUCCESS
[INFO]

[INFO] Total time: 2:18.040s
[INFO] Finished at: Thu May 05 20:39:09 CEST 2011
[INFO] Final Memory: 38M/90M
[INFO]


On Thu, May 5, 2011 at 6:53 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Gabriele,

 On 5/5/2011 at 9:57 AM, Gabriele Kahlout wrote:
  Okay, that sequence worked, but then shouldn't I be able to do $ mvn
  install afterwards? This is what I get:
 ...
  COMPILATION ERROR :
  -
  org/apache/solr/spelling/suggest/fst/InputStreamDataInput.java:[7,27]
  package com.google.common.io does not exist
  org/apache/solr/spelling/suggest/fst/FSTLookup.java:[28,32] package
  com.google.common.collect does not exist
 ...

 mvn install should work, but it doesn't - I can reproduce this error on
 my machine.  This is a bug in the Maven build.

 The nightly Lucene/Solr Maven build on Jenkins should have caught this
 compilation failure three weeks ago, when Dawid Weiss committed his work
 under https://issues.apache.org/jira/browse/SOLR-2378.  Unfortunately,
 the nightly builds were using the results of compilation under the Ant
 build, rather than compiling from scratch.  I have committed a fix to the
 nightly build script so this won't happen again.

 The Maven build bug is that the Solr-core Google Guava dependency was
 scoped as test-only.  Until SOLR-2378, that was true, but it is no longer.
  So

Re: How do i I modify XMLWriter to write foobar?

2011-05-05 Thread Gabriele Kahlout
I've now tried to write my own QueryResponseWriter plugin[1], as a maven
project depending on Solr Core 3.1, which is the same version of Solr I've
installed. It seems I'm not able to get rid of some cache.


$ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
queryResponseWriter name=*xml* class=org.apache.solr.request.*
XMLResponseWriter*/
queryResponseWriter name=*Test* class=com.mysimpatico.me.indexplugins.*
TestQueryResponseWriter* default=true/

Restarted tomcat after changing solrconfig.xml and placing indexplugins.jar
in $SOLR_HOME/
At tomcat boot:
INFO: Adding 'file:/Users/simpatico/SOLR_HOME/lib/IndexPlugins.jar' to
classloader

I get legacy code of the plugin for both, and I don't understand why. At
least the xml should be different. Why could this be? How to find out?
http://localhost:8080/solr/select?q=apachewt=Test and
http://localhost:8080/solr/select?q=apachewt=xml
XML Parsing Error: syntax error
Location: http://localhost:8080/solr/select?q=apachewt=xml (//Test
Line Number 1, Column 1:
foobarresponseHeaderstatusQTimeparamsqapachewtxmlresponse00foobar
^

It seems the new code for TestQueryResponseWriter[1] seems to never be
executed since i added a severe log statement that doesn't appear in tomcat
logs. Where are those caches?

Thank you in advance.

[1]
package com.mysimpatico.me.indexplugins;

import java.io.*;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.solr.request.XMLResponseWriter;


/**
 * Hello world!
 *
 */
public class TestQueryResponseWriter extends XMLResponseWriter{

@Override
public void write(Writer writer,
org.apache.solr.request.SolrQueryRequest request,
org.apache.solr.response.SolrQueryResponse response) throws IOException {

Logger.getLogger(TestQueryResponseWriter.class.getName()).log(Level.SEVERE,
Hello from TestQueryResponseWriter);
super.write(writer, request, response);
}
}


On Thu, May 5, 2011 at 9:01 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 : queryResponseWriter name=xml class=org.apache.solr.request.*
 : XMLResponseWriter* default=true/
 :
 : Now I comment the line in Solrconfix.xml, and there's no more writer.
 : $ xmlstarlet sel -t -c /config/queryResponseWriter conf/solrconfig.xml
 :
 : I make a query, and the XMLResponseWriter is still in charge.
 : *$ curl -L http://localhost:8080/solr/select?q=apache*
 : ?xml version=1.0 encoding=UTF-8?

 ...

 Your example request is not specifying a wt param.

 in addition to the response writers declared in your solrconfig.xml, there
 are response writers that exist implicitly unless you define your own
 instances that override those names (xml, json, python, etc...)

 the real question is: what writer do you *want* to have used when no wt is
 specified?

 whatever the answer is: declare n instance of that writer with
 default=true in your solrconfig.xml


 -Hoss




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


Re: Is it possible to build Solr as a maven project?

2011-05-05 Thread Gabriele Kahlout
Just for the reference.

$ svn update
At revision 1099940.

On Thu, May 5, 2011 at 9:14 PM, Steven A Rowe sar...@syr.edu wrote:

 You're welcome, I'm glad you got it to work. - Steve

  -Original Message-
  From: Gabriele Kahlout [mailto:gabri...@mysimpatico.com]
  Sent: Thursday, May 05, 2011 2:41 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Is it possible to build Solr as a maven project?
 
  Steven, thank you!
 
  $ mvn -DskipTests=true install
  works!
 
  [INFO] Reactor Summary:
  [INFO]
  [INFO] Grandparent POM for Apache Lucene Java and Apache Solr  SUCCESS
  [13.142s]
  [INFO] Lucene parent POM . SUCCESS
  [0.345s]
  [INFO] Lucene Core ... SUCCESS
  [18.448s]
  [INFO] Lucene Test Framework . SUCCESS
  [3.560s]
  [INFO] Lucene Common Analyzers ... SUCCESS
  [7.739s]
  [INFO] Lucene Contrib Ant  SUCCESS
  [1.265s]
  [INFO] Lucene Contrib bdb  SUCCESS
  [1.332s]
  [INFO] Lucene Contrib bdb-je . SUCCESS
  [1.321s]
  [INFO] Lucene Database aggregator POM  SUCCESS
  [0.242s]
  [INFO] Lucene Demo ... SUCCESS
  [1.813s]
  [INFO] Lucene Memory . SUCCESS
  [2.412s]
  [INFO] Lucene Queries  SUCCESS
  [2.275s]
  [INFO] Lucene Highlighter  SUCCESS
  [2.985s]
  [INFO] Lucene InstantiatedIndex .. SUCCESS
  [2.170s]
  [INFO] Lucene Lucli .. SUCCESS
  [1.814s]
  [INFO] Lucene Miscellaneous .. SUCCESS
  [1.998s]
  [INFO] Lucene Query Parser ... SUCCESS
  [2.755s]
  [INFO] Lucene Spatial  SUCCESS
  [1.314s]
  [INFO] Lucene Spellchecker ... SUCCESS
  [1.535s]
  [INFO] Lucene Swing .. SUCCESS
  [1.233s]
  [INFO] Lucene Wordnet  SUCCESS
  [1.309s]
  [INFO] Lucene XML Query Parser ... SUCCESS
  [1.483s]
  [INFO] Lucene Contrib aggregator POM . SUCCESS
  [0.151s]
  [INFO] Lucene ICU Analysis Components  SUCCESS
  [2.728s]
  [INFO] Lucene Phonetic Filters ... SUCCESS
  [1.765s]
  [INFO] Lucene Smart Chinese Analyzer . SUCCESS
  [3.709s]
  [INFO] Lucene Stempel Analyzer ... SUCCESS
  [4.241s]
  [INFO] Lucene Analysis Modules aggregator POM  SUCCESS
  [0.213s]
  [INFO] Lucene Benchmark .. SUCCESS
  [2.926s]
  [INFO] Lucene Modules aggregator POM . SUCCESS
  [0.307s]
  [INFO] Apache Solr parent POM  SUCCESS
  [0.233s]
  [INFO] Apache Solr Solrj . SUCCESS
  [3.780s]
  [INFO] Apache Solr Core .. SUCCESS
  [9.693s]
  [INFO] Apache Solr Search Server . SUCCESS
  [6.739s]
  [INFO] Apache Solr Test Framework  SUCCESS
  [2.699s]
  [INFO] Apache Solr Analysis Extras ... SUCCESS
  [3.868s]
  [INFO] Apache Solr Clustering  SUCCESS
  [6.736s]
  [INFO] Apache Solr DataImportHandler . SUCCESS
  [4.914s]
  [INFO] Apache Solr DataImportHandler Extras .. SUCCESS
  [2.721s]
  [INFO] Apache Solr DataImportHandler aggregator POM .. SUCCESS
  [0.253s]
  [INFO] Apache Solr Content Extraction Library  SUCCESS
  [1.909s]
  [INFO] Apache Solr - UIMA integration  SUCCESS
  [1.922s]
  [INFO] Apache Solr Contrib aggregator POM  SUCCESS
  [0.211s]
  [INFO]
  
  [INFO] BUILD SUCCESS
  [INFO]
  
  [INFO] Total time: 2:18.040s
  [INFO] Finished at: Thu May 05 20:39:09 CEST 2011
  [INFO] Final Memory: 38M/90M
  [INFO]
  
 
  On Thu, May 5, 2011 at 6:53 PM, Steven A Rowe sar...@syr.edu wrote:
 
   Hi Gabriele,
  
   On 5/5/2011 at 9:57 AM, Gabriele Kahlout wrote:
Okay, that sequence worked, but then shouldn't I be able to do $ mvn
install afterwards? This is what I get:
   ...
COMPILATION ERROR :
-
org/apache/solr/spelling/suggest/fst/InputStreamDataInput.java:[7,27]
package com.google.common.io does not exist
org/apache/solr/spelling/suggest/fst/FSTLookup.java:[28,32] package
com.google.common.collect does not exist
   ...
  
   mvn install should work, but it doesn't

Is it possible to build Solr as a maven project?

2011-05-04 Thread Gabriele Kahlout
Hello,

I'm trying to modify Solr and I think debugging will be very useful to
understand what's going on. Hence I'd like to use an IDE (NetBeans)
which automatically supports Maven projects. I see under src/maven
that there are templates but I'm not sure how to use them to mavenize
the build/project. Nothing on the Wiki. I've seen issue solr-19 and
some messages on older msgs on the mailing list too.

Any instructions?


-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
time(x)  Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the
email does not contain a valid code then the email is not received. A
valid code starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
∈ L(-[a-z]+[0-9]X)).


Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread Gabriele Kahlout
generate-maven-artifacts:
   [mkdir] Created dir: /Users/simpatico/SOLR_HOME/build/maven
   [mkdir] Created dir: /Users/simpatico/SOLR_HOME/dist/maven
[copy] Copying 1 file to
/Users/simpatico/SOLR_HOME/build/maven/src/maven
[artifact:install-provider] Installing provider:
org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2

*BUILD FAILED*
/Users/simpatico/SOLR_HOME/*build.xml:800*: The following error occurred
while executing this line:
/Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
support the uniqueVersion attribute


*build.xml:800: *m2-deploy
pom.xml=src/maven/solr-parent-pom.xml.template/

removed uniquVersion attirubte:

generate-maven-artifacts:
[artifact:install-provider] Installing provider:
org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
[artifact:deploy] Deploying to file:///Users/simpatico/SOLR_HOME/dist/maven
[artifact:deploy] [INFO] Retrieving previous build number from remote
[artifact:deploy] [INFO] Retrieving previous metadata from remote
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
org.apache.solr:solr-parent'
[artifact:deploy] [INFO] Retrieving previous metadata from remote
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
org.apache.solr:solr-parent:1.4.2-SNAPSHOT'
 [copy] Copying 1 file to /Users/simpatico/SOLR_HOME/build/maven/lib
[artifact:install-provider] Installing provider:
org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
[artifact:deploy] Deploying to file:///Users/simpatico/SOLR_HOME/dist/maven
[artifact:deploy] [INFO] Retrieving previous build number from remote
[artifact:deploy] [INFO] Retrieving previous metadata from remote
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
org.apache.solr:solr-commons-csv'
[artifact:deploy] [INFO] Retrieving previous metadata from remote
[artifact:deploy] [INFO] Uploading project information for solr-commons-csv
1.4.2-SNAPSHOT
[artifact:deploy] [INFO] Retrieving previous metadata from remote
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
org.apache.solr:solr-commons-csv:1.4.2-SNAPSHOT'
 [copy] Copying 1 file to
/Users/simpatico/SOLR_HOME/build/maven/contrib/dataimporthandler
[artifact:install-provider] Installing provider:
org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2

BUILD FAILED
/Users/simpatico/SOLR_HOME/build.xml:809: The following error occurred while
executing this line:
*/Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
support the nested attach element*

On Wed, May 4, 2011 at 11:50 AM, lboutros boutr...@gmail.com wrote:
 In the ant script there is a target to generate maven's artifacts.

 After that, you will be able to open the project as a standard maven
 project.

 Ludovic.

 2011/5/4 Gabriele Kahlout [via Lucene] 
 ml-node+2898068-621882422-383...@n3.nabble.com

 Hello,

 I'm trying to modify Solr and I think debugging will be very useful to
 understand what's going on. Hence I'd like to use an IDE (NetBeans)
 which automatically supports Maven projects. I see under src/maven
 that there are templates but I'm not sure how to use them to mavenize
 the build/project. Nothing on the Wiki. I've seen issue solr-19 and
 some messages on older msgs on the mailing list too.

 Any instructions?


 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
 time(x)  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the
 email does not contain a valid code then the email is not received. A
 valid code starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y
 ∈ L(-[a-z]+[0-9]X)).


 --
  If you reply to this email, your message will be added to the discussion
 below:


http://lucene.472066.n3.nabble.com/Is-it-possible-to-build-Solr-as-a-maven-project-tp2898068p2898068.html
  To start a new topic under Solr - User, email
 ml-node+472068-1765922688-383...@n3.nabble.com
 To unsubscribe from Solr - User, click here
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=
.




 -
 Jouve
 France.
 --
 View this message in context:
http://lucene.472066.n3.nabble.com/Is-it-possible-to-build-Solr-as-a-maven-project-tp2898068p2898084.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid

Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread Gabriele Kahlout
On Wed, May 4, 2011 at 1:11 PM, lboutros boutr...@gmail.com wrote:

 oups,

 sorry, this was not the target I used (this one should work too, but...),
 the one I used is get-maven-poms. That will just create pom files and copy
 them to their right target locations.


I don't have get-maven-poms target in my script.


 I'm using netbeans and I'm using the plugin Automatic Projects to do
 everything inside the IDE.

 Which version of Solr are you using ?


the official latest: 3.1

Maybe I can copy-paste from the build script you are using?



 Ludovic.

 2011/5/4 Gabriele Kahlout [via Lucene] 
 ml-node+2898211-2124746009-383...@n3.nabble.com

  generate-maven-artifacts:
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/build/maven
 [mkdir] Created dir: /Users/simpatico/SOLR_HOME/dist/maven
  [copy] Copying 1 file to
  /Users/simpatico/SOLR_HOME/build/maven/src/maven
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
 
  *BUILD FAILED*
  /Users/simpatico/SOLR_HOME/*build.xml:800*: The following error occurred
  while executing this line:
  /Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
  support the uniqueVersion attribute
 
 
  *build.xml:800: *m2-deploy
  pom.xml=src/maven/solr-parent-pom.xml.template/
 
  removed uniquVersion attirubte:
 
  generate-maven-artifacts:
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-parent'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
  org.apache.solr:solr-parent:1.4.2-SNAPSHOT'
   [copy] Copying 1 file to /Users/simpatico/SOLR_HOME/build/maven/lib
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
  [artifact:deploy] Deploying to
 file:///Users/simpatico/SOLR_HOME/dist/maven
 
  [artifact:deploy] [INFO] Retrieving previous build number from remote
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact
  org.apache.solr:solr-commons-csv'
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading project information for
 solr-commons-csv
 
  1.4.2-SNAPSHOT
  [artifact:deploy] [INFO] Retrieving previous metadata from remote
  [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot
  org.apache.solr:solr-commons-csv:1.4.2-SNAPSHOT'
   [copy] Copying 1 file to
  /Users/simpatico/SOLR_HOME/build/maven/contrib/dataimporthandler
  [artifact:install-provider] Installing provider:
  org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-2
 
  BUILD FAILED
  /Users/simpatico/SOLR_HOME/build.xml:809: The following error occurred
  while
  executing this line:
  */Users/simpatico/SOLR_HOME/common-build.xml:274: artifact:deploy doesn't
  support the nested attach element*
 
 


 -
 Jouve
 France.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Is-it-possible-to-build-Solr-as-a-maven-project-tp2898068p2898315.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).


  1   2   >