date:20120515

Hi,

I have 2 cores configured in my solr instance.

Both cores are using same schema.

I have indexed column1 in core0 and column2 in core1

My search query is 

http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1:A;
AND column2:B

No result found


http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1:A;
OR column2:B

Whether AND is supported in multi core search?

Thanks,
ravi

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: problem with date searching.

In fact I am able to see scanneddate field  when i added query like this 

responseHeader:{
-
-
  q: ibrahim.hamid 2012-02-02T04:00:52Z,
  qf: userid scanneddate,
  wt:json,
  defType:dismax,
  version:2.2,
  rows:50}},
  response:{numFound:20,start:0,docs:[
  {
  ---
--
scanneddate:[2012-02-02T04:00:52Z],

},

--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3983801.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: problem with date searching.

select/?defType=dismaxq=+ibrahim.hamid+2012-02-02T04:00:52Zqf=+userid+scanneddateversion=2.2start=0rows=50indent=onwt=jsondebugQuery=on

--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3983802.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with AND clause in multi core search query

2012-05-15 Thread Tommaso Teofili

The latter is supposed to work:
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1
:A OR column2:B

The first query cannot work as there is no document neither in core0 nor in
core1 which has A in field column1 and B in field column2 but only
documents which have B in column2 (in core1) OR A in column1 (in core0).

Regards.
Tommaso

2012/5/15 ravicv ravichandra...@gmail.com

 Hi,

 I have 2 cores configured in my solr instance.

 Both cores are using same schema.

 I have indexed column1 in core0 and column2 in core1

 My search query is


 http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1
 :A
 AND column2:B

 No result found



 http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1
 :A
 OR column2:B

 Whether AND is supported in multi core search?

 Thanks,
 ravi

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Show a portion of searchable text in Solr

2012-05-15 Thread Shameema Umer

Can somebody tell me where should I place the highlighting parameters, when
I did on the query, it is not working.
hl=truehl.requireFieldMatch=truehl.fl=*

FYI: I am new to solr. My aim  is to have emphasis tags on the queried
words and need to display only the query relevant snippet of the content

Thanks
Shameema





On Mon, May 14, 2012 at 1:18 PM, Ahmet Arslan iori...@yahoo.com wrote:

  I have indexed very large documents, In some cases these
  documents has
  100.000 characters. Is there a way to return a portion of
  the documents
  (lets say the 300 first characters) when i am querying
  Solr?. Is there any
  attribute to set in the schema.xml or solrconfig.xml to
  achieve this?

 I have a set-up with very large documents too. Here is two different
 solutions that I have used in the past:

 1) Use highlighting with hl.alternateField and hl.maxAlternateFieldLength
 http://wiki.apache.org/solr/HighlightingParameters

 2) Create an extra field (indexed=false and stored=true) using
 copyField just for display purposes. (fl=shortField)

 copyField source=largeField dest=shortField maxChars=300/
 http://wiki.apache.org/solr/SchemaXml#Copy_Fields

 Also, didn't used by myself yet but I *think* this can be accomplished by
 using a custom Transformer too.
 http://wiki.apache.org/solr/DocTransformers

Re: Problem with AND clause in multi core search query

Thanks Tommaso .

Could you please tell me is their any way to get this scenario to get
worked?

http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1:A;
AND column2:B 

Is their any way we can achieve this below scenario 

query : column1:A should searched in core0 and column2:B should be
searched in core1 and later the results from both queries should use
condition AND and give final response.?

since both will return common field as response.

For reference my schema is : 

   field name=id type=string indexed=true stored=true
required=true / 
   field name=value type=string indexed=true stored=true / 
   field name=column1 type=string indexed=true stored=true/
   field name=column2 type=string indexed=true stored=true /

Thanks 
Ravi


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983806.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: problem with date searching.

if i use 
q=scanneddate:[2011-09-22T22:40:30Z TO 2012-02-02T01:30:52Z] .
it is working fine .
but when i tried with dismax query .it is not working .
EX :
select/?defType=dismaxq=[2011-09-22T22:40:30Z TO
2012-02-02T01:30:52Z]qf=scanneddateversion=2.2start=0rows=50indent=onwt=jsondebugQuery=ontrue

please comment on the same.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3983807.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR Security

2012-05-15 Thread Anupam Bhattacharya

Thanks for the suggestions.

I tried to use SolrJ within my Servlet. Although the SolrJ QueryResponse is
not returning a well formed Json Object.
I need the Json String with quotes as below. although
QueryResponse.toString() doesn't return json with quotes at all.

jsonp1337064466204({responseHeader:{status:0,QTime:0,params:{json.wrf:jsonp1337064466204,facet:true,facet.mincount:1,q:*:*,facet.limit:-1,json.nl:map,facet.field:[title,abstract],wt:json,rows:0}},response:{numFound:0,start:0,docs:[]},facet_counts:{facet_queries:{},facet_fields:{title:{},abstract:{}},facet_dates:{},facet_ranges:{}}})

Regards

Anupam

On Fri, May 11, 2012 at 7:56 PM, Welty, Richard rwe...@ltionline.comwrote:

in fact, there's a sample proxy.php on the ajax-solr web page which can
easily be modified into a security layer. my solr servers only listen to
requests issued by a narrow list of systems, and everything gets routed
through a modified copy of the proxy.php file, which checks whether the
user is logged in, and adds terms to the query to limit returned results to
those the user is permitted to see.

-Original Message-
From: Jan Høydahl [mailto:j...@hoydahl.no]
Sent: Fri 5/11/2012 9:45 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR Security

Hi,

There is nothing stopping you from pointing Ajax-SOLR to a URL on your
app-server, which acts as a security insulation layer between the Solr
backend and the world. In this (thin) layer you can analyze the input and
choose carefully what to let through and not.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:

Yes, I agree with you.

But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
solution ?

Anupam

On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael
mklosterme...@riskexchange.com wrote:

Instead of hitting the Solr server directly from the client, I think I
would go through your application server, which would have access to all
the users data and can forward that to the Solr server, thereby hiding
it
from the client.

Mike

-Original Message-
From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
Sent: Thursday, May 10, 2012 9:53 PM
To: solr-user@lucene.apache.org
Subject: SOLR Security

I am using Ajax-Solr Framework for creating a search interface. The
search
interface works well.
In my case, the results have document level security so by even indexing
records with there authorized users help me to filter results per user
based on the authentication of the user.

The problem that I have to a pass always a parameter to the SOLR Server
with userid={xyz} which one can figure out from the SOLR URL(ajax call
url)
using Firebug tool in the Net Console on Firefox and can change this
parameter value to see others records which he/she is not authorized.
Basically it is Cross Site Scripting Issue.

I have read about some approaches for Solr Security like Nginx with
Jetty
.htaccess based security.Overall what i understand from this is that
we
can restrict users to do update/delete operations on SOLR as well as we
can
restrict the SOLR admin interface to certain IPs also. But How can I
restrict the {solr-server}/solr/select based results from access by
different user id's ?

Re: Boosting on field empty or not

 Basically I want documents that have a given field populated
 to have a
 higher score than the documents that dont.  So if you
 search for foo I want
 documents that contain foo, but i want the documents that
 have field a
 populated to have a higher score...


Hi Donald,

Since you are using edismax, it is better to use bq (boosting query) for this.

bq=reqularprice:[* TO *]^50

http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29

defType=edismaxqf=nameSuggest^10 name^10 codeTXT^2 description^1 
brand_search^0 cat_search^10q=chairsbq=reqularprice:[* TO *]^50

Query regarding multi core search

HI,

I want to configured 2 cores in my SOLR instance. Now i want to query core0
with different query and core1 with diffrent query and finally merge the
results .

Please suggest me the best way to do this .

Thanks 
Ravi

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-regarding-multi-core-search-tp3983813.html
Sent from the Solr - User mailing list archive at Nabble.com.

simple query help

2012-05-15 Thread Peter Kirk

Hi

Can someone please give me some help with a simple query.

If I search
q=skcode:2021051 and flength:368.0

I get 1 document returned (doc A)

If I search
q=skcode:2021049 and ent_no:1040970907

I get 1 document returned (doc B)


But if I search
q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

I get no documents returned.

Shouldn't I get both docA and docB?

Thanks,
Peter

Re: Multi-words synonyms matching

2012-05-15 Thread Bernd Fehling

Without reading the whole thread let me say that you should not trust
the solr admin analysis. It takes the whole multiword search and runs
it all together at once through each analyzer step (factory).
But this is not how the real system works. First pitfall, the query parser
is also splitting at white space (if not a phrase query). Due to this,
a multiword query is send chunk after chunk through the analyzer and,
second pitfall, each chunk runs through the whole analyzer by its own.

So if you are dealing with multiword synonyms you have the following
problems. Either you turn your query into a phrase so that the whole
phrase is analyzed at once and therefore looked up as multiword synonym
but phrase queries are not analyzed !!! OR you send your query chunk
by chunk through the analyzer but then they are not multiwords anymore
and are not found in your synonyms.txt.

From my experience I can say that it requires some deep work to get it done
but it is possible. I have connected a thesaurus to solr which is doing
query time expansion (no need to reindex if the thesaurus changes).
The thesaurus holds synonyms and used for terms in 24 languages. So
it is also some kind of language translation. And naturally the thesaurus
translates from single term to multi term synonyms and vice versa.

Regards,
Bernd


Am 14.05.2012 13:54, schrieb elisabeth benoit:
 Just for the record, I'd like to conclude this thread
 
 First, you were right, there was no behaviour difference between fq and q
 parameters.
 
 I realized that:
 
 1) my synonym (hotel de ville) has a stopword in it (de) and since I used
 tokenizerFactory=solr.KeywordTokenizerFactory in my synonyms declaration,
 there was no stopword removal in the indewed expression, so when requesting
 hotel de ville, after stopwords removal in query, Solr was comparing
 hotel de ville
 with hotel ville
 
 but my queries never even got to that point since
 
 2) I made a mistake using mairie alone in the admin interface when
 testing my schema. The real field was something like collectivités
 territoriales mairie,
 so the synonym hotel de ville was not even applied, because of the
 tokenizerFactory=solr.KeywordTokenizerFactory in my synonym definition
 not splitting field into words when parsing
 
 So my problem is not solved, and I'm considering solving it outside of Solr
 scope, unless someone else has a clue
 
 Thanks again,
 Elisabeth
 
 
 
 2012/4/25 Erick Erickson erickerick...@gmail.com
 
 A little farther down the debug info output you'll find something
 like this (I specified fq=name:features)

 arr name=parsed_filter_queries
 strname:features/str
 /arr


 so it may well give you some clue. But unless I'm reading things wrong,
 your
 q is going against a field that has much more information than the
 CATEGORY_ANALYZED field, is it possible that the data from your
 test cases simply isn't _in_ CATEGORY_ANALYZED?

 Best
 Erick

 On Wed, Apr 25, 2012 at 9:39 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
 I'm not at the office until next Wednesday, and I don't have my Solr
 under
 hand, but isn't debugQuery=on giving informations only about q parameter
 matching and nothing about fq parameter? Or do you mean
 parsed_filter_queries gives information about fq?

 CATEGORY_ANALYZED is being populated by a copyField instruction in
 schema.xml, and has the same field type as my catchall field, the search
 field for my searchHandler (the one being used by q parameter).

 CATEGORY (a string) is copied in CATEGORY_ANALYZED (field type is text)

 CATEGORY (a string) is copied in catchall field (field type is text),
 and a
 lot of other fields are copied too in that catchall field.

 So as far as I can see, the same analysis should be done in both cases,
 but
 obviously I'm missing something, and the only thing I can think of is a
 different behavior between q and fq parameter.

 I'll check that parsed_filter_querie first thing in the morning next
 Wednesday.

 Thanks a lot for your help.

 Elisabeth


 2012/4/24 Erick Erickson erickerick...@gmail.com

 Elisabeth:

 What shows up in the debug section of the response when you add
 debugQuery=on? There should be some bit of that section like:
 parsed_filter_queries

 My other question is are you absolutely sure that your
 CATEGORY_ANALYZED field has the correct content?. How does it
 get populated?

 Nothing jumps out at me here

 Best
 Erick

 On Tue, Apr 24, 2012 at 9:55 AM, elisabeth benoit
 elisaelisael...@gmail.com wrote:
 yes, thanks, but this is NOT my question.

 I was wondering why I have multiple matches with q=hotel de ville
 and
 no
 match with fq=CATEGORY_ANALYZED:hotel de ville, since in both case
 I'm
 searching in the same solr fieldType.

 Why is q parameter behaving differently in that case? Why do the
 quotes
 work in one case and not in the other?

 Does anyone know?

 Thanks,
 Elisabeth

 2012/4/24 Jeevanandam je...@myjeeva.com


 usage of q and fq

 q = is typically the main query for the search

query with DATE FIELD AND RANGE query using dismax

Hi

   My queries are working with standard query handler but not in dismax.

*it is working fine *
EX :
q=scanneddate:[2012-02-02T01:30:52Z TO 2011-09-22T22:40:30Z] .

*Not Working :*
EX
defType=dismaxq=[2012-02-02T01:30:52Z TO
2011-09-22T22:40:30Z]qf=scanneddate

How can I check for the date ranges  using solr's dismax query handler


--
View this message in context: 
http://lucene.472066.n3.nabble.com/query-with-DATE-FIELD-AND-RANGE-query-using-dismax-tp3983819.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: authentication for solr admin page?

2012-05-15 Thread findbestopensource

I have written an article on this. The various steps to restrict /
authenticate Solr admin interface.

http://www.findbestopensource.com/article-detail/restrict-solr-admin-access

Regards
Aditya
www.findbestopensource.com


On Thu, Mar 29, 2012 at 1:06 AM, geeky2 gee...@hotmail.com wrote:

 update -

 ok - i was reading about replication here:

 http://wiki.apache.org/solr/SolrReplication

 and noticed comments in the solrconfig.xml file related to HTTP Basic
 Authentication and the usage of the following tags:

 str name=httpBasicAuthUserusername/str
str name=httpBasicAuthPasswordpassword/str

 *Can i place these tags in the request handler to achieve an authentication
 scheme for the /admin page?*

 // snipped from the solrconfig.xml file

  requestHandler name=/admin/
 class=org.apache.solr.handler.admin.AdminHandlers/

 thanks for any help
 mark

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/authentication-for-solr-admin-page-tp3865665p3865747.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: query with DATE FIELD AND RANGE query using dismax

2012-05-15 Thread Jan Høydahl

Hi,

You can't. Try eDisMax instead: http://wiki.apache.org/solr/ExtendedDisMax

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 15. mai 2012, at 11:05, ayyappan wrote:

 Hi
 
   My queries are working with standard query handler but not in dismax.
 
 *it is working fine *
 EX :
 q=scanneddate:[2012-02-02T01:30:52Z TO 2011-09-22T22:40:30Z] .
 
 *Not Working :*
 EX
 defType=dismaxq=[2012-02-02T01:30:52Z TO
 2011-09-22T22:40:30Z]qf=scanneddate
 
 How can I check for the date ranges  using solr's dismax query handler
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/query-with-DATE-FIELD-AND-RANGE-query-using-dismax-tp3983819.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: adding an OR to a fq makes some doc that matched not match anymore

2012-05-15 Thread jmlucjav

that does not change the results for me:

-suggest?q=suggest_terms:lap*fq=type:Pfq=((-type:B))debugQuery=true
-found 1

-suggest?q=suggest_terms:lap*fq=type:Pfq=((-type:B)+OR+name:aa)debugQuery=true
-found 0

looks like a bug?
xab

--
View this message in context: 
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775p3983828.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: adding an OR to a fq makes some doc that matched not match anymore

 that does not change the results for
 me:
 
 -suggest?q=suggest_terms:lap*fq=type:Pfq=((-type:B))debugQuery=true
 -found 1
 
 -suggest?q=suggest_terms:lap*fq=type:Pfq=((-type:B)+OR+name:aa)debugQuery=true
 -found 0
 

Negative clause and OR clause does not work like this.
fq=+*:* -type:B name:aa should work.

Re: simple query help

2012-05-15 Thread András Bártházi

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
  Andras

2012/5/15 Peter Kirk p...@alpha-solutions.dk

 Hi

 Can someone please give me some help with a simple query.

 If I search
 q=skcode:2021051 and flength:368.0

 I get 1 document returned (doc A)

 If I search
 q=skcode:2021049 and ent_no:1040970907

 I get 1 document returned (doc B)


 But if I search
 q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

 I get no documents returned.

 Shouldn't I get both docA and docB?

 Thanks,
 Peter

RE: simple query help

2012-05-15 Thread Peter Kirk

Hi - thanks for the response. Yes I have tried with parentheses, to group as 
you suggest.

It doesn't make a difference. But now I'm thinking there's something completely 
odd - and I wonder if it's necessary to use a special search-handler to achieve 
what  I want.

For example, if I execute 
q=(skcode:2021051 AND flength:368.0)

I get no results. If I omit the parentheses, I get 1 result. (Let alone trying 
to combine several Boolean clauses).

/Peter


-Original Message-
From: András Bártházi [mailto:and...@barthazi.hu] 
Sent: 15. maj 2012 12:51
To: solr-user@lucene.apache.org
Subject: Re: simple query help

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
  Andras

2012/5/15 Peter Kirk p...@alpha-solutions.dk

 Hi

 Can someone please give me some help with a simple query.

 If I search
 q=skcode:2021051 and flength:368.0

 I get 1 document returned (doc A)

 If I search
 q=skcode:2021049 and ent_no:1040970907

 I get 1 document returned (doc B)


 But if I search
 q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

 I get no documents returned.

 Shouldn't I get both docA and docB?

 Thanks,
 Peter

RE: simple query help

 It doesn't make a difference. But now I'm thinking there's
 something completely odd - and I wonder if it's necessary to
 use a special search-handler to achieve what  I want.
 
 For example, if I execute 
 q=(skcode:2021051 AND flength:368.0)
 
 I get no results. If I omit the parentheses, I get 1 result.
 (Let alone trying to combine several Boolean clauses).

Which query parser are you using?

Re: - Solr 4.0 - How do I enable JSP support ? ...

What do you mean jsp support? What is it you're trying to do
with jsp? What servelet container are you using? Details matter.

Best
Erick

On Mon, May 14, 2012 at 5:34 PM, Naga Vijayapuram nvija...@tibco.com wrote:
 Hello,

 How do I enable JSP support in Solr 4.0 ?

 Thanks
 Naga

Re: document cache

Yes. In fact, all the caches get flushed on every commit/replication cycle.

Some of the caches get autowarmed when a new searcher is opened,
which happens...you guessed it...every time a commit/replication happens.

Best
Erick

On Tue, May 15, 2012 at 1:32 AM, shinkanze rajatrastogi...@gmail.com wrote:
  hi ,

 I want to know the internal mechanism how document cache works .

 specifically its flushing cycle ...

 i.e does it gets flushed  on every commit /replication .

 regards

 Rajat Rastogi


 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Issue in Applying patch file

2012-05-15 Thread mechravi25

Hi,


We have checked out the latest version of Solr source code from svn. We are
trying to apply the following patch file to it.

 https://issues.apache.org/jira/browse/SOLR-3430

While applying the patch file using eclipse (i.e. using team--apply patch
options), we are getting cross marks for certain java files and its getting
updated for the following java file alone and we are able to see the patch
file changes for this alone.

solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestThreaded.java

Why is that its not getting applied for the other set of java files which is
present in the patch file and sometimes, we are getting file does not
exist error even if the corresponding files are present.

And also, when I try to ant build it after applying the patch, Im getting
the following error

common-build.xml:949: Error starting modern compiler

Can you tell me If Im missing out anything? Can you please guide me on this?

Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-in-Applying-patch-file-tp3983842.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: simple query help

2012-05-15 Thread Péter Király

Hi,

it is AND (uppercase) not and (smallcase) (and OR instead of or).

Regards,
Peter

2012/5/15 András Bártházi and...@barthazi.hu:
 Hi,

 You should use parantheses, have you tried that?
 q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
 ent_no:1040970907)

 http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

 Bye,
  Andras

 2012/5/15 Peter Kirk p...@alpha-solutions.dk

 Hi

 Can someone please give me some help with a simple query.

 If I search
 q=skcode:2021051 and flength:368.0

 I get 1 document returned (doc A)

 If I search
 q=skcode:2021049 and ent_no:1040970907

 I get 1 document returned (doc B)


 But if I search
 q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

 I get no documents returned.

 Shouldn't I get both docA and docB?

 Thanks,
 Peter





-- 
Péter Király
eXtensible Catalog
http://eXtensibleCatalog.org
http://drupal.org/project/xc

Re: Problem with AND clause in multi core search query

I really don't understand what you're trying to achieve.

query : column1:A should searched in core0 and column2:B should be
searched in core1 and later the results from both queries should use
condition AND and give final response.?

core1 and core0 are completely separate cores, with separate documents.
The only relationship between documents in the two cores is that they
should conform to the same schema since you're using shards. So saying
that your query should search in just one column in each core then AND the
results really doesn't make any sense to me.

I suspect there are some assumptions you're not explicitly stating about the
relationship between documents in separate cores that would help here...

Best
Erick

On Tue, May 15, 2012 at 3:07 AM, ravicv ravichandra...@gmail.com wrote:
 Thanks Tommaso .

 Could you please tell me is their any way to get this scenario to get
 worked?

 http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1:A;
 AND column2:B

 Is their any way we can achieve this below scenario

 query : column1:A should searched in core0 and column2:B should be
 searched in core1 and later the results from both queries should use
 condition AND and give final response.?

 since both will return common field as response.

 For reference my schema is :

   field name=id type=string indexed=true stored=true
 required=true /
   field name=value type=string indexed=true stored=true /
   field name=column1 type=string indexed=true stored=true/
   field name=column2 type=string indexed=true stored=true /

 Thanks
 Ravi


 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983806.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Stefan Matheis

Afaik we disabled JSP-Functionality in SOLR-3159 while upgrading Jetty .. 



On Tuesday, May 15, 2012 at 1:44 PM, Erick Erickson wrote:

 What do you mean jsp support? What is it you're trying to do
 with jsp? What servelet container are you using? Details matter.
 
 Best
 Erick
 
 On Mon, May 14, 2012 at 5:34 PM, Naga Vijayapuram nvija...@tibco.com 
 (mailto:nvija...@tibco.com) wrote:
  Hello,
  
  How do I enable JSP support in Solr 4.0 ?
  
  Thanks
  Naga

RE: simple query help

2012-05-15 Thread Peter Kirk

Hi

If I understand the terms correctly, the search-handler was configured to use 
edismax.

The start of the configuration in the solrconfig.xml looks like this:

requestHandler name=/search class=solr.SearchHandler default=true
lst name=defaults
  str name=defTypeedismax/str

In any case, when I commented-out the deftype entry, and restarted the solr 
webapp, things began to function as I expected.

But whether or not it was simply the act of restarting - I'm not sure. (I had 
also found out that AND  and OR should be written in uppercase, but this 
made no difference until after I had restarted).


Thanks for your time,
Peter



-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com] 
Sent: 15. maj 2012 13:25
To: solr-user@lucene.apache.org
Subject: RE: simple query help

 It doesn't make a difference. But now I'm thinking there's something 
 completely odd - and I wonder if it's necessary to use a special 
 search-handler to achieve what  I want.
 
 For example, if I execute
 q=(skcode:2021051 AND flength:368.0)
 
 I get no results. If I omit the parentheses, I get 1 result.
 (Let alone trying to combine several Boolean clauses).

Which query parser are you using?

Re: simple query help

Are you using the edismax query parser (which permits lower case and and 
or operators)? If so, there is a bug with parenthesized sub-queries. If 
you have a left parenthesis immediately before a field name (which you do in 
this case) the query fails. The short-term workaround is to place a space 
between the left parenthesis and the field name.


See:
https://issues.apache.org/jira/browse/SOLR-3377

-- Jack Krupansky

-Original Message- 
From: Peter Kirk

Sent: Tuesday, May 15, 2012 7:04 AM
To: solr-user@lucene.apache.org
Subject: RE: simple query help

Hi - thanks for the response. Yes I have tried with parentheses, to group as 
you suggest.


It doesn't make a difference. But now I'm thinking there's something 
completely odd - and I wonder if it's necessary to use a special 
search-handler to achieve what  I want.


For example, if I execute
q=(skcode:2021051 AND flength:368.0)

I get no results. If I omit the parentheses, I get 1 result. (Let alone 
trying to combine several Boolean clauses).


/Peter


-Original Message-
From: András Bártházi [mailto:and...@barthazi.hu]
Sent: 15. maj 2012 12:51
To: solr-user@lucene.apache.org
Subject: Re: simple query help

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
 Andras

2012/5/15 Peter Kirk p...@alpha-solutions.dk


Hi

Can someone please give me some help with a simple query.

If I search
q=skcode:2021051 and flength:368.0

I get 1 document returned (doc A)

If I search
q=skcode:2021049 and ent_no:1040970907

I get 1 document returned (doc B)


But if I search
q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

I get no documents returned.

Shouldn't I get both docA and docB?

Thanks,
Peter

RE: simple query help

 But whether or not it was simply the act of restarting - I'm
 not sure. (I had also found out that AND  and OR should
 be written in uppercase, but this made no difference until
 after I had restarted).

By the way, there is a control parameter for this. 

lowercaseOperators A Boolean parameter indicating if lowercase and and 
or should be treated the same as operators AND and OR. 

http://lucidworks.lucidimagination.com/display/solr/The+Extended+DisMax+Query+Parser

Re: simple query help

By removing the defType you reverted to using the traditional Solr/Lucene 
query parser which supports the particular query syntax you used (as long as 
AND is in upper-case) and without the parenthesis bug of edismax.


-- Jack Krupansky

-Original Message- 
From: Peter Kirk

Sent: Tuesday, May 15, 2012 8:23 AM
To: solr-user@lucene.apache.org
Subject: RE: simple query help

Hi

If I understand the terms correctly, the search-handler was configured to 
use edismax.


The start of the configuration in the solrconfig.xml looks like this:

requestHandler name=/search class=solr.SearchHandler default=true
   lst name=defaults
 str name=defTypeedismax/str

In any case, when I commented-out the deftype entry, and restarted the 
solr webapp, things began to function as I expected.


But whether or not it was simply the act of restarting - I'm not sure. (I 
had also found out that AND  and OR should be written in uppercase, but 
this made no difference until after I had restarted).



Thanks for your time,
Peter



-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.com]
Sent: 15. maj 2012 13:25
To: solr-user@lucene.apache.org
Subject: RE: simple query help


It doesn't make a difference. But now I'm thinking there's something
completely odd - and I wonder if it's necessary to use a special
search-handler to achieve what  I want.

For example, if I execute
q=(skcode:2021051 AND flength:368.0)

I get no results. If I omit the parentheses, I get 1 result.
(Let alone trying to combine several Boolean clauses).


Which query parser are you using?

Re: simple query help

Yes, the parentheses are needed to prioritize the operator precedence (do 
the ANDs and then OR those results.) And, add a space after both left 
parentheses to account for the edismax bug.


(https://issues.apache.org/jira/browse/SOLR-3377)

-- Jack Krupansky

-Original Message- 
From: András Bártházi

Sent: Tuesday, May 15, 2012 6:50 AM
To: solr-user@lucene.apache.org
Subject: Re: simple query help

Hi,

You should use parantheses, have you tried that?
q=(skcode:2021051 and flength:368.0) or (skcode:2021049 and
ent_no:1040970907)

http://robotlibrarian.billdueber.com/solr-and-boolean-operators/

Bye,
 Andras

2012/5/15 Peter Kirk p...@alpha-solutions.dk


Hi

Can someone please give me some help with a simple query.

If I search
q=skcode:2021051 and flength:368.0

I get 1 document returned (doc A)

If I search
q=skcode:2021049 and ent_no:1040970907

I get 1 document returned (doc B)


But if I search
q=skcode:2021051 and flength:368.0 or skcode:2021049 and ent_no:1040970907

I get no documents returned.

Shouldn't I get both docA and docB?

Thanks,
Peter

Solr tmp working directory

2012-05-15 Thread G.Long


Hi :)

I'm using SolrJ to index documents. I noticed that during the indexing 
process, .tmp files are created in my /tmp folder. These files contain 
the xml commands add for the documents I add to the index.


Can I change this folder in Solr config and where is it?

Thanks,
Gary

Re: Show a portion of searchable text in Solr


See the /browse request handler in the example config.

Only stored fields will be highlighted.

-- Jack Krupansky

-Original Message- 
From: Shameema Umer 
Sent: Tuesday, May 15, 2012 2:59 AM 
To: solr-user@lucene.apache.org 
Subject: Re: Show a portion of searchable text in Solr 


Can somebody tell me where should I place the highlighting parameters, when
I did on the query, it is not working.
hl=truehl.requireFieldMatch=truehl.fl=*

FYI: I am new to solr. My aim  is to have emphasis tags on the queried
words and need to display only the query relevant snippet of the content

Thanks
Shameema





On Mon, May 14, 2012 at 1:18 PM, Ahmet Arslan iori...@yahoo.com wrote:


 I have indexed very large documents, In some cases these
 documents has
 100.000 characters. Is there a way to return a portion of
 the documents
 (lets say the 300 first characters) when i am querying
 Solr?. Is there any
 attribute to set in the schema.xml or solrconfig.xml to
 achieve this?

I have a set-up with very large documents too. Here is two different
solutions that I have used in the past:

1) Use highlighting with hl.alternateField and hl.maxAlternateFieldLength
http://wiki.apache.org/solr/HighlightingParameters

2) Create an extra field (indexed=false and stored=true) using
copyField just for display purposes. (fl=shortField)

copyField source=largeField dest=shortField maxChars=300/
http://wiki.apache.org/solr/SchemaXml#Copy_Fields

Also, didn't used by myself yet but I *think* this can be accomplished by
using a custom Transformer too.
http://wiki.apache.org/solr/DocTransformers

Re: Boosting on field empty or not

The problem with what you provided is it is boosting ALL documents whether
the field is empty or not

On Tue, May 15, 2012 at 3:52 AM, Ahmet Arslan iori...@yahoo.com wrote:

  Basically I want documents that have a given field populated
  to have a
  higher score than the documents that dont.  So if you
  search for foo I want
  documents that contain foo, but i want the documents that
  have field a
  populated to have a higher score...


 Hi Donald,

 Since you are using edismax, it is better to use bq (boosting query) for
 this.

 bq=reqularprice:[* TO *]^50

 http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29

 defType=edismaxqf=nameSuggest^10 name^10 codeTXT^2 description^1
 brand_search^0 cat_search^10q=chairsbq=reqularprice:[* TO *]^50

Re: Boosting on field empty or not

 The problem with what you provided is
 it is boosting ALL documents whether
 the field is empty or not

Then all of your fields are non-empty? What is the type of your field?

Re: Solr tmp working directory

Solr is probably simply using the Java JVM default. Set the java.io.tmpdir 
system property. Something equivalent to the following:


java -Djava.io.tmpdir=/mytempdir ...

On Windows you can set the TMP environment variable.

-- Jack Krupansky

-Original Message- 
From: G.Long

Sent: Tuesday, May 15, 2012 9:04 AM
To: solr-user@lucene.apache.org
Subject: Solr tmp working directory

Hi :)

I'm using SolrJ to index documents. I noticed that during the indexing
process, .tmp files are created in my /tmp folder. These files contain
the xml commands add for the documents I add to the index.

Can I change this folder in Solr config and where is it?

Thanks,
Gary

Re: Solr tmp working directory

2012-05-15 Thread G.Long


Thank you :)

Gary

Le 15/05/2012 15:27, Jack Krupansky a écrit :
Solr is probably simply using the Java JVM default. Set the 
java.io.tmpdir system property. Something equivalent to the following:


java -Djava.io.tmpdir=/mytempdir ...

On Windows you can set the TMP environment variable.

-- Jack Krupansky

-Original Message- From: G.Long
Sent: Tuesday, May 15, 2012 9:04 AM
To: solr-user@lucene.apache.org
Subject: Solr tmp working directory

Hi :)

I'm using SolrJ to index documents. I noticed that during the indexing
process, .tmp files are created in my /tmp folder. These files contain
the xml commands add for the documents I add to the index.

Can I change this folder in Solr config and where is it?

Thanks,
Gary

Re: adding an OR to a fq makes some doc that matched not match anymore

2012-05-15 Thread jmlucjav

oh yeah, forgot about negatives and *:*...
thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775p3983863.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting on field empty or not

  The problem with what you
 provided is
  it is boosting ALL documents whether
  the field is empty or not
 
 Then all of your fields are non-empty? What is the type of
 your field?

How do you feed your documents to solr? My be you are indexing empty string? Is 
your field indexed=true? 

http://wiki.apache.org/solr/SolrQuerySyntax#Differences_From_Lucene_Query_Parser

-field:[* TO *] finds all documents without a value for field

Another approach is to use default=SOMETHING in your field definition. 
(schema.xml)   
field name=id type=int indexed=true stored=true default=0 /

Then you can use field:SOMETHING to retrieve empty fields. 
+*:* -field:SOMETHING retrieves non-empty documents.

Re: Editing long Solr URLs - Chrome Extension

2012-05-15 Thread Amit Nithian

Jan

Thanks for your feedback! If possible can you file these requests on the
github page for the extension so I can work on them? They sound like great
ideas and I'll try to incorporate all of them in future releases.

Thanks
Amit
On May 11, 2012 9:57 AM, Jan Høydahl j...@hoydahl.no wrote:

 I've been testing
 https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=enbut
  I don't think it's great.

 Great work on this one. Simple and straight forward. A few wishes:
 * Sticky mode? This tool would make sense in a sidebar, to do rapid
 refinements
 * If you edit a value and click TAB, it is not updated :(
 * It should not be necessary to URLencode all non-ascii chars - why not
 leave colon, caret (^) etc as is, for better readability?
 * Some param values in Solr may be large, such as fl, qf or bf.
 Would be nice if the edit box was multi-line, or perhaps adjusts to the
 size of the content

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.facebook.com/Cominvent
 Solr Training - www.solrtraining.com

 On 11. mai 2012, at 07:32, Amit Nithian wrote:

  Hey all,
 
  I don't know about you but most of the Solr URLs I issue are fairly
  lengthy full of parameters on the query string and browser location
  bars aren't long enough/have multi-line capabilities. I tried to find
  something that does this but couldn't so I wrote a chrome extension to
  help.
 
  Please check out my blog post on the subject and please let me know if
  something doesn't work or needs improvement. Of course this can work
  for any URL with a query string but my motivation was to help edit my
  long Solr URLs.
 
 
 http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
 
  Thanks!
  Amit

Re: Boosting on field empty or not

Let's go back to this step where things look correct, but we ran into the 
edismax bug which requires that you put a space between each left 
parenthesis and field name.


First, verify that you are using edismax or not.

Then, change:

q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)sort=score 
desc


to

q=chairs AND ( regularprice:*^5 OR ( *:* -regularprice:*)^0.5)sort=score 
desc


(Note the space after each (.)

And make sure to uuencode your spaces as + or %20.

Also, try this to verify whether you really have chairs without prices:

q=chairs AND ( *:* -regularprice:*)sort=score desc

(Note that space after (.)

And for sanity, try this as well:

q=chairs AND ( -regularprice:*)sort=score desc

(Again, note that space after (.)

Those two queries should give identical results.

Finally, technically you should be able to use * or [* TO *] to match 
all values or negate them to match all documents without a value in a field, 
but try both to see that they do return the identical set of documents.


-- Jack Krupansky

-Original Message- 
From: Donald Organ

Sent: Monday, May 14, 2012 4:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)sort=score 
desc



Same effect.


On Mon, May 14, 2012 at 4:12 PM, Jack Krupansky 
j...@basetechnology.comwrote:



Change the second boost to 0.5 to de-boost doc that are missing the field
value. You had them the same.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 4:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK it looks like the query change is working but it looks like it boosting
everything even documents that have that field empty

On Mon, May 14, 2012 at 3:41 PM, Donald Organ dor...@donaldorgan.com
wrote:

 OK i must be missing something:



defType=edismaxstart=0rows=**24facet=trueqf=nameSuggest^**10 name^10
codeTXT^2 description^1 brand_search^0 cat_search^10spellcheck=true**
spellcheck.collate=true**spellcheck.q=chairsfacet.**
mincount=1fl=code,scoreq=**chairs AND (regularprice:*^5 OR (*:*
-regularprice:*)^5)sort=score desc


On Mon, May 14, 2012 at 3:36 PM, Jack Krupansky j...@basetechnology.com
**wrote:

 (*:* -regularprice:*)5 should be (*:* -regularprice:*)^0.5 - the

missing boost operator.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

Still doesnt appear to be working.  Here is the full Query string:


defType=edismaxstart=0rows=24facet=trueqf=nameSuggest^10
name^10
codeTXT^2 description^1 brand_search^0
cat_search^10spellcheck=truespellcheck.collate=true**
spellcheck.q=chairsfacet.mincount=1fl=code,scoreq=chairs
AND (regularprice:*^5 OR (*:* -regularprice:*)5)


On Mon, May 14, 2012 at 3:28 PM, Jack Krupansky j...@basetechnology.com

**wrote:

 Sorry, make that:



q=chairs AND (regularprice:*^5 OR (*:* -regularprice:*)^0.5)

I forgot that pure negative queries are broken again, so you need the
*:*
in there.

I noticed that you second boost operator was missing as well.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 3:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK i just tried:

q=chairs AND (regularprice:*^5 OR (-regularprice:*)5)


And that gives me 0 results


On Mon, May 14, 2012 at 2:51 PM, Jack Krupansky 
j...@basetechnology.com
*
*wrote:

 foo AND (field:*^2.0 OR (-field:*)^0.5)


So, if a doc has anything in the field, it gets boosted, and if the 
doc

does not have anything in the field, de-boost it. Choose the boost
factors
to suit your desired boosting effect.

-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Boosting on field empty or not

OK maybe i need to describe this a little more.

Basically I want documents that have a given field populated to have a
higher score than the documents that dont.  So if you search for foo I
want
documents that contain foo, but i want the documents that have field a
populated to have a higher score...

Is there a way to do this?



On Mon, May 14, 2012 at 2:22 PM, Jack Krupansky 
j...@basetechnology.com
*
*wrote:

 In a query or filter query you can write +field:* to require that a
field

 be populated or +(-field:*) to require that it not be populated



-- Jack Krupansky

-Original Message- From: Donald Organ
Sent: Monday, May 14, 2012 2:10 PM
To: solr-user
Subject: Boosting on field empty or not

Is there a way to boost a document based on whether the field is 
empty

or
not.  I am looking to boost documents that have a specific field
populated.

RE: Issue in Applying patch file

2012-05-15 Thread Dyer, James

SOLR-3430 is already applied to the latest 3.6 and 4.x (trunk) source code.  Be 
sure you have sources from May 7, 2012 or later (for 3.6 this is SVN r1335205 + 
; for trunk it is SVN r1335196 + )  No patches are needed.

About the modern compiler error, make sure you're running a 1.6 or 1.7 JDK 
(the default JDK on some linux distributions is often inadequate) Issue javac 
-version from the command line as an insanity check.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: mechravi25 [mailto:mechrav...@yahoo.co.in] 
Sent: Tuesday, May 15, 2012 6:54 AM
To: solr-user@lucene.apache.org
Subject: Issue in Applying patch file

Hi,


We have checked out the latest version of Solr source code from svn. We are
trying to apply the following patch file to it.

 https://issues.apache.org/jira/browse/SOLR-3430

While applying the patch file using eclipse (i.e. using team--apply patch
options), we are getting cross marks for certain java files and its getting
updated for the following java file alone and we are able to see the patch
file changes for this alone.

solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestThreaded.java

Why is that its not getting applied for the other set of java files which is
present in the patch file and sometimes, we are getting file does not
exist error even if the corresponding files are present.

And also, when I try to ant build it after applying the patch, Im getting
the following error

common-build.xml:949: Error starting modern compiler

Can you tell me If Im missing out anything? Can you please guide me on this?

Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-in-Applying-patch-file-tp3983842.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Editing long Solr URLs - Chrome Extension

I think I put one up already, but in case I messed up github, complex
params like the fq here:

http://localhost:8983/solr/select?q=:fq={!geofilt sfield=store
pt=52.67,7.30 d=5}

aren't properly handled.

But I'm already using it occasionally

Erick

On Tue, May 15, 2012 at 10:02 AM, Amit Nithian anith...@gmail.com wrote:
 Jan

 Thanks for your feedback! If possible can you file these requests on the
 github page for the extension so I can work on them? They sound like great
 ideas and I'll try to incorporate all of them in future releases.

 Thanks
 Amit
 On May 11, 2012 9:57 AM, Jan Høydahl j...@hoydahl.no wrote:

 I've been testing
 https://chrome.google.com/webstore/detail/mbnigpeabbgkmbcbhkkbnlidcobbapff?hl=enbut
  I don't think it's great.

 Great work on this one. Simple and straight forward. A few wishes:
 * Sticky mode? This tool would make sense in a sidebar, to do rapid
 refinements
 * If you edit a value and click TAB, it is not updated :(
 * It should not be necessary to URLencode all non-ascii chars - why not
 leave colon, caret (^) etc as is, for better readability?
 * Some param values in Solr may be large, such as fl, qf or bf.
 Would be nice if the edit box was multi-line, or perhaps adjusts to the
 size of the content

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.facebook.com/Cominvent
 Solr Training - www.solrtraining.com

 On 11. mai 2012, at 07:32, Amit Nithian wrote:

  Hey all,
 
  I don't know about you but most of the Solr URLs I issue are fairly
  lengthy full of parameters on the query string and browser location
  bars aren't long enough/have multi-line capabilities. I tried to find
  something that does this but couldn't so I wrote a chrome extension to
  help.
 
  Please check out my blog post on the subject and please let me know if
  something doesn't work or needs improvement. Of course this can work
  for any URL with a query string but my motivation was to help edit my
  long Solr URLs.
 
 
 http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html
 
  Thanks!
  Amit

Highlight feature

2012-05-15 Thread anarchos78

Hello friends

I have noticed that the highlighted term of a query are returned in a second
xml struct(named highlighting). Is it possible to return the highlighted
terms into the doc field. I don't need the solr generated ids of the
highlighted field.

Thanks,
Tom

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlight-feature-tp3983875.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with AND clause in multi core search query

Hi Erick ,

My Schema is as follows

field name=id type=string indexed=true stored=true required=true
/
   field name=value type=string indexed=true stored=true /
   field name=column1 type=string indexed=true stored=true/
   field name=column2 type=string indexed=true stored=true / 

My data which i am indexing in core0 is 
id:1,  value:'123456',   column1:'A',column2:'null'
id:2,  value:'1234567895252',  column1:'B',column2:'null'

My data which i am indexing in core1 is 
id:3,  value:'123456',  column1:'null',  column2:'C'

Now my query is 
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1:A;
AND column2:C

Response: No data

In database we can achieve this by query querying separately 
 as follows

select value from core0 where column1='A'
intersect
select value from core0 where column1='C'

Same scenario i am trying to implement in my multi core SOLR setup. But i am
unable to do so.
Please let me know what should i do to implement this type of scenario in
SOLR.

I am using SOLR 1.4 version.

Thanks 
Ravi 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983881.html
Sent from the Solr - User mailing list archive at Nabble.com.

need help with getting exact matches to score higher

2012-05-15 Thread geeky2

Hello all,


i am trying to tune our core for exact matches on a single field (itemNo)
and having issues getting it to work.  

in addition - i need help understanding the output from debugQuery=on where
it presents the scoring.

my goal is to get exact matches to arrive at the top of the results. 
however - what i am seeing is non-exact matches arrive at the top of the
results with MUCH higher scores.



// from schema.xml - i am copying itemNo in to the string field for use in
boosting

  field name=itemNoExactMatchStr  type=string indexed=true
stored=false/
  copyField source=itemNo dest=itemNoExactMatchStr/

// from solrconfig.xml - i have the boost set for my special exact match
field and the sorting on score desc.

  requestHandler name=itemNoProductTypeBrandSearch
class=solr.SearchHandler default=false
lst name=defaults
  str name=defTypeedismax/str
  str name=echoParamsall/str
  int name=rows10/int
  *str name=qfitemNoExactMatchStr^30 itemNo^.9 divProductTypeDesc^.8
brand^.5/str*
  str name=q.alt*:*/str
 * str name=sortscore desc/str*
  str name=facettrue/str
  str name=facet.fielditemDescFacet/str
  str name=facet.fieldbrandFacet/str
  str name=facet.fielddivProductTypeIdFacet/str
/lst
lst name=appends
/lst
lst name=invariants
/lst
  /requestHandler



// analysis output from debugQuery=on

here you can see that the top socre for itemNo:9030 is a part that does not
start with 9030.

the entries below (there are 4) all have exact matches - but they rank below
this part - ???



str name=quot;0904000,1354  ,lt;b2TTZ9030C1000A* 
0.585678 = (MATCH) max of:
  0.585678 = (MATCH) weight(itemNo:9030^0.9 in 582979), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
27.173943 = (MATCH) fieldWeight(itemNo:9030 in 582979), product of:
  2.6457512 = tf(termFreq(itemNo:9030)=7)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=582979)
/str



str name=quot;122,1232  ,lt;b9030*   
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 499864), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 499864), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=499864)
/str

str name=quot;0537220,1882  ,lt;b9030   *
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 538826), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 538826), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=538826)
/str

str name=quot;0537220,2123  ,lt;b9030   *
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544313), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544313), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544313)
/str

str name=quot;0537220,2087  ,lt;b9030   *
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544657), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544657), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544657)
/str







--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-with-getting-exact-matches-to-score-higher-tp3983882.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Urgent! Highlighting not working as expected

2012-05-15 Thread TJ Tong

Hi Jack,

Thanks for your reply. I did not specify dismax when query with highlighting
enabled: q=text:G-Moneyhl=truehl.fl=*, that was the whole query string I
sent. What puzzled me is that the string field cr_firstname was copied
to text, but it was not highlighted. But if I use
q=cr_fristname:G-Moneyhl=truehl.fl=*, it will be highlighted. I attached
my solrconfig.xml here, could you please take a look? Thanks again!
http://lucene.472066.n3.nabble.com/file/n3983883/solrconfig.xml
solrconfig.xml 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755p3983883.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with AND clause in multi core search query

Right, but for that to work, there's an implicit connection between
the docs in core1 and core0, I assume provided by 123456 as
a foreign key or something. There's nothing automatically built
in like this in Solr 1.4 (joins come close, but those are trunk).

Whenever you try to make Solr act just like a database, you're
probably doing something you shouldn't. Solr is a very good search
engine, but it's not a RDBMS and shouldn't be asked to behave like
one.

In your case, consider de-normalizing the data and indexing all the related
data in a single document, even if it means repeating the data. Sometimes
this requires some judicious creativity, but it's the first thing I'd look at.

Best
Erick

On Tue, May 15, 2012 at 10:54 AM, ravicv ravichandra...@gmail.com wrote:
 Hi Erick ,

 My Schema is as follows

 field name=id type=string indexed=true stored=true required=true
 /
   field name=value type=string indexed=true stored=true /
   field name=column1 type=string indexed=true stored=true/
   field name=column2 type=string indexed=true stored=true /

 My data which i am indexing in core0 is
 id:1,  value:'123456',   column1:'A',    column2:'null'
 id:2,  value:'1234567895252',  column1:'B',    column2:'null'

 My data which i am indexing in core1 is
 id:3,  value:'123456',  column1:'null',  column2:'C'

 Now my query is
 http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=column1:A;
 AND column2:C

 Response: No data

 In database we can achieve this by query querying separately
  as follows

 select value from core0 where column1='A'
 intersect
 select value from core0 where column1='C'

 Same scenario i am trying to implement in my multi core SOLR setup. But i am
 unable to do so.
 Please let me know what should i do to implement this type of scenario in
 SOLR.

 I am using SOLR 1.4 version.

 Thanks
 Ravi




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Problem-with-AND-clause-in-multi-core-search-query-tp3983800p3983881.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: need help with getting exact matches to score higher

2012-05-15 Thread Tanguy Moal

Hello,
From the response you pasted here, it looks like the field
itemNoExactMatchStr
never matched.

Can you try matching in that field only and ensure you have matches ? Given
the ^30 boost, you should have high scores on this field...

Hope this helps,

--
Tanguy

2012/5/15 geeky2 gee...@hotmail.com

 Hello all,


 i am trying to tune our core for exact matches on a single field (itemNo)
 and having issues getting it to work.

 in addition - i need help understanding the output from debugQuery=on where
 it presents the scoring.

 my goal is to get exact matches to arrive at the top of the results.
 however - what i am seeing is non-exact matches arrive at the top of the
 results with MUCH higher scores.



 // from schema.xml - i am copying itemNo in to the string field for use in
 boosting

  field name=itemNoExactMatchStr  type=string indexed=true
 stored=false/
  copyField source=itemNo dest=itemNoExactMatchStr/

 // from solrconfig.xml - i have the boost set for my special exact match
 field and the sorting on score desc.

  requestHandler name=itemNoProductTypeBrandSearch
 class=solr.SearchHandler default=false
lst name=defaults
  str name=defTypeedismax/str
  str name=echoParamsall/str
  int name=rows10/int
  *str name=qfitemNoExactMatchStr^30 itemNo^.9 divProductTypeDesc^.8
 brand^.5/str*
  str name=q.alt*:*/str
 * str name=sortscore desc/str*
  str name=facettrue/str
  str name=facet.fielditemDescFacet/str
  str name=facet.fieldbrandFacet/str
  str name=facet.fielddivProductTypeIdFacet/str
/lst
lst name=appends
/lst
lst name=invariants
/lst
  /requestHandler



 // analysis output from debugQuery=on

 here you can see that the top socre for itemNo:9030 is a part that does not
 start with 9030.

 the entries below (there are 4) all have exact matches - but they rank
 below
 this part - ???



 str name=0904000,1354  ,b2TTZ9030C1000A* 
 0.585678 = (MATCH) max of:
  0.585678 = (MATCH) weight(itemNo:9030^0.9 in 582979), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
27.173943 = (MATCH) fieldWeight(itemNo:9030 in 582979), product of:
  2.6457512 = tf(termFreq(itemNo:9030)=7)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=582979)
 /str



 str name=122,1232  ,b9030*   
 0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 499864), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 499864), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=499864)
 /str

 str name=0537220,1882  ,b9030   *
 0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 538826), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 538826), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=538826)
 /str

 str name=0537220,2123  ,b9030   *
 0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544313), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544313), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544313)
 /str

 str name=0537220,2087  ,b9030   *
 0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544657), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544657), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544657)
 /str







 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/need-help-with-getting-exact-matches-to-score-higher-tp3983882.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlight feature

2012-05-15 Thread TJ Tong

I am also working on highlighting. I don't think so. And the ids in the
highlighting part are the ids of the docs retrieved.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlight-feature-tp3983875p3983887.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Urgent! Highlighting not working as expected

In the case of text:G-Money, the term is analyzed by Solr into the phrase 
g money, which matches in the text field, but will not match for a string 
field containing the literal text G-Money. But when you query 
cr_fristname:G-Money, the term is not tokenized by the Solr analyzer 
because it is a value for a string field, and a literal match occurs in 
the string field cr_fristname. I think that fully accounts for the 
behavior you see.


You might consider having a cr_fristname_text field which is tokenized text 
with a copyField from cr_fristname that fully supports highlighting of text 
terms.


BTW, I presume that should be first name, not frist name.

-- Jack Krupansky

-Original Message- 
From: TJ Tong

Sent: Tuesday, May 15, 2012 11:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Urgent! Highlighting not working as expected

Hi Jack,

Thanks for your reply. I did not specify dismax when query with highlighting
enabled: q=text:G-Moneyhl=truehl.fl=*, that was the whole query string I
sent. What puzzled me is that the string field cr_firstname was copied
to text, but it was not highlighted. But if I use
q=cr_fristname:G-Moneyhl=truehl.fl=*, it will be highlighted. I attached
my solrconfig.xml here, could you please take a look? Thanks again!
http://lucene.472066.n3.nabble.com/file/n3983883/solrconfig.xml
solrconfig.xml

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755p3983883.html
Sent from the Solr - User mailing list archive at Nabble.com.

Index an URL

2012-05-15 Thread Tolga


Hi,

I have a few questions, please bear with me:

1- I have a theory. nutch may be used to index to solr when we don't 
have access to URL's file system, while we can use curl when we do have 
access. Am I correct?
2- A tutorial I have been reading is talking about different levels of 
id. Is there such a thing (exid6, exid7 etc)?
3- When I use curl 
http://localhost:8983/solr/update/extract?literal.id=exid7commit=true; 
-F myfile=@serialized-form.html, I get ERROR: [doc=exid7] unknown 
field 'ignored_link'/pre. Is this something exid7 gives me? Where does 
this field ignored_link come from? Do I need to add all these fields to 
schema.xml in order not to get such error? What is the safest way?


Regards,

Re: Boosting on field empty or not

I have figured it out using your recommendation...I just had to give it a
high enough boost.

BTW its a float field

On Tue, May 15, 2012 at 9:21 AM, Ahmet Arslan iori...@yahoo.com wrote:

  The problem with what you provided is
  it is boosting ALL documents whether
  the field is empty or not

 Then all of your fields are non-empty? What is the type of your field?

Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Ryan McKinley

In 4.0, solr no longer uses JSP, so it is not enabled in the example setup.

You can enable JSP in your servlet container using whatever method
they provide.  For Jetty, using start.jar, you need to add the command
line: java -jar start.jar -OPTIONS=jsp

ryan



On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram nvija...@tibco.com wrote:
 Hello,

 How do I enable JSP support in Solr 4.0 ?

 Thanks
 Naga

Re: Highlight feature

2012-05-15 Thread Ramesh K Balasubramanian

That is the default response format. If you would like to change that, you 
could extend the search handler or post process the XML data. Another option 
would be to use the javabin (if your app is java based) and build xml the way 
your app would need.
 
Best Regards,
Ramesh

Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-15 Thread Ravi Solr

Hello,
   Unfortunately it seems like I spoke too early. Today morning I
received the same error again even after disabling the iptables. The
weird thing is only one out of 6 or 7 queries fails as evidenced in
the stack traces below. The query below the stack trace gave a
'status=500' subsequent queries look fine


[#|2012-05-15T08:12:38.703-0400|SEVERE|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=32;_ThreadName=httpSSLWorkerThread-9001-8;_RequestID=9f54ea89-357a-4c1b-87a1-fbaacc9fd0ee;|org.apache.solr.common.SolrException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:275)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:313)
at 
org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94)
at 
com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:98)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:222)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:166)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
at 
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:291)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:670)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.doProcess(DefaultProcessorTask.java:601)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.process(DefaultProcessorTask.java:875)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.executeProcessorTask(DefaultReadTask.java:365)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:285)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:221)
at 
com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:269)
at 
com.sun.enterprise.web.connector.grizzly.ssl.SSLWorkerThread.run(SSLWorkerThread.java:111)
Caused by: java.lang.RuntimeException: Invalid version (expected 2,
but 60) or the data in not in 'javabin' format
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:129)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:103)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

Re: Urgent! Highlighting not working as expected

2012-05-15 Thread TJ Tong

Thanks, Jack! I think you are right. But I also copied cr_firstname to text,
I assumed Solr would highlight cr_firstname if there is a match. I guess the
only solution is to copy all field to another field which is not tokenized.
Yes, it is firstname, good catch! 

Thanks again!

TJ

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Urgent-Highlighting-not-working-as-expected-tp3983755p3983907.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-15 Thread Ramesh K Balasubramanian

I have seen similar errors before when the solr version and solrj version in 
the client don't match.
 
Best Regards,
Ramesh

apostrophe / ayn / alif

2012-05-15 Thread Naomi Dushay

We are using the ICUFoldingFilterFactory with great success to fold diacritics 
so searches with and without the diacritics get the same results.

We recently discovered we have some Korean records that use an alif diacritic 
instead of an apostrophe, and this diacritic is NOT getting folded.   Has 
anyone experienced this for alif or ayn characters?   Do you have a solution?


- Naomi

Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-15 Thread Ravi Solr

I have already triple cross-checked  that all my clients are using
same version as the server which is 3.6

Thanks

Ravi Kiran

On Tue, May 15, 2012 at 2:09 PM, Ramesh K Balasubramanian
beeyar...@yahoo.com wrote:
 I have seen similar errors before when the solr version and solrj version in 
 the client don't match.

 Best Regards,
 Ramesh

Solr Caches

2012-05-15 Thread Rahul R

Hello,
I am trying to understand how I can size the caches for my solr powered
application. Some details on the index and application :
Solr Version : 1.3
JDK : 1.5.0_14 32 bit
OS : Solaris 10
App Server : Weblogic 10 MP1
Number of documents : 1 million
Total number of fields : 1000 (750 strings, 225 int/float/double/long, 25
boolean)
Number of fields on which faceting and filtering can be done : 400
Physical size of  index : 600MB
Number of unique values for a field : Ranges from 5 - 1000. Average of 150
-Xms and -Xmx vals for jvm : 3G
Expected number of concurrent users : 15
No sorting planned for now

Now I want to set appropriate values for the caches. I have put below some
of my understanding and questions about the caches. Please correct and
answer accordingly.
FilterCache:
As per the solr wiki, this is used to store an unordered list of Ids of
matching documents for an fq param.
So if a query contains two fq params, it will create two separate entries
for each of these fq params. The value of each entry is the list of ids of
all documents across the index that match the corresponding fq param. Each
entry is independent of any other entry.
A minimum size for filterCache could be (total number of fields * avg
number of unique values per field) ? Is this correct ? I have not enabled
useFilterForSortedQuery.
Max physical size of the filter cache would be (size * avg byte size of a
document id * avg number of docs returned per fq param) ?

QueryResultsCache:
Used to store an ordered list of ids of the documents that match the most
commonly used searches. So if my query is something like
q=Status:Activefq=Org:Apachefq=Version:13, it will create one entry that
contains list of ids of documents that match this full query. Is this
correct ? How can I size my queryResultsCache ? Some entries from
solrconfig.xml :
queryResultWindowSize50/queryResultWindowSize
queryResultMaxDocsCached200/queryResultMaxDocsCached
Max physical size of the filterCache would be (size * avg byte size of a
document id * avg number of docs per query). Is this correct ?


documentCache:
Stores the documents that are stored in the index. So I do two searches
that return three documents each with 1 document being common between both
result sets. This will result in 5 entries in the documentCache for the 5
unique documents that have been returned for the two queries ? Is this
correct ? For sizing, SolrWiki states that *The size for the documentCache
should always be greater than max_results * max_concurrent_queries*.
Why do we need the max_concurrent_queries parameter here ? Is it when
max_results is much lesser than numDocs ? In my case, a q=*:*search is done
the first time the index is loaded. So, will setting documentCache size to
numDocs be correct ? Can this be like the max that I need to allocate ?
Max physical size of document cache would be (size * avg byte size of a
document in the index). Is this correct ?

Thank you

-Rahul

Re: Boosting on field empty or not

Scratch that...it still seems to be boosting documents where the value of
the field is empty.


bq=regularprice:[0.01 TO *]^50

Results with bq set:

doc
float name=score2.2172112/float
str name=codebhl-ltab-30/str
  /doc


Results without bq set:

doc
float name=score2.4847748/float
str name=codebhl-ltab-30/str
  /doc


On Tue, May 15, 2012 at 12:40 PM, Donald Organ dor...@donaldorgan.comwrote:

 I have figured it out using your recommendation...I just had to give it a
 high enough boost.

 BTW its a float field

 On Tue, May 15, 2012 at 9:21 AM, Ahmet Arslan iori...@yahoo.com wrote:

  The problem with what you provided is
  it is boosting ALL documents whether
  the field is empty or not

 Then all of your fields are non-empty? What is the type of your field?

Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Naga Vijayapuram

Alright; thanks.  Tried with -OPTIONS=jsp and am still seeing this on
console Š

2012-05-15 12:47:08.837:INFO:solr:No JSP support.  Check that JSP jars are
in lib/jsp and that the JSP option has been specified to start.jar

I am trying to go after
http://localhost:8983/solr/collection1/admin/zookeeper.jsp (or its
equivalent in 4.0) after going through
http://wiki.apache.org/solr/SolrCloud

May I know the right zookeeper url in 4.0 please?

Thanks
Naga


On 5/15/12 10:56 AM, Ryan McKinley ryan...@gmail.com wrote:

In 4.0, solr no longer uses JSP, so it is not enabled in the example
setup.

You can enable JSP in your servlet container using whatever method
they provide.  For Jetty, using start.jar, you need to add the command
line: java -jar start.jar -OPTIONS=jsp

ryan



On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram nvija...@tibco.com
wrote:
 Hello,

 How do I enable JSP support in Solr 4.0 ?

 Thanks
 Naga

Replacing payloads for per-document-per-keyword scores

2012-05-15 Thread Neil Hooey

Hello Hoss and the list,

We are currently using Lucene payloads to store per-document-per-keyword
scores for our dataset. Our dataset consists of photos with keywords
assigned (only once each) to them. The index is about 90 GB, running on
24-core machines with dedicated 10k SAS drives, and 16/32 GB allocated to
the JVM.

When searching the payloads field, our 98 percentile query time is at 2
seconds even with trivially low queries per second. I have asked several
Lucene committers about this and it's believed that the implementation of
payloads being so general is the cause of the slowness.

Hoss guessed that we could override Term Frequency with PreAnalyzedField[1]
for the per-keyword scores, since keywords (tags) always have a Term
Frequency of 1 and the TF calculation is very fast. However it turns out
that you can't[2] specify TF in the PreAnalyzedField.

Is there any other way to override Term Frequency during index time? If
not, where in the code could this be implemented?

An obvious option is to repeat the keyword as many times as its payload
score, but that would drastically increase the amount of data per document
sent during index time.

I'd welcome any other per-document-per-keyword score solutions, or some way
to speed up searching a payload field.

Thanks,

- Neil

[1] https://issues.apache.org/jira/browse/SOLR-1535
[2]
https://issues.apache.org/jira/browse/SOLR-1535?focusedCommentId=13273501#comment-13273501

Re: apostrophe / ayn / alif

2012-05-15 Thread Robert Muir

On Tue, May 15, 2012 at 2:47 PM, Naomi Dushay ndus...@stanford.edu wrote:
 We are using the ICUFoldingFilterFactory with great success to fold 
 diacritics so searches with and without the diacritics get the same results.

 We recently discovered we have some Korean records that use an alif diacritic 
 instead of an apostrophe, and this diacritic is NOT getting folded.   Has 
 anyone experienced this for alif or ayn characters?   Do you have a solution?


What do you mean alif diacritic in Korean? Alif (ا) isn't a diacritic
and isn't used in Korean.

Or did you mean arabic dagger alif ( ٰ ) ? This is not a diacritic in
unicode (though its a combining mark).


-- 
lucidimagination.com

Re: Show a portion of searchable text in Solr

2012-05-15 Thread anarchos78

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Show-a-portion-of-searchable-text-in-Solr-tp3983613p3983942.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Replacing payloads for per-document-per-keyword scores

2012-05-15 Thread Mikhail Khludnev

Hello Neil,

if manipulating tf is a possible approach, why don't extend
KeywordTokenizer to make it work in the following manner:

3|wheel - {wheel,wheel,wheel}

it will allow supply your per-term-per-doc boosts as a prefixes for field
values and multiply them during indexing internally.

The second consideration is - have you considered Click Scoring Tools from
lucidworks as a relevant approach?

Regards

On Wed, May 16, 2012 at 12:02 AM, Neil Hooey nho...@gmail.com wrote:

Hello Hoss and the list,

We are currently using Lucene payloads to store per-document-per-keyword
scores for our dataset. Our dataset consists of photos with keywords
assigned (only once each) to them. The index is about 90 GB, running on
24-core machines with dedicated 10k SAS drives, and 16/32 GB allocated to
the JVM.

When searching the payloads field, our 98 percentile query time is at 2
seconds even with trivially low queries per second. I have asked several
Lucene committers about this and it's believed that the implementation of
payloads being so general is the cause of the slowness.

Hoss guessed that we could override Term Frequency with PreAnalyzedField[1]
for the per-keyword scores, since keywords (tags) always have a Term
Frequency of 1 and the TF calculation is very fast. However it turns out
that you can't[2] specify TF in the PreAnalyzedField.

Is there any other way to override Term Frequency during index time? If
not, where in the code could this be implemented?

An obvious option is to repeat the keyword as many times as its payload
score, but that would drastically increase the amount of data per document
sent during index time.

I'd welcome any other per-document-per-keyword score solutions, or some way
to speed up searching a payload field.

Thanks,

- Neil

[1] https://issues.apache.org/jira/browse/SOLR-1535
[2]

https://issues.apache.org/jira/browse/SOLR-1535?focusedCommentId=13273501#comment-13273501

--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com

doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread geeky2


hello,

After doing a DIH full-import (with clean=true) after deleting records in
the database, i noticed that the number of documents processed, did change.


example:

Indexing completed. Added/Updated: 595908 documents. Deleted 0 documents.

however, i noticed the numbers on the statistics page did not change nor do
they match the number of indexed records -


can someone help me understand the difference in these numbers and the
meaning of maxDoc / numDoc?

numDocs : 594893
maxDoc : 594893 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Boosting on field empty or not

 Scratch that...it still seems to be
 boosting documents where the value of
 the field is empty.
 
 
 bq=regularprice:[0.01 TO *]^50
 
 Results with bq set:
 
 doc
     float
 name=score2.2172112/float
     str
 name=codebhl-ltab-30/str
   /doc
 
 
 Results without bq set:
 
 doc
     float
 name=score2.4847748/float
     str
 name=codebhl-ltab-30/str
   /doc
 

Important thing is the order. Does the order of results change in a way that 
you want? (When you add bq) 

It is not a good idea to compare scores of two different queries. I *think* 
queryNorm is causing this difference.
You can add debugQuery=on and see what is the difference.

Re: Boosting on field empty or not

If the bq is only supposed apply the boost when the field value is greater
than 0.01 why would trying another query make sure this is working.

Its applying the boost to all the fields, yes when the boost is high enough
most of documents with a value GT 0.01 show up first however since it is
applying the boost to all the documents sometimes documents without a value
in this field appear before those that do.



On Tue, May 15, 2012 at 4:51 PM, Ahmet Arslan iori...@yahoo.com wrote:

  Scratch that...it still seems to be
  boosting documents where the value of
  the field is empty.
 
 
  bq=regularprice:[0.01 TO *]^50
 
  Results with bq set:
 
  doc
  float
  name=score2.2172112/float
  str
  name=codebhl-ltab-30/str
/doc
 
 
  Results without bq set:
 
  doc
  float
  name=score2.4847748/float
  str
  name=codebhl-ltab-30/str
/doc
 

 Important thing is the order. Does the order of results change in a way
 that you want? (When you add bq)

 It is not a good idea to compare scores of two different queries. I
 *think* queryNorm is causing this difference.
 You can add debugQuery=on and see what is the difference.

Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Michael Della Bitta

Hi, Jon:

Well, you don't see that every day!

Is it possible that you have something weird going on in your DDL
and/or queries, like a tree schema that now suddenly has a cyclical
reference?

Michael

On Tue, May 15, 2012 at 4:33 PM, Jon Drukman jdruk...@gmail.com wrote:
 I have a machine which does a full update using DataImportHandler every
 hour.  It worked up until a little while ago.  I did not change the
 dataconfig.xml or version of Solr.

 Here is the beginning of the error in the log (the real thing runs for
 thousands of lines)

 2012-05-15 12:44:30.724166500 SEVERE: Full Import
 failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
 java.lang.StackOverflowError
 2012-05-15 12:44:30.724168500 at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
 2012-05-15 12:44:30.724169500 at
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
 2012-05-15 12:44:30.724171500 at
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
 2012-05-15 12:44:30.724219500 at
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
 2012-05-15 12:44:30.724221500 at
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
 2012-05-15 12:44:30.724223500 at
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
 2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
 2012-05-15 12:44:30.724225500 at
 java.lang.String.checkBounds(String.java:404)
 2012-05-15 12:44:30.724234500 at java.lang.String.init(String.java:450)
 2012-05-15 12:44:30.724235500 at java.lang.String.init(String.java:523)
 2012-05-15 12:44:30.724236500 at
 java.net.SocketOutputStream.socketWrite0(Native Method)
 2012-05-15 12:44:30.724238500 at
 java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
 2012-05-15 12:44:30.724239500 at
 java.net.SocketOutputStream.write(SocketOutputStream.java:153)
 2012-05-15 12:44:30.724253500 at
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
 2012-05-15 12:44:30.724254500 at
 java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
 2012-05-15 12:44:30.724256500 at
 com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
 2012-05-15 12:44:30.724257500 at
 com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
 2012-05-15 12:44:30.724259500 at
 com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
 2012-05-15 12:44:30.724267500 at
 com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2618)
 2012-05-15 12:44:30.724268500 at
 com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
 2012-05-15 12:44:30.724270500 at
 com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
 2012-05-15 12:44:30.724271500 at
 com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
 2012-05-15 12:44:30.724273500 at
 com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
 2012-05-15 12:44:30.724280500 at
 com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
 2012-05-15 12:44:30.724282500 at
 com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
 2012-05-15 12:44:30.724283500 at
 com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
 2012-05-15 12:44:30.724285500 at
 com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
 2012-05-15 12:44:30.724286500 at
 com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
 2012-05-15 12:44:30.724321500 at
 com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
 2012-05-15 12:44:30.724322500 at
 com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
 2012-05-15 12:44:30.724324500 at
 com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
 2012-05-15 12:44:30.724325500 at
 com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
 2012-05-15 12:44:30.724327500 at
 com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
 2012-05-15 12:44:30.724334500 at
 com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
 2012-05-15 12:44:30.724335500 at
 com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
 2012-05-15 12:44:30.724336500 at
 com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
 2012-05-15 12:44:30.724338500 at
 com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
 2012-05-15 12:44:30.724339500 at
 com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
 2012-05-15 12:44:30.724345500 at
 com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
 2012-05-15 12:44:30.724347500 at
 com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
 2012-05-15 12:44:30.724348500 at
 com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
 2012-05-15 12:44:30.724350500 at
 com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
 2012-05-15 12:44:30.724351500 at
 com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
 2012-05-15

Re: doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread Michael Della Bitta

Hello, geeky2:

In statistics in the update section, do you see a non-zero value for
docsPending?

Thanks,

Michael

On Tue, May 15, 2012 at 4:49 PM, geeky2 gee...@hotmail.com wrote:

 hello,

 After doing a DIH full-import (with clean=true) after deleting records in
 the database, i noticed that the number of documents processed, did change.


 example:

 Indexing completed. Added/Updated: 595908 documents. Deleted 0 documents.

 however, i noticed the numbers on the statistics page did not change nor do
 they match the number of indexed records -


 can someone help me understand the difference in these numbers and the
 meaning of maxDoc / numDoc?

 numDocs : 594893
 maxDoc : 594893



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman

i don't think so, my config is straightforward:

dataConfig
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver
 url=jdbc:mysql://x/xx
 user=x password=x batchSize=-1 /
  document
entity name=content
   query=select content_id, description, title, add_date from
content_solr where active = '1'
   entity name=tag
  query=select tag_id from tags_assoc where content_id =
'${content.content_id}' /
   entity name=likes
  query=select count(1) as likes from votes where content_id =
'${content.content_id}' /
   entity name=views
  query=select sum(views) as views from media_views mv join
content_media cm USING (media_id) WHERE cm.content_id =
'${content.content_id}' /
/entity
  /document
/dataConfig

i'm triggering the import with:
http://localhost:8983/solr/dataimport?command=full-importclean=truecommit=true



On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 Hi, Jon:

 Well, you don't see that every day!

 Is it possible that you have something weird going on in your DDL
 and/or queries, like a tree schema that now suddenly has a cyclical
 reference?

 Michael

 On Tue, May 15, 2012 at 4:33 PM, Jon Drukman jdruk...@gmail.com wrote:
  I have a machine which does a full update using DataImportHandler every
  hour.  It worked up until a little while ago.  I did not change the
  dataconfig.xml or version of Solr.
 
  Here is the beginning of the error in the log (the real thing runs for
  thousands of lines)
 
  2012-05-15 12:44:30.724166500 SEVERE: Full Import
  failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
  java.lang.StackOverflowError
  2012-05-15 12:44:30.724168500 at
 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
  2012-05-15 12:44:30.724169500 at
 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
  2012-05-15 12:44:30.724171500 at
 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
  2012-05-15 12:44:30.724219500 at
 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
  2012-05-15 12:44:30.724221500 at
 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
  2012-05-15 12:44:30.724223500 at
 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
  2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
  2012-05-15 12:44:30.724225500 at
  java.lang.String.checkBounds(String.java:404)
  2012-05-15 12:44:30.724234500 at java.lang.String.init(String.java:450)
  2012-05-15 12:44:30.724235500 at java.lang.String.init(String.java:523)
  2012-05-15 12:44:30.724236500 at
  java.net.SocketOutputStream.socketWrite0(Native Method)
  2012-05-15 12:44:30.724238500 at
  java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
  2012-05-15 12:44:30.724239500 at
  java.net.SocketOutputStream.write(SocketOutputStream.java:153)
  2012-05-15 12:44:30.724253500 at
  java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
  2012-05-15 12:44:30.724254500 at
  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
  2012-05-15 12:44:30.724256500 at
  com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
  2012-05-15 12:44:30.724257500 at
  com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
  2012-05-15 12:44:30.724259500 at
  com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
  2012-05-15 12:44:30.724267500 at
  com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2618)
  2012-05-15 12:44:30.724268500 at
 
 com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
  2012-05-15 12:44:30.724270500 at
  com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
  2012-05-15 12:44:30.724271500 at
  com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
  2012-05-15 12:44:30.724273500 at
  com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
  2012-05-15 12:44:30.724280500 at
  com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2478)
  2012-05-15 12:44:30.724282500 at
 
 com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1584)
  2012-05-15 12:44:30.724283500 at
  com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4364)
  2012-05-15 12:44:30.724285500 at
  com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1360)
  2012-05-15 12:44:30.724286500 at
  com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2652)
  2012-05-15 12:44:30.724321500 at
 
 com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
  2012-05-15 12:44:30.724322500 at
  com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
  2012-05-15 12:44:30.724324500 at
  com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
  2012-05-15 12:44:30.724325500 at
  com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
  2012-05-15 12:44:30.724327500 at

Re: - Solr 4.0 - How do I enable JSP support ? ...

2012-05-15 Thread Naga Vijayapuram

Finally got a handle on this by looking into the New Admin UI -
http://localhost:8983/solr/#/~cloud

Thanks
Naga


On 5/15/12 12:53 PM, Naga Vijayapuram nvija...@tibco.com wrote:

Alright; thanks.  Tried with -OPTIONS=jsp and am still seeing this on
console Š

2012-05-15 12:47:08.837:INFO:solr:No JSP support.  Check that JSP jars are
in lib/jsp and that the JSP option has been specified to start.jar

I am trying to go after
http://localhost:8983/solr/collection1/admin/zookeeper.jsp (or its
equivalent in 4.0) after going through
http://wiki.apache.org/solr/SolrCloud

May I know the right zookeeper url in 4.0 please?

Thanks
Naga


On 5/15/12 10:56 AM, Ryan McKinley ryan...@gmail.com wrote:

In 4.0, solr no longer uses JSP, so it is not enabled in the example
setup.

You can enable JSP in your servlet container using whatever method
they provide.  For Jetty, using start.jar, you need to add the command
line: java -jar start.jar -OPTIONS=jsp

ryan



On Mon, May 14, 2012 at 2:34 PM, Naga Vijayapuram nvija...@tibco.com
wrote:
 Hello,

 How do I enable JSP support in Solr 4.0 ?

 Thanks
 Naga

- When is Solr 4.0 due for Release? ...

2012-05-15 Thread Naga Vijayapuram

… Any idea, anyone?

Thanks
Naga

RE: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Dyer, James

Shot in the dark here, but try adding readOnly=true to your dataSource tag.

dataSource readOnly=true type=JdbcDataSource  ... /

This sets autocommit to true and sets the Holdability to 
ResultSet.CLOSE_CURSORS_AT_COMMIT.  DIH does not explicitly close resultsets 
and maybe if your JDBC driver also manages this poorly you could end up with 
strange conditions like the one you're getting?  It could be a case where your 
data has grown just over the limit your setup can handle under such an 
unfortunate circumstance.

Let me know if this solves it.  If so, we probably should open a bug report and 
get this fixed in DIH.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Jon Drukman [mailto:jdruk...@gmail.com] 
Sent: Tuesday, May 15, 2012 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Exception in DataImportHandler (stack overflow)

i don't think so, my config is straightforward:

dataConfig
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver
 url=jdbc:mysql://x/xx
 user=x password=x batchSize=-1 /
  document
entity name=content
   query=select content_id, description, title, add_date from
content_solr where active = '1'
   entity name=tag
  query=select tag_id from tags_assoc where content_id =
'${content.content_id}' /
   entity name=likes
  query=select count(1) as likes from votes where content_id =
'${content.content_id}' /
   entity name=views
  query=select sum(views) as views from media_views mv join
content_media cm USING (media_id) WHERE cm.content_id =
'${content.content_id}' /
/entity
  /document
/dataConfig

i'm triggering the import with:
http://localhost:8983/solr/dataimport?command=full-importclean=truecommit=true



On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 Hi, Jon:

 Well, you don't see that every day!

 Is it possible that you have something weird going on in your DDL
 and/or queries, like a tree schema that now suddenly has a cyclical
 reference?

 Michael

 On Tue, May 15, 2012 at 4:33 PM, Jon Drukman jdruk...@gmail.com wrote:
  I have a machine which does a full update using DataImportHandler every
  hour.  It worked up until a little while ago.  I did not change the
  dataconfig.xml or version of Solr.
 
  Here is the beginning of the error in the log (the real thing runs for
  thousands of lines)
 
  2012-05-15 12:44:30.724166500 SEVERE: Full Import
  failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
  java.lang.StackOverflowError
  2012-05-15 12:44:30.724168500 at
 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
  2012-05-15 12:44:30.724169500 at
 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
  2012-05-15 12:44:30.724171500 at
 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
  2012-05-15 12:44:30.724219500 at
 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
  2012-05-15 12:44:30.724221500 at
 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
  2012-05-15 12:44:30.724223500 at
 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
  2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
  2012-05-15 12:44:30.724225500 at
  java.lang.String.checkBounds(String.java:404)
  2012-05-15 12:44:30.724234500 at java.lang.String.init(String.java:450)
  2012-05-15 12:44:30.724235500 at java.lang.String.init(String.java:523)
  2012-05-15 12:44:30.724236500 at
  java.net.SocketOutputStream.socketWrite0(Native Method)
  2012-05-15 12:44:30.724238500 at
  java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
  2012-05-15 12:44:30.724239500 at
  java.net.SocketOutputStream.write(SocketOutputStream.java:153)
  2012-05-15 12:44:30.724253500 at
  java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
  2012-05-15 12:44:30.724254500 at
  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
  2012-05-15 12:44:30.724256500 at
  com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
  2012-05-15 12:44:30.724257500 at
  com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
  2012-05-15 12:44:30.724259500 at
  com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163)
  2012-05-15 12:44:30.724267500 at
  com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2618)
  2012-05-15 12:44:30.724268500 at
 
 com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1644)
  2012-05-15 12:44:30.724270500 at
  com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:198)
  2012-05-15 12:44:30.724271500 at
  com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7617)
  2012-05-15 12:44:30.724273500 at
  com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:907)
  2012-05-15 12:44:30.724280500 at

Re: Boosting on field empty or not


 If the bq is only supposed apply the
 boost when the field value is greater
 than 0.01 why would trying another query make sure this is
 working.
 
 Its applying the boost to all the fields, yes when the boost
 is high enough
 most of documents with a value GT 0.01 show up first however
 since it is
 applying the boost to all the documents sometimes documents
 without a value
 in this field appear before those that do.

If boosting is applied to all documents, then why result order is changing?

Sometimes documents without a value can show-up before because there are other 
factors that contribute score calculation. 

http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/Similarity.html

If you add debugQuery=on, you can see detailed explanation about how 
calculation is done.

Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman

I fixed it for now by upping the wait_timeout on the mysql server.
 Apparently Solr doesn't like having its connection yanked out from under
it and/or isn't smart enough to reconnect if the server goes away.  I'll
set it back the way it was and try your readOnly option.

Is there an option with DataImportHandler to have it transmit one or more
arbitrary SQL statements after connecting?  If there was, I could just send
SET wait_timeout=86400; after connecting.  That would probably prevent
this issue.

-jsd-

On Tue, May 15, 2012 at 2:35 PM, Dyer, James james.d...@ingrambook.comwrote:

 Shot in the dark here, but try adding readOnly=true to your dataSource
 tag.

 dataSource readOnly=true type=JdbcDataSource  ... /

 This sets autocommit to true and sets the Holdability to
 ResultSet.CLOSE_CURSORS_AT_COMMIT.  DIH does not explicitly close
 resultsets and maybe if your JDBC driver also manages this poorly you could
 end up with strange conditions like the one you're getting?  It could be a
 case where your data has grown just over the limit your setup can handle
 under such an unfortunate circumstance.

 Let me know if this solves it.  If so, we probably should open a bug
 report and get this fixed in DIH.

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: Jon Drukman [mailto:jdruk...@gmail.com]
 Sent: Tuesday, May 15, 2012 4:12 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Exception in DataImportHandler (stack overflow)

 i don't think so, my config is straightforward:

 dataConfig
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver
 url=jdbc:mysql://x/xx
 user=x password=x batchSize=-1 /
  document
entity name=content
   query=select content_id, description, title, add_date from
 content_solr where active = '1'
   entity name=tag
  query=select tag_id from tags_assoc where content_id =
 '${content.content_id}' /
   entity name=likes
  query=select count(1) as likes from votes where content_id =
 '${content.content_id}' /
   entity name=views
  query=select sum(views) as views from media_views mv join
 content_media cm USING (media_id) WHERE cm.content_id =
 '${content.content_id}' /
/entity
  /document
 /dataConfig

 i'm triggering the import with:

 http://localhost:8983/solr/dataimport?command=full-importclean=truecommit=true



 On Tue, May 15, 2012 at 2:07 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:

  Hi, Jon:
 
  Well, you don't see that every day!
 
  Is it possible that you have something weird going on in your DDL
  and/or queries, like a tree schema that now suddenly has a cyclical
  reference?
 
  Michael
 
  On Tue, May 15, 2012 at 4:33 PM, Jon Drukman jdruk...@gmail.com wrote:
   I have a machine which does a full update using DataImportHandler every
   hour.  It worked up until a little while ago.  I did not change the
   dataconfig.xml or version of Solr.
  
   Here is the beginning of the error in the log (the real thing runs for
   thousands of lines)
  
   2012-05-15 12:44:30.724166500 SEVERE: Full Import
   failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
   java.lang.StackOverflowError
   2012-05-15 12:44:30.724168500 at
  
 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
   2012-05-15 12:44:30.724169500 at
  
 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
   2012-05-15 12:44:30.724171500 at
  
 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
   2012-05-15 12:44:30.724219500 at
  
 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
   2012-05-15 12:44:30.724221500 at
  
 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
   2012-05-15 12:44:30.724223500 at
  
 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
   2012-05-15 12:44:30.724224500 Caused by: java.lang.StackOverflowError
   2012-05-15 12:44:30.724225500 at
   java.lang.String.checkBounds(String.java:404)
   2012-05-15 12:44:30.724234500 at
 java.lang.String.init(String.java:450)
   2012-05-15 12:44:30.724235500 at
 java.lang.String.init(String.java:523)
   2012-05-15 12:44:30.724236500 at
   java.net.SocketOutputStream.socketWrite0(Native Method)
   2012-05-15 12:44:30.724238500 at
   java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
   2012-05-15 12:44:30.724239500 at
   java.net.SocketOutputStream.write(SocketOutputStream.java:153)
   2012-05-15 12:44:30.724253500 at
   java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
   2012-05-15 12:44:30.724254500 at
   java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
   2012-05-15 12:44:30.724256500 at
   com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3345)
   2012-05-15 12:44:30.724257500 at
   com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1983)
   2012-05-15

should i upgrade

2012-05-15 Thread Jon Kirton

We're running solr v1.4.1 w/ approx 30M - 40M records at any given time.
 Often, socket timeout exceptions occur for a search query.  Is there a
compelling reason to upgrade?  I.e. can u set a socket timeout in
solrconfig.xml in the latest version and not in v1.4.1 ?

Re: First query to find meta data, second to search. How to group into one?

2012-05-15 Thread SUJIT PAL

Hi Samarendra,

This does look like a candidate for a custom query component if you want to do 
this inside Solr. You can of course continue to do this at the client.

-sujit

On May 15, 2012, at 12:26 PM, Samarendra Pratap wrote:

 Hi,
 I need a suggestion for improving relevance of search results. Any
 help/pointers are appreciated.
 
 We have following fields (plus a lot more) in our schema
 
 title
 description
 category_id (multivalued)
 
 We are using mm=70% in solrconfig.xml
 We are using qf=title description
 We are not doing phrase query in q
 
 In case of a multi-word search text, mostly the end results are the junk
 ones. Because the words, mentioned in search text, are written in different
 fields and in different contexts.
 For example searching for water proof (without double quotes) brings a
 record where title = rose water and description = ... no proof of
 contamination ...
 
 Our priority is to remove irrelevant results, as much as possible.
 Increasing mm will not solve this completely because user input may not
 be always correct to be benefited by high mm.
 
 To remove irrelevant records we worked on following solution (or
 work-around)
 
   - We are firing first query to get top n results. We assume that first
   n results are mostly good results. n is dynamic within a predefined
   minimum and maximum value.
   - We are calculating frequency of category ids in these top results. We
   are not using facets because that gives count for all, relevant or
   irrelevant, results.
   - Based on category frequencies within top matching results we are
   trying to find a few most frequent categories by simple calculation. Now we
   are very confident that these categories are the ones which best suit to
   our query.
   - Finally we are firing a second query with top categories, calculated
   above, in filter query (fq).
 
 
 The quality of results really increased very much so I thought to try it
 the standard way.
 Does it require writing a plugin if I want to move above logic into Solr?
 Which component do I need to modify - QueryComponent?
 
 Or is there any better or even equivalent method in Solr of doing this or
 similar thing?
 
 
 
 Thanks
 
 -- 
 Regards,
 Samar

Re: Boosting on field empty or not