Re: Show filename in search result using a FileListEntityProcessor

2011-05-19 Thread Daniel Rijkhof
You should use file instead of fileName in column

field column=file name=fileName/

Don't forget to add the 'fileName' to the schema.xml in the fields section.

field name=fileName type=string indexed=true stored=true /


Have fun,

Daniel Rijkhof
06 12 14 12 17



On Mon, May 16, 2011 at 4:20 PM, Marcel Panse marcel.pa...@gmail.comwrote:

 Hi, thanks for the reply.

 I tried a couple of things both in the tika-test entity and in the entity
 named 'f'.
 In the tika-test entity I tried:

 field column=fileName name=${f.fileName} /
 field column=fileName name=${f.file} /

 even

 field column=fileName name=${f.fileAbsolutePath} /

 I also tried doing things in the entity 'f' like:

 field column=fileName name=fileName/
 field column=fileName name=file/

 None of it works. I also added fileName to the schema like:

 field name=fileName type=string indexed=true stored=true /

 In fields. Doesn't help.

 Can anyone provide me with a working example? I'm pretty stuck here on
 something that seems really trivial and simple :-(



 On Sat, May 14, 2011 at 22:56, kbootz kbo...@caci.com wrote:

  There is a JIRA item(can't recall it atm) that addresses the issue with
 the
  docs. I'm running 3.1 and per your example you should be able to get it
  using ${f.file}. I think* it should also be in the entity desc. but I'm
  also
  new and that's just how I access it.
 
  GL
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Show-filename-in-search-result-using-a-FileListEntityProcessor-tp2939193p2941305.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: filter cache and negative filter query

2011-05-19 Thread Juan Antonio Farré Basurte
 : query that in fact returns the negative results. As a simple example, 
 : I believe that, for a boolean field, -field:true is exactly the same as 
 : +field:false, but the former is a negative query and the latter is a 
 
 that's not strictly true in all cases... 
 
 * if the field is multivalued=true, a doc may contain both false and 
   true in field, in which case it would match +field:false but it 
   would not match -field:true
 
 * if the field is not multivalued-false, and required=false, a doc
   may not contain any value, in which case it would match -field:true but 
   it would not match +field:false

You're totally right. But it was just an example. I just didn't think about 
specifying the field to be single valued and required.

I did some testing yesterday about how are filteres cached, using the admin 
interface.
I noticed that if I perform a facet.query on a boolean field testing it to be 
true or false it always looks to add two entries to the query cache. May be it 
also adds an entry to test for unexsistence of the value?
And if I perform a facet.field on the same boolean field, three new entries are 
inserted into the filter cache. May be one for true, one for false and one for 
unexsistence? I really don't know what it's exactly doing, but doesn't look, at 
first sight, like a very optimal behaviour...
I'm testing on 1.4.1 lucidworks version of solr, using the boolean field 
inStock of its example schema, with its example data.

Out of memory on sorting

2011-05-19 Thread Rohit
Hi,

 

We are moving to a multi-core Solr installation with each of the core having
millions of documents, also documents would be added to the index on an
hourly basis.  Everything seems to run find and I getting the expected
result and performance, except where sorting is concerned.

 

I have an index size of 13217121 documents, now when I want to get documents
between two dates and then sort them by ID  solr goes out of memory. This is
with just me using the system, we might also have simultaneous users, how
can I improve this performance?

 

Rohit



Re: Out of memory on sorting

2011-05-19 Thread rajini maski
Explicit Warming of Sort Fields

If you do a lot of field based sorting, it is advantageous to add explicitly
warming queries to the newSearcher and firstSearcher event listeners in
your solrconfig which sort on those fields, so the FieldCache is populated
prior to any queries being executed by your users.
firstSearcher
lst str name=qsolr rocks/strstr name=start0/strstr
name=rows10/strstr name=sortempID asc/str/lst



On Thu, May 19, 2011 at 2:39 PM, Rohit ro...@in-rev.com wrote:

 Hi,



 We are moving to a multi-core Solr installation with each of the core
 having
 millions of documents, also documents would be added to the index on an
 hourly basis.  Everything seems to run find and I getting the expected
 result and performance, except where sorting is concerned.



 I have an index size of 13217121 documents, now when I want to get
 documents
 between two dates and then sort them by ID  solr goes out of memory. This
 is
 with just me using the system, we might also have simultaneous users, how
 can I improve this performance?



 Rohit




SOLR Custom datasource integration

2011-05-19 Thread amit.b....@gmail.com
Hi,

We are trying build enterprise search solution using SOLR , out data source
is Database which is interfaced with JPA.

Solution looks like 

SOLR INDEX  JPA  Oracle database.

We need help to findout what is the best approch integrate Solr Index with
JPA.

We tried out two appoches 

Approch 1 - 
1 Polulating SolrInputDocument with data from JPA 
2 Updating EmbeddedSolrServer with captured data using SolrJ API.

Approch 2 - 
1 Customizing dataimporthandler of HTTPSolrServer
2 Retrieving data in dataimporthandler using JPA entity.

Functional requirement - 
1 Solution should be performant for huge magnitude of data
2 Should be scalable  

We have few question which will help us to decide solution 
Will like know which one is better approch to meet our requirement.
Is it good idea to integrate with Lucene against using EmbeddedSolrServer +
JPA
If JVM is crashes ,  EmbeddedSolrServer content will be lost on reboot.
Can we get support from Jasper Experts team ? can we buy ? how ?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Custom-datasource-integration-tp2960475p2960475.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Highlighting does not work when using !boost as a nested query

2011-05-19 Thread Juan Antonio Farré Basurte
Hi,

The query is generated dynamically and can be more or less complex depending on 
different parameters. I'm also not free to give many details of our 
implementation, but I'll give you the minimal query string that fails and the 
relevant pieces of the config.
The query string is:

/select?q=+id:12345^0.01 +_query_:{!boost b=$dateboost v=$qq 
deftype=dismax}dateboost=recip(ms(NOW/DAY,published_date),3.16e-11,1,1)qq=user_textqf=text1^2
 text2pf=text1^2 text2tie=0.1q.alt=*:*hl=truehl.fl=text1 
text2hl.mergeContiguous=true

where id is an int and text1 and text2 are type text. hl.fl has proven to be 
necessary whenever I use dismax in an inner query. Ohterwise, only text2 (the 
default field) is highlighted, and not both fields appearing in qf. For example,
q={!dismax v=$qq}... does not require hl.fl to highlight both text1 and 
text2.
q=+_query_:{!dismax v=$qq}... only highlights text2, unless I specify 
hl.fl.

The given query is probably not minimal in the sense that some of the 
dismax-related parameters can be omitted and the query still fails. But the one 
given always fails (and adding more complexity to it does not make it work, 
quite obviously). Unfortunately, hl.requireFieldMatch=false does not help.

Request handler config is the following:

requestHandler name=standard class=solr.SearchHandler default=true
  lst name=defaults
str name=echoParamsexplicit/str
  /lst
/requestHandler

Highlighter config is the following:

highlighting
  fragmenter name=gap class=org.apache.solr.highlight.GapFragmenter 
default=true
lst name=defaults
  int name=hl.fragsize100/int
/lst
  /fragmenter
  fragmenter name=regex class=org.apache.solr.highlight.RegexFragmenter
lst name=defaults
  int name=hl.fragsize70/int
  float name=hl.regex.slop0.5/float
  str name=hl.regex.pattern[-\w ,/\n\']{20,200}/str
/lst
  /fragmenter
  formatter name=html class=org.apache.solr.highlight.HtmlFormatter 
default=true
lst name=defaults
  str name=hl.simple.preem/str
  str name=hl.simple.post/em/str
/lst
  /formatter
/highlighting

If there's any other information that could be useful, just ask.
Thank you very much for your help,

Juan

El 16/05/2011, a las 23:18, Chris Hostetter escribió:

 
 : As I said in my previous message, if I issue:
 : q=+field1:range +field2:value +_query_:{!dismax v=$qq}
 : highlighting works. I've just discovered the problem is not just with 
 {!boost...}. If I just add a bf parameter to the previous query, highlighting 
 also fails.
 : Anybody knows what can be happening? I'm really stuck on this problem...
 
 Just a hunch, but i suspect the problem has to do with 
 highlighter (or maybe it's the fragment generator?) trying to determine
 matches from query types it doens't understand 
 
 I thought there was a query param you could use to tell the highlighter to 
 use an alternate query string (that would be simpler) instead of the 
 real query ... but i'm not seeing it in the docs.
 
 hl.requireFieldMatch=false might also help (not sure)
 
 In general it would probably be helpful for folks if you could post the 
 *entire* request you are making (full query string and all request params) 
 along with the solrconfig.xml sections that show how your request handler 
 and highlighter are configured.
 
 
 
 -Hoss



How do I write/build query using qf parameter of dismax handler for my use case?

2011-05-19 Thread Gnanakumar
Hi,

How do I write/build a Solr query using dismax handler for my application
specific use case explained below:

Snippet of fields definition from schema.xml:

field name=documentid type=string indexed=true stored=true
required=true /
field name=companyid type=long indexed=true stored=true
required=true /
field name=textfield1 type=text indexed=true stored=false
required=true /
field name=textfield2 type=text indexed=true stored=false
required=true /
field name=textfield3 type=text indexed=true stored=false
required=true /

uniqueKeydocumentid/uniqueKey
defaultSearchFieldtextfield1/defaultSearchField

Now, I want to search for documents containing solr and struts in all 3
text fields (textfield1, textfield2, textfield3) but within the companyid =
100.

As you can see from above statement, companyid=100 is common here but search
keywords should be searched only in 3 text fields (textfield1, textfield2,
textfield3).

I also understand that this can be written as shown below by qualifying all
the 3 text fields explicitly:
http://localhost/solr/select?q=companyid:100textfield1:solr AND
strutstextfield2:solr AND strutstextfield3:solr AND struts

But how do I write/build a query using qf parameter of dismax query
handler, so that I don't need to specify all the 3 fields explicitly.

Wiki says: For each word in the query string, dismax builds a
DisjunctionMaxQuery object for that word across all of the fields in the qf
param

NOTE: I'm using edismax as my default query type in my Search Handler.

Regards,
Gnanam



RE: Out of memory on sorting

2011-05-19 Thread Rohit
Thanks for pointing me in the right direction, now I see the configuration
for firstsearcher or newsearcher, the str name=q needs to configured
previously. In my case the q is every changing, users can actually search
for anything and the possibilities of queries unlimited. 

How can I make this generic?

-Rohit



-Original Message-
From: rajini maski [mailto:rajinima...@gmail.com] 
Sent: 19 May 2011 14:53
To: solr-user@lucene.apache.org
Subject: Re: Out of memory on sorting

Explicit Warming of Sort Fields

If you do a lot of field based sorting, it is advantageous to add explicitly
warming queries to the newSearcher and firstSearcher event listeners in
your solrconfig which sort on those fields, so the FieldCache is populated
prior to any queries being executed by your users.
firstSearcher
lst str name=qsolr rocks/strstr name=start0/strstr
name=rows10/strstr name=sortempID asc/str/lst



On Thu, May 19, 2011 at 2:39 PM, Rohit ro...@in-rev.com wrote:

 Hi,



 We are moving to a multi-core Solr installation with each of the core
 having
 millions of documents, also documents would be added to the index on an
 hourly basis.  Everything seems to run find and I getting the expected
 result and performance, except where sorting is concerned.



 I have an index size of 13217121 documents, now when I want to get
 documents
 between two dates and then sort them by ID  solr goes out of memory. This
 is
 with just me using the system, we might also have simultaneous users, how
 can I improve this performance?



 Rohit





Field collapsing patch issues

2011-05-19 Thread Isha Garg

Hi All!


Kindly provide me the links for suitable patches that are applied to 
solr version 1.4.1 and 3.0 so that field collapsing should work properly.



Thanks in advance!
Isha garg


Re: How do I write/build query using qf parameter of dismax handler for my use case?

2011-05-19 Thread Grijesh
edismax supports full query format of lucene parser.But you can search using
filter queries eg.

qf=textfield1, textfield2, textfield3fq=textfield1:solr AND
strutsfq=textfield2:solr AND strutsfq=textfield3:solr AND struts
fq=companyid:100


-
Thanx: 
Grijesh 
www.gettinhahead.co.in 
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-do-I-write-build-query-using-qf-parameter-of-dismax-handler-for-my-use-case-tp2960766p2960911.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: filter cache and negative filter query

2011-05-19 Thread Juan Antonio Farré Basurte
 lookups to work with an arbitrary query, you would either need to changed 
 the cache structure from Query=DocSet to a mapping of 
 Query=[DocSet,inverseionBit] and store the same cache value needs needs 
 with two keys -- both the positive and the negative; or you keep the 

Well, I don't know how it's working right now, but I guess that, as the 
positive version is being stored, when you look a negative query up, you 
already have a similar lookup problem: or you store two keys for the same value 
or you just transform the negative query into a positive canonical one before 
looking it up. The same could be done in this case, with the difference that 
yes, you need an inversion bit stored too. The double lookup option sounds 
worse, though benchmarking should be done to know for sure.
Would this optimization influence only memory usage or also smaller sets are 
faster to intersect, for example? Well, in any case, saving memory allows to 
use the additional memory to speed up the application, for example, with bigger 
caches.

Re: Highlighting does not work when using !boost as a nested query

2011-05-19 Thread Juan Antonio Farré Basurte
By the way, I was wrong when saying that using bf instead of !boost did not 
work either. I probably hit more than one problem at the same time when I first 
tested that.
I've retested now and this works:

/select?q=+id:12345^0.01 +_query_:{!dismax 
v=$qq}bf=recip(ms(NOW/DAY,published_date),3.16e-11,1,1)qq=user_textqf=text1^2
 text2pf=text1^2 text2tie=0.1q.alt=*:*hl=truehl.fl=text1 
text2hl.mergeContiguous=true

But I don't get the multiplicative boost I'd like to use...

El 19/05/2011, a las 11:31, Juan Antonio Farré Basurte escribió:

 Hi,
 
 The query is generated dynamically and can be more or less complex depending 
 on different parameters. I'm also not free to give many details of our 
 implementation, but I'll give you the minimal query string that fails and the 
 relevant pieces of the config.
 The query string is:
 
 /select?q=+id:12345^0.01 +_query_:{!boost b=$dateboost v=$qq 
 deftype=dismax}dateboost=recip(ms(NOW/DAY,published_date),3.16e-11,1,1)qq=user_textqf=text1^2
  text2pf=text1^2 text2tie=0.1q.alt=*:*hl=truehl.fl=text1 
 text2hl.mergeContiguous=true
 
 where id is an int and text1 and text2 are type text. hl.fl has proven to be 
 necessary whenever I use dismax in an inner query. Ohterwise, only text2 (the 
 default field) is highlighted, and not both fields appearing in qf. For 
 example,
 q={!dismax v=$qq}... does not require hl.fl to highlight both text1 and 
 text2.
 q=+_query_:{!dismax v=$qq}... only highlights text2, unless I specify 
 hl.fl.
 
 The given query is probably not minimal in the sense that some of the 
 dismax-related parameters can be omitted and the query still fails. But the 
 one given always fails (and adding more complexity to it does not make it 
 work, quite obviously). Unfortunately, hl.requireFieldMatch=false does not 
 help.
 
 Request handler config is the following:
 
 requestHandler name=standard class=solr.SearchHandler default=true
   lst name=defaults
 str name=echoParamsexplicit/str
   /lst
 /requestHandler
 
 Highlighter config is the following:
 
 highlighting
   fragmenter name=gap class=org.apache.solr.highlight.GapFragmenter 
 default=true
 lst name=defaults
   int name=hl.fragsize100/int
 /lst
   /fragmenter
   fragmenter name=regex class=org.apache.solr.highlight.RegexFragmenter
 lst name=defaults
   int name=hl.fragsize70/int
   float name=hl.regex.slop0.5/float
   str name=hl.regex.pattern[-\w ,/\n\']{20,200}/str
 /lst
   /fragmenter
   formatter name=html class=org.apache.solr.highlight.HtmlFormatter 
 default=true
 lst name=defaults
   str name=hl.simple.preem/str
   str name=hl.simple.post/em/str
 /lst
   /formatter
 /highlighting
 
 If there's any other information that could be useful, just ask.
 Thank you very much for your help,
 
 Juan
 
 El 16/05/2011, a las 23:18, Chris Hostetter escribió:
 
 
 : As I said in my previous message, if I issue:
 : q=+field1:range +field2:value +_query_:{!dismax v=$qq}
 : highlighting works. I've just discovered the problem is not just with 
 {!boost...}. If I just add a bf parameter to the previous query, 
 highlighting also fails.
 : Anybody knows what can be happening? I'm really stuck on this problem...
 
 Just a hunch, but i suspect the problem has to do with 
 highlighter (or maybe it's the fragment generator?) trying to determine
 matches from query types it doens't understand 
 
 I thought there was a query param you could use to tell the highlighter to 
 use an alternate query string (that would be simpler) instead of the 
 real query ... but i'm not seeing it in the docs.
 
 hl.requireFieldMatch=false might also help (not sure)
 
 In general it would probably be helpful for folks if you could post the 
 *entire* request you are making (full query string and all request params) 
 along with the solrconfig.xml sections that show how your request handler 
 and highlighter are configured.
 
 
 
 -Hoss
 



Solr book

2011-05-19 Thread Savvas-Andreas Moysidis
Hello,

Does anyone know if there is a v 3.1 book coming any time soon?

Regards,
Savvas


Re: indexing directed graph

2011-05-19 Thread dani.b.angelov
Thank you Gora in advance!

However, I decided to create a bean for indexing something like that:
...
String[] vertices
String[] edges
int[] triple_inx_levels
...
So I can search in vertices text  edge text in vertices  edges array
fields, and I hope to find the relation from triple_inx_levels array, where
I will save indexes ot the upper two array in specific order(with some math
function, I do not find out yet). I will try in this way, I hope this will
enough for me. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-directed-graph-tp2949556p2960964.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr book

2011-05-19 Thread Rafał Kuć
Hello!

  Take   a   look   at   the   Solr   resources   page   on  the  wiki
(http://wiki.apache.org/solr/SolrResources). 


-- 
Regards,
 Rafał Kuć
 http://solr.pl



RE: How do I write/build query using qf parameter of dismax handler for my use case?

2011-05-19 Thread Gnanakumar
 edismax supports full query format of lucene parser.But you can search
using
 filter queries eg.

 qf=textfield1, textfield2, textfield3fq=textfield1:solr AND
 strutsfq=textfield2:solr AND strutsfq=textfield3:solr AND struts
 fq=companyid:100
Is it not possible to build query without filter queries fq?

For example, something like this (I believe this is syntactically not
correct, but something equivalent to this):
q=companyid:100 AND solr AND strutsqf= textfield1,textfield2,textfield3

Basically, I'm just trying/finding to simplify the query syntax.



SOLR-2209

2011-05-19 Thread Jean-Sebastien Vachon
Hi All,

I am having some problems with the presence of unnecessary  parenthesis in my 
query.
A query such as:
title:software AND (title:engineer)
will return no results. Remove the parenthesis fix the issue but then since my 
user can enter the parenthesis by himself I need to find a way to fix or 
work-around this bug. I found that this is related to SOLR-2209 but there is no 
activity on this bug.

Anyone know if this will get fixed some time in the future or if it is already 
fixed in Solr 4?

Otherwise, could someone point me to the code handling this so that I can 
attempt to make a fix?

Thx


Re: Solr book

2011-05-19 Thread Savvas-Andreas Moysidis
great, thanks!

So, I guess  the Solr In Action and Solr Cookbook will be based on 3.1..
:)

2011/5/19 Rafał Kuć ra...@alud.com.pl

 Hello!

  Take   a   look   at   the   Solr   resources   page   on  the  wiki
 (http://wiki.apache.org/solr/SolrResources).


 --
 Regards,
  Rafał Kuć
  http://solr.pl




Re: sorting on date field in facet query

2011-05-19 Thread Dmitry Kan
Hi Erick,

It is about ordering the facet information. The result set is empty via
rows=0.

Here is the logics and example:

Each doc has string field someStr and a date field associated with it, and
same doc id has same value of the date field. Question: is it possible to
sort the facet values given below on that date field?

curl
http://localhost:8983/solr/select?q=someStr:networkfacet=truefacet.field=idfacet.limit=1000facet.mincount=1rows=0

result excerpt:

lst name=facet_fields
lst name=id
int name=T-AS_1386229
54
/int
int name=T-AS_1386181
45
/int
int name=T-CP_1370095
36
/int
int name=T-AS_1377809
25
/int
int name=T-CP_1380207
18
/int
int name=T-CP_1373820
11
/int
int name=T-AS_1372073-1
8
/int
int name=T-AS_1367577
6
/int
int name=T-AS_1383141
5
/int
int name=T-AS_1383648-1
5
/int
int name=T-AS_1351183-1
4
/int
/lst
/lst


Regards,

Dmitry




On Wed, May 18, 2011 at 3:33 PM, Erick Erickson erickerick...@gmail.comwrote:

 Can you provide an example of what you are trying to do? Are you
 referring to ordering the result set or the facet information?

 Best
 Erick

 On Wed, May 18, 2011 at 7:21 AM, Dmitry Kan dmitry@gmail.com wrote:
  Hello list,
 
  Is it possible to sort on date field in a facet query in SOLR 3.1?
 
  --
  Regards,
 
  Dmitry Kan
 



Re: Fuzzy search and solr 4.0

2011-05-19 Thread Michael McCandless
Well the good news is FuzzyQuery is indeed much faster in Lucene/Solr 4.0.

But the bad news is... FuzzyQuery won't do what you need here.  You
need some sort of FuzzyPhraseQuery, which is able to replace terms
similar to one another (comp/company/corporation) by some metric.  I
don't know of such a query in Lucene/Solr... but it'd be a nice
addition.  Others have asked about this before.

FuzzyQuery finds terms close to other terms, when measured by edit
distance, eg fuzzy/wuzzy/muzzy are all edit distance one from each
other.

Mike

http://blog.mikemccandless.com

On Wed, May 18, 2011 at 8:03 PM, Guilherme Aiolfi grad...@gmail.com wrote:
 Hi,

 I want to do a fuzzy search that compare a phrase to a field in solr. For
 example:

 abc company ltda will be compared to abc comp, abc corporation, def
 company ltda, nothing to match here.

 The thing is the it has to always returns documents sorted by its score.

 I've found some good algorithms to do that, like StrikeAMatch[1] and
 JaroWinkler.

 Using the JaroWinkler with strdist() I can do exactly that. But, I rather
 prefer to use the StrikeAMatch that had a patch in the lucene jira that was
 never commited.

 So, I contacted the author of that patch and he told me that I should use
 the solr 4.0 that it has now some pretty good new fuzzy search enhancements
 that made StrikeAMatch seems toys for kids.

 Anyone know how can I achieve that using solr 4.0?

 [1] http://www.catalysoft.com/articles/StrikeAMatch.html



Re: Out of memory on sorting

2011-05-19 Thread Erick Erickson
The warming queries warm up the caches used in sorting. So
just including the sort=. will warm the sort caches. the terms
searched are not important. The same is true with facets...

However, I don't understand how that relates to your OOM problems. I'd
expect the OOM to start happening on startup, you'd be doing
the operation that runs you out of memory on startup...

So, we need more details:
1 how is your sort field defined? String? Integer? If it's a string
 and you could change it to a numeric type, you'd use a lot
 less memory.
2 How many distinct terms? I'm guessing one/document actually,
 this is somewhat of an anti-pattern in Solr for all it's sometimes
 necessary.
3 How much memory are you allocating for the JVM?
4 What other fields are you sorting on and how many unique values
 in each? Solr Admin can help you here

Best
Erick


On Thu, May 19, 2011 at 6:20 AM, Rohit ro...@in-rev.com wrote:
 Thanks for pointing me in the right direction, now I see the configuration
 for firstsearcher or newsearcher, the str name=q needs to configured
 previously. In my case the q is every changing, users can actually search
 for anything and the possibilities of queries unlimited.

 How can I make this generic?

 -Rohit



 -Original Message-
 From: rajini maski [mailto:rajinima...@gmail.com]
 Sent: 19 May 2011 14:53
 To: solr-user@lucene.apache.org
 Subject: Re: Out of memory on sorting

 Explicit Warming of Sort Fields

 If you do a lot of field based sorting, it is advantageous to add explicitly
 warming queries to the newSearcher and firstSearcher event listeners in
 your solrconfig which sort on those fields, so the FieldCache is populated
 prior to any queries being executed by your users.
 firstSearcher
 lst str name=qsolr rocks/strstr name=start0/strstr
 name=rows10/strstr name=sortempID asc/str/lst



 On Thu, May 19, 2011 at 2:39 PM, Rohit ro...@in-rev.com wrote:

 Hi,



 We are moving to a multi-core Solr installation with each of the core
 having
 millions of documents, also documents would be added to the index on an
 hourly basis.  Everything seems to run find and I getting the expected
 result and performance, except where sorting is concerned.



 I have an index size of 13217121 documents, now when I want to get
 documents
 between two dates and then sort them by ID  solr goes out of memory. This
 is
 with just me using the system, we might also have simultaneous users, how
 can I improve this performance?



 Rohit






Re: Field collapsing patch issues

2011-05-19 Thread Erick Erickson
Here's the root issue, and all available patches:
https://issues.apache.org/jira/browse/SOLR-236

I confess I have no clue what's what here, so
you're largely on your own. There are some
encouraging titles (note you can sort the patches
by date, which might help in figuring out which
to use)..

Best
Erick


On Thu, May 19, 2011 at 6:43 AM, Isha Garg isha.g...@orkash.com wrote:
 Hi All!


 Kindly provide me the links for suitable patches that are applied to solr
 version 1.4.1 and 3.0 so that field collapsing should work properly.


 Thanks in advance!
 Isha garg



Spatial search with SolrJ 3.1 ? How to

2011-05-19 Thread martin_groenhof
How do you construct a query in java for spatial search ? not the default
solr REST interface

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spatial-search-with-SolrJ-3-1-How-to-tp2961136p2961136.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Out of memory on sorting

2011-05-19 Thread Rohit
Hi Erick,

My OOM problem starts when I query the core with 13217121 documents. My
schema and other details are given below,

1 how is your sort field defined? String? Integer? If it's a string and you
could change it to a numeric type, you'd use a lot less memory.

We primarily use two different sort criteria one is a date field and the
other is string (id). I cannot change the id field as this is also the
uniquekey for my schema. 

2 How many distinct terms? I'm guessing one/document actually,this is
somewhat of an anti-pattern in Solr for all it's sometimes necessary.

Since one of the field is a timestamp instance and the other a unique key
all are distinct. (These are tweets happening for keyword)

3 How much memory are you allocating for the JVM?

I am starting solr with the following command java -Xms1024M -Xmx-2048M
start.jar


All out test case for moving to solr has passed, this is proving to be a big
set back. Help would be greatly appreciated.

Regards,
Rohit



-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 19 May 2011 18:21
To: solr-user@lucene.apache.org
Subject: Re: Out of memory on sorting

The warming queries warm up the caches used in sorting. So
just including the sort=. will warm the sort caches. the terms
searched are not important. The same is true with facets...

However, I don't understand how that relates to your OOM problems. I'd
expect the OOM to start happening on startup, you'd be doing
the operation that runs you out of memory on startup...

So, we need more details:
1 how is your sort field defined? String? Integer? If it's a string
 and you could change it to a numeric type, you'd use a lot
 less memory.
2 How many distinct terms? I'm guessing one/document actually,
 this is somewhat of an anti-pattern in Solr for all it's sometimes
 necessary.
3 How much memory are you allocating for the JVM?
4 What other fields are you sorting on and how many unique values
 in each? Solr Admin can help you here

Best
Erick


On Thu, May 19, 2011 at 6:20 AM, Rohit ro...@in-rev.com wrote:
 Thanks for pointing me in the right direction, now I see the configuration
 for firstsearcher or newsearcher, the str name=q needs to configured
 previously. In my case the q is every changing, users can actually search
 for anything and the possibilities of queries unlimited.

 How can I make this generic?

 -Rohit



 -Original Message-
 From: rajini maski [mailto:rajinima...@gmail.com]
 Sent: 19 May 2011 14:53
 To: solr-user@lucene.apache.org
 Subject: Re: Out of memory on sorting

 Explicit Warming of Sort Fields

 If you do a lot of field based sorting, it is advantageous to add
explicitly
 warming queries to the newSearcher and firstSearcher event listeners
in
 your solrconfig which sort on those fields, so the FieldCache is populated
 prior to any queries being executed by your users.
 firstSearcher
 lst str name=qsolr rocks/strstr name=start0/strstr
 name=rows10/strstr name=sortempID asc/str/lst



 On Thu, May 19, 2011 at 2:39 PM, Rohit ro...@in-rev.com wrote:

 Hi,



 We are moving to a multi-core Solr installation with each of the core
 having
 millions of documents, also documents would be added to the index on an
 hourly basis.  Everything seems to run find and I getting the expected
 result and performance, except where sorting is concerned.



 I have an index size of 13217121 documents, now when I want to get
 documents
 between two dates and then sort them by ID  solr goes out of memory. This
 is
 with just me using the system, we might also have simultaneous users, how
 can I improve this performance?



 Rohit







Re: Spatial search with SolrJ 3.1 ? How to

2011-05-19 Thread Yonik Seeley
On Thu, May 19, 2011 at 8:52 AM, martin_groenhof
martin.groen...@yahoo.com wrote:
 How do you construct a query in java for spatial search ? not the default
 solr REST interface

It depends on what you are trying to do - a spatial request (as
currently implemented in Solr) is typically more than just a query...
it can be filtering by a bounding box, filtering by a distance radius,
 or using a distance (geodist) function query in another way such as
sorting by it or using it as a factor in relevance.


-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


Re: Fuzzy search and solr 4.0

2011-05-19 Thread Guilherme Aiolfi
You, or any other solr member, knows a good fuzzy string matching library to
recommend?

On Thu, May 19, 2011 at 9:39 AM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Well the good news is FuzzyQuery is indeed much faster in Lucene/Solr
 4.0.

 But the bad news is... FuzzyQuery won't do what you need here.  You
 need some sort of FuzzyPhraseQuery, which is able to replace terms
 similar to one another (comp/company/corporation) by some metric.  I
 don't know of such a query in Lucene/Solr... but it'd be a nice
 addition.  Others have asked about this before.

 FuzzyQuery finds terms close to other terms, when measured by edit
 distance, eg fuzzy/wuzzy/muzzy are all edit distance one from each
 other.

 Mike

 http://blog.mikemccandless.com

 On Wed, May 18, 2011 at 8:03 PM, Guilherme Aiolfi grad...@gmail.com
 wrote:
  Hi,
 
  I want to do a fuzzy search that compare a phrase to a field in solr. For
  example:
 
  abc company ltda will be compared to abc comp, abc corporation,
 def
  company ltda, nothing to match here.
 
  The thing is the it has to always returns documents sorted by its score.
 
  I've found some good algorithms to do that, like StrikeAMatch[1] and
  JaroWinkler.
 
  Using the JaroWinkler with strdist() I can do exactly that. But, I rather
  prefer to use the StrikeAMatch that had a patch in the lucene jira that
 was
  never commited.
 
  So, I contacted the author of that patch and he told me that I should use
  the solr 4.0 that it has now some pretty good new fuzzy search
 enhancements
  that made StrikeAMatch seems toys for kids.
 
  Anyone know how can I achieve that using solr 4.0?
 
  [1] http://www.catalysoft.com/articles/StrikeAMatch.html
 



[Announce[ White paper describing Near Real Time Implementation with Solr and RankingAlgorithm

2011-05-19 Thread Nagendra Nagarajayya

Hi!

I would like to announce a white paper that describes the technical 
details of  Near Real Time implementation with Solr and the 
RankingAlgorithm. The paper discusses the modifications made to enable NRT.


You can download the white paper from here:
http://solr-ra.tgels.com/papers/NRT_Solr_RankingAlgorithm.pdf

The modified src can also be downloaded from here:
http://solr-ra.tgels.com

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.com





Re: sorting on date field in facet query

2011-05-19 Thread Erick Erickson
The only two ways to influence facet order is by count and alphabetically.

facet.sort=index will sort by alpha, the default is facet.sort=count

All that said, I still don't quite understand what you're asking for. Facets
are simply a count of the documents that have unique values for, in your
case, the id field. It doesn't make sense to sort the returned facets
by some other field. You can facet on the other field and sort *that*.

Sorting the documents returned is unrelated, but I don't think that's what
you're asking...

Or I completely miss the point...

Best
Erick

On Thu, May 19, 2011 at 8:24 AM, Dmitry Kan dmitry@gmail.com wrote:
 Hi Erick,

 It is about ordering the facet information. The result set is empty via
 rows=0.

 Here is the logics and example:

 Each doc has string field someStr and a date field associated with it, and
 same doc id has same value of the date field. Question: is it possible to
 sort the facet values given below on that date field?

 curl
 http://localhost:8983/solr/select?q=someStr:networkfacet=truefacet.field=idfacet.limit=1000facet.mincount=1rows=0

 result excerpt:

 lst name=facet_fields
 lst name=id
 int name=T-AS_1386229
 54
 /int
 int name=T-AS_1386181
 45
 /int
 int name=T-CP_1370095
 36
 /int
 int name=T-AS_1377809
 25
 /int
 int name=T-CP_1380207
 18
 /int
 int name=T-CP_1373820
 11
 /int
 int name=T-AS_1372073-1
 8
 /int
 int name=T-AS_1367577
 6
 /int
 int name=T-AS_1383141
 5
 /int
 int name=T-AS_1383648-1
 5
 /int
 int name=T-AS_1351183-1
 4
 /int
 /lst
 /lst


 Regards,

 Dmitry




 On Wed, May 18, 2011 at 3:33 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Can you provide an example of what you are trying to do? Are you
 referring to ordering the result set or the facet information?

 Best
 Erick

 On Wed, May 18, 2011 at 7:21 AM, Dmitry Kan dmitry@gmail.com wrote:
  Hello list,
 
  Is it possible to sort on date field in a facet query in SOLR 3.1?
 
  --
  Regards,
 
  Dmitry Kan
 




Re: sorting on date field in facet query

2011-05-19 Thread Stefan Matheis
Dmitry,

how should that work? Take a this short sample-data:

id | date
T-AS_1386229 | 1995-12-31T23:59:59Z
T-AS_1386181 | 1996-12-31T23:59:59Z
T-AS_1386229 | 1997-12-31T23:59:59Z

So, you'll have two facets for the ids .. but how should they be
sorted? One (of the two) is the first and the other the last Document
.. so, sort by lowest date? highest date? i guess, that would/could
not really work.

Perhaps we have to ask another Question .. what are you trying to
achieve? Boost by Date?

Regards
Stefan

On Thu, May 19, 2011 at 2:24 PM, Dmitry Kan dmitry@gmail.com wrote:
 Hi Erick,

 It is about ordering the facet information. The result set is empty via
 rows=0.

 Here is the logics and example:

 Each doc has string field someStr and a date field associated with it, and
 same doc id has same value of the date field. Question: is it possible to
 sort the facet values given below on that date field?

 curl
 http://localhost:8983/solr/select?q=someStr:networkfacet=truefacet.field=idfacet.limit=1000facet.mincount=1rows=0

 result excerpt:

 lst name=facet_fields
 lst name=id
 int name=T-AS_1386229
 54
 /int
 int name=T-AS_1386181
 45
 /int
 int name=T-CP_1370095
 36
 /int
 int name=T-AS_1377809
 25
 /int
 int name=T-CP_1380207
 18
 /int
 int name=T-CP_1373820
 11
 /int
 int name=T-AS_1372073-1
 8
 /int
 int name=T-AS_1367577
 6
 /int
 int name=T-AS_1383141
 5
 /int
 int name=T-AS_1383648-1
 5
 /int
 int name=T-AS_1351183-1
 4
 /int
 /lst
 /lst


 Regards,

 Dmitry




 On Wed, May 18, 2011 at 3:33 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Can you provide an example of what you are trying to do? Are you
 referring to ordering the result set or the facet information?

 Best
 Erick

 On Wed, May 18, 2011 at 7:21 AM, Dmitry Kan dmitry@gmail.com wrote:
  Hello list,
 
  Is it possible to sort on date field in a facet query in SOLR 3.1?
 
  --
  Regards,
 
  Dmitry Kan
 




Re: Out of memory on sorting

2011-05-19 Thread Erick Erickson
See below:

On Thu, May 19, 2011 at 9:06 AM, Rohit ro...@in-rev.com wrote:
 Hi Erick,

 My OOM problem starts when I query the core with 13217121 documents. My
 schema and other details are given below,

H, how many cores are you running and what are they doing? Because they
all use the same memory pool, so you may be getting some carry-over. So one
strategy would be just to move this core to a dedicated machine.


 1 how is your sort field defined? String? Integer? If it's a string and you
 could change it to a numeric type, you'd use a lot less memory.

 We primarily use two different sort criteria one is a date field and the
 other is string (id). I cannot change the id field as this is also the
 uniquekey for my schema.

OK, but can you use a separate field just for sorting? Populate it with
a copyField and sort on that rather than ID. This is only helpful if
you can make a compact representation, e.g. integer.


 2 How many distinct terms? I'm guessing one/document actually,this is
 somewhat of an anti-pattern in Solr for all it's sometimes necessary.

 Since one of the field is a timestamp instance and the other a unique key
 all are distinct. (These are tweets happening for keyword)


Not one, but two fields where all values are distinct. Although  I don't think
the timestamp is much of a problem, assuming you're storing it as one
of the numeric types (I'd especially make sure it was one of the Trie types,
specifically tdate if you're going to do range queries). There are tricks for
dealing with this, but your id field will get you a bigger bang for the buck,
concentrate on that first.

 3 How much memory are you allocating for the JVM?

 I am starting solr with the following command java -Xms1024M -Xmx-2048M
 start.jar


Well, you can bump this higher if you're on 64 bit OSs, The other possibility is
to shard your index. But really, with 13M documents this should fit on one
machine.

What does your statistics page tell you, especially about cache usage?




 All out test case for moving to solr has passed, this is proving to be a big
 set back. Help would be greatly appreciated.

 Regards,
 Rohit



 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: 19 May 2011 18:21
 To: solr-user@lucene.apache.org
 Subject: Re: Out of memory on sorting

 The warming queries warm up the caches used in sorting. So
 just including the sort=. will warm the sort caches. the terms
 searched are not important. The same is true with facets...

 However, I don't understand how that relates to your OOM problems. I'd
 expect the OOM to start happening on startup, you'd be doing
 the operation that runs you out of memory on startup...

 So, we need more details:
 1 how is your sort field defined? String? Integer? If it's a string
     and you could change it to a numeric type, you'd use a lot
     less memory.
 2 How many distinct terms? I'm guessing one/document actually,
     this is somewhat of an anti-pattern in Solr for all it's sometimes
     necessary.
 3 How much memory are you allocating for the JVM?
 4 What other fields are you sorting on and how many unique values
     in each? Solr Admin can help you here

 Best
 Erick


 On Thu, May 19, 2011 at 6:20 AM, Rohit ro...@in-rev.com wrote:
 Thanks for pointing me in the right direction, now I see the configuration
 for firstsearcher or newsearcher, the str name=q needs to configured
 previously. In my case the q is every changing, users can actually search
 for anything and the possibilities of queries unlimited.

 How can I make this generic?

 -Rohit



 -Original Message-
 From: rajini maski [mailto:rajinima...@gmail.com]
 Sent: 19 May 2011 14:53
 To: solr-user@lucene.apache.org
 Subject: Re: Out of memory on sorting

 Explicit Warming of Sort Fields

 If you do a lot of field based sorting, it is advantageous to add
 explicitly
 warming queries to the newSearcher and firstSearcher event listeners
 in
 your solrconfig which sort on those fields, so the FieldCache is populated
 prior to any queries being executed by your users.
 firstSearcher
 lst str name=qsolr rocks/strstr name=start0/strstr
 name=rows10/strstr name=sortempID asc/str/lst



 On Thu, May 19, 2011 at 2:39 PM, Rohit ro...@in-rev.com wrote:

 Hi,



 We are moving to a multi-core Solr installation with each of the core
 having
 millions of documents, also documents would be added to the index on an
 hourly basis.  Everything seems to run find and I getting the expected
 result and performance, except where sorting is concerned.



 I have an index size of 13217121 documents, now when I want to get
 documents
 between two dates and then sort them by ID  solr goes out of memory. This
 is
 with just me using the system, we might also have simultaneous users, how
 can I improve this performance?



 Rohit








Re: sorting on date field in facet query

2011-05-19 Thread Dmitry Kan
Hi,

Thanks for the questions, guys, and sorry for the confusion. I should start
with a broader picture of what we are trying to achieve. The only problem is
that I cannot speak about specifics of the task we are solving the way we
do. We currently sort the facets on the client side, having the date values
at hand (done by an boolean query to SOLR with a list of ids). However,
sometimes we have glitches, that is since we limit the facets to first
facet.limit ones, and there is no date boosting we may have some facet
counts end up beyond the facet counts range and that's sad. One way around
it would be to facet with pagination, where a page would correspond to a
date subrange in the range of required dates. But we haven't tried it yet
before we investigate what can be done inside SOLR (by modifying its source
code, if needed).

So as said every solr doc that has some id in the solr index (this id is
used to combine several solr docs logically, only that purpose; this design
comes from the task definition) has a date field, and the value of that date
field is always same for a given doc id across all the solr docs with the
same doc id.

Now, taking the Stefan's example, I would like to sort desc the facets by
date (yes, date boosting during the facet gathering process) that were
calculated against someStr field:

int name=T-AS_1386181
45
/int
int name=T-AS_1386229
54
/int

So SOLR facet component would ignore the counts and sort the facets by dates
desc (in reverse chronological order).

Is it possible to implement such a solution through some class inheritance
in facet component?

Regards,

Dmitry

On Thu, May 19, 2011 at 4:25 PM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Dmitry,

 how should that work? Take a this short sample-data:

 id | date
 T-AS_1386229 | 1995-12-31T23:59:59Z
 T-AS_1386181 | 1996-12-31T23:59:59Z
 T-AS_1386229 | 1997-12-31T23:59:59Z

 So, you'll have two facets for the ids .. but how should they be
 sorted? One (of the two) is the first and the other the last Document
 .. so, sort by lowest date? highest date? i guess, that would/could
 not really work.

 Perhaps we have to ask another Question .. what are you trying to
 achieve? Boost by Date?

 Regards
 Stefan

 On Thu, May 19, 2011 at 2:24 PM, Dmitry Kan dmitry@gmail.com wrote:
  Hi Erick,
 
  It is about ordering the facet information. The result set is empty via
  rows=0.
 
  Here is the logics and example:
 
  Each doc has string field someStr and a date field associated with it,
 and
  same doc id has same value of the date field. Question: is it possible to
  sort the facet values given below on that date field?
 
  curl
 
 http://localhost:8983/solr/select?q=someStr:networkfacet=truefacet.field=idfacet.limit=1000facet.mincount=1rows=0
 
  result excerpt:
 
  lst name=facet_fields
  lst name=id
  int name=T-AS_1386229
  54
  /int
  int name=T-AS_1386181
  45
  /int
  int name=T-CP_1370095
  36
  /int
  int name=T-AS_1377809
  25
  /int
  int name=T-CP_1380207
  18
  /int
  int name=T-CP_1373820
  11
  /int
  int name=T-AS_1372073-1
  8
  /int
  int name=T-AS_1367577
  6
  /int
  int name=T-AS_1383141
  5
  /int
  int name=T-AS_1383648-1
  5
  /int
  int name=T-AS_1351183-1
  4
  /int
  /lst
  /lst
 
 
  Regards,
 
  Dmitry
 
 
 
 
  On Wed, May 18, 2011 at 3:33 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  Can you provide an example of what you are trying to do? Are you
  referring to ordering the result set or the facet information?
 
  Best
  Erick
 
  On Wed, May 18, 2011 at 7:21 AM, Dmitry Kan dmitry@gmail.com
 wrote:
   Hello list,
  
   Is it possible to sort on date field in a facet query in SOLR 3.1?
  
   --
   Regards,
  
   Dmitry Kan
  
 
 



Re: SOLR-2209

2011-05-19 Thread Erick Erickson
What version of Solr are you using? Because this works fine for me.

Could you attach the results of adding debugQuery=on in both instances?
The parsed form of the query is identical in 1.4.1 as far as I can tell. The bug
you're referencing is a peculiarity of the not (-) operator I think.

Best
Erick

On Thu, May 19, 2011 at 7:25 AM, Jean-Sebastien Vachon
jean-sebastien.vac...@wantedtech.com wrote:
 Hi All,

 I am having some problems with the presence of unnecessary  parenthesis in my 
 query.
 A query such as:
                title:software AND (title:engineer)
 will return no results. Remove the parenthesis fix the issue but then since 
 my user can enter the parenthesis by himself I need to find a way to fix or 
 work-around this bug. I found that this is related to SOLR-2209 but there is 
 no activity on this bug.

 Anyone know if this will get fixed some time in the future or if it is 
 already fixed in Solr 4?

 Otherwise, could someone point me to the code handling this so that I can 
 attempt to make a fix?

 Thx



Re: Spatial search with SolrJ 3.1 ? How to

2011-05-19 Thread martin_groenhof
I don't care about the method, I just want results within let's say 10km of a
lat,lng ?

(I can do this with REST) but don't know how to with a Java API

[code]SpatialOptions spatialOptions =
new SpatialOptions(company.getLatitude() + , +
company.getLongitude(),
10, new SchemaField(geolocation, null), searchName, 20,
DistanceUnits.KILOMETERS);

LatLonType latLonType = new LatLonType();

Query query = latLonType.createSpatialQuery(new
SpatialFilterQParser(searchString.toString(), solrq, solrq, null, true),
spatialOptions);[/code]

(I am trying with this, but it does not seem to be compatible with solr only
lucene)

Any example will do, Thx

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Spatial-search-with-SolrJ-3-1-How-to-tp2961136p2961452.html
Sent from the Solr - User mailing list archive at Nabble.com.


Facetting: Some questions concerning method:fc

2011-05-19 Thread Erik Fäßler

 Hey all!

I have a few questions concerning the field cache method for faceting.
The wiki says for enum method: This was the default (and only) method 
for faceting multi-valued fields prior to Solr 1.4. . And for fc 
method: This was the default method for single valued fields prior to 
Solr 1.4. .
I just ran into the problem of using fc for a field which can have 
multiple terms for one field. The facet counts would be wrong, seemingly 
only counting the first term in the field of each document. I observed 
this in Solr 1.4.1 and in 3.1 with the same index.


Question 1: The quotes above say prior to Solr 1.4. Has this changed? 
Is there another method for multi-valued faceting since Solr 1.4?
Question 2: Very weird is another observation: When faceting on another 
field, namely the text field holding a large variety of terms and 
especially a lot of different terms in one single field, the fc method 
seems to count everything correctly. In fact, the results between fc and 
enum don't seem to differ. The field in which the fc and enum faceting 
results differ consists of a lot of terms which have all start- end end 
offsets 0, 0 and position increment 1. Could this be a problem?


Best regards,

Erik


how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-19 Thread stockii
Hello

i want to index some datefields with this dateformat: -mm-dd. Solr
thwows an exception like this: can not be represented as java.sql.Date

i am unsing ...transformer=DateFormatTransformer
and ...zeroDateTimeBehavoir=convertToNull

how can i say to DIH to convert this fields in correct format ?? thx

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2961481.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Facetting: Some questions concerning method:fc

2011-05-19 Thread Yonik Seeley
On Thu, May 19, 2011 at 9:56 AM, Erik Fäßler erik.faess...@uni-jena.de wrote:
 I have a few questions concerning the field cache method for faceting.
 The wiki says for enum method: This was the default (and only) method for
 faceting multi-valued fields prior to Solr 1.4. . And for fc method: This
 was the default method for single valued fields prior to Solr 1.4. .
 I just ran into the problem of using fc for a field which can have multiple
 terms for one field. The facet counts would be wrong, seemingly only
 counting the first term in the field of each document. I observed this in
 Solr 1.4.1 and in 3.1 with the same index.

That doesn't sound right... the results should always be identical
between facet.method=fc and facet.method=enum. Are you sure you didn't
index a multi-valued field and then change the fieldType in the schema
to be single valued? Are you sure the field is indexed the way you
think it is?  If so, is there an easy way for someone to reproduce
what you are seeing?

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco


Re: sorting on date field in facet query

2011-05-19 Thread Erick Erickson
Oh, isn't that ducky. The facet.sort parameter only sorts ascending
as far as I can tell. Which is exactly the reverse of what you want.

Would it work to cleverly encode the facet field to do what you want
just by a lexical sort? Something like use a very large constant,
subtract the date for each record from that and then put that in a
new field that you facet/sort by? Then un-transform it for display? Let's
say you have a range from 0-9. Then your facet field could be
something like
original doc values
doc 1: 2 - oldest
doc 2: 5
doc 3: 8  - newest

You'd store values like these in facetme (9 - orig value) + text
doc1: 7_docid1
doc2: 4_docid2
doc3: 1_docid3

Now a natural ordering (facet.sort=index) wold return them in
date order. If this was a well-defined process you could easily
transform it back for proper display. Although watch out for
leading zeros!

Thinking off the top of my head here

Erick

On Thu, May 19, 2011 at 9:46 AM, Dmitry Kan dmitry@gmail.com wrote:
 Hi,

 Thanks for the questions, guys, and sorry for the confusion. I should start
 with a broader picture of what we are trying to achieve. The only problem is
 that I cannot speak about specifics of the task we are solving the way we
 do. We currently sort the facets on the client side, having the date values
 at hand (done by an boolean query to SOLR with a list of ids). However,
 sometimes we have glitches, that is since we limit the facets to first
 facet.limit ones, and there is no date boosting we may have some facet
 counts end up beyond the facet counts range and that's sad. One way around
 it would be to facet with pagination, where a page would correspond to a
 date subrange in the range of required dates. But we haven't tried it yet
 before we investigate what can be done inside SOLR (by modifying its source
 code, if needed).

 So as said every solr doc that has some id in the solr index (this id is
 used to combine several solr docs logically, only that purpose; this design
 comes from the task definition) has a date field, and the value of that date
 field is always same for a given doc id across all the solr docs with the
 same doc id.

 Now, taking the Stefan's example, I would like to sort desc the facets by
 date (yes, date boosting during the facet gathering process) that were
 calculated against someStr field:

 int name=T-AS_1386181
 45
 /int
 int name=T-AS_1386229
 54
 /int

 So SOLR facet component would ignore the counts and sort the facets by dates
 desc (in reverse chronological order).

 Is it possible to implement such a solution through some class inheritance
 in facet component?

 Regards,

 Dmitry

 On Thu, May 19, 2011 at 4:25 PM, Stefan Matheis 
 matheis.ste...@googlemail.com wrote:

 Dmitry,

 how should that work? Take a this short sample-data:

 id | date
 T-AS_1386229 | 1995-12-31T23:59:59Z
 T-AS_1386181 | 1996-12-31T23:59:59Z
 T-AS_1386229 | 1997-12-31T23:59:59Z

 So, you'll have two facets for the ids .. but how should they be
 sorted? One (of the two) is the first and the other the last Document
 .. so, sort by lowest date? highest date? i guess, that would/could
 not really work.

 Perhaps we have to ask another Question .. what are you trying to
 achieve? Boost by Date?

 Regards
 Stefan

 On Thu, May 19, 2011 at 2:24 PM, Dmitry Kan dmitry@gmail.com wrote:
  Hi Erick,
 
  It is about ordering the facet information. The result set is empty via
  rows=0.
 
  Here is the logics and example:
 
  Each doc has string field someStr and a date field associated with it,
 and
  same doc id has same value of the date field. Question: is it possible to
  sort the facet values given below on that date field?
 
  curl
 
 http://localhost:8983/solr/select?q=someStr:networkfacet=truefacet.field=idfacet.limit=1000facet.mincount=1rows=0
 
  result excerpt:
 
  lst name=facet_fields
  lst name=id
  int name=T-AS_1386229
  54
  /int
  int name=T-AS_1386181
  45
  /int
  int name=T-CP_1370095
  36
  /int
  int name=T-AS_1377809
  25
  /int
  int name=T-CP_1380207
  18
  /int
  int name=T-CP_1373820
  11
  /int
  int name=T-AS_1372073-1
  8
  /int
  int name=T-AS_1367577
  6
  /int
  int name=T-AS_1383141
  5
  /int
  int name=T-AS_1383648-1
  5
  /int
  int name=T-AS_1351183-1
  4
  /int
  /lst
  /lst
 
 
  Regards,
 
  Dmitry
 
 
 
 
  On Wed, May 18, 2011 at 3:33 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  Can you provide an example of what you are trying to do? Are you
  referring to ordering the result set or the facet information?
 
  Best
  Erick
 
  On Wed, May 18, 2011 at 7:21 AM, Dmitry Kan dmitry@gmail.com
 wrote:
   Hello list,
  
   Is it possible to sort on date field in a facet query in SOLR 3.1?
  
   --
   Regards,
  
   Dmitry Kan
  
 
 




Re: sorting on date field in facet query

2011-05-19 Thread Dmitry Kan
Thanks Erick, this sounds solid to me!

It of course will require the repost of the entire index (pretty big one,
sharded), but that's not an issue as we periodically do that anyway.

Thanks and regards,

Dmitry

On Thu, May 19, 2011 at 5:08 PM, Erick Erickson erickerick...@gmail.comwrote:

 Oh, isn't that ducky. The facet.sort parameter only sorts ascending
 as far as I can tell. Which is exactly the reverse of what you want.

 Would it work to cleverly encode the facet field to do what you want
 just by a lexical sort? Something like use a very large constant,
 subtract the date for each record from that and then put that in a
 new field that you facet/sort by? Then un-transform it for display? Let's
 say you have a range from 0-9. Then your facet field could be
 something like
 original doc values
 doc 1: 2 - oldest
 doc 2: 5
 doc 3: 8  - newest

 You'd store values like these in facetme (9 - orig value) + text
 doc1: 7_docid1
 doc2: 4_docid2
 doc3: 1_docid3

 Now a natural ordering (facet.sort=index) wold return them in
 date order. If this was a well-defined process you could easily
 transform it back for proper display. Although watch out for
 leading zeros!

 Thinking off the top of my head here

 Erick

 On Thu, May 19, 2011 at 9:46 AM, Dmitry Kan dmitry@gmail.com wrote:
  Hi,
 
  Thanks for the questions, guys, and sorry for the confusion. I should
 start
  with a broader picture of what we are trying to achieve. The only problem
 is
  that I cannot speak about specifics of the task we are solving the way we
  do. We currently sort the facets on the client side, having the date
 values
  at hand (done by an boolean query to SOLR with a list of ids). However,
  sometimes we have glitches, that is since we limit the facets to first
  facet.limit ones, and there is no date boosting we may have some facet
  counts end up beyond the facet counts range and that's sad. One way
 around
  it would be to facet with pagination, where a page would correspond to a
  date subrange in the range of required dates. But we haven't tried it yet
  before we investigate what can be done inside SOLR (by modifying its
 source
  code, if needed).
 
  So as said every solr doc that has some id in the solr index (this id is
  used to combine several solr docs logically, only that purpose; this
 design
  comes from the task definition) has a date field, and the value of that
 date
  field is always same for a given doc id across all the solr docs with the
  same doc id.
 
  Now, taking the Stefan's example, I would like to sort desc the facets by
  date (yes, date boosting during the facet gathering process) that were
  calculated against someStr field:
 
  int name=T-AS_1386181
  45
  /int
  int name=T-AS_1386229
  54
  /int
 
  So SOLR facet component would ignore the counts and sort the facets by
 dates
  desc (in reverse chronological order).
 
  Is it possible to implement such a solution through some class
 inheritance
  in facet component?
 
  Regards,
 
  Dmitry
 
  On Thu, May 19, 2011 at 4:25 PM, Stefan Matheis 
  matheis.ste...@googlemail.com wrote:
 
  Dmitry,
 
  how should that work? Take a this short sample-data:
 
  id | date
  T-AS_1386229 | 1995-12-31T23:59:59Z
  T-AS_1386181 | 1996-12-31T23:59:59Z
  T-AS_1386229 | 1997-12-31T23:59:59Z
 
  So, you'll have two facets for the ids .. but how should they be
  sorted? One (of the two) is the first and the other the last Document
  .. so, sort by lowest date? highest date? i guess, that would/could
  not really work.
 
  Perhaps we have to ask another Question .. what are you trying to
  achieve? Boost by Date?
 
  Regards
  Stefan
 
  On Thu, May 19, 2011 at 2:24 PM, Dmitry Kan dmitry@gmail.com
 wrote:
   Hi Erick,
  
   It is about ordering the facet information. The result set is empty
 via
   rows=0.
  
   Here is the logics and example:
  
   Each doc has string field someStr and a date field associated with it,
  and
   same doc id has same value of the date field. Question: is it possible
 to
   sort the facet values given below on that date field?
  
   curl
  
 
 http://localhost:8983/solr/select?q=someStr:networkfacet=truefacet.field=idfacet.limit=1000facet.mincount=1rows=0
  
   result excerpt:
  
   lst name=facet_fields
   lst name=id
   int name=T-AS_1386229
   54
   /int
   int name=T-AS_1386181
   45
   /int
   int name=T-CP_1370095
   36
   /int
   int name=T-AS_1377809
   25
   /int
   int name=T-CP_1380207
   18
   /int
   int name=T-CP_1373820
   11
   /int
   int name=T-AS_1372073-1
   8
   /int
   int name=T-AS_1367577
   6
   /int
   int name=T-AS_1383141
   5
   /int
   int name=T-AS_1383648-1
   5
   /int
   int name=T-AS_1351183-1
   4
   /int
   /lst
   /lst
  
  
   Regards,
  
   Dmitry
  
  
  
  
   On Wed, May 18, 2011 at 3:33 PM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
  
   Can you provide an example of what you are trying to do? Are you
   referring to ordering the result set or the facet information?
  
 

Re: lucene parser, negative OR operands

2011-05-19 Thread Jonathan Rochkind

On 5/18/2011 9:07 PM, Chris Hostetter wrote:

You could implement a parser like that relatively easily -- just make sure
you put a MatchAllDocsQuery in every BooleanQuery object thta you
construct, and only ever use the PROHIBITED and MANDATORY clause types
(never OPTIONAL) ...  the thing is, a parser like that isn't as useful
as you think it might be when dealing with search results.  OPTIONAL
clauses are where most of the useful factors of scoring documents ocme
into play.


Thanks for the background and ideas, very helpful.

Hmm, but what if it DID use OPTIONAL clause types but just turned 
all pure-negative clauses into the alternative combination with 
MatchAllDocsQuery ( *:* AND $pure_negative)?  Just like lucene query 
parser does now, but not only for top-level clauses. Seems like that 
might maintain the power of optional clauses for scoring, but still 
allow negative clauses to work the 'boolean logic' way people expect -- 
same rationale that has the query parser doing this at top-level, why 
not do it for sub-clauses as well? Does that have any promise do you think?


Jonathan


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-19 Thread roySolr
Try this in your query:

TIME_FORMAT(timeDb, '%H:%i') as timefield

http://www.java2s.com/Tutorial/MySQL/0280__Date-Time-Functions/TIMEFORMATtimeformat.htm


--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2961591.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: sorting on date field in facet query

2011-05-19 Thread kenf_nc
This is more a speculation than direction, I don't currently use Field
Collapsing but my take on it is that it returns the number of docs
collapsed. So instead of faceting could you do a search returning DocID,
collapsing on DocID sorting on date, then the count of collapsed docs
*should* match the facet count?

Just wondering.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/sorting-on-date-field-in-facet-query-tp2956540p2961612.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: sorting on date field in facet query

2011-05-19 Thread Dmitry Kan
Hi,

1. Is it possible to produce the collapsed docs count in the same query?
2. What is the performance of Field Collapsing versus Facet Search?

Dmitry

On Thu, May 19, 2011 at 5:36 PM, kenf_nc ken.fos...@realestate.com wrote:

 This is more a speculation than direction, I don't currently use Field
 Collapsing but my take on it is that it returns the number of docs
 collapsed. So instead of faceting could you do a search returning DocID,
 collapsing on DocID sorting on date, then the count of collapsed docs
 *should* match the facet count?

 Just wondering.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/sorting-on-date-field-in-facet-query-tp2956540p2961612.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,

Dmitry Kan


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-19 Thread stockii
did you mean something like this ? 

DATE_FORMAT(cp.field, '%Y-%m-%di %H:%i:%s') AS field ???

i think i need to add the timestamp to my date fields? or not ? 
why cannot DIH handle with this ? 

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2961684.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: sorting on date field in facet query

2011-05-19 Thread Erick Erickson
Oooh, that's clever

The glitch is that field collapsing is scheduled for 3.2, but that
probably means
the patch is close to being applicable to 3.1 but I don't know that for sure.

Erick

On Thu, May 19, 2011 at 10:36 AM, kenf_nc ken.fos...@realestate.com wrote:
 This is more a speculation than direction, I don't currently use Field
 Collapsing but my take on it is that it returns the number of docs
 collapsed. So instead of faceting could you do a search returning DocID,
 collapsing on DocID sorting on date, then the count of collapsed docs
 *should* match the facet count?

 Just wondering.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/sorting-on-date-field-in-facet-query-tp2956540p2961612.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-19 Thread Erick Erickson
Offhand, I don't think the problem is DIH since your stack trace
specifies a SQL error. What is the SQL you're using? And
the DIH configuration?

Best
Erick

On Thu, May 19, 2011 at 10:53 AM, stockii stock.jo...@googlemail.com wrote:
 did you mean something like this ?

 DATE_FORMAT(cp.field, '%Y-%m-%di %H:%i:%s') AS field ???

 i think i need to add the timestamp to my date fields? or not ?
 why cannot DIH handle with this ?

 -
 --- System 
 

 One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
 1 Core with 31 Million Documents other Cores  100.000

 - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
 - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2961684.html
 Sent from the Solr - User mailing list archive at Nabble.com.



New release of Python/Solr library Sunburnt

2011-05-19 Thread Toby White
Hi,

I'd like to announce the release of a new version of my Python-Solr
library, sunburnt:
http://pypi.python.org/pypi/sunburnt/0.5

Documentation and tutorial examples are available at:
http://opensource.timetric.com/sunburnt/

and there's a mailing list for discussion at
http://groups.google.com/group/python-sunburnt

Sunburnt was written initially for use with the Timetric platform
(http://timetric.com) and is in use by several other internet-scale
sites.

Toby

-- 
http://timetric.com
2nd Floor, White Bear Yard, 144a Clerkenwell Road, London EC1R 5DF
phone: +44 20 3286 0677 (office), +44 7747 603618 (mobile)
twitter: @timetric, @tow21 | skype: tobyohwhite


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-19 Thread stockii
entity name=foo pk=cp_id transformer=DateFormatTransformer
query=SELECT ...,
...some fields ...

cp.start_date_1,
cp.start_date_2,
cp.end_date_1,
cp.end_date_2,

.. some other fields ..

FROM ... 
/entity


that not works with fields with this value: -00-00 OR/AND 2011-05-18


id tried with:
field column=start_date_1 dateTimeFormat=-MM-dd'T'hh:mm:ss /


but solr say always that these fields have a wrong format ! i try my
sql-selects before i post it here ,-)

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2961787.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to convert YYYY-MM-DD to YYY-MM-DD hh:mm:ss - DIH

2011-05-19 Thread stockii
okay, i found the problem.


i put the fields two times in my data-config ;-)

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-convert--MM-DD-to-YYY-MM-DD-hh-mm-ss-DIH-tp2961481p2961834.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Replication - replicated failed at the same time?

2011-05-19 Thread Stefan Matheis
Hm, anyone?

On Sat, May 14, 2011 at 7:11 PM, Stefan Matheis
matheis.ste...@googlemail.com wrote:
 Hi Guys,

 while working on the UI for Replication, i've got confused sometimes because
 of the following response (from /replication?command=details):

 ?xml version=1.0 encoding=UTF-8?
 response
  lst name=details
    lst name=slave
      !-- .. --
      str name=indexReplicatedAtSat May 14 16:25:53 UTC 2011/str
      arr name=indexReplicatedAtList
        strSat May 14 16:25:53 UTC 2011/str
      /arr
      str name=replicationFailedAtSat May 14 16:25:53 UTC 2011/str
      arr name=replicationFailedAtList
        strSat May 14 16:25:53 UTC 2011/str
      /arr
      !-- .. --
    /lst
  /lst
 /response

 To reproduce that: Start with Solr-Instance (with a clean index), trigger
 replication, abort fetch - look at the details.

 Does not really make sense to me? If it's okay .. please let me know: how 
 why - especially interested in how to display that information in the UI
 (Current State: http://files.mathe.is/solr-admin/10_replication.png).

 Regards
 Stefan



Re: Facetting: Some questions concerning method:fc

2011-05-19 Thread Erik Fäßler

 Am 19.05.2011 16:07, schrieb Yonik Seeley:

On Thu, May 19, 2011 at 9:56 AM, Erik Fäßlererik.faess...@uni-jena.de  wrote:

I have a few questions concerning the field cache method for faceting.
The wiki says for enum method: This was the default (and only) method for
faceting multi-valued fields prior to Solr 1.4. . And for fc method: This
was the default method for single valued fields prior to Solr 1.4. .
I just ran into the problem of using fc for a field which can have multiple
terms for one field. The facet counts would be wrong, seemingly only
counting the first term in the field of each document. I observed this in
Solr 1.4.1 and in 3.1 with the same index.

That doesn't sound right... the results should always be identical
between facet.method=fc and facet.method=enum. Are you sure you didn't
index a multi-valued field and then change the fieldType in the schema
to be single valued? Are you sure the field is indexed the way you
think it is?  If so, is there an easy way for someone to reproduce
what you are seeing?

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco
Thanks a lot for your help: Changing the field type to multiValued did 
the trick. The point is, I built the index using Lucene directly (I need 
to for some special manipulation of offsets and position increments). So 
my question is which requirements a Lucene field has to fulfill so 
Solr's faceting works correctly.
Particular question: In Lucene terms, what exactly is denoted by a 
multiValued field? I thought that would result in multiple Lucene 
Field instances with the same name for a single document. But I think my 
field has only one instance per document (but I could check that back).


Thanks again for your quick and helpful answer!

Erik


DIH Response

2011-05-19 Thread Savvas-Andreas Moysidis
Hello,

We have configured solr for delta processing through DIH and we kick off the
index request from within a batch process.
However, we somehow need to know whether our indexing request succeeded or
not because we want to be able to rollback a db transaction if that step
fails.

By looking at the SolrServer API we weren't able to find a method that could
help us with that, so the only solution we see is by constantly polling the
server and parsing the response for the idle or Rolledback words.

What we noticed though is that the response also contains a message saying
This response format is experimental.  It is likely to change in the
future.

Does this mean that we can't rely on this response to build our module? Is
there a better way?

Thank you,
Savvas


Similarity class for an individual field

2011-05-19 Thread Brian Lamb
Hi all,

Based on advice I received on a previous email thread, I applied patch
https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able to
apply a similarity class to certain fields but not all fields.

I ran the following commands:

$ cd your Solr trunk checkout dir
$ svn up
$ wget https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
$ patch -p0 -i SOLR-2338.patch

And I did not get any errors. I then created my own SimilarityClass
listed below because it isn't very large:

package org.apache.lucene.misc;
import org.apache.lucene.search.DefaultSimilarity;

public class SimpleSimilarity extends DefaultSimilarity {
  public SimpleSimilarity() { super(); }
  public float idf(int dont, int care) { return 1; }
}

As you can see, it isn't very complicated. I'm just trying to remove
the idf from the scoring equation in certain cases.

Next, I make a change to the schema.xml file:

fieldType name=string_noidf class=solr.StrField
sortMissingLast=true omitNorms=true
  similarity class=org.apache.lucene.misc.SimpleSimilarity/
/fieldType

And apply that to the field in question:

field name=string_noidf multiValued=true type=string_noidf
indexed=true stored=true required=false omitNorms=true /

But I think something did not get applied correctly to the patch. I
restarted and did a full import but the scores are exactly the same.
Also, I tried using the existing SweetSpotSimilarity:
fieldType name=string_noidf class=solr.StrField
sortMissingLast=true omitNorms=true
  similarity class=org.apache.lucene.misc.SweetSpotSimilarity/
/fieldType

But the scores remained unchanged even in that case. At this point,
I'm not quite sure how to debug this to see whether the problem is
with the patch or the similarity class but given that the SweetSpot
similarity class didn't work either, I'm inclined to think it was a
problem with the patch.

Any thoughts on this one?

Thanks,

Brian Lamb


Re: Similarity class for an individual field

2011-05-19 Thread Brian Lamb
Also, I've tried adding:

similarity class=org.apache.lucene.misc.SweetSpotSimilarity/

To the end of the schema file so that it is applied globally but it does not
appear to change the score either. What am I doing incorrectly?

Thanks,

Brian Lamb

On Thu, May 19, 2011 at 2:45 PM, Brian Lamb
brian.l...@journalexperts.comwrote:

 Hi all,

 Based on advice I received on a previous email thread, I applied patch
 https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able to
 apply a similarity class to certain fields but not all fields.

 I ran the following commands:

 $ cd your Solr trunk checkout dir
 $ svn up
 $ wget 
 https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
 $ patch -p0 -i SOLR-2338.patch

 And I did not get any errors. I then created my own SimilarityClass listed 
 below because it isn't very large:

 package org.apache.lucene.misc;
 import org.apache.lucene.search.DefaultSimilarity;

 public class SimpleSimilarity extends DefaultSimilarity {
   public SimpleSimilarity() { super(); }

   public float idf(int dont, int care) { return 1; }
 }

 As you can see, it isn't very complicated. I'm just trying to remove the idf 
 from the scoring equation in certain cases.

 Next, I make a change to the schema.xml file:

 fieldType name=string_noidf class=solr.StrField sortMissingLast=true 
 omitNorms=true

   similarity class=org.apache.lucene.misc.SimpleSimilarity/
 /fieldType

 And apply that to the field in question:

 field name=string_noidf multiValued=true type=string_noidf 
 indexed=true stored=true required=false omitNorms=true /

 But I think something did not get applied correctly to the patch. I restarted 
 and did a full import but the scores are exactly the same. Also, I tried 
 using the existing SweetSpotSimilarity:
 fieldType name=string_noidf class=solr.StrField sortMissingLast=true 
 omitNorms=true
   similarity class=org.apache.lucene.misc.SweetSpotSimilarity/

 /fieldType

 But the scores remained unchanged even in that case. At this point, I'm not 
 quite sure how to debug this to see whether the problem is with the patch or 
 the similarity class but given that the SweetSpot similarity class didn't 
 work either, I'm inclined to think it was a problem with the patch.

 Any thoughts on this one?

 Thanks,

 Brian Lamb





Re: Similarity class for an individual field

2011-05-19 Thread Brian Lamb
I tried editing the SweetSpotSimilarity class located at
lucene/contrib/misc/src/java/org/apache/lucene/misc/SweetSpotSimilarity.java
to just return 1 for each function and the score does not change at all.
This has led me to believe that it does not recognize similarity at all. At
this point, all I have for similarity is the line at the end of the file to
apply similarity to all searches but that does not even work. So where am I
going wrong?

Thanks,

Brian Lamb

On Thu, May 19, 2011 at 3:41 PM, Brian Lamb
brian.l...@journalexperts.comwrote:

 Also, I've tried adding:

 similarity class=org.apache.lucene.misc.SweetSpotSimilarity/

 To the end of the schema file so that it is applied globally but it does
 not appear to change the score either. What am I doing incorrectly?

 Thanks,

 Brian Lamb

 On Thu, May 19, 2011 at 2:45 PM, Brian Lamb brian.l...@journalexperts.com
  wrote:

 Hi all,

 Based on advice I received on a previous email thread, I applied patch
 https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able
 to apply a similarity class to certain fields but not all fields.

 I ran the following commands:

 $ cd your Solr trunk checkout dir
 $ svn up
 $ wget 
 https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
 $ patch -p0 -i SOLR-2338.patch

 And I did not get any errors. I then created my own SimilarityClass listed 
 below because it isn't very large:

 package org.apache.lucene.misc;
 import org.apache.lucene.search.DefaultSimilarity;

 public class SimpleSimilarity extends DefaultSimilarity {
   public SimpleSimilarity() { super(); }


   public float idf(int dont, int care) { return 1; }
 }

 As you can see, it isn't very complicated. I'm just trying to remove the idf 
 from the scoring equation in certain cases.


 Next, I make a change to the schema.xml file:

 fieldType name=string_noidf class=solr.StrField sortMissingLast=true 
 omitNorms=true


   similarity class=org.apache.lucene.misc.SimpleSimilarity/
 /fieldType

 And apply that to the field in question:

 field name=string_noidf multiValued=true type=string_noidf 
 indexed=true stored=true required=false omitNorms=true /


 But I think something did not get applied correctly to the patch. I 
 restarted and did a full import but the scores are exactly the same. Also, I 
 tried using the existing SweetSpotSimilarity:
 fieldType name=string_noidf class=solr.StrField sortMissingLast=true 
 omitNorms=true
   similarity class=org.apache.lucene.misc.SweetSpotSimilarity/


 /fieldType

 But the scores remained unchanged even in that case. At this point, I'm not 
 quite sure how to debug this to see whether the problem is with the patch or 
 the similarity class but given that the SweetSpot similarity class didn't 
 work either, I'm inclined to think it was a problem with the patch.


 Any thoughts on this one?

 Thanks,

 Brian Lamb






Re: solr sorting on multiple conditions, please help

2011-05-19 Thread Chris Hostetter

: sort=query({!v=area_id: 78153}) desc, score desc
: 
: What I want to achieve is sort by if there is a match with area_id, then
: sort by the actual score

I think you can use the map function here to map all scores greater then 
zero (matching docs) to some fixed value.  something like this should 
work...

qq=area_id:78153
sort=map(query($qq,-1),0,,1) desc, score desc

http://wiki.apache.org/solr/FunctionQuery#map

-Hoss


How to get Error caught in SOLR layer to SOLRj layer

2011-05-19 Thread geeta...@gmail.com
Hi,

I have a code logic to push documents to SOLR using SOLRj APIs.
Due to an error in schema, i get appropriate error in SOLR logs printed in
catalina.log inside tomcat. Here is a snippet:

SEVERE: org.apache.solr.common.SolrException: ERROR: multiple values
encountered for non multiValued copy field suggestion: E:\Files\lpsimdev.inf
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:288)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.solr.servlet.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:104)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579)
at 
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1555)
at java.lang.Thread.run(Unknown Source)


But in my JAVA logs, i simply get this snippet:
###  13  05/19 17:27:52:333  ###  Runner@9be1041:: (SOLR failed with
SolrException for DocId = [2dac611a5bb7ce87831dc0245ffcb66a] and detailed
Exception: [org.apache.solr.common.SolrException: Internal Server Error

Internal Server Error

request:
http://dm2search2.dm2.commvault.com:27000/solr/update/extract?fmap.content=bodyliteral.contentid=2dac611a5bb7ce87831dc0245ffcb66aliteral.jid=5literal.afln=27009287literal.conv=lpsimdev.infliteral.cvowner=SX1X5X32X544literal.cvreadacls=SX1X5X32X544;SX1X5X18;SX1X5X32X544;SX1X5X32X545literal.mtmstr=1000502032literal.afofstr=43
02 25 29 17
literal.bktm=2011-5-19T14:56:3Zliteral.mtm=2001-9-14T21:13:52Zliteral.afof=4302252917literal.atyp=33literal.clid=2literal.cijid=22literal.afid=1literal.szkb=822literal.ccn=-1literal.apid=6literal.url=E:\Files\lpsimdev.infliteral.cistate=1wt=javabinversion=2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:436)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245)
at
org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer.request(StreamingUpdateSolrServer.java:202)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:33)
at
com.commvault.commclient.ciengine.CVRequestWrapper.processRequests(CVRequestWrapper.java:551)
at
com.commvault.commclient.ciengine.solr.SOLRHTTPConnector$Runner.run(SOLRHTTPConnector.java:692)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

How can i get the same error in my solrj side, so that i can debug easily?

Thanks a lot for your time  help,
Geeta

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-get-Error-caught-in-SOLR-layer-to-SOLRj-layer-tp2963446p2963446.html
Sent from the Solr - User mailing list archive at Nabble.com.


Help, Data Import not indexing in solr.

2011-05-19 Thread fredylee
Newbie at SOLR,

When I ran through my test data config, it was able to find my 91 rows
sample test.  However, it didn't add any into my index.

Can someone help me and tell me why?   

Please find the data config below:

dataConfig
dataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
url=jdbc:sqlserver://localhost\TESTSERVER:4317;databaseName=Northwind;user=sa;password=datapassword
/
document
entity name=Customers query=select * from Customers
field column=CustomerID name=customerid /
field column=CompanyName name=companyname /
field column=ContactName name=contactname /
field column=Address name=address /
field column=City name=city /
field column=ContactTitle name=contacttitle /
   
/entity
/document
/dataConfig


Here is the result when I run http://localhost:8983/solr/dataimport? (after
I ran the full import)
response
lst name=responseHeader
int name=status0/int
int name=QTime15/int
/lst
lst name=initArgs
lst name=defaults
str name=configdataconfig.xml/str
/lst
/lst
str name=statusidle/str
str name=importResponse/
lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched91/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2011-05-19 15:09:56/str
str name=
Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.
/str
str name=Committed2011-05-19 15:09:57/str
str name=Optimized2011-05-19 15:09:57/str
str name=Total Documents Processed0/str
str name=Time taken 0:0:1.765/str
/lst
str name=WARNING
This response format is experimental. It is likely to change in the future.
/str
/response

Please help. 

Thx.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-Data-Import-not-indexing-in-solr-tp2963450p2963450.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Faceting on distance in Solr: how do you generate links that search withing a given range of distance?

2011-05-19 Thread Chris Hostetter

: It is fairly simple to generate facets for ranges or 'buckets' of
: distance in Solr:
: http://wiki.apache.org/solr/SpatialSearch#How_to_facet_by_distance.
: What isnt described is how to generate the links for these facets

any query you specify in a facet.query to generate a constraint count can 
be specified in an fq to actaully apply that constraint.

So if you use...
   facet.query={!frange l=5.001 u=3000}geodist()

...to get a count of 34 and the user wants to constrain to those docs, 
you would add...

   fq={!frange l=5.001 u=3000}geodist()

...to the query to do that.


-Hoss


Re: Embedded Solr Optimize under Windows

2011-05-19 Thread Chris Hostetter

: Thanks for the reply. I'm at home right now, or I'd try this myself, but is
: the suggestion that two optimize() calls in a row would resolve the issue?

it might ... I think the situations in which it happens have evolved a bit 
over the years as IndexWRiter has gotten smarter about knowing when it 
really needs to touch the disk to reduce IO.

there's a relatively new explicit method (IndexWriter.deleteUnusedFiles) 
that can force this...

https://issues.apache.org/jira/browse/LUCENE-2259

...but it's only on trunk, and there isn't any user level hook for it in 
Solr yet (i opened SOLR-2532 to consider adding it)


-Hoss


Re: SOLR Custom datasource integration

2011-05-19 Thread Lance Norskog
What is JPA?

You are better off pulling from JPA yourself than coding with the
DataImportHandler. It will be much easier.

EmbeddedSolr is just like web solr: when you commit data it is on the
disk. If you crash during indexing, it may or may not be available to
commit. EmbeddedSolr does not do anything special with index storage.

Lance

On Thu, May 19, 2011 at 2:08 AM, amit.b@gmail.com
amit.b@gmail.com wrote:
 Hi,

 We are trying build enterprise search solution using SOLR , out data source
 is Database which is interfaced with JPA.

 Solution looks like

 SOLR INDEX  JPA  Oracle database.

 We need help to findout what is the best approch integrate Solr Index with
 JPA.

 We tried out two appoches

 Approch 1 -
 1 Polulating SolrInputDocument with data from JPA
 2 Updating EmbeddedSolrServer with captured data using SolrJ API.

 Approch 2 -
 1 Customizing dataimporthandler of HTTPSolrServer
 2 Retrieving data in dataimporthandler using JPA entity.

 Functional requirement -
 1 Solution should be performant for huge magnitude of data
 2 Should be scalable

 We have few question which will help us to decide solution
Will like know which one is better approch to meet our requirement.
Is it good idea to integrate with Lucene against using EmbeddedSolrServer +
 JPA
If JVM is crashes ,  EmbeddedSolrServer content will be lost on reboot.
Can we get support from Jasper Experts team ? can we buy ? how ?






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SOLR-Custom-datasource-integration-tp2960475p2960475.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Lance Norskog
goks...@gmail.com


Re: Embedded Solr Optimize under Windows

2011-05-19 Thread Greg Pendlebury
Ahh, thanks. I might try a basic commit() then and see, although it's not a
huge deal for me. It occurred to me that two optimize() calls would probably
leave exactly the same problem behind.

On 20 May 2011 09:52, Chris Hostetter hossman_luc...@fucit.org wrote:


 : Thanks for the reply. I'm at home right now, or I'd try this myself, but
 is
 : the suggestion that two optimize() calls in a row would resolve the
 issue?

 it might ... I think the situations in which it happens have evolved a bit
 over the years as IndexWRiter has gotten smarter about knowing when it
 really needs to touch the disk to reduce IO.

 there's a relatively new explicit method (IndexWriter.deleteUnusedFiles)
 that can force this...

 https://issues.apache.org/jira/browse/LUCENE-2259

 ...but it's only on trunk, and there isn't any user level hook for it in
 Solr yet (i opened SOLR-2532 to consider adding it)


 -Hoss



Mysql vs Postgres DIH

2011-05-19 Thread antonio
Hi, 
i make the same query to import my data with mysql and postgres.
But only postgres index all data (17090).
While Mysql index 17086, after 197085, after 17087... never 17090. But the
response tell me that it has skipped 0 documents. I don't understand!

Help me please, i woul to use Mysql for my application...

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-vs-Postgres-DIH-tp2963822p2963822.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Mysql vs Postgres DIH

2011-05-19 Thread antonio
Excuse me, i wrong to write 197085, correct is 17085. But never the same
count...

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-vs-Postgres-DIH-tp2963822p2963824.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: error while doing full import

2011-05-19 Thread deniz
thank you dan... i have checked the code that produces XML for solr and then
fixed nbsp problem 

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/error-while-doing-full-import-tp2951185p2963832.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Too slow indexing while using 2 different data sources

2011-05-19 Thread deniz
hi Gora,

i guess you are right, i have checked and url seems serving data slowly...
maybe its because of the crappy test env too...

thank you so much

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Too-slow-indexing-while-using-2-different-data-sources-tp2959551p2963833.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SOLR-2209

2011-05-19 Thread Jean-Sebastien Vachon
I'm using Solr 1.4...

I thought I had a case without a NOT but it seems to work now :S
It might be a glitch on my server.

The problem is easily reproducible with the NOT operator

http://10.0.5.221:8983/jobs/select?q=title:java%20AND%20(-title:programmer)
http://10.0.5.221:8983/jobs/select?q=title:java%20AND%20(-(title:programmer)
)

both queries returns 0 results while...

http://10.0.5.221:8983/jobs/select?q=title:java%20AND%20-(title:programmer)
(note the position of the negation operator)

returns more than 50 000 results

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: May-19-11 9:53 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR-2209

What version of Solr are you using? Because this works fine for me.

Could you attach the results of adding debugQuery=on in both instances?
The parsed form of the query is identical in 1.4.1 as far as I can tell. The
bug you're referencing is a peculiarity of the not (-) operator I think.

Best
Erick

On Thu, May 19, 2011 at 7:25 AM, Jean-Sebastien Vachon
jean-sebastien.vac...@wantedtech.com wrote:
 Hi All,

 I am having some problems with the presence of unnecessary  parenthesis in
my query.
 A query such as:
                title:software AND (title:engineer) will return no 
 results. Remove the parenthesis fix the issue but then since my user can
enter the parenthesis by himself I need to find a way to fix or work-around
this bug. I found that this is related to SOLR-2209 but there is no activity
on this bug.

 Anyone know if this will get fixed some time in the future or if it is
already fixed in Solr 4?

 Otherwise, could someone point me to the code handling this so that I can
attempt to make a fix?

 Thx




Re: Problem about Solrj

2011-05-19 Thread deniz
you mean you have change the code of the solr admin page to remove all
indexes?  and also, when you by indexes are gone you mean they are deleted
or solr sees no indexes when you run it? a little bit confusing post :)

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-about-Solrj-tp2952009p2963901.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem about Solrj

2011-05-19 Thread deniz
sorry for the typos in the prev msg... a little bit drowsy still...

so if you can make a little bit more clear about your problem, we can help
you

-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-about-Solrj-tp2952009p2963935.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using Boost fields for a sum total score.

2011-05-19 Thread ronveenstra
Apologies if this is obvious, but I've been banging my head against a wall.

I can define a query like the following:

http://HOST_NAME/solr/select?q=$search_termbq=boost_high:$search_term^1.5bq=boost_medium:$search_term^1.3bq=boost_max:$search_term^1.7bq=boost_low:$search_term^1.1

This does precisely what I'm looking for (assuming $search_term is a string
like dinosaur)  The search term is found in the default/defined search
fields, and then a boost is applied if this term is also found in one of the
defined boost fields.  What I'm looking to do is define this setup in
solrconfig.xml such that I need only hit a URL like:

http://HOST_NAME/solr/select?q=$search_term

I can define a bq str in solrconfig, but seem unable to reference the q
query parameter in order to boost only when the search term is found.  Any
help would be greatly appreciated.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-Boost-fields-for-a-sum-total-score-tp2958968p2963986.html
Sent from the Solr - User mailing list archive at Nabble.com.


Please Unsubscribe

2011-05-19 Thread omohamme
Could you please unsubscribe me.



From: ronveenstra ron-s...@agathongroup.com
Reply-To: solr-user@lucene.apache.org
Date: Thu, 19 May 2011 18:52:52 -0700 (PDT)
To: solr-user@lucene.apache.org
Subject: Re: Using Boost fields for a sum total score.

Apologies if this is obvious, but I've been banging my head against a wall.

I can define a query like the following:

http://HOST_NAME/solr/select?q=$search_termbq=boost_high:$search_term^1.5b
q=boost_medium:$search_term^1.3bq=boost_max:$search_term^1.7bq=boost_low:$
search_term^1.1

This does precisely what I'm looking for (assuming $search_term is a string
like dinosaur)  The search term is found in the default/defined search
fields, and then a boost is applied if this term is also found in one of the
defined boost fields.  What I'm looking to do is define this setup in
solrconfig.xml such that I need only hit a URL like:

http://HOST_NAME/solr/select?q=$search_term

I can define a bq str in solrconfig, but seem unable to reference the q
query parameter in order to boost only when the search term is found.  Any
help would be greatly appreciated.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Using-Boost-fields-for-a-sum-total-score-
tp2958968p2963986.html
Sent from the Solr - User mailing list archive at Nabble.com.



How can I query mutlitcore with solrJ

2011-05-19 Thread Zhao, Zane
Dear team.

I installed to cores on my tomcat ,

http://localhost:8983/solr/fund_dih/admin/
http://localhost:8983/solr/fund_tika/admin/

How can I send one query request via Solrj to these URL?

Thanks and Regards

Zane



How to get right facet counts?

2011-05-19 Thread Bill Bell
We are having an issue with facet counts and groupingŠ

We have multiple doctors with addresses. How do I search these lat longs?

1. Using SOLR 3.1, I can duplicate all fields except lat_long, and use
group.field for the key.
2. I can use David Smiley's solution for multiple points (but it seems to be
abandoned?)
3. I can add a parameter that will calculate facet.field after the group by
(can I get some help?)
4. Others?
The whole thing does not sound good, since there is sooo much duplication.
It would be perfect to support
multiValued field with many lat_logs for a row in SOLR directly.

Ideas?

Thanks.




Re: Using Boost fields for a sum total score.

2011-05-19 Thread Bill Bell
Put everything except q in solrconfig... Then just use
qt=nameinsolrconfigq=



On 5/19/11 7:52 PM, ronveenstra ron-s...@agathongroup.com wrote:

Apologies if this is obvious, but I've been banging my head against a
wall.

I can define a query like the following:

http://HOST_NAME/solr/select?q=$search_termbq=boost_high:$search_term^1.5
bq=boost_medium:$search_term^1.3bq=boost_max:$search_term^1.7bq=boost_l
ow:$search_term^1.1

This does precisely what I'm looking for (assuming $search_term is a
string
like dinosaur)  The search term is found in the default/defined search
fields, and then a boost is applied if this term is also found in one of
the
defined boost fields.  What I'm looking to do is define this setup in
solrconfig.xml such that I need only hit a URL like:

http://HOST_NAME/solr/select?q=$search_term

I can define a bq str in solrconfig, but seem unable to reference the
q
query parameter in order to boost only when the search term is found.  Any
help would be greatly appreciated.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Using-Boost-fields-for-a-sum-total-scor
e-tp2958968p2963986.html
Sent from the Solr - User mailing list archive at Nabble.com.