Re: how to do a Parent/Child Mapping using entities

2009-12-30 Thread magui

Thanks Sascha for your post, but i find it interresting, but in my case i
don't want to use an additionnal field, i want to be able with the same
schema to do a simple query like : q=res_url:some url, and a query like
the other one;
in other word; is there any solution to make two or more multivalued fields
in the same document linked with each other, e.g:
in this result:

- result name=response numFound=1 start=0
- doc
  str name=id1/str
  str name=keywordKey1/str
- arr name=res_url
  strurl1/str
  strurl2/str
  strurl3/str
  strurl4/str
  /arr
- arr name=res_rank
  str1/str
  str2/str
  str3/str
  str4/str
  /arr
  /doc
  /result 

i would like to make solr understand that for this document, value:url1 of
res_url field is linked to value:1 of res_rank field, and all of them
are linked to the commen field keyword.
I think that i should use a custom field analyser or some thing like that;
but i don't know what to do.

but thanks for all; and any supplied help will be lovable.


Sascha Szott wrote:
 
 Hi,
 
 you could create an additional index field res_ranked_url that contains 
 the concatenated value of an url and its corresponding rank, e.g.,
   
   res_rank +   + res_url
 
 Then, q=res_ranked_url:1 url1 retrieves all documents with url1 as the 
 first url.
 
 A drawback of this workaround is that you have to use a phrase query 
 thus preventing wildcard searches for urls.
 
 -Sascha
 

 Hello everybody, i would like to know how to create index supporting a
 parent/child mapping and then querying the child to get the results.
 in other words; imagine that we have a database containing 2
 tables:Keyword[id(int), value(string)] and Result[id(int), res_url(text),
 res_text(tex), res_date(date), res_rank(int)]
 For indexing, i used the DataImportHandler to import data and it works
 well,
 and my query response seems good:(q=*:*) (imagine that we have only this
 to
 keywords and their results)

?xml version=1.0 encoding=UTF-8 ?
 -response
 -lst name=responseHeader
int name=status0/int
int name=QTime0/int
 -lst name=params
str name=q*:*/str
/lst
/lst
 -result name=response numFound=2 start=0
 -doc
str name=id1/str
str name=keywordKey1/str
 -arr name=res_url
strurl1/str
strurl2/str
strurl3/str
strurl4/str
/arr
 -arr name=res_rank
str1/str
str2/str
str3/str
str4/str
/arr
/doc
 -doc
str name=id2/str
str name=keywordKey2/str
 -arr name=res_url
strurl1/str
strurl5/str
strurl8/str
strurl7/str
/arr
 -arr name=res_rank
str1/str
str2/str
str3/str
str4/str
/arr
/doc
/result
/response

 but the problem is when i tape a query kind of this:q=res_url:url2 AND
 res_rank:1 and this to say that i want to search for the keywords in
 which
 the url (url2) is ranked at the first position, i have a result like
 this:

 ?xml version=1.0 encoding=UTF-8 ?
 -response
 -lst name=responseHeader
int name=status0/int
int name=QTime0/int
 -lst name=params
str name=qres_url:url2 AND res_rank:1/str
/lst
/lst
 -result name=response numFound=1 start=0
 -doc
str name=id1/str
str name=keywordKey1/str
 -arr name=res_url
strurl1/str
strurl2/str
strurl3/str
strurl4/str
/arr
 -arr name=res_rank
str1/str
str2/str
str3/str
str4/str
/arr
/doc
/result
/response

 But this is not true; because the url present in the 1st position in the
 results of the keyword key1 is url1 and not url2.
 So what i want to say is : is there any solution to make the values of
 the
 multivalued fields linked;
 so in our case we can see that the previous result say that:
   - url1 is present in 1st position of key1 results
   - url2 is present in 2nd position of key1 results
   - url3 is present in 3rd position of key1 results
   - url4 is present in 4th position of key1 results

 and i would like that solr consider this when executing queries.

 Any helps please; and thanks for all :)
 
 
 

-- 
View this message in context: 
http://old.nabble.com/how-to-do-a-Parent-Child-Mapping-using-entities-tp26956426p26965478.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using the new tokenizer API from a jar file

2009-12-30 Thread Ahmed El-dawy
Thanks all for your interest, especially Uwe. I asked this question on
solr-user at the beginning but I got no reply. That's why I re-asked the
question at java-user.
Thanks for your efforts. I will try it now.

On Mon, Dec 28, 2009 at 12:02 PM, Uwe Schindler u...@thetaphi.de wrote:

 I opened https://issues.apache.org/jira/browse/LUCENE-2182 about this
 problem and already have a fix.

 This is really a bug. The solution is simple because you have to load the
 IMPL class using the same classloader as the passed in interface. The
 default for Class.forName is the classloader of AttributeSource.class,
 which
 is the wrong one.

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: Uwe Schindler [mailto:u...@thetaphi.de]
  Sent: Monday, December 28, 2009 9:20 AM
  To: java-u...@lucene.apache.org
  Cc: solr-user@lucene.apache.org
  Subject: RE: Using the new tokenizer API from a jar file
 
  The question on this list was ok,as it shows a minor problem of using the
  new TokenStream API with Solr.
 
  His plugin was loaded correctly, because if Lucene says, that it cannot
  find
  the *Impl class, it was able to load the interface class before - the
 JAR
  file is visible to the JVM.
 
  The problem is the following and has to do with classloaders:
 
  1. We have different class loaders for different places in Solr. Solr
 uses
  for plugins a SolrResourceLoader that searches for JAR files in the local
  lib folder before handling over to the webapp's classloader.
 
  2. Initially, the lucene JAR is loaded by the Webapp's class loader
 
  3. If a AttributeImpl is placed into a jar file e.g. in the plugin folder
  of
  solr (the lib folder where solr loads all resources, stop words,...), the
  loading mechanism inside AttributeSource.DEFAULT_ATTRIBUTE_FACTORY is
  unable
  to locate the class file, because Class.forName() always uses the class'
  classloader and not the global/thread one's. So AttributeSource will only
  find the class file if it is in the *same* directory as the lucene-
  core.jar
  file (WEB-INF/lib) and so accessible by the webapp's class loader.
 
  A good introduction about the problem is this one:
 
 http://www.theserverside.com/tt/articles/content/dm_classForname/DynLoad.p
  df
 
  The problem is here described for the JVM extensions folder but also
  applies
  to solr, because it has another classloader for plugins.
 
  A solution to fix this would be in lucene to use the thread's context
  class
  loader in AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, but I strongly
  discourage this, as it would break the whole AttributeSource
 functionality
  if you add two different attributes with same class names from different
  class loaders to the AttributeSource.
 
  The only solution to the problem is placing the JAR file inside the
  WEB-INF/lib folder where lucene-core.jar is. Plugins in Solr cannot
 define
  own attribute implementations. Alternatively he could try to force
 preload
  the class by calling Class.forName in his plugin initialization code on
  the
  Impl class. But I am not sure if this works (as Java handles classes from
  different classloaders different).
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
   -Original Message-
   From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
   Sent: Monday, December 28, 2009 4:27 AM
   To: java-u...@lucene.apache.org
   Subject: Re: Using the new tokenizer API from a jar file
  
  
   : I tried to use it with solr and the problems began. It's always
  telling
   me
   : that it cannot find the class GlossAttributeImpl. I think the problem
  is
   : that my jar file is added to the class path at run time not from the
   command
   : line. Do you have a good solution or workaround?
  
   You're likely to get mmore helpful answers from other people in the
 Solr
   User community (solr-u...@lucene.a.o)
  
   As long as you put your jar in the lib directory under your solr home
   (or refrence it using a lib/ directive in your solrconfig.xml) Solr's
   plugin loader will take care of hte classloading for you.
  
   if you are confident you have your jar in the correct place, please
  email
   solr-user with the ClassNotFound stack trace from your solr logs, as
  well
   as hierarchy  of files from your solr home (ie: the output of find .)
  
  
   -Hoss
  
  
   -
   To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-user-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-user-h...@lucene.apache.org



 -
 To unsubscribe, 

Re: Checkin mistake - example does not work in trunk

2009-12-30 Thread Grant Ingersoll
ant example is how the solr.war gets generated for the example.  It's not 
checked in.


On Dec 29, 2009, at 10:22 PM, Lance Norskog wrote:

 The distributed binaries do not include the new spatial types, so the
 .../trunk/example/ store app does not start.
 
 Please either always check in the latest binaries (a pain), or edit
 the README.txt to include now first do an 'ant clean dist'.  (And
 maybe not include the binaries?)
 
 http://svn.apache.org/repos/asf/lucene/solr/trunk/example/README.txt
 
 
 -- 
 Lance Norskog
 goks...@gmail.com

--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search



Re: Checkin mistake - example does not work in trunk

2009-12-30 Thread Yonik Seeley
On Tue, Dec 29, 2009 at 10:22 PM, Lance Norskog goks...@gmail.com wrote:
 The distributed binaries do not include the new spatial types, so the
 .../trunk/example/ store app does not start.

?
What distributed binaries are you referring to?  The nightly builds?
Are they missing a jar?

-Yonik
http://www.lucidimagination.com



Re: performance question

2009-12-30 Thread Grant Ingersoll

On Dec 29, 2009, at 2:19 PM, A. Steven Anderson wrote:

 Greetings!
 
 Is there any significant negative performance impact of using a
 dynamicField?

There can be an impact if you are searching against a lot of fields or if you 
are indexing a lot of fields on every document, but for the most part in most 
applications it is negligible. 

 
 Likewise for multivalued fields?

No.  Multivalued fields are just concatenated together with a large position 
gap underneath the hood.

 
 The reason why I ask is that our system basically aggregates data from many
 disparate data sources (structured, unstructured, and semi-structured), and
 the management of the schema.xml has become unwieldy; i.e. we currently have
 dozens of fields which grows every time we add a new data source.
 
 I was considering redefining the domain model outside of Solr which would be
 used to generate the fields for the indexing process and the metadata (e.g.
 display names) for the search process.
 
 Thoughts?

It probably can't hurt to be more streamlined, but without knowing more about 
your model, it's hard to say.  I've built apps that were totally dynamic field 
based and they worked just fine, but these were more for discovery than just 
pure search.  In other words, the user was interacting with the system in a 
reflective model that selected which fields to search on.

-Grant

--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search



Re: performance question

2009-12-30 Thread A. Steven Anderson
 There can be an impact if you are searching against a lot of fields or if
 you are indexing a lot of fields on every document, but for the most part in
 most applications it is negligible.


We index a lot of fields at one time, but we can tolerate the performance
impact at index time.

It probably can't hurt to be more streamlined, but without knowing more
 about your model, it's hard to say.  I've built apps that were totally
 dynamic field based and they worked just fine, but these were more for
 discovery than just pure search.  In other words, the user was interacting
 with the system in a reflective model that selected which fields to search
 on.


Our application is as much about discovery as search, so this is good to
know.

Thanks for the feedback. It was very helpful.
-- 
A. Steven Anderson
Independent Consultant
st...@asanderson.com


StreamingUpdateSolrServer

2009-12-30 Thread Patrick Sauts

Hi All,

I'm testing StreamingUpdateSolrServer for indexing but I don't see the 
last :  finished: 
org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@
in my logs. Do I have to use a special function to wait until update is 
effective ?


Another question (maybe easy for you) I'm running solr on a tomcat 
5.0.28 and sometimes, not at a time of rsync or big traffic or commit, 
it doesn't respond anymore and uptime is very high.


Thank you for your help.

Patrick.




RE:Delete, commit, optimize doesn't reduce index file size

2009-12-30 Thread Giovanni Fernandez-Kincade
Is there another way to make this happen without making further changes to the 
index? Maybe a bounce of the servlet server?



On Tue, Dec 29, 2009 at 1:23 PM, markwaddle m...@markwaddle.com wrote:
I have an index that used to have ~38M docs at 17.2GB. I deleted all but 13K
docs using a delete by query, commit and then optimize. A *:* query now
returns 13K docs. The problem is that the files on disk are still 17.1GB in
size. I expected the optimize to shrink the files. Is there a way I can
shrink them now that the index only has 13K docs?

Are you on Windows?
The IndexWriter can't delete files in use by the current IndexReader
(like it can in UNIX) when the commit is done.
If you make further changes to the index and do a commit, you should
see the space go down.

-Yonik
http://www.lucidimagination.com



Automating implementation of SolrInfoMBean

2009-12-30 Thread Mat Brown
Hi all,

Is there a standard way to automatically update the values returned by
the methods in SolrInfoMBean? Particularly those concerning revision
control etc. I'm assuming folks don't just update that by hand every
commit...

Thanks!
Mat


Build index by consuming web service

2009-12-30 Thread javaxmlsoapdev

I am in a need of a handler which consumes web serivce and builds index from
return results of the service. Until now I was building index by reading
data directly from database query using DataImportHandler. 

There are new functional requirements to index calculated fields in the
index and allow search on them. I have exposed an application API as a web
service, which returns all attributes for indexing. How can ask Solr to
consume this service and index attributes returned by the service? 

Any pointers would be appreciated.

Thanks,


-- 
View this message in context: 
http://old.nabble.com/Build-index-by-consuming-web-service-tp26970642p26970642.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: how to do a Parent/Child Mapping using entities

2009-12-30 Thread Ryan McKinley

Ya, structured data gets a little funny.

For starters, the order of multi-valued fields should be maintained,  
so if you have:


doc
 field name=urlhttp://aaa/field
 field name=url_rank5/field
 field name=urlhttp://bbb/field
 field name=url_rank4/field
/doc

the response will return result in order, so you can map them with  
array indicies.


I have played some tricks with a JSON field analyzer that give you  
some more control.


For example, if you index:

doc
 field name=url{ url:http://host/;, rank:5 }/field
/doc

Then I use an analyzer that indexes the terms:
  url:http://host/
  rank:5

I just posted SOLR-1690, if you want to take a look at that approach

ryan


On Dec 30, 2009, at 4:25 AM, magui wrote:



Thanks Sascha for your post, but i find it interresting, but in my  
case i
don't want to use an additionnal field, i want to be able with the  
same
schema to do a simple query like : q=res_url:some url, and a query  
like

the other one;
in other word; is there any solution to make two or more multivalued  
fields

in the same document linked with each other, e.g:
in this result:

- result name=response numFound=1 start=0
- doc
 str name=id1/str
 str name=keywordKey1/str
- arr name=res_url
 strurl1/str
 strurl2/str
 strurl3/str
 strurl4/str
 /arr
- arr name=res_rank
 str1/str
 str2/str
 str3/str
 str4/str
 /arr
 /doc
 /result

i would like to make solr understand that for this document,  
value:url1 of
res_url field is linked to value:1 of res_rank field, and all of  
them

are linked to the commen field keyword.
I think that i should use a custom field analyser or some thing like  
that;

but i don't know what to do.

but thanks for all; and any supplied help will be lovable.


Sascha Szott wrote:


Hi,

you could create an additional index field res_ranked_url that  
contains

the concatenated value of an url and its corresponding rank, e.g.,

res_rank +   + res_url

Then, q=res_ranked_url:1 url1 retrieves all documents with url1  
as the

first url.

A drawback of this workaround is that you have to use a phrase query
thus preventing wildcard searches for urls.

-Sascha



Hello everybody, i would like to know how to create index  
supporting a

parent/child mapping and then querying the child to get the results.
in other words; imagine that we have a database containing 2
tables:Keyword[id(int), value(string)] and Result[id(int),  
res_url(text),

res_text(tex), res_date(date), res_rank(int)]
For indexing, i used the DataImportHandler to import data and it  
works

well,
and my query response seems good:(q=*:*) (imagine that we have  
only this

to
keywords and their results)

  ?xml version=1.0 encoding=UTF-8 ?
-response
-lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
-lst name=params
  str name=q*:*/str
  /lst
  /lst
-result name=response numFound=2 start=0
-doc
  str name=id1/str
  str name=keywordKey1/str
-arr name=res_url
  strurl1/str
  strurl2/str
  strurl3/str
  strurl4/str
  /arr
-arr name=res_rank
  str1/str
  str2/str
  str3/str
  str4/str
  /arr
  /doc
-doc
  str name=id2/str
  str name=keywordKey2/str
-arr name=res_url
  strurl1/str
  strurl5/str
  strurl8/str
  strurl7/str
  /arr
-arr name=res_rank
  str1/str
  str2/str
  str3/str
  str4/str
  /arr
  /doc
  /result
  /response

but the problem is when i tape a query kind of  
this:q=res_url:url2 AND
res_rank:1 and this to say that i want to search for the keywords  
in

which
the url (url2) is ranked at the first position, i have a result like
this:

?xml version=1.0 encoding=UTF-8 ?
-response
-lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
-lst name=params
  str name=qres_url:url2 AND res_rank:1/str
  /lst
  /lst
-result name=response numFound=1 start=0
-doc
  str name=id1/str
  str name=keywordKey1/str
-arr name=res_url
  strurl1/str
  strurl2/str
  strurl3/str
  strurl4/str
  /arr
-arr name=res_rank
  str1/str
  str2/str
  str3/str
  str4/str
  /arr
  /doc
  /result
  /response

But this is not true; because the url present in the 1st position  
in the

results of the keyword key1 is url1 and not url2.
So what i want to say is : is there any solution to make the  
values of

the
multivalued fields linked;
so in our case we can see that the previous result say that:
 - url1 is present in 1st position of key1 results
 - url2 is present in 2nd position of key1 results
 - url3 is present in 3rd position of key1 results
 - url4 is present in 4th position of key1 results

and i would like that solr consider this when executing queries.

Any helps please; and thanks for all :)






--
View this message in context: 
http://old.nabble.com/how-to-do-a-Parent-Child-Mapping-using-entities-tp26956426p26965478.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: weird sorting behavior

2009-12-30 Thread Joel Nylund

Hi, so this is only available in 1.5?

I tried in 1.4 and got :

org.apache.solr.common.SolrException: Error loading class  
'solr.CollationKeyFilterFactory'


Is there a way to do this in 1.4?

The link Shalin sent is a 1.5 link I think.

thanks
Joel

On Dec 25, 2009, at 10:52 PM, Robert Muir wrote:

Hello, as Shalin said, you might want to try  
CollationKeyFilterFactory.


Below is an example (using the multilingual root locale), where the
spaces will sort after the letters and numbers as you mentioned, but
it will still not be case-sensitive. This is because strength is
'secondary'.

But are you really sure you want the spaces sorted after the letters
and numbers? Or instead do you just want them ignored for sorting? If
this is the case, then try 'primary', so that spaces, punctuation,
accents and things like that in addition to case are ignored in the
sort: for example Test-1234 andtest1234 sort the same with
primary, but not with secondary (the one with leading spaces will sort
last)

If all else fails, you can write custom rules for it too, as Shalin  
mentioned.


fieldType name=collatedROOT class=solr.TextField
 analyzer
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.CollationKeyFilterFactory
   language=
   strength=secondary
   /
 /analyzer
/fieldType

On Fri, Dec 25, 2009 at 5:37 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:


On Thu, Dec 24, 2009 at 11:51 PM, Joel Nylund jnyl...@yahoo.com  
wrote:


update, I tried changing to datatype string, and it sorts the  
numerics

better, but the other sorts are not as good.

Is there a way to control sorting for special chars, for example,  
I want

blanks to sort after letters and numbers.


In the general case, CollationKeyFilterFactory will do the trick.  
You could
create a custom rule set which sorts spaces after letters and  
numbers. See

http://wiki.apache.org/solr/UnicodeCollation



using alphaOnlySort - sorts nicely for alpha, but numbers dont work
string - sorts nicely for numbers and letters, but special chars  
like

blanks show up first in the list


alphaOnlySort has a PatternReplaceFilterFactory which removes all  
characters
except a-z. This is the reason behind those wierd results. You  
could try

removing that filter and see if thats what you need.

--
Regards,
Shalin Shekhar Mangar.




--
Robert Muir
rcm...@gmail.com




Result ordering for Wildcard/Prefix queries or ConstantScoreQueries

2009-12-30 Thread Prasanna R
All documents matched for Wildcard and Prefix queries get the same score as
they are scored as a ConstantScoreQuery. Example query - title:abc*

In such cases, what determines the ordering of the results? Is it simply the
same order in which those document terms appeared when enumerating through
the terms of the field matched in the index?

Also, would it be possible to specify criteria determining the ordering of
such matches? I am assuming that should be possible but have little idea how
that could be done. Kindly provide guidance/help.

Regards,

Prasanna.


score = result of function query

2009-12-30 Thread Joe Calderon
how can i make the score be solely the output of a function query?

the function query wiki page details something like
 q=boxname:findbox+_val_:product(product(x,y),z)fl=*,score


but that doesnt seems to work


--joe


Requesting feedback on solr-spatial plugin

2009-12-30 Thread Mat Brown
Hi all,

I've been working on a small Solr plugin to expose the basic
functionality of lucene-spatial as unobtrusively as possible. I've got
a basic implementation up and passing tests, and I was hoping to get
some feedback on it. Though I've coded against Lucene for a production
app in the past, this is my first time writing code for Solr's plugin
API, so I could easily be entirely on the wrong track.

Honest (even brutal!) feedback would be very much appreciated:

http://github.com/outoftime/solr-spatial

Thanks much,
Mat

P.S. I definitely don't want to step on anyone's toes with the name -
if solr-spatial is already in use, or reserved for a future official
contrib for Solr, let me know and I'll come up with something else!


Re: how to do a Parent/Child Mapping using entities

2009-12-30 Thread Sascha Szott

Hi,


Thanks Sascha for your post, but i find it interresting, but in my case i
don't want to use an additionnal field, i want to be able with the same
schema to do a simple query like : q=res_url:some url, and a query like
the other one;
You could easily write your own query parser (QParserPlugin, in Solr's 
terminology) that internally translates queries like


 q = res_url:url AND res_rank:rank

into
q = res_ranked_url:rank url

thus hiding the res_ranked_url field from the user/client.

I'm not sure, but maybe it's possible to utilize the order of values 
within the multi-valued field res_url directly in the newly created 
parser. This seems like the cleanest solution to me.


-Sascha


in other word; is there any solution to make two or more multivalued fields
in the same document linked with each other, e.g:
in this result:

-result name=response numFound=1 start=0
-doc
   str name=id1/str
   str name=keywordKey1/str
-arr name=res_url
   strurl1/str
   strurl2/str
   strurl3/str
   strurl4/str
   /arr
-arr name=res_rank
   str1/str
   str2/str
   str3/str
   str4/str
   /arr
   /doc
   /result

i would like to make solr understand that for this document, value:url1 of
res_url field is linked to value:1 of res_rank field, and all of them
are linked to the commen field keyword.
I think that i should use a custom field analyser or some thing like that;
but i don't know what to do.

but thanks for all; and any supplied help will be lovable.


Sascha Szott wrote:


Hi,

you could create an additional index field res_ranked_url that contains
the concatenated value of an url and its corresponding rank, e.g.,

res_rank +   + res_url

Then, q=res_ranked_url:1 url1 retrieves all documents with url1 as the
first url.

A drawback of this workaround is that you have to use a phrase query
thus preventing wildcard searches for urls.

-Sascha



Hello everybody, i would like to know how to create index supporting a
parent/child mapping and then querying the child to get the results.
in other words; imagine that we have a database containing 2
tables:Keyword[id(int), value(string)] and Result[id(int), res_url(text),
res_text(tex), res_date(date), res_rank(int)]
For indexing, i used the DataImportHandler to import data and it works
well,
and my query response seems good:(q=*:*) (imagine that we have only this
to
keywords and their results)

?xml version=1.0 encoding=UTF-8 ?
-response
-lst name=responseHeader
int name=status0/int
int name=QTime0/int
-lst name=params
str name=q*:*/str
/lst
/lst
-result name=response numFound=2 start=0
-doc
str name=id1/str
str name=keywordKey1/str
-arr name=res_url
strurl1/str
strurl2/str
strurl3/str
strurl4/str
/arr
-arr name=res_rank
str1/str
str2/str
str3/str
str4/str
/arr
/doc
-doc
str name=id2/str
str name=keywordKey2/str
-arr name=res_url
strurl1/str
strurl5/str
strurl8/str
strurl7/str
/arr
-arr name=res_rank
str1/str
str2/str
str3/str
str4/str
/arr
/doc
/result
/response

but the problem is when i tape a query kind of this:q=res_url:url2 AND
res_rank:1 and this to say that i want to search for the keywords in
which
the url (url2) is ranked at the first position, i have a result like
this:

?xml version=1.0 encoding=UTF-8 ?
-response
-lst name=responseHeader
int name=status0/int
int name=QTime0/int
-lst name=params
str name=qres_url:url2 AND res_rank:1/str
/lst
/lst
-result name=response numFound=1 start=0
-doc
str name=id1/str
str name=keywordKey1/str
-arr name=res_url
strurl1/str
strurl2/str
strurl3/str
strurl4/str
/arr
-arr name=res_rank
str1/str
str2/str
str3/str
str4/str
/arr
/doc
/result
/response

But this is not true; because the url present in the 1st position in the
results of the keyword key1 is url1 and not url2.
So what i want to say is : is there any solution to make the values of
the
multivalued fields linked;
so in our case we can see that the previous result say that:
   - url1 is present in 1st position of key1 results
   - url2 is present in 2nd position of key1 results
   - url3 is present in 3rd position of key1 results
   - url4 is present in 4th position of key1 results

and i would like that solr consider this when executing queries.

Any helps please; and thanks for all :)







Re: score = result of function query

2009-12-30 Thread Grant Ingersoll

On Dec 30, 2009, at 5:27 PM, Joe Calderon wrote:

 how can i make the score be solely the output of a function query?
 
 the function query wiki page details something like
 q=boxname:findbox+_val_:product(product(x,y),z)fl=*,score
 
 

Wrap the non-function query part in parenthesis and boost it by 0.  In Solr 
1.5, you will be able to sort by function query.

-Grant

Re: Result ordering for Wildcard/Prefix queries or ConstantScoreQueries

2009-12-30 Thread Grant Ingersoll

On Dec 30, 2009, at 3:21 PM, Prasanna R wrote:

 All documents matched for Wildcard and Prefix queries get the same score as
 they are scored as a ConstantScoreQuery. Example query - title:abc*
 
 In such cases, what determines the ordering of the results? Is it simply the
 same order in which those document terms appeared when enumerating through
 the terms of the field matched in the index?

I'm assuming they are just in order of internal Lucene doc id, but I'd have to 
look for sure.  There was also some changes to Lucene that allowed the 
collectors to take docs out of order, but again, I'd have to check to see if 
that is the case.

 
 Also, would it be possible to specify criteria determining the ordering of
 such matches? I am assuming that should be possible but have little idea how
 that could be done. Kindly provide guidance/help.

Sort?

What problem are you trying to solve?

-Grant

Re: Requesting feedback on solr-spatial plugin

2009-12-30 Thread Mattmann, Chris A (388J)
Hi Mat,

Taking a quick look at your code via the gitHub browser (and not having
downloaded or run it, that's for later! :) ), it looks _very_ clean, and
well commented. Bravo!

If you get a chance and are interested in participating in the SOLR spatial
effort, there are a few issues you could take a look at, in particular,
based on what you have so far, I would take a look at SOLR-1568, having to
do with creating a QParserPlugin for spatial:

http://issues.apache.org/jira/browse/SOLR-1568

SOLR-773 tracks the general progress of all of the spatial work, here:

http://issues.apache.org/jira/browse/SOLR-773

There is also a wiki page for the community efforts:

http://wiki.apache.org/solr/SpatialSearch

If you're not familiar with it yet, there has been a ton of work on Local
SOLR and LocalLucene as well. You may want to check out those pages too,
located here:

http://www.gissearch.com/localsolr

Again, bravo on such a clean, easy to understand plugin! I'll try and test
out your code and provide some feedback if I get a chance soon. Also I
welcome and encourage your contribution/discussion on the SOLR mailing lists
and wiki area.

Cheers,
Chris


On 12/30/09 3:51 PM, Mat Brown m...@patch.com wrote:

 Hi all,
 
 I've been working on a small Solr plugin to expose the basic
 functionality of lucene-spatial as unobtrusively as possible. I've got
 a basic implementation up and passing tests, and I was hoping to get
 some feedback on it. Though I've coded against Lucene for a production
 app in the past, this is my first time writing code for Solr's plugin
 API, so I could easily be entirely on the wrong track.
 
 Honest (even brutal!) feedback would be very much appreciated:
 
 http://github.com/outoftime/solr-spatial
 
 Thanks much,
 Mat
 
 P.S. I definitely don't want to step on anyone's toes with the name -
 if solr-spatial is already in use, or reserved for a future official
 contrib for Solr, let me know and I'll come up with something else!
 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++




Re: Checkin mistake - example does not work in trunk

2009-12-30 Thread Lance Norskog
Rats! I did not rebuild after updating, so the new schema.xml tripped
over my old example solr.war. Never mind.

On 12/30/09, Yonik Seeley yo...@lucidimagination.com wrote:
 On Tue, Dec 29, 2009 at 10:22 PM, Lance Norskog goks...@gmail.com wrote:
 The distributed binaries do not include the new spatial types, so the
 .../trunk/example/ store app does not start.

 ?
 What distributed binaries are you referring to?  The nightly builds?
 Are they missing a jar?

 -Yonik
 http://www.lucidimagination.com



-- 
Lance Norskog
goks...@gmail.com


serialize SolrInputDocument to java.io.File and back again?

2009-12-30 Thread Phillip Rhodes
I want to store a SolrInputDocument to the filesystem until it can be sent
to the solr server via the solrj client.

I will be using a quartz job to periodically query a table that contains a
listing of SolrInputDocuments stored as java.io.File that need to be
processed.

Thanks for your time.


Re: Requesting feedback on solr-spatial plugin

2009-12-30 Thread Mat Brown
Hi Grant,

Thanks for the info and your point is well taken. I should have been
clearer that I have no intention of this project being a long-term
solution for spatial search in Solr - rather I was looking to build a
rough and ready solution that gives some basic spatial search
capabilities to tide us over until the real deal is available in Solr
1.5. That being said, I'd love to be of use in the official spatial
efforts, so I'll be sure to take a look at the related tickets and see
if there is anywhere I can help out.

Mat

On Wed, Dec 30, 2009 at 19:36, Grant Ingersoll gsi...@gmail.com wrote:
 Hi Mat,

 This is an area of active work in Solr right now (see SOLR-773 in JIRA for 
 the top level tracking issue).  Obviously you can do as you wish, but it 
 would be really great if you chipped in on making the capabilities in Solr 
 better (we've already added in the Lucene spatial jar, a bunch of distance 
 functions, sort by functions and a few spatial field types) instead of doing 
 something separate.

 In other words, spatial support is going to be baked into Solr 1.5, riding on 
 the tail of a a whole slew of features that make Solr even more capable.  See 
 http://wiki.apache.org/solr/SpatialSearch for more details.

 Cheers,
 Grant

 On Dec 30, 2009, at 6:51 PM, Mat Brown wrote:

 Hi all,

 I've been working on a small Solr plugin to expose the basic
 functionality of lucene-spatial as unobtrusively as possible. I've got
 a basic implementation up and passing tests, and I was hoping to get
 some feedback on it. Though I've coded against Lucene for a production
 app in the past, this is my first time writing code for Solr's plugin
 API, so I could easily be entirely on the wrong track.

 Honest (even brutal!) feedback would be very much appreciated:

 http://github.com/outoftime/solr-spatial

 Thanks much,
 Mat

 P.S. I definitely don't want to step on anyone's toes with the name -
 if solr-spatial is already in use, or reserved for a future official
 contrib for Solr, let me know and I'll come up with something else!




Correct syntax for solrJ filter queries

2009-12-30 Thread Jay Fisher
I'm using solrJ to construct a query and it works just fine until I add the
following.


query.setFilterQueries(price:[*+TO+500], price:[500+TO+*]);


That generates this error


Caused by: org.apache.solr.common.SolrException: Bad Request


Bad Request


request:
http://balboa:8085/apache-solr-1.4.0/core0/select?q=redfacet=truefl=*,scorerows=20fq=price:[*+TO+500]fq=price:[500+TO+*]wt=javabinversion=1


What is the proper syntax for specifying a set of facet.queries?


RE: Search both diacritics and non-diacritics

2009-12-30 Thread Olala

I have done follow it, but if I query with diacritic it respose only
non-diacritic. But I want to query without diacritic anh then solr will be
response both of diacritic and without diacritic :( 


Steven A Rowe wrote:
 
 Hi Olala,
 
 You can get something similar to what you want by copying the original
 field to another one where, as Hoss suggests, you apply
 ASCIIFoldingFilterFactory, and the rewrite queries to match against both
 fields, with higher boost given to the original field.
 
 @Hoss: Olala would benefit from a feature that AFAICT Solr doesn't
 currently have: the ability to add synonyms based on arbritrary
 transforms.
 
 Steve
 
 On 12/28/2009 at 5:33 AM, Olala wrote:
 
 I tried but it still not correct :(
 
 hossman wrote:
   I am developing a seach engine with Solr, and now I want to search
   both with and without diacritics, for example: if I query kho, it
   will response kho, khó, khò,... But if I query khó, it will
   response only khó.
   
   Who anyone have solution? I have used filter
   class=solr.ISOLatin1AccentFilterFactory/ but it is not correct
  (
  
  try ASCIIFoldingFilterFactory instead.
  
  -Hoss
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Search-both-diacritics-and-non-diacritics-tp26897627p26975115.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Correct syntax for solrJ filter queries

2009-12-30 Thread Erik Hatcher

Use query.addFacetQuery(str) instead.

Erik

On Dec 30, 2009, at 10:16 PM, Jay Fisher wrote:

I'm using solrJ to construct a query and it works just fine until I  
add the

following.


query.setFilterQueries(price:[*+TO+500], price:[500+TO+*]);


That generates this error


Caused by: org.apache.solr.common.SolrException: Bad Request


Bad Request


request:
http://balboa:8085/apache-solr-1.4.0/core0/select?q=redfacet=truefl=*,scorerows=20fq=price 
:[*+TO+500]fq=price:[500+TO+*]wt=javabinversion=1



What is the proper syntax for specifying a set of facet.queries?




Re: absolute search

2009-12-30 Thread Olala

Can anyone help me??? plz!


Olala wrote:
 
 uhm,I am sorry, this is the debug :)
 
 lst name=debug
 str name=rawquerystringbook/str
 str name=querystringbook/str
 str name=parsedquery+DisjunctionMaxQuery((name:book)~0.01) ()/str
 str name=parsedquery_toString+(name:book)~0.01 ()/str
 −
 lst name=explain
 −
 str name=19534
 
 7.903358 = (MATCH) sum of:
   7.903358 = (MATCH) fieldWeight(name:book in 19533), product of:
 1.0 = tf(termFreq(name:book)=1)
 7.903358 = idf(docFreq=79, maxDocs=79649)
 1.0 = fieldNorm(field=name, doc=19533)
 /str
 −
 str name=5925
 
 3.951679 = (MATCH) sum of:
   3.951679 = (MATCH) fieldWeight(name:book in 5924), product of:
 1.0 = tf(termFreq(name:book)=1)
 7.903358 = idf(docFreq=79, maxDocs=79649)
 0.5 = fieldNorm(field=name, doc=5924)
 /str
 −
 str name=5933
 
 3.951679 = (MATCH) sum of:
   3.951679 = (MATCH) fieldWeight(name:book in 5932), product of:
 1.0 = tf(termFreq(name:book)=1)
 7.903358 = idf(docFreq=79, maxDocs=79649)
 0.5 = fieldNorm(field=name, doc=5932)
 /str
 −
 str name=8049
 
 3.951679 = (MATCH) sum of:
   3.951679 = (MATCH) fieldWeight(name:book in 8048), product of:
 1.0 = tf(termFreq(name:book)=1)
 7.903358 = idf(docFreq=79, maxDocs=79649)
 0.5 = fieldNorm(field=name, doc=8048)
 /str
 −
 str name=9358
 
 3.951679 = (MATCH) sum of:
   3.951679 = (MATCH) fieldWeight(name:book in 9357), product of:
 1.0 = tf(termFreq(name:book)=1)
 7.903358 = idf(docFreq=79, maxDocs=79649)
 0.5 = fieldNorm(field=name, doc=9357)
 /str
 /lst
 str name=QParserDisMaxQParser/str
 null name=altquerystring/
 null name=boostfuncs/
 −
 arr name=filter_queries
 str/
 /arr
 arr name=parsed_filter_queries/
 −
 lst name=timing
 double name=time0.0/double
 −
 lst name=prepare
 double name=time0.0/double
 −
 lst name=org.apache.solr.handler.component.QueryComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.FacetComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.MoreLikeThisComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.HighlightComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.StatsComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.DebugComponent
 double name=time0.0/double
 /lst
 /lst
 −
 lst name=process
 double name=time0.0/double
 −
 lst name=org.apache.solr.handler.component.QueryComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.FacetComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.MoreLikeThisComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.HighlightComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.StatsComponent
 double name=time0.0/double
 /lst
 −
 lst name=org.apache.solr.handler.component.DebugComponent
 double name=time0.0/double
 /lst
 /lst
 /lst
 /lst
 
 
 Erick Erickson wrote:
 
 Hmmm, nothing jumps out at me. What does Luke show you
 is actually in your index in the field in question? And what does
 adding debugQuery=on to the query show?
 
 On Thu, Dec 24, 2009 at 8:44 PM, Olala hthie...@gmail.com wrote:
 


 Oh,yes, that is my schema config:

 fieldType name=text class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory
 language=English
 protected=protwords.txt/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory
 language=English
 protected=protwords.txt/
  /analyzer
/fieldType


 field name=name type=text indexed=true stored=true
 multiValued=true/

 And, my solrconfig.xml for seach in dismax:

 requestHandler name=dismax class=solr.SearchHandler 
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsexplicit/str
 float 

Re: Result ordering for Wildcard/Prefix queries or ConstantScoreQueries

2009-12-30 Thread Prasanna R
On Wed, Dec 30, 2009 at 5:04 PM, Grant Ingersoll gsi...@gmail.com wrote:


 On Dec 30, 2009, at 3:21 PM, Prasanna R wrote:

  All documents matched for Wildcard and Prefix queries get the same score
 as
  they are scored as a ConstantScoreQuery. Example query - title:abc*
 
  In such cases, what determines the ordering of the results? Is it simply
 the
  same order in which those document terms appeared when enumerating
 through
  the terms of the field matched in the index?

 I'm assuming they are just in order of internal Lucene doc id, but I'd have
 to look for sure.  There was also some changes to Lucene that allowed the
 collectors to take docs out of order, but again, I'd have to check to see if
 that is the case.

 
  Also, would it be possible to specify criteria determining the ordering
 of
  such matches? I am assuming that should be possible but have little idea
 how
  that could be done. Kindly provide guidance/help.

 Sort?

 What problem are you trying to solve?

 I am using a prefix query to match a bunch of documents and would like to
specify an ordering for the documents matched for that prefix query This is
part of the work I am doing in implementing an autocomplete feature and I am
using the dismax query parser with some custom modifications. I assume you
mean that I can apply a sort ordering to the prefix query matches as part of
the results handler. I was not aware of the same. Will look into that.

Thanks a lot for the help.

Regards,

Prasanna.


Re: Correct syntax for solrJ filter queries

2009-12-30 Thread Jay Fisher
Thanks! That did it.

~ Jay

On Wed, Dec 30, 2009 at 9:58 PM, Erik Hatcher erik.hatc...@gmail.comwrote:

 Use query.addFacetQuery(str) instead.

Erik


 On Dec 30, 2009, at 10:16 PM, Jay Fisher wrote:

  I'm using solrJ to construct a query and it works just fine until I add
 the
 following.


 query.setFilterQueries(price:[*+TO+500], price:[500+TO+*]);


 That generates this error


 Caused by: org.apache.solr.common.SolrException: Bad Request


 Bad Request


 request:

 http://balboa:8085/apache-solr-1.4.0/core0/select?q=redfacet=truefl=*,scorerows=20fq=price
 :[*+TO+500]fq=price:[500+TO+*]wt=javabinversion=1


 What is the proper syntax for specifying a set of facet.queries?





numFound is changing when query across distributed-seach with the same query.

2009-12-30 Thread johnson hong

Hi,all.
I found a problem on distributed-seach.
when i use ?q=keywordstart=0rows=20 to query across
distributed-seach,it will return numFound=181 ,then I
change the start param from 0 to 100,it will return numFound=131.
why return different numFound with same query ?

-- 
View this message in context: 
http://old.nabble.com/numFound-is-changing-when-query-across-distributed-seach-with-the-same-query.-tp26976128p26976128.html
Sent from the Solr - User mailing list archive at Nabble.com.