Re: field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-25 Thread Martijn v Groningen
Hi Joe,

Can you give a bit more context info? Like the exact search and the
field types you are using for example. Also are you doing a lot of
frequent updates to the index?

Cheers,

Martijn

2009/10/23 Joe Calderon calderon@gmail.com:
 seems to happen when sort on anything besides strictly score, even
 score desc, num desc triggers it, using latest nightly and 10/14 patch

 Problem accessing /solr/core1/select. Reason:

    4731592

 java.lang.ArrayIndexOutOfBoundsException: 4731592
        at 
 org.apache.lucene.search.FieldComparator$StringOrdValComparator.copy(FieldComparator.java:660)
        at 
 org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentComparator.compare(NonAdjacentDocumentCollapser.java:235)
        at 
 org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentPriorityQueue.lessThan(NonAdjacentDocumentCollapser.java:173)
        at 
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)
        at 
 org.apache.solr.search.NonAdjacentDocumentCollapser.doCollapsing(NonAdjacentDocumentCollapser.java:95)
        at 
 org.apache.solr.search.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:208)
        at 
 org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:98)
        at 
 org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:66)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148)
        at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387)
        at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
        at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
        at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
        at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520)


Re: Which query parser handles nested queries?

2009-10-25 Thread Chris Hostetter
: http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/).
: When working with Solr 1.3 stable, I'm able to use this syntax
: effectively using the default requestHandler, but when I am
: hand-rolling my own requestHandler, it doesn't recognize the _query_

the magic field _query_ is special syntax of the SolrQueryParser (aka the 
lucene QParserPlugin)  (FYI: if you know jaav, which i assume you do it 
you're writting your own requestHandler) you can find this by grepping the 
code base for '_query_'

So if you use SolrQueryParser in your own request handler you should be 
fine ... if you're writting a custom request handler you'll have to add 
that same special handling.


-Hoss



Re: Store tika extracted result as xhtml

2009-10-25 Thread Chris Hostetter

: My objective is to be able to stored it as xhtml in the field and be 
: able to retrieve it as cached output. Since tika is already giving xhtml 
: output, I wonder why when Solr save it as a plain text. (Maybe I missed 
: out something in the configuration??)

I'm not very familiar with Tika or Solr CELL, but I think what you are 
seeing is that Solr only asks Tika for the *content* of the DOM Nodes 
matched by the xpath and/or capture params (ie: node.getTextContent()).

I suspect it wouldnt' be too hard to add an option to allow the capture of 
the serialized DOM Nodes.



-Hoss



Re: MoreLikeThis support Dismax parameters

2009-10-25 Thread Chris Hostetter

In the current code base the MLT Handler has geneerally been superceeded 
by a MLT Component which may do what you want -- you can use an QParser 
you want to generate a DocList and the MLT Component then suggests similar 
docs for each doc in your list.

As i said: that may be what you're looking for (it's hard to tell based on 
your email) but the other possibility is that you want to be able to 
specify bq (and maybe bf) type parrams to influence the MLT portion of the 
request (ie: apply a bias so docs matching a particular query/func are 
mosre likely to be suggested) ... this is an area that hasn't really been 
very well explored as far as i can remember.

: From what I've read/found, MoreLikeThis doesn't support the dismax
: parameters that are available in the StandardRequestHandler (such as bq). Is
: it possible that we might get support for those parameters some time? What
: are the issues with MLT Handler inheriting from the StandardRequestHandler
: instead of RequestHandlerBase?


-Hoss



Re: Shards param accepts spaces between commas?

2009-10-25 Thread Chris Hostetter

: It seems like no, and should be an easy change.  I'm putting newlines
: after the commas so the large shards list doesn't scroll off the
: screen.

Yeah ... for some odd reason QueryComponnent is using 
StrUtils.splitSmart() ... SolrPluginUtils.split() seems like a saner 
choice.

A better question is probably why the shards parm isn't just multivalued.

(Yonik?)






-Hoss



Re: Solr and bitwise comparaison

2009-10-25 Thread Chris Hostetter

: I search to make a request in solr similaire are SELECT COUNT(*) FROM
: InscriptionNew WHERE choices  17  0; into mysql.
: it is possible, you have an idea ?

bitmask operations in DB queries like that are usually a result of using a 
single physical column to store many logical boolean columns -- either out 
of space concerns or as a way to pre-allocate boolean fields for later use 
without needing to add columns.

In Solr: just use a BoolField for each bit.




-Hoss



Re: Solrj Javabin and JSON

2009-10-25 Thread Patrick Jungermann
Hi Stefan,

you don't need to convert the Java objects built from the result
returned as Javabin. Instead of this, you could easily use the JSON
return format by setting wt=json. See also at [0] for more information
about this.


Patrick


[0] http://wiki.apache.org/solr/SolJSON


SGE0 schrieb:
 Hi Paul,
 
 
 fair enough. Is this included in the Solrj package ? Any examples how to do
 this ?
 
 
 Stefan
 
 
 
 Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 There is no point converting javabin to json. javabin is in
 intermediate format it is converted to the java objects as soon as
 comes. You just need means to convert the java object to json.



 On Sat, Oct 24, 2009 at 12:10 PM, SGE0 stefangee...@hotmail.com wrote:
 Hi,

 did anyone write a Javabin to JSON convertor and is willing to share this
 ?

 In our servlet we use a CommonsHttpSolrServer instance to execute a
 query.

 The problem is that is returns Javabin format and we need to send the
 result
 back to the browser using JSON format.

 And no, the browser is not allowed to directly query Lucene with the
 wt=json
 format.

 Regards,

 S.
 --
 View this message in context:
 http://www.nabble.com/Solrj-Javabin-and-JSON-tp26036551p26036551.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 -- 
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com


 



Dismax params, mm lt explanation

2009-10-25 Thread ram_sj

Hi,
consider this minimum match params in dismax query handler, 

  str name=mm
2lt;-1 3lt;-2 6lt;100%
  /str

I requested solr to match atleast two fields, which i understood from the
documents. Can someone give me explanations for other params in it? 

lt;-1 3

lt;-2 6

lt;100%

how are they significant? 

thanks
ram
-- 
View this message in context: 
http://www.nabble.com/Dismax-params%2C--mm--lt-explanation-tp26049472p26049472.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: question about merging indexes

2009-10-25 Thread Chris Hostetter

: I need some help about the mergeindex command. I have 2 cores A and B
: that I want to merge into a new index RES. A has 100 docs and B 10
: docs. All of B's docs are from A, except that one attribute is
: changed. The goal is to bring the updated attributes from B into A.

that's not how mergeindex works ... merging two indexes is essentially 
just adding all the docs from one index into the other (but w/o the 
reindexing step - it works by copying the raw term info)

There is no way to modify a doc once it's been indexed.

: When I issue the mergeindexes command  my RES core only has 10 docs. I
: expect RES to have 100  or even 110 docs, but 10 is very puzzling. Am
: I misunderstanding something about merging indexes?

what exactly was the command you used to do the merge?  you should have 
gotten 110 docs.



-Hoss



Re: Searching over all Dynamic Fields: different things tested, multiple issues experienced

2009-10-25 Thread Chris Hostetter

: When I test it, if I test it with stored=true, it works as expected, if I
: test with with stored=false the resultset is empty.

Adding stored=false has no impact on anything related to searchings -- 
it only affects what values can be written out by the response writer. 
There's no way only changing that attribute on a field could produce the 
behavior you're describing.

If you post your schema.xml, some sample data, and examples of the queries 
you are attempting; people could probably help you spot what may be 
causing your problem.

-Hoss



Re: Constant Score Queries and Function Queries

2009-10-25 Thread Chris Hostetter

: Fair enough, I guess I was just kind of expecting a constant score query + a
: function query to result in a score of whatever the function query is.  This
: is a common trick to sort by a function, but it's easy enough to just ^0 the
: non function clause.

I think the root of hte issue is that a ConstantScoreQuery has a constant 
according to it's boost, which defaults to 1, so all the usual 
queryNorm effects apply when using it in a BooleanQuery.

Random thought: one way to implement sort by function more naturally 
would be to add a sortfunc parm that used the FunctionQParser, take the 
q query and move it to the Filter query list, then use the func query as 
your main query.  (all of this could be triggered my a new magic sort 
field _func_ desc which was equivilent to score desc after the 
query swapping ... things like sort=inStock desc, _func_ desc would 
still work as well)



-Hoss



Re: field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-25 Thread Martijn v Groningen
I was able to reproduce the exact same stacktrace you have sent. The
exception occured when I removed a document from a newly created index
(with a commit) and then did a search with field collapsing enabled. I
have attached a new patch to SOLR-236 that includes a fix for this
bug.

Martijn

2009/10/25 Martijn v Groningen martijn.is.h...@gmail.com:
 Hi Joe,

 Can you give a bit more context info? Like the exact search and the
 field types you are using for example. Also are you doing a lot of
 frequent updates to the index?

 Cheers,

 Martijn

 2009/10/23 Joe Calderon calderon@gmail.com:
 seems to happen when sort on anything besides strictly score, even
 score desc, num desc triggers it, using latest nightly and 10/14 patch

 Problem accessing /solr/core1/select. Reason:

    4731592

 java.lang.ArrayIndexOutOfBoundsException: 4731592
        at 
 org.apache.lucene.search.FieldComparator$StringOrdValComparator.copy(FieldComparator.java:660)
        at 
 org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentComparator.compare(NonAdjacentDocumentCollapser.java:235)
        at 
 org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentPriorityQueue.lessThan(NonAdjacentDocumentCollapser.java:173)
        at 
 org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)
        at 
 org.apache.solr.search.NonAdjacentDocumentCollapser.doCollapsing(NonAdjacentDocumentCollapser.java:95)
        at 
 org.apache.solr.search.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:208)
        at 
 org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:98)
        at 
 org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:66)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148)
        at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387)
        at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
        at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
        at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
        at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520)



Re: Dismax params, mm lt explanation

2009-10-25 Thread ram_sj

After reading the explanation in the book, its very clear now. Thank you
citing it with page number, 

Ram


hossman wrote:
 
 
 What you are looking at is an XML escaped version of this string...
 
   2-1 3-2 6100%
 
 ...the syntax is documented here...
 
 http://wiki.apache.org/solr/DisMaxRequestHandler#mm_.28Minimum_.27Should.27_Match.29
 http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html
 
 ...note that the string you have listed there actually makes very little 
 sense because of the 100% condition.  it says that for queries of more 
 then 6 clauses all of them are required (usually the mm param get's less 
 strict as the number of clauses increase)
 
 (FYI: As the creator of the 'mm' param syntax, One of my favorite parts of 
 the new Solr 1.4 book is the explanation of mm options with multiple 
 clauses.  It's descibes in in a completely differnet way from anything i'd 
 ever thought of before (i was convinced it was a huge mistake the first 
 two times i read that section before the light bulb went off and i 
 realized how brilliant it was) and is probably a lot easier for many 
 people to understand -- if you have the book it's on p139)
 
 : 2lt;-1 3lt;-2 6lt;100%
   ...
 : I requested solr to match atleast two fields, which i understood from
 the
 : documents. Can someone give me explanations for other params in it? 
 : 
 : lt;-1 3
 : 
 : lt;-2 6
 : 
 : lt;100%
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Dismax-params%2C--mm--lt-explanation-tp26049472p26052492.html
Sent from the Solr - User mailing list archive at Nabble.com.



begins with searches

2009-10-25 Thread Bernadette Houghton
We need to offer begins with type searches, e.g. a search for surname, f 
will retrieve surname, firstname, surname, f, surname fm etc.

Ideally, the user would be able to enter something like surname f*.

However, wildcards don't work on phrase searches, nor do range searches.

Any suggestions as to how best to search for begins with phrases; or, how to 
best configure solr to support such searches?

TIA
Bernadette Houghton, Library Business Applications Developer
Deakin University Geelong Victoria 3217 Australia.
Phone: 03 5227 8230 International: +61 3 5227 8230
Fax: 03 5227 8000 International: +61 3 5227 8000
MSN: bern_hough...@hotmail.com
Email: 
bernadette.hough...@deakin.edu.aumailto:bernadette.hough...@deakin.edu.au
Website: http://www.deakin.edu.au
http://www.deakin.edu.au/Deakin University CRICOS Provider Code 00113B (Vic)

Important Notice: The contents of this email are intended solely for the named 
addressee and are confidential; any unauthorised use, reproduction or storage 
of the contents is expressly prohibited. If you have received this email in 
error, please delete it and any attachments immediately and advise the sender 
by return email or telephone.
Deakin University does not warrant that this email and any attachments are 
error or virus free



weird behaviour while inserting records into solr

2009-10-25 Thread Rakhi Khatwani
Hi,
 i was trying to insert one million records in solr (keeping the id from
0 to 100). things were fine till it inserted (id =  523932). after that
it started inserting it from 1 (i.e updating). i am not able to understand
this behaviour. any pointers??
Regards,
Raakhi


Re: Shards param accepts spaces between commas?

2009-10-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Sun, Oct 25, 2009 at 9:34 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : It seems like no, and should be an easy change.  I'm putting newlines
 : after the commas so the large shards list doesn't scroll off the
 : screen.

 Yeah ... for some odd reason QueryComponnent is using
 StrUtils.splitSmart() ... SolrPluginUtils.split() seems like a saner
 choice.

 A better question is probably why the shards parm isn't just multivalued.
good question. I guess it should be


 (Yonik?)






 -Hoss





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com