Re: Data Import from a Queue

2011-07-20 Thread Stefan Matheis
Brandon, i don't know how they are using it in detail, but Part of Chef's Architecture is this one: Chef Server - RabbitMQ - Chef Solr Indexer - Solr http://wiki.opscode.com/download/attachments/7274878/chef-server-arch.png Perhaps not exactly, what you're looking for - but may give you an

Re: how to get solr core information using solrj

2011-07-20 Thread Stefan Matheis
Jiang, what about http://wiki.apache.org/solr/CoreAdmin#STATUS ? Regards Stefan Am 20.07.2011 05:40, schrieb Jiang mingyuan: hi all, Our solr server contains two cores:core0,core1,and they both works well. Now I'am trying to find a way to get information about core0 and core1. Can solrj or

suggester component from trunk throwing error

2011-07-20 Thread abhayd
hi I am trying to configure suggester component. I downloaded solr from trunk and did a build. here is my config requestHandler name=/suggest class=org.apache.solr.handler.component.SearchHandler lst name=defaults str name=spellchecktrue/str str

Re: - character in search query

2011-07-20 Thread roySolr
Here is my complete fieldtype: fieldType name=name class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.PatternTokenizerFactory pattern=\s|, / filter class=solr.LowerCaseFilterFactory/

Re: query time boosting in solr

2011-07-20 Thread Sowmya V.B.
Can anyone throw some light on this issue? My problem is to: give a query time boost to certain documents, which have a field, say field1, in the range that the user chooses during query time. I think the below link indicates a range query:

Re: Solr UI

2011-07-20 Thread Gora Mohanty
On Tue, Jul 19, 2011 at 7:51 PM, Erik Hatcher erik.hatc...@gmail.com wrote: There's several starting points for Solr UI out there, but really the best choice is whatever fits your environment and the skills/resources you have handy.  Here's a few off the top of my head - [...] Besides these

Re: any detailed tutorials on plugin development?

2011-07-20 Thread Gora Mohanty
On Wed, Jul 20, 2011 at 6:29 AM, deniz denizdurmu...@gmail.com wrote: gosh sorry for my typo in msg first... i just realized it now... well anyway... i would like to find a detailed tutorial about how to implement an analyzer or a request handler plugin... but all i have got is nothing from

Re: - character in search query

2011-07-20 Thread roySolr
When i use the edismax handler the escaping works great(before i used the dismax handler).The debugQuery shows me this: +((DisjunctionMaxQuery((name:arsenal)~1.0) DisjunctionMaxQuery((name:london)~1.0))~2 The \ is not in the parsedquery, so i get the results i wanted. I don't know why the dismax

Re: any detailed tutorials on plugin development?

2011-07-20 Thread samuele.mattiuzzo
actually i'm rewriting http://wiki.apache.org/solr/UpdateRequestProcessor this wiki page with a more detailed how-to, it will be ready and online after i get back from work! -- View this message in context:

term positions performance

2011-07-20 Thread Marco Martinez
Hi, I am developing a new query term proximity and i am using the term positions to get the positions of each term. I want to know if there is any clues to increase the perfomance of using term positions, in index time o in query time, all my fields that i am applying the term positions are

Re: term positions performance

2011-07-20 Thread Marco Martinez
Also, i develop this query via function query, i wonder if i do it via a normal query will increase the perfomance.. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/7/20 Marco Martinez

Re: POST VS GET and NON English Characters

2011-07-20 Thread Sujatha Arun
Paul , I added the fllowing line to catalina.sh and restarted the server ,but this does not seem to help. JAVA_OPTS=-Djavax.servlet.request.encoding=UTF-8 -Dfile.encoding=UTF-8 Regards Sujatha On Sun, Jul 17, 2011 at 3:51 AM, Paul Libbrecht p...@hoplahup.net wrote: If you have the option,

Re: embeded solrj doesn't refresh index

2011-07-20 Thread Marco Martinez
You should send a commit to you embedded solr Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/7/20 Jianbin Dai j...@huawei.com Hi, I am using embedded solrj. After I add new doc to the

Re: query time boosting in solr

2011-07-20 Thread Sowmya V.B.
Hi Tomasso Thanks for a quick response. So, if I say: http://localhost:8085/apache-solr-3.3.0/select?indent=onversion=2.2* defType=dismax*q=scientificbq=Field1:[20%20TO%2025]^10start=0rows=30 -will it be right? The above query: boosts the documents which suit the given query (scientific), which

Re: query time boosting in solr

2011-07-20 Thread Tomás Fernández Löbbe
Yes, it should, but make sure you specify at least the qf parameter for dismax. You can activate debugQuery and you'll see which documents get boosted and which aren't. On Wed, Jul 20, 2011 at 9:21 AM, Sowmya V.B. vbsow...@gmail.com wrote: Hi Tomasso Thanks for a quick response. So, if I

Re: Solr 3.3: Exception in thread Lucene Merge Thread #1

2011-07-20 Thread mdz-munich
Update. After adding 1626 documents without doing a commit or optimize: /Exception in thread Lucene Merge Thread #1 org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: Map failed at

Culr Tika not working with blanks into literal.field

2011-07-20 Thread Peralta Gutiérrez del Álamo
Hi. I'm trying to index binary documents with curl and Tika for extracting text. The problem is that when I set the value of a field with spaces blanks using the input parameter: literal.fieldname=value, the document is not indexed. The sentence I send is the follow: curl

Re: defType argument weirdness

2011-07-20 Thread Yonik Seeley
On Tue, Jul 19, 2011 at 11:41 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Is it generally recognized that this terminology is confusing, or is it just me? I do understand what they do (at least well enough to use them), but I find it confusing that it's called defType as a main param, but

Re: query time boosting in solr

2011-07-20 Thread Sowmya V.B.
Hi Tomas Here is what I was trying to give. http://localhost:8085/apache-solr-3.3.0/select?indent=onversion=2.2defType=dismaxq=scientificbq=Field1:[20%20TO%2030] ^10start=0rows=30qf=textfl=Field1,dociddebugQuery=on Over here, I was trying to change the range of Field1, keeping everything else

Re: Geospatial queries in Solr

2011-07-20 Thread Jamie Johnson
Thanks for responding so quickly, I don't mind waiting a bit. I'll hang out until the updates have been made. Thanks again. On Tue, Jul 19, 2011 at 3:51 PM, Smiley, David W. dsmi...@mitre.org wrote: Hi Jamie. I work on LSP; it can index polygons and query for them. Although the capability

Re: Geospatial queries in Solr

2011-07-20 Thread Smiley, David W.
Ryan just updated LSP for Lucene/Solr trunk compatibility so you should do a mvn clean install and you'll be back in business. On Jul 20, 2011, at 10:37 AM, Jamie Johnson wrote: Thanks for responding so quickly, I don't mind waiting a bit. I'll hang out until the updates have been made.

Reading Solr's JSON

2011-07-20 Thread Sowmya V.B.
Hi All Which is the best way to read Solr's JSON output, from a Java code? There seems to be a JSONParser in one of the jar files in SolrLib (org.apache.noggit..)...but I dont understand how to read the parsed output in this. Are there any better JSON parsers for Java? S -- Sowmya V.B.

Re: Solr suggester and spell checker

2011-07-20 Thread abhayd
hi I am having same issue, did you find the solution for this problem? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-suggester-and-spell-checker-tp2326907p3185680.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Reading Solr's JSON

2011-07-20 Thread Yonik Seeley
On Wed, Jul 20, 2011 at 10:58 AM, Sowmya V.B. vbsow...@gmail.com wrote: Which is the best way to read Solr's JSON output, from a Java code? You could use SolrJ - it handles parsing for you (and uses the most efficient binary format by default). There seems to be a JSONParser in one of the jar

Manipulating a Fuzzy Query's Prefix Length

2011-07-20 Thread Kyle Lee
We're performing fuzzy searches on a field possessing a large number of unique terms. Specifying a required minimum similarity of 0.7 results in a query execution time of 13-15 seconds, which stands in stark contrast to our average query time of 40ms. We suspect that the performance problem most

Tokenizer Question

2011-07-20 Thread Jamie Johnson
I have a query which starts out with something like name:john, I need to expand this to something like name:(john johnny). I've implemented a custom tokenzier which gets close, but isn't quite right it outputs name:john johnny. Is there a simple example of doing what I'm attempting?

How can i find a document by a special id?

2011-07-20 Thread Per Newgro
Hi, i'm new to solr. I built an application using the standard solr 3.3 examples as default. My id field is a string and is copied to a solr.TextField (searchtext) for search queries. All works fine except i try to get documents by a special id. Let me explain the detail's. Assume id =

Re: How can i find a document by a special id?

2011-07-20 Thread Kyle Lee
Perhaps I'm missing something, but if your fields are indexed as 1234567 but users are searching for AB1234567, is it not possible simply to strip the prefix from the user's input before sending the request? On Wed, Jul 20, 2011 at 10:57 AM, Per Newgro per.new...@gmx.ch wrote: Hi, i'm new to

Re: Tokenizer Question

2011-07-20 Thread Kyle Lee
I'm not sure how to accomplish what you're asking, but have you considered using a synonyms file? This would also allow you to catch ostensibly unrelated name substitutes such as Robert - Bob and Richard - Dick. On Wed, Jul 20, 2011 at 10:57 AM, Jamie Johnson jej2...@gmail.com wrote: I have a

Re: Tokenizer Question

2011-07-20 Thread Jamie Johnson
My use case really isn't names, I just used that as a simplification. I did look at the Synonym filter to see if I could implement a similar filter (if that was a more appropriate place to do so) but even after doing that I ended up with the same result. On Wed, Jul 20, 2011 at 12:07 PM, Kyle Lee

Re: Geospatial queries in Solr

2011-07-20 Thread Jamie Johnson
Thanks for the update David, I'll give that a try now. On Wed, Jul 20, 2011 at 10:58 AM, Smiley, David W. dsmi...@mitre.org wrote: Ryan just updated LSP for Lucene/Solr trunk compatibility so you should do a mvn clean install and you'll be back in business. On Jul 20, 2011, at 10:37 AM,

Re: How can i find a document by a special id?

2011-07-20 Thread Per Newgro
Am 20.07.2011 18:03, schrieb Kyle Lee: Perhaps I'm missing something, but if your fields are indexed as 1234567 but users are searching for AB1234567, is it not possible simply to strip the prefix from the user's input before sending the request? On Wed, Jul 20, 2011 at 10:57 AM, Per

Wiki Error JSON syntax

2011-07-20 Thread Remy Loubradou
Hi, I was writing a Solr Client API for Node and I found an error on this page http://wiki.apache.org/solr/UpdateJSON ,on the section Update Commands the JSON is not valid because there are duplicate keys and two times with add and delete.I tried with an array and it doesn't work as well, I got

Re: Solr 3.3: Exception in thread Lucene Merge Thread #1

2011-07-20 Thread mdz-munich
Here we go ... This time we tried to use the old LogByteSizeMergePolicy and SerialMergeScheduler: mergePolicy class=org.apache.lucene.index.LogByteSizeMergePolicy/ mergeScheduler class=org.apache.lucene.index.SerialMergeScheduler/ We did this before, just to be sure ... ~300 Documents: /

Re: Wiki Error JSON syntax

2011-07-20 Thread Yonik Seeley
On Wed, Jul 20, 2011 at 12:16 PM, Remy Loubradou remyloubra...@gmail.com wrote: Hi, I was writing a Solr Client API for Node and I found an error on this page http://wiki.apache.org/solr/UpdateJSON ,on the section Update Commands the JSON is not valid because there are duplicate keys and two

Re: query time boosting in solr

2011-07-20 Thread Tomás Fernández Löbbe
So, what you want is to have the same exact results set as if the query was scientific, but the documents that also match Field1:[20 TO 30] to have more score, right? On Wed, Jul 20, 2011 at 10:53 AM, Sowmya V.B. vbsow...@gmail.com wrote: Hi Tomas Here is what I was trying to give.

Re: How can i find a document by a special id?

2011-07-20 Thread Kyle Lee
Is the mediacode always alphabetic, and is the ID always numeric?

Schema design/data import

2011-07-20 Thread Travis Low
Greetings. I am struggling to design a schema and a data import/update strategy for some semi-complicated data. I would appreciate any input. What we have is a bunch of database records that may or may not have files attached. Sometimes no files, sometimes 50. The requirement is to index the

Re: How can i find a document by a special id?

2011-07-20 Thread Per Newgro
Am 20.07.2011 19:23, schrieb Kyle Lee: Is the mediacode always alphabetic, and is the ID always numeric? No sadly not. We expose our products on too many medias :-). Per

Re: Geospatial queries in Solr

2011-07-20 Thread Jamie Johnson
So I've pulled the latest and can run the example, I've tried to move my config over and am having a bit of an issue when executing queries, specifically I get this: Unable to read: POLYGON((... looking at the code it's usign the simple spatial context, how do I specify JtsSpatialContext? On

Re: Wiki Error JSON syntax

2011-07-20 Thread Remy Loubradou
I think I can trust you but this is weird. Funny things if you try to validate on http://jsonlint.com/ this JSON, duplicates keys are automatically removed. But the thing is, how can you possibly generate this json with Javascript Object? It will be really nice to combine both ways that you show

Re: Geospatial queries in Solr

2011-07-20 Thread Smiley, David W.
You can set the system property SpatialContextProvider to com.googlecode.lucene.spatial.base.context.JtsSpatialContext ~ David On Jul 20, 2011, at 2:02 PM, Jamie Johnson wrote: So I've pulled the latest and can run the example, I've tried to move my config over and am having a bit of an

Re: Geospatial queries in Solr

2011-07-20 Thread Jamie Johnson
Where do you set that? On Wed, Jul 20, 2011 at 2:37 PM, Smiley, David W. dsmi...@mitre.org wrote: You can set the system property SpatialContextProvider to com.googlecode.lucene.spatial.base.context.JtsSpatialContext ~ David On Jul 20, 2011, at 2:02 PM, Jamie Johnson wrote: So I've

Culr Tika not working with blanks into literal.field

2011-07-20 Thread Peralta Gutiérrez del Álamo
Hi. I'm trying to index binary documents with curl and Tika for extracting text. The problem is that when I set the value of a field with spaces blanks using the input parameter: literal.fieldname=value, the document is not indexed. The sentence I send is the follow: curl

Re: Geospatial queries in Solr

2011-07-20 Thread Smiley, David W.
The notion of a system property is a java concept; google it and you'll learn more. BTW, despite my responsiveness in helping right now; I'm pretty busy this week so this won't necessarily last long. ~ David On Jul 20, 2011, at 2:43 PM, Jamie Johnson wrote: Where do you set that? On Wed,

RE: embeded solrj doesn't refresh index

2011-07-20 Thread Jianbin Dai
Hi Thanks for response. Here is the whole picture: I use DIH to import and index data. And use embedded solrj connecting to the index file for search and other operations. Here is what I found: Once data are indexed (and committed), I can see the changes through solr web server, but not from

set queryNorm to 1?

2011-07-20 Thread Elaine Li
Hi Folks, My boost function bf=div(product(num_clicks,0.3),sum(num_clicks,25)) I would like to directly add the score of it to the final scoring instead of letting it be normalized by the queryNorm value. Is there anyway to do it? Thanks. Elaine

Re: query time boosting in solr

2011-07-20 Thread Sowmya V.B.
Hi Tomas Yeah, I now understand it. I was confused about interpreting the output. Thanks for the comments. Sowmya. 2011/7/20 Tomás Fernández Löbbe tomasflo...@gmail.com So, what you want is to have the same exact results set as if the query was scientific, but the documents that also match

Schema Design/Data Import

2011-07-20 Thread travis
[Apologies if this is a duplicate -- I have sent several messages from my work email and they just vanish, so I subscribed with my personal email] Greetings. I am struggling to design a schema and a data import/update strategy for some semi-complicated data. I would appreciate any input.

Re: How can i find a document by a special id?

2011-07-20 Thread Chris Hostetter
: Am 20.07.2011 19:23, schrieb Kyle Lee: : Is the mediacode always alphabetic, and is the ID always numeric? : : No sadly not. We expose our products on too many medias :-). If i'm understanding you correctly, you're saying even the prefix AB is not special, that there could be any number of

Re: Tokenizer Question

2011-07-20 Thread Chris Hostetter
When the QueryParser gives hunks of text to an analyzer, and that analyzer produces multiple terms, the query parser has to decide how to build a query out of it. if the terms have identicle position information, then it always builds an OR query (this is the typical synonym situation). If

RE: defType argument weirdness

2011-07-20 Thread Chris Hostetter
: I do understand what they do (at least well enough to use them), but I : find it confusing that it's called defType as a main param, but type : in a LocalParam, when to me they both seem to do the same thing -- which type as a localparam in a query string defines the type of query string

Re: defType argument weirdness

2011-07-20 Thread Jonathan Rochkind
Huh, I'm still not completely following. I'm sure it makes sense if you understand the underlying implemetnation, but I don't understand how 'type' and 'defType' don't mean exactly the same thing, just need to be expressed differently in different location. Sorry for beating a dead horse, but

Re: Tokenizer Question

2011-07-20 Thread Jamie Johnson
Thanks, I'll try that now, I'm assuming I need to add the position increment and offset attributes? On Wed, Jul 20, 2011 at 3:44 PM, Chris Hostetter hossman_luc...@fucit.org wrote: When the QueryParser gives hunks of text to an analyzer, and that analyzer produces multiple terms, the query

solrj and XML result sets

2011-07-20 Thread Joe Shubitowski
Does anyone have advice as to how to produce an XML result set using SolrJ?? My Java coder says he can *only* produce result sets in javabin - which is fine in most cases - but we have a need for an XML output stream as well. Thanks...

RE: Solr 3.3: Exception in thread Lucene Merge Thread #1

2011-07-20 Thread Robert Petersen
Says it is caused by a Java out of memory error, no? -Original Message- From: mdz-munich [mailto:sebastian.lu...@bsb-muenchen.de] Sent: Wednesday, July 20, 2011 9:18 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.3: Exception in thread Lucene Merge Thread #1 Here we go ...

Re: How can i find a document by a special id?

2011-07-20 Thread Bill Bell
Why not just search the 2 fields? q=*:*fq=mediacode:AB OR id:123456 You could take the user input and replace it: q=*:*fq=mediacode:$input OR id:$input Of course you can also use dismax and wrap with an OR. Bill Bell Sent from mobile On Jul 20, 2011, at 3:38 PM, Chris Hostetter

RE: embeded solrj doesn't refresh index

2011-07-20 Thread Jianbin Dai
Hi Thanks for response. Here is the whole picture: I use DIH to import and index data. And use embedded solrj connecting to the index file for search and other operations. Here is what I found: Once data are indexed (and committed), I can see the changes through solr web server, but not from

Re: Data Import from a Queue

2011-07-20 Thread Bill Bell
Yes this is a good reason for using a queue. I have used Amazon SQS this way and it was simple to set up. Bill Bell Sent from mobile On Jul 20, 2011, at 2:59 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: Brandon, i don't know how they are using it in detail, but Part of Chef's

Re: Geospatial queries in Solr

2011-07-20 Thread Jamie Johnson
Thanks David. When trying to execute queries on a complex irregular polygon (say the shape of NJ) I'm getting results which are actually outside of that polygon. Is there a setting which controls this resolution? On Wed, Jul 20, 2011 at 2:53 PM, Smiley, David W. dsmi...@mitre.org wrote: The

Updating fields in an existing document

2011-07-20 Thread Benson Margulies
We find ourselves in the following quandry: At initial index time, we store a value in a field, and we use it for facetting. So it, seemingly, has to be there as a field. However, from time to time, something happens that causes us to want to change this value. As far as we know, this requires

Re: Question on the appropriate software

2011-07-20 Thread Erick Erickson
Solr would work find for this, your PDF files would have to be interpreted by Tika, but see Data Import handler, FileListEntityProcessor and TikaEntityProcessor. I don't quite think Nutch is the tool here. You'll be wanting to do highlighting and a couple of other things You'll spend some

RE: Updating fields in an existing document

2011-07-20 Thread Jonathan Rochkind
Nope, you're not missing anything, there's no way to alter a document in an index but reindexing the whole document. Solr's architecture would make it difficult (although never say impossible) to do otherwise. But you're right it would be convenient for people other than you. Reindexing a

RE: Solr 3.3: Exception in thread Lucene Merge Thread #1

2011-07-20 Thread mdz-munich
Yeah, indeed. But since the VM is equipped with plenty of RAM (22GB) and it works so far (Solr 3.2) very well with this setup, I AM slightly confused, am I? Maybe we should LOWER the dedicated Physical Memory? The remaining 10GB are used for a second tomcat (8GB) and the OS (Suse). As far as I

Solr not returning results for some key words

2011-07-20 Thread Matthew Twomey
Greetings, I'm having trouble getting Solr to return results for key words that I know for sure are in the index. As a test, I've indexed a PDF of a book on Java. I'm trying to search the index for UnsupportedOperationException but I get no results. I can see it in the index though: #

Re: Manipulating a Fuzzy Query's Prefix Length

2011-07-20 Thread Kyle Lee
Update: Solr/Lucene 4.0 will incorporate a new fuzzy search algorithm with substantial performance improvements. To tide us over until this release, we've simply rebuilt from source with a default prefix length of 2, which will suit our needs until then. On Wed, Jul 20, 2011 at 10:09 AM, Kyle

Announcement/Invitation: Melbourne Solr/Lucene Users Group

2011-07-20 Thread Tal Rotbart
Hi all, I hope you won't mind me informing the list, but I thought some Melbourne-based members would find this relevant. We have noticed that there is a blossoming of Apache Solr/Lucene usage development in Melbourne in addition to a lack of an unofficial, relaxed gathering to allow some

Re: Announcement/Invitation: Melbourne Solr/Lucene Users Group

2011-07-20 Thread Dave Hall
Hi Tal, On 21/07/11 14:04, Tal Rotbart wrote: We have noticed that there is a blossoming of Apache Solr/Lucene usage development in Melbourne in addition to a lack of an unofficial, relaxed gathering to allow some fruitful information and experience exchange. We're trying to put together a

Re: Solr not returning results for some key words

2011-07-20 Thread Matthew Twomey
Ok, apparently I'm not the first to have fallen prey to maxFieldLength gotcha: http://lucene.472066.n3.nabble.com/Solr-ignoring-maxFieldLength-td473263.html All fixed now. -Matt On 07/20/2011 07:13 PM, Matthew Twomey wrote: Greetings, I'm having trouble getting Solr to return results for

Re: Question on the appropriate software

2011-07-20 Thread Matthew Twomey
Excellent, thanks for the confirmation Erik. I've started working with Solr (just getting my feet wet at this point). -Matt On 07/20/2011 05:38 PM, Erick Erickson wrote: Solr would work find for this, your PDF files would have to be interpreted by Tika, but see Data Import handler,

Re: Announcement/Invitation: Melbourne Solr/Lucene Users Group

2011-07-20 Thread Mark Mandel
Sounds great :) I'll sign up as well. Look forward to a meeting! Mark On Thu, Jul 21, 2011 at 2:14 PM, Dave Hall dave.h...@skwashd.com wrote: Hi Tal, On 21/07/11 14:04, Tal Rotbart wrote: We have noticed that there is a blossoming of Apache Solr/Lucene usage development in Melbourne

Re: Announcement/Invitation: Melbourne Solr/Lucene Users Group

2011-07-20 Thread Ranveer Kumar
Hi, I m intrested to atained but not in aus.:-( Regards On 21-Jul-2011 9:45 AM, Dave Hall dave.h...@skwashd.com wrote: Hi Tal, On 21/07/11 14:04, Tal Rotbart wrote: We have noticed that there is a blossoming of Apache Solr/Lucene usage development in Melbourne in addition to a lack of an