Re: NOT keyword - doesn't work with dismax?

2010-04-29 Thread Chris Hostetter
: Ah, dismax doesn't support top-level NOT query. Hmm, yeah i don' think support for purely negated queries was ever added to dismax. I'm pretty sure that as a workarround you can use add something like... bq=*:*^0.001 ...to your query. based on the dismax structure, that should

Security/authentication strategies

2010-04-29 Thread Andrew McCombe
Hi I'm planning on adding some protection to our solr servers and would like to know what others are doing in this area. Basically I have a few solr cores running under tomcat6 and all use DH to populate the solr index. This is all behind a firewall and only accessible from certain IP

Re: How are (multiple) filter queries processed?

2010-04-29 Thread Alexander Valet
Hi, thanks for your help, I figued it out myself I guess. All parts of an fq are always intersected, so it has no effect to put a boolean operator inside a fq like in fq=+tags:(Gucci) OR -tags:(watch sunglasses) (would be a mildly strange query anyway) The order in which the intersections are

Re: How are (multiple) filter queries processed?

2010-04-29 Thread Alexander Valet
Hi, thanks for your help, I figued it out myself I guess. All parts of an fq are always intersected, so it has no effect to put a boolean operator inside a fq like in fq=+tags:(Gucci) OR -tags:(watch sunglasses) (would be a mildly strange query anyway) The order in which the intersections are

Re: CDATA For All Fields?

2010-04-29 Thread Erik Hatcher
yes, that's totally fine. On Apr 28, 2010, at 7:14 PM, Thomas Nguyen wrote: Is there anything wrong with wrapping the text content of all fields with CDATA whether they be analyzed, not analyzed, indexed, not indexed and etc.? I have a script that creates update XML documents and it's just

AW: No highlighting results with dismax?

2010-04-29 Thread Markus.Rietzler
we use dismax and highlighting works fine. the only thing we had to add to the query-url was hl.fl=FIELD1,FIELD2 so we had to specify which fields should be used for highlighting. -Ursprüngliche Nachricht- Von: fabritw [mailto:fabr...@gmail.com] Gesendet: Mittwoch, 28.

solr multi indexes and scoring

2010-04-29 Thread khirb7
Hello every body, In our application we are dealing with music. In our index we are storing music tracks (3 million documents). We have popularity field which inside the track document, this field contains the number of times the track have been listened. The issue is that we are forced to

Re: solr multi indexes and scoring

2010-04-29 Thread Koji Sekiguchi
khirb7 wrote: Hello every body, In our application we are dealing with music. In our index we are storing music tracks (3 million documents). We have popularity field which inside the track document, this field contains the number of times the track have been listened. The issue is that we

Re: require synonym filter on string field

2010-04-29 Thread Koji Sekiguchi
Ranveer Kumar wrote: Hi, I require to configure synonym to exact match. The field I need to search is string type. I tried to configure by the text but in text, due to whitespace tokenizer exact match not found. My requirement is : suppose user search by solr user and exact solr user (or

Re: require synonym filter on string field

2010-04-29 Thread Ranveer
On 4/29/10 3:45 PM, Koji Sekiguchi wrote: Ranveer Kumar wrote: Hi, I require to configure synonym to exact match. The field I need to search is string type. I tried to configure by the text but in text, due to whitespace tokenizer exact match not found. My requirement is : suppose user

Re: Slow Date-Range Queries

2010-04-29 Thread Ahmet Arslan
I am currently having serious performance problems with date range queries. What I am doing, is validating a datasets published status by a valid_from and a valid_till date field. I did get a performance boost of ~ 100% by switching from a normal solr.DateField to a solr.TrieDateField

AW: Slow Date-Range Queries

2010-04-29 Thread Jan Simon Winkelmann
((valid_from:[* TO 2010-04-29T10:34:12Z]) AND (valid_till:[2010-04-29T10:34:12Z TO *])) OR ((*:* -valid_from:[* TO *]) AND (*:* -valid_till:[* TO *]))) I use the empty checks for datasets which do not have a valid from/till range. Is there any way to get this any faster? I can

Re: require synonym filter on string field

2010-04-29 Thread Ahmet Arslan
I am wondering that KeywordTokenizerFactory will work or not in textfield. Actually as I understood about the KeywordTokenizerFactory that : KeywordTokenizerFactory is tokenize the keyword. for example : 'solr user' will tokenize to 'solr' and 'user' because solr and user are keyword.. My

Re: Using QueryElevationComponent without specifying top results?

2010-04-29 Thread Oliver Beattie
Just wondering if anyone had any further thoughts on how I might do this? On 26 April 2010 19:18, Oliver Beattie oli...@obeattie.com wrote: Hi Grant, Thanks for getting back to me. Yes, indeed, #1 is exactly what I'm looking for. Results are already ranked by distance (among other things),

Re: require synonym filter on string field

2010-04-29 Thread Koji Sekiguchi
Hi Koji, thanks for reply. where should I use the KeywordTokenizerFactory in string or in text field. I am wondering that KeywordTokenizerFactory will work or not in textfield. Actually as I understood about the KeywordTokenizerFactory that : KeywordTokenizerFactory is tokenize the

Re: Security/authentication strategies

2010-04-29 Thread Peter Sturge
Hi Andrew, Today, authentication is handled by the container (e.g. Tomcat, Jetty etc.). There's a thread I found to be very useful on this topic here: http://www.lucidimagination.com/search/document/d1e338dc452db2e4/how_can_i_protect_the_solr_cores This was for Jetty, but the idea is pretty

RE: Problem with DIH delta-import on JDBC

2010-04-29 Thread cbennett
Hi, It looks like the deltaImportQuery needs to be changed you are using dataimporter.delta.id which is not correct, you are selecting objected in the deltaQuery, so the deltaImportQuery should be using dataimporter.delta.objectid So try this: entity name=test pk=objectid query=select *

Re: Problem in solr search

2010-04-29 Thread stockii
hey.. try the fq parameter !? ...fq=(title:A country:USA) -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-in-solr-search-tp765028p765171.html Sent from the Solr - User mailing list archive at Nabble.com.

JTeam Spatial Plugin

2010-04-29 Thread Jean-Sebastien Vachon
Hi All, I am using JTeam's Spatial Plugin RC3 to perform spatial searches on my index and it works great. However, I can't seem to get it to return the computed distances. My query component is run before the geoDistanceComponent and the distanceField is set to distance Fields for lat/long

How to make documents low priority

2010-04-29 Thread Doddamani, Prakash
Hi, I am using the boost factor as below str name=qf field1^20.0 field2^5 field3^2.5 field4^.5 /str Where it searches first in field1 then field1 and so on Is there a way, where I can make some documents very low priority so that they come at the end? Scenario : doc

synonym filter problem for string or phrase

2010-04-29 Thread Ranveer
Hi, I am trying to configure synonym filter. my requirement is: when user searching by phrase like what is solr user? then it should be replace with solr user. something like : what is solr user? = solr user My schema for particular field is: fieldType name=text_sync class=solr.TextField

RE: Solr date range problem - specific date problem

2010-04-29 Thread Ankit Bhatnagar
You should do this - http://localhost:8080/solr/select/?q=*:*fq=pubdate:[2010-03-25T00:00:00Z %20TO%202010-03-25T23:59:59Z] Ankit -Original Message- From: Hamid Vahedi [mailto:hvb...@yahoo.com] Sent: Thursday, April 29, 2010 5:33 AM To: solr-user@lucene.apache.org Subject: Solr

RE: Problem with DIH delta-import on JDBC

2010-04-29 Thread safl
Hi, I did a debugger session and found that the column names are case sensitive (at least with Oracle). The column names are retreived from the JDBC metadatas and I found that my objectid is in fact OBJECTID. So now, I'm able to do an update with the following config (pay attention to the

Re: How to make documents low priority

2010-04-29 Thread Jon Baer
Does a sort=field5+desc on the query param not work? - Jon On Apr 29, 2010, at 9:32 AM, Doddamani, Prakash wrote: Hi, I am using the boost factor as below str name=qf field1^20.0 field2^5 field3^2.5 field4^.5 /str Where it searches first in field1 then field1 and

RE: Slow Date-Range Queries

2010-04-29 Thread Nagelberg, Kallin
You might want to look at DateMath, http://lucene.apache.org/solr/api/org/apache/solr/util/DateMathParser.html. I believe the default precision is to the millisecond, so if you afford to round to the nearest second or even minute you might see some performance gains. -Kallin Nagelberg

Relevancy Practices

2010-04-29 Thread Grant Ingersoll
I'm putting on a talk at Lucene Eurocon (http://lucene-eurocon.org/sessions-track1-day2.html#1) on Practical Relevance and I'm curious as to what people put in practice for testing and improving relevance. I have my own inclinations, but I don't want to muddy the water just yet. So, if you

Re: Problem with DIH delta-import on JDBC

2010-04-29 Thread Jon Baer
All that stuff happens in the JDBC driver associated w/ the DataSource so probably not unless there is something which can be set in the Oracle driver itself. One thing that might have helped in this case might have been if readFieldNames() in the JDBCDataSource dumped its return to debug log

RE: How to make documents low priority

2010-04-29 Thread Doddamani, Prakash
Thanks Jon, Its very nice idea I dint thought about it, But I am already using order for one more field, sort=field1+desc Can I have order for 2 fields something like sort=field1+descfield5+desc Or is there something else I should do. Thanks Prakash -Original Message- From: Jon Baer

Re: How to make documents low priority

2010-04-29 Thread Koji Sekiguchi
Doddamani, Prakash wrote: Thanks Jon, Its very nice idea I dint thought about it, But I am already using order for one more field, sort=field1+desc Can I have order for 2 fields something like sort=field1+descfield5+desc Yes, you can: sort=field1+desc,field5+desc

Re: synonym filter problem for string or phrase

2010-04-29 Thread Marco Martinez
Hi Ranveer, If you don't specify a field type in the q parameter, the search will be done searching in your default search field defined in the solrconfig.xml, its your default field a text_sync field? Regards, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26.

Re: Using NoOpMergePolicy (Lucene 2331) from Solr

2010-04-29 Thread Koji Sekiguchi
Jason Rutherglen wrote: Tom, Interesting, can you post your findings after you've found them? :) Jason On Tue, Apr 27, 2010 at 2:33 PM, Burton-West, Tom tburt...@umich.edu wrote: Is it possible to use the NoOpMergePolicy ( https://issues.apache.org/jira/browse/LUCENE-2331 ) from Solr?

Re: synonym filter problem for string or phrase

2010-04-29 Thread Ranveer
On 4/29/10 8:50 PM, Marco Martinez wrote: Hi Ranveer, If you don't specify a field type in the q parameter, the search will be done searching in your default search field defined in the solrconfig.xml, its your default field a text_sync field? Regards, Marco Martínez Bautista

Re: Slow Date-Range Queries

2010-04-29 Thread Erick Erickson
Hmmm, what does the rest of your query look like? And does adding debugQuery=on show anything interesting? Best Erick On Thu, Apr 29, 2010 at 6:54 AM, Jan Simon Winkelmann winkelm...@newsfactory.de wrote: ((valid_from:[* TO 2010-04-29T10:34:12Z]) AND (valid_till:[2010-04-29T10:34:12Z TO

Solr configuration to enable indexing/searching webapp log files

2010-04-29 Thread Stefan Maric
I thought i remembered seeing some information about this, but have been unable to find it Does anyone know if there is a configuration / module that would allow us to setup Solr to take in the (large) log files generated by our web/app servers, so that we can query for things like peak time

Re: Solr Cloud Gossip Protocols

2010-04-29 Thread Jon Baer
Thanks, Im looking @ the atomic broadcast messaging protocol of Zookeeper and think I have found what I was looking for ... - Jon On Apr 28, 2010, at 11:27 PM, Yonik Seeley wrote: On Wed, Apr 28, 2010 at 2:23 PM, Jon Baer jonb...@gmail.com wrote: From what I understand Cassandra uses a

Re: Solr configuration to enable indexing/searching webapp log files

2010-04-29 Thread Jon Baer
Good question, +1 on finding answer, my take ... Depending on how large of log files you are talking about it might be better off to do this w/ HDFS / Hadoop (and a script language like Pig) (or Amazon EMR) http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873 Theoretically

Evangelism

2010-04-29 Thread Daniel Baughman
Hi I'm new to the list here, I'd like to steer someone in the direction of Solr, and I see the list of companies using solr, but none have a power by solr logo or anything. Does anyone have any great links with evidence to majorly successful solr projects? Thanks in advance, Dan B.

Re: Evangelism

2010-04-29 Thread Peter Wolanin
A very abbreviated list of sites using Apache Solr + Drupal here: http://drupal.org/node/447564 -Peter On Thu, Apr 29, 2010 at 2:10 PM, Daniel Baughman da...@hostworks.com wrote: Hi I'm new to the list here, I'd like to steer someone in the direction of Solr, and I see the list of

Re: Evangelism

2010-04-29 Thread Israel Ekpo
Checkout Lucid Imagination http://www.lucidimagination.com/About-Search This should convince you. On Thu, Apr 29, 2010 at 2:10 PM, Daniel Baughman da...@hostworks.comwrote: Hi I'm new to the list here, I'd like to steer someone in the direction of Solr, and I see the list of companies

Re: Evangelism

2010-04-29 Thread Israel Ekpo
Their main search page has the Powered by Solr logo http://www.lucidimagination.com/search/ On Thu, Apr 29, 2010 at 2:18 PM, Israel Ekpo israele...@gmail.com wrote: Checkout Lucid Imagination http://www.lucidimagination.com/About-Search This should convince you. On Thu, Apr 29, 2010

RE: Evangelism

2010-04-29 Thread Nagelberg, Kallin
I had a very hard time selling Solr to business folks. Most are of the mind that if you're not paying for something it can't be any good. That might also be why they refrain from posting 'powered by solr' on their website, as if it might show them to be cheap. They are also fearful of lack of

Re: Evangelism

2010-04-29 Thread Israel Ekpo
A lot of high performing websites use MySQL, Oracle and Microsoft SQL Server for data storage and other RDBMS needs without necessarily putting the powered by logo on the sites. If you need the certified version of Apache Solr, you can contact Lucid Imagination. Just like MySQL, Apache Solr and

Re: Evangelism

2010-04-29 Thread Erick Erickson
This is a Lucene story, but may well apply... By the time I'd sent a request for assistance to the vendor of one of our search tools and received the reply you didn't give us the right license number, I'd found Lucene, indexed part of my corpus and run successful searches against it. And had

RE: Evangelism

2010-04-29 Thread Jason Chaffee
Netflix search is built with Solr. That seems like a fairly big and recognizable company. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, April 29, 2010 11:44 AM To: solr-user@lucene.apache.org Subject: Re: Evangelism This is a Lucene story, but

Re: Solr configuration to enable indexing/searching webapp log files

2010-04-29 Thread Jon Baer
To follow up it ... it seems dumping to Solr is common ... http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data - Jon On Apr 29, 2010, at 1:58 PM, Jon Baer wrote: Good question, +1 on finding answer, my take ... Depending on how large of log files you

Re: Evangelism

2010-04-29 Thread Grant Ingersoll
Hi Daniel, There are lots of sites running Solr ranging from very large to very small. Because it is open source, people aren't required to report, but there are several places where people have reported: http://wiki.apache.org/solr/PublicServers

RE: Evangelism

2010-04-29 Thread Daniel Baughman
ColdFusion 9 is now shipping with it, as well. Thanks everyone for the inputs. -Original Message- From: Grant Ingersoll [mailto:gsi...@gmail.com] On Behalf Of Grant Ingersoll Sent: Thursday, April 29, 2010 1:35 PM To: solr-user@lucene.apache.org Subject: Re: Evangelism Hi Daniel, There

Re: benefits of float vs. string

2010-04-29 Thread Lance Norskog
Floats are Trie types and are stored in a compressed format. They will search faster. They will also sort with much less space. One thing to point out is that doing bitwise comparison on floats is to live in a state of sin. Your string representations must parse exactly right. On Wed, Apr 28,

Re: Security/authentication strategies

2010-04-29 Thread Andrew McCombe
Thanks for this Peter. I have managed to get this working with Tomcat. Andrew On 29 April 2010 12:11, Peter Sturge peter.stu...@googlemail.com wrote: Hi Andrew, Today, authentication is handled by the container (e.g. Tomcat, Jetty etc.). There's a thread I found to be very useful on this

Solr Dismax query - prefix matching

2010-04-29 Thread Belagodu, Bharath
Folks, Greetings. Using dismax query parser is there a way to perform prefix match. For example: If I have a field called 'booktitle' with the actual values as 'Code Complete', 'Coding standard 101', then I'd like to search for the query string 'cod' and have the dismax match against both the

RE: Using NoOpMergePolicy (Lucene 2331) from Solr

2010-04-29 Thread Burton-West, Tom
Thanks Koji, That was the information I was looking for. I'll be sure to post the test results to the list. It may be a few weeks before we can schedule the tests for our test server. Tom I've never tried it but NoMergePolicy and NoMergeScheduler can be specified in solrconfig.xml:

Re: StreamingUpdateSolrServer hangs

2010-04-29 Thread Yonik Seeley
On Fri, Apr 16, 2010 at 1:34 PM, Sascha Szott sz...@zib.de wrote: In my case the whole application hangs and never recovers (CPU utilization goes down to near 0%). Interestingly, the problem reproducibly occurs only if SUSS is created with *more than 2* threads. Is your application also using

Re: Relevancy Practices

2010-04-29 Thread MitchK
I think the problems one has to solve are depending on the usecases one has to deal with. It makes a difference whether I got much documents that are bloody similar but with different contexts and I have to determine what query applies to what context in what probability for which document - or

Re: Using QueryElevationComponent without specifying top results?

2010-04-29 Thread Lance Norskog
What you want is: All results within the area and whatever results the QueryElevateComponent adds, sorted by some relevance function. If this is it, you can get the results, with the elevated output, and do a second query with all of the ids, sorted by distance. This second query would not

Re: Slow Date-Range Queries

2010-04-29 Thread Lance Norskog
Do you really need the *:* stuff in the date range subqueries? That may add to the execution time. On Thu, Apr 29, 2010 at 9:52 AM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, what does the rest of your query look like? And does adding debugQuery=on show anything interesting? Best

Re: Solr configuration to enable indexing/searching webapp log files

2010-04-29 Thread Lance Norskog
It sounds like you want a data warehouse, not a text search engine. Splunk and Pentaho are good things to try. On Thu, Apr 29, 2010 at 12:03 PM, Jon Baer jonb...@gmail.com wrote: To follow up it ... it seems dumping to Solr is common ...

Re: StreamingUpdateSolrServer hangs

2010-04-29 Thread Lance Norskog
In solrconfig.xml, there is a parameter controlling remote streaming: requestDispatcher handleSelect=true !--Make sure your system has some authentication before enabling remote streaming! -- requestParsers enableRemoteStreaming=true multipartUploadLimitInKB=2048000 / 1) Is this

Re: Evangelism

2010-04-29 Thread Ryan Grange
DollarDays.com is currently using it and we display the powered by logo as at least a gesture of giving back to the community. Ryan T. Grange, IT Manager DollarDays International, Inc. rgra...@dollardays.com (480)922-8155 x106 On 4/29/2010 11:10 AM, Daniel Baughman wrote: Hi I'm new to the

Re: StreamingUpdateSolrServer hangs

2010-04-29 Thread Yonik Seeley
On Thu, Apr 29, 2010 at 6:04 PM, Lance Norskog goks...@gmail.com wrote: In solrconfig.xml, there is a parameter controlling remote streaming:   requestDispatcher handleSelect=true      !--Make sure your system has some authentication before enabling remote streaming!  --      requestParsers

Re: StreamingUpdateSolrServer hangs

2010-04-29 Thread Lance Norskog
What is the garbage collection status when this happens? What are the open sockets in the OS when this happens? Run 'netstat -an | fgrep 8983' where 8983 is the Solr incoming port number. A side note on sockets: SUSS uses the MultiThreadedHttpConnectionManager but never calls

Re: StreamingUpdateSolrServer hangs

2010-04-29 Thread Yonik Seeley
I'm trying to reproduce now... single thread adding documents to a multithreaded client, StreamingUpdateSolrServer(addr,32,4) I'm currently at the 2.5 hour mark and 100M documents - no issues so far. -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague On Thu, Apr 29, 2010 at 5:12 PM,

Re: synonym filter problem for string or phrase

2010-04-29 Thread Jonty Rhods
On 4/29/10 8:50 PM, Marco Martinez wrote: Hi Ranveer, If you don't specify a field type in the q parameter, the search will be done searching in your default search field defined in the solrconfig.xml, its your default field a text_sync field? Regards, Marco Martínez Bautista

ubuntu lucid package

2010-04-29 Thread pablo platt
Hi I've installed solr-tomcat package on ubuntu lucid (10.04 latest). It automatically install java and tomcat and hopefully all other dependencies. I can access tomcat at http://localhost:8080 but not sure where to find the solr web admin http://localhost:8180 gives me nothing. Is this package

Re: ubuntu lucid package

2010-04-29 Thread Otis Gospodnetic
Pablo, Ubuntu Lucid is *brand* new :) try: find / -name \*solr\* or locate solr.war Or simply try http://localhost:8080/solr/admin/ Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From:

copyField - how does it work?

2010-04-29 Thread Naga Darbha
Hi, I have my config something like clubbed_text of type text and clubbed_string of type string. : BLOCK-1... field name=field_A type=text indexed=true stored=true / field name=field_B type=text indexed=true stored=true / BLOCK-2... field name=clubbed_text type=text indexed=true

RE: How to make documents low priority

2010-04-29 Thread Doddamani, Prakash
Thanks much Koji, Let me have look on this, Regards Prakash -Original Message- From: Koji Sekiguchi [mailto:k...@r.email.ne.jp] Sent: Thursday, April 29, 2010 8:25 PM To: solr-user@lucene.apache.org Subject: Re: How to make documents low priority Doddamani, Prakash wrote: Thanks