Usage of Double quotes for single terms (camelcase) while querying

2011-10-25 Thread Nasima Banu
Hello Solr, Do we have to specify double quotes for a single term (if the term is a camelcase, eg, OrientalTradingCo) while querying. I am using apache-solr-3.3.0. For example the query : q=OrientalTradingCodebugQuery=true gives the debugging response as --- lst name=debug str

MoreLikeThis - To many hits

2011-10-25 Thread vraa
Hi I'm using the MoreLikeThis functionallity http://wiki.apache.org/solr/MoreLikeThis http://wiki.apache.org/solr/MoreLikeThis , and it works almost perfectly for my situation. But, i get to many hist, and mayby thats the hole idea of MoreLikeThis, but im gonna ask anyway. My query looks like

Re: questions about autocommit committing documents

2011-10-25 Thread darul
Well until now I was using SolrJ API to commit() (for each document added...) changes but wonder in case of a production deployment it was not a best solution to use AutoCommit feature instead. With AutoCommit parameters, is it mandatory to remove commit() instruction called on

Re: questions about autocommit committing documents

2011-10-25 Thread Mark Miller
It's not 'mandatory', but it makes no sense to keep it. Even without autocommit, committing after every doc add is horribly inefficient. On Oct 25, 2011, at 9:45 AM, darul wrote: Well until now I was using SolrJ API to commit() (for each document added...) changes but wonder in case of a

Re: questions about autocommit committing documents

2011-10-25 Thread darul
I was not sure thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/questions-about-autocommit-committing-documents-tp1582487p3450794.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: [ANNOUNCEMENT] PHP Solr Extension 1.0.1 Stable Has Been Released

2011-10-25 Thread alex
Hello roySolr, roySolr wrote: Are you working on some changes to support earlier versions of PHP? What is the status? I have supplied a patch, so that it can be compiled with PHP 5.2: https://bugs.php.net/bug.php?id=59808 I contacted Israel a while ago to integrate this into the package,

data import handler issue

2011-10-25 Thread Tanweer Noor
Hi, I am having issue in fetching records from db, I am using *eXist* database. Please see if you can help in looking into this. http://localhost:8983/solr/dataimport?command=full-import ?xml version=1.0 encoding=UTF-8 ? - http://localhost:8983/solr/lab2/dataimport?command=full-import#

prefix search

2011-10-25 Thread Radha Krishna Reddy
Hi, when i indexed words like 'Joe Tom' and 'Terry'.When i do prefix query like q=t*,i get both 'Joe Tom' and Terry' as the results.But i want the result for the complete string that start with 'T'.means i want only 'Terry' as the result. Can i do this? Thanks and Regards, Radha Krishna.

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Erik Hatcher
Blacklight - http://projectblacklight.org/ It's a full featured application fronting Solr. It's Ruby on Rails based, and powers many library front-ends but is becoming much more general purpose for other domains. See examples here:

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Memory Makers
Looks very interesting -- actually I looked at it a while back but in a different context -- for a non RoR person how much of a learning curve is it to set up? Thanks. On Tue, Oct 25, 2011 at 5:49 AM, Erik Hatcher erik.hatc...@gmail.comwrote: Blacklight - http://projectblacklight.org/ It's a

Re: Solr main query response input to facet query

2011-10-25 Thread Erik Hatcher
I'm not following exactly what you're looking for here, but sounds like you want to facet on name... facet=onfacet.field=name1 and then to filter on a selected one, you can use fq=name:name1 Erik On Oct 24, 2011, at 20:18 , solrdude wrote: Hi, I am implementing an solr solution

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Erik Hatcher
You could be up and running with Blacklight by following the quickstart instructions in only a few minutes, but Ruby and RoR know-how will be needed to go further with the types of customizations you mentioned. Some things will be purely in configuration sections (but still within Ruby code

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Memory Makers
Kool -- I was hoping to avoid adding another language :-( python/java/php were going to be it for me -- but I guess not. Thanks. On Tue, Oct 25, 2011 at 6:02 AM, Erik Hatcher erik.hatc...@gmail.comwrote: You could be up and running with Blacklight by following the quickstart instructions in

Re: prefix search

2011-10-25 Thread Alireza Salimi
That's because the phrases are being tokenized and then indexed by Solr. You have to define a new fieldType which is not tokenized. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeywordTokenizerFactory I'm not sure if it would solve your problem On Tue, Oct 25, 2011 at 5:46

Queries suggestion (not the suggester :P)

2011-10-25 Thread Simone Tripodi
Hi all guys, I'm working on a search service that uses solr as search engine and the results are provided in the Atom form, containing some OpenSearch tags. What I'm interested to understand is if it is possible, via solr, having in the response some suggestions to other queries in order to

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Robert Stewart
It is really not very difficult to build a decent web front-end to SOLR using one of the available client libraries (such as solrpy for python). I recently build pretty full-featured search front-end to SOLR in python (using tornado web server and templates) and it was not difficult at all to

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Erik Hatcher
On Oct 25, 2011, at 07:24 , Robert Stewart wrote: It is really not very difficult to build a decent web front-end to SOLR using one of the available client libraries Or even just not using any client library at all (other than an HTTP library). I've done a bit of

Re: About the indexing process

2011-10-25 Thread Martijn v Groningen
Hi Amos, How are you currently indexing files? Are you indexing Solr input documents or just regular files? You can use Solr cell to index binary files: http://wiki.apache.org/solr/ExtractingRequestHandler Martijn On 25 October 2011 10:21, 刘浪 liu.l...@eisoo.com wrote: Hi,     I appreciate

Re: Date boosting with dismax question

2011-10-25 Thread Erik Hatcher
Also, those boosts on your qf and pf are a red flag and may be causing you issues. Look at explains provided with debugQuery=true output to see how your field/phrase boosts are working in conjunction with your date boosting attempts. Erik On Oct 23, 2011, at 17:15 , Erick Erickson

Re: prefix search

2011-10-25 Thread Michael Kuhlmann
I think what Radha Krishna (is this really her name?) means is different: She wants to return only the matching token instead of the complete field value. Indeed, this is not possible. But you could use highlighting (http://wiki.apache.org/solr/HighlightingParameters), and then extract the

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Fred Zimmerman
what about something that's a bit less discovery-oriented? for my particular application I am most concerned with bringing back a straightforward top ten answer set and having users look at it. I actually don't want to bother them with faceting, etc. at this juncture. Fred On Tue, Oct 25, 2011

RE: some basic information on Solr

2011-10-25 Thread Jaeger, Jay - DOT
I am not a developer either. We are just using it in a project here. -Original Message- From: Dan Wu [mailto:wudan1...@gmail.com] Sent: Monday, October 24, 2011 2:16 PM To: solr-user@lucene.apache.org Subject: Re: some basic information on Solr JRJ, We did check the solr official

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Memory Makers
Well https://github.com/evolvingweb/ajax-solr is fairly decent for that -- haven't used it in a while but that is a minimalist client -- however I find it hard to customize. MM. On Tue, Oct 25, 2011 at 8:34 AM, Fred Zimmerman zimzaz@gmail.comwrote: what about something that's a bit less

Re: Is there a good web front end application / interface for solr

2011-10-25 Thread Erik Hatcher
Well, if what you want is straightforward like this, why not just use and tweak the templates that come with Solr's VelocityResponseWriter? Have a look at /browse from a recent Solr distro to see what I mean. It's very easily customizable. Prism is my tinkering to pull the (Velocity, or

RE: sort non-roman character strings last

2011-10-25 Thread Jaeger, Jay - DOT
Could you replace it with something that will sort it last instead of an empty string? (Say, for example, replacement={}). This would still give something that looks empty to a person, and would sort last. BTW, it looks to me as though your pattern only requires that the input contain just

gets time out error in full import with data import hadler

2011-10-25 Thread vrpar...@gmail.com
Hello all, i am using data import hadler, jdbc to get data from db for indexing. i have one query which takes more time to get data, when i do full import, it gives me timeout error. please help me to solve this problem, if i can set timeout anywhere or any other way. Thanks, Vishal Parekh

accessing the query string from inside TokenFilter

2011-10-25 Thread Bernd Fehling
Dear list, while writing some TokenFilter for my analyzer chain I need access to the query string from inside of my TokenFilter for some comparison, but the Filters are working with a TokenStream and get seperate Tokens. Currently I couldn't get any access to the query string. Any idea how to

Re: accessing the query string from inside TokenFilter

2011-10-25 Thread Simon Willnauer
On Tue, Oct 25, 2011 at 3:51 PM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: Dear list, while writing some TokenFilter for my analyzer chain I need access to the query string from inside of my TokenFilter for some comparison, but the Filters are working with a TokenStream and get

RE: sort non-roman character strings last

2011-10-25 Thread themanwho
Jay, Thanks, good call on the pattern. Still, my embedded question: if a field is filtered down to a zero-length string, does this qualify as missing so far as sortMissingLast is concerned? If not, your suggestion should work fine -- appreciated!! Cheers, Bill -- View this message in context:

Points to processing hastags

2011-10-25 Thread Memory Makers
Greetings, I am trying to index hashtags from twitter -- so they are tokens that start with a # symbol and can have any number of alpha numeric characters. Examples: 1. #jane 2. #Jane 3. #Jane! At a high level I'd like to be able to: 1. differentiate between say #jane and #jane! 2.

Dismax handler - whitespace and special character behaviour

2011-10-25 Thread Rohk
Hello, I've got strange results when I have special characters in my query. Here is my request : q=histoire-francestart=0rows=10sort=score+descdefType=dismaxqf=any^1.0mm=100% Parsed query : str name=parsedquery_toString+((any:histoir any:franc)) ()/str I've got 17000 results because Solr is

Solr Replication: relative path in confFiles Element?

2011-10-25 Thread Mark Schoy
Hi, is ist possible to define a relative path in confFile? For example: str name=confFiles../../x.xml/str If yes, to which location will the file be copied at the slave? Thanks.

Re: Bet you didn't know Lucene can...

2011-10-25 Thread Mikhail Garber
Solr as enterprise event warehouse. Multiple heterogeneous applications and log file sweepers posting stuff to centralized Solr index. On Sat, Oct 22, 2011 at 2:12 AM, Grant Ingersoll gsing...@apache.org wrote: Hi All, I'm giving a talk at ApacheCon titled Bet you didn't know Lucene can...

Search for the single hash # character never returns results

2011-10-25 Thread Daniel Bradley
When running a search such as: field_name:# field_name:# field_name:\# where there is a record with the value of exactly #, solr returns 0 rows. The workaround we are having to use is to use a range query on the field such as: field_name:[# TO #] and this returns the correct documents.

DisMax and WordDelimiterFilterFactory

2011-10-25 Thread Demian Katz
I've seen a couple of threads related to this subject (for example, http://www.mail-archive.com/solr-user@lucene.apache.org/msg33400.html), but I haven't found an answer that addresses the aspect of the problem that concerns me... I have a field type set up like this: fieldType name=text

Replication issues with multiple Slaves

2011-10-25 Thread Rob Nicholls
Hey all, We have a Master (1 server) and 2 Slaves (2 servers) setup and running replication across multiple cores. However, the replication appears to behave sporadically and often fails when left to replicate automatically via poll. More often than not a replicate will fail after the slave

RE: Dismax handler - whitespace and special character behaviour

2011-10-25 Thread Demian Katz
I just sent an email to the list about DisMax interacting with WordDelimiterFilterFactory, and I think our problems are at least partially related -- I think the reason you are seeing an OR where you expect an AND is that you have autoGeneratePhraseQueries set to false, which changes the way

Re: joins and filter queries effecting scoring

2011-10-25 Thread Yonik Seeley
Can you give an example of the request (URL) you are sending to Solr? -Yonik http://www.lucidimagination.com On Mon, Oct 24, 2011 at 3:31 PM, Jason Toy jason...@gmail.com wrote: I have 2 types of docs, users and posts. I want to view all the docs that belong to certain users by joining posts

Replication issues with multiple Slaves

2011-10-25 Thread Rob Nicholls
Hey guys, We have a Master (1 server) and 2 Slaves (2 servers) setup and running replication across multiple cores. However, the replication appears to behave sporadically and often fails when left to replicate automatically via poll. More often than not a replicate will fail after the slave

Re: joins and filter queries effecting scoring

2011-10-25 Thread Jason Toy
Hi Yonik, Without a Join I would normally query user docs with: q=data_text:testfq=is_active_boolean:true With joining users with posts, I get no no results: q={!join from=self_id_i to=user_id_i}data_text:testfq=is_active_boolean:truefq=posts_text:hello I am able to use this query, but it

Re: some basic information on Solr

2011-10-25 Thread Simon Willnauer
hey, 2011/10/24 Dan Wu wudan1...@gmail.com:  Hi all, I am doing a student project on search engine research. Right now I have some basic questions about Slor. 1. How many types of data file Solr can support (estimate)? i.e. No. of file types solr can look at for indexing and searching.

Re: Optimization /Commit memory

2011-10-25 Thread Simon Willnauer
RAM costs during optimize / merge is generally low. Optimize is basically a merge of all segments into one, however there are exceptions. Lucene streams existing segments from disk and serializes the new segment on the fly. When you optimize or in general when you merge segments you need disk

RE: sort non-roman character strings last

2011-10-25 Thread Jaeger, Jay - DOT
As far as I know, in the index, a string that is zero length is still a string, and would not count as missing. The CSV importer has a way to not index empty entries, but once it is in the index, it is in the index -- as an empty string. i.e. String silly = null; Is not the same

Adding a DocSet as a filter from a custom search component

2011-10-25 Thread Marc Sturlese
Hey there, I'm wondering if there's a more clean way to to this: I've written a SearchComponent, that runs as last-component. In the prepare method I build a DocSet (SortedIntDocSet) based on if some values of the fieldCache of a determined field accomplish some rules (if rules are accomplished,

RE: Points to processing hastags

2011-10-25 Thread Jaeger, Jay - DOT
Sounds like a possible application of solr.PatternTokenizerFactory http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternTokenizerFactory.html You could use copyField to copy the entire string to a separate field (or set of fields) that are processed by patterns. JRJ

RE: Replication issues with multiple Slaves

2011-10-25 Thread Jaeger, Jay - DOT
I noted that in these messages the left hand side is lower case collection, but the right hand side is upper case Collection. Assuming you did a cut/paste, could you have a core name mismatch between a master and a slave somehow? Otherwise (shudder): could you be doing a commit while the

Re: Replication issues with multiple Slaves

2011-10-25 Thread Markus Jelsma
Are you frequently adding and deleting documents and committing those mutations? Then it might try to download a file that doesnt exist anymore. If that is the case try increasing : str name=maxCommitsToKeep/str I noted that in these messages the left hand side is lower case collection, but

Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Awasthi, Shishir
Hi, I recently started working on SOLR and loaded approximately 4 million records to the solr using DataImportHandler. It took 5 days to complete this process. Can you please suggest how this can be improved? I would like this to be done in less than 6 hrs. Thanks, Shishir

RE: Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Jaeger, Jay - DOT
My goodness. We do 4 million in about 1/2 HOUR (7+ million in 40 minutes). First question: Are you somehow forcing Solr to do a commit for each and every record? If so, that way leads to the house of PAIN. The thing to do next, I suppose, might be to try and figure out whether the issue is

java.net.SocketException: Too many open files

2011-10-25 Thread Jonty Rhods
Hi, I am using solrj and for connection to server I am using instance of the solr server: SolrServer server = new CommonsHttpSolrServer( http://localhost:8080/solr/core0;); I noticed that after few minutes it start throwing exception java.net.SocketException: Too many open files. It seems that

Re: java.net.SocketException: Too many open files

2011-10-25 Thread Yonik Seeley
On Tue, Oct 25, 2011 at 4:03 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi, I am using solrj and for connection to server I am using instance of the solr server: SolrServer server =  new CommonsHttpSolrServer( http://localhost:8080/solr/core0;); Are you reusing the server object for all

Re: java.net.SocketException: Too many open files

2011-10-25 Thread Markus Jelsma
This is on Linux? This should help: echo fs.file-max = 16384 /etc/sysctl.conf On some distro's like Debian it seems you also have to add these settings to security.conf, otherwise it may not persist between reboots or even shell sessions: echo systems hard nofile 16384 systems soft nofile

RE: Replication issues with multiple Slaves

2011-10-25 Thread Rob Nicholls
1) Hmm, maybe, didn't notice that... but I'd be very confused why it works occasionally, and manual replication (through Solr Admin) always works ok in that case? 2) This was my initial thought, it was happening on one core (multiple commits while replication in progress), but I noticed it

Re: Replication issues with multiple Slaves

2011-10-25 Thread Markus Jelsma
1) Hmm, maybe, didn't notice that... but I'd be very confused why it works occasionally, and manual replication (through Solr Admin) always works ok in that case? 2) This was my initial thought, it was happening on one core (multiple commits while replication in progress), but I noticed it

Re: Solr Replication: relative path in confFiles Element?

2011-10-25 Thread Yury Kats
On 10/25/2011 11:24 AM, Mark Schoy wrote: Hi, is ist possible to define a relative path in confFile? For example: str name=confFiles../../x.xml/str If yes, to which location will the file be copied at the slave? I don;t think it's possible. Replication copies confFiles from master

RE: Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Awasthi, Shishir
Ok that makes me feel better. We have around 40 fields being loaded from multiple tables. Other than not commiting every row is there any other setting that you make? Are you also using DataImportHandler? -Original Message- From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] Sent:

Re: Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Alain Rogister
Are you loading data from multiple tables ? How many levels deep ? After some experimenting, I gave up on the DIH because I found it to generate very chatty (one row at a time) SQL against my schema, and I experienced concurrency bugs unless multithreading was set to false, and I wasn't too

RE: Replication issues with multiple Slaves

2011-10-25 Thread Rob Nicholls
Thanks... Yes, and no. The main thing is, after the replicate failed below, I checked the master and the files that it complains about below (and several others) did exist... which is where I'm stumped about what is causing the issue (I have added the maxCommits setting you mention below

RE: Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Awasthi, Shishir
Alain, How many rows did you export in this fashion and what was the performance? We do have oracle as underlying database with data obtained from multiple tables. The data is only 1 level deep except for one table where we need to traverse hierarchy to get information. How many XML files did

Re: Loading data to SOLR first time ( taking too long)

2011-10-25 Thread Alain Rogister
Sishir, I believe our main table has about half a million rows, which isn't a lot but it has multiple dependent tables, several levels deep. The resulting XML files were about 1 GB in total, split into around 15 files. We could feed these files one at a time into Solr in as little as a few

Re: org.apache.pdfbox.pdmodel.PDPage Error

2011-10-25 Thread Mike Sokolov
On 10/24/2011 02:35 PM, MBD wrote: Is this really a stumper? This is my first experience with Solr and having spent only an hour or so with it I hit this barrier (below). I'm sure *I* am doing something completely wrong just hoping someone more familiar with the platform can help me identify

Incorrect Search Results showing up

2011-10-25 Thread aronitin
Hi Group, I've the defined a type text in the SOLR schema as shown below. fieldType name=text class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter

Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-25 Thread Shawn Heisey
On 10/20/2011 11:00 AM, Shawn Heisey wrote: I've got two build systems for my Solr index that I wrote. The first one is in Perl and uses GET/POST requests via HTTP, the second is in Java using SolrJ. I've noticed a performance discrepancy when processing every one of my delete records,

Re: java.net.SocketException: Too many open files

2011-10-25 Thread Péter Király
One note for this. I had a trouble to reset the root's limit in Ubuntu. Somewhere I read, that Ubuntu doesn't give you even the correct number of limit. The solution to this problem is to run Solr under another user. Péter 2011/10/25 Markus Jelsma markus.jel...@openindex.io: This is on Linux?

Re: Solr main query response input to facet query

2011-10-25 Thread lee carroll
Take a look at facet query. You can facet on a query results not just terms in a field http://wiki.apache.org/solr/SimpleFacetParameters#facet.query_:_Arbitrary_Query_Faceting On 25 October 2011 10:56, Erik Hatcher erik.hatc...@gmail.com wrote: I'm not following exactly what you're looking

Difficulties Installing Solr with Jetty 7.x

2011-10-25 Thread Scott Vanderbilt
Hello. I am having trouble installing Solr 3.4.0 with Jetty 7.5.3. My OS is OpenBSD 5.0, and JDK is 1.7.0. I was able to successfully run the Solr example application which comes bundled with an earlier version of Jetty (not sure which, but I'm assuming pre-version 7). I would like--if at all

Re: help needed on solr-uima integration

2011-10-25 Thread Xue-Feng Yang
I configured solr-uima integration as the resource() I could found, but the data import results had empty data from uima. The other fields not from uima were there and no error messages. The following were the steps I did: 1) set shema.xml with all fields of both uima and non uima. 2) set

Error loading ICUTokenizerFactory

2011-10-25 Thread Tomek Rej
Hi everyone I'm getting an exception when trying to use the solr.ICUTokenizerFactory: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.ICUTokenizerFactory' The code in the schema.xml that isn't working is: fieldType name=text_en_splitting class=solr.TextField

Re: Error loading ICUTokenizerFactory

2011-10-25 Thread Tomek Rej
Looks like another person had the same problem as me. The solution to the issue can be found here: http://lucene.472066.n3.nabble.com/Solr-3-1-ICU-filters-error-loading-class-td2835323.html Perhaps the person in charge of the documentation could add apache-solr -analysis-extras-X.Y.jar as a

Re: java.net.SocketException: Too many open files

2011-10-25 Thread Bui Van Quy
Hi, I had save problem Too many open files but it is logged by Tomcat server. Please check your index directory if there are too much index files please execute Solr optimize command. This exception is raised by OS of server, you can google for researching it. On 10/26/2011 3:07 AM, Yonik

Re: java.net.SocketException: Too many open files

2011-10-25 Thread Jonty Rhods
Hi Yonik, thanks for reply. Currently I have more than 50 classes and every class have their own SolrServer server = new CommonsHttpSolrServer( http://localhost:8080/solr/core0;); Majority of classes connect to core0 however there are many cores which is connecting from different classes. My