AW: AW: Searching for empty fields possible?

2010-01-26 Thread Jan-Simon Winkelmann
I'm not sure, theoretically fields with a null value (php-side) should end up not having the field. But then again i don't think it's relevant just yet. What bugs me is that if I add the -puid:[* TO *], all results for puid:[0 TO *] disappear, even though I am using OR. - operator does not

Re: StreamingUpdateSolrServer seems to hang on indexing big batches

2010-01-26 Thread Tim Terlegård
2010/1/26 Jake Brownell ja...@benetech.org: I swapped our indexing process over to the streaming update server, but now I'm seeing places where our indexing code adds several documents, but eventually hangs. It hangs just before the completion message, which comes directly after sending to

Re: Solr wiki link broken

2010-01-26 Thread Erik Hatcher
All seems well now. The wiki does have its flakey moments though. Erik On Jan 26, 2010, at 1:23 AM, Teruhiko Kurosaka wrote: In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text in the side bar text after expansion are the link to http://wiki.apache.org/solr

Re: Solr wiki link broken

2010-01-26 Thread Sven Maurmann
Hi, you might want to try the link called Frontpage on the generic wiki page. But well, this seems to be kind of broken for some locales. Regards, Sven --On Dienstag, 26. Januar 2010 01:23 -0500 Teruhiko Kurosaka k...@basistech.com wrote: In http://lucene.apache.org/solr/ the wiki tab

Need hardware recommendation

2010-01-26 Thread Jayesh Wadhwani
I am trying to do the following: Index 6 Million database records( SQL Server 2008). Full index daily. Differential every 15 minutes Index 2 Million rich documents. Full index weekly. Differential every 15 minutes Search queries: 1 per minute 20 cores I am looking for hardware

Re: StreamingUpdateSolrServer seems to hang on indexing big batches

2010-01-26 Thread Erick Erickson
My indexing script has been running all night and has accomplished nothing. I see lots of disk activity though, which is weird. One explanation would be that you're memory-starved and the disk activity you see is thrashing. How much memory do you allocate to your JVM? A further indication that

Re: Invalid CRLF - StreamingUpdateSolrServer ?

2010-01-26 Thread Patrick Sauts
I've patched the solrj release(tag) 1.4 with SOLR-1595, it's online for about two weeks now and It's working just fine. Thanks a lot. Patrick. P.S.: It's a pity there is no plan for a 1.4.1 release Yonik Seeley a écrit : It could be this bug, fixed in trunk: * SOLR-1595:

RE: Solr vs. Compass

2010-01-26 Thread Minutello, Nick
Ultimately... You're right, to some extent, the transaction synchronisation isn't ideal for sheer throughput if you many small transactions (as Lucene benefits from batching documents when you index...). However, the subindex feature gives you decidedly more throughput since the locking is at

Re: StreamingUpdateSolrServer seems to hang on indexing big batches

2010-01-26 Thread Tim Terlegård
2010/1/26 Erick Erickson erickerick...@gmail.com: My indexing script has been running all night and has accomplished nothing. I see lots of disk activity though, which is weird. One explanation would be that you're memory-starved and the disk activity you see is thrashing. How much

RE: Solr vs. Compass

2010-01-26 Thread Shay Banon
Hi, Well, I thought I would jump here as the creator of Compass (up until this point, the discussion was great and very objective). Compass is here for about 5/6 years now (man, how time passes). Concentrating on the transactional implementation it provides, there have been changes to it

Re: determine which value produced a hit in multivalued field type

2010-01-26 Thread Renaud Delbru
Hi, SIREn [1] could provide you such information (return the value index in the multi-valued field). But actually, only a Lucene extension is available, and you'll have to modified a little bit the SIREn query operator to returns you the value position in the query results. [1]

Re: StreamingUpdateSolrServer seems to hang on indexing big batches

2010-01-26 Thread Erick Erickson
I'll have to defer that one for now. 2010/1/26 Tim Terlegård tim.terleg...@gmail.com 2010/1/26 Erick Erickson erickerick...@gmail.com: My indexing script has been running all night and has accomplished nothing. I see lots of disk activity though, which is weird. One

Re: Lock problems: Lock obtain timed out

2010-01-26 Thread Ian Connor
We traced one of the lock files, and it had been around for 3 hours. A restart removed it - but is 3 hours normal for one of these locks? Ian. On Mon, Jan 25, 2010 at 4:14 PM, mike anderson saidthero...@gmail.comwrote: I am getting this exception as well, but disk space is not my problem. What

Re: Solr wiki link broken

2010-01-26 Thread Sven Maurmann
Hi Erik, one observation from me who is using the wiki from a browser living in a non-US locale: I usually get the standard wiki frontpage (in German) and not (!) the Solr-Frontpage I get, if I use a US locale (or click on the link FrontPage). B.t.w I know that this does not strictly belong to

solr1.5

2010-01-26 Thread Matthieu Labour
Hi quick question: Is there any release date scheduled for solr 1.5 with all the wonderful patches (StreamingUpdateSolrServer etc ...)? Thank you !

Behaviour Indicitive of Throttling

2010-01-26 Thread Raf Gemmail
I've been working on benchmarking our solr response times in relation to the a variable number of concurrent queries. With maxThreads=150 - I've tried running between 20-100 queries concurrently against our solr instance and have noted that for all n-way (20) queries I'm finding that performance

Re: Behaviour Indicitive of Throttling

2010-01-26 Thread Jeff Newburn
Have you tried watching the threads in a monitoring program like VisualVM? We have found that at a certain point the solr software starts locking in the synchronous calls including logging. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Raf Gemmail

replication setup

2010-01-26 Thread Matthieu Labour
Hi I have set up replication following the wiki I downloaded the latest apache-solr-1.4 release and exploded it in 2 different directories I modified both solrconfig.xml for the master the slave as described on the wiki page In both sirectory, I started solr from the example directory

Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text in the side bar text after expansion are the link to http://wiki.apache.org/solr But the wiki site seems to be broken. The above link took me to a generic help page of the Wiki system. What's going on? Did I just hit

RE: Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
I'm sorry. Please ignore this duplicate posting. From: Teruhiko Kurosaka Sent: Tuesday, January 26, 2010 8:32 AM To: solr-user@lucene.apache.org Subject: Solr wiki link broken In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text in the

RE: Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
Sven, You are right. The wiki can't be read if the preferred language is not English. The wiki system seems to implement or be configured to use a wrong way of choosing its locale. Erik, let me know if I can help solving this. Kuro From: Sven Maurmann

Re: Specify logging options from command line in Solr 1.4?

2010-01-26 Thread Mat Brown
On Mon, Jan 18, 2010 at 19:15, Mark Miller markrmil...@gmail.com wrote: Mat Brown wrote: Hi all, Wondering if anyone can point me at a simple way to specify basic logging options (log level, log file location) when starting the Solr example jar from the command line. As a bit of

Re: StreamingUpdateSolrServer seems to hang on indexing big batches

2010-01-26 Thread Yonik Seeley
On Mon, Jan 25, 2010 at 7:27 PM, Jake Brownell ja...@benetech.org wrote: I swapped our indexing process over to the streaming update server, but now I'm seeing places where our indexing code adds several documents, but eventually hangs. It hangs just before the completion message, which comes

RE: Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
One more comment on this. I can see this page http://wiki.apache.org/solr/SolrTomcat w/o a problem, for example. Or I can see this: http://wiki.apache.org/solr/FrontPage I think it's only the main page without actual page name http://wiki.apache.org/solr/ that is having the problem. So the

Mail config

2010-01-26 Thread Bogdan Vatkov
Hi, I do not want to receive all the emails from this mail list, I only want to receive the answers to my questions, is this possible? If I am not mistaken when I unsubscribed I sent an email which did not reach the mail list at all (therefore there was of course no chance to get any replies).

To store or not to store serialized objects in solr

2010-01-26 Thread Andre Parodi
Hi, We currently are storing all of our data in sql database and use solr for indexing. We get a list of id's from solr and retrieve the data from the db. We are considering storing all the data in solr to simplify administration and remove any synchronisation and are considering the

Query 2 Cats

2010-01-26 Thread Lee Smith
Sorry of this is a poor Q but cant seem to get it to work. I have a field called cat setup so I can query against specific categories. It ok I search all or one but cant seem to make it search over multiples. ie q=string AND cat:name1 AND cat:name2 I have tried the following variations.

RE: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-01-26 Thread Shah, Nirmal
Hi Jorg, This is working now. If you look at SOLR-1583 (http://issues.apache.org/jira/browse/SOLR-1583) you can see that an InputStream was needed from the DataSource for file and URL data sources. The same is true for the FieldReaderDataSource. I created a class, BinFieldReaderDataSource

Re: To store or not to store serialized objects in solr

2010-01-26 Thread Markus Jelsma
Hello Andre, We have used this approach before. We did keep all our data in a RDBMS but added serialized objects to the index so we could simply query the record and display it as is, without any hassle and SQL connections. Although storing this data sounds a bit strange, it actually works well

Re: Query 2 Cats

2010-01-26 Thread Erick Erickson
Tell us more about the cat field. Is there one (and only one) value per document? Or are there multiple values per document? Because if there's only one cat value/doc, you want something like q=string AND (cat:name1 OR cat:name2) Erick On Tue, Jan 26, 2010 at 1:52 PM, Lee Smith l...@weblee.co.uk

Re: Query 2 Cats

2010-01-26 Thread Lee Smith
Thank you Dave, Eric Worked a charm On 26 Jan 2010, at 18:58, Dave Searle wrote: Try q=string AND (cat:name1 OR cat:name2) On 26 Jan 2010, at 18:53, Lee Smith l...@weblee.co.uk wrote: Sorry of this is a poor Q but cant seem to get it to work. I have a field called cat setup so I

Basic questions about Solr cost in programming time

2010-01-26 Thread Jeff Crump
Hi, I hope this message is OK for this list. I'm looking into search solutions for an intranet site built with Drupal. Eventually we'd like to scale to enterprise search, which would include the Drupal site, a document repository, and Jive SBS (collaboration software). I'm interested in

Re: Basic questions about Solr cost in programming time

2010-01-26 Thread Israel Ekpo
On Tue, Jan 26, 2010 at 3:00 PM, Jeff Crump jcr...@hq.mercycorps.orgwrote: Hi, I hope this message is OK for this list. I'm looking into search solutions for an intranet site built with Drupal. Eventually we'd like to scale to enterprise search, which would include the Drupal site, a

SOLR index file system size estimate

2010-01-26 Thread SHS SOLR
We wanted to estimate the file system size requirements for index. Although space very cheap, its not so here as we have to go through a process to add space to the file system. So we don't want to end up estimating less and get the process to kick in. Is there a estimate tool for index sizes

RE: matching exact/whole phrase

2010-01-26 Thread darniz
Extending this thread. Is it safe to say in order to do exact matches the field should be a string. Let say for example i have two fields on is caption which is of type string and the other is regular text. So if i index caption as my car is the best car in the world it will be stored and i copy

How to Create dynamic field names using script transformers

2010-01-26 Thread JavaGuy84
Hi, I am trying to generate a dynamic fieldname using custom transformers but couldn't achieve the expected results. My requirement is that I do not want to hardcode some of field names used by SOLR for indexing, instead the field name should be generated using the data retreieved from a table.

Re: How to Create dynamic field names using script transformers

2010-01-26 Thread Erik Hatcher
Barani - Give us some details of what you tried, what you expected to happen, and what actually happened. Erik On Jan 26, 2010, at 4:15 PM, JavaGuy84 wrote: Hi, I am trying to generate a dynamic fieldname using custom transformers but couldn't achieve the expected results.

Re: How to Create dynamic field names using script transformers

2010-01-26 Thread JavaGuy84
Hey Erik, Thanks a lot for your reply.. I am a newbie to SOLR ... I am just trying to use the example present in Apache WIKI to understand how the scriptTransformer works. I want to know how to pass the data from table.field to transformer and get back the data from transformer and set the

Re: How to Create dynamic field names using script transformers

2010-01-26 Thread JavaGuy84
To add some more details, this is what I am trying to acheive... There are 2 fields present in a database table and I am trying to make those 2 fields as key value pair. Eg: Consider there are 2 fields associated with each other (Propertyid and propertyValue) I want the property id as field

RE: determine which value produced a hit in multivalued field type

2010-01-26 Thread Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS]
I guess it's not possible for all types then: int, sdate, etc. Because, Highlighting will only work on text fields. -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Monday, January 25, 2010 3:47 PM To: solr-user@lucene.apache.org Subject: Re: determine which

Re: SOLR index file system size estimate

2010-01-26 Thread Erick Erickson
10K documents of 20K each is only 200M as a base, so I don't think you need to worry. Especially since your question is unanswerable given the number of variables About the only thing you can really do is measure, with the understanding that the first documents are more expensive space-wise

Re: Dynamic boosting of ids at search time

2010-01-26 Thread Chris Hostetter
: I mean, if for query x, ids to be boosted are 243452,346563,773567, then for : query y the ids to be boosted won't be the same. They are calculated at the : search time. : Also, I cant keep them in the lucene query as the list goes in thousands. : Please suggest a good resolution to it. I'm at

Re: Comparison of Solr with Sharepoint Search

2010-01-26 Thread Chris Hostetter
: Has anyone done a functionality comparison of Solr with Sharepoint/Fast : Search? there's been some discussion on this over the years comparing Solr with FAST if you go looking for it... http://old.nabble.com/SOLR-X-FAST-to14284618.html

Re: How can I boost bq in FieldQParserPlugin?

2010-01-26 Thread Chris Hostetter
: My original query is: : http://myhost:8080/solr/select?q=ipod*bq=userId:12345^0.5* : fq=start=0rows=10fl=*%2Cscoreqt=dismaxwt=standarddebugQuery=onexplainOther=hl.fl= : But I would like to place bq phrase in the default solrconfig.xml : configuration to make the query string more brief, so I

Re: Design Question - Dynamic Field Names (*)

2010-01-26 Thread Chris Hostetter
: - We are indexing CSV files and generating field names dynamically from the : header line. : User should be able to *list all the possible header names* (i.e. dynamic : field names), and filter results based on some of the field names. : - Also, list* all possible values* associated to for a

Multiple Cores Vs. Single Core for the following use case

2010-01-26 Thread Matthieu Labour
Hi Shall I set up Multiple Core or Single core for the following use case: I have X number of users. When I do a search, I always know for which user I am doing a search Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and add a userId field to each document? If I

RE: Solr wiki link broken

2010-01-26 Thread Chris Hostetter
: You are right. The wiki can't be read if the preferred language is not English. : The wiki system seems to implement or be configured to use a wrong way of choosing its locale. : Erik, let me know if I can help solving this. Interesting. When accessing http://wiki.apache.org/solr/;

How to index the fields as key value pair if a query returns multiple rows

2010-01-26 Thread JavaGuy84
Hi all, I have a scenario where a particular query returns multiple results and I need to map those results as a key value pair. Ex: entity=t1 query=select id from table1 entity=t2 qyery=select PropertyName, propertyid from table2 where id ='${t1.id}' -- This query returns multiple values

RE: Comparison of Solr with Sharepoint Search

2010-01-26 Thread Fuad Efendi
I can only tell that Liferay Portal (WebDAV) Document Library Portlet has same functionality as Sharepoint (it has even /servlet/ URL with suffix '/sharepoint'); Liferay also has plugin (web-hook) for SOLR (it has generic search wrapper; any kind of search service provider can be hooked in

Re: Basic questions about Solr cost in programming time

2010-01-26 Thread Peter Wolanin
Having worked quite a bit on the Drupal integration - here's my quick take: If you have someone help you the first time, you can have a basic implementation running in Jetty in about 15 minutes. On your own, a couple hours maybe. For a non-public site (intranet) with modest traffic and no

Re: Solr 1.4 - stats page slow

2010-01-26 Thread Peter Wolanin
Sorry for not following up sooner- been a busy last couple weeks. We do see a significant instanity count - could this be due to updating indexes from the dev Solr build? E.g. on one server I see stat name=insanity_count 61 /stat and entries like: stat

Re: schema.xml and Xinclude

2010-01-26 Thread Peter Wolanin
It doesn't really work with the schema.xml - I beat my head on it for a few hours not long ago - maybe I sent an e-mail to this list about it? Yes, here: http://www.lucidimagination.com/search/document/ba68aa6f2f7702c3/is_it_possible_to_use_xinclude_in_schema_xml -Peter On Wed, Jan 6, 2010 at

Re: Solr 1.4 - stats page slow

2010-01-26 Thread Yonik Seeley
On Tue, Jan 26, 2010 at 8:49 PM, Peter Wolanin peter.wola...@acquia.com wrote: Sorry for not following up sooner- been a busy last couple weeks. We do see a significant instanity count - could this be due to updating indexes from the dev Solr build?  E.g. on one server I see Do you both sort

Re: Multiple Cores Vs. Single Core for the following use case

2010-01-26 Thread Trey
Hi Matt, In most cases you are going to be better off going with the userid method unless you have a very small number of users and a very large number of docs/user. The userid method will likely be much easier to manage, as you won't have to spin up a new core every time you add a new user. I

Re: Wildcard Search and Filter in Solr

2010-01-26 Thread ashokcz
Hi just looked at the analysis.jsp and found out what it does during index / query Index Analyzer Intel intel intel intel intel intel Query Analyzer Inte* Inte* inte* inte inte inte int I think somewhere my configuration or my definition of the type text is wrong. This is my

Re: DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
There is no corresponding DataSurce which can be used with TikaEntityProcessor which reads from BLOB I have opened an issue.https://issues.apache.org/jira/browse/SOLR-1737 On Mon, Jan 25, 2010 at 10:57 PM, Shah, Nirmal ns...@columnit.com wrote: Hi, I am fairly new to Solr and would like to

Re: Fastest way to use solrj

2010-01-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
if you write only a few docs you may not observe much difference in size. if you write large no:of docs you may observe a big difference. 2010/1/27 Tim Terlegård tim.terleg...@gmail.com: I got the binary format to work perfectly now. Performance is better than with xml. Thanks! Although, it