Re: Solrj Javabin and JSON

2009-10-24 Thread SGE0
Hi Paul, fair enough. Is this included in the Solrj package ? Any examples how to do this ? Stefan Noble Paul നോബിള്‍ नोब्ळ्-2 wrote: There is no point converting javabin to json. javabin is in intermediate format it is converted to the java objects as soon as comes. You just need

Re: Solrj client API and response in XML format (Solr 1.4)

2009-10-24 Thread SGE0
Hi Paul, thx again. Can I use this technique from within a servlet ? Do I need an instance of the HttpClient to do that ? I noticed I can instantiate the CommonsHttpSolrServer with a HttpClient client . I did not find any relevant examples how to use this . If you can help me out with this

Date Facet Giving Count more than actual

2009-10-24 Thread Aakash Dharmadhikari
hi guys, I am indexing events in solr, where every Event contains a startDate and endDate. On the search page, I would like to have a Date Facet where users can quickly browse through dates they are interested in. I have a field daysForFilter in each document which stores timestamps from

Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas
Hoping someone can help - Problem: Querying for non-english phrases such as Добавить do not return any results under Tomcat but do work when using the Jetty example. Both tomcat and jetty are being queried by the same custom (flash) client and both reference the same

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Zsolt Czinkos
Hello Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml (on connector element)? Like: Connector URIEncoding=UTF-8 connectionTimeout=2 port=8080 protocol=HTTP/1.1 redirectPort=8443/ Hope this helps. Best regards czinkos 2009/10/24 Glock, Thomas thomas.gl...@pfizer.com:

RE: Too many open files

2009-10-24 Thread Fuad Efendi
I had extremely specific use case; about 5000 documents-per-second (small documents) update rate, some documents can be repeatedly sent to SOLR with different timestamp field (and same unique document ID). Nothing breaks, just a great performance gain which was impossible with 32GB Buffer (- it

Re: Solrj client API and response in XML format (Solr 1.4)

2009-10-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
no need to use httpclient . use java.net.URL#openConnection(url) and read the inputstream into a buffer and that is it. On Sat, Oct 24, 2009 at 1:53 PM, SGE0 stefangee...@hotmail.com wrote: Hi Paul, thx again. Can I use this technique from within a servlet ? Do I need an instance of the

RE: Too many open files

2009-10-24 Thread Fuad Efendi
Thanks for pointing to it, but it is so obvious: 1. Buffer is used as a RAM storage for index updates 2. int has 2 x Gb different values (2^^32) 3. We can have _up_to_ 2Gb of _Documents_ (stored as key-value pairs, inverted index) In case of 5 fields which I have, I need 5 arrays (up to 2Gb of

RE: Too many open files

2009-10-24 Thread Fuad Efendi
Mark, I don't understand this; of course it is use case specific, I haven't seen any terrible behaviour with 8Gb... 32Mb is extremely small for Nutch-SOLR -like applications, but it is acceptable for Liferay-SOLR... Please note also, I have some documents with same IDs updated many thousands

RE: Too many open files

2009-10-24 Thread Fuad Efendi
This JavaDoc is incorrect especially for SOLR, when you store raw (non tokenized, non indexed) text value with a document (which almost everyone does). Try to store 1,000,000 documents with 1000 bytes non-tokenized field: you will need 1Gb just for this array. -Original Message- From:

Re: Too many open files

2009-10-24 Thread Yonik Seeley
On Sat, Oct 24, 2009 at 12:18 PM, Fuad Efendi f...@efendi.ca wrote: Mark, I don't understand this; of course it is use case specific, I haven't seen any terrible behaviour with 8Gb If you had gone over 2GB of actual buffer *usage*, it would have broke... Guaranteed. We've now added a check in

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Yonik Seeley
Try using example/exampledocs/test_utf8.sh to narrow down if the charset problems you're hitting are due to servlet container configuration. -Yonik http://www.lucidimagination.com 2009/10/24 Glock, Thomas thomas.gl...@pfizer.com: Thanks but not working... I did have the URIEncoding in place

RE: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas
Thanks - I now think it must be due to my client not sending enough ( or correct ) headers in the request. Tomcat does work when using an HTTP GET but is failing the POST from my flash client. For example putting this in both firefox and IE browsers url works correctly:

RE: Too many open files

2009-10-24 Thread Fuad Efendi
Hi Yonik, I am still using pre-2.9 Lucene (taken from SOLR trunk two months ago). 2048 is limit for documents, not for array of pointers to documents. And especially for new uninverted SOLR features, plus non-tokenized stored fields, we need 1Gb to store 1Mb of a simple field only (size of

RE: Too many open files

2009-10-24 Thread Fuad Efendi
when you store raw (non tokenized, non indexed) text value with a document (which almost everyone does). Try to store 1,000,000 documents with 1000 bytes non-tokenized field: you will need 1Gb just for this array. Nope. You shouldn't even need 1GB of buffer space for that. The size

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Walter Underwood
Don't use POST. That is the wrong HTTP semantic for search results. Use GET. That will make it possible to cache the results, will make your HTTP logs useful, and all sorts of other good things. wunder On Oct 24, 2009, at 10:11 AM, Glock, Thomas wrote: Thanks - I now think it must be due

RE: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas
Thanks - I agree. However my application requires results be trimmed to users based on roles. The roles are repeating values on the documents. Users have many different role combinations as do documents. I recognize this is going to hamper caching - but using a GET will tend to limit the

RE: Too many open files

2009-10-24 Thread Fuad Efendi
If you had gone over 2GB of actual buffer *usage*, it would have broke... Guaranteed. We've now added a check in Lucene 2.9.1 that will throw an exception if you try to go over 2048MB. And as the javadoc says, to be on the safe side, you probably shouldn't go too near 2048 - perhaps 2000MB

RE: StreamingUpdateSolrServer - indexing process stops in a couple of hours

2009-10-24 Thread Dadasheva, Olga
I am using java 1.6.0_05 To illustrate what is happening I wrote this test program that has 10 threads adding a collection of documents and one thread optimizing the index every 10 sec. I am seeing that after the first optimize there is only one thread that keeps adding documents. The other