Re: System requirements in my case?

2012-05-22 Thread findbestopensource
Dedicated Server may not be required. If you want to cut down cost, then prefer shared server. How much the RAM? Regards Aditya www.findbestopensource.com On Tue, May 22, 2012 at 12:36 PM, Bruno Mannina bmann...@free.fr wrote: Dear Solr users, My company would like to use solr to index

Re: Strategy for maintaining De-normalized indexes

2012-05-22 Thread findbestopensource
Thats how de-normalization works. You need to update all child products. If you just need the count and you are using facets then maintain a map between category and main product, main product and child product. Lucene db has no schema. You could retrieve the data based on its type. Category

Re: Multicore Solr

2012-05-22 Thread findbestopensource
Having cores per user is not good idea. The count is too high. Keep everything in single core. You could filter the data based on user name or user id. Regards Aditya www.findbestopensource.com On Tue, May 22, 2012 at 2:29 PM, Shanu Jha shanuu@gmail.com wrote: Hi all, greetings from my

Re: System requirements in my case?

2012-05-22 Thread findbestopensource
www.findbestopensource.com On Tue, May 22, 2012 at 2:36 PM, Bruno Mannina bmann...@free.fr wrote: My choice: http://www.ovh.com/fr/**serveurs_dedies/eg_best_of.xmlhttp://www.ovh.com/fr/serveurs_dedies/eg_best_of.xml 24 Go DDR3 Le 22/05/2012 10:26, findbestopensource a écrit : Dedicated Server may

Re: is commit a sequential process in solr indexing

2012-05-22 Thread findbestopensource
Yes. Lucene / Solr supports multi threaded environment. You could do commit from two different threads to same core or different core. Regards Aditya www.findbestopensource.com On Tue, May 22, 2012 at 12:35 AM, jame vaalet jamevaa...@gmail.com wrote: hi, my use case here is to search all the

Re: Fault tolerant Solr replication architecture

2012-05-21 Thread findbestopensource
Hi Parvin, Fault tolerant architecture is something you need to decide on your requirement. At some point of time there may require some manual intervention to recover from crash. You need to see how much percentage you could support fault tolerant. It certainly may not be 100. We could handle

Re: curl or nutch

2012-05-16 Thread findbestopensource
You could very well use Solr. It has support to index the PDF and XML files. If you want to index websites and search using page rank then choose Nutch. Regards Aditya www.findbestopensource.com On Wed, May 16, 2012 at 1:13 PM, Tolga to...@ozses.net wrote: Hi, I have been trying for a week.

Re: authentication for solr admin page?

2012-05-15 Thread findbestopensource
I have written an article on this. The various steps to restrict / authenticate Solr admin interface. http://www.findbestopensource.com/article-detail/restrict-solr-admin-access Regards Aditya www.findbestopensource.com On Thu, Mar 29, 2012 at 1:06 AM, geeky2 gee...@hotmail.com wrote: update

Large data set or data corpus

2012-01-11 Thread findbestopensource
Hello all, Recently i saw couple of discussions in LinkedIn group about generating large data set or data corpus. I have compiled the same in to an article. Hope it would be helpful. If you have any other links where we could get large data set for free, please reply to this mail thread, i will

Re: Search Issue

2012-01-11 Thread findbestopensource
While indexing @ is removed. You need to use your own Tokenizer which will consider @rohit as one word. Another option is to break the tweet in to two fields, @username and the tweet. Index both the fields but don't use any tokenizer for the field @username. Just index as it is. While querying

Re: Thoughts on Search Analytics?

2011-05-06 Thread findbestopensource
1. Reports based on Location. Group by City / Country 2. Total search performed per hour / week / month 3. Frequently used search keywords 4. Analytics based on search keywords. Regards Aditya www.findbestopensource.com On Fri, May 6, 2011 at 3:55 AM, Otis Gospodnetic otis_gospodne...@yahoo.com

Re: How can i use Solr based Search Engine for My University?

2011-05-06 Thread findbestopensource
Hello Anurag Google is always there to do internet search. You need to support search for your university. My opinion would be don't crawl the sites. You require only Solr and not Nutch. 1. Provide an interface to upload the documents by the university students. The documents could be previous

Re: Is it possible to use sub-fields or multivalued fields for boosting?

2011-05-05 Thread findbestopensource
Hello deniz, You could create a new field say FullName which is a copyfield of firstname and surname. Search on both the new field and location but boost up the new field query. Regards Aditya www.findbestopensource.com On Thu, May 5, 2011 at 9:21 AM, deniz denizdurmu...@gmail.com wrote:

Re: [ANNOUNCE] Web Crawler

2011-03-02 Thread findbestopensource
Hello Dominique Bejean, Good job. We identified almost 8 open source web crawlers http://www.findbestopensource.com/tagged/webcrawler I don't know how far yours would be different from the rest. Your license states that it is not open source but it is free for personnel use. Regards Aditya

Re: Does Solr supports indexing search for Hebrew.

2011-01-18 Thread findbestopensource
You may need to use Hebrew analyzer. http://www.findbestopensource.com/search/?query=hebrew Regards Aditya www.findbestopensource.com On Tue, Jan 18, 2011 at 2:34 PM, prasad deshpande prasad.deshpand...@gmail.com wrote: Hello, With reference to below links I haven't found Hebrew support

Re: Spatial Search - Best choice ?

2010-07-15 Thread findbestopensource
Some more pointers to spatial search, http://www.jteam.nl/products/spatialsolrplugin.html http://code.google.com/p/spatial-search-lucene/ http://sujitpal.blogspot.com/2008/02/spatial-search-with-lucene.html Regards Aditya www.findbestopensource.com On Thu, Jul 15, 2010 at 3:54 PM, Saïd

Re: Cache full text into memory

2010-07-14 Thread findbestopensource
You have two options 1. Store the compressed text as part of stored field in Solr. 2. Using external caching. http://www.findbestopensource.com/tagged/distributed-caching You could use ehcache / Memcache / Membase. The problem with external caching is you need to synchronize the deletions and

Re: Cache full text into memory

2010-07-14 Thread findbestopensource
. To load more into memory, I want to compress it in memory. I don't care much about disk space so whether or not it's compressed in lucene . 2010/7/14 findbestopensource findbestopensou...@gmail.com: You have two options 1. Store the compressed text as part of stored field in Solr. 2. Using

Re: Cache full text into memory

2010-07-14 Thread findbestopensource
). 2010/7/14 findbestopensource findbestopensou...@gmail.com: I have just provided you two options. Since you already store as part of the index, You could try external caching. Try using ehcache / Membase http://www.findbestopensource.com/tagged/distributed-caching . The caching system

Re: Use of EmbeddedSolrServer

2010-06-11 Thread findbestopensource
Refer http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer Regards Aditya www.findbestopensource.com On Fri, Jun 11, 2010 at 2:25 PM, Robert Naczinski robert.naczin...@googlemail.com wrote: Hello experts, we would like to use Solr in our search application. We want to index a large

Re: Indexing link targets in HTML fragments

2010-06-07 Thread findbestopensource
Could you tell us your schema used for indexing. In my opinion, using standardanalyzer / Snowball analyzer will do the best. They will not break the URLs. Add href, and other related html tags as part of stop words and it will removed while indexing. Regards Aditya www.findbestopensource.com On

Re: Query Question

2010-06-02 Thread findbestopensource
What analyzer you are using to index and search? Check out schema.xml. You are currently using analyzer which breaks the words. If you don't want to break then you need to use tokenizer class=solr.KeywordTokenizerFactory/. Regards Aditya www.findbestopensource.com On Wed, Jun 2, 2010 at 2:41

Re: logic for auto-index

2010-06-02 Thread findbestopensource
You need to do schedule your task. Check out schedulers available in all programming languages. http://www.findbestopensource.com/tagged/job-scheduler Regards Aditya www.findbestopensource.com On Wed, Jun 2, 2010 at 2:39 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi Peter, actually I

Re: newbie question on how to batch commit documents

2010-06-01 Thread findbestopensource
Add commit after the loop. I would advise to use commit in a separate thread. I do keep separate timer thread, where every minute I will do commit and at the end of every day I will optimize the index. Regards Aditya www.findbestopensource.com On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
To reterive all documents, You need to use the query/filter *FieldName:*:** Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:14 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, Is there any way to get all the fields (irrespective of whether it contains a value

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
To reterive all documents, You need to use the query/filter *FieldName:*:** Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:14 PM, Rakhi Khatwani rkhatw...@gmail.com wrote: Hi, Is there any way to get all the fields (irrespective of whether it contains a value or

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
Resending it as there is a typo error. To reterive all documents, You need to use the query/filter FieldName:*:* . Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:29 PM, findbestopensource findbestopensou...@gmail.com wrote: To reterive all documents, You need to use

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
, May 25, 2010 at 5:07 PM, findbestopensource findbestopensou...@gmail.com wrote: Resending it as there is a typo error. To reterive all documents, You need to use the query/filter FieldName:*:* . Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:29 PM

Re: Personalized Search

2010-05-20 Thread findbestopensource
Hi Rih, You going to include either of the two field bought or like to per member/visitor OR a unique field per member / visitor? If it's one or two common fields are included then there will not be any impact in performance. If you want to include unique field then you need to consider multi

Re: Moving from Lucene to Solr?

2010-05-19 Thread findbestopensource
Hi Peter, You need to use Lucene, - To have more control - You cannot depend on any Web server - To use termvector, termdocs etc - You could easily extend to have your own Analyzer You need to use Solr, - To index and search docs easily by writting few code - Solr is a

Re: Solr Deployment Question

2010-05-14 Thread findbestopensource
. Only one index is being processed/optimized. Also, if I may add to my same question, how can I find the amount of memory that an index would use, theoretically? i.e.: Is there a formulae etc? Thanks Madu -Original Message- From: findbestopensource [mailto:findbestopensou

Re: Solr Deployment Question

2010-05-13 Thread findbestopensource
You may use one index at a time, but both indexes are active and loaded all its terms in memory. Memory consumption will be certainly more. Regards Aditya http://www.findbestopensource.com On Fri, May 14, 2010 at 10:28 AM, Maduranga Kannangara mkannang...@infomedia.com.au wrote: Hi We use

Re: multi-valued associated fields

2010-05-12 Thread findbestopensource
Hello Eric, Certainly it is possible. I would strongly advice to have field which differentiates the record type (RECORD_TYPE:CAR / PROPERTY). In general I was also wondering how Solr developers implement websites that uses tag filters.For example, a user clicks on Hard drives then get tags

Re: Solr 1.4 Enterprise Search Server book examples

2010-04-27 Thread findbestopensource
I downloaded the 5883_Code.zip file but not able to extract the complete contents. Regards Aditya www.findbestopensource.com On Tue, Apr 27, 2010 at 12:45 AM, Johan Cwiklinski johan.cwiklin...@ajlsm.com wrote: Hello, Le 26/04/2010 20:53, findbestopensource a écrit : I am able

Re: hybrid approach to using cloud servers for Solr/Lucene

2010-04-25 Thread findbestopensource
Hello Dennis If the load goes up, then queries are sent to the cloud at a certain point. My advice is to do load balancing between local and cloud. Your local system seems to be capable as it is a dedicated host. Another option is to do indexing in local and sync it with cloud. Cloud will be

Re: Best Open Source

2010-04-22 Thread findbestopensource
Thank you Dave and Michael for your feedback. We are currently in beta and we will fix these issues sooner. Regards Aditya www.findbestopensource.com On Tue, Apr 20, 2010 at 3:01 PM, Michael Kuhlmann michael.kuhlm...@zalando.de wrote: Nice site. Really! In addition to Dave: How do I