solr requirements

2010-10-18 Thread satya swaroop
Hi All,
I am planning to have a separate server for solr and regarding
hardware requirements i have a doubt about what configuration to be needed.
I know it will be hard to tell but i just need a minimum requirement for the
particular situation as follows::


1) There are 1000 regular users using solr and Every day each user indexes
10 files of 1KB each and totally it leads to a size of 10MB for a day and it
goes on...???

2)How much of RAM is used by solr in genral???

Thanks,
satya


Re: solr requirements

2010-10-18 Thread Erick Erickson
Well, always get the biggest, fastest machine you can G...

On a serious note, you're right, there's not much info to go
on here. And even if there were more info, Solr performance
depends on how you search your data as well as how much
data you have...

About the only way you can really tell is to set your system
up and use the adminstatistics page to monitor your
system. In particular, monitor your cache evictions etc.

This page may also help:
http://wiki.apache.org/solr/SolrPerformanceFactors

Best
Erick

On Mon, Oct 18, 2010 at 5:59 AM, satya swaroop satya.yada...@gmail.comwrote:

 Hi All,
I am planning to have a separate server for solr and regarding
 hardware requirements i have a doubt about what configuration to be needed.
 I know it will be hard to tell but i just need a minimum requirement for
 the
 particular situation as follows::


 1) There are 1000 regular users using solr and Every day each user indexes
 10 files of 1KB each and totally it leads to a size of 10MB for a day and
 it
 goes on...???

 2)How much of RAM is used by solr in genral???

 Thanks,
 satya



Re: solr requirements

2010-10-18 Thread satya swaroop
Hi,
   here is some more info about it. I use Solr to output only the file
names(file id's). Here i enclose the fields in my schema.xml and presently i
have only about 40MB of indexed data.


   field name=id type=string indexed=true stored=true
required=true /
   field name=sku type=textTight indexed=true stored=false
omitNorms=true/
   field name=name type=textgen indexed=true stored=false/

   field name=manu type=textgen indexed=true stored=false
omitNorms=true/
   field name=cat type=text_ws indexed=true stored=false
multiValued=true omitNorms=true /
   field name=features type=text indexed=true stored=false
multiValued=true/
   field name=includes type=text indexed=true stored=false
termVectors=true termPositions=true termOffsets=true /

   field name=weight type=float indexed=true stored=false/
   field name=price  type=float indexed=true stored=false/
   field name=popularity type=int indexed=true stored=false /
   field name=inStock type=boolean indexed=true stored=false /

   !--
   The following store examples are used to demonstrate the various ways one
might _CHOOSE_ to
implement spatial.  It is highly unlikely that you would ever have ALL
of these fields defined.
--
   field name=store type=location indexed=true stored=false/
   field name=store_lat_lon type=latLon indexed=true stored=false/
   field name=store_hash type=geohash indexed=true stored=false/


   !-- Common metadata fields, named specifically to match up with
 SolrCell metadata when parsing rich documents such as Word, PDF.
 Some fields are multiValued only because Tika currently may return
 multiple values for them.
   --
   field name=title type=text indexed=true stored=true
multiValued=true/
   field name=subject type=text indexed=true stored=false/
   field name=description type=text indexed=true stored=false/
   field name=comments type=text indexed=true stored=false/
   field name=author type=textgen indexed=true stored=false/
   field name=keywords type=textgen indexed=true stored=false/
   field name=category type=textgen indexed=true stored=false/
   field name=content_type type=string indexed=true stored=false
multiValued=true/
   field name=last_modified type=date indexed=true stored=false/
   field name=links type=string indexed=true stored=false
multiValued=true/
!-- added here content satya--
   field name=content type=spell indexed=true stored=false
multiValued=true/


   !-- catchall field, containing all other searchable text fields
(implemented
via copyField further on in this schema  --
   field name=text type=text indexed=true stored=false
multiValued=true termVectors=true/

   !-- catchall text field that indexes tokens both normally and in reverse
for efficient
leading wildcard queries.  here satya--
   field name=text_rev type=text_rev indexed=true stored=false
multiValued=true/

   !-- non-tokenized version of manufacturer to make it easier to sort or
group
results by manufacturer.  copied from manu via copyField here
satya--
   field name=manu_exact type=string indexed=true stored=false/
   field name=spell type=spell indexed=true stored=false
multiValued=true/
!-- heere changed --
   field name=payloads type=payloads indexed=true stored=false/

 field name=timestamp type=date indexed=true stored=false
default=NOW multiValued=false/



Regards,
satya