Re: SEVERE: Could not start SOLR. Check solr/home property

2010-04-26 Thread Siddhant Goel
Did you by any chance set up multicore? Try passing in the path to the Solr
home directory as -Dsolr.solr.home=/path/to/solr/home while you start Solr.

On Mon, Apr 26, 2010 at 1:04 PM, Jon Drukman jdruk...@gmail.com wrote:

 What does this error mean?

 SEVERE: Could not start SOLR. Check solr/home property

 I've had this solr installation working before, but I haven't looked at it
 in a few months.  I checked it today and the web side is returning a 500
 error, the log file shows this when starting up:


 SEVERE: Could not start SOLR. Check solr/home property
  java.lang.RuntimeException: java.io.IOException: read past EOF
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:579)
   at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
   at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
   at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
   at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
   at
 org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
   at org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
   at
 org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218)

 For the record, I've never explictly set solr/home ever.  It always just
 worked.

 -jsd-




-- 
- Siddhant


Re: What hardware do I need ?

2010-04-24 Thread Siddhant Goel
If its worth mentioning here, in my case the disk read speeds seemed to have
a really noticeable effect on the query times. What disks are you planning
on using? Also, as Otis has already pointed out, I doubt if a single box of
that capacity can handle 100-700 queries per second.

On Fri, Apr 23, 2010 at 1:32 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Xavier,

 100-700 QPS is still high.  I'm guessing your 1 box won't handle that
 without sweating a lot (read: slow queries).
  Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Xavier Schepler xavier.schep...@sciences-po.fr
  To: solr-user@lucene.apache.org
  Sent: Fri, April 23, 2010 11:53:23 AM
  Subject: Re: What hardware do I need ?
 
  Le 23/04/2010 17:08, Otis Gospodnetic a écrit :
  Xavier,
 
 
  0-1000 QPS is a pretty wide range.  Plus, it depends on how good your
  auto-complete is, which depends on types of queries it issues, among
 other
  things.
  100K short docs is small, so that will all fit in RAM nicely,
  assuming those other processes leave enough RAM for the OS to cache the
  index.
 
That said, you do need more than 1 box if you want
  your auto-complete more fault tolerant.
 
  Otis
 
  
  Sematext ::
  http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene ecosystem
  search ::
  http://search-lucene.com/
 
 
 
  - Original
  Message 
 
  From: Xavier Schepler
  ymailto=mailto:xavier.schep...@sciences-po.fr;
  href=mailto:xavier.schep...@sciences-po.fr;
 xavier.schep...@sciences-po.fr
 
  To:
  href=mailto:solr-user@lucene.apache.org;solr-user@lucene.apache.org
 
  Sent: Fri, April 23, 2010 11:01:24 AM
  Subject: What hardware do I
  need ?
 
  Hi,
 
  I'm
  working with Solr 1.4.
  My schema has about 50 fields.
 
  I'm
 
  using full text search in short strings (~
  30-100 terms) and facetted
  search.
 
 
  My index will have 100 000 documents.
 
  The number of
  requests
 
  per second will be low. Let's say
  between 0 and 1000 because of
  auto-complete.
 
 
  Is a standard server (3ghz proc, 4gb ram) with the
  client
 
  application (apache + php5 + ZF + apc)
  and Tomcat + Solr enough ???
 
  Do I
  need
 
  more hardware ?
 
 
  Thanks in advance,
 
  Xavier S.
 
 
 Well my auto-complete is built on the facet prefix search
  component.
 I think that 100-700 requests per seconds is maybe a better
  approximation.




-- 
- Siddhant


Re: exclude words?

2010-03-31 Thread Siddhant Goel
I think you can use something like q=hello world -books. Should do.

On Wed, Mar 31, 2010 at 7:34 PM, Sebastian Funk
qbasti.f...@googlemail.comwrote:

 Hey there,

 I'm sure this easy a pretty easy thing, but I can't find the solution:
 can I search for a text with one word (e.g. books) especially not in it?
 so solr returns all documents, that don't have books somewhere in them?

 thanks for the help,
 sebastian




-- 
- Siddhant


Re: jmap output help

2010-03-29 Thread Siddhant Goel
Gentle bounce

On Sun, Mar 28, 2010 at 11:31 AM, Siddhant Goel siddhantg...@gmail.comwrote:

 Hi everyone,

 The output of jmap -histo:live 27959 | head -30 is something like the
 following :

 num #instances #bytes  class name
 --
1:448441  180299464  [C
2:  5311  135734480  [I
3:  3623   68389720  [B
4:445669   17826760  java.lang.String
5:391739   15669560  org.apache.lucene.index.TermInfo
6:417442   13358144  org.apache.lucene.index.Term
7: 587675171496
  org.apache.lucene.index.FieldsReader$LazyField
8: 329025049760  constMethodKlass
9: 329023955920  methodKlass
   10:  28433512688  constantPoolKlass
   11:  23973128048  [Lorg.apache.lucene.index.Term;
   12:353053592  [J
   13: 33044288  [Lorg.apache.lucene.index.TermInfo;
   14: 556712707536  symbolKlass
   15: 272822701352  [Ljava.lang.Object;
   16:  28432212384  instanceKlassKlass
   17:  23432132224  constantPoolCacheKlass
   18: 264241056960  java.util.ArrayList
   19: 164231051072  java.util.LinkedHashMap$Entry
   20:  20391028944  methodDataKlass
   21: 14336 917504  org.apache.lucene.document.Field
   22: 29587 710088  java.lang.Integer
   23:  3171 583464  java.lang.Class
   24:   813 492880  [Ljava.util.HashMap$Entry;
   25:  8471 474376  org.apache.lucene.search.PhraseQuery
   26:  4184 402848  [[I
   27:  4277 380704  [S

 Is it ok to assume that the top 3 entries (character/integer/byte arrays)
 are referring to the entries inside the solr cache?

 Thanks,


 --
 - Siddhant




-- 
- Siddhant


Re: Solr Performance Issues

2010-03-12 Thread Siddhant Goel
I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS disk
caching.

I think that at any point of time, there can be a maximum of number of
threads concurrent requests, which happens to make sense btw (does it?).

As I increase the number of threads, the load average shown by top goes up
to as high as 80%. But if I keep the number of threads low (~10), the load
average never goes beyond ~8). So probably thats the number of requests I
can expect Solr to serve concurrently on this index size with this hardware.

Can anyone give a general opinion as to how much hardware should be
sufficient for a Solr deployment with an index size of ~43GB, containing
around 2.5 million documents? I'm expecting it to serve at least 20 requests
per second. Any experiences?

Thanks

On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West tburtonw...@gmail.comwrote:


 How much of your memory are you allocating to the JVM and how much are you
 leaving free?

 If you don't leave enough free memory for the OS, the OS won't have a large
 enough disk cache, and you will be hitting the disk for lots of queries.

 You might want to monitor your Disk I/O using iostat and look at the
 iowait.

 If you are doing phrase queries and your *prx file is significantly larger
 than the available memory then when a slow phrase query hits Solr, the
 contention for disk I/O with other queries could be slowing everything
 down.
 You might also want to look at the 90th and 99th percentile query times in
 addition to the average. For our large indexes, we found at least an order
 of magnitude difference between the average and 99th percentile queries.
 Again, if Solr gets hit with a few of those 99th percentile slow queries
 and
 your not hitting your caches, chances are you will see serious contention
 for disk I/O..

 Of course if you don't see any waiting on i/o, then your bottleneck is
 probably somewhere else:)

 See

 http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1
 for more background on our experience.

 Tom Burton-West
 University of Michigan Library
 www.hathitrust.org



 
  On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.com
  wrote:
 
   Hi everyone,
  
   I have an index corresponding to ~2.5 million documents. The index size
  is
   43GB. The configuration of the machine which is running Solr is - Dual
   Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache,
  8GB
   RAM, and 250 GB HDD.
  
   I'm observing a strange trend in the queries that I send to Solr. The
  query
   times for queries that I send earlier is much lesser than the queries I
   send
   afterwards. For instance, if I write a script to query solr 5000 times
   (with
   5000 distinct queries, most of them containing not more than 3-5 words)
   with
   10 threads running in parallel, the average times for queries goes from
   ~50ms in the beginning to ~6000ms. Is this expected or is there
  something
   wrong with my configuration. Currently I've configured the
  queryResultCache
   and the documentCache to contain 2048 entries (hit ratios for both is
  close
   to 50%).
  
   Apart from this, a general question that I want to ask is that is such
 a
   hardware enough for this scenario? I'm aiming at achieving around 20
   queries
   per second with the hardware mentioned above.
  
   Thanks,
  
   Regards,
  
   --
   - Siddhant
  
 



 --
 - Siddhant



 --
 View this message in context:
 http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
- Siddhant


Re: Solr Performance Issues

2010-03-12 Thread Siddhant Goel
Hi,

Thanks for your responses. It actually feels good to be able to locate where
the bottlenecks are.

I've created two sets of data - in the first one I'm measuring the time took
purely on Solr's end, and in the other one I'm including network latency
(just for reference). The data that I'm posting below contains the time took
purely by Solr.

I'm running 10 threads simultaneously and the average response time (for
each query in each thread) remains close to 40 to 50 ms. But as soon as I
increase the number of threads to something like 100, the response time goes
up to ~600ms, and further up when the number of threads is close to 500. Yes
the average time definitely depends on the number of concurrent requests.

Going from memory, debugQuery=on will let you know how much time
 was spent in various operations in SOLR. It's important to know
 whether it was the searching, assembling the response, or
 transmitting the data back to the client.


I just tried this. The information that it gives me for a query that took
7165ms is - http://pastebin.ca/1835644

So out of the total time 7165ms, QueryComponent took most of the time. Plus
I can see the load average going up when the number of threads is really
high. So it actually makes sense. (I didn't add any other component while
searching; it was a plain /select?q=query call).
Like I mentioned earlier in this mail, I'm maintaining separate sets for
data with/without network latency, and I don't think its the bottleneck.


 How many threads does it take to peg the CPU? And what
 response times are you getting when your number of threads is
 around 10?


If the number of threads is greater than 100, that really takes its toll on
the CPU. So probably thats the number.

When the number of threads is around 10, the response times average to
something like 60ms (and 95% of the queries fall within 100ms of that
value).

Thanks,





 Erick

 On Fri, Mar 12, 2010 at 3:39 AM, Siddhant Goel siddhantg...@gmail.com
 wrote:

  I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS
 disk
  caching.
 
  I think that at any point of time, there can be a maximum of number of
  threads concurrent requests, which happens to make sense btw (does it?).
 
  As I increase the number of threads, the load average shown by top goes
 up
  to as high as 80%. But if I keep the number of threads low (~10), the
 load
  average never goes beyond ~8). So probably thats the number of requests I
  can expect Solr to serve concurrently on this index size with this
  hardware.
 
  Can anyone give a general opinion as to how much hardware should be
  sufficient for a Solr deployment with an index size of ~43GB, containing
  around 2.5 million documents? I'm expecting it to serve at least 20
  requests
  per second. Any experiences?
 
  Thanks
 
  On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West tburtonw...@gmail.com
  wrote:
 
  
   How much of your memory are you allocating to the JVM and how much are
  you
   leaving free?
  
   If you don't leave enough free memory for the OS, the OS won't have a
  large
   enough disk cache, and you will be hitting the disk for lots of
 queries.
  
   You might want to monitor your Disk I/O using iostat and look at the
   iowait.
  
   If you are doing phrase queries and your *prx file is significantly
  larger
   than the available memory then when a slow phrase query hits Solr, the
   contention for disk I/O with other queries could be slowing everything
   down.
   You might also want to look at the 90th and 99th percentile query times
  in
   addition to the average. For our large indexes, we found at least an
  order
   of magnitude difference between the average and 99th percentile
 queries.
   Again, if Solr gets hit with a few of those 99th percentile slow
 queries
   and
   your not hitting your caches, chances are you will see serious
 contention
   for disk I/O..
  
   Of course if you don't see any waiting on i/o, then your bottleneck is
   probably somewhere else:)
  
   See
  
  
 
 http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1
   for more background on our experience.
  
   Tom Burton-West
   University of Michigan Library
   www.hathitrust.org
  
  
  
   
On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel 
 siddhantg...@gmail.com
wrote:
   
 Hi everyone,

 I have an index corresponding to ~2.5 million documents. The index
  size
is
 43GB. The configuration of the machine which is running Solr is -
  Dual
 Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB
  cache,
8GB
 RAM, and 250 GB HDD.

 I'm observing a strange trend in the queries that I send to Solr.
 The
query
 times for queries that I send earlier is much lesser than the
 queries
  I
 send
 afterwards. For instance, if I write a script to query solr 5000
  times
 (with
 5000 distinct queries, most of them containing not more than 3-5
  words

Solr Performance Issues

2010-03-11 Thread Siddhant Goel
Hi everyone,

I have an index corresponding to ~2.5 million documents. The index size is
43GB. The configuration of the machine which is running Solr is - Dual
Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB
RAM, and 250 GB HDD.

I'm observing a strange trend in the queries that I send to Solr. The query
times for queries that I send earlier is much lesser than the queries I send
afterwards. For instance, if I write a script to query solr 5000 times (with
5000 distinct queries, most of them containing not more than 3-5 words) with
10 threads running in parallel, the average times for queries goes from
~50ms in the beginning to ~6000ms. Is this expected or is there something
wrong with my configuration. Currently I've configured the queryResultCache
and the documentCache to contain 2048 entries (hit ratios for both is close
to 50%).

Apart from this, a general question that I want to ask is that is such a
hardware enough for this scenario? I'm aiming at achieving around 20 queries
per second with the hardware mentioned above.

Thanks,

Regards,

-- 
- Siddhant


Re: Solr Performance Issues

2010-03-11 Thread Siddhant Goel
Hi Erick,

The way the load test works is that it picks up 5000 queries, splits them
according to the number of threads (so if we have 10 threads, it schedules
10 threads - each one sending 500 queries). So it might be possible that the
number of queries at a point later in time is greater than the number of
queries earlier in time. I'm not very sure about that though. Its a simple
Ruby script that starts up threads, calls the search function in each
thread, and then waits for each of them to exit.

How many queries per second can we expect Solr to serve, given this kind of
hardware? If what you suggest is true, then is it possible that while Solr
is serving a query, another query hits it, which increases the response time
even further? I'm not sure about it. But yes I can observe the query times
going up as I increase the number of threads.

Thanks,

Regards,

On Thu, Mar 11, 2010 at 8:30 PM, Erick Erickson erickerick...@gmail.comwrote:

 How many outstanding queries do you have at a time? Is it possible
 that when you start, you have only a few queries executing concurrently
 but as your test runs you have hundreds?

 This really is a question of how your load test is structured. You might
 get a better sense of how it works if your tester had a limited number
 of threads running so the max concurrent requests SOLR was serving
 at once were capped (30, 50, whatever).

 But no, I wouldn't expect SOLR to bog down the way you're describing
 just because it was running for a while.

 HTH
 Erick

 On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.com
 wrote:

  Hi everyone,
 
  I have an index corresponding to ~2.5 million documents. The index size
 is
  43GB. The configuration of the machine which is running Solr is - Dual
  Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache,
 8GB
  RAM, and 250 GB HDD.
 
  I'm observing a strange trend in the queries that I send to Solr. The
 query
  times for queries that I send earlier is much lesser than the queries I
  send
  afterwards. For instance, if I write a script to query solr 5000 times
  (with
  5000 distinct queries, most of them containing not more than 3-5 words)
  with
  10 threads running in parallel, the average times for queries goes from
  ~50ms in the beginning to ~6000ms. Is this expected or is there something
  wrong with my configuration. Currently I've configured the
 queryResultCache
  and the documentCache to contain 2048 entries (hit ratios for both is
 close
  to 50%).
 
  Apart from this, a general question that I want to ask is that is such a
  hardware enough for this scenario? I'm aiming at achieving around 20
  queries
  per second with the hardware mentioned above.
 
  Thanks,
 
  Regards,
 
  --
  - Siddhant
 




-- 
- Siddhant


Re: field length normalization

2010-03-11 Thread Siddhant Goel
Did you reindex after setting omitNorms to false? I'm not sure whether or
not it is needed, but it makes sense.

On Thu, Mar 11, 2010 at 5:34 PM, muneeb muneeba...@hotmail.com wrote:


 Hi,

 In my schema, the document title field has omitNorms=false, which, if I
 am
 not wrong, causes length of titles to be counted in the scoring.

 But when I query with: word1 word2 word3 I dont know why still the top
 two
 documents title have these words and other words, where as the document
 which has exact and only these query words is coming on third place.

 Setting omitNorms to false, should bring the titles with exact words on top
 shouldn't it?

 Also I realized when debugged query, that all three top documents have same
 score, shouldn't this be different as they have different title lengths?

 Thanks very much.
 -A
 --
 View this message in context:
 http://old.nabble.com/field-length-normalization-tp27862618p27862618.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
- Siddhant


Re: Question about fieldNorms

2010-03-08 Thread Siddhant Goel
Wonderful! That explains it. Thanks a lot!

Regards,

On Mon, Mar 8, 2010 at 6:39 AM, Jay Hill jayallenh...@gmail.com wrote:

 Yes, if omitNorms=true, then no lengthNorm calculation will be done, and
 the
 fieldNorm value will be 1.0, and lengths of the field in question will not
 be a factor in the score.

 To see an example of this you can do a quick test. Add two text fields,
 and on one omitNorms:

   field name=foo type=text indexed=true stored=true/
   field name=bar type=text indexed=true stored=true
 omitNorms=true/

 Index a doc with the same value for both fields:
  field name=foo1 2 3 4 5/field
  field name=bar1 2 3 4 5/field

 Set debugQuery=true and do two queries: q=foo:5   q=bar:5

 in the explain section of the debug output note that the fieldNorm value
 for the foo query is this:

0.4375 = fieldNorm(field=foo, doc=1)

 and the value for the bar query is this:

1.0 = fieldNorm(field=bar, doc=1)

 A simplified description of how the fieldNorm value is: fieldNorm =
 lengthNorm * documentBoost * documentFieldBoosts

 and the lengthNorm is calculated like this: lengthNorm  =
 1/(numTermsInField)**.5
 [note that the value is encoded as a single byte, so there is some
 precision
 loss]

 When omitNorms=true no norm calculation is done, so fieldNorm will always
 be
 one on those fields.

 You can also use the Luke utility to view the document in the index, and it
 will show that there is a norm value for the foo field, but not the bar
 field.

 -Jay
 http://www.lucidimagination.com


 On Sun, Mar 7, 2010 at 5:55 AM, Siddhant Goel siddhantg...@gmail.com
 wrote:

  Hi everyone,
 
  Is the fieldNorm calculation altered by the omitNorms factor? I saw on
 this
  page (http://old.nabble.com/Question-about-fieldNorm-td17782701.html)
 the
  formula for calculation of fieldNorms (fieldNorm =
  fieldBoost/sqrt(numTermsForField)).
 
  Does this mean that for a document containing a string like A B C D E
 in
  its field, its fieldNorm would be boost/sqrt(5), and for another document
  containing the string A B C in the same field, its fieldNorm would be
  boost/sqrt(3). Is that correct?
 
  If yes, then is *this* what omitNorms affects?
 
  Thanks,
 
  --
  - Siddhant
 




-- 
- Siddhant


Re: Free Webinar: Mastering Solr 1.4 with Yonik Seeley

2010-03-07 Thread Siddhant Goel
Now that I missed attending it, where can I view it? :-)

Thanks

On Fri, Feb 26, 2010 at 10:11 PM, Jay Hill jayallenh...@gmail.com wrote:

 Yes, it will be recorded and available to view after the presentation.

 -Jay


 On Thu, Feb 25, 2010 at 2:19 PM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  Yonk, can you please advise whether this event will be recorded and
  available for later download? (It starts 5am our time ;-)  )
 
  Regards
  Bern
 
  -Original Message-
  From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
  Seeley
  Sent: Thursday, 25 February 2010 10:23 AM
  To: solr-user@lucene.apache.org
  Subject: Free Webinar: Mastering Solr 1.4 with Yonik Seeley
 
  I'd like to invite you to join me for an in-depth review of Solr's
  powerful, versatile new features and functions. The free webinar,
  sponsored by my company, Lucid Imagination, covers an intensive
  how-to for the features you need to make the most of Solr for your
  search application:
 
 * Faceting deep dive, from document fields to performance management
 * Best practices for sharding, index partitioning and scaling
 * How to construct efficient Range Queries and function queries
 * Sneak preview: Solr 1.5 roadmap
 
  Join us for a free webinar
  Thursday, March 4, 2010
  10:00 AM PST / 1:00 PM EST / 18:00 GMT
  Follow this link to sign up
 
  http://www.eventsvc.com/lucidimagination/030410?trk=WR-MAR2010-AP
 
  Thanks,
 
  -Yonik
  http://www.lucidimagination.com
 




-- 
- Siddhant


Question about fieldNorms

2010-03-07 Thread Siddhant Goel
Hi everyone,

Is the fieldNorm calculation altered by the omitNorms factor? I saw on this
page (http://old.nabble.com/Question-about-fieldNorm-td17782701.html) the
formula for calculation of fieldNorms (fieldNorm =
fieldBoost/sqrt(numTermsForField)).

Does this mean that for a document containing a string like A B C D E in
its field, its fieldNorm would be boost/sqrt(5), and for another document
containing the string A B C in the same field, its fieldNorm would be
boost/sqrt(3). Is that correct?

If yes, then is *this* what omitNorms affects?

Thanks,

-- 
- Siddhant


Re: multiCore

2010-03-05 Thread Siddhant Goel
Can you provide the error message that you got?

On Sat, Mar 6, 2010 at 11:13 AM, Suram reactive...@yahoo.com wrote:


 Hi,


  how can i send the xml file to solr after created the multicore.i tried it
 refuse accept
 --
 View this message in context:
 http://old.nabble.com/multiCore-tp27802043p27802043.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
- Siddhant


Re: field not found for search

2010-03-04 Thread Siddhant Goel
Did you send a commit after indexing those files?

On Thu, Mar 4, 2010 at 6:30 PM, Suram reactive...@yahoo.com wrote:


 Hi,

I newly Indexed some xml files, it was not found for search and
 autosuggestion

 My xml Index file http://old.nabble.com/file/p27780413/Nike.xmlNike.xml

 and my scheme is  http://old.nabble.com/file/p27780413/schema.xmlschema.xml

 how can i achive this.
 --
 View this message in context:
 http://old.nabble.com/field-not-found-for-search-tp27780413p27780413.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
- Siddhant


Re: fieldType text

2010-03-02 Thread Siddhant Goel
I think that's because of the internal tokenization that Solr does. If a
document contains HP1, and you're using the default text field type, Solr
would tokenize that to HP and 1, so that document figures in the list of
documents containing HP, and hence that documents appears in the search
results for HP. Creating a separate text field which does not tokenize like
that might be what you want.

The various filter/tokenizer types are listed here -
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

On Tue, Mar 2, 2010 at 6:07 PM, Frederico Azeiteiro 
frederico.azeite...@cision.com wrote:

 Hi,

 I'm using the default text  field type that comes with the example.



 When searching for simple words as 'HP' or 'TCS' solr is returning
 results that contains 'HP1' or 'TCS'

 Is there a solution for to avoid this?



 Thanks,

 Frederico




-- 
- Siddhant


Re: Indexing HTML document

2010-03-02 Thread Siddhant Goel
There is an HTML filter documented here, which might be of some help -
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory

Control characters can be eliminated using code like this -
http://bitbucket.org/cogtree/python-solr/src/tip/pythonsolr/pysolr.py#cl-449

On Tue, Mar 2, 2010 at 9:37 PM, György Frivolt gyorgy.friv...@gmail.comwrote:

 Hi, How to index properly HTML documents? All the documents are HTML, some
 containing charaters encodid like #x17E;#xED; ... Is there a character
 filter for filtering these codes? Is there a way to strip the HTML tags
 out?
 Does solr weight the terms in the document based on where they appear?..
 words in headers (H1, H2,..) would be supposed to describe the document
 more
 then words in paragraphs.

 Thanks for help,

   Georg




-- 
- Siddhant


Re: updating particular field

2010-03-01 Thread Siddhant Goel
Yes. You can just re-add the document with your changes, and the rest of the
fields in the document will remain unchanged.

On Mon, Mar 1, 2010 at 5:09 PM, Suram reactive...@yahoo.com wrote:


 Hi,

 doc
   field name=idEN7800GTX/2DHTV/256M/field
  field name=manuASUS Computer Inc./field
  field name=catelectronics/field
  field name=catgraphics card/field
  field name=featuresNVIDIA GeForce 7800 GTX GPU/VPU clocked at
 486MHz/field
  field name=features256MB GDDR3 Memory clocked at 1.35GHz/field
  field name=price479.95/field
  field name=popularity7/field
  field name=inStockfalse/field
  field name=manufacturedate_dt2006-02-13T15:26:37Z/DAY/field
 /doc

 can i possible to update field name=inStocktrue/field without affect
 any field of my previous document

 Thanks in advance
 --
 View this message in context:
 http://old.nabble.com/updating-particular-field-tp27742399p27742399.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
- Siddhant


Re: updating particular field

2010-03-01 Thread Siddhant Goel
Yep. I think updation in Lucene means first a deletion, and then an
addition. So the entire document needs to be sent to update.

On Mon, Mar 1, 2010 at 7:24 PM, Israel Ekpo israele...@gmail.com wrote:

 Unfortunately, because of how Lucene works internally, you will not be able
 to update just one or two fields. You have to resubmit the entire document.

 If you only send just one or two fields, then the updated document will
 only
 have the fields sent in the last update.

 On Mon, Mar 1, 2010 at 7:09 AM, Suram reactive...@yahoo.com wrote:

 
 
 
  Siddhant wrote:
  
   Yes. You can just re-add the document with your changes, and the rest
 of
   the
   fields in the document will remain unchanged.
  
   On Mon, Mar 1, 2010 at 5:09 PM, Suram reactive...@yahoo.com wrote:
  
  
   Hi,
  
   doc
 field name=idEN7800GTX/2DHTV/256M/field
field name=manuASUS Computer Inc./field
field name=catelectronics/field
field name=catgraphics card/field
field name=featuresNVIDIA GeForce 7800 GTX GPU/VPU clocked at
   486MHz/field
field name=features256MB GDDR3 Memory clocked at 1.35GHz/field
field name=price479.95/field
field name=popularity7/field
field name=inStockfalse/field
field name=manufacturedate_dt2006-02-13T15:26:37Z/DAY/field
   /doc
  
   can i possible to update field name=inStocktrue/field without
   affect
   any field of my previous document
  
   Thanks in advance
   --
   View this message in context:
  
  http://old.nabble.com/updating-particular-field-tp27742399p27742399.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
   --
   - Siddhant
  
  
 
 
  Hi,
Here i don't want to reload entire data just i want u update a
 field
  i need to change(ie one or more field with id not whole)
 
 
  --
  View this message in context:
  http://old.nabble.com/updating-particular-field-tp27742399p27742671.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 


 --
 Good Enough is not good enough.
 To give anything less than your best is to sacrifice the gift.
 Quality First. Measure Twice. Cut Once.
 http://www.israelekpo.com/




-- 
- Siddhant


Re: CoreAdmin

2010-02-25 Thread Siddhant Goel
Hi,

Did you *really* go through this page -
http://wiki.apache.org/solr/CoreAdmin ?

On Thu, Feb 25, 2010 at 7:40 PM, Sudhakar_Thangavel
reactive...@yahoo.comwrote:


 Hi,
Am new to Solr .Am not getting clearly in wiki..can any one tell me
 how to configure coreAdmin i need step by step instruction..



 --
 View this message in context:
 http://old.nabble.com/CoreAdmin-tp27714440p27714440.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
- Siddhant


Ruby client fails to build

2010-01-20 Thread Siddhant Goel
Hi,

I'm using Solr 1.4 (and trying to use the Ruby client (solr-ruby) to access
it). The problem is that I just cant get it to work. :-)

If I run the tests (rake test), it fails giving me the following output -
/path/to/solr-ruby/test/unit/delete_test.rb:52: invalid multibyte char
(US-ASCII)
/path/to/solr-ruby/test/unit/delete_test.rb:52: syntax error, unexpected
$end, expecting ')'
request = Solr::Request::Delete.new(:query = 'ëäïöü')
 ^
from
/home/mango/.gem/ruby/1.9.1/gems/rake-0.8.7/lib/rake/rake_test_loader.rb:5:in
`block in main'
from
/home/mango/.gem/ruby/1.9.1/gems/rake-0.8.7/lib/rake/rake_test_loader.rb:5:in
`each'
from
/home/mango/.gem/ruby/1.9.1/gems/rake-0.8.7/lib/rake/rake_test_loader.rb:5:in
`main'
rake aborted!
Command failed with status (1): [/usr/bin/ruby -Ilib -r solr -r
test/unit...]


And If I try to build the gem anyway, it fails giving me the following error
(after quite a few lines of output) -
rake aborted!
private method `rm_f' called for File:Class
/path/to/solr-ruby/Rakefile:79:in `block (2 levels) in top (required)'


Could anyone please tell me what am I missing here?

Thanks,


-- 
- Siddhant


Re: Ruby client fails to build

2010-01-20 Thread Siddhant Goel
On Wed, Jan 20, 2010 at 4:19 PM, Erik Hatcher erik.hatc...@gmail.comwrote:

 Where are you getting your solr-ruby code from?  You can simply gem
 install it to pull in an already pre-built gem.


I'm just picking it up from the 1.4 release. I also tried checking out the
latest copy from svn, but the results were the same.

So I just figured out I was using the pre built gem the wrong way. Its
working fine here. Is there any documentation that you could point me to?
Right now I'm just figuring out how to use it on a hit and trial basis, and
random googling. The wiki page doesn't tell me much about all the search
options supported.

Thanks,

-- 
- Siddhant


Re: Queries of type field:value not functioning

2010-01-13 Thread Siddhant Goel
Hi,

Thanks for the responses.
q.alt did the job. Turns out that the dismax query parser was at fault, and
wasn't able to handle queries of the type *:*. Putting the query in q.alt,
or adding a defType=lucene (as pointed out to me on the irc channel) worked.

Thanks,


-- 
- Siddhant


Re: Reload synonyms

2010-01-05 Thread Siddhant Goel
On Tue, Jan 5, 2010 at 2:24 PM, Peter A. Kirk p...@alpha-solutions.dk wrote:

 Thanks for the answer. How does one reload a core? Is there an API, or a
 url one can use?


I think this should be it - http://wiki.apache.org/solr/CoreAdmin#RELOAD

-- 
- Siddhant


Re: Adaptive search?

2009-12-22 Thread Siddhant Goel
On Tue, Dec 22, 2009 at 12:01 PM, Ryan Kennedy rcken...@gmail.com wrote:

 This approach will be limited to applying a global rank to all the
 documents, which may have some unintended consequences. The most
 popular document in your index will be the most popular, even for
 queries for which it was never clicked on.


Right. Makes so much sense. Thanks for sharing.

-- 
- Siddhant


Adaptive search?

2009-12-17 Thread Siddhant Goel
Hi,

Does Solr provide adaptive searching? Can it adapt to user clicks within the
search results it provides? Or that has to be done externally?

I couldn't find anything on googling for it.

Thanks,

-- 
- Siddhant


Re: Adaptive search?

2009-12-17 Thread Siddhant Goel
Let say we have a search engine (a simple front end - web app kind of a
thing - responsible for querying Solr and then displaying the results in a
human readable form) based on Solr. If a user searches for something, gets
quite a few search results, and then clicks on one such result - is there
any mechanism by which we can notify Solr to boost the score/relevance of
that particular result in future searches? If not, then any pointers on how
to go about doing that would be very helpful.

Thanks,

On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht p...@activemath.org wrote:

 What can it mean to adapt to user clicks ? Quite many things in my head.
 Do you have maybe a citation that inspires you here?

 paul


 Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :


  Does Solr provide adaptive searching? Can it adapt to user clicks within
 the
 search results it provides? Or that has to be done externally?





-- 
- Siddhant