Re: Solr 4.0 - disappointing results sharding on 1 machine

2012-09-20 Thread Yonik Seeley
Depends on where the bottlenecks are I guess.

On a single system, increasing shards decreases throughput  (this
isn't specific to Solr).  The increased parallelism *can* decrease
latency to the degree that the parts that were parallelized outweigh
the overhead.

Going from one shard to two shards is also the most extreme case since
the unsharded case as no distributed overhead whatsoever.

What's the average CPU load during your tests?
How are you testing (i.e. how many requests are in progress at the same time?)
In your unsharded case, what's taking up the bulk of the time?

-Yonik
http://lucidworks.com


On Thu, Sep 20, 2012 at 9:39 AM, Tom Mortimer tom.m.f...@gmail.com wrote:
 Hi all,

 After reading 
 http://carsabi.com/car-news/2012/03/23/optimizing-solr-7x-your-search-speed/ 
 , I thought I'd do my own experiments. I used 2M docs from wikipedia, indexed 
 in Solr 4.0 Beta on a standard EC2 large instance. I compared an unsharded 
 and 2-shard configuration (the latter set up with SolrCloud following the 
 http://wiki.apache.org/solr/SolrCloud example). I wrote a simple python 
 script to randomly throw queries from a hand-compiled list at Solr. The only 
 extra I had turned on was facets (on document category).

 To my surprise, the performance of the 2-shard configuration is almost 
 exactly half that of the unsharded index -

 unsharded
 4983912891 results in 24920 searches; 0 errors
 70.02 mean qps
 0.35s mean query time, 2.25s max, 0.00s min
 90%   of qtimes = 0.83s
 99%   of qtimes = 1.42s
 99.9% of qtimes = 1.68s

 2-shard
 4990351660 results in 24501 searches; 0 errors
 34.07 mean qps
 0.66s mean query time, 694.20s max, 0.01s min
 90%   of qtimes = 1.19s
 99%   of qtimes = 2.12s
 99.9% of qtimes = 2.95s

 All caches were set to 4096 items, and performance looks ok in both cases 
 (hit ratios close to 1.0, 0 evictions). I gave the single VM -Xmx1G and each 
 shard VM -Xmx500M.

 I must be doing something stupid - surely this result is unexpected? Does 
 anybody have any thoughts where it might be going wrong?

 cheers,
 Tom



Re: Understanding fieldCache SUBREADER insanity

2012-09-19 Thread Yonik Seeley
The other thing to realize is that it's only insanity if it's
unexpected or not-by-design (so the term is rather mis-named).
It's more for core developers - if you are just using Solr without
custom plugins, don't worry about it.

-Yonik
http://lucidworks.com


On Wed, Sep 19, 2012 at 3:27 PM, Tomás Fernández Löbbe
tomasflo...@gmail.com wrote:
 Hi Aaron, here there is some information about the insanity count:
 http://wiki.apache.org/solr/SolrCaching#The_Lucene_FieldCache

 As for the SUBREADER type, the javadocs say:
 Indicates an overlap in cache usage on a given field in sub/super readers.

 This probably means that you are using the same field for faceting and for
 sorting (tf_normalizedTotalHotttnesss), sorting uses the segment level
 cache and faceting uses by default the global field cache. This can be a
 problem because the field is duplicated in cache, and then it uses twice
 the memory.

 One way to solve this would be to change the faceting method on that field
 to 'fcs', which uses segment level cache (but may be a little bit slower).

 Tomás


 On Wed, Sep 19, 2012 at 3:16 PM, Aaron Daubman daub...@gmail.com wrote:

 Hi all,

 In reviewing a solr instance with somewhat variable performance, I
 noticed that its fieldCache stats show an insanity_count of 1 with the
 insanity type SUBREADER:

 ---snip---
 insanity_count : 1
 insanity#0 : SUBREADER: Found caches for descendants of
 ReadOnlyDirectoryReader(segments_k
 _6h9(3.3):C17198463)+tf_normalizedTotalHotttnesss
 'ReadOnlyDirectoryReader(segments_k

 _6h9(3.3):C17198463)'='tf_normalizedTotalHotttnesss',float,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_FLOAT_PARSER=[F#1965982057
 'ReadOnlyDirectoryReader(segments_k

 _6h9(3.3):C17198463)'='tf_normalizedTotalHotttnesss',float,null=[F#1965982057

 'MMapIndexInput(path=/io01/p/solr/playlist/a/playlist/index/_6h9.frq)'='tf_normalizedTotalHotttnesss',float,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_FLOAT_PARSER=[F#1308116426
 ---snip---

 How can I decipher what this means and what, if anything, I should do
 to fix/improve the insanity?

 Thanks,
  Aaron



Re: Nodes cannot recover and become unavailable

2012-09-19 Thread Yonik Seeley
On Wed, Sep 19, 2012 at 4:25 PM, Mark Miller markrmil...@gmail.com wrote:
 bq. I believe there were some changes made to the clusterstate.json
 recently that are not backwards compatible.

 Indeed - I think yonik committed something the other day - we prob
 should send an email out about this.

Yeah, I was just in the process of committing another change, updating
CHANGES and sending a message.

-Yonik
http://lucidworks.com


Re: Understanding fieldCache SUBREADER insanity

2012-09-19 Thread Yonik Seeley
 already-optimized, single-segment index

That part is interesting... if true, then the type of insanity you
saw should be impossible, and either the insanity detection or
something else is broken.

-Yonik
http://lucidworks.com


SolrCloud clusterstate.json layout changes

2012-09-19 Thread Yonik Seeley
Folks,

Some changes have been committed in the past few days related to
SOLR-3815 as part of the groundwork
for SOLR-3755 (shard splitting).

The resulting clusterstate.json now looks like the following:

{collection1:{
shard1:{
  range:8000-,
  replicas:{Rogue:8983_solr_collection1:{
  shard:shard1,
  roles:null,
  state:active,
  core:collection1,
  collection:collection1,
  node_name:Rogue:8983_solr,
  base_url:http://Rogue:8983/solr;,
  leader:true}}},
shard2:{
  range:0-7fff,
  replicas:{


Note the addition of the replicas level to make room for other
properties at the shard level such as range (which define what hash
range belongs in what shard).
Although range now exists, it is ignored by the current code (i.e.
indexing still uses hash MOD nShards to place documents).

-Yonik
http://lucidworks.com


Re: SolrCloud clusterstate.json layout changes

2012-09-19 Thread Yonik Seeley
On Wed, Sep 19, 2012 at 5:27 PM, Yonik Seeley yo...@lucidworks.com wrote:
 Folks,

 Some changes have been committed in the past few days related to
 SOLR-3815 as part of the groundwork
 for SOLR-3755 (shard splitting).

 The resulting clusterstate.json now looks like the following:

 {collection1:{
 shard1:{
   range:8000-,
   replicas:{Rogue:8983_solr_collection1:{
   shard:shard1,
   roles:null,
   state:active,
   core:collection1,
   collection:collection1,
   node_name:Rogue:8983_solr,
   base_url:http://Rogue:8983/solr;,
   leader:true}}},
 shard2:{
   range:0-7fff,
   replicas:{


 Note the addition of the replicas level to make room for other
 properties at the shard level such as range (which define what hash
 range belongs in what shard).
 Although range now exists, it is ignored by the current code (i.e.
 indexing still uses hash MOD nShards to place documents).

Correction - MOD was just one of the earliest methods, not the
previous method which split the hash range up equally between all
shards, and should still be the same when we switch to paying
attention to the ranges.

-Yonik
http://lucidworks.com


Re: SOLR memory usage jump in JVM

2012-09-18 Thread Yonik Seeley
On Tue, Sep 18, 2012 at 7:45 AM, Bernd Fehling
bernd.fehl...@uni-bielefeld.de wrote:
 I used GC in different situations and tried back and forth.
 Yes, it reduces the used heap memory, but not by 5GB.
 Even so that GC from jconsole (or jvisualvm) is Full GC.

Whatever Full GC means ;-)
In the past at least, I've found that I had to hit Full GC from
jconsole many times in a row until heap usage stabilizes at it's
lowest point.

You could check fieldCache and fieldValueCache to see how many entries
there are before and after the memory bump.
If that doesn't show anything different, I guess you may need to
resort to a heap dump before and after.

 But while you bring GC into this, there is another interesting thing.
 - I have one slave running for a week which ends up around 18 to 20GB of heap 
 memory.
 - the slave goes offline for replication (no user queries on this slave)
 - the slave gets replicated and starts a new searcher
 - the heap memory of the slave is still around 11 to 12GB
 - then I initiate a Full GC from jconsole which brings it down to about 8GB
 - then I call optimize (on a optimized index) and it then drops to 6.5GB like 
 a fresh started system


 I have already looked through Uwe's blog but he says ...As a rule of thumb: 
 Don’t use more
 than 1/4 of your physical memory as heap space for Java running 
 Lucene/Solr,...
 That would be on my server 8GB for JVM heap, can't believe that the system
 will run for longer than 10 minutes with 8GB heap.

As you probably know, it depends hugely on the usecases/queries: some
configurations would be fine with a small amount of heap, other
configurations that facet and sort on tons of different fields would
not be.


-Yonik
http://lucidworks.com


Re: FilterCache Memory consumption high

2012-09-17 Thread Yonik Seeley
On Mon, Sep 17, 2012 at 3:44 PM, Mike Schultz mike.schu...@gmail.com wrote:
 So I'm figuring 3MB per entry.  With CacheSize=512 I expect something like
 1.5GB of RAM, but with the server in steady state after 1/2 hour, it is 7GB
 larger than without the cache.

Heap size and memory use aren't quite the same thing.
Try running jconsole (it comes with every JDK), attaching to the
process, and then make it run multiple garbage collections to see what
the heap shrinks down to.

-Yonik
http://lucidworks.com


Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Yonik Seeley
On Tue, Sep 11, 2012 at 10:52 AM, Radim Kolar h...@filez.com wrote:
 After investigating more, here is the tomcat log herebelow. It is indeed
 the same problem: exceeded limit of maxWarmingSearchers=2,.

 could not be solr able to close oldest warming searcher and replace it by
 new one?

That approach can easily lead to starvation (i.e. you never get a new
searcher usable for queries).

-Yonik
http://lucidworks.com


Re: solr.StrField with stored=true useless or bad?

2012-09-11 Thread Yonik Seeley
On Tue, Sep 11, 2012 at 7:03 PM,  sy...@web.de wrote:
 The purpose of stored=true is to store the raw string data besides the 
 analyzed/transformed data for displaying purposes. This is fine for an 
 analyzed solr.TextField, but for an StrField both values are the same. So is 
 there any reason to apply stored=true on a StrField as well?

You're over-thinking things a bit ;-)

if you want to search on it: index it
If you want to return it in search results: store it
Those are two orthogonal things (even for StrField).

Why?  Indexed means full-text inverted index: words (terms) point to
documents.  It's not easy/fast for a given document to find out what
terms point to it.  Stored fields are all stored together and can be
retrieved together given a document id.  Hence search finds lists of
document ids (via indexed fields), and can then return any of the
stored fields for those document ids.

-Yonik
http://lucidworks.com


Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Yonik Seeley
On Fri, Sep 7, 2012 at 9:39 AM, Erik Hatcher erik.hatc...@gmail.com wrote:
 A trie field probably doesn't work properly, as it indexes multiple terms 
 per value and you'd get odd values.

I don't know about pivot faceting, but all of the other types of
faceting take this into account (hence faceting works fine on trie
fields).

-Yonik
http://lucidworks.com


Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Yonik Seeley
I believe this is caused by the regex support in
https://issues.apache.org/jira/browse/LUCENE-2039

It certainly seems wrong to interpret a slash in the middle of the
word as the start of a regex, so I've reopened the issue.

-Yonik
http://lucidworks.com


On Thu, Sep 6, 2012 at 9:34 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
 Hello,

 I was under the impression that edismax was supposed to be crash proof
 and just ignore bad syntax. But I am either misconfiguring it or hit a
 weird bug. I basically searched for text containing '/' and got this:

 {
   'responseHeader'={
 'status'=400,
 'QTime'=9,
 'params'={
   'qf'='TitleEN DescEN',
   'indent'='true',
   'wt'='ruby',
   'q'='foo/bar',
   'defType'='edismax'}},
   'error'={
 'msg'='org.apache.lucene.queryparser.classic.ParseException:
 Cannot parse \'foo/bar \': Lexical error at line 1, column 9.
 Encountered: EOF after : /bar ',
 'code'=400}}

 Is that normal? If it is, is there a known list of characters I need
 to escape or do I just have to catch the exception and tell user to
 not do this again?

 Regards,
Alex.

 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


Re: UnInvertedField limitations

2012-09-06 Thread Yonik Seeley
It's actually limited to 24 bits to point to the term list in a
byte[], but there are 256 different arrays, so the maximum capacity is
4B bytes of un-inverted terms, but each bucket is limited to 4B/256 so
the real limit can come in at a little less due to luck.

From the comments:

 *   There is a single int[maxDoc()] which either contains a pointer
into a byte[] for
 *   the termNumber lists, or directly contains the termNumber list if
it fits in the 4
 *   bytes of an integer.  If the first byte in the integer is 1, the
next 3 bytes
 *   are a pointer into a byte[] where the termNumber list starts.
 *
 *   There are actually 256 byte arrays, to compensate for the fact
that the pointers
 *   into the byte arrays are only 3 bytes long.  The correct byte
array for a document
 *   is a function of it's id.


-Yonik
http://lucidworks.com


On Thu, Sep 6, 2012 at 6:33 PM, Fuad Efendi f...@efendi.ca wrote:
 Hi Jack,


 24bit = 16M possibilities, it's clear; just to confirm... the rest is
 unclear, why 4-byte can have 4 million cardinality? I thought it is 4
 billions...


 And, just to confirm: UnInvertedField allows 16M cardinality, correct?




 On 12-08-20 6:51 PM, Jack Krupansky j...@basetechnology.com wrote:

It appears that there is a hard limit of 24-bits or 16M for the number of
bytes to reference the terms in a single field of a single document. It
takes 1, 2, 3, 4, or 5 bytes to reference a term. If it took 4 bytes,
that
would allow 16/4 or 4 million unique terms - per document. Do you have
such
large documents? This appears to be a hard limit based of 24-bytes in a
Java
int.

You can try facet.method=enum, but that may be too slow.

What release of Solr are you running?

-- Jack Krupansky

-Original Message-
From: Fuad Efendi
Sent: Monday, August 20, 2012 4:34 PM
To: Solr-User@lucene.apache.org
Subject: UnInvertedField limitations

Hi All,


I have a problemŠ  (Yonik, please!) help me, what is Term count limits? I
possibly have 256,000,000 different terms in a fieldŠ or 16,000,000?

Thanks!


2012-08-20 16:20:19,262 ERROR [solr.core.SolrCore] - [pool-1-thread-1] - :
org.apache.solr.common.SolrException: Too many values for UnInvertedField
faceting on field enrich_keywords_string_mv
at
org.apache.solr.request.UnInvertedField.init(UnInvertedField.java:179)
at
org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField
.j
ava:668)
at
org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:326)
at
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java
:4
23)
at
org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:206)
at
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.ja
va
:85)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa
nd
ler.java:204)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas
e.
java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)




--
Fuad Efendi
http://www.tokenizer.ca







Re: Injest pauses

2012-08-29 Thread Yonik Seeley
On Wed, Aug 29, 2012 at 11:58 AM, Voth, Brad (GE Corporate)
brad.v...@ge.com wrote:
 Anyone know the actual status of SOLR-2565, it looks to be marked as resolved 
 in 4.* but I am still seeing long pauses during commits using 4.*

SOLR-2565 is definitely committed - adds are no longer blocked by
commits (at least at the Solr level).

-Yonik
http://lucidworks.com


Re: Ordering of fields

2012-08-29 Thread Yonik Seeley
In 4.0 you can use the def function with pseudo-fields (returning
function results as doc field values)
http://wiki.apache.org/solr/FunctionQuery#def

fl=a,b,c:def(myfield,10)


-Yonik
http://lucidworks.com


On Wed, Aug 29, 2012 at 2:39 PM, Rohit Harchandani rhar...@gmail.com wrote:
 Hi all,
 Is there a way to specify the order in which fields are returned by solr?
 Also, is it possible to make solr return a blank/default value for a field
 not present for a particular document, apart from giving a default value in
 the schema and having it indexed?
 Thanks,
 Rohit Harchandani


Re: Sort on dynamic field

2012-08-16 Thread Yonik Seeley
On Thu, Aug 16, 2012 at 8:00 AM, Peter Kirk p...@alpha-solutions.dk wrote:
 Hi, a question about sorting and dynamic fields in Solr Specification 
 Version: 3.6.0.2012.04.06.11.34.07.

 I have a field defined like
 dynamicField name=*_int  type=int  indexed=true  stored=true 
 multiValued=false/

 Where type int is
 fieldType name=int class=solr.TrieIntField precisionStep=0 
 omitNorms=true positionIncrementGap=0/

Try adding sortMissingLast=true to this type.

-Yonik
http://lucidworks.com


Re: Tlog vs. buffer + softcommit.

2012-08-10 Thread Yonik Seeley
On Fri, Aug 10, 2012 at 11:19 AM, Bing Hua bh...@cornell.edu wrote:
 Thanks for the information. It definitely helps a lot. There're
 numDeletesToKeep = 1000; numRecordsToKeep = 100; in UpdateLog so this should
 probably be what you're referring to.

 However when I was doing indexing the total size of TLogs kept on
 increasing. It doesn't sound like the case where there's a cap for number of
 documents?

No, there is no cap.  That's why the following is in solrconfig.xml:

 autoCommit
   maxTime15000/maxTime
   openSearcherfalse/openSearcher
 /autoCommit

That causes a hard commit every 15 seconds w/o opening a new searcher
(i.e. you still retain control over exactly when the searcher view
changes if you want).

 Also for peersync, can I find some intro online?

Nothing yet - but the idea is pretty simple... sync up with peers by
getting recent updates if possible.  If that fails, we get in sync by
copying over a full index.

-Yonik
http://lucidworks.com


Re: Tuning caching of geofilt queries

2012-08-10 Thread Yonik Seeley
On Fri, Aug 10, 2012 at 1:47 PM, David Smiley (@MITRE.org)
dsmi...@mitre.org wrote:
 Information I've read vary on exactly what is the accuracy of float
 vs double but at a kilometer there's no question a double is overkill.

Back of the envelope:

23 mantissa bits + 1 implied bit == 24 effective mantissa bits in a 32
bit float.

40,000 km circumference / (2^24) = .0024 km  (i.e. our resolution at
the equator is 2.4m at best - there will be some lost unused space at
the beginning and end of the +-180 number-line).

Is that in line with what you've read?

-Yonik
http://lucidworks.com


Re: Documentation on the new updateLog transaction log feature?

2012-08-10 Thread Yonik Seeley
On Fri, Aug 10, 2012 at 2:31 PM, David Smiley (@MITRE.org)
dsmi...@mitre.org wrote:
 Is there any documentation on the updateLog transaction log feature in Solr
 4?

Not much beyond what's in solrconfig.xml

 I started a quick prototype using Solr 4 alpha with a fairly structured
 schema; no big text.  I disabled auto-commit which came pre-enabled and
 there's no soft-commit either.  With CURL I posted a 1.8GB CSV file.  AFter
 some time, I find this huge ~2.6GB transaction log file that didn't want to
 go away.  FWIW A small number of records had errors, and maybe half of the
 records were duplicates of existing records in the file because of
 duplicated IDs.  When I restarted Solr, Solr spent a long time reading from
 the transaction log before it was ready.  But the file is still there; I
 manually deleted it.  This isn't a great user experience for a feature I
 have no intention of using


Simply comment out the following in solrconfig.xml

updateLog
  str name=dir${solr.data.dir:}/str
/updateLog

 (no Solr Cloud for this project, and no so-called
 realtime get which has always struck me as an odd feature).

It's often pretty important for anyone using Solr as a NoSQL store.

-Yonik
http://lucidworks.com


Re: null:java.lang.RuntimeException: [was class java.net.SocketTimeoutException] null

2012-08-09 Thread Yonik Seeley
On Thu, Aug 9, 2012 at 10:11 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 I've increased the connection time out on all 10 Tomcats from 1000ms to 
 5000ms. Indexing a larger amount of batches seems to run fine now. This, 
 however, does not really answer the issue. What is exactly timing out here 
 and why?

It can be any communication with tomcat for any reason.
For example, a commit needs to flush and fsync all segments, applying
buffered deletes, etc, then open a new searcher and run any configured
warming queries or autowarming.  That can take some time.  It's even
longer if you want to optimize.  Or a long GC pause could cause a
socket timeout.

For the stock jetty server, we set it to 50,000ms, which still may be
too short for some things frankly.

Here's the jetty documentation for the parameter:

maxIdleTime:
Set the maximum Idle time for a connection, which roughly translates
to the Socket.setSoTimeout(int) call, although with NIO
implementations other mechanisms may be used to implement the timeout.
The max idle time is applied: when waiting for a new request to be
received on a connection; when reading the headers and content of a
request; when writing the headers and content of a response. Jetty
interprets this value as the maximum time between some progress being
made on the connection. So if a single byte is read or written, then
the timeout (if implemented by jetty) is reset. However, in many
instances, the reading/writing is delegated to the JVM, and the
semantic is more strictly enforced as the maximum time a single
read/write operation can take. Note, that as Jetty supports writes of
memory mapped file buffers, then a write may take many 10s of seconds
for large content written to a slow device.


-Yonik
http://lucidimagination.com








 I assume its the forwarding of documents from the `indexing node` to
the correct shard leader but with 512 maxThreads it should be fine.

 Any hints?

 Thanks



 -Original message-
 From:Markus Jelsma markus.jel...@openindex.io
 Sent: Wed 08-Aug-2012 00:10
 To: solr-user@lucene.apache.org
 Subject: RE: null:java.lang.RuntimeException: [was class 
 java.net.SocketTimeoutException] null

 Jack,

 There are no peculiarities in the JVM graphs. Only increase in used threads 
 and GC time. Heap space is collected quickly and doesn't suddenly increase. 
 There's only 256MB available for the heap but it's fine.


 Yonik,

 I'll increase the time out to five seconds tomorrow and try to reproduce it 
 with a low batch size of 32. Juding from what i've seen it should throw an 
 error quickly with such a low batch size. However, what is timing out here? 
 My client connection to the indexing node or something else that i don't see?

 Unfortunately no Jetty here (yet).

 Thanks
 Markus


 -Original message-
  From:Yonik Seeley yo...@lucidimagination.com
  Sent: Tue 07-Aug-2012 23:54
  To: solr-user@lucene.apache.org
  Subject: Re: null:java.lang.RuntimeException: [was class 
  java.net.SocketTimeoutException] null
 
  Could this be just a simple case of a socket timeout?  Can you raise
  the timout on request threads in Tomcat?
  It's a lot easier to reproduce/diagnose stuff like this when people
  use the stock jetty server shipped with Solr.
 
  -Yonik
  http://lucidimagination.com
 
 
  On Tue, Aug 7, 2012 at 5:39 PM, Markus Jelsma
  markus.jel...@openindex.io wrote:
   A signicant detail is the batch size which we set to 64 documents due to 
   earlier memory limitations. We index segments of roughly 300-500k 
   records each time. Lowering the batch size to 32 lead to an early 
   internal server error and the stack trace below. Increasing it to 128 
   allowed us to index some more records but it still throws the error 
   after 200k+ indexed records.
  
   Increasing it even more to 256 records per batch allowed us to index an 
   entire segment without errors.
  
   Another detail is that we do not restart the cluster between indexing 
   attempts so it seems that something only builds up during indexing 
   (nothing seems to leak afterwards) and throws an error.
  
   Any hints?
  
   Thanks,
   Markus
  
  
  
   -Original message-
   From:Markus Jelsma markus.jel...@openindex.io
   Sent: Tue 07-Aug-2012 20:08
   To: solr-user@lucene.apache.org
   Subject: null:java.lang.RuntimeException: [was class 
   java.net.SocketTimeoutException] null
  
   Hello,
  
   We sometimes see the error below in our `master` when indexing. Our 
   master is currently the node we send documents to - we've not yet 
   implemented CloudSolrServer in Apache Nutch. This causes the indexer to 
   crash when using Nutch locally, the task is retried when running on 
   Hadoop. We're running it locally in this test set up so there's only 
   one indexing thread.
  
   Anyway, for me it's quite a cryptic error because i don't know what 
   connection has timed out, i assume a connection from the indexing node 
   to some other node in the cluster 

Re: Tlog vs. buffer + softcommit.

2012-08-09 Thread Yonik Seeley
On Thu, Aug 9, 2012 at 5:39 PM, Bing Hua bh...@cornell.edu wrote:
 I'm a bit confused with the purpose of Transaction Logs (Update Logs) in
 Solr.

 My understanding is, update request comes in, first the new item is put in
 RAM buffer as well as T-Log. After a soft commit happens, the new item
 becomes searchable but not hard committed in stable storage. Configuring
 soft commit interval to 1 sec achieves NRT.

 Then what exactly T-Log is doing in this scenario?

It serves realtime-get... when even 1 second isn't acceptable (i.e.
you need to be guaranteed of getting the latest version of a
document):
http://searchhub.org/dev/2011/09/07/realtime-get/
and also allows for a peer to ask give me the list of the last update
events you know about.
You can also kill -9 the server and solr will automatically recover
from the log.

 what circumstances is it being cleared?

A new log file is created every time a hard commit is done, and old
log files are removed if newer log files contain enough entries to
satisfy the needs of what I call peersync in solr cloud (currently
~100 updates IIRC).

It's cleared after a hard commit and after there are enough entries in
other log files to satisfy the lookback requirements of what I call
peersync in SolrCloud (currently ~100 updates IIRC).

-Yonik
http://lucidimagination.com


Re: Recovery problem in solrcloud

2012-08-08 Thread Yonik Seeley
Stack trace looks normal - it's just a multi-term query instantiating
a bitset.  The memory is being taken up somewhere else.
How many documents are in your index?
Can you get a heap dump or use some other memory profiler to see
what's taking up the space?

 if I stop query more then  ten minutes, the solr instance will start normally.

Maybe queries are piling up in threads before the server is ready to
handle them and then trying to handle them all at once gives an OOM?
Is this live traffic or a test?  How many concurrent requests get sent?

-Yonik
http://lucidimagination.com


On Wed, Aug 8, 2012 at 2:43 AM, Jam Luo cooljam2...@gmail.com wrote:
 Aug 06, 2012 10:05:55 AM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java
 heap space
 at
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:456)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:284)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
 at
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:499)
 at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
 at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
 at
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
 at
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
 at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
 at org.eclipse.jetty.server.Server.handle(Server.java:351)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
 at
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900)
 at
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:954)
 at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)
 at
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
 at
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
 at
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.OutOfMemoryError: Java heap space
 at org.apache.lucene.util.FixedBitSet.init(FixedBitSet.java:54)
 at
 org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:104)
 at
 org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:129)
 at
 org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:318)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:507)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1394)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1269)
 at
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:384)
 at
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:420)
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1544)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
 at
 

Re: Syntax for parameter substitution in function queries?

2012-08-07 Thread Yonik Seeley
On Tue, Aug 7, 2012 at 3:01 PM, Timothy Hill timothy.d.h...@gmail.com wrote:
 Hello, all ...

 According to http://wiki.apache.org/solr/FunctionQuery/#What_is_a_Function.3F,
 it is possible under Solr 4.0 to perform parameter substitutions
 within function queries.

 However, I can't get the syntax provided in the documentation there to
 work *at all* with Solr 4.0 out of the box: the only location at which
 function queries can be specified, it seems, is in the 'fl' parameter.
 And attempts at parameter substitutions here fail. Using (haphazardly
 guessed) syntax like

 select?q=*:*fl=*, test_id:if(exists(employee), employee_id,
 socialsecurity_id), boost_id:sum($test_id, 10)wt=xml

 results in the following error

 Error parsing fieldname: Missing param test_id while parsing function
 'sum($test_id, 10)'

test_id needs to be an actual request parameter.

This worked for me on the example data:
http://localhost:8983/solr/query?q=*:*fl=*,%20test_id:if(exists(price),id,name),%20boost_id:sum($param,10)param=price

-Yonik
http://lucidimagination.com


Re: null:java.lang.RuntimeException: [was class java.net.SocketTimeoutException] null

2012-08-07 Thread Yonik Seeley
Could this be just a simple case of a socket timeout?  Can you raise
the timout on request threads in Tomcat?
It's a lot easier to reproduce/diagnose stuff like this when people
use the stock jetty server shipped with Solr.

-Yonik
http://lucidimagination.com


On Tue, Aug 7, 2012 at 5:39 PM, Markus Jelsma
markus.jel...@openindex.io wrote:
 A signicant detail is the batch size which we set to 64 documents due to 
 earlier memory limitations. We index segments of roughly 300-500k records 
 each time. Lowering the batch size to 32 lead to an early internal server 
 error and the stack trace below. Increasing it to 128 allowed us to index 
 some more records but it still throws the error after 200k+ indexed records.

 Increasing it even more to 256 records per batch allowed us to index an 
 entire segment without errors.

 Another detail is that we do not restart the cluster between indexing 
 attempts so it seems that something only builds up during indexing (nothing 
 seems to leak afterwards) and throws an error.

 Any hints?

 Thanks,
 Markus



 -Original message-
 From:Markus Jelsma markus.jel...@openindex.io
 Sent: Tue 07-Aug-2012 20:08
 To: solr-user@lucene.apache.org
 Subject: null:java.lang.RuntimeException: [was class 
 java.net.SocketTimeoutException] null

 Hello,

 We sometimes see the error below in our `master` when indexing. Our master 
 is currently the node we send documents to - we've not yet implemented 
 CloudSolrServer in Apache Nutch. This causes the indexer to crash when using 
 Nutch locally, the task is retried when running on Hadoop. We're running it 
 locally in this test set up so there's only one indexing thread.

 Anyway, for me it's quite a cryptic error because i don't know what 
 connection has timed out, i assume a connection from the indexing node to 
 some other node in the cluster when it passes a document to the correct 
 leader? Each node of the 10 node cluster has the same configuration, Tomcat 
 is configured with maxThreads=512 and a time out of one second.

 We're using today's trunk in this test set up and we cannot reliably 
 reproduce the error. We've seen the error before so it's not a very recent 
 issue. No errors are found in the other node's logs.

 2012-08-07 17:52:05,260 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-8080-exec-6] - : null:java.lang.RuntimeException: [was class 
 java.net.SocketTimeoutException] null
 at 
 com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
 at 
 com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
 at 
 com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
 at 
 com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
 at 
 org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:376)
 at 
 org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:229)
 at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:157)
 at 
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
 at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:454)
 at 
 io.openindex.solr.servlet.HttpResponseSolrDispatchFilter.doFilter(HttpResponseSolrDispatchFilter.java:219)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
 at 
 org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:744)
 at 
 org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2274)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.SocketTimeoutException
 

Re: Urgent: Facetable but not Searchable Field

2012-08-01 Thread Yonik Seeley
On Wed, Aug 1, 2012 at 7:58 AM, jayakeerthi s mail2keer...@gmail.com wrote:
 We have a requirement, where we need to implement 2 fields as Facetable,
 but the values of the fields should not be Searchable.

The user fields uf feature of the edismax parser may work for you:

http://wiki.apache.org/solr/ExtendedDisMax#uf_.28User_Fields.29

-Yonik
http://lucidimagination.com


Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-24 Thread Yonik Seeley
On Tue, Jul 24, 2012 at 8:24 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com wrote:
 SolrIndexSearcher is a heavy object with caches, etc.

As I've said, the caches are configurable, and it's trivial to disable
all caching (to the point where the cache objects are not even
created).

 The reader member is not replaced in the existing SolrIndexSearcher object.
 The IndexSearcher.getIndexReader() method has been overriden in
 SolrIndexSearcher and all direct reader member access has been replaced with
 a getIndexReader() method call allowing a NRT reader to be supplied when
 realtime is enabled.

In a single Solr request (that runs through multiple components like
query, highlight, facet, and response writing),
does IndexSearcher.getIndexReader() always return the same reader?  If
not, this breaks pretty much every standard solr component - but it
will only be apparent under load, and if you are carefully sanity
checking the results.

-Yonik
http://lucidimagination.com


Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Yonik Seeley
On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com wrote:
 Realtime NRT algorithm enables NRT functionality in
 Solr by not closing the Searcher object  and so is very fast. I am in the
 process of contributing the algorithm back to Apache Solr as a patch.

Since you're in the process of contributing this back, perhaps you
could explain your approach - it never made sense to me.

Replacing the reader in an existing SolrIndexSearcher as you do means
that all the related caches will be invalid (meaning you can't use
solr's caches).  You could just ensure that there is no auto-warming
set up for Solr's caches (which is now the default), or you could
disable caching altogether.  It's not clear what you're comparing
against when you claim it's faster.

There are also consistency and concurrency issues with replacing the
reader in an existing SolrIndexSearcher, which is supposed to have a
static view of the index.  If a reader replacement happens in the
middle of a request, it's bound to cause trouble, including returning
the wrong documents!

-Yonik
http://lucidimagination.com


Re: SOLR 4 Alpha Out Of Mem Err

2012-07-18 Thread Yonik Seeley
I think what makes the most sense is to limit the number of
connections to another host.  A host only has so many CPU resources,
and beyond a certain point throughput would start to suffer anyway
(and then only make the problem worse).  It also makes sense in that a
client could generate documents faster than we can index them (either
for a short period of time, or on average) and having flow control to
prevent unlimited buffering (which is essentially what this is) makes
sense.

Nick - when you switched to HttpSolrServer, things worked because this
added an explicit flow control mechanism.
A single request (i.e. an add with one or more documents) is fully
indexed to all endpoints before the response is returned.  Hence if
you have 10 indexing threads and are adding documents in batches of
100, there can be only 1000 documents buffered in the system at any
one time.

-Yonik
http://lucidimagination.com


Re: Error 404 on every request

2012-07-17 Thread Yonik Seeley
On Tue, Jul 17, 2012 at 6:01 AM, Nils Abegg nils.ab...@ffuf.de wrote:
 I have installed the 4.0 Alpha with the build-in Jetty Server on Ubuntu 
 Server 12.04…i followed this tutorial to set it up:
 http://kingstonlabs.blogspot.de/2012/06/installing-solr-36-on-ubuntu-1204.html

Instead of trying to install Solr, I'd suggest just starting with
the stock server included with the binary distribution.
If you have Java in your path, you just do:

cd example
java -jar start.jar

-Yonik
http://lucidimagination.com


Re: Computed fields - can I put a function in fl?

2012-07-16 Thread Yonik Seeley
On Mon, Jul 16, 2012 at 4:43 AM, maurizio1976
maurizio.picc...@gmail.com wrote:
 Yes,
 sorry Just a typo.
 I meant  
 q=*:*fq=start=0rows=10qt=wt=explainOther=fl=product:(if(show_product:true,
 product, )
 thanks

Functions normally derive their values from the fieldCache... there
isn't currently a function to load stored fields (e.g. your product
field), but it's not a bad idea (given this usecase).

Here's an example with the exampledocs that shows IN_STOCK_PRICE only
if the item is in stock, and otherwise shows 0.
This works because price is a single-valued indexed field that the
fieldCache works on.

http://localhost:8983/solr/query?
  q=*:*
fl=id, inStock, IN_STOCK_PRICE:if(inStock,price,0)

-Yonik
http://lucidimagination.com


Re: SOLR 4 Alpha Out Of Mem Err

2012-07-15 Thread Yonik Seeley
Do you have the following hard autoCommit in your config (as the stock
server does)?

 autoCommit
   maxTime15000/maxTime
   openSearcherfalse/openSearcher
 /autoCommit

This is now fairly important since Solr now tracks information on
every uncommitted document added.
At some point we should probably hardcode some mechanism based on
number of documents or time.

-Yonik
http://lucidimagination.com


Re: SOLR 4 Alpha Out Of Mem Err

2012-07-15 Thread Yonik Seeley
On Sun, Jul 15, 2012 at 11:52 AM, Nick Koton nick.ko...@gmail.com wrote:
 Do you have the following hard autoCommit in your config (as the stock
 server does)?
 autoCommit
   maxTime15000/maxTime
   openSearcherfalse/openSearcher
 /autoCommit

 I have tried with and without that setting.  When I described running with
 auto commit, that setting is what I mean.

OK cool.  You should be able to run the stock server (i.e. with this
autocommit) and blast in updates all day long - it looks like you have
more than enough memory.  If you can't, we need to fix something.  You
shouldn't need explicit commits unless you want the docs to be
searchable at that point.

 Solrj multi-threaded client sends several 1,000 docs/sec

Can you expand on that?  How many threads at once are sending docs to
solr?  Is each request a single doc or multiple?

-Yonik
http://lucidimagination.com


Re: SOLR 4 Alpha Out Of Mem Err

2012-07-15 Thread Yonik Seeley
On Sun, Jul 15, 2012 at 12:52 PM, Jack Krupansky
j...@basetechnology.com wrote:
 Maybe your rate of update is so high that the commit never gets a chance to
 run.

I don't believe that is possible.  If it is, it should be fixed.

-Yonik
http://lucidimagination.com


Re: Is it possible to alias a facet field?

2012-07-14 Thread Yonik Seeley
On Sat, Jul 14, 2012 at 10:12 AM, Jamie Johnson jej2...@gmail.com wrote:
 So this got me close

 facet.field=testfieldfacet.field=%7B!key=mylabel%7Dtestfieldf.mylabel.limit=1

 but the limit on the alias didn't seem to work.  Is this expected?

Per-field params don't currently look under the alias.  I believe
there's a JIRA open for this.

-Yonik
http://lucidimagination.com


Re: Updating documents

2012-07-13 Thread Yonik Seeley
On Fri, Jul 13, 2012 at 1:41 PM, Jonatan Fournier
jonatan.fourn...@gmail.com wrote:
 On Fri, Jul 13, 2012 at 12:57 AM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Thu, Jul 12, 2012 at 3:20 PM, Jonatan Fournier
 jonatan.fourn...@gmail.com wrote:
 Is there a flag for: if document does not exist, create it for me?

 Not currently, but it certainly makes sense.
 The implementation should be easy. The most difficult part is figuring
 out the best syntax to specify this.

 Another idea: we could possibly switch to create-if-not-exist by
 default, and use the existing optimistic concurrency mechanism to
 specify that the document should exist.

 So specify _version_=1 if the document should exist and _version_=0
 (the default) if you don't care.

 Yes that would be neat!

I've just committed this change.

 One more question related to partial document update. So far I'm able
 to append to multivalue fields, set new value to regular/multivalue
 fields. One thing I didn't find is the remove command, what is its
 JSON syntax?

Set it to the JSON value of null.

-Yonik
http://lucidimagination.com


Re: Updating documents

2012-07-13 Thread Yonik Seeley
On Fri, Jul 13, 2012 at 3:50 PM, Jonatan Fournier
jonatan.fourn...@gmail.com wrote:
 On Thu, Jul 12, 2012 at 3:20 PM, Jonatan Fournier
 jonatan.fourn...@gmail.com wrote:
 But later on when I want to append cat3 to the field by doing this:

 mv_f:{add:cat3},
 ...

 I end up with something like this in the index:

 mv_f:[{add=cat3}],

 Obviously something is wrong with my syntax ;)

Are you using a custom update processor chain?  The
DistributedUpdateProcessor currently contains the logic for optimistic
concurrency and updates.
If you're not already, try some test commands with the stock server.

If you are already using the stock server, then perhaps you're not
sending what you think you are to Solr?

-Yonik
http://lucidimagination.com


Re: Updating documents

2012-07-12 Thread Yonik Seeley
On Thu, Jul 12, 2012 at 12:38 PM, Jonatan Fournier
jonatan.fourn...@gmail.com wrote:
 On Thu, Jul 12, 2012 at 11:05 AM, Erick Erickson
 The partial documents update that Jonatan references also requires
 that all the fields be stored.

 If my only fields with stored=false are copyField (e.g. I don't need
 their content to rebuild the document), are they gonna be re-copied
 with the partial document update?

Correct - your setup should be fine.  Only original source fields (non
copyField targets) should have stored=true

-Yonik
http://lucidimagination.com


Re: Updating documents

2012-07-12 Thread Yonik Seeley
On Thu, Jul 12, 2012 at 3:20 PM, Jonatan Fournier
jonatan.fourn...@gmail.com wrote:
 Is there a flag for: if document does not exist, create it for me?

Not currently, but it certainly makes sense.
The implementation should be easy. The most difficult part is figuring
out the best syntax to specify this.

Another idea: we could possibly switch to create-if-not-exist by
default, and use the existing optimistic concurrency mechanism to
specify that the document should exist.

So specify _version_=1 if the document should exist and _version_=0
(the default) if you don't care.

-Yonik
http://lucidimagination.com


Re: Solr 4.0 Alpha taking lot of CPU

2012-07-11 Thread Yonik Seeley
On Wed, Jul 11, 2012 at 8:11 PM, Pavitar Singh psi...@sprinklr.com wrote:
 We upgraded to Solr 4.0 Alpha and our CPU usage shot off to 400%.In
 profiling we are getting following trace.

That could either be good or bad.  Higher CPU can mean higher
concurrency.  Have you benchmarked your indexing performance?

Example: going from 60 minutes for indexing and 200% average CPU usage
to 30 minutes at 400% CPU would generally be considered a good thing.

-Yonik
http://lucidimagination.com


- *100.0%*  -
 *java.lang*.Thread.runhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
   - *42 Collapsed methods
 (show)https://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
   *
  - *98.0%*  - *org.apache.lucene.index*
  
 .DocumentsWriter.updateDocumenthttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
 - *77.0%*  - *org.apache.lucene.index*
 
 .DocumentsWriterPerThread.updateDocumenthttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
- *76.0%*  - *org.apache.lucene.index*

 .DocFieldProcessor.processDocumenthttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
   - *76.0%*  - *org.apache.lucene.index*

 .DocInverterPerField.processFieldshttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
  - *36.0%*  - *org.apache.lucene.analysis.miscellaneous*

 .TrimFilter.incrementTokenhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
 - *35.0%*  - *org.apache.lucene.analysis.core*

 .LowerCaseFilter.incrementTokenhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
- *17.0%*  - *org.apache.lucene.analysis.ngram*

 .NGramTokenFilter.incrementTokenhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
   - *9.2%*  - *org.apache.lucene.util*

 .AttributeSource.clearAttributeshttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
- *34.0%*  - *org.apache.lucene.index*

 .TermsHashPerField.addhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
 - *12.0%*  - *org.apache.lucene.index*

 .FreqProxTermsWriterPerField.addTermhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
- *11.0%*  - *org.apache.lucene.index*

 .FreqProxTermsWriterPerField.writeProxhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
   - *6.4%*  - *org.apache.lucene.index*

 .TermsHashPerField.writeVInthttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
- *11.0%*  - *
 org.apache.lucene.analysis.tokenattributes*

 .CharTermAttributeImpl.fillBytesRefhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#
  - *15.0%*  - *org.apache.lucene.index*
 
 .DocumentsWriterFlushControl.obtainAndLockhttps://rpm.newrelic.com/accounts/132291/applications/834717/profiles/1266#


Re: Nrt and caching

2012-07-07 Thread Yonik Seeley
On Sat, Jul 7, 2012 at 9:59 AM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
 Currently the caches are stored per-multiple-segments, meaning after each
 'soft' commit, the cache(s) will be purged.

Depends which caches.  Some caches are per-segment, and some caches
are top level.
It's also a trade-off... for some things, per-segment data structures
would indeed turn around quicker on a reopen, but every query would be
slower for it.

-Yonik
http://lucidimagination.com


Re: deleteById commitWithin question

2012-07-05 Thread Yonik Seeley
On Thu, Jul 5, 2012 at 4:29 PM, Jamie Johnson jej2...@gmail.com wrote:
 I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am
 noticing some issues around deleteById when a commitWithin parameter
 is included using SolrJ, specifically commit isn't executed.  If I
 later just call commit on the solr instance I see the item is deleted
 though.  Is anyone aware if this should work in that snapshot?


I thought I remembered something like this... but looking at the
commit log for DUH2, I don't see it.

/opt/code/lusolr4$ svn log
./solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java
| less


r1357332 | yonik | 2012-07-04 12:23:09 -0400 (Wed, 04 Jul 2012) | 1 line

log DBQ reordering events

r1356858 | markrmiller | 2012-07-03 14:18:48 -0400 (Tue, 03 Jul 2012) | 1 line

SOLR-3587: After reloading a SolrCore, the original Analyzer is still
used rather than a new one

r1356845 | yonik | 2012-07-03 13:47:56 -0400 (Tue, 03 Jul 2012) | 1 line

SOLR-3559: DBQ reorder support

r1355088 | sarowe | 2012-06-28 13:51:38 -0400 (Thu, 28 Jun 2012) | 1 line

LUCENE-4172: clean up redundant throws clauses (merge from trunk)

r1348984 | hossman | 2012-06-11 15:46:14 -0400 (Mon, 11 Jun 2012) | 1 line

LUCENE-3949: fix license headers to not be javadoc style comments

r1343813 | rmuir | 2012-05-29 12:16:38 -0400 (Tue, 29 May 2012) | 1 line

create stable branch for 4.x releases

r1328890 | yonik | 2012-04-22 11:01:55 -0400 (Sun, 22 Apr 2012) | 1 line

SOLR-3392: fix search leak when openSearcher=false

r1328883 | yonik | 2012-04-22 09:58:00 -0400 (Sun, 22 Apr 2012) | 1 line

SOLR-3391: Make explicit commits cancel pending autocommits.


I'll try out trunk quick and see if it currently works.

-Yonik
http://lucidimagination.com


Re: SolrCloud cache warming issues

2012-06-27 Thread Yonik Seeley
On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 Why would the documentCache not be populated via firstSearcher warming 
 queries with a non-zero value for rows?

Solr streams documents (the stored fields) returned to the user (so
very large result sets can be supported w/o having the whole thing in
memory).
A warming query finds the document ids matching a query, but does not
send them anywhere (and the stored fields aren't needed for anything
else), hence the stored fields are never loaded.

-Yonik
http://lucidimagination.com


Re: SolrCloud cache warming issues

2012-06-27 Thread Yonik Seeley
On Wed, Jun 27, 2012 at 12:23 PM, Erik Hatcher erik.hatc...@gmail.com wrote:

 On Jun 27, 2012, at 12:01 , Yonik Seeley wrote:

 On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma
 markus.jel...@openindex.io wrote:
 Why would the documentCache not be populated via firstSearcher warming 
 queries with a non-zero value for rows?

 Solr streams documents (the stored fields) returned to the user (so
 very large result sets can be supported w/o having the whole thing in
 memory).
 A warming query finds the document ids matching a query, but does not
 send them anywhere (and the stored fields aren't needed for anything
 else), hence the stored fields are never loaded.


 But if highlighting were enabled on those warming queries, it'd fill in the 
 document cache, right?

Correct.

-Yonik
http://lucidimagination.com


Re: Trying to avoid filtering on score, as I'm told that's bad

2012-06-27 Thread Yonik Seeley
On Wed, Jun 27, 2012 at 6:50 PM, mcb thestreet...@gmail.com wrote:
 I have a function query that returns miles as a score along two points:

 q={!func}sub(sum(geodist(OriginCoordinates,39,-105),geodist(DestinationCoordinates,36,-97),Mileage),1000)

 The issue that I'm having now now my results give me a list of scores:
 *score:10.1 (mi)
 score: 20 (mi)
 score: 75 (mi)
 *
 But I would like to also add a clause that cuts off the results after X
 miles (say 50) so that 75 above would not be included in the results.
 Unfortunately I can't say fq=score:[0 TO 50], but perhaps there is another
 way? I'm on solr 4.0

If you want to cut off the whole function at 75, then frange can do that:
q={!frange u=75}sub(sum(...

http://lucene.apache.org/solr/api/org/apache/solr/search/FunctionRangeQParserPlugin.html

-Yonik
http://lucidimagination.com


Re: How to update one field without losing the others?

2012-06-16 Thread Yonik Seeley
Atomic update is a very new feature coming in 4.0 (i.e. grab a recent
nightly build to try it out).

It's not documented yet, but here's the JIRA issue:
https://issues.apache.org/jira/browse/SOLR-139?focusedCommentId=13269007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269007

-Yonik
http://lucidimagination.com


Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Yonik Seeley
On Sun, May 27, 2012 at 11:57 AM, Radim Kolar h...@filez.com wrote:
 but i see RankingAlgorithm has fantastic results too and looking at its
 reference page it even powers sites like oracle.com and ebay.com.

What reference page are you referring to?

-Yonik
http://lucidimagination.com


Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Yonik Seeley
On Sun, May 27, 2012 at 12:42 PM, Radim Kolar h...@filez.com wrote:
 What reference page are you referring to?

 http://tgels.com/wiki/en/Sites_using/downloaded_RankingAlgorithm_or_Solr-RA

Ah, ok sites using/downloaded
So someone with a .oracle email / domain checked it out - that
certainly doesn't mean they are in production with it, or even plan to
be.

-Yonik
http://lucidimagination.com


Re: What is the docs number in Solr explain query results for fieldnorm?

2012-05-25 Thread Yonik Seeley
On Fri, May 25, 2012 at 2:13 PM, Tom Burton-West tburt...@umich.edu wrote:
 The explain (debugQuery) shows the following for fieldnorm:
  0.625 = fieldNorm(field=ocr, doc=16624)
 What does the doc=16624 mean?

It's the internal document id (i.e. it's debugging info and doesn't
affect scoring)

-Yonik
http://lucidimagination.com


Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Yonik Seeley
On Thu, May 24, 2012 at 7:29 AM, Michael Kuhlmann k...@solarier.de wrote:
 However, I doubt it. I've not been too deeply into the UpdateHandler yet,
 but I think it first needs to parse the complete XML file before it starts
 to index.

Solr's update handlers all stream (XML, JSON, CSV), reading and
indexing a document at a time from the input.

-Yonik
http://lucidimagination.com


Re: Update JSON not working for me

2012-05-16 Thread Yonik Seeley
On Wed, May 16, 2012 at 1:43 PM, rjain15 rjai...@gmail.com wrote:
 http://localhost:8983/solr/select?q=title:monsterswt=jsonindent=true

Try switching title:monsters to name:monsters
https://issues.apache.org/jira/browse/SOLR-2598

Looks like the data was changed to use the name field instead and the
docs were never updated (big downside to our non-versioned docs).

-Yonik
http://lucidimagination.com


Re: Update JSON not working for me

2012-05-16 Thread Yonik Seeley
On Wed, May 16, 2012 at 2:36 PM, rjain15 rjai...@gmail.com wrote:
 No. Changing to name:monsters didn't work

OK, but you'll have to do that if you get the other part working.

 Here is my guess, the UpdateJSON is not adding any new documents to the
 existing index.

If that's true, the most likely culprit is your curl on windows (or
the windows shell).
You mentioned removing the single quotes in the curl command?  Perhaps
try replacing all those with double quotes.

C:\Tools\Solr\apache-solr-4.0-2012-05-15_08-20-37\example\exampledocsC:\tools\curl\curl
http://localhost:8983/solr/update?commit=true --data-binary @books.json -H
Content-type:application/json


I'd really recommend installing cygwin if you know any unix at all...
not required, but will make your life much easier.

-Yonik
http://lucidimagination.com


Re: Update JSON not working for me

2012-05-16 Thread Yonik Seeley
On Wed, May 16, 2012 at 4:10 PM, rjain15 rjai...@gmail.com wrote:
 Hi

 Firstly, apologies for the long post, I changed the quote to double quote
 (and sometimes it is messy copying from DOS windows)

 Here is the command and the output on the Jetty Server Window. I am
 highlighting some important pieces,
 I have enabled the LOG LEVEL to DEBUG on the JETTY window.

 C:\Tools\Solr\apache-solr-4.0-2012-05-15_08-20-37\example\exampledocsC:\tools\curl\curl
 http://localhost:8983/solr/update?commit=true; --data-binary @books.js
 on -H 'Content-type:application/json'



 May 16, 2012 4:05:49 PM org.apache.solr.update.processor.LogUpdateProcessor
 finish
 INFO: [collection1] webapp=/solr path=/update params={commit=true[
  {
    id : 978-0641723445,

There ya go - what should be the body of the post is in fact used as a
very large parameter name.
I get this behavior when I leave off the -H
'Content-type:application/json' when trying this on UNIX.

This means that your content-type is not being set correctly by your
curl command.
Did you try changing those single quotes to double quotes at the end?

C:\Tools\Solr\apache-solr-4.0-2012-05-15_08-20-37\example\exampledocsC:\tools\curl\curl
http://localhost:8983/solr/update?commit=true; --data-binary
@books.js -H Content-type:application/json

-Yonik
http://lucidimagination.com


Re: Problems with field names in solr functions

2012-05-14 Thread Yonik Seeley
In trunk, see:
* SOLR-2335: New 'field(...)' function syntax for refering to complex
  field names (containing whitespace or special characters) in functions.

The schema in trunk also specifies:
   !-- field names should consist of alphanumeric or underscore
characters only and
  not start with a digit.  This is not currently strictly enforced,
  but other field names will not have first class support from all
components
  and back compatibility is not guaranteed.
   --

-Yonik
http://lucidimagination.com


On Thu, May 10, 2012 at 11:28 AM, Iker Huerga iker.hue...@gmail.com wrote:
 Hi all,

 I am having problems when sorting solr documents using solr functions due
 to the field names.


 Imagine we want to sort the solr documents based on the sum of the scores
 of the matching fields. These field are created as follows


 dynamicField name=foo/bar-* type=float indexed=true stored=true/


 The idea is that these fields store float values as in this example *field
 name=foo/bar-1234 50.45/field*



 The examples below illustrate the issue


 This query - http://URL/solr/select/?q=(*foo/bar-1234*:*)+AND+(
 http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)version=2.2start=0rows=10indent=onsort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+descwt=json
 *foo/bar*http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)version=2.2start=0rows=10indent=onsort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+descwt=json
 *-2345*:*)version=2.2start=0rows=10indent=onsort=sum(
 *foo/bar-1234*http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)version=2.2start=0rows=10indent=onsort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+descwt=json
  , 
 http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)version=2.2start=0rows=10indent=onsort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+descwt=json
 *foo/bar*http://184.73.38.213:8080/solr/select/?q=(EMMeT/Concept-5348008:*)+AND+(EMMeT/Concept-5347854:*)version=2.2start=0rows=10indent=onsort=sum(EMMeT/Concept-5348008,EMMeT/Concept-5347854)+descwt=json
 *-2345* )+descwt=json



 it gives me the following exception

 *
 *

 *The request sent by the client was syntactically incorrect (sort param
 could not be parsed as a query, and is not a field that exists in the
 index: sum(foo/bar-1234,foo/bar-2345)).*


 Whereas if I rename the field removing the / and - the following query
 will work -

 http://URL/solr/select/?q=(*bar1234*:*)+AND+(*bar2345*:*)version=2.2start=0rows=10indent=onsort=sum(
 http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
 *bar1234*http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
 ,
 *bar2345*http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
  )+descwt=json



  response:{numFound:2,start:0,docs:[

      {

        primaryDescRes:DescRes2,

         
 *bar1234*http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
  :45.54,

         
 *bar2345*http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
  :100.0},

      {

        primaryDescRes:DescRes1,

         
 *bar1234*http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
  :100.5,

         
 *bar2345*http://184.73.38.213:8080/solr/select/?q=(Concept5348008:*)+AND+(Concept5347854:*)version=2.2start=0rows=10indent=onsort=sum(Concept5348008,Concept5347854)+descwt=json
  :25.22}]

  }}



 I tried escaping the character as indicated in solr documentation [1], i.e.
 foo%2Fbar-12345 instead of foo/bar-12345, without success



 Could this be caused by the query parser?


 I would be extremely grateful if you could let me know any workaround for
 this



 Best

 Iker



 [1]
 http://wiki.apache.org/solr/SolrQuerySyntax#NOTE:_URL_Escaping_Special_Characters

 --
 Iker Huerga
 http://www.ikerhuerga.com/


Re: Update JSON not working for me

2012-05-14 Thread Yonik Seeley
I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857
JIRA is down right now so I can't check, but I thought the intent was
to have some back compat.

Try changing the URL from /update/json to just /update in the meantime

-Yonik
http://lucidimagination.com


On Mon, May 14, 2012 at 2:42 PM, Rajesh Jain rjai...@gmail.com wrote:
 Hi Jack

 I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.

 The first example is of books.json, which  I executed, but I dont see any
 books

 http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks

 0 results found in 26 ms Page 0 of 0

 I modified the books.json to add my own book, but still no result. The
 money.xml works, so I converted the money.xml to money.json and added an
 extra currency. I don't see the new currency.

 My question is, how do I know if the UpdateJSON action was valid, if I
 don't see them in the
 http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks

 Is there a way to find what is happening - maybe through log files?

 I am new to Solr, please help

 Thanks
 Rajesh




 On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky 
 j...@basetechnology.comwrote:

 Check the examples of update/json here:

 http://wiki.apache.org/solr/**UpdateJSONhttp://wiki.apache.org/solr/UpdateJSON

 In your case, either leave out the add level or add a doc level below
 it.

 For example:

 curl 
 http://localhost:8983/solr/**update/jsonhttp://localhost:8983/solr/update/json-H
  'Content-type:application/
 **json' -d '
 {
 add: {doc: {id : TestDoc1, title : test1} },
 add: {doc: {id : TestDoc2, title : another test} }
 }'

 -- Jack Krupansky

 -Original Message- From: Rajesh Jain
 Sent: Monday, May 14, 2012 1:27 PM
 To: solr-user@lucene.apache.org
 Cc: Rajesh Jain
 Subject: Update JSON not working for me


 Hi,

 I am using the 4.x version of Solr, and following the UpdateJSON Solr Wiki

 1. When I try to update using :

 curl 
 'http://localhost:8983/solr/**update/json?commit=truehttp://localhost:8983/solr/update/json?commit=true
 '
 --data-binary @books.json -H 'Content-type:application/**json'

 I don't see any Category as Books in Velocity based Solr Browser the
 http://localhost:8983/solr/**collection1/browse/http://localhost:8983/solr/collection1/browse/
 ?

 I see the following message on the startup window when I run this command
 C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
 exampledocsC:\tools\curl\curl
 http://localhost:8983/solr/**update/json?commit=truehttp://localhost:8983/solr/update/json?commit=true--data-binary
 @books
 .json -H 'Content-type:application/**json'
 {
  responseHeader:{
   status:0,
   QTime:47}}

 2. I wrote my own JSON file where I added an extra add directive

 My JSON File
 [
  {
 add:{
 id : MXN,
 cat : [currency],
 name : One Peso,
 inStock : true,
 price_c : 1,MXN,
 manu : 384,
 manu_id_s : Bank Mexico,
 features:Coins and notes
     }
   }
 ]

 I still don't see the addition in the existing Currency Categories.


 Please let me know if the UPDATEJSON works in 4.x or is this only for 3.6?

 Thanks
 Rajesh



Re: Update JSON not working for me

2012-05-14 Thread Yonik Seeley
On Mon, May 14, 2012 at 3:11 PM, Rajesh Jain rjai...@gmail.com wrote:
 Hi Yonik

 i tried without the json in the URL, the result was same but in XML format

Interesting... the XML response is fine (just not ideal).

When I tried it, I did get a JSON response (perhaps I'm running a
later version of trunk... the unified update handler is very new)

$ curl 'http://localhost:8983/solr/update?commit=true' --data-binary
@books.json -H 'Content-type:application/json'
{responseHeader:{status:0,QTime:133}}

-Yonik
http://lucidimagination.com



 C:\Tools\Solr\apache-solr-4.0-2012-05-04_08-23-31\example\exampledocsC:\tools\curl\curl
 http://localhost:8983/solr/update?commit=true --data-binary @money.json -H
 'Content-type:application/json'
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime45/in
 /lst
 /response




 On Mon, May 14, 2012 at 2:58 PM, Yonik Seeley yo...@lucidimagination.com
 wrote:

 I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857
 JIRA is down right now so I can't check, but I thought the intent was
 to have some back compat.

 Try changing the URL from /update/json to just /update in the meantime

 -Yonik
 http://lucidimagination.com


 On Mon, May 14, 2012 at 2:42 PM, Rajesh Jain rjai...@gmail.com wrote:
  Hi Jack
 
  I am following the http://wiki.apache.org/solr/UpdateJSON tutorials.
 
  The first example is of books.json, which  I executed, but I dont see
  any
  books
 
  http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
 
  0 results found in 26 ms Page 0 of 0
 
  I modified the books.json to add my own book, but still no result. The
  money.xml works, so I converted the money.xml to money.json and added an
  extra currency. I don't see the new currency.
 
  My question is, how do I know if the UpdateJSON action was valid, if I
  don't see them in the
  http://localhost:8983/solr/collection1/browse?q=cat%3Dbooks
 
  Is there a way to find what is happening - maybe through log files?
 
  I am new to Solr, please help
 
  Thanks
  Rajesh
 
 
 
 
  On Mon, May 14, 2012 at 2:33 PM, Jack Krupansky
  j...@basetechnology.comwrote:
 
  Check the examples of update/json here:
 
 
  http://wiki.apache.org/solr/**UpdateJSONhttp://wiki.apache.org/solr/UpdateJSON
 
  In your case, either leave out the add level or add a doc level
  below
  it.
 
  For example:
 
  curl
  http://localhost:8983/solr/**update/jsonhttp://localhost:8983/solr/update/json-H
  'Content-type:application/
  **json' -d '
  {
  add: {doc: {id : TestDoc1, title : test1} },
  add: {doc: {id : TestDoc2, title : another test} }
  }'
 
  -- Jack Krupansky
 
  -Original Message- From: Rajesh Jain
  Sent: Monday, May 14, 2012 1:27 PM
  To: solr-user@lucene.apache.org
  Cc: Rajesh Jain
  Subject: Update JSON not working for me
 
 
  Hi,
 
  I am using the 4.x version of Solr, and following the UpdateJSON Solr
  Wiki
 
  1. When I try to update using :
 
  curl
  'http://localhost:8983/solr/**update/json?commit=truehttp://localhost:8983/solr/update/json?commit=true
  '
  --data-binary @books.json -H 'Content-type:application/**json'
 
  I don't see any Category as Books in Velocity based Solr Browser the
 
  http://localhost:8983/solr/**collection1/browse/http://localhost:8983/solr/collection1/browse/
  ?
 
  I see the following message on the startup window when I run this
  command
  C:\Tools\Solr\apache-solr-4.0-**2012-05-04_08-23-31\example\**
  exampledocsC:\tools\curl\curl
 
  http://localhost:8983/solr/**update/json?commit=truehttp://localhost:8983/solr/update/json?commit=true--data-binary
  @books
  .json -H 'Content-type:application/**json'
  {
   responseHeader:{
    status:0,
    QTime:47}}
 
  2. I wrote my own JSON file where I added an extra add directive
 
  My JSON File
  [
   {
  add:{
  id : MXN,
  cat : [currency],
  name : One Peso,
  inStock : true,
  price_c : 1,MXN,
  manu : 384,
  manu_id_s : Bank Mexico,
  features:Coins and notes
      }
    }
  ]
 
  I still don't see the addition in the existing Currency Categories.
 
 
  Please let me know if the UPDATEJSON works in 4.x or is this only for
  3.6?
 
  Thanks
  Rajesh
 




Re: 1MB file to Zookeeper

2012-05-05 Thread Yonik Seeley
On Sat, May 5, 2012 at 8:39 AM, Jan Høydahl jan@cominvent.com wrote:
 support for CouchDb, Voldemort or whatever.

Hmmm... Or Solr!

-Yonik


Re: 1MB file to Zookeeper

2012-05-04 Thread Yonik Seeley
On Fri, May 4, 2012 at 12:50 PM, Mark Miller markrmil...@gmail.com wrote:
 And how should we detect if data is compressed when
 reading from ZooKeeper?

 I was thinking we could somehow use file extensions?

 eg synonyms.txt.gzip - then you can use different compression algs depending 
 on the ext, etc.

 We would want to try and make it as transparent as possible though...

At first I thought about adding a marker to the beginning of a file, but
file extensions could work too, as long as the resource loader made it
transparent
(i.e. code would just need to ask for synonyms.txt, but the resource
loader would search
for synonyms.txt.gzip, etc, if the original name was not found)

Hmmm, but this breaks down for things like watches - I guess that's
where putting the encoding inside the file would be a better option.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: solr: how to change display name of a facet?

2012-05-03 Thread Yonik Seeley
On Thu, May 3, 2012 at 2:26 PM, okayndc bodymo...@gmail.com wrote:
[...]
 I've experimented with this:
 str name=facet.field{!ex=dt key=Categories and Stuff}category/str

 I'm not really sure what 'ex=dt' does but it's obvious that 'key' is the
 desired display name? If there are spaces in the 'key' value, the display
 name gets cut off.  What am I doing wrong?

http://wiki.apache.org/solr/LocalParams
For a non-simple parameter value, enclose it in single quotes

ex excludes filters tagged with a value.
See
http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: access document by primary key

2012-05-03 Thread Yonik Seeley
On Thu, May 3, 2012 at 3:01 PM, Tomás Fernández Löbbe
tomasflo...@gmail.com wrote:
 Is this still true? Assuming that I know that there hasn't been updates or
 that I don't care to see a different version of the document, are the term
 QP or the raw QP faster than the real-time get handler?

Sort of different things... query parsers only parse queries, not execute them.
If you're looking for documents by ID though, the realtime-get hander
should be the fastest, esp in a distributed setup.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: NPE when faceting

2012-05-01 Thread Yonik Seeley
Darn... looks likely that it's another bug from when part of
UnInvertedField was refactored into Lucene.
We really need some random tests that can catch bugs like these though
- I'll see if I can reproduce.

Can you open a JIRA issue for this?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


On Tue, May 1, 2012 at 4:51 PM, Jamie Johnson jej2...@gmail.com wrote:
 I had reported this issue a while back, hoping that it was something
 with my environment, but that doesn't seem to be the case.  I am
 getting the following stack trace on certain facet queries.
 Previously when I did an optimize the error went away, does anyone
 have any insight into why specifically this could be happening?

 May 1, 2012 8:48:52 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.NullPointerException
        at org.apache.lucene.index.DocTermOrds.lookupTerm(DocTermOrds.java:807)
        at 
 org.apache.solr.request.UnInvertedField.getTermValue(UnInvertedField.java:636)
        at 
 org.apache.solr.request.UnInvertedField.getCounts(UnInvertedField.java:411)
        at 
 org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:300)
        at 
 org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:396)
        at 
 org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:205)
        at 
 org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:81)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
        at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
        at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
        at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
        at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
        at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
        at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
        at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
        at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
        at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
        at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
        at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
        at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
        at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
        at org.eclipse.jetty.server.Server.handle(Server.java:351)
        at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
        at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
        at 
 org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900)
        at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:954)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)
        at 
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
        at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
        at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
        at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
        at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
        at java.lang.Thread.run(Thread.java:662)


Re: commit fail

2012-04-28 Thread Yonik Seeley
On Sat, Apr 28, 2012 at 7:02 AM, mav.p...@holidaylettings.co.uk
mav.p...@holidaylettings.co.uk wrote:
 Hi,

 This is what the thread dump looks like.

 Any ideas?

Looks like the thread taking up CPU is in LukeRequestHandler

 1062730578@qtp-1535043768-5' Id=16, RUNNABLE on lock=, total cpu
 time=16156160.ms user time=16153110.msat
 org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeR
 equestHandler.java:320)

That probably accounts for the 1 CPU doing things... but it's not
clear at all why commits are failing.

Perhaps the commit is succeeding, but the client is just not waiting
long enough for it to complete?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Recovery - too many updates received since start

2012-04-27 Thread Yonik Seeley
On Tue, Apr 24, 2012 at 9:31 AM, Trym R. Møller t...@sigmat.dk wrote:
 Hi

 I experience that a Solr looses its connection with Zookeeper and
 re-establish it. After Solr is reconnection to Zookeeper it begins to
 recover.
 It has been missing the connection approximately 10 seconds and meanwhile
 the leader slice has received some documents (maybe about 1000 documents).
 Solr fails to update peer sync with the log message:
 Apr 21, 2012 10:13:40 AM org.apache.solr.update.PeerSync sync
 WARNING: PeerSync: core=mycollection_slice21_shard1
 url=zk-1:2181,zk-2:2181,zk-3:2181 too many updates received since start -
 startingUpdates no longer overlaps with our currentUpdates

 Looking into PeerSync and UpdateLog I can see that 100 updates is the
 maximum allowed updates that a shard can be behind.
 Is it correct that this is not configurable and what is the reasons for
 choosing 100?

 I suspect that one must compare the work needed to replicate the full index
 with the performance loss/resource usage when enhancing the size of the
 UpdateLog?

The peersync messages don't stream, so we need to limit how many docs
will be in memory at once.
If someone makes that streamable, I'd be more comfortable making the
limit configurable.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: commit stops

2012-04-27 Thread Yonik Seeley
On Fri, Apr 27, 2012 at 9:18 AM, mav.p...@holidaylettings.co.uk
mav.p...@holidaylettings.co.uk wrote:
 We have an index of about 3.5gb which seems to work fine until it suddenly 
 stops accepting new commits.

 Users can still search on the front end but nothing new can be committed and 
 it always times out on commit.

 Any ideas?

Perhaps the commit happens to cause a major merge which may take a
long time (and solr isn't going to allow overlapping commits).
How long does a commit request take to time out?

What Solr version is this?  Do you have any kind of auto-commit set
up?  How often are you manually committing?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: commit fail

2012-04-27 Thread Yonik Seeley
On Fri, Apr 27, 2012 at 8:23 PM, mav.p...@holidaylettings.co.uk
mav.p...@holidaylettings.co.uk wrote:
 Hi again,

 This is the only log entry I can find, regarding the failed commits…

 Still timing out as far as the client is concerned and there is actually 
 nothing happening on the server in terms of load (staging environment).

 1 CPU core seems busy constantly with solr but unsure what is happening.

You can get a thread dump to see what the various threads are doing
(use the solr admin, or kill -3).  Sounds like it could just be either
merging in progress or a commit in progress.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: embedded solr populating field of type LatLonType

2012-04-25 Thread Yonik Seeley
On Tue, Apr 24, 2012 at 4:05 PM, Jason Cunning jcunn...@ucar.edu wrote:
 My question is, what is the AppropriateJavaType for populating a solr field 
 of type LatLonType?

A String with both the lat and lon separated by a comma.  Example: 12.34,56.78

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Title Boosting and IDF

2012-04-25 Thread Yonik Seeley
On Wed, Apr 25, 2012 at 9:24 PM, Walter Underwood wun...@wunderwood.org wrote:
 Interestingly, I worked at two different web search companies with two 
 different completely different search engines, and one arrived at an 8X title 
 boost and the other at a 7.5X title boost. So I consider 8X a universal 
 physical constant.

Great info!  Do you know if that 8x was after (i.e. already included)
length normalization?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


searcher leak on trunk after 2/1/2012

2012-04-22 Thread Yonik Seeley
Folks,
If you're using a trunk version after 2/1/2012 in conjunction with the
shipped solrconfig.xml (which uses openSearcher=false in an autoCommit
by default),
then you should upgrade to a new version.  There's a searcher leak
when openSearcher=false is used with a commit that leads to files not
being closed.

This was just fixed in https://issues.apache.org/jira/browse/SOLR-3392
so if you're looking to use nightly builds, you will need one from Apr
23 or later.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: # open files with SolrCloud

2012-04-21 Thread Yonik Seeley
I can reproduce some kind of searcher leak issue here, even w/o
SolrCloud, and I've opened
https://issues.apache.org/jira/browse/SOLR-3392

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Solr Hanging

2012-04-19 Thread Yonik Seeley
On Thu, Apr 19, 2012 at 4:25 AM, Trym R. Møller t...@sigmat.dk wrote:
 Hi

 I am using Solr trunk and have 7 Solr instances running with 28 leaders and
 28 replicas for a single collection.
 After indexing a while (a couple of days) the solrs start hanging and doing
 a thread dump on the jvm I see blocked threads like the following:
    Thread 2369: (state = BLOCKED)
     - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
 information may be imprecise)
     - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
 line=158 (Compiled frame)
     -
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
 @bci=42, line=1987 (Compiled frame)
     - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399
 (Compiled frame)
     - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164
 (Compiled frame)
     - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
 @bci=27, line=350 (Compiled frame)
     - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98
 (Compiled frame)
     - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
 @bci=4, line=299 (Compiled frame)
     - org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
 @bci=1, line=817 (Compiled frame)
    ...
     - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582
 (Interpreted frame)

 I read the stack trace as my indexing client has indexed a document and this
 Solr is now waiting for the replica? to respond before returning an answer
 to the client.

Correct.  What's the full stack trace like on both a leader and replica?
We need to know what the replica is blocking on.

What version of trunk are you using?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Distributed FacetComponent NullPointer Exception

2012-04-17 Thread Yonik Seeley
facet.field={!terms=$organization__terms}organization

This is referring to another request parameter that Solr should have
added (organization__terms) .  Did you cut-n-paste all of the
parameters below?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10



On Tue, Apr 17, 2012 at 10:13 AM, Jamie Johnson jej2...@gmail.com wrote:
 I'm noticing that this issue seems to be occurring with facet fields
 which have some unexpected characters.  For instance the query that I
 see going across the wire is as follows

 facet=truetie=0.1ids=3F2504E0-4F89-11D3-9A0C-0305E82C3301qf=%0a++author^0.5+type^0.5+content_mvtxt^10++subject_phonetic^1+subject_txt^20%0a+++q.alt=*:*distrib=falseTest+%0a%0a%0a%0a?+%0a%0a%0a%0aDaily+News,Test+Association,Toyota,U.S.,Washington+Postrows=10rows=10NOW=1334670761188shard.url=JamiesMac.local:8502/solr/shard5-core1/fl=*,scoreq=bobfacet.field={!terms%3D$organization__terms}organizationisShard=true

 Now there is an obvious issue here with our data having these \n
 characters in it which I will be fixing shortly (plan to use a set of
 Character replace filters to remove extra white space).  I am assuming
 that this is causing our issue, but would be nice if someone could
 confirm.


 On Tue, Apr 17, 2012 at 12:08 AM, Jamie Johnson jej2...@gmail.com wrote:
 I created to track this.  https://issues.apache.org/jira/browse/SOLR-3362

 On Mon, Apr 16, 2012 at 11:18 PM, Jamie Johnson jej2...@gmail.com wrote:
 doing some debugging this is the relevant block in FacetComponent

          String name = shardCounts.getName(j);
          long count = ((Number)shardCounts.getVal(j)).longValue();
          ShardFacetCount sfc = dff.counts.get(name);
          sfc.count += count;


 the issue is sfc is null.  I don't know if that should or should not
 occur, but if I add a check (if(sfc == null)continue;) then I think it
 would work.  Is this appropriate?

 On Mon, Apr 16, 2012 at 10:45 PM, Jamie Johnson jej2...@gmail.com wrote:
 worth notingthe error goes away at times depending on the number
 of facets asked for.

 On Mon, Apr 16, 2012 at 10:38 PM, Jamie Johnson jej2...@gmail.com wrote:
 I found (what appears to be) the issue I am experiencing here
 http://lucene.472066.n3.nabble.com/NullPointerException-with-distributed-facets-td3528165.html
 but there were no responses to it.  I've included the stack trace I am
 seeing, any ideas why this would happen?


 SEVERE: java.lang.NullPointerException
        at 
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:489)
        at 
 org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:278)
        at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
        at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
        at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
        at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
        at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
        at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
        at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
        at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
        at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
        at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
        at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
        at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
        at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
        at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
        at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
        at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
        at org.eclipse.jetty.server.Server.handle(Server.java:351)
        at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
        at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
        at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
        at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
        at 
 

Re: Problem with faceting on a boolean field

2012-04-17 Thread Yonik Seeley
On Tue, Apr 17, 2012 at 2:22 PM, Kissue Kissue kissue...@gmail.com wrote:
 Hi,

 I am faceting on a boolean field called usedItem. There are a total of
 607601 items in the index and they all have value for usedItem set to
 false.

 However when i do a search for *:* and faceting on usedItem, the num
 found is set correctly to 607601 but i get the facet result below:

 lst name=usedItemint name=false17971/int/lst

You can verify by changing the query from *:* to usedItem:false  (or
adding an additional fq to that effect).

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Changing precisionStep without a re-index

2012-04-16 Thread Yonik Seeley
On Mon, Apr 16, 2012 at 12:12 PM, Michael Ryan mr...@moreover.com wrote:
 Is it safe to change the precisionStep for a TrieField without doing a 
 re-index?

Not really - it changes what tokens are indexed for them numbers and
range queries won't work correctly.
Sorting (FieldCache), function queries, etc, would still work, and
exact match queries would still work.


-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


 Specifically, I want to change a field from this:
 fieldType name=tlong class=solr.TrieLongField precisionStep=8 
 omitNorms=true positionIncrementGap=0/
 to this:
 fieldType name=long class=solr.TrieLongField precisionStep=0 
 omitNorms=true positionIncrementGap=0/

 By safe, I mean that searches will return the correct results, a FieldCache 
 on the field will still work, clowns won't eat me...

 -Michael


Re: DeleteByQuery using xml commands in SolrCloud

2012-04-16 Thread Yonik Seeley
On Mon, Apr 16, 2012 at 4:13 PM, Jamie Johnson jej2...@gmail.com wrote:
 I tried to execute the following on my cluster, but it had no results.
  Should this work?

 curl http://host:port/solr/collection1/update/?commit=true -H
 Contenet-Type: text/xml --data-binary
 'deletequery*:*/query/delete'

Is this a cut-n-paste of what you actually sent?
If so, Content-Type is misspelled (but I'm not sure if that's the issue)

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Can Solr solve this simple problem?

2012-04-16 Thread Yonik Seeley
2012/4/16 Tomás Fernández Löbbe tomasflo...@gmail.com:
 I'm wondering if Solr is the best tool for this kind of usage. Solr is a
 text search engine

Well, Lucene is a full-text search library, but Solr has always been far more.
Dating back to it's first use in CNET, it was used as a browse engine
(faceted search), sometimes without much of a full-text aspect at all.
And we're moving more and more into the NoSQL realm (durability,
realtime-get, and coming real soon - optimistic locking).

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: solr 3.5 taking long to index

2012-04-15 Thread Yonik Seeley
On Thu, Apr 12, 2012 at 10:42 PM, Rohit ro...@in-rev.com wrote:
 The machine has a total ram of around 46GB. My Biggest concern is Solr index 
 time gradually increasing and then the commit stops because of timeouts, out 
 commit rate is very high, but I am not able to find the root cause of the 
 issue.

The difference you're seeing between 3.1 and 3.5 may be due to a bug
in the former where fsync was not being called:
https://issues.apache.org/jira/browse/LUCENE-3418

 We commit every 5000 documents

If you are doing bulk indexing, wait until the end to commit.
Upcoming Solr4 has near realtime (soft commit) support to make doing
frequent commits (for the purposes of visibility) less expensive.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: It's hard to google on _val_

2012-04-15 Thread Yonik Seeley
On Sun, Apr 15, 2012 at 11:34 AM, Benson Margulies
bimargul...@gmail.com wrote:
 So, I've been experimenting to learn how the _val_ participates in scores.

 It seems to me that http://wiki.apache.org/solr/FunctionQuery should
 explain the *effect* of including an _val_ term in an ordinary query,
 starting with a constant.

It's simply added to the score as any other clause in a boolean query would be.

 Positive values of _val_ did lead to
 positive increments in the score, but clearly not by simple addition.

That's just because Lucene normalizes scores.  By default, this is
really just multiplying scores by a magic constant (that by default is
the inverse of the sum of squared weights) and doesn't change relative
orderings of docs.  If you add debugQuery=true and look at the scoring
explanations, you'll see that queryNorm component.

If you want to go down the rabbit hole on trunk, see
IndexSearcher.createNormalizedWeight()

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: It's hard to google on _val_

2012-04-15 Thread Yonik Seeley
On Sun, Apr 15, 2012 at 12:14 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 That's just because Lucene normalizes scores.  By default, this is
 really just multiplying scores by a magic constant (that by default is
 the inverse of the sum of squared weights)

Sorry... I missed the square root.   Should be inverse of the square
root of the sum of squared weights.
See DefaultSimilarity.queryNorm:

  public float queryNorm(float sumOfSquaredWeights) {
return (float)(1.0 / Math.sqrt(sumOfSquaredWeights));
  }

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: performance impact using string or float when querying ranges

2012-04-13 Thread Yonik Seeley
On Fri, Apr 13, 2012 at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote:
 Well, I guess my first question is whether using stirngs
 is fast enough, in which case there's little reason to
 make your life more complex.

 But yes, range queries will be significantly faster with
 any of the Trie types than with strings.

To elaborate on this point a bit... range queries on strings will be
the same speed as a numeric field with precisionStep=0.
You need a precisionStep  0 (so the number will be indexed in
multiple parts) to speed up range queries on numeric fields.  (See
int vs tint in the solr schema).

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10




 Trie types are
 all numeric types.


 Best
 Erick

 On Fri, Apr 13, 2012 at 3:49 AM, crive marco.cr...@gmail.com wrote:
 Hi All,
 is there a big difference in terms of performances when querying a range
 like [50.0 TO *] on a string field compared to a float field?

 At the moment I am using a dynamic field of type string to map some values
 coming from our database and their type can vary depending on the context
 (float/integer/string); it easier to use a dynamic field other than having
 to create a bespoke field for each type of value.

 Marco


Re: solr 3.4 with nTiers = 2: usage of ids param causes NullPointerException (NPE)

2012-04-12 Thread Yonik Seeley
On Wed, Apr 11, 2012 at 8:16 AM, Dmitry Kan dmitry@gmail.com wrote:
 We have a system with nTiers, that is:

 Solr front base --- Solr front -- shards

Although the architecture had this in mind (multi-tier), all of the
pieces are not yet in place to allow it.
The errors you see are a direct result of that.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-12 Thread Yonik Seeley
On Thu, Apr 12, 2012 at 2:21 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : Please see the documentation: 
 http://wiki.apache.org/solr/SolrCloud#Required_Config :

 : schema.xml
 :
 : You must have a _version_ field defined:
 :
 : field name=_version_ type=long indexed=true stored=true/

 Seems like this is the kind of thing that should make Solr fail hard and
 fast on SolrCore init if it sees you are running in cloud mode and yet it
 doesn't find this -- similar to how some other features fail hard and fast
 if you don't have uniqueKey.

Off the top of my head:
_version_ is needed for solr cloud where a leader forwards updates to
replicas, unless you're handing update distribution yourself or
providing pre-built shards.
_version_ is needed for realtime-get and optimistic locking

We should document for sure... but at this point it's not clear what
we should enforce. (not saying we shouldn't enforce anything... just
that I haven't really thought about it)

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SOLR 4 autocommit - is it working as I think it should?

2012-04-11 Thread Yonik Seeley
On Wed, Apr 11, 2012 at 12:58 PM, vybe3142 vybe3...@gmail.com wrote:
 This morning, I've been looking at the autocommit functionality as defined
 in solrconfig.xml. By default, it appears that it should kick in 15 seconds
 after a new document has been added. I do see this event triggered via the
 SOLR/tomcat logs, but can't see the docs/terms  in the index or query them.
 I haven't bothered with the softcommit yet as I'd like to first understand
 what the issue is wrt the autocommit.

The 15 second hard autocommit is not for the purpose of update
visibility, but for durability (hence the hard autocommit uses
openSearcher=false).  It simply makes sure that recent changes are
flushed to disk.

If you want to automatically see changes after some period of time,
use an additional soft autocommit for that (and leave the hard
autocommit exactly as configured),
or use commitWithin when you do an update... that's more flexible and
allows you to specify latency on a per-update basis.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SOLR issue - too many search queries

2012-04-10 Thread Yonik Seeley
On Tue, Apr 10, 2012 at 8:51 AM, arunssasidhar arunssasid...@gmail.com wrote:
 We have a PHP web application which is using SOLR for searching. The APP is
 using CURL to connect to the SOLR server and which run in a loop with
 thousands of predefined keywords. That will create thousands of different
 search quires to SOLR at a given time.

Thousands of concurrent queries?  That's normally not a useful metric
unless you have a very strange application.

You normally want to look at the following:
 - throughput (queries per second)
 - latency (how long the queries take - average, 90%, 95%, etc)

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SolrCloud replica and leader out of Sync somehow

2012-04-05 Thread Yonik Seeley
On Thu, Apr 5, 2012 at 12:19 AM, Jamie Johnson jej2...@gmail.com wrote:
 Not sure if this got lost in the shuffle, were there any thoughts on this?

Sorting by id could be pretty expensive (memory-wise), so I don't
think it should be default or anything.
We also need a way for a client to hit the same set of servers again
anyway (to handle other possible variations like commit time).

To handle the tiebreak stuff, you could also sort by _version_ - that
should be unique in an index and is already used under the covers and
hence shouldn't add any extra memory overhead.  versions increase over
time, so _version desc should give you newer documents first.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10




 On Wed, Mar 21, 2012 at 11:02 AM, Jamie Johnson jej2...@gmail.com wrote:
 Given that in a distributed environment the docids are not guaranteed
 to be the same across shards should the sorting use the uniqueId field
 as the tie breaker by default?

 On Tue, Mar 20, 2012 at 2:10 PM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Tue, Mar 20, 2012 at 2:02 PM, Jamie Johnson jej2...@gmail.com wrote:
 I'll try to dig for the JIRA.  Also I'm assuming this could happen on
 any sort, not just score correct?  Meaning if we sorted by a date
 field and there were duplicates in that date field order wouldn't be
 guaranteed for the same reasons right?

 Correct - internal docid is the tiebreaker for all sorts.

 -Yonik
 lucenerevolution.com - Lucene/Solr Open Source Search Conference.
 Boston May 7-10


Re: Evaluating Solr

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 12:46 PM, Joseph Werner telco...@gmail.com wrote:
 For more routine changes, are record updates supported without the
 necessitity to rebuilt an index? For example if a description field for an
 item needs be changed, am I correct in reading that the recodrd need only
 be resubmitted?

Correct.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: solrcloud is deleteByQuery stored in transactions and forwarded like other operations?

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 3:04 PM, Jamie Johnson jej2...@gmail.com wrote:
 Thanks Mark.  The delete by query is a very rare operation for us and
 I really don't have the liberty to update to current trunk right now.
 Do you happen to know about when the fix was made so I can see if we
 are before or after that time?

Not difinitive, but a grep of svn log in solr/core shows:

r1295665 | yonik | 2012-03-01 11:41:54 -0500 (Thu, 01 Mar 2012) | 1 line
cloud: fix distributed deadlock w/ deleteByQuery

r1243773 | yonik | 2012-02-13 22:00:22 -0500 (Mon, 13 Feb 2012) | 1 line
dbq: fix param rename

r1243768 | yonik | 2012-02-13 21:45:41 -0500 (Mon, 13 Feb 2012) | 1 line
solrcloud: send deleteByQuery to all shard leaders to version and
forward to replicas


-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 3:14 PM, vybe3142 vybe3...@gmail.com wrote:

 Updating a single field is not possible in solr.  The whole record has to
 be rewritten.

 Unfortunate. Lucene allows it.

I think you're mistaken - the same limitations apply to Lucene.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: How do I use localparams/joins using SolrJ and/or the Admin GUI

2012-03-31 Thread Yonik Seeley
On Sat, Mar 31, 2012 at 11:50 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Try escaping the '+' with %2B (as I remember).

Shouldn't that be the other way?  The admin UI should do any necessary
escaping, so those + chars should instead be a spaces?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SOLR hangs - update timeout - please help

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 4:24 AM, Lance Norskog goks...@gmail.com wrote:
 5-7 seconds- there's the problem. If you want to have documents
 visible for search within that time, you want to use the trunk and
 near-real-time search. A hard commit does several hard writes to the
 disk (with the fsync() system call). It does not run smoothly at that
 rate. It is no surprise that eventually you hit a thread-locking bug.

Are you speaking of a JVM bug, or something else?  A Lucene bug?  A Solr bug?

Rafal, do you have a thread dump of when the update hangs (as opposed
to at shutdown?)

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SOLR hangs - update timeout - please help

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 1:50 PM, Rafal Gwizdala
rafal.gwizd...@gmail.com wrote:
 Below i'm pasting the thread dump taken when the update was hung (it's also
 attached to the first message of this topic)

Interesting...
It looks like there's only one thread in solr code (the one generating
the thread dump).

The stack trace looks like you switched Jetty to use the NIO connector perhaps?
Could you try with the Jetty shipped with Solr (exactly as configured)?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SOLR hangs - update timeout - please help

2012-03-29 Thread Yonik Seeley
Oops... my previous replies accidentally went off-list.  I'll cut-n-paste below.

OK, so it looks like there is probably no bug here - it's simply that
commits can sometimes take a long time and updates were blocked during
that time (and would have succeeded eventually except the jetty
timeout was not set long enough).

Things are better in trunk (4.0) with soft commits and updates that
can proceed concurrently with commits.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10



On Thu, Mar 29, 2012 at 3:11 PM, Rafal Gwizdala
rafal.gwizd...@gmail.com wrote:
 You're right, this is not default Jetty from Solr - I configured it from
 scratch and then added Solr.
 Previously I had autocommit enabled and also did commit on every update so
 this might also contribute to the problem. Now I disabled it and made the
 updates less frequent.
 If the autocommit is allowed to happen together with 'manual' commit on
 update then there could be simultaneous commits, which now shouldn't happen
 - there will be at most one update/commit active at a time.
 Request timeout is default for jetty, but don't know what's that value.

 Best regards
 RG


I wrote:
On Thu, Mar 29, 2012 at 2:25 PM, Rafal Gwizdala
rafal.gwizd...@gmail.com wrote:
 Yonik, I didn't say there was an update request active at the moment the
 thread dump was made, only that previous update requests failed with a
 timeout. So maybe this is the missing piece.
 I didn't enable nio with Jetty, probably it's there by default.

Not with the jetty that comes with Solr.

bq. If solr hangs next time I'll try to make a thread dump when the
update request is waiting for completion.

Great!  We need to see where it's hanging!
Also, how long did the request take to time out?  Do you have
auto-commit enabled?
In the 3x series, updates will block while commits are in progress, so
timeouts can happen if they are set too short (and it seems like maybe
you aren't using the Jetty from Solr, so the configuration may not be
ideal).


Re: bbox query and range queries

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 6:20 PM, Alexandre Rocco alel...@gmail.com wrote:
 http://localhost:8984/solr/select?q=*:*fq=local:[-23.6677,-46.7315 TO
 -23.6709,-46.7261]

Range queries always need to be [lower_bound TO upper_bound]
Try
http://localhost:8984/solr/select?q=*:*fq=local:[-23.6709,-46.7315 TO
-23.6677,-46.7261]

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: bbox query and range queries

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 6:44 PM, Alexandre Rocco alel...@gmail.com wrote:
 Yonik,

 Thanks for the heads-up. That one worked.

 Just trying to wrap around how it would work on a real case. To test this
 one I just got the coordinates from Google Maps and searched within the pair
 of coordinates as I got them. Should I always check which is the lower and
 upper to assemble the query?

Yep... range query on LatLonField is currently pretty low level, and
you need to ensure yourself that lat1=lat2 and lon1=lon2 in
[lat1,lon1 TO lat2,lon2]

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Optimizing in SolrCloud

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 7:15 PM, Jamie Johnson jej2...@gmail.com wrote:
 Thanks, does it matter that we are also updates to documents at
 various times?  Do the deleted documents get removed when doing a
 merge or does that only get done on an optimize?

Yes, any merge removes documents that have been marked as deleted
(from the segments involved in the merge).

Optimize can still make sense, but more often in scenarios where
documents are updated infrequently.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: NullPointException when Faceting

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 6:33 PM, Jamie Johnson jej2...@gmail.com wrote:
 I recently got this stack trace when trying to execute a facet based
 query on my index.  The error went away when I did an optimize but I
 was surprised to see it at all.  Can anyone shed some light on why
 this may have happened?

I don't see how that could happen (and I've never seen it happen).

I recently fixed one NPE: https://issues.apache.org/jira/browse/SOLR-3150
Hopefully this isn't another!

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SolrCloud replica and leader out of Sync somehow

2012-03-20 Thread Yonik Seeley
On Tue, Mar 20, 2012 at 11:17 AM, Jamie Johnson jej2...@gmail.com wrote:
 ok, with my custom component out of the picture I still have the same
 issue.  Specifically, when sorting by score on a leader and replica I
 am getting different doc orderings.  Is this something anyone has
 seen?

This is certainly possible and expected - sorting tiebreakers is by
internal lucene docid, which can change (even on a single node!)
If you need lists that don't shift around due to unrelated changes,
make sure you don't have any ties!

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: SolrCloud replica and leader out of Sync somehow

2012-03-20 Thread Yonik Seeley
On Tue, Mar 20, 2012 at 11:39 AM, Jamie Johnson jej2...@gmail.com wrote:
 HmmmOk, I don't see how it's possible for me to ensure that there
 are no ties.  If a query were for *:* everything has a constant score,
 if the user requested 1 page then requested the next the results on
 the second page could be duplicates from what was on the first page.
 I don't remember ever seeing this issue on older versions of
 SolrCloud, although from what you're saying I should have.  What could
 explain why I never saw this before?

If you use replication only to duplicate an index (and avoid any
merges), then you will have identical docids.

 Another possible fix to ensure proper ordering couldn't we always
 specify a sort order which contained the key?  So for instance the
 user asks for score asc, we'd make this score asc,key asc so that
 results would be order by score and then by key so the results across
 pages would be consistent?

Yep.

And like I said, this is also an issue even on a single node.
docid A can be before docid B, then a segment merge can cause these to
be shuffled.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Multi-valued polyfields - Do they exist in the wild ?

2012-03-20 Thread Yonik Seeley
On Tue, Mar 20, 2012 at 2:17 PM,  ramdev.wud...@thomsonreuters.com wrote:
 Hi:
   We have been keen on using polyfields for a while. But we have been 
 restricted from using it because they do not seem to support Multi-values 
 (yet).

Poly-fields should support multi-values, it's more what uses them may not.
For example LatLon isn't multiValued because it doesn't have a
mechanism to correlate multiple values per document.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Is there a way for SOLR / SOLRJ to index files directly bypassing HTTP streaming?

2012-03-19 Thread Yonik Seeley
On Mon, Mar 19, 2012 at 4:38 PM, vybe3142 vybe3...@gmail.com wrote:
 Okay, I added the javabin handler snippet to the solrconfig.xml file
 (actually shared across all cores).  I got further (the request made it past
 tomcat and into SOLR) but  haven't quite succeeded yet.

 Server trace:
 Mar 19, 2012 3:31:35 PM org.apache.solr.core.SolrCore execute
 INFO: [testcore1] webapp=/solr path=/update/javabin
 params={waitSearcher=truecommit=trueliteral.id=testid1waitFlush=truewt=javabinstream.file=C:\work\SolrC
 lient\data\justin2.txtversion=2} status=500 QTime=82

Is this justin2.txt file in the javabin format?  That's what you're
telling Solr by hitting the /update/javabin URL.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Is there a way for SOLR / SOLRJ to index files directly bypassing HTTP streaming?

2012-03-19 Thread Yonik Seeley
On Mon, Mar 19, 2012 at 5:48 PM, vybe3142 vybe3...@gmail.com wrote:
 Thanks for the response

 No, the file is plain text.

 All I'm trying to do is index plain ASCII text files via a remote reference
 to their file paths.

The XML update handler expects a specific format of XML.
The json, CSV, javabin update handlers likewise expect a specific
document format.

If you have Word, PDF, HTML, or plain text files, one way to index them is
http://wiki.apache.org/solr/ExtractingRequestHandler (aka Solr Cell)

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread Yonik Seeley
Hmmm, this looks like it's generated by DocumentBuilder with the code

  catch( Exception ex ) {
throw new SolrException( SolrException.ErrorCode.BAD_REQUEST,
ERROR: +getID(doc, schema)+Error adding field ' +
  field.getName() + '=' +field.getValue()+', ex );
  }

Unfortunately, you're not getting the message from the underlying exception.
Is there a full stack trace in the logs?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


On Tue, Mar 13, 2012 at 7:05 PM, jlark alpti...@hotmail.com wrote:
 Hey Folks,
 I'm new to lucene/solr so pardon my lack of knowledge.

 I'm trying to feed some json to my solr instance through wget.
 I'm using the command

 wget 'http://localhost:8983/solr/update/json?commit=true'
 --post-file=itemsExported.json --header='Content-type:application/json'

 however the response I get is:
 012-03-13 14:44:44 ERROR 400: ERROR: [doc=http://www.mysite.com] Error
 adding field 'tags'='[car,house,farm]'

 where the tag field in my schema looks like.

   field name=tags type=string indexed=true stored=true
 multiValued=true/

 Not sure if I'm missing something. I'm not too sure on how to debug this
 further either so anyhelp on both would be great.

 I was able to feed and test with some dummy docs so I'm pretty sure my
 method of submission works.

 Are there any further logs I can look at or turn on?

 Thanks so much,
 Alp

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/400-Error-adding-field-tags-a-b-c-tp3823853p3823853.html
 Sent from the Solr - User mailing list archive at Nabble.com.


<    2   3   4   5   6   7   8   9   10   11   >