date:20120214

Has there been any success in replicating this?  I'm wondering if it
could be something with my setup that is causing the issue...


On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS

 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
    -data
  -slice1_shard2
    -data

 if it matters I'm running everything from localhost, zk and the solr shards

 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:

Debugging on 3,5

2012-02-14 Thread Bill Bell


I did find a solution, but the output is horrible. Why does explain look so 
badly?

lst name=explainstr name=2H7DF
6.351252 = (MATCH) boost(*:*,query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; 
,def=0.0)), product of:
  1.0 = (MATCH) MatchAllDocsQuery, product of:
1.0 = queryNorm
  6.351252 = query(specialties_ids: #1;#0;#0;#0;#0;#0;#0;#0;#0; 
,def=0.0)=6.351252
/str


defType=edismaxboost=query($param)param=multi_field:87
--


We like the boost parameter in SOLR 3.5 with eDismax.

The question we have is what we would like to replace bq with boost, but we get 
the multi-valued field issue when we try to do this.

Bill Bell
Sent from mobile

Re: Solr binary response for C#?

It's not as compact as binary format, but would just using something
like JSON help enough? This is really simple, just specify
wt=json (there's a method to set this on the server, at least in Java).

Otherwise, you might get a more knowledgeable response on the
C# java list, I'm frankly clueless

Best
Erick

On Mon, Feb 13, 2012 at 1:15 PM, naptowndev naptowndev...@gmail.com wrote:
 Admittedly I'm new to this, but the project we're working on feeds results
 from Solr to an ASP.net application.  Currently we are using XML, but our
 payloads can be rather large, some up to 17MB.  We are looking for a way to
 minimize that payload and increase performance and I'm curious if there's
 anything anyone has been working out that creates a binary response that can
 be read by C# (similar to the javabin response built into Solr).

 That, or if anyone has experience implementing an external protocol like
 Thrift with Solr and consuming it with C# - again all in the effort to
 increase performance across the wire and while being consumed.

 Any help and direction would be greatly appreciated!

 Thanks!

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-binary-response-for-C-tp3741101p3741101.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Mmap

2012-02-14 Thread Bill Bell

Does someone have an example of using unmap in 3.5 and chunksize?

 I am using Solr 3.5.

I noticed in solrconfig.xml:

directoryFactory name=DirectoryFactory 
class=${solr.directoryFactory:solr.StandardDirectoryFactory}/

I don't see this parameter taking.. When I set 
-Dsolr.directoryFactory=solr.MMapDirectoryFactory

How do I see the setting in the log or in stats.jsp ? I cannot find a place 
that indicates it is set or not.

I would assume StandardDirectoryFactory is being used but I do see (when I set 
it or NOT set it)

Bill Bell
Sent from mobile

Re: Improving performance for SOLR geo queries?

2012-02-14 Thread Bill Bell

Can we get this back ported to 3x?

Bill Bell
Sent from mobile


On Feb 14, 2012, at 3:45 AM, Matthias Käppler matth...@qype.com wrote:

 hey thanks all for the suggestions, didn't have time to look into them
 yet as we're feature-sprinting for MWC, but will report back with some
 feedback over the next weeks (we will have a few more performance
 sprints in March)
 
 Best,
 Matthias
 
 On Mon, Feb 13, 2012 at 2:32 AM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Thu, Feb 9, 2012 at 1:46 PM, Yonik Seeley yo...@lucidimagination.com 
 wrote:
 One way to speed up numeric range queries (at the cost of increased
 index size) is to lower the precisionStep.  You could try changing
 this from 8 to 4 and then re-indexing to see how that affects your
 query speed.
 
 Your issue, and the fact that I had been looking at the post-filtering
 code again for another client, reminded me that I had been planning on
 implementing post-filtering for spatial.  It's now checked into trunk.
 
 If you have the ability to use trunk, you can add a high cost (like
 cost=200) along with cache=false to trigger it.
 
 More details here:
 http://www.lucidimagination.com/blog/2012/02/10/advanced-filter-caching-in-solr/
 
 -Yonik
 lucidimagination.com
 
 
 
 -- 
 Matthias Käppler
 Lead Developer API  Mobile
 
 Qype GmbH
 Großer Burstah 50-52
 20457 Hamburg
 Telephone: +49 (0)40 - 219 019 2 - 160
 Skype: m_kaeppler
 Email: matth...@qype.com
 
 Managing Director: Ian Brotherston
 Amtsgericht Hamburg
 HRB 95913
 
 This e-mail and its attachments may contain confidential and/or
 privileged information. If you are not the intended recipient (or have
 received this e-mail in error) please notify the sender immediately
 and destroy this e-mail and its attachments. Any unauthorized copying,
 disclosure or distribution of this e-mail and  its attachments is
 strictly forbidden. This notice also applies to future messages.

Re: Highlighting stopwords

2012-02-14 Thread Koji Sekiguchi


(12/02/14 22:25), O. Klein wrote:

I have not been able to find any logic in the behavior of hl.q and how it
analyses the query. Could you explain how it is supposed to work?


Nothing special on hl.q. If you use hl.q, the value of it will be used for
highlighting rather than the value of q. There's no tricks, I think.

 When using hl.q=content_hl:(spell Check) I now get highlighting including
 stopwords.

 but when using hl.q=content_hl:(SC) where SC is synonym I get no
 highlighting.

 Can you verify if synonyms work when using hl.q?

  :

 OK I got it working by using hl.q=content_hl:(spell Check)
 content_text:(spell Check) but it makes no sense to me.

 only difference between the 2 fields is the use of Stopwords.

Uh, what you tried was that you changed the field between q and hl.q,
that I've not expected use case when I proposed hl.q.

Do you think that hl.text meats your needs?

https://issues.apache.org/jira/browse/SOLR-1926?focusedCommentId=12871234page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12871234

koji
--
Apache Solr Query Log Visualizer
http://soleami.com/

Re: Stemming and accents (HunspellStemFilterFactory)

2012-02-14 Thread Chantal Ackermann

Hi Bráulio,

I don't know about HunspellStemFilterFactory especially but concerning
accents:

There are several accent filter that will remove accents from your
tokens. If the Hunspell filter factory requires the accents, then simply
add the accent filters after Hunspell in your index and query filter
chains.

You would then have Hunspell produce the tokens as result of the
stemming and only afterwards the accents would be removed (your example:
'forum' instead of 'fórum'). Do the same on the query side in case
someone inputs accents.

Accent filters are:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory
(lowercases, as well!)
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ASCIIFoldingFilterFactory

and others on that page.

Chantal


On Tue, 2012-02-14 at 14:48 +0100, Bráulio Bhavamitra wrote:
 Hello all,
 
 I'm evaluating the HunspellStemFilterFactory I found it works with a
 pt_PT dictionary.
 
 For example, if I search for 'fóruns' it stems it to 'fórum' and then find
 'fórum' references.
 
 But if I search for 'foruns' (without accent),
 then HunspellStemFilterFactory cannot stem
 word, as it does' not exist in its dictionary.
 
 It there any way to make HunspellStemFilterFactory work without accents
 differences?
 
 best,
 bráulio

Re: Highlighting stopwords

2012-02-14 Thread O. Klein


Koji Sekiguchi wrote
 
 Uh, what you tried was that you changed the field between q and hl.q,
 that I've not expected use case when I proposed hl.q.
 
 Do you think that hl.text meats your needs?
 
 https://issues.apache.org/jira/browse/SOLR-1926?focusedCommentId=12871234page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12871234
 
 koji
 -- 
 Apache Solr Query Log Visualizer
 http://soleami.com/
 

Well, If I understand it correctly, yes.

If this means that queries are analyzed like the field they are
highlighting. That would give the highlighter a lot more flexibility. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlighting-stopwords-tp3681901p3744054.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Replication Question

Sorry, have not gotten it yet, but will be back trying later today - monday, 
tuesday tend to be slow for me (meetings and crap).

- Mark

On Feb 14, 2012, at 9:10 AM, Jamie Johnson wrote:

 Has there been any success in replicating this?  I'm wondering if it
 could be something with my setup that is causing the issue...
 
 
 On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS
 
 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
-data
  -slice1_shard2
-data
 
 if it matters I'm running everything from localhost, zk and the solr shards
 
 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:

- Mark Miller
lucidimagination.com

Re: SolrJ + SolrCloud

No hard plans around that that at the moment, but when I free up some time I 
plan on looking at the JIRA issue I pointed to. Looks like a lot of the work 
may already be done.

- mark

On Feb 12, 2012, at 8:14 AM, Darren Govoni wrote:

 Thanks Mark. Is there any plan to make all the Solr search handlers work
 with SolrCloud, like MLT? That missing feature would prohibit us from
 using SolrCloud at the moment. :(
 
 On Sat, 2012-02-11 at 18:24 -0500, Mark Miller wrote:
 On Feb 11, 2012, at 6:02 PM, Darren Govoni wrote:
 
 Hi,
 Do all the normal facilities of Solr work with SolrCloud from SolrJ?
 Things like /mlt, /cluster, facets , tvf's, etc.
 
 Darren
 
 
 
 SolrJ works the same in SolrCloud mode as it does in non SolrCloud mode - 
 it's fully supported. There is even a new SolrJ client called 
 CloudSolrServer that has built in cluster awareness and load balancing.
 
 In terms of what is supported - anything that is supported with distributed 
 search - that is most things, but there is the odd man out - like MLT - 
 looks like an issue is open here: 
 https://issues.apache.org/jira/browse/SOLR-788 but it's not resolved yet.
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 
 
 

- Mark Miller
lucidimagination.com

Re: SolrCloud Replication Question

Thanks Mark, not a huge rush, just me trying to get to use the latest
stuff on our project.

On Tue, Feb 14, 2012 at 10:53 AM, Mark Miller markrmil...@gmail.com wrote:
 Sorry, have not gotten it yet, but will be back trying later today - monday, 
 tuesday tend to be slow for me (meetings and crap).

 - Mark

 On Feb 14, 2012, at 9:10 AM, Jamie Johnson wrote:

 Has there been any success in replicating this?  I'm wondering if it
 could be something with my setup that is causing the issue...


 On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS

 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
    -data
  -slice1_shard2
    -data

 if it matters I'm running everything from localhost, zk and the solr shards

 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:

 - Mark Miller
 lucidimagination.com

Re: OR-FilterQuery

2012-02-14 Thread Mikhail Khludnev

On Mon, Feb 13, 2012 at 11:17 PM, spr...@gmx.eu wrote:

 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

1. These two opts have the different scoring.
2. if you hit same fq=id:(1 OR 2 OR 3...) many times you have a benefit due
to reading docset from heap instead of searching on disk.



 Is the Filter Cache used for the OR'ed fq?

Filter cache is used for whatever filter. I guess I didn't get you. Can't
you rephrase your question?



 Thank you




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com

Re: Need help with graphing function (MATH)

2012-02-14 Thread Mark

Thanks I'll have a look at this. I should have mentioned that the actual 
values on the graph aren't important rather I was showing an example of 
how the function should behave.


On 2/13/12 6:25 PM, Kent Fitch wrote:

Hi, assuming you have x and want to generate y, then maybe

- if x  50, y = 150

- if x  175, y = 60

- otherwise :

either y = (100/(e^((x -50)/75)^2)) + 50
http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175


- or maybe y =sin((x+5)/38)*42+105

http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175

Regards,

Kent Fitch

On Tue, Feb 14, 2012 at 12:29 PM, Mark static.void@gmail.com 
mailto:static.void@gmail.com wrote:


I need some help with one of my boost functions. I would like the
function to look something like the following mockup below. Starts
off flat then there is a gradual decline, steep decline then
gradual decline and then back to flat.

Can some of you math guys please help :)

Thanks.

Re: OR-FilterQuery

bq:  Is the Filter Cache used for the OR'ed fq?

The filter cache is actually pretty simple conceptually. It's
just a map where the key is the fq and the value is the set
of documents that satisfy that fq (we'll skip the implementation
here, just think of it as the list of all the docs that the fq selects).

Solr doesn't attempt to do much with the key, just think of it
as a single string. Whether or not an fq is reused from the
cache depends upon whether the key is in the map.

So fq=id:(1 OR 2 OR 3) will just look to see if
id:(1 OR 2 OR 3) is a key. If so, it'll just use the
document list stored in the cache.

It won't match
id:(1 OR 2)
or
id:(2)
or
id:1 OR id:2 OR id:3

In other words, there's no attempt to decompose the fq clause
and store parts of it in the cache, it's exact-match or
nothing.

Hope that helps
Erick

On Mon, Feb 13, 2012 at 2:17 PM,  spr...@gmx.eu wrote:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you

Re: Need help with graphing function (MATH)

2012-02-14 Thread Gora Mohanty

On 14 February 2012 23:35, Mark static.void@gmail.com wrote:
 Thanks I'll have a look at this. I should have mentioned that the actual
 values on the graph aren't important rather I was showing an example of how
 the function should behave.
[...]

 either y = (100/(e^((x -50)/75)^2)) + 50
[...]

In general, the exponential will be better behave than the sinusoid.
You can change the exact values by tweaking the coeffiocients in the
equation.

Regards,
Gora

Re: Re: Solr 3.5 not starting on CentOS 6 or RHEL 5

2012-02-14 Thread Bernhardt, Russell (CIV)

Nope, I don't have a custom /tmp mount in fstab, I just have a basic CentOS 6 
install for development and testing...
 Full everyone read/write permissions are in place on /tmp too.


 Is /tmp a separate file system? There are problems with people
 mounting /tmp with 'noexec' as a security precaution, which then
 causes Solr to fail.


Russ Bernhardt
Systems Office
Dudley Knox Library, Naval Postgraduate School
Monterey, CA

Re: OR-FilterQuery

2012-02-14 Thread Mikhail Khludnev

Hi Em,

I briefly read the thread. Are you talking about combing of cached clauses
of BooleanQuery, instead of evaluating whole BQ as a filter?

I found something like that in API (but only in API)
http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

Am I get you right? Why do you need it, btw? If I'm ..
I have idea how to do it in two mins:

q=+f:text
+(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...

Right leg will be a BooleanQuery with SHOULD clauses backed on cached
queries (see below).

if you are not scarred by the syntax yet you can implement trivial
fqQParserPlugin, which will be just

// lazily through User/Generic Cache
q = new FilteredQuery (new MatchAllDocsQuery(), new
CachingWrapperFilter(new
QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
return q;

it will use per segment bitset at contrast to Solr's fq which caches for
top level reader.

WDYT?

On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
  Hi,
 
  how efficent is such an query:
 
  q=some text
  fq=id:(1 OR 2 OR 3...)
 
  Should I better use q:some text AND id:(1 OR 2 OR 3...)?
 
  Is the Filter Cache used for the OR'ed fq?
 
  Thank you
 
 




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com

Re: Need help with graphing function (MATH)

2012-02-14 Thread Ted Dunning

In general this kind of function is very easy to construct using sums of basic 
sigmoidal functions. The logistic and probit functions are commonly used for 
this. 

Sent from my iPhone

On Feb 14, 2012, at 10:05, Mark static.void@gmail.com wrote:

 Thanks I'll have a look at this. I should have mentioned that the actual 
 values on the graph aren't important rather I was showing an example of how 
 the function should behave.
 
 On 2/13/12 6:25 PM, Kent Fitch wrote:
 Hi, assuming you have x and want to generate y, then maybe
 
 - if x  50, y = 150
 
 - if x  175, y = 60
 
 - otherwise :
 
 either y = (100/(e^((x -50)/75)^2)) + 50
 http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175
 
 
 - or maybe y =sin((x+5)/38)*42+105
 
 http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175
 
 Regards,
 
 Kent Fitch
 
 On Tue, Feb 14, 2012 at 12:29 PM, Mark static.void@gmail.com 
 mailto:static.void@gmail.com wrote:
 
I need some help with one of my boost functions. I would like the
function to look something like the following mockup below. Starts
off flat then there is a gradual decline, steep decline then
gradual decline and then back to flat.
 
Can some of you math guys please help :)
 
Thanks.

Re: Need help with graphing function (MATH)

2012-02-14 Thread Mark

Would you mind throwing out an example of these types of functions. 
Looking at Wikipedia (http://en.wikipedia.org/wiki/Probit) its seems 
like the Probit function is very similar to what I want.


Thanks

On 2/14/12 10:56 AM, Ted Dunning wrote:

In general this kind of function is very easy to construct using sums of basic 
sigmoidal functions. The logistic and probit functions are commonly used for 
this.

Sent from my iPhone

On Feb 14, 2012, at 10:05, Markstatic.void@gmail.com  wrote:


Thanks I'll have a look at this. I should have mentioned that the actual values 
on the graph aren't important rather I was showing an example of how the 
function should behave.

On 2/13/12 6:25 PM, Kent Fitch wrote:

Hi, assuming you have x and want to generate y, then maybe

- if x  50, y = 150

- if x  175, y = 60

- otherwise :

either y = (100/(e^((x -50)/75)^2)) + 50
http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175


- or maybe y =sin((x+5)/38)*42+105

http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175

Regards,

Kent Fitch

On Tue, Feb 14, 2012 at 12:29 PM, 
Markstatic.void@gmail.commailto:static.void@gmail.com  wrote:

I need some help with one of my boost functions. I would like the
function to look something like the following mockup below. Starts
off flat then there is a gradual decline, steep decline then
gradual decline and then back to flat.

Can some of you math guys please help :)

Thanks.

Re: Need help with graphing function (MATH)

2012-02-14 Thread Mark


Or better yet an example in solr would be best :)

Thanks!

On 2/14/12 11:05 AM, Mark wrote:
Would you mind throwing out an example of these types of functions. 
Looking at Wikipedia (http://en.wikipedia.org/wiki/Probit) its seems 
like the Probit function is very similar to what I want.


Thanks

On 2/14/12 10:56 AM, Ted Dunning wrote:
In general this kind of function is very easy to construct using sums 
of basic sigmoidal functions. The logistic and probit functions are 
commonly used for this.


Sent from my iPhone

On Feb 14, 2012, at 10:05, Markstatic.void@gmail.com  wrote:

Thanks I'll have a look at this. I should have mentioned that the 
actual values on the graph aren't important rather I was showing an 
example of how the function should behave.


On 2/13/12 6:25 PM, Kent Fitch wrote:

Hi, assuming you have x and want to generate y, then maybe

- if x  50, y = 150

- if x  175, y = 60

- otherwise :

either y = (100/(e^((x -50)/75)^2)) + 50
http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175 




- or maybe y =sin((x+5)/38)*42+105

http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175 



Regards,

Kent Fitch

On Tue, Feb 14, 2012 at 12:29 PM, 
Markstatic.void@gmail.commailto:static.void@gmail.com  
wrote:


I need some help with one of my boost functions. I would like the
function to look something like the following mockup below. Starts
off flat then there is a gradual decline, steep decline then
gradual decline and then back to flat.

Can some of you math guys please help :)

Thanks.

Re: OR-FilterQuery

2012-02-14 Thread Em

Hi Mikhail,

thanks for kicking in some brainstorming-code!
The given thread is almost a year old and I was working with Solr in my
freetime to see where it fails to behave/perform as I expect/wish.

I found out that if you got a lot of different access-patterns for a
filter-query, you might end up with either a big cache to make things
fast or with lower performance (impact depends on usecase and
circumstances).

Scenario:
You got a permission-field and the client is able to filter by one to
three permission-values.
That is:
fq=foo:user
fq=foo:moderator
fq=foo:manager

If you can not control/guarantee the order of the fq's values, you could
end up with a lot of mess which all returns the same.

Example:
fq=permission:user OR permission:moderator OR permission:manager
fq=permission:user OR permission:manager OR permission:moderator
fq=permission:moderator OR permission:user OR permission:manager
...
They all return the same but where cached seperately which leads to the
fact that you are wasting memory a lot.

Furthermore, if your access pattern will lead to a lot of different fq's
on a small set of distinct values, it may make more sense to cache each
filter-query for itself from a memory-consuming point of view (may cost
a little bit performance).

That beeing said, if you cache a filter for foo:user, foo:moderator and
foo:manager you can combine those filters with AND, OR, NOT or whatever
without recomputing every filter over and over again which would be the
case if your filter-cache is not large enough.

However, I never compared the performance differences (in terms of
speed) of a cached filter-query like
foo:bar OR foo:baz
With a combination of two cached filter-queries like
foo:bar
foo:baz
combined by a logical OR.

That's how the background looks like.
Unfortunately I didn't had the time to implement this in the past.

Back to your post:
Looks like a cool idea and is almost what I had in mind!

I would formulate an easier syntax so that one is able to parse each
fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
Could you explain why this bitset would be per-segment based, please?
I don't see a reason why this *have* to be so.
What is the benefit you are seeing?

Kind regards,
Em

Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,
 
 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?
 
 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)
 
 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:
 
 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...
 
 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).
 
 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just
 
 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;
 
 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 
 WDYT?
 
 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:
 
 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you

Re: Need help with graphing function (MATH)

2012-02-14 Thread Em

Hi Mark,

did you already had a look at http://wiki.apache.org/solr/FunctionQuery ?

Regards,
Em

Am 14.02.2012 20:09, schrieb Mark:
 Or better yet an example in solr would be best :)
 
 Thanks!
 
 On 2/14/12 11:05 AM, Mark wrote:
 Would you mind throwing out an example of these types of functions.
 Looking at Wikipedia (http://en.wikipedia.org/wiki/Probit) its seems
 like the Probit function is very similar to what I want.

 Thanks

 On 2/14/12 10:56 AM, Ted Dunning wrote:
 In general this kind of function is very easy to construct using sums
 of basic sigmoidal functions. The logistic and probit functions are
 commonly used for this.

 Sent from my iPhone

 On Feb 14, 2012, at 10:05, Markstatic.void@gmail.com  wrote:

 Thanks I'll have a look at this. I should have mentioned that the
 actual values on the graph aren't important rather I was showing an
 example of how the function should behave.

 On 2/13/12 6:25 PM, Kent Fitch wrote:
 Hi, assuming you have x and want to generate y, then maybe

 - if x  50, y = 150

 - if x  175, y = 60

 - otherwise :

 either y = (100/(e^((x -50)/75)^2)) + 50
 http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175



 - or maybe y =sin((x+5)/38)*42+105

 http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175


 Regards,

 Kent Fitch

 On Tue, Feb 14, 2012 at 12:29 PM,
 Markstatic.void@gmail.commailto:static.void@gmail.com 
 wrote:

 I need some help with one of my boost functions. I would like the
 function to look something like the following mockup below. Starts
 off flat then there is a gradual decline, steep decline then
 gradual decline and then back to flat.

 Can some of you math guys please help :)

 Thanks.

Re: Need help with graphing function (MATH)

2012-02-14 Thread Walter Underwood

In practice, I expect a linear piecewise function (with sharp corners) would be 
indistinguishable from the smoothed function. It is also much easier to read, 
test, and debug. It might even be faster.

Try the sharp corners one first.

wunder

On Feb 14, 2012, at 10:56 AM, Ted Dunning wrote:

 In general this kind of function is very easy to construct using sums of 
 basic sigmoidal functions. The logistic and probit functions are commonly 
 used for this. 
 
 Sent from my iPhone
 
 On Feb 14, 2012, at 10:05, Mark static.void@gmail.com wrote:
 
 Thanks I'll have a look at this. I should have mentioned that the actual 
 values on the graph aren't important rather I was showing an example of how 
 the function should behave.
 
 On 2/13/12 6:25 PM, Kent Fitch wrote:
 Hi, assuming you have x and want to generate y, then maybe
 
 - if x  50, y = 150
 
 - if x  175, y = 60
 
 - otherwise :
 
 either y = (100/(e^((x -50)/75)^2)) + 50
 http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175
 
 
 - or maybe y =sin((x+5)/38)*42+105
 
 http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175
 
 Regards,
 
 Kent Fitch
 
 On Tue, Feb 14, 2012 at 12:29 PM, Mark static.void@gmail.com 
 mailto:static.void@gmail.com wrote:
 
   I need some help with one of my boost functions. I would like the
   function to look something like the following mockup below. Starts
   off flat then there is a gradual decline, steep decline then
   gradual decline and then back to flat.
 
   Can some of you math guys please help :)
 
   Thanks.

Re: Solr 3.5 not starting on CentOS 6 or RHEL 5

2012-02-14 Thread Yonik Seeley

Perhaps this is some kind of vufind specific issue?
The server (/example) bundled with solr unpacks the war in
/example/work and not /tmp

-Yonik
lucidimagination.com

On Mon, Feb 13, 2012 at 7:06 PM, Bernhardt, Russell (CIV)
rgber...@nps.edu wrote:
 A software package we use recently upgraded to Solr 3.5 (from 1.4.1) and now 
 we're having problems getting the Solr server to start up under RHEL 5 or 
 CentOS 6.

 I upgraded our local install of Java to the latest from Oracle and it didn't 
 help, even removed the local OpenJDK just to be sure.

 When starting jetty manually (with java -jar start.jar) I get the following 
 messages:

 2012-02-13 07:52:55.954::INFO:  Logging to STDERR via 
 org.mortbay.log.StdErrLog
 2012-02-13 07:52:56.120::INFO:  jetty-6.1.11
 2012-02-13 07:52:56.184::INFO:  Extract 
 jar:file:/opt/vufind/solr/jetty/webapps/solr.war!/ to 
 /tmp/Jetty_0_0_0_0_8080_solr.war__solr__7k9npr/webapp
 2012-02-13 07:52:56.702::WARN:  Failed startup of context 
 org.mortbay.jetty.webapp.WebAppContext@15aaf0b3{/solr,jar:file:/opt/vufind/solr/jetty/webapps/solr.war!/}
 java.util.zip.ZipException: error in opening zip file
        at java.util.zip.ZipFile.open(Native Method)
        at java.util.zip.ZipFile.init(Unknown Source)
        at java.util.jar.JarFile.init(Unknown Source)
        at java.util.jar.JarFile.init(Unknown Source)
        at 
 org.mortbay.jetty.webapp.TagLibConfiguration.configureWebApp(TagLibConfiguration.java:168)
        at 
 org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1217)
        at 
 org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:513)
        at 
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
        at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
        at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
        at 
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
        at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
        at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
        at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
        at 
 org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
        at org.mortbay.jetty.Server.doStart(Server.java:222)
        at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:39)
        at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:977)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.mortbay.start.Main.invokeMain(Main.java:194)
        at org.mortbay.start.Main.start(Main.java:512)
        at org.mortbay.start.Main.main(Main.java:119)
 2012-02-13 07:52:56.713::INFO:  Opened 
 /opt/vufind/solr/jetty/logs/2012_02_13.request.log
 2012-02-13 07:52:56.740::INFO:  Started SelectChannelConnector@0.0.0.0:8080

 Jetty starts up just fine but shows a 503 error when attempting to access 
 localhost:8080/solr/. The temp directory structure does exist in /tmp/. Any 
 ideas?

 Thanks,

 Russ Bernhardt
 Systems Analyst
 Library Information Systems
 Naval Postgraduate School, Monterey CA

Re: SolrCloud Replication Question

Okay Jamie, I think I have a handle on this. It looks like an issue with what 
config files are being used by cores created with the admin core handler - I 
think it's just picking up default config and not the correct config for the 
collection. This means they end up using config that has no UpdateLog defined - 
and so recovery fails.

I've added more logging around this so that it's easy to determine that.

I'm investigating more and working on a test + fix. I'll file a JIRA issue soon 
as well.

- Mark

On Feb 14, 2012, at 11:39 AM, Jamie Johnson wrote:

 Thanks Mark, not a huge rush, just me trying to get to use the latest
 stuff on our project.
 
 On Tue, Feb 14, 2012 at 10:53 AM, Mark Miller markrmil...@gmail.com wrote:
 Sorry, have not gotten it yet, but will be back trying later today - monday, 
 tuesday tend to be slow for me (meetings and crap).
 
 - Mark
 
 On Feb 14, 2012, at 9:10 AM, Jamie Johnson wrote:
 
 Has there been any success in replicating this?  I'm wondering if it
 could be something with my setup that is causing the issue...
 
 
 On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS
 
 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
-data
  -slice1_shard2
-data
 
 if it matters I'm running everything from localhost, zk and the solr shards
 
 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 

- Mark Miller
lucidimagination.com

Re: SolrCloud Replication Question

Sounds good, if I pull the latest from trunk and rerun will that be
useful or were you able to duplicate my issue now?

On Tue, Feb 14, 2012 at 3:00 PM, Mark Miller markrmil...@gmail.com wrote:
 Okay Jamie, I think I have a handle on this. It looks like an issue with what 
 config files are being used by cores created with the admin core handler - I 
 think it's just picking up default config and not the correct config for the 
 collection. This means they end up using config that has no UpdateLog defined 
 - and so recovery fails.

 I've added more logging around this so that it's easy to determine that.

 I'm investigating more and working on a test + fix. I'll file a JIRA issue 
 soon as well.

 - Mark

 On Feb 14, 2012, at 11:39 AM, Jamie Johnson wrote:

 Thanks Mark, not a huge rush, just me trying to get to use the latest
 stuff on our project.

 On Tue, Feb 14, 2012 at 10:53 AM, Mark Miller markrmil...@gmail.com wrote:
 Sorry, have not gotten it yet, but will be back trying later today - 
 monday, tuesday tend to be slow for me (meetings and crap).

 - Mark

 On Feb 14, 2012, at 9:10 AM, Jamie Johnson wrote:

 Has there been any success in replicating this?  I'm wondering if it
 could be something with my setup that is causing the issue...


 On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS

 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
    -data
  -slice1_shard2
    -data

 if it matters I'm running everything from localhost, zk and the solr 
 shards

 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:

 - Mark Miller
 lucidimagination.com












 - Mark Miller
 lucidimagination.com

Re: OR-FilterQuery

Whoa!

fq=id(1 OR 2)
is not the same thing at all as
fq=id:1fq=id:2

Assuming that any document had one and only one ID,  the second clause
would return exactly 0 documents, each and every time.

Multiple fq clauses are essentially set intersections. So the first query is the
set of all documents where id is 1 or 2
the second is the intersection of two sets of documents, one set
with an id of 1 and one with an id of 2. Not the same thing at all.

There's no support for the concept of
(fq=id:1 OR fq=id:2)

Best
Erick

On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again which would be the
 case if your filter-cache is not large enough.

 However, I never compared the performance differences (in terms of
 speed) of a cached filter-query like
 foo:bar OR foo:baz
 With a combination of two cached filter-queries like
 foo:bar
 foo:baz
 combined by a logical OR.

 That's how the background looks like.
 Unfortunately I didn't had the time to implement this in the past.

 Back to your post:
 Looks like a cool idea and is almost what I had in mind!

 I would formulate an easier syntax so that one is able to parse each
 fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 Could you explain why this bitset would be per-segment based, please?
 I don't see a reason why this *have* to be so.
 What is the benefit you are seeing?

 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,

 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?

 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:

 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...

 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).

 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just

 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.

 WDYT?

 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you

Re: OR-FilterQuery

BTW, you're not the first person who would like this capability, see:
https://issues.apache.org/jira/browse/SOLR-1223

But the fact that this JIRA was originally opened in in June of 2009
and hasn't been implemented yet indicates that it's not  super-high
priority.

Best
Erick

On Tue, Feb 14, 2012 at 4:33 PM, Erick Erickson erickerick...@gmail.com wrote:
 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2

 Assuming that any document had one and only one ID,  the second clause
 would return exactly 0 documents, each and every time.

 Multiple fq clauses are essentially set intersections. So the first query is 
 the
 set of all documents where id is 1 or 2
 the second is the intersection of two sets of documents, one set
 with an id of 1 and one with an id of 2. Not the same thing at all.

 There's no support for the concept of
 (fq=id:1 OR fq=id:2)

 Best
 Erick

 On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again which would be the
 case if your filter-cache is not large enough.

 However, I never compared the performance differences (in terms of
 speed) of a cached filter-query like
 foo:bar OR foo:baz
 With a combination of two cached filter-queries like
 foo:bar
 foo:baz
 combined by a logical OR.

 That's how the background looks like.
 Unfortunately I didn't had the time to implement this in the past.

 Back to your post:
 Looks like a cool idea and is almost what I had in mind!

 I would formulate an easier syntax so that one is able to parse each
 fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.
 Could you explain why this bitset would be per-segment based, please?
 I don't see a reason why this *have* to be so.
 What is the benefit you are seeing?

 Kind regards,
 Em

 Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
 Hi Em,

 I briefly read the thread. Are you talking about combing of cached clauses
 of BooleanQuery, instead of evaluating whole BQ as a filter?

 I found something like that in API (but only in API)
 http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

 Am I get you right? Why do you need it, btw? If I'm ..
 I have idea how to do it in two mins:

 q=+f:text
 +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 
 _query_:{!fq}id:4)...

 Right leg will be a BooleanQuery with SHOULD clauses backed on cached
 queries (see below).

 if you are not scarred by the syntax yet you can implement trivial
 fqQParserPlugin, which will be just

 // lazily through User/Generic Cache
 q = new FilteredQuery (new MatchAllDocsQuery(), new
 CachingWrapperFilter(new
 QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V);
 return q;

 it will use per segment bitset at contrast to Solr's fq which caches for
 top level reader.

 WDYT?

 On Mon, Feb 13, 2012 at 11:34 PM, Em mailformailingli...@yahoo.de wrote:

 Hi,

 have a look at:
 http://search-lucene.com/m/Z8lWGEiKoI

 I think not much had changed since then.

 Regards,
 Em

 Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
 Hi,

 how efficent is such an query:

 q=some text
 fq=id:(1 OR 2 OR 3...)

 Should I better use q:some text AND id:(1 OR 2 OR 3...)?

 Is the Filter Cache used for the OR'ed fq?

 Thank you

fq=id(1 OR 2)
is not the same thing at all as
fq=id:1fq=id:2
Ahm, who said they would be the same? :)
I mean, you are completely right in what you are saying but it seems to
me that we are talking about two different things.

I was talking about caching each filter-criteria instead of the whole
filter-query to recombine the cached filter-criteria based on the
boolean-operators the client sends.

In other words:
currently
fq=id:1 OR id:2
results into ONE cached filter-entry.

fq=id:2 OR id:1
results into ANOTHER cached filter-entry

fq=id:2 AND id:1
results into (surprise, surprise) a third filter-entry (although this
example does not make sense).

My idea was to cache each filter-criteria, that means caching the bitset
for id:1 and the bitset for id:2 to recombine both bitsets via AND, OR,
NOT etc. whenever this is neccessary.

This way one could save memory (and maybe computing-time as well) which
definitely makes sense when you got a way smaller set of
filter-criterias while having a much larger set of possible (and used)
combinations of each filter-criteria with a small number of repetitions
per combination (which would destroy the benefit of caching).

Don't you agree?

Kind regards,
Em

Am 14.02.2012 22:33, schrieb Erick Erickson:
Whoa!

fq=id(1 OR 2)
is not the same thing at all as
fq=id:1fq=id:2

Assuming that any document had one and only one ID, the second clause
would return exactly 0 documents, each and every time.

Multiple fq clauses are essentially set intersections. So the first query is
the
set of all documents where id is 1 or 2
the second is the intersection of two sets of documents, one set
with an id of 1 and one with an id of 2. Not the same thing at all.

There's no support for the concept of
(fq=id:1 OR fq=id:2)

Best
Erick

On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
Hi Mikhail,

thanks for kicking in some brainstorming-code!
The given thread is almost a year old and I was working with Solr in my
freetime to see where it fails to behave/perform as I expect/wish.

I found out that if you got a lot of different access-patterns for a
filter-query, you might end up with either a big cache to make things
fast or with lower performance (impact depends on usecase and
circumstances).

Scenario:
You got a permission-field and the client is able to filter by one to
three permission-values.
That is:
fq=foo:user
fq=foo:moderator
fq=foo:manager

If you can not control/guarantee the order of the fq's values, you could
end up with a lot of mess which all returns the same.

Example:
fq=permission:user OR permission:moderator OR permission:manager
fq=permission:user OR permission:manager OR permission:moderator
fq=permission:moderator OR permission:user OR permission:manager
...
They all return the same but where cached seperately which leads to the
fact that you are wasting memory a lot.

Furthermore, if your access pattern will lead to a lot of different fq's
on a small set of distinct values, it may make more sense to cache each
filter-query for itself from a memory-consuming point of view (may cost
a little bit performance).

That beeing said, if you cache a filter for foo:user, foo:moderator and
foo:manager you can combine those filters with AND, OR, NOT or whatever
without recomputing every filter over and over again which would be the
case if your filter-cache is not large enough.

However, I never compared the performance differences (in terms of
speed) of a cached filter-query like
foo:bar OR foo:baz
With a combination of two cached filter-queries like
foo:bar
foo:baz
combined by a logical OR.

That's how the background looks like.
Unfortunately I didn't had the time to implement this in the past.

Back to your post:
Looks like a cool idea and is almost what I had in mind!

I would formulate an easier syntax so that one is able to parse each
fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

it will use per segment bitset at contrast to Solr's fq which caches for
top level reader.
Could you explain why this bitset would be per-segment based, please?
I don't see a reason why this *have* to be so.
What is the benefit you are seeing?

Kind regards,
Em

Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
Hi Em,

I briefly read the thread. Are you talking about combing of cached clauses
of BooleanQuery, instead of evaluating whole BQ as a filter?

I found something like that in API (but only in API)
http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)

Am I get you right? Why do you need it, btw? If I'm ..
I have idea how to do it in two mins:

q=+f:text
+(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3
_query_:{!fq}id:4)...

Right leg will be a BooleanQuery with SHOULD clauses backed on cached
queries (see below).

if you are not scarred by the syntax yet you can implement trivial

Re: SolrCloud Replication Question

Doh - looks like I was just seeing a test issue. Do you mind updating and 
trying the latest rev? At the least there should be some better logging around 
the recovery.

I'll keep working on tests in the meantime.

- Mark

On Feb 14, 2012, at 3:15 PM, Jamie Johnson wrote:

 Sounds good, if I pull the latest from trunk and rerun will that be
 useful or were you able to duplicate my issue now?
 
 On Tue, Feb 14, 2012 at 3:00 PM, Mark Miller markrmil...@gmail.com wrote:
 Okay Jamie, I think I have a handle on this. It looks like an issue with 
 what config files are being used by cores created with the admin core 
 handler - I think it's just picking up default config and not the correct 
 config for the collection. This means they end up using config that has no 
 UpdateLog defined - and so recovery fails.
 
 I've added more logging around this so that it's easy to determine that.
 
 I'm investigating more and working on a test + fix. I'll file a JIRA issue 
 soon as well.
 
 - Mark
 
 On Feb 14, 2012, at 11:39 AM, Jamie Johnson wrote:
 
 Thanks Mark, not a huge rush, just me trying to get to use the latest
 stuff on our project.
 
 On Tue, Feb 14, 2012 at 10:53 AM, Mark Miller markrmil...@gmail.com wrote:
 Sorry, have not gotten it yet, but will be back trying later today - 
 monday, tuesday tend to be slow for me (meetings and crap).
 
 - Mark
 
 On Feb 14, 2012, at 9:10 AM, Jamie Johnson wrote:
 
 Has there been any success in replicating this?  I'm wondering if it
 could be something with my setup that is causing the issue...
 
 
 On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS
 
 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
-data
  -slice1_shard2
-data
 
 if it matters I'm running everything from localhost, zk and the solr 
 shards
 
 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 

- Mark Miller
lucidimagination.com

Re: Need help with graphing function (MATH)

2012-02-14 Thread Kent Fitch

agreeing with wunder - I don't know the application, but I think almost
always, a set of linear approximations over a few ranges would be ok (and
you could increase the number of ranges until it was), and will be faster.

And if you need just one equation, a sigmoid function will do the trick,
such as

110 - 50((x-100)/20)/(sqrt(1+((x-100)/20)^2))

http://www.wolframalpha.com/input/?i=plot+110+-+50%28%28x-100%29%2F20%29%2F%28sqrt%281%2B%28%28x-100%29%2F20%29
^2%29%29%2C+x%3D0..200

Regards

Kent Fitch

On Wed, Feb 15, 2012 at 6:17 AM, Walter Underwood wun...@wunderwood.orgwrote:

 In practice, I expect a linear piecewise function (with sharp corners)
 would be indistinguishable from the smoothed function. It is also much
 easier to read, test, and debug. It might even be faster.

 Try the sharp corners one first.

 wunder

 On Feb 14, 2012, at 10:56 AM, Ted Dunning wrote:

  In general this kind of function is very easy to construct using sums of
 basic sigmoidal functions. The logistic and probit functions are commonly
 used for this.
 
  Sent from my iPhone
 
  On Feb 14, 2012, at 10:05, Mark static.void@gmail.com wrote:
 
  Thanks I'll have a look at this. I should have mentioned that the
 actual values on the graph aren't important rather I was showing an example
 of how the function should behave.
 
  On 2/13/12 6:25 PM, Kent Fitch wrote:
  Hi, assuming you have x and want to generate y, then maybe
 
  - if x  50, y = 150
 
  - if x  175, y = 60
 
  - otherwise :
 
  either y = (100/(e^((x -50)/75)^2)) + 50
  http://www.wolframalpha.com/input/?i=plot++%28100%2F%28e
 ^%28%28x+-50%29%2F75%29^2%29%29+%2B+50%2C+x%3D50..175
 
 
  - or maybe y =sin((x+5)/38)*42+105
 
 
 http://www.wolframalpha.com/input/?i=plot++sin%28%28x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175
 
  Regards,
 
  Kent Fitch
 
  On Tue, Feb 14, 2012 at 12:29 PM, Mark static.void@gmail.commailto:
 static.void@gmail.com wrote:
 
I need some help with one of my boost functions. I would like the
function to look something like the following mockup below. Starts
off flat then there is a gradual decline, steep decline then
gradual decline and then back to flat.
 
Can some of you math guys please help :)
 
Thanks.

Re: Semantic autocomplete with Solr

2012-02-14 Thread Paul Libbrecht

facetting?

paul


Le 14 févr. 2012 à 23:10, Octavian Covalschi a écrit :

 Hey guys,
 
 Has anyone done any kind of smart autocomplete? Let's say we have a web
 store, and we'd like to autocomplete user's searches. So if I'll type in
 jacket next word that will be suggested should be something related to
 jacket (color, fabric) etc...
 
 It seems to me I have to structure this data in a particular way, but that
 way I can do without solr, so I was wondering if Solr could help us.
 
 Thank you in advance.

Re: Can I rebuild an index and remove some fields?

2012-02-14 Thread Robert Stewart

I was thinking if I make a wrapper class that aggregates another IndexReader 
and filter out terms I don't want anymore it might work.   And then pass that 
wrapper into SegmentMerger.  I think if I filter out terms on 
GetFieldNames(...) and Terms(...) it might work.

Something like:

HashSetstring ignoredTerms=...;

FilteringIndexReader wrapper=new FilterIndexReader(reader);

SegmentMerger merger=new SegmentMerger(writer);

merger.add(wrapper);

merger.Merge();





On Feb 14, 2012, at 1:49 AM, Li Li wrote:

 for method 2, delete is wrong. we can't delete terms.
   you also should hack with the tii and tis file.
 
 On Tue, Feb 14, 2012 at 2:46 PM, Li Li fancye...@gmail.com wrote:
 
 method1, dumping data
 for stored fields, you can traverse the whole index and save it to
 somewhere else.
 for indexed but not stored fields, it may be more difficult.
if the indexed and not stored field is not analyzed(fields such as
 id), it's easy to get from FieldCache.StringIndex.
But for analyzed fields, though theoretically it can be restored from
 term vector and term position, it's hard to recover from index.
 
 method 2, hack with metadata
 1. indexed fields
  delete by query, e.g. field:*
 2. stored fields
   because all fields are stored sequentially. it's not easy to delete
 some fields. this will not affect search speed. but if you want to get
 stored fields,  and the useless fields are very long, then it will slow
 down.
   also it's possible to hack with it. but need more effort to
 understand the index file format  and traverse the fdt/fdx file.
 http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/fileformats.html
 
 this will give you some insight.
 
 
 On Tue, Feb 14, 2012 at 6:29 AM, Robert Stewart bstewart...@gmail.comwrote:
 
 Lets say I have a large index (100M docs, 1TB, split up between 10
 indexes).  And a bunch of the stored and indexed fields are not used in
 search at all.  In order to save memory and disk, I'd like to rebuild that
 index *without* those fields, but I don't have original documents to
 rebuild entire index with (don't have the full-text anymore, etc.).  Is
 there some way to rebuild or optimize an existing index with only a sub-set
 of the existing indexed fields?  Or alternatively is there a way to avoid
 loading some indexed fields at all ( to avoid loading term infos and terms
 index ) ?
 
 Thanks
 Bob

Re: Semantic autocomplete with Solr

2012-02-14 Thread Octavian Covalschi

Hm... I used it for some basic group by feature, but haven't thought of it
for autocomplete. I'll give it a shot.

Thanks!


On Tue, Feb 14, 2012 at 4:19 PM, Paul Libbrecht p...@hoplahup.net wrote:

 facetting?

 paul


 Le 14 févr. 2012 à 23:10, Octavian Covalschi a écrit :

  Hey guys,
 
  Has anyone done any kind of smart autocomplete? Let's say we have a web
  store, and we'd like to autocomplete user's searches. So if I'll type in
  jacket next word that will be suggested should be something related to
  jacket (color, fabric) etc...
 
  It seems to me I have to structure this data in a particular way, but
 that
  way I can do without solr, so I was wondering if Solr could help us.
 
  Thank you in advance.

Re: Semantic autocomplete with Solr

2012-02-14 Thread Roman Chyla

done something along these lines:

https://svnweb.cern.ch/trac/rcarepo/wiki/InspireAutoSuggest#Autosuggestautocompletefunctionality

but you would need MontySolr for that - https://github.com/romanchyla/montysolr

roman

On Tue, Feb 14, 2012 at 11:10 PM, Octavian Covalschi
octavian.covals...@gmail.com wrote:
 Hey guys,

 Has anyone done any kind of smart autocomplete? Let's say we have a web
 store, and we'd like to autocomplete user's searches. So if I'll type in
 jacket next word that will be suggested should be something related to
 jacket (color, fabric) etc...

 It seems to me I have to structure this data in a particular way, but that
 way I can do without solr, so I was wondering if Solr could help us.

 Thank you in advance.

payload and exact match

2012-02-14 Thread leonardo2

Is there the possibility of perform 'exact search' in a payload field?

I'have to index text with auxiliary info for each word. In particular at
each word is associated the bounding box containing it in the original pdf
page (it is used for highligthing the search terms in the pdf). I used the
payload to store that information.

In the schema.xml, the fieldType definition is:

---
fieldtype name=wppayloads stored=false indexed=true
class=solr.TextField 
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1
 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.DelimitedPayloadTokenFilterFactory encoder=identity/
/analyzer
/fieldtype
---

while the field definition is:

---
field name=words type=wppayloads indexed=true stored=true
required=true multiValued=true/
---

When indexing, the field 'words' contains a list of word|box as in the
following example:

---
doc_id=example
words={Fonte:|307.62,948.16,324.62,954.25 Comune|326.29,948.16,349.07,954.25
di|350.74,948.16,355.62,954.25 Bologna|358.95,948.16,381.28,954.25}
---

Such solution works well except in the case of an exact search. For example,
assuming the only indexed doc is the 'example' doc (before shown), the query
words:Comune di Bologna returns no results.

Someone know if there is the possibility of perform 'exact search' in a
payload field?

Thanks in advance,
Leonardo


--
View this message in context: 
http://lucene.472066.n3.nabble.com/payload-and-exact-match-tp3745369p3745369.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr soft commit feature

2012-02-14 Thread Dipti Srivastava

Hi All,
Is there a way to soft commit in the current released version of solr 3.5?

Regards,
Dipti Srivastava


This message is private and confidential. If you have received it in error, 
please notify the sender and remove it from your system.

Re: OR-FilterQuery

Ah, OK, I misread your post apparently. And yes, what you suggest
would result in some efficiencies, but at present I don't think there's any
syntax that allows one to combine filter queries as you suggest. There
was some discussion about it in the JIRA I referenced, but no action that
I could see.

That is, efficiencies in some circumstances, though I think it would be
hard to predict. For instance, imagine a set of 100 entries in an FQ. And
no, I'm not making things up, I've seen applications where this makes
sense. Splitting that out into 100 separate entries in the filterCache would
use up a lot of space. Likewise, I suspect that the actual process of
creating the heuristics that were able to analyze an incoming filter
query and do the right thing in terms of splitting it up and recombining
it would be pretty hairy. Local parameters for instance, and let's throw in
dereferencing too G...

So I suspect that this is one of those features that is quite easy to see
the benefits of in the simple case, but pretty quickly becomes a
nightmare to actually implement correctly, but that's mostly
a guess.

And before putting the work into it, I think modeling the actual
benefits would be wise, as well as convincing myself that there
are enough cases where this *would* be beneficial. I mean Solr
does a pretty reasonable job of caching these anyway, and with the
non-cached filters it's not clear to me that the benefits are
sufficient...

Good luck, though, if you want to tackle it!
Erick



On Tue, Feb 14, 2012 at 4:54 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Erick,

 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2
 Ahm, who said they would be the same? :)
 I mean, you are completely right in what you are saying but it seems to
 me that we are talking about two different things.

 I was talking about caching each filter-criteria instead of the whole
 filter-query to recombine the cached filter-criteria based on the
 boolean-operators the client sends.

 In other words:
 currently
 fq=id:1 OR id:2
 results into ONE cached filter-entry.

 fq=id:2 OR id:1
 results into ANOTHER cached filter-entry

 fq=id:2 AND id:1
 results into (surprise, surprise) a third filter-entry (although this
 example does not make sense).

 My idea was to cache each filter-criteria, that means caching the bitset
 for id:1 and the bitset for id:2 to recombine both bitsets via AND, OR,
 NOT etc. whenever this is neccessary.

 This way one could save memory (and maybe computing-time as well) which
 definitely makes sense when you got a way smaller set of
 filter-criterias while having a much larger set of possible (and used)
 combinations of each filter-criteria with a small number of repetitions
 per combination (which would destroy the benefit of caching).

 Don't you agree?

 Kind regards,
 Em


 Am 14.02.2012 22:33, schrieb Erick Erickson:
 Whoa!

 fq=id(1 OR 2)
 is not the same thing at all as
 fq=id:1fq=id:2

 Assuming that any document had one and only one ID,  the second clause
 would return exactly 0 documents, each and every time.

 Multiple fq clauses are essentially set intersections. So the first query is 
 the
 set of all documents where id is 1 or 2
 the second is the intersection of two sets of documents, one set
 with an id of 1 and one with an id of 2. Not the same thing at all.

 There's no support for the concept of
 (fq=id:1 OR fq=id:2)

 Best
 Erick

 On Tue, Feb 14, 2012 at 2:13 PM, Em mailformailingli...@yahoo.de wrote:
 Hi Mikhail,

 thanks for kicking in some brainstorming-code!
 The given thread is almost a year old and I was working with Solr in my
 freetime to see where it fails to behave/perform as I expect/wish.

 I found out that if you got a lot of different access-patterns for a
 filter-query, you might end up with either a big cache to make things
 fast or with lower performance (impact depends on usecase and
 circumstances).

 Scenario:
 You got a permission-field and the client is able to filter by one to
 three permission-values.
 That is:
 fq=foo:user
 fq=foo:moderator
 fq=foo:manager

 If you can not control/guarantee the order of the fq's values, you could
 end up with a lot of mess which all returns the same.

 Example:
 fq=permission:user OR permission:moderator OR permission:manager
 fq=permission:user OR permission:manager OR permission:moderator
 fq=permission:moderator OR permission:user OR permission:manager
 ...
 They all return the same but where cached seperately which leads to the
 fact that you are wasting memory a lot.

 Furthermore, if your access pattern will lead to a lot of different fq's
 on a small set of distinct values, it may make more sense to cache each
 filter-query for itself from a memory-consuming point of view (may cost
 a little bit performance).

 That beeing said, if you cache a filter for foo:user, foo:moderator and
 foo:manager you can combine those filters with AND, OR, NOT or whatever
 without recomputing every filter over and over again

Re: Solr soft commit feature

This has not been ported back to the 3.X line yet - mostly because it involved 
some rather large and invasive changes that I wanted to bake on trunk for some 
time first.

Even still, the back port is not trivial, so I don't know that it's something 
I'd personally be able to get to in the short term. If I had any free time, I'd 
probably prefer pushing towards a 4 release with NRT. Some of the changes also 
broke back compat behavior in ways that are more acceptable over a major 
release.

Someone else might jump in and do the work of course.

On Feb 14, 2012, at 7:41 PM, Dipti Srivastava wrote:

 Hi All,
 Is there a way to soft commit in the current released version of solr 3.5?
 
 Regards,
 Dipti Srivastava
 
 
 This message is private and confidential. If you have received it in error, 
 please notify the sender and remove it from your system.
 
 

- Mark Miller
lucidimagination.com

Re: Can I rebuild an index and remove some fields?

2012-02-14 Thread Li Li

I have roughly read the codes of 4.0 trunk. maybe it's feasible.
SegmentMerger.add(IndexReader) will add to be merged Readers
merge() will call
  mergeTerms(segmentWriteState);
  mergePerDoc(segmentWriteState);

   mergeTerms() will construct fields from IndexReaders
for(int
readerIndex=0;readerIndexmergeState.readers.size();readerIndex++) {
  final MergeState.IndexReaderAndLiveDocs r =
mergeState.readers.get(readerIndex);
  final Fields f = r.reader.fields();
  final int maxDoc = r.reader.maxDoc();
  if (f != null) {
slices.add(new ReaderUtil.Slice(docBase, maxDoc, readerIndex));
fields.add(f);
  }
  docBase += maxDoc;
}
So If you wrapper your IndexReader and override its fields() method,
maybe it will work for merge terms.

for DocValues, it can also override AtomicReader.docValues(). just
return null for fields you want to remove. maybe it should
traverse CompositeReader's getSequentialSubReaders() and wrapper each
AtomicReader

other things like term vectors norms are similar.
On Wed, Feb 15, 2012 at 6:30 AM, Robert Stewart bstewart...@gmail.comwrote:

 I was thinking if I make a wrapper class that aggregates another
 IndexReader and filter out terms I don't want anymore it might work.   And
 then pass that wrapper into SegmentMerger.  I think if I filter out terms
 on GetFieldNames(...) and Terms(...) it might work.

 Something like:

 HashSetstring ignoredTerms=...;

 FilteringIndexReader wrapper=new FilterIndexReader(reader);

 SegmentMerger merger=new SegmentMerger(writer);

 merger.add(wrapper);

 merger.Merge();





 On Feb 14, 2012, at 1:49 AM, Li Li wrote:

  for method 2, delete is wrong. we can't delete terms.
you also should hack with the tii and tis file.
 
  On Tue, Feb 14, 2012 at 2:46 PM, Li Li fancye...@gmail.com wrote:
 
  method1, dumping data
  for stored fields, you can traverse the whole index and save it to
  somewhere else.
  for indexed but not stored fields, it may be more difficult.
 if the indexed and not stored field is not analyzed(fields such as
  id), it's easy to get from FieldCache.StringIndex.
 But for analyzed fields, though theoretically it can be restored from
  term vector and term position, it's hard to recover from index.
 
  method 2, hack with metadata
  1. indexed fields
   delete by query, e.g. field:*
  2. stored fields
because all fields are stored sequentially. it's not easy to
 delete
  some fields. this will not affect search speed. but if you want to get
  stored fields,  and the useless fields are very long, then it will slow
  down.
also it's possible to hack with it. but need more effort to
  understand the index file format  and traverse the fdt/fdx file.
 
 http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/fileformats.html
 
  this will give you some insight.
 
 
  On Tue, Feb 14, 2012 at 6:29 AM, Robert Stewart bstewart...@gmail.com
 wrote:
 
  Lets say I have a large index (100M docs, 1TB, split up between 10
  indexes).  And a bunch of the stored and indexed fields are not
 used in
  search at all.  In order to save memory and disk, I'd like to rebuild
 that
  index *without* those fields, but I don't have original documents to
  rebuild entire index with (don't have the full-text anymore, etc.).  Is
  there some way to rebuild or optimize an existing index with only a
 sub-set
  of the existing indexed fields?  Or alternatively is there a way to
 avoid
  loading some indexed fields at all ( to avoid loading term infos and
 terms
  index ) ?
 
  Thanks
  Bob

Re: SolrCloud Replication Question

Doing so now, will let you know if I continue to see the same issues

On Tue, Feb 14, 2012 at 4:59 PM, Mark Miller markrmil...@gmail.com wrote:
 Doh - looks like I was just seeing a test issue. Do you mind updating and 
 trying the latest rev? At the least there should be some better logging 
 around the recovery.

 I'll keep working on tests in the meantime.

 - Mark

 On Feb 14, 2012, at 3:15 PM, Jamie Johnson wrote:

 Sounds good, if I pull the latest from trunk and rerun will that be
 useful or were you able to duplicate my issue now?

 On Tue, Feb 14, 2012 at 3:00 PM, Mark Miller markrmil...@gmail.com wrote:
 Okay Jamie, I think I have a handle on this. It looks like an issue with 
 what config files are being used by cores created with the admin core 
 handler - I think it's just picking up default config and not the correct 
 config for the collection. This means they end up using config that has no 
 UpdateLog defined - and so recovery fails.

 I've added more logging around this so that it's easy to determine that.

 I'm investigating more and working on a test + fix. I'll file a JIRA issue 
 soon as well.

 - Mark

 On Feb 14, 2012, at 11:39 AM, Jamie Johnson wrote:

 Thanks Mark, not a huge rush, just me trying to get to use the latest
 stuff on our project.

 On Tue, Feb 14, 2012 at 10:53 AM, Mark Miller markrmil...@gmail.com 
 wrote:
 Sorry, have not gotten it yet, but will be back trying later today - 
 monday, tuesday tend to be slow for me (meetings and crap).

 - Mark

 On Feb 14, 2012, at 9:10 AM, Jamie Johnson wrote:

 Has there been any success in replicating this?  I'm wondering if it
 could be something with my setup that is causing the issue...


 On Mon, Feb 13, 2012 at 8:55 AM, Jamie Johnson jej2...@gmail.com wrote:
 Yes, I have the following layout on the FS

 ./bootstrap.sh
 ./example (standard example directory from distro containing jetty
 jars, solr confs, solr war, etc)
 ./slice1
  - start.sh
  -solr.xml
  - slice1_shard1
   - data
  - slice2_shard2
   -data
 ./slice2
  - start.sh
  - solr.xml
  -slice2_shard1
    -data
  -slice1_shard2
    -data

 if it matters I'm running everything from localhost, zk and the solr 
 shards

 On Mon, Feb 13, 2012 at 8:42 AM, Sami Siren ssi...@gmail.com wrote:
 Do you have unique dataDir for each instance?
 13.2.2012 14.30 Jamie Johnson jej2...@gmail.com kirjoitti:

 - Mark Miller
 lucidimagination.com












 - Mark Miller
 lucidimagination.com












 - Mark Miller
 lucidimagination.com

Re: SolrCloud Replication Question