What does replicationFactor really do?

2015-07-16 Thread Jim . Musil
Hi,

In 5.1, we are creating a collection using the Collections API with an initial 
replicationFactor of X. This value is then stored in the state.json file for 
that collection.

If I try to issue ADDREPLICA on this cluster, it throws an error saying that 
there are no live nodes for additional replicas.

If I connect a new solr node to zookeeper and issue an ADDREPLICA call, the 
replica is created and no errors are thrown, but replicationFactor remains at X 
in the state.json file.

Why? What does replicationFactor really mean? It seems like it's being honored 
in some cases and ignored in others.

Thanks for any help you can provide.

Cheers,
Jim




CREATE collection bug or feature?

2015-06-19 Thread Jim . Musil
I noticed that when I issue the CREATE collection command to the api, it does 
not automatically put a replica on every live node connected to zookeeper.

So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and 
create a collection like this:

/admin/collections?action=CREATEname=my_collectionnumShards=1replicationFactor=1maxShardsPerNode=1collection.configName=my_config

It will only create a core on one of the three nodes. I can make it work if I 
change replicationFactor to 3. When standing up an entire stack using chef, 
this all gets a bit clunky. I don't see any option such as ALL that would 
just create a replica on all nodes regardless of size.

I'm guessing this is intentional, but curious about the reasoning.

Thanks!
Jim


Re: CREATE collection bug or feature?

2015-06-19 Thread Jim . Musil
Thanks as always for the great answers!

Jim


On 6/19/15, 11:57 AM, Erick Erickson erickerick...@gmail.com wrote:

Jim:

This is by design. There's no way to tell Solr to find all the cores
available and put one replica on each. In fact, you're explicitly
telling it to create one and only one replica, one and only one shard.
That is, your collection will have exactly one low-level core. But you
realized that...

As to the reasoning. Consider hetergeneous collections all hosted on
the same Solr cluster. I have big collections, little collections,
some with high QPS rates, some not. etc. Having Solr do things like
this automatically would make managing this difficult.

Probably the real reason is nobody thought it would be useful in
the general case. And I probably concur. Adding a new node to an
existing cluster would result in unbalanced clusters etc.

I suppose a stop-gap would be to query the live_nodes in the cluster
and add that to the URL, don't know how much of a pain that would be
though.

Best,
Erick

On Fri, Jun 19, 2015 at 10:15 AM, Jim.Musil jim.mu...@target.com wrote:
 I noticed that when I issue the CREATE collection command to the api,
it does not automatically put a replica on every live node connected to
zookeeper.

 So, for example, if I have 3 solr nodes connected to a zookeeper
ensemble and create a collection like this:

 
/admin/collections?action=CREATEname=my_collectionnumShards=1replicati
onFactor=1maxShardsPerNode=1collection.configName=my_config

 It will only create a core on one of the three nodes. I can make it
work if I change replicationFactor to 3. When standing up an entire
stack using chef, this all gets a bit clunky. I don't see any option
such as ALL that would just create a replica on all nodes regardless
of size.

 I'm guessing this is intentional, but curious about the reasoning.

 Thanks!
 Jim



Collections API and adding new boxes

2015-06-18 Thread Jim . Musil
Hi,

Let's say I have a zookeeper ensemble with several Solr nodes connected to it. 
I've created a collection successfully and all is well.

What happens when I want to add another solr node?

I've tried spinning one up and connecting it to zookeeper, but the new node 
doesn't join the collection.  What's the expected next step?

This is Solr 5.1.

Thanks!
Jim Musil


Re: Clarification on Collections API for 5.x

2015-05-27 Thread Jim . Musil
bump

On 5/21/15, 9:06 AM, Jim.Musil jim.mu...@target.com wrote:

Hi,

In the guide for moving from Solr 4.x to 5.x, it states the following:

Solr 5.0 only supports creating and removing SolrCloud collections
through the Collections
APIhttps://cwiki.apache.org/confluence/display/solr/Collections+API,
unlike previous versions. While not using the collections API may still
work in 5.0, it is unsupported, not recommended, and the behavior will
change in a 5.x release.

Currently, we launch several solr nodes with identical cores defined
using the new Core Discovery process. These nodes are also connected to a
zookeeper ensemble. Part of the core definition is to set the configSet
to use. This configSet is uploaded to zookeeper separately. This
effectively creates a Collection.

Is this method no long supported in 5.x?

Thanks!
Jim Musil




Re: Clarification on Collections API for 5.x

2015-05-27 Thread Jim . Musil
Thanks for the clarification!

On 5/27/15, 12:00 PM, Erick Erickson erickerick...@gmail.com wrote:

Are you defining shard and replicas here? Or is this just a
single-node collection? In any case, this seems unnecessary. You'd get
the same thing by having your uploading the config set to ZK, then
just issuing a Collections CREATE command, specifying the node to use
if desired.

What you're doing _should_ work, because essentially that's what start
up does. It finds cores somewhere below SOLR_HOME and reads the
core.properties file. When it finds parameters like collection, shard,
coreNodeName, numShards, all that stuff it figures things out. But,
you have to get all this right manually with the process you're using
now, why take the risk? Besides, in the future you'll have to adapt to
any back-compat breaks...

Best,
Erick

On Wed, May 27, 2015 at 8:34 AM, Jim.Musil jim.mu...@target.com wrote:
 bump

 On 5/21/15, 9:06 AM, Jim.Musil jim.mu...@target.com wrote:

Hi,

In the guide for moving from Solr 4.x to 5.x, it states the following:

Solr 5.0 only supports creating and removing SolrCloud collections
through the Collections
APIhttps://cwiki.apache.org/confluence/display/solr/Collections+API,
unlike previous versions. While not using the collections API may still
work in 5.0, it is unsupported, not recommended, and the behavior will
change in a 5.x release.

Currently, we launch several solr nodes with identical cores defined
using the new Core Discovery process. These nodes are also connected to
a
zookeeper ensemble. Part of the core definition is to set the configSet
to use. This configSet is uploaded to zookeeper separately. This
effectively creates a Collection.

Is this method no long supported in 5.x?

Thanks!
Jim Musil





Clarification on Collections API for 5.x

2015-05-21 Thread Jim . Musil
Hi,

In the guide for moving from Solr 4.x to 5.x, it states the following:

Solr 5.0 only supports creating and removing SolrCloud collections through the 
Collections 
APIhttps://cwiki.apache.org/confluence/display/solr/Collections+API, unlike 
previous versions. While not using the collections API may still work in 5.0, 
it is unsupported, not recommended, and the behavior will change in a 5.x 
release.

Currently, we launch several solr nodes with identical cores defined using the 
new Core Discovery process. These nodes are also connected to a zookeeper 
ensemble. Part of the core definition is to set the configSet to use. This 
configSet is uploaded to zookeeper separately. This effectively creates a 
Collection.

Is this method no long supported in 5.x?

Thanks!
Jim Musil



ConfigSets and SolrCloud

2015-05-20 Thread Jim . Musil
Hi,

I need a little clarification on configSets in solr 5.x.

According to this page:

https://cwiki.apache.org/confluence/display/solr/Config+Sets

I can create named configSets to be shared by other cores. If I create them 
using this method AND am operating in SolrCloud mode, will it automatically 
upload these named config sets to zookeeper?

Thanks!
Jim Musil


Confusion about zkcli.sh and solr.war

2015-05-13 Thread Jim . Musil
I'm trying to use zkcli.sh to upload configurations to zookeeper and solr 5.1.

It's throwing an error because it references webapps/solr.war which no longer 
exists.

Do I have to build my own solr.war in order to use zkcli.sh?

Please forgive me if I'm missing something here.

Jim Musil


Possible to dump clusterstate, system stats into solr log?

2015-02-11 Thread Jim . Musil
Hi,

Is it possible to periodically dump the cluster state contents (or system 
diagnostics) into the main solr log file?

We have many security protocols in place that prevents us from running 
diagnostic requests directly to the solr boxes, but we do have access to the 
shipped logs.

Thanks!
Jim


Re: Where can we set the parameters in Solr Config?

2015-02-03 Thread Jim . Musil
We set them as extra parameters sent to to the servlet (jetty or tomcat).

eg java -Dsolr.lock.type=native -jar start.jar

Jim

On 2/3/15, 11:58 AM, O. Olson olson_...@yahoo.it wrote:

I'm sorry if this is a basic question, but I am curious where, or at
least,
how can we set the parameters in the solrconfig.xml.

E.g. Consider the solrconfig.xml shown here:
http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_10/solr/exa
mple/example-DIH/solr/db/conf/solrconfig.xml?revision=1638496view=markup

There seems be a lot of
${ParameterName:Value}
E.g. 
lockType${solr.lock.type:native}/lockType

Where do these parameter values get set? Thank you in anticipation.




--
View this message in context:
http://lucene.472066.n3.nabble.com/Where-can-we-set-the-parameters-in-Solr
-Config-tp4183706.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SOLR retrieve data using URL

2015-02-02 Thread Jim . Musil
You don't have to use SolrJ. It's just a web request to a url, so just
issue the request in Java and parse the JSON response.

http://stackoverflow.com/questions/7467568/parsing-json-from-url

SolrJ does make it simpler, however.

Jim

On 2/2/15, 12:57 PM, mathewvino vinojmat...@hotmail.com wrote:

Hi There,

I am using solrj API to make call to Solr Server with the data that I am
looking for. Basically I am using
solrj api as below to get the data. Everything is working as expected

HttpSolrServer solr = new
HttpSolrServer(http://server:8983/solr/collection1;);
SolrQuery query = new SolrQuery(*:*);
query.setFacet(true).addFacetField(PLS_SURVY_SURVY_STATUS_MAP)

Is there any API I can use the complete URL to get the data like below

HttpSolrServer solr = new
HttpSolrServer(http://server:8983/solr/collection1/select?q=*%3A*wt=json
indent=truefacet=truefacet.field=PLS_SURVY_SURVY_LANG_CHOICE_MAP)

I would like to pass the complete url to get the data insted of using
solrj
query api.

Thanks






--
View this message in context:
http://lucene.472066.n3.nabble.com/SOLR-retrieve-data-using-URL-tp4183536.
html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr throwing SocketException: Connection Reset

2015-02-02 Thread Jim . Musil
This is difficult to diagnose, but here¹s some questions I would ask
myself:

Can you reliably recreate the error?
Can you recreate the error faster by writing to all 100 collections at
once?
Can you recreate the error faster if I have less nodes?

Is just one solr node or one solr collection throwing the error?

Are all the updates coming from one machine?
Is there some other bottleneck in your network (like a load balancer) that
is limiting connections?

Good luck,
Jim Musil


On 2/2/15, 5:29 AM, nkgupta nitinkumargu...@gmail.com wrote:

I have 8 node solr cloud cluster connected with external zookeeper. Each
node
: 30 Gb, 4 core.
I have created around 100 collections, each collection is having approx.
30
shards. (Why I need it, let be a different story, business isolation,
business requirement could be anything).

Now, I am ingesting data into cluster on 30 collections simultaneously. I
see that ingestion to few collections is getting failed. In solr logs, I
can
see this Connection Reset exception occurring. Overall time for
ingestion
is in the tune of 10 hours.

Any suggestion? Even if it is due to resource starvation how can I prove
that connection reset is coming because of lack of resources.

 Exception ==
2015-01-30 09:16:14,454 ERROR [updateExecutor-1-thread-8151] ? (:) - error
java.net.SocketException: Connection reset
   at java.net.SocketInputStream.read(SocketInputStream.java:196)
~[?:1.7.0_55]
   at java.net.SocketInputStream.read(SocketInputStream.java:122)
~[?:1.7.0_55]
   at
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSess
ionInputBuffer.java:160)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.jav
a:84)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessio
nInputBuffer.java:273)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpR
esponseParser.java:140)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpR
esponseParser.java:57)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.
java:260)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(Ab
stractHttpClientConnection.java:283)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(De
faultClientConnection.java:251)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeade
r(ManagedClientConnectionImpl.java:197)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequest
Executor.java:271)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.j
ava:123)
~[httpcore-4.3.jar:4.3]
   at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultReque
stDirector.java:682)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestD
irector.java:486)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClien
t.java:863)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClien
t.java:82)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClien
t.java:106)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClien
t.java:57)
~[httpclient-4.3.1.jar:4.3.1]
   at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(Co
ncurrentUpdateSolrServer.java:233)
[solr-solrj-4.10.0.jar:4.10.0 1620776 - rjernst - 2014-08-26 20:49:51]
   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1145)
[?:1.7.0_55]
   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
:615)
[?:1.7.0_55]
   at java.lang.Thread.run(Thread.java:745) [?:1.7.0_55]



--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-throwing-SocketException-Connectio
n-Reset-tp4183434.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr pattern tokenizer

2015-02-02 Thread Jim . Musil
It looks to me like you simply want to split the incoming query by the
hyphen, so that it searches for exact codes like this ³CHQ PAID² ³INWARD
TRAN² ³HDFC LTD². 

If that¹s true, I¹d either just change the query at the client to do what
you want, or look into something like the PatternTokenizer:

https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTo
kenizerFactory


Apologies if I¹m not understanding your use case.

Thanks,
Jim

On 2/2/15, 3:56 AM, Nivedita nivedita.pa...@tcs.com wrote:

Hi,

I want to tokenize query like CHQ PAID-INWARD TRAN-HDFC LTD  in such a
way
that it should give me result documnet containing HDFC LTD and not HDFC
MF. 

How can I do this.
I Have already applied below Tokenizers

 fieldType name=text_general class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
   
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /

filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory /
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
   
   filter class=solr.WordDelimiterFilterFactory 
 generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0 splitOnCaseChange=1/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
   filter class=solr.EdgeNGramFilterFactory minGramSize=3
maxGramSize=25 side=front/
filter class=solr.LowerCaseFilterFactory/
   filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true/
filter class=solr.TrimFilterFactory /
  /analyzer
/fieldType


Please help.



--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-pattern-tokenizer-tp4183421.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: An interesting approach to grouping

2015-01-27 Thread Jim . Musil
Yes, I’m trying to pin down exactly what conditions cause the bug to
appear. It seems as though it’s only when using the query function.

Jim

On 1/27/15, 12:44 PM, Ryan Josal rjo...@gmail.com wrote:

This is great, thanks Jim.  Your patch worked and the sorting solution
meets the goal, although group.limit seems like it could cut various
results out of the middle of the result set.  I will play around with it
and see if it proves helpful.  Can you let me know the Jira so I can keep
an eye on it?

Ryan

On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com wrote:

 Interestingly, you can do something like this:

 group=true
 group.main=true
 group.func=rint(scale(query({!type=edismax v=$q}),0,20)) // puts into
 buckets
 group.limit=20 // gives you 20 from each bucket
 group.sort=category asc  // this will sort by category within each
bucket,
 but this can be a function as well.



 Jim Musil



 On 1/27/15, 10:14 AM, Jim.Musil jim.mu...@target.com javascript:;
 wrote:

 When using group.main=true, the results are not mixed as you expect:
 
 If true, the result of the last field grouping command is used as the
 main result list in the response, using group.format=simple”
 
 https://wiki.apache.org/solr/FieldCollapsing
 
 
 Jim
 
 On 1/27/15, 9:22 AM, Ryan Josal rjo...@gmail.com javascript:;
 wrote:
 
 Thanks a lot!  I'll try this out later this morning.  If group.func
and
 group.field don't combine the way I think they might, I'll try to look
 for
 a way to put it all in group.func.
 
 On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com
 javascript:; wrote:
 
  I¹m not sure the query you provided will do what you want, BUT I did
 find
  the bug in the code that is causing the NullPointerException.
 
  The variable context is supposed to be global, but when prepare() is
  called, it is only defined in the scope of that function.
 
  Here¹s the simple patch:
 
  Index: core/src/java/org/apache/solr/search/Grouping.java
  ===
  --- core/src/java/org/apache/solr/search/Grouping.java  (revision
 1653358)
  +++ core/src/java/org/apache/solr/search/Grouping.java  (working
copy)
  @@ -926,7 +926,7 @@
*/
   @Override
   protected void prepare() throws IOException {
  -  Map context = ValueSource.newContext(searcher);
  +  context = ValueSource.newContext(searcher);
 groupBy.createWeight(context, searcher);
 actualGroupsToFind = getMax(offset, numGroups, maxDoc);
   }
 
 
  I¹ll search for a Jira issue and open if I can¹t find one.
 
  Jim Musil
 
 
 
  On 1/26/15, 6:34 PM, Ryan Josal r...@josal.com javascript:;
 javascript:;
 wrote:
 
  I have an index of products, and these products have a category
 which we
  can say for now is a good approximation of its location in the
store.
 I'm
  investigating altering the ordering of the results so that the
 categories
  aren't interlaced as much... so that the results are a little bit
more
  grouped by category, but not *totally* grouped by category.  It's
  interesting because it's an approach that sort of compares results
to
  near-scored/ranked results.  One of the hoped outcomes of this
would
 that
  there would be somewhat fewer categories represented in the top
 results
  for
  a given query, although it is questionable if this is a good
 measurement
  to
  determine the effectiveness of the implementation.
  
  My first attempt was to
 
 
group=truegroup.main=truegroup.field=categorygroup.func=rint(scale
(q
 u
 er
  y({!type=edismax
  v=$q}),0,20))
  
  Or some FunctionQuery like that, so that in order to become a
member
 of a
  group, the doc would have to have the same category, and be dropped
 into
  the same score bucket (20 in this case).  This doesn't work out of
the
  gate
  due to an NPE (solr 4.10.2) (although I'm not sure it would work
 anyway):
  
  java.lang.NullPointerException\n\tat
 
 
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.get
Va
 l
 ue
  s(ScaleFloatFunction.java:104)\n\tat
 
 
org.apache.solr.search.DoubleParser$Function.getValues(ValueSourcePar
se
 r
 .j
  ava:)\n\tat
 
 
org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingC
ol
 l
 ec
  tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
 
 
org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.
ja
 v
 a:
  113)\n\tat
 
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)
\n
 \
 ta
  t
 
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
\n
 \
 ta
  t
 
 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:4
51
 )
 \n
  \tat
  org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
 
 
org.apache.solr.handler.component.QueryComponent.process(QueryCompone
nt
 .
 ja
  va:459)\n\tat
 
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sea
rc
 h
 Ha
  ndler.java:218)\n\tat
  
  
  Has anyone tried something like

Re: An interesting approach to grouping

2015-01-27 Thread Jim . Musil
Here’s the issue:

https://issues.apache.org/jira/browse/SOLR-7046


Jim

On 1/27/15, 12:44 PM, Ryan Josal rjo...@gmail.com wrote:

This is great, thanks Jim.  Your patch worked and the sorting solution
meets the goal, although group.limit seems like it could cut various
results out of the middle of the result set.  I will play around with it
and see if it proves helpful.  Can you let me know the Jira so I can keep
an eye on it?

Ryan

On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com wrote:

 Interestingly, you can do something like this:

 group=true
 group.main=true
 group.func=rint(scale(query({!type=edismax v=$q}),0,20)) // puts into
 buckets
 group.limit=20 // gives you 20 from each bucket
 group.sort=category asc  // this will sort by category within each
bucket,
 but this can be a function as well.



 Jim Musil



 On 1/27/15, 10:14 AM, Jim.Musil jim.mu...@target.com javascript:;
 wrote:

 When using group.main=true, the results are not mixed as you expect:
 
 If true, the result of the last field grouping command is used as the
 main result list in the response, using group.format=simple”
 
 https://wiki.apache.org/solr/FieldCollapsing
 
 
 Jim
 
 On 1/27/15, 9:22 AM, Ryan Josal rjo...@gmail.com javascript:;
 wrote:
 
 Thanks a lot!  I'll try this out later this morning.  If group.func
and
 group.field don't combine the way I think they might, I'll try to look
 for
 a way to put it all in group.func.
 
 On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com
 javascript:; wrote:
 
  I¹m not sure the query you provided will do what you want, BUT I did
 find
  the bug in the code that is causing the NullPointerException.
 
  The variable context is supposed to be global, but when prepare() is
  called, it is only defined in the scope of that function.
 
  Here¹s the simple patch:
 
  Index: core/src/java/org/apache/solr/search/Grouping.java
  ===
  --- core/src/java/org/apache/solr/search/Grouping.java  (revision
 1653358)
  +++ core/src/java/org/apache/solr/search/Grouping.java  (working
copy)
  @@ -926,7 +926,7 @@
*/
   @Override
   protected void prepare() throws IOException {
  -  Map context = ValueSource.newContext(searcher);
  +  context = ValueSource.newContext(searcher);
 groupBy.createWeight(context, searcher);
 actualGroupsToFind = getMax(offset, numGroups, maxDoc);
   }
 
 
  I¹ll search for a Jira issue and open if I can¹t find one.
 
  Jim Musil
 
 
 
  On 1/26/15, 6:34 PM, Ryan Josal r...@josal.com javascript:;
 javascript:;
 wrote:
 
  I have an index of products, and these products have a category
 which we
  can say for now is a good approximation of its location in the
store.
 I'm
  investigating altering the ordering of the results so that the
 categories
  aren't interlaced as much... so that the results are a little bit
more
  grouped by category, but not *totally* grouped by category.  It's
  interesting because it's an approach that sort of compares results
to
  near-scored/ranked results.  One of the hoped outcomes of this
would
 that
  there would be somewhat fewer categories represented in the top
 results
  for
  a given query, although it is questionable if this is a good
 measurement
  to
  determine the effectiveness of the implementation.
  
  My first attempt was to
 
 
group=truegroup.main=truegroup.field=categorygroup.func=rint(scale
(q
 u
 er
  y({!type=edismax
  v=$q}),0,20))
  
  Or some FunctionQuery like that, so that in order to become a
member
 of a
  group, the doc would have to have the same category, and be dropped
 into
  the same score bucket (20 in this case).  This doesn't work out of
the
  gate
  due to an NPE (solr 4.10.2) (although I'm not sure it would work
 anyway):
  
  java.lang.NullPointerException\n\tat
 
 
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.get
Va
 l
 ue
  s(ScaleFloatFunction.java:104)\n\tat
 
 
org.apache.solr.search.DoubleParser$Function.getValues(ValueSourcePar
se
 r
 .j
  ava:)\n\tat
 
 
org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingC
ol
 l
 ec
  tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
 
 
org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.
ja
 v
 a:
  113)\n\tat
 
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)
\n
 \
 ta
  t
 
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
\n
 \
 ta
  t
 
 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:4
51
 )
 \n
  \tat
  org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
 
 
org.apache.solr.handler.component.QueryComponent.process(QueryCompone
nt
 .
 ja
  va:459)\n\tat
 
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sea
rc
 h
 Ha
  ndler.java:218)\n\tat
  
  
  Has anyone tried something like this before, and does anyone have
any
  novel
  ideas for how to approach

Re: An interesting approach to grouping

2015-01-27 Thread Jim . Musil
Here’s the issue:


On 1/27/15, 12:44 PM, Ryan Josal rjo...@gmail.com wrote:

This is great, thanks Jim.  Your patch worked and the sorting solution
meets the goal, although group.limit seems like it could cut various
results out of the middle of the result set.  I will play around with it
and see if it proves helpful.  Can you let me know the Jira so I can keep
an eye on it?

Ryan

On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com wrote:

 Interestingly, you can do something like this:

 group=true
 group.main=true
 group.func=rint(scale(query({!type=edismax v=$q}),0,20)) // puts into
 buckets
 group.limit=20 // gives you 20 from each bucket
 group.sort=category asc  // this will sort by category within each
bucket,
 but this can be a function as well.



 Jim Musil



 On 1/27/15, 10:14 AM, Jim.Musil jim.mu...@target.com javascript:;
 wrote:

 When using group.main=true, the results are not mixed as you expect:
 
 If true, the result of the last field grouping command is used as the
 main result list in the response, using group.format=simple”
 
 https://wiki.apache.org/solr/FieldCollapsing
 
 
 Jim
 
 On 1/27/15, 9:22 AM, Ryan Josal rjo...@gmail.com javascript:;
 wrote:
 
 Thanks a lot!  I'll try this out later this morning.  If group.func
and
 group.field don't combine the way I think they might, I'll try to look
 for
 a way to put it all in group.func.
 
 On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com
 javascript:; wrote:
 
  I¹m not sure the query you provided will do what you want, BUT I did
 find
  the bug in the code that is causing the NullPointerException.
 
  The variable context is supposed to be global, but when prepare() is
  called, it is only defined in the scope of that function.
 
  Here¹s the simple patch:
 
  Index: core/src/java/org/apache/solr/search/Grouping.java
  ===
  --- core/src/java/org/apache/solr/search/Grouping.java  (revision
 1653358)
  +++ core/src/java/org/apache/solr/search/Grouping.java  (working
copy)
  @@ -926,7 +926,7 @@
*/
   @Override
   protected void prepare() throws IOException {
  -  Map context = ValueSource.newContext(searcher);
  +  context = ValueSource.newContext(searcher);
 groupBy.createWeight(context, searcher);
 actualGroupsToFind = getMax(offset, numGroups, maxDoc);
   }
 
 
  I¹ll search for a Jira issue and open if I can¹t find one.
 
  Jim Musil
 
 
 
  On 1/26/15, 6:34 PM, Ryan Josal r...@josal.com javascript:;
 javascript:;
 wrote:
 
  I have an index of products, and these products have a category
 which we
  can say for now is a good approximation of its location in the
store.
 I'm
  investigating altering the ordering of the results so that the
 categories
  aren't interlaced as much... so that the results are a little bit
more
  grouped by category, but not *totally* grouped by category.  It's
  interesting because it's an approach that sort of compares results
to
  near-scored/ranked results.  One of the hoped outcomes of this
would
 that
  there would be somewhat fewer categories represented in the top
 results
  for
  a given query, although it is questionable if this is a good
 measurement
  to
  determine the effectiveness of the implementation.
  
  My first attempt was to
 
 
group=truegroup.main=truegroup.field=categorygroup.func=rint(scale
(q
 u
 er
  y({!type=edismax
  v=$q}),0,20))
  
  Or some FunctionQuery like that, so that in order to become a
member
 of a
  group, the doc would have to have the same category, and be dropped
 into
  the same score bucket (20 in this case).  This doesn't work out of
the
  gate
  due to an NPE (solr 4.10.2) (although I'm not sure it would work
 anyway):
  
  java.lang.NullPointerException\n\tat
 
 
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.get
Va
 l
 ue
  s(ScaleFloatFunction.java:104)\n\tat
 
 
org.apache.solr.search.DoubleParser$Function.getValues(ValueSourcePar
se
 r
 .j
  ava:)\n\tat
 
 
org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingC
ol
 l
 ec
  tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
 
 
org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.
ja
 v
 a:
  113)\n\tat
 
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)
\n
 \
 ta
  t
 
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
\n
 \
 ta
  t
 
 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:4
51
 )
 \n
  \tat
  org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
 
 
org.apache.solr.handler.component.QueryComponent.process(QueryCompone
nt
 .
 ja
  va:459)\n\tat
 
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Sea
rc
 h
 Ha
  ndler.java:218)\n\tat
  
  
  Has anyone tried something like this before, and does anyone have
any
  novel
  ideas for how to approach it, no matter how different?  How about a
  workaround

Re: An interesting approach to grouping

2015-01-27 Thread Jim . Musil
I¹m not sure the query you provided will do what you want, BUT I did find
the bug in the code that is causing the NullPointerException.

The variable context is supposed to be global, but when prepare() is
called, it is only defined in the scope of that function.

Here¹s the simple patch:

Index: core/src/java/org/apache/solr/search/Grouping.java
===
--- core/src/java/org/apache/solr/search/Grouping.java  (revision 1653358)
+++ core/src/java/org/apache/solr/search/Grouping.java  (working copy)
@@ -926,7 +926,7 @@
  */
 @Override
 protected void prepare() throws IOException {
-  Map context = ValueSource.newContext(searcher);
+  context = ValueSource.newContext(searcher);
   groupBy.createWeight(context, searcher);
   actualGroupsToFind = getMax(offset, numGroups, maxDoc);
 }


I¹ll search for a Jira issue and open if I can¹t find one.

Jim Musil



On 1/26/15, 6:34 PM, Ryan Josal r...@josal.com wrote:

I have an index of products, and these products have a category which we
can say for now is a good approximation of its location in the store.  I'm
investigating altering the ordering of the results so that the categories
aren't interlaced as much... so that the results are a little bit more
grouped by category, but not *totally* grouped by category.  It's
interesting because it's an approach that sort of compares results to
near-scored/ranked results.  One of the hoped outcomes of this would that
there would be somewhat fewer categories represented in the top results
for
a given query, although it is questionable if this is a good measurement
to
determine the effectiveness of the implementation.

My first attempt was to
group=truegroup.main=truegroup.field=categorygroup.func=rint(scale(quer
y({!type=edismax
v=$q}),0,20))

Or some FunctionQuery like that, so that in order to become a member of a
group, the doc would have to have the same category, and be dropped into
the same score bucket (20 in this case).  This doesn't work out of the
gate
due to an NPE (solr 4.10.2) (although I'm not sure it would work anyway):

java.lang.NullPointerException\n\tat
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.getValue
s(ScaleFloatFunction.java:104)\n\tat
org.apache.solr.search.DoubleParser$Function.getValues(ValueSourceParser.j
ava:)\n\tat
org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingCollec
tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.java:
113)\n\tat
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)\n\ta
t
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)\n\ta
t
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:451)\n
\tat
org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.ja
va:459)\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa
ndler.java:218)\n\tat


Has anyone tried something like this before, and does anyone have any
novel
ideas for how to approach it, no matter how different?  How about a
workaround for the group.func error here?  I'm very open-minded about
where
to go on this one.

Thanks,
Ryan



Re: An interesting approach to grouping

2015-01-27 Thread Jim . Musil
When using group.main=true, the results are not mixed as you expect:

If true, the result of the last field grouping command is used as the
main result list in the response, using group.format=simple”

https://wiki.apache.org/solr/FieldCollapsing


Jim

On 1/27/15, 9:22 AM, Ryan Josal rjo...@gmail.com wrote:

Thanks a lot!  I'll try this out later this morning.  If group.func and
group.field don't combine the way I think they might, I'll try to look for
a way to put it all in group.func.

On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com wrote:

 I¹m not sure the query you provided will do what you want, BUT I did
find
 the bug in the code that is causing the NullPointerException.

 The variable context is supposed to be global, but when prepare() is
 called, it is only defined in the scope of that function.

 Here¹s the simple patch:

 Index: core/src/java/org/apache/solr/search/Grouping.java
 ===
 --- core/src/java/org/apache/solr/search/Grouping.java  (revision
1653358)
 +++ core/src/java/org/apache/solr/search/Grouping.java  (working copy)
 @@ -926,7 +926,7 @@
   */
  @Override
  protected void prepare() throws IOException {
 -  Map context = ValueSource.newContext(searcher);
 +  context = ValueSource.newContext(searcher);
groupBy.createWeight(context, searcher);
actualGroupsToFind = getMax(offset, numGroups, maxDoc);
  }


 I¹ll search for a Jira issue and open if I can¹t find one.

 Jim Musil



 On 1/26/15, 6:34 PM, Ryan Josal r...@josal.com javascript:; wrote:

 I have an index of products, and these products have a category
which we
 can say for now is a good approximation of its location in the store.
I'm
 investigating altering the ordering of the results so that the
categories
 aren't interlaced as much... so that the results are a little bit more
 grouped by category, but not *totally* grouped by category.  It's
 interesting because it's an approach that sort of compares results to
 near-scored/ranked results.  One of the hoped outcomes of this would
that
 there would be somewhat fewer categories represented in the top results
 for
 a given query, although it is questionable if this is a good
measurement
 to
 determine the effectiveness of the implementation.
 
 My first attempt was to
 
group=truegroup.main=truegroup.field=categorygroup.func=rint(scale(qu
er
 y({!type=edismax
 v=$q}),0,20))
 
 Or some FunctionQuery like that, so that in order to become a member
of a
 group, the doc would have to have the same category, and be dropped
into
 the same score bucket (20 in this case).  This doesn't work out of the
 gate
 due to an NPE (solr 4.10.2) (although I'm not sure it would work
anyway):
 
 java.lang.NullPointerException\n\tat
 
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.getVal
ue
 s(ScaleFloatFunction.java:104)\n\tat
 
org.apache.solr.search.DoubleParser$Function.getValues(ValueSourceParser
.j
 ava:)\n\tat
 
org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingColl
ec
 tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
 
org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.jav
a:
 113)\n\tat
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)\n\
ta
 t
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)\n\
ta
 t
 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:451)
\n
 \tat
 org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.
ja
 va:459)\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Search
Ha
 ndler.java:218)\n\tat
 
 
 Has anyone tried something like this before, and does anyone have any
 novel
 ideas for how to approach it, no matter how different?  How about a
 workaround for the group.func error here?  I'm very open-minded about
 where
 to go on this one.
 
 Thanks,
 Ryan





Re: An interesting approach to grouping

2015-01-27 Thread Jim . Musil
Interestingly, you can do something like this:

group=true
group.main=true
group.func=rint(scale(query({!type=edismax v=$q}),0,20)) // puts into
buckets
group.limit=20 // gives you 20 from each bucket
group.sort=category asc  // this will sort by category within each bucket,
but this can be a function as well.



Jim Musil



On 1/27/15, 10:14 AM, Jim.Musil jim.mu...@target.com wrote:

When using group.main=true, the results are not mixed as you expect:

If true, the result of the last field grouping command is used as the
main result list in the response, using group.format=simple”

https://wiki.apache.org/solr/FieldCollapsing


Jim

On 1/27/15, 9:22 AM, Ryan Josal rjo...@gmail.com wrote:

Thanks a lot!  I'll try this out later this morning.  If group.func and
group.field don't combine the way I think they might, I'll try to look
for
a way to put it all in group.func.

On Tuesday, January 27, 2015, Jim.Musil jim.mu...@target.com wrote:

 I¹m not sure the query you provided will do what you want, BUT I did
find
 the bug in the code that is causing the NullPointerException.

 The variable context is supposed to be global, but when prepare() is
 called, it is only defined in the scope of that function.

 Here¹s the simple patch:

 Index: core/src/java/org/apache/solr/search/Grouping.java
 ===
 --- core/src/java/org/apache/solr/search/Grouping.java  (revision
1653358)
 +++ core/src/java/org/apache/solr/search/Grouping.java  (working copy)
 @@ -926,7 +926,7 @@
   */
  @Override
  protected void prepare() throws IOException {
 -  Map context = ValueSource.newContext(searcher);
 +  context = ValueSource.newContext(searcher);
groupBy.createWeight(context, searcher);
actualGroupsToFind = getMax(offset, numGroups, maxDoc);
  }


 I¹ll search for a Jira issue and open if I can¹t find one.

 Jim Musil



 On 1/26/15, 6:34 PM, Ryan Josal r...@josal.com javascript:;
wrote:

 I have an index of products, and these products have a category
which we
 can say for now is a good approximation of its location in the store.
I'm
 investigating altering the ordering of the results so that the
categories
 aren't interlaced as much... so that the results are a little bit more
 grouped by category, but not *totally* grouped by category.  It's
 interesting because it's an approach that sort of compares results to
 near-scored/ranked results.  One of the hoped outcomes of this would
that
 there would be somewhat fewer categories represented in the top
results
 for
 a given query, although it is questionable if this is a good
measurement
 to
 determine the effectiveness of the implementation.
 
 My first attempt was to
 
group=truegroup.main=truegroup.field=categorygroup.func=rint(scale(q
u
er
 y({!type=edismax
 v=$q}),0,20))
 
 Or some FunctionQuery like that, so that in order to become a member
of a
 group, the doc would have to have the same category, and be dropped
into
 the same score bucket (20 in this case).  This doesn't work out of the
 gate
 due to an NPE (solr 4.10.2) (although I'm not sure it would work
anyway):
 
 java.lang.NullPointerException\n\tat
 
org.apache.lucene.queries.function.valuesource.ScaleFloatFunction.getVa
l
ue
 s(ScaleFloatFunction.java:104)\n\tat
 
org.apache.solr.search.DoubleParser$Function.getValues(ValueSourceParse
r
.j
 ava:)\n\tat
 
org.apache.lucene.search.grouping.function.FunctionFirstPassGroupingCol
l
ec
 tor.setNextReader(FunctionFirstPassGroupingCollector.java:82)\n\tat
 
org.apache.lucene.search.MultiCollector.setNextReader(MultiCollector.ja
v
a:
 113)\n\tat
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:612)\n
\
ta
 t
 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)\n
\
ta
 t
 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:451
)
\n
 \tat
 org.apache.solr.search.Grouping.execute(Grouping.java:368)\n\tat
 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent
.
ja
 va:459)\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Searc
h
Ha
 ndler.java:218)\n\tat
 
 
 Has anyone tried something like this before, and does anyone have any
 novel
 ideas for how to approach it, no matter how different?  How about a
 workaround for the group.func error here?  I'm very open-minded about
 where
 to go on this one.
 
 Thanks,
 Ryan






Re: Indexed epoch time in Solr

2015-01-26 Thread Jim . Musil
If you are using the DataImportHandler, you can leverage on of the
transformers, such as the DateFormatTransformer:

http://wiki.apache.org/solr/DataImportHandler#DateFormatTransformer


If you are updating documents directly you can define a regex
transformation in your schema.xml:

https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternRe
placeCharFilterFactory


If you have control over the input, then I always find it better to just
transform it prior to sending it into solr.

Jim

On 1/25/15, 11:35 PM, Ahmed Adel ahmed.a...@badrit.com wrote:

Hi All,

Is there a way to convert unix time field that is already indexed to
ISO-8601 format in query response? If this is not possible on the query
level, what is the best way to copy this field to a new Solr standard date
field.

Thanks,

-- 
*Ahmed Adel*
http://s.wisestamp.com/links?url=http%3A%2F%2Fwww.linkedin.com%2Fin%2F



Does CloudSolrServer hit zookeeper for every request?

2014-06-02 Thread Jim . Musil
I’m curious how CloudSolrServer works in practice.

I understand that it gets the active solr nodes from zookeeper, but does it do 
this for every request?

If it does hit zk for every request, that seems to put a lot of pressure on the 
zk ensemble.

If it does NOT hit zk for every request, then how does it detect changes in the 
number of nodes and the status of the nodes?

Thanks!
Jim M.


Status of configName in core.properties

2014-05-30 Thread Jim . Musil
Hi,

I’m attempting to define a core using the new core discovery method described 
here:

http://wiki.apache.org/solr/Core%20Discovery%20(4.4%20and%20beyond)

At the bottom of the page is a parameter named configName that should allow me 
to specify a configuration name to use for a collection. This does not seem to 
be working. I have a configuration uploaded to zookeeper with a name. I want to 
share that configuration between two cores, but it is only linking to the one 
with the same exact name.

This parameter is marked at “Tentative” for 4.6. What is the status?

Thanks!
Jim