Sorting on multivalues field in Solr

2015-05-12 Thread nutchsolruser
Is there any way we can sort multivalued field in Solr. I have two documents
with field custom_code and values are as below,

Doc 1 : 11, 78, 45, 22
Doc 2 : 56, 74, 62, 10

When I sort it in ascending order the order should be ,
Doc 2 : 56, 74, 62, 10
Doc 1 : 11, 78, 45, 22

Here Doc 2 will come first because it has smallest element 10 (which is
greater that 11 of doc 1). 


How can we achieve this in Solr. What is the easiest way?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-on-multivalues-field-in-Solr-tp4204996.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 4.10.4 - error creating document

2015-05-12 Thread Bernd Fehling
Hi Erik,

thanks for your concerns and thoughts.
There is no XY problem because we decouple input (storing)
from, searching, faceting, ...
What you see is just the input for storing and output of the original
text in the results. There is no need to do any analysis on this.
So don't worry, it works like a charm for years now ;-)

With the upgrade from 4.6.1 to 4.10.4 it only turned out we never
recognized that we were missing 3 or 4 documents within over
70 million because they were silently dropped which has been changed
by LUCENE-5472.

Regards
Bernd


Am 12.05.2015 um 00:29 schrieb Erick Erickson:
 I've got to ask _how_ are you intending to search this field? On the
 surface, this feels like an XY problem.
 It's a string type. Therefore, if this is the input:
 
 102, 111, 114, 32, 97, 32, 114, 101, 118, 105, 101, 119, 32, 115, 101,
 101, 32, 66, 114
 
 you'll only ever get a match if you search exactly:
 102, 111, 114, 32, 97, 32, 114, 101, 118, 105, 101, 119, 32, 115, 101,
 101, 32, 66, 114
 
 None of these will match
 102
 102,
 32
 32,
 119, 32, 115
 
 etc.
 
 The idea of doing a match on a single _token_ that's over 32K long is
 pretty far out there, thus
 the check.
 
 The entire multiValued discussion is _probably_ a red herring and
 won't help you. multiValued
 has nothing to do with multiple terms, that's all up to your field type.
 
 So back up and tell us _how_ you intend to search this field. I'm
 guessing you really want
 to make it a text-based type instead. But that's just a guess.
 
 Best,
 Erick.
 
 On Mon, May 11, 2015 at 8:43 AM, Bernd Fehling
 bernd.fehl...@uni-bielefeld.de wrote:
 It turned out that I didn't recognized that dcdescription is not indexed,
 only stored. So the next in chain ist f_dcperson where dccreator and
 dcdescription is combined and indexed. And this is why the error
 shows up on f_dcperson. (delay of error)

 Thanks for your help, regards.
 Bernd


 Am 11.05.2015 um 15:35 schrieb Shawn Heisey:
 On 5/11/2015 7:19 AM, Bernd Fehling wrote:
 After reading https://issues.apache.org/jira/browse/LUCENE-5472
 one question still remains.

 Why is it complaining about f_dcperson which is a copyField when the
 origin problem field is dcdescription which definately is much larger
 than 32766?

 I would assume it complains about dcdescription field. Or not?

 If the value resulting in the error does come from a copyField source
 that also uses a string type, then my guess here is that Solr has some
 prioritization that causes the copyField destination to be indexed
 before the sources.  This ordering might make things go a little faster,
 because if it happens right after copying, all or most of the data for
 the destination field would already be sitting in one or more of the CPU
 caches.  Cache hits are wonderful things for performance.

 Thanks,
 Shawn


-- 
*
Bernd FehlingBielefeld University Library
Dipl.-Inform. (FH)LibTec - Library Technology
Universitätsstr. 25  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*


Re: Sorting on multivalues field in Solr

2015-05-12 Thread Alexandre Rafalovitch
The easiest way is to have a separate field for sorting. Make it
DocValue as well for faster sorting performance.

Then, you have an Update Request Processor (URP) chain and in it you
clone the field and choose the most appropriate value (smallest).
There are URPs for that, e.g.
http://www.solr-start.com/info/update-request-processors/#MinFieldValueUpdateProcessorFactory

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 12 May 2015 at 16:22, nutchsolruser nutchsolru...@gmail.com wrote:
 Is there any way we can sort multivalued field in Solr. I have two documents
 with field custom_code and values are as below,

 Doc 1 : 11, 78, 45, 22
 Doc 2 : 56, 74, 62, 10

 When I sort it in ascending order the order should be ,
 Doc 2 : 56, 74, 62, 10
 Doc 1 : 11, 78, 45, 22

 Here Doc 2 will come first because it has smallest element 10 (which is
 greater that 11 of doc 1).


 How can we achieve this in Solr. What is the easiest way?






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Sorting-on-multivalues-field-in-Solr-tp4204996.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Transactional Behavior

2015-05-12 Thread Amr Ali
Hello,

I have a business case in which I need to be able for the rollback. When I 
tried add/commit I was not able to prevent other threads that write to a given 
Solr core from committing everything. I also tried indexwriter but Solr did not 
get changes until we restart it.


--
Regards,
Amr Ali

City stars capital 8 - 3rd floor, Nasr city, Cairo, Egypt
Ext: 278




Re: SolrCloud indexing

2015-05-12 Thread Bill Au
Thanks for the reply.

Actually in our case we want the timestamp to be populated locally on each
node in the SolrCloud cluster.  We want to see if there is any delay in the
document being distributed within the cluster.  Just want to confirm that
the timestamp can be use for that purpose.

Bill

On Sat, May 9, 2015 at 11:37 PM, Shawn Heisey apa...@elyograg.org wrote:

 On 5/9/2015 8:41 PM, Bill Au wrote:
  Is the behavior of document being indexed independently on each node in a
  SolrCloud cluster new in 5.x or is that true in 4.x also?
 
  If the document is indexed independently on each node, then if I query
 the
  document from each node directly, a timestamp could hold different values
  since the document is indexed independently, right?
 
  field name=timestamp type=date indexed=true stored=true
  default=NOW /

 SolrCloud has had that behavior from day one, when it was released in
 version 4.0.  You are correct that it can result in a different
 timestamp on each replica if the default comes from schema.xml.

 I am pretty sure that the solution for this problem is to set up an
 update processor chain that includes TimestampUpdateProcessorFactory to
 populate the timestamp field before the document is distributed to each
 replica.

 https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors

 Thanks,
 Shawn




Re: How is the most relevant document of each group chosen when group.truncate is used?

2015-05-12 Thread Andrii Berezhynskyi
Forgot to mention that I'm using solr 5.0


JARs needed to run SolrJ

2015-05-12 Thread Steven White
Hi Everyone,

I am trying to use SolrJ to add docs to Solr.  The following line:

HttpSolrClient solrServer = new HttpSolrClient(
http://localhost:8983/solr;);

Is failing with exception:

Exception in thread main java.lang.NoClassDefFoundError:
org.apache.commons.logging.LogFactory
at
org.apache.http.impl.client.CloseableHttpClient.init(CloseableHttpClient.java:60)
at
org.apache.http.impl.client.AbstractHttpClient.init(AbstractHttpClient.java:271)
at
org.apache.http.impl.client.DefaultHttpClient.init(DefaultHttpClient.java:127)
. . . . . . . . . . .
Caused by: java.lang.ClassNotFoundException:
org.apache.commons.logging.LogFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:665)
at java.lang.ClassLoader.loadClassHelper(ClassLoader.java:942)
at java.lang.ClassLoader.loadClass(ClassLoader.java:851)
. . . . . . . . . . .

I pulled in everything from \solr-5.1.0\dist\solrj-lib and I included
solr-solrj-5.1.0.jar from \solr-5.1.0\dist.

Why I'm getting the above error?  Is there an external JAR I need?  I want
to pull in required JARs only.

Google'ing the issue suggest I need to include org-apache-commons-logging.jar
(and few other JARs) but this JAR is not part of Solr's distribution so Im
not willing to do so blindly.

Thanks

Steve


Re: SolrJ vs. plain old HTTP post

2015-05-12 Thread Steven White
Thanks Shalin and all for helping with this question.  It is much
appreciated.

Steve

On Tue, May 12, 2015 at 1:24 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Mon, May 11, 2015 at 8:20 PM, Steven White swhite4...@gmail.com
 wrote:

  Thanks Erik and Emir.
 
  snip/

 
  To close the loop on this question, I will need to enable Jetty's SSL
 (the
  jetty that comes with Solr 5.1).  If I do so, will SolrJ still work, can
 I
  assume that SolrJ supports SSL?
 
 
 Yes, SolrJ can work with SSL enabled on the server as long as you pass the
 same JVM parameters on the client side to enable SSL e.g.

 -Djavax.net.ssl.keyStore=
 -Djavax.net.ssl.keyStorePassword=
 -Djavax.net.ssl.trustStore=
 -Djavax.net.ssl.trustStorePassword=

 See

 https://cwiki.apache.org/confluence/display/solr/Enabling+SSL#EnablingSSL-IndexadocumentusingCloudSolrClient


  I Google'ed but cannot find the answer.
 
  Thanks again.
 
  Steve
 
  On Mon, May 11, 2015 at 8:39 AM, Erik Hatcher erik.hatc...@gmail.com
  wrote:
 
   Another advantage to SolrJ is with SolrCloud (ZK) awareness, and taking
   advantage of some routing optimizations client-side so the cluster has
  less
   hops to make.
  
   —
   Erik Hatcher, Senior Solutions Architect
   http://www.lucidworks.com http://www.lucidworks.com/
  
  
  
  
On May 11, 2015, at 8:21 AM, Steven White swhite4...@gmail.com
  wrote:
   
Hi Everyone,
   
If all that I need to do is send data to Solr to add / delete a Solr
document, which tool is better for the job: SolrJ or plain old HTTP
  post?
   
In other word, what are the advantages of using SolrJ when the need
 is
  to
push data to Solr for indexing?
   
Thanks,
   
Steve
  
  
 



 --
 Regards,
 Shalin Shekhar Mangar.



RE: Retrieving list of synonyms and facet field values

2015-05-12 Thread Siamak Rowshan
Thanks Alessandro, managed resources was exactly what I needed.

-Original Message-
From: Alessandro Benedetti [mailto:benedetti.ale...@gmail.com] 
Sent: Tuesday, May 12, 2015 10:12 AM
To: solr-user@lucene.apache.org
Subject: Re: Retrieving list of synonyms and facet field values

Hi Siamak,

1) You can do that with the managed resources :
Take a look to the synonym section.
https://cwiki.apache.org/confluence/display/solr/Managed+Resources

Specifically :

To determine the synonyms for a specific term, you send a GET request for the 
child resource, such as /schema/analysis/synonyms/english/mad would return 
[angry,upset]. Lastly, you can delete a mapping by sending a DELETE request 
to the managed endpoint.


2) you can use the Term Component (
https://cwiki.apache.org/confluence/display/solr/The+Terms+Component)
It's quite straightforward to use .
If you are talking about the facets, when you send a query to Solr , with the 
facets enabled, you simply need to parse the resulting Json ( or xml).
In the case you are doing it programmatically SolrJ gives great support for the 
facets.

Cheers

2015-05-12 14:43 GMT+01:00 Siamak Rowshan siamak.rows...@softmart.com:

 Hi all, I'm new to Solr and would appreciate any help with this question.
 Is there a way, to retrieve the list of synonyms via the API? I also 
 need to retrieve the values of each facet field via API. For example 
 the list of Cat facet includes: fiction, non-fiction, etc.

 Thanks,
 Siamak




--
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


RE: Trying to get AnalyzingInfixSuggester to work in Solr?

2015-05-12 Thread Reitzel, Charles
Fwiw, we ended up preferring the 4.x spellcheck approach.  For starters, it is 
supported by SolrJ ... :-)

But more importantly, we wanted a mix of both terms and field values in our 
suggestions.   We found the Suggester component doesn't do that.   We also 
weren't interested in matching in the middle of words. Partial prefix matching 
was better and, thus, we used an ngram query.

In addition, we liked the Amazon style xyz in Dept X, xyz in Dept Y 
suggestions, which we used facets in combination with the ngram query to 
produce.  

Finally, we needed to make a minor patch to get document frequency information 
about terms (and collations) provided by the SpellCheckComponent.

https://issues.apache.org/jira/browse/SOLR-7144


So, to summarize, we ended up with a 2-pass suggestion approach:
pass 1: spellcheck with document frequency and collation using 
WFSTLookupFactory and org.apache.solr.spelling.suggest.Suggester.
pass 2: if spellcheck has corrections?, use 1st correction instead of original 
term as query for against an ngram field (using copyTo to populate from fields 
we care about).  This query also has a field facet.  The facet values are used 
as ${queryTerm} in ${facet} suggestions.  Specified fields from matching docs 
are used as suggestions (like the suggester component).

Please don't take this to mean you should be doing anything like what we are 
doing.  But, rather, I'm urging you to dig deeper into your suggestion 
functionality and think hard about what really makes sense for your 
application.  It's a major usability issue for search apps.


-Original Message-
From: O. Olson [mailto:olson_...@yahoo.it] 
Sent: Thursday, May 07, 2015 4:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Trying to get AnalyzingInfixSuggester to work in Solr?

Thank you Erick. I'm sorry I did not mention this earlier, but I am still on 
Solr 4.10.3. Once I upgrade to Solr 5.0+ , I would consider your suggestion in 
your blog post. 
O. O. 


Erick Erickson wrote
 Uh, you mean because I forgot to pate in the URL? Siih...
 
 Anyway, the URL is irrelevant now that you've solved your problem, but 
 in case you're interested:
 http://lucidworks.com/blog/solr-suggester/
 
 Sorry for the confusion.
 Erick





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204392.html
Sent from the Solr - User mailing list archive at Nabble.com.

*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*



Re: JARs needed to run SolrJ

2015-05-12 Thread Emir Arnautovic

Hi Steve,
You can find list of dependencies in its pom: 
http://central.maven.org/maven2/org/apache/solr/solr-solrj/5.1.0/solr-solrj-5.1.0.pom


It would be best if you use some dependency management tool. You can use 
it in separate project to create all-in-one jar and than include that 
one in your project, but there is always chance it will collide with 
other project jars.


Thanks,
Emir

--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr  Elasticsearch Support * http://sematext.com



On 12.05.2015 20:33, Steven White wrote:

Hi Everyone,

I am trying to use SolrJ to add docs to Solr.  The following line:

 HttpSolrClient solrServer = new HttpSolrClient(
http://localhost:8983/solr;);

Is failing with exception:

 Exception in thread main java.lang.NoClassDefFoundError:
org.apache.commons.logging.LogFactory
at
org.apache.http.impl.client.CloseableHttpClient.init(CloseableHttpClient.java:60)
at
org.apache.http.impl.client.AbstractHttpClient.init(AbstractHttpClient.java:271)
at
org.apache.http.impl.client.DefaultHttpClient.init(DefaultHttpClient.java:127)
 . . . . . . . . . . .
 Caused by: java.lang.ClassNotFoundException:
org.apache.commons.logging.LogFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:665)
at java.lang.ClassLoader.loadClassHelper(ClassLoader.java:942)
at java.lang.ClassLoader.loadClass(ClassLoader.java:851)
 . . . . . . . . . . .

I pulled in everything from \solr-5.1.0\dist\solrj-lib and I included
solr-solrj-5.1.0.jar from \solr-5.1.0\dist.

Why I'm getting the above error?  Is there an external JAR I need?  I want
to pull in required JARs only.

Google'ing the issue suggest I need to include org-apache-commons-logging.jar
(and few other JARs) but this JAR is not part of Solr's distribution so Im
not willing to do so blindly.

Thanks

Steve



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr  Elasticsearch Support * http://sematext.com/



Re: JARs needed to run SolrJ

2015-05-12 Thread Shawn Heisey
On 5/12/2015 12:33 PM, Steven White wrote:
 Hi Everyone,

 I am trying to use SolrJ to add docs to Solr.  The following line:

 HttpSolrClient solrServer = new HttpSolrClient(
 http://localhost:8983/solr;);

 Is failing with exception:

 Exception in thread main java.lang.NoClassDefFoundError:
 org.apache.commons.logging.LogFactory
 at
 org.apache.http.impl.client.CloseableHttpClient.init(CloseableHttpClient.java:60)
 at
 org.apache.http.impl.client.AbstractHttpClient.init(AbstractHttpClient.java:271)
 at
 org.apache.http.impl.client.DefaultHttpClient.init(DefaultHttpClient.java:127)
 . . . . . . . . . . .
 Caused by: java.lang.ClassNotFoundException:
 org.apache.commons.logging.LogFactory
 at java.net.URLClassLoader.findClass(URLClassLoader.java:665)
 at java.lang.ClassLoader.loadClassHelper(ClassLoader.java:942)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:851)
 . . . . . . . . . . .

 I pulled in everything from \solr-5.1.0\dist\solrj-lib and I included
 solr-solrj-5.1.0.jar from \solr-5.1.0\dist.

You need to make a decision about how to do your logging.  This decision
is intentionally NOT made for you, so you can do whatever you wish.

SolrJ uses the slf4j logging API, but slf4j doesn't actually do any
logging itself, you must include jars to decide which logging framework
will actually do the logging.  In the server/lib/ext directory of the
Solr binary download, you will find a set of jars.  These jars set up
the logging intercepts that Solr (and SolrJ) will need for third-party
libraries, and configure it to bind the actual logging to log4j.

While SolrJ itself uses slf4j directly, some of the third-party
libraries use other logging frameworks, which must be intercepted by
slf4j for a consistent logging experience.  For your error message
above, it is the HttpClient library that is trying to load the Apache
Commons Logging class.  The jcl-over-slf4j jar provides an
implementation of the commons logging classes, and directs those logs
through slf4j.

You will also need the server/resources/log4j.properties file somewhere
on your classpath, or specified in a system property on your program
startup, in order to configure log4j.  You'll probably want to customize
that properties file.

If you want to use a different logging framework other than log4j for
the actual logging, you will need to research how to set up your slf4j
jars to accomplish your goal.  Some limited information can be found here:

http://wiki.apache.org/solr/SolrLogging

More comprehensive information, not specific to Solr, can be found here:

http://slf4j.org/

Thanks,
Shawn



Re: Transactional Behavior

2015-05-12 Thread Jack Krupansky
Solr does have a rollback/ command, but it is an expert feature and not
so clear how it works in SolrCloud.

See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers
and
https://wiki.apache.org/solr/UpdateXmlMessages#A.22rollback.22


-- Jack Krupansky

On Tue, May 12, 2015 at 12:58 PM, Amr Ali amr_...@siliconexpert.com wrote:

 Hello,

 I have a business case in which I need to be able for the rollback. When I
 tried add/commit I was not able to prevent other threads that write to a
 given Solr core from committing everything. I also tried indexwriter but
 Solr did not get changes until we restart it.


 --
 Regards,
 Amr Ali

 City stars capital 8 - 3rd floor, Nasr city, Cairo, Egypt
 Ext: 278





Re: Transactional Behavior

2015-05-12 Thread Emir Arnautovic

Hi Amr,
One option is to include transaction id in your documents and do delete 
in case of failed transaction. It is not cheap option - additional field 
if you don't have something to use to identify transaction. Assuming 
rollback will not happen to often deleting is not that big issue.


Thanks,
Emir

--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr  Elasticsearch Support * http://sematext.com/



On 12.05.2015 22:37, Amr Ali wrote:

Please check this

https://lucene.apache.org/solr/4_1_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback()
Note that this is not a true rollback as in databases. Content you have previously 
added may have been committed due to autoCommit, buffer full, other client performing a 
commit etc.

It is not a real rollback if you have two threads T1 and T2 that are adding. If T1 is 
adding 500 and T2 is adding 3 then T2 will commit its 3 document PLUS the documents added 
by T1 (because T2 will finish add/commit before T2 due to the documents number). Solr 
transactions are server side only.


--
Regards,
Amr Ali

City stars capital 8 - 3rd floor, Nasr city, Cairo, Egypt
Ext: 278



-Original Message-
From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
Sent: Tuesday, May 12, 2015 10:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Transactional Behavior

Solr does have a rollback/ command, but it is an expert feature and not so 
clear how it works in SolrCloud.

See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers
and
https://wiki.apache.org/solr/UpdateXmlMessages#A.22rollback.22


-- Jack Krupansky

On Tue, May 12, 2015 at 12:58 PM, Amr Ali amr_...@siliconexpert.com wrote:


Hello,

I have a business case in which I need to be able for the rollback.
When I tried add/commit I was not able to prevent other threads that
write to a given Solr core from committing everything. I also tried
indexwriter but Solr did not get changes until we restart it.


--
Regards,
Amr Ali

City stars capital 8 - 3rd floor, Nasr city, Cairo, Egypt
Ext: 278





--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr  Elasticsearch Support * http://sematext.com/



Re: Why are these two queries different?

2015-05-12 Thread Frank li
Thanks for your help. I figured it out. Just as you said. Appreciate your
help. Somehow forgot to reply your post.

On Wed, Apr 29, 2015 at 9:24 AM, Chris Hostetter hossman_luc...@fucit.org
wrote:


 : We did two SOLR qeries and they supposed to return the same results but
 : did not:

 the short answer is: if you want those queries to return the same results,
 then you need to adjust your query time analyzer forthe all_text field to
 not split intra numberic tokens on ,

 i don't know *why* exactly it's doing that, because you didn't give us the
 full details of your field/fieldtypes (or other really important info: the
 full request params -- echoParams=all -- and the documents matched by your
 second query, etc... https://wiki.apache.org/solr/UsingMailingLists )
 ... but that's the reason the queries are different as evident from the
 parsedquery output.


 : Query 1: all_text:(US 4,568,649 A)
 :
 : parsedquery: (+((all_text:us ((all_text:4 all_text:568 all_text:649
 : all_text:4568649)~4))~2))/no_coord,
 :
 : Result: numFound: 0,
 :
 : Query 2: all_text:(US 4568649)
 :
 : parsedquery: (+((all_text:us all_text:4568649)~2))/no_coord,
 :
 :
 : Result: numFound: 2,
 :
 :
 : We assumed the two return the same result. Our default operator is AND.



 -Hoss
 http://www.lucidworks.com/



RE: Transactional Behavior

2015-05-12 Thread Amr Ali
Please check this

https://lucene.apache.org/solr/4_1_0/solr-solrj/org/apache/solr/client/solrj/SolrServer.html#rollback()
Note that this is not a true rollback as in databases. Content you have 
previously added may have been committed due to autoCommit, buffer full, other 
client performing a commit etc.

It is not a real rollback if you have two threads T1 and T2 that are adding. If 
T1 is adding 500 and T2 is adding 3 then T2 will commit its 3 document PLUS the 
documents added by T1 (because T2 will finish add/commit before T2 due to the 
documents number). Solr transactions are server side only.


--
Regards,
Amr Ali

City stars capital 8 - 3rd floor, Nasr city, Cairo, Egypt
Ext: 278



-Original Message-
From: Jack Krupansky [mailto:jack.krupan...@gmail.com] 
Sent: Tuesday, May 12, 2015 10:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Transactional Behavior

Solr does have a rollback/ command, but it is an expert feature and not so 
clear how it works in SolrCloud.

See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers
and
https://wiki.apache.org/solr/UpdateXmlMessages#A.22rollback.22


-- Jack Krupansky

On Tue, May 12, 2015 at 12:58 PM, Amr Ali amr_...@siliconexpert.com wrote:

 Hello,

 I have a business case in which I need to be able for the rollback. 
 When I tried add/commit I was not able to prevent other threads that 
 write to a given Solr core from committing everything. I also tried 
 indexwriter but Solr did not get changes until we restart it.


 --
 Regards,
 Amr Ali

 City stars capital 8 - 3rd floor, Nasr city, Cairo, Egypt
 Ext: 278





How is the most relevant document of each group chosen when group.truncate is used?

2015-05-12 Thread Andrii Berezhynskyi
Hi all,

When I use group.truncate and filtering I'm getting strange faceting
results. If I use just grouping without filtering:

group=truegroup.field=parent_skugroup.ngroups=truegroup.truncate=truefacet=truefacet.field=color,

 then I get:

facet_fields: { color: [ white, 19742,

19742 white items.

However if I filter by white items:

group=truegroup.field=parent_skugroup.ngroups=truegroup.truncate=truefacet=truefacet.field=colorfq=color:white,


I'm getting 20543 items. The same happens when I use collapse query parser
instead of grouping.

I would expect those two numbers to be equal. So I assume the most relevant
document of each group is chosen somehow differently when filtering is
used. How can this be explained?

Best regards,
Andrii


Re: schema modification issue

2015-05-12 Thread Ziloo Zolr
Hi Steve,

Thanks for paying attention to this. Here is the JIRA issue I reported:
https://issues.apache.org/jira/browse/SOLR-7536.

Sorry for any inconvenience caused by my unfamiliarness with JIRA.

2015-05-12 0:22 GMT+08:00 Steve Rowe sar...@gmail.com:

 Hi,

 Thanks for reporting, I’m working a test to reproduce.

 Can you please create a Solr JIRA issue for this?:
 https://issues.apache.org/jira/browse/SOLR/

 Thanks,
 Steve

  On May 7, 2015, at 5:40 AM, User Zolr zolr.u...@gmail.com wrote:
 
  Hi there,
 
  I have come accross a problem that  when using managed schema in
 SolrCloud,
  adding fields into schema would SOMETIMES end up prompting Can't find
  resource 'schema.xml' in classpath or '/configs/collectionName',
  cwd=/export/solr/solr-5.1.0/server, there is of course no schema.xml in
  configs, but 'schema.xml.bak' and 'managed-schema'
 
  i use solrj to create a collection:
 
 Path tempPath = getConfigPath();
  client.uploadConfig(tempPath, name); //customized configs with
  solrconfig.xml using ManagedIndexSchemaFactory
  if(numShards==0){
  numShards = getNumNodes(client);
  }
  Create request = new CollectionAdminRequest.Create();
  request.setCollectionName(name);
  request.setNumShards(numShards);
  replicationFactor =
  (replicationFactor==0?DEFAULT_REPLICA_FACTOR:replicationFactor);
  request.setReplicationFactor(replicationFactor);
 
 request.setMaxShardsPerNode(maxShardsPerNode==0?replicationFactor:maxShardsPerNode);
  CollectionAdminResponse response = request.process(client);
 
 
  and adding fields to schema, either by curl or by httpclient,  would
  sometimes yield the following error, but the error can be fixed by
  RELOADING the newly created collection once or several times:
 
  INFO  - [{  responseHeader:{status:500,QTime:5},
  errors:[Error reading input String Can't find resource 'schema.xml' in
  classpath or '/configs/collectionName',
  cwd=/export/solr/solr-5.1.0/server],  error:{msg:Can't find
  resource 'schema.xml' in classpath or '/configs/collectionName',
  cwd=/export/solr/solr-5.1.0/server,trace:java.io.IOException:
 Can't
  find resource 'schema.xml' in classpath or '/configs/collectionName',
  cwd=/export/solr/solr-5.1.0/server
 
  at
 
 org.apache.solr.cloud.ZkSolrResourceLoader.openResource(ZkSolrResourceLoader.java:98)
  at
 
 org.apache.solr.schema.SchemaManager.getFreshManagedSchema(SchemaManager.java:421)
  at
 org.apache.solr.schema.SchemaManager.doOperations(SchemaManager.java:104)
  at
 
 org.apache.solr.schema.SchemaManager.performOperations(SchemaManager.java:94)
  at
 
 org.apache.solr.handler.SchemaHandler.handleRequestBody(SchemaHandler.java:57)
  at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
  at
 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
  at
 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
  at
 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
  at
 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
  at
 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
  at
 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
  at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
  at
 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
  at
 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
  at
 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
  at
 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
  at
 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
  at
 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
  at org.eclipse.jetty.server.Server.handle(Server.java:368)
  at
 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
  at
 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
  at
 
 org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
  at
 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
  at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
  at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
  at
 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
  

Re: Solr Multilingual Indexing with one field- Guidance

2015-05-12 Thread TK Solr


On 5/7/15, 11:23 AM, Kuntal Ganguly wrote:

1) Is this a correct approach to do it? Or i'm missing something?

Does the user wants to see the documents that he/she doesn't understand?
The words such as doctor, taxi, etc. are common among many languages in 
Europe.
Would the Spanish user wants to see English documents?
Of course this issue can be worked-around by having a separate language field.

How do you handle word collision among languages ?
kind in German means child in English. If a German user search for articles
about children, they will find lots of unrelated English
articles about someone being kind.
This one too can be worked-around by having a language field.

By default, Solr/Lucene hits are sort by the relevancy scores and
the score calculation uses IDF. If a search term appears in many documents,
the score is low. Because virtually all German documents have die, the 
particle,
the score of the English word die will be low also.


2) Can you give me an example where there will be problem with this above
new field type? A use-case/scenario with example will be very helpful.


If you have lots of Japanese documents indexed, try searching 京都 (Kyoto).
You will find many documents about Tokyo (東京) because the government
of the metropolitan Tokyo area is spelled as 東京都 = Tokyo Capital, which
generates two bigrams, 東京 and 京都.

Kuro





scoreMode ToParentBlockJoinQuery

2015-05-12 Thread StrW_dev
Hi

Is it possible to configure the scoreMode of the Parent block join query
parser (ToParentBlockJoinQuery)?
It seems it's set to none, while i would require max in this case.

What I want is to filter on child documents, but still use the
relevance/boost of these child documents in the final score.

Gr.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/scoreMode-ToParentBlockJoinQuery-tp4205020.html
Sent from the Solr - User mailing list archive at Nabble.com.


Creating a new collection via solrj

2015-05-12 Thread Sznajder ForMailingList
Hi,

I would like to create programmatically a new collection with a given
Schema (the schema.xml file is in my java project under a folder
configuration/, for example)

However, I did not find a solrj example describing these steps.

If one of you could help..

thanks!

Benjamin


Beginner problems with solr.ICUCollationField

2015-05-12 Thread Björn Keil
Hello,

I am trying to understand solr 5.1 (trying to overcome some problems I
have with solr 3.6) by experimenting with the distributed package, but I
am having problems using a solr.ICUCollationField field.

Trying to create the collection using
bin/create_collection aborts, because it says it does the classloader
failed to find solr.ICUCollationField.

The documentations says:
## QUOTE ###
solr.ICUCollationField is included in the Solr analysis-extras contrib -
see solr/contrib/analysis-extras/README.txt for instructions on which
jars you need to add to your SOLR_HOME/lib in order to use it.


The mentioned README.txt file says:
## QUOTE ###
ICU relies upon lucene-libs/lucene-analyzers-icu-X.Y.jar
and lib/icu4j-X.Y.jar


Well, that's a bid odd in so far as there is no SOLR_HOME/lib directory.
When I start solr with the verbose flag it says:

SOLR_HOME = (...)/solr-5.1.0/server/solr

However, what I did is first symlink and then copy the respective
libraries to solr-5.1.0/server/lib/ext. That leads to the server not
starting at all. I also tried
solr-5.1.0/server/solr-webapp/webapp/WEB-INF/lib, but that does not make
any difference at all.

Then I had a look at the example configurations how it's done there and
notices the lib tags in the respective solrconfig.xml files. But the
process still fails, even though the logs indicate that the .jar files
*did* load.

There is a message:
201946 [qtp1055930828-14] INFO  org.apache.solr.core.SolrConfig
[booklooker shard2  booklooker_shard2_replica1] – Adding specified lib
dirs to ClassLoader
201947 [qtp1055930828-14] INFO  org.apache.solr.core.SolrResourceLoader
 [booklooker shard2  booklooker_shard2_replica1] – Adding
'file:/home/bjoern/solr-5.1.0/contrib/analysis-extras/lib/icu4j-54.1.jar' t
o classloader
201948 [qtp1055930828-14] INFO  org.apache.solr.core.SolrResourceLoader
 [booklooker shard2  booklooker_shard2_replica1] – Adding
'file:/home/bjoern/solr-5.1.0/contrib/analysis-extras/lucene-libs/lucene-an
alyzers-icu-5.1.0.jar' to classloader

and later:
202810 [qtp1055930828-14] ERROR org.apache.solr.core.CoreContainer
[booklooker shard2  booklooker_shard2_replica1] – Error creating core
[booklooker_shard2_replica1]: (...) Caused by:
java.lang.ClassNotFoundException: solr.ICUCollationField

So the respective jar files have been added to the class loader, but it
does not find the field? Well, the class solr.ICUCollationField itself
should be found somewhere in the org.apache.solr tree. Not in the
org.apache.lucene tree and certainly not under com.ibm.icu4j. But I
can't find the proper jar file for it...

It would nice if someone could tell me what's wrong here.



signature.asc
Description: OpenPGP digital signature


Re: Creating a new collection via solrj

2015-05-12 Thread Erick Erickson
See the CollectionAdminRequest.createCollection etc.

Best,
Erick



On Tue, May 12, 2015 at 3:53 AM, Sznajder ForMailingList
bs4mailingl...@gmail.com wrote:
 Hi,

 I would like to create programmatically a new collection with a given
 Schema (the schema.xml file is in my java project under a folder
 configuration/, for example)

 However, I did not find a solrj example describing these steps.

 If one of you could help..

 thanks!

 Benjamin


utility methods to get field values from index

2015-05-12 Thread Parvesh Garg
Hi All,

Was wondering if there is any class in Solr that provides utility methods
to fetch indexed field values for documents using docId. Something simple
like

getMultiLong(String field, int docId)

getLong(String field, int docId)

We have written a solr component to return group level stats like avg
score, max score etc over a large number of documents (say 5000+) against a
query executed using edismax. Need to get the group id fields value to do
that, this is a single valued long field.

This component also looks at one more field that is a multivalued long
field for each document and compute a score based on frequency + document
score for each value.

Currently we are using stored fields and was wondering if this approach
would be faster.

Apologies if this is too much to ask for.

Parvesh Garg,


Re: Retrieving list of synonyms and facet field values

2015-05-12 Thread Alessandro Benedetti
Hi Siamak,

1) You can do that with the managed resources :
Take a look to the synonym section.
https://cwiki.apache.org/confluence/display/solr/Managed+Resources

Specifically :

To determine the synonyms for a specific term, you send a GET request for
the child resource, such as /schema/analysis/synonyms/english/mad would
return [angry,upset]. Lastly, you can delete a mapping by sending a
DELETE request to the managed endpoint.


2) you can use the Term Component (
https://cwiki.apache.org/confluence/display/solr/The+Terms+Component)
It's quite straightforward to use .
If you are talking about the facets, when you send a query to Solr , with
the facets enabled, you simply need to parse the resulting Json ( or xml).
In the case you are doing it programmatically SolrJ gives great support for
the facets.

Cheers

2015-05-12 14:43 GMT+01:00 Siamak Rowshan siamak.rows...@softmart.com:

 Hi all, I'm new to Solr and would appreciate any help with this question.
 Is there a way, to retrieve the list of synonyms via the API? I also need
 to retrieve the values of each facet field via API. For example the list of
 Cat facet includes: fiction, non-fiction, etc.

 Thanks,
 Siamak




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Sorting on multivalues field in Solr

2015-05-12 Thread Nutch Solr User
Thanks Alex that was really useful.



-
Nutch Solr User

The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-on-multivalued-field-in-Solr-tp4204996p4205017.html
Sent from the Solr - User mailing list archive at Nabble.com.


Retrieving list of synonyms and facet field values

2015-05-12 Thread Siamak Rowshan
Hi all, I'm new to Solr and would appreciate any help with this question. Is 
there a way, to retrieve the list of synonyms via the API? I also need to 
retrieve the values of each facet field via API. For example the list of Cat 
facet includes: fiction, non-fiction, etc.

Thanks,
Siamak  


Re: scoreMode ToParentBlockJoinQuery

2015-05-12 Thread Alessandro Benedetti
Hi ,
One year ago or something, it was not possible to have in Solr the results
of the Join sorted ( it was not using the lucene sorting) .
In solr it was only a filter query with no scoring.
I should verify if we are currently in the same scenario.
For sure it should not be a big deal to port the lucene feature in Solr.

Cheers


2015-05-12 11:11 GMT+01:00 StrW_dev r.j.bamb...@structweb.nl:

 Hi

 Is it possible to configure the scoreMode of the Parent block join query
 parser (ToParentBlockJoinQuery)?
 It seems it's set to none, while i would require max in this case.

 What I want is to filter on child documents, but still use the
 relevance/boost of these child documents in the final score.

 Gr.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/scoreMode-ToParentBlockJoinQuery-tp4205020.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Beginner problems with solr.ICUCollationField

2015-05-12 Thread Shawn Heisey
On 5/12/2015 7:03 AM, Björn Keil wrote:
 Well, that's a bid odd in so far as there is no SOLR_HOME/lib
 directory. When I start solr with the verbose flag it says:

 SOLR_HOME = (...)/solr-5.1.0/server/solr

 However, what I did is first symlink and then copy the respective
 libraries to solr-5.1.0/server/lib/ext. That leads to the server not
 starting at all. I also tried
 solr-5.1.0/server/solr-webapp/webapp/WEB-INF/lib, but that does not
 make any difference at all.

The ${solr.solr.home}/lib directory does not exist by default in the
Solr example, but if you create it and use it for all your contrib/user
jars that your Solr config needs, it will work.  You should completely
remove all lib config elements from solrconfig.xml at the same time,
and make sure that any jar you need is in that lib directory.  All the
jars will be loaded once and available to all cores.

There seems to be some kind of problem with the classloader when certain
jars (the ICU jars being the one example I'm sure about) are loaded more
than once by the same classloader.

https://issues.apache.org/jira/browse/SOLR-4852

Thanks,
Shawn



Re: scoreMode ToParentBlockJoinQuery

2015-05-12 Thread StrW_dev
I actually did some digging and changed the default ScoreMode in the source
code, which actually allowed me to do what I want.

So now I use the parent block join query which propogates the score. With
the new child transformer for the return field I can even get the child info
in the result :).



--
View this message in context: 
http://lucene.472066.n3.nabble.com/scoreMode-ToParentBlockJoinQuery-tp4205020p4205074.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: scoreMode ToParentBlockJoinQuery

2015-05-12 Thread Alessandro Benedetti
So , have you customised your Solr with a plugin ?
Do you have additional info or documentation ? What is the new child
transformer ? I never used it !

Cheers

2015-05-12 16:12 GMT+01:00 StrW_dev r.j.bamb...@structweb.nl:

 I actually did some digging and changed the default ScoreMode in the source
 code, which actually allowed me to do what I want.

 So now I use the parent block join query which propogates the score. With
 the new child transformer for the return field I can even get the child
 info
 in the result :).



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/scoreMode-ToParentBlockJoinQuery-tp4205020p4205074.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England