Re: Select TOP 10 items from Solr Query

2017-02-16 Thread Michael Kuhlmann
So basically you want faceting only on the returned result set?

I doubt that this is possible without additional queries. The issue is
that faceting and result collecting is done within one iteration, so
when some document (actually the document's internal id) is fetched as a
possible result item, you can't determine whether this will make it into
the top x elements or not since there will come more.

-Michael

Am 17.02.2017 um 05:00 schrieb Zheng Lin Edwin Yeo:
> Hi,
>
> Would like to check, is it possible to do a select of say TOP 10 items from
> Solr query, and use the list of the items to do another query (Eg: JSON
> Facet)?
>
> Currently, I'm using a normal facet to retrieve the list of the TOP 10 item
> from the normal faceting.
> After which, I have to list out all the 10 items as a filter when I do the
> JSON Facet like this
> q=itemNo:(001 002 003 004 005 006 007 008 009 010)
>
> It will help if I can combine both of this into a single query.
>
> I'm using Solr 6.4.1
>
> Regards,
> Edwin
>



Select TOP 10 items from Solr Query

2017-02-16 Thread Zheng Lin Edwin Yeo
Hi,

Would like to check, is it possible to do a select of say TOP 10 items from
Solr query, and use the list of the items to do another query (Eg: JSON
Facet)?

Currently, I'm using a normal facet to retrieve the list of the TOP 10 item
from the normal faceting.
After which, I have to list out all the 10 items as a filter when I do the
JSON Facet like this
q=itemNo:(001 002 003 004 005 006 007 008 009 010)

It will help if I can combine both of this into a single query.

I'm using Solr 6.4.1

Regards,
Edwin


Sort by field Type String

2017-02-16 Thread Deeksha Sharma
Hi,

I have an index with a field that looks like below:


Below are some examples of version values:

"version":"2.0.5"},
  {
"version":"1.10-b04"},
  {
"version":"2.3.3"},
  {
"version":"2.0-M5.1"},
  {
"version":"0.4.0"},
  {
"version":"2.1.0-M01"},
  {
"version":"2.0.3"},
  {
"version":"4.2.2"},
  {
"version":"5.2.12.Final"},
  {
"version":"1.7.4"}]
 }

As per the instructions in documentation here: 
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-ThesortParameter

When I query solr, I want to sort the results by this version. When I in my 
query sort=version asc it gives me some weird result (shown below):


{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "q":"*:*",
  "indent":"true",
  "fl":"version",
  "sort":"version asc",
  "wt":"json"}},
  "response":{"numFound":5249127,"start":0,"docs":[
  {
"version":"\"1.0.0"},
  {
"version":"\"1.0.0"},
  {
"version":"$%7Bcucumber-jvm.version%7D"},
  {
"version":"$%7Bcucumber-jvm.version%7D"},
  {
"version":"$%7Bcucumber-jvm.version%7D"},
  {
"version":"$%7Blog4jplugin.version%7D"},
  {
"version":"$%7Blog4jplugin.version%7D"},
  {
"version":"$%7Blog4jplugin.version%7D"},
  {
"version":"$%7Blog4jplugin.version%7D"},
  {
"version":"${env.VERSION}"}]
  }}

Can someone please help?



RE: Atomic updates to increase single field bulk updates?

2017-02-16 Thread Chris Hostetter

: partial update or a complete document. Under the hood a partial update 
: is a complete object anyway. Using partial updates you gain a little 
: bandwidth at the expense of additional stored fields.

FWIW: once SOLR-5944 lands in a released version, that won't always be 
true -- atomic updates on numeric fields that are docValues="true" and 
nothing else (stored=false, indexed=false) will use updatable docvalues 
under the covers and should be much more efficient then either reindexing 
the entire document, or the default atomic update codepath of re-indexing 
all fields from stored values.



-Hoss
http://www.lucidworks.com/


Multiple bf VS Single sum bf (performance)

2017-02-16 Thread Ivan Bianchi
Hi,

I'm doing a quite complex boost function for my search and I'm wondering
what method has better performance.

   - Single *bf* function with a *sum()* chain.

*For example*: bf: sum(sum(sum(A,B),C),D)

   - Multiple bf parameters (which internally does the same sum chain).

*For example*: bf: [A,B,C,D].


I'm asking as I saw the benefits of the cache with multiple *fq* instead of
a single and-chained *fq,* so I'm wondering if this also happens with *bf*.

For debugging purposes the second one is obviously better, as I can see the
individual scores with the *debugQuery.*

Best regards,

-- 
Ivan


Re: Solr6.3.0 SolrJ API for Basic Authentication

2017-02-16 Thread Hrishikesh Gadre
Hey,

The other alternative would be to implement a HttpClientConfigurer which
can perform preemptive basic authentication (just the same way SolrRequest
is sending the credentials). The code is available in the branch_6x here,

https://github.com/apache/lucene-solr/blob/1bfa057d5c9d89b116031baa7493ee422b4cbabb/solr/solrj/src/java/org/apache/solr/client/solrj/impl/PreemptiveBasicAuthConfigurer.java

For the 6.3.x installation, you can copy this code and include it as a
custom jar on the client side. As part of the client initialization, just
configure this implementation of client configurer. Here is a sample code
snippet you can use as a reference.

https://github.com/apache/lucene-solr/blob/a986368fd0670840177a8c19fb15dcd1f0e69797/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java#L111-L123

Let me know if you have any questions.

Thanks
Hrishikesh


On Thu, Feb 16, 2017 at 6:39 AM, Bryan Bende  wrote:

> Hello,
>
> The QueryRequest was just an example, it will work with any request
> that extends SolrRequest.
>
> How are you indexing your documents?
>
> I am going to assume you are doing something like this:
>
> SolrClient client = ...
> client.add(solrInputDocument);
>
> Behind the scenes this will do something like the following:
>
> UpdateRequest req = new UpdateRequest();
> req.add(doc);
> req.setCommitWithin(commitWithinMs);
> req.process(client, collection);
>
> So you can do that your self and first set the basic auth credentials
> on the UpdateRequest which extends SolrRequest.
>
> Thanks,
>
> Bryan
>
> On Thu, Feb 16, 2017 at 5:45 AM, vrindavda  wrote:
> > Hi Bryan,
> >
> > Thanks for your quick response.
> >
> > I am trying to ingest data into SolrCloud,  Hence I will not have any
> solr
> > query. Will it be right approach to use QueryRequest to index data ? Do I
> > need to put any dummy solrQuery instead ?
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/Solr6-3-0-SolrJ-API-for-Basic-Authentication-
> tp4320238p4320675.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Testing an ingest framework that uses Apache Tika

2017-02-16 Thread Mattmann, Chris A (3010)
++1 awesome job

++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, NSF & Open Source Projects Formulation and Development Offices (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-503
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++
 

On 2/16/17, 5:28 AM, "Luís Filipe Nassif"  wrote:

Excellent, Tim! Thank you for all your great work on Apache Tika!

2017-02-16 11:23 GMT-02:00 Konstantin Gribov :

> Tim,
>
> it's a awesome feature for downstream projects' integration tests. Thanks
> for implementing it!
>
> чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. :
>
> > All,
> >
> > I finally got around to documenting Apache Tika's MockParser[1].  As of
> > Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and
> you
> > can simulate:
> >
> > 1. Regular catchable exceptions
> > 2. OOMs
> > 3. Permanent hangs
> >
> > This will allow you to determine if your ingest framework is robust
> > against these issues.
> >
> > As always, we fix Tika when we can, but if history is any indicator,
> > you'll want to make sure your ingest code can handle these issues if you
> > are handling millions/billions of files from the wild.
> >
> > Cheers,
> >
> > Tim
> >
> >
> > [1] https://wiki.apache.org/tika/MockParser
> >
> --
>
> Best regards,
> Konstantin Gribov
>




Re: Solr6.3.0 SolrJ API for Basic Authentication

2017-02-16 Thread Bryan Bende
Hello,

The QueryRequest was just an example, it will work with any request
that extends SolrRequest.

How are you indexing your documents?

I am going to assume you are doing something like this:

SolrClient client = ...
client.add(solrInputDocument);

Behind the scenes this will do something like the following:

UpdateRequest req = new UpdateRequest();
req.add(doc);
req.setCommitWithin(commitWithinMs);
req.process(client, collection);

So you can do that your self and first set the basic auth credentials
on the UpdateRequest which extends SolrRequest.

Thanks,

Bryan

On Thu, Feb 16, 2017 at 5:45 AM, vrindavda  wrote:
> Hi Bryan,
>
> Thanks for your quick response.
>
> I am trying to ingest data into SolrCloud,  Hence I will not have any solr
> query. Will it be right approach to use QueryRequest to index data ? Do I
> need to put any dummy solrQuery instead ?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr6-3-0-SolrJ-API-for-Basic-Authentication-tp4320238p4320675.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Testing an ingest framework that uses Apache Tika

2017-02-16 Thread Luís Filipe Nassif
Excellent, Tim! Thank you for all your great work on Apache Tika!

2017-02-16 11:23 GMT-02:00 Konstantin Gribov :

> Tim,
>
> it's a awesome feature for downstream projects' integration tests. Thanks
> for implementing it!
>
> чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. :
>
> > All,
> >
> > I finally got around to documenting Apache Tika's MockParser[1].  As of
> > Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and
> you
> > can simulate:
> >
> > 1. Regular catchable exceptions
> > 2. OOMs
> > 3. Permanent hangs
> >
> > This will allow you to determine if your ingest framework is robust
> > against these issues.
> >
> > As always, we fix Tika when we can, but if history is any indicator,
> > you'll want to make sure your ingest code can handle these issues if you
> > are handling millions/billions of files from the wild.
> >
> > Cheers,
> >
> > Tim
> >
> >
> > [1] https://wiki.apache.org/tika/MockParser
> >
> --
>
> Best regards,
> Konstantin Gribov
>


Re: Testing an ingest framework that uses Apache Tika

2017-02-16 Thread Konstantin Gribov
Tim,

it's a awesome feature for downstream projects' integration tests. Thanks
for implementing it!

чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. :

> All,
>
> I finally got around to documenting Apache Tika's MockParser[1].  As of
> Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and you
> can simulate:
>
> 1. Regular catchable exceptions
> 2. OOMs
> 3. Permanent hangs
>
> This will allow you to determine if your ingest framework is robust
> against these issues.
>
> As always, we fix Tika when we can, but if history is any indicator,
> you'll want to make sure your ingest code can handle these issues if you
> are handling millions/billions of files from the wild.
>
> Cheers,
>
> Tim
>
>
> [1] https://wiki.apache.org/tika/MockParser
>
-- 

Best regards,
Konstantin Gribov


Testing an ingest framework that uses Apache Tika

2017-02-16 Thread Allison, Timothy B.
All,

I finally got around to documenting Apache Tika's MockParser[1].  As of Tika 
1.15 (unreleased), add tika-core-tests.jar to your class path, and you can 
simulate:

1. Regular catchable exceptions
2. OOMs
3. Permanent hangs

This will allow you to determine if your ingest framework is robust against 
these issues.

As always, we fix Tika when we can, but if history is any indicator, you'll 
want to make sure your ingest code can handle these issues if you are handling 
millions/billions of files from the wild.

Cheers,

Tim


[1] https://wiki.apache.org/tika/MockParser


Re: Solr6.3.0 SolrJ API for Basic Authentication

2017-02-16 Thread vrindavda
Hi Bryan,

Thanks for your quick response.

I am trying to ingest data into SolrCloud,  Hence I will not have any solr
query. Will it be right approach to use QueryRequest to index data ? Do I
need to put any dummy solrQuery instead ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr6-3-0-SolrJ-API-for-Basic-Authentication-tp4320238p4320675.html
Sent from the Solr - User mailing list archive at Nabble.com.