Solr 4.10 Joins: Slow performance with millions of documents

2016-08-14 Thread Tim Frey
Hi there.  I'm trying to fix a performance problem I have with queries that
use Solr's Join feature.  The query is intended to find all Job
Applications that have an Interview in a particular state.  There are 20
million Job Applications and around 7 million Interviews, with 1 million
Interviews in the state I'm looking for.  With all other filters applied,
the total result set is around 5000 documents.  The query takes around 10
seconds.

After reading up on how Joins are essentially just subqueries, I understand
why my original approach would be slow.  However, when I add another
restriction for the "inner query" to a single Job Application the entire
query still takes around 5 seconds.  In this case, the inner query matches
2 documents and the total result set size is 1 document (as expected.)

Here's the debug output:
https://gist.github.com/tfrey7/50cd92c98e767ec612cc98bf430b9931

I'm using Solr 4.10.  All documents are in the same index.  The ID columns
are dynamic integer fields (because we're using the Sunspot ruby library,
exactly like:
https://github.com/sunspot/sunspot/blob/master/sunspot_solr/solr/solr/configsets/sunspot/conf/schema.xml#L179
)

Is there something obviously wrong with the query that I'm making?  Can
query-time Joins ever work for a scenario like this?

Thanks!


insertion time

2016-08-14 Thread Mahmoud Almokadem
Hello, 

We always update the same document many times using DataImportHandler. Can I 
add a field for the first time the document inserted to the index and another 
field for the last time the document updated?


Thanks,
Mahmoud 

Re: SOLR-7036 - a new Faster method for facet grouping

2016-08-14 Thread danny teichthal
Hi,
A reminder, in case anyone that is interested in performance of grouped
facet missed the older mail.
There's a new patch for improving grouped facet performance that works with
latest branch.

It uses the UIF method from JSON API.
Can be a first step for adding support for grouped facets on JSON API.

Please take a look at https://issues.apache.org/jira/browse/SOLR-7036
Comments and votes are welcome.





On Wed, Jul 27, 2016 at 11:31 AM, danny teichthal 
wrote:

> Hi,
> SOLR-7036 introduced a new faster method for group.facet, which uses
> UnInvertedField.
> It was patched for version 4.x.
> Over the last week, my colleague uploaded a new patch that work against
> the trunk.
>
> We would really appreciate if anyone could take a look at it and give us
> some feedback about it.
> Full details and performance tests results were also added to the JIRA
> issue.
>
> We are willing to work at it and if possible backport it to an older
> branch.
>
> Link:
> https://issues.apache.org/jira/browse/SOLR-7036
>
>
> Thanks in advance,
>