Re: [EXTERNAL] Re: Facet Sorting

2018-07-18 Thread Satheesh . Akkinepally
Thank You Chris for an immediate response. I am using json facets and my 
question was for json facets and using solr 7.2.1. Will try your approach. 
Basically I want to use an average of the scores than the sum of the scores and 
sort those buckets based on the average score. The scores am assuming would 
have already been calculated in the QueryComponent when those docs were 
collected from the original query and calculating the average and sorting the 
buckets may be simple enough with your approach.

Thanks again for the reply. 


On 7/18/18, 3:45 PM, "Chris Hostetter"  wrote:


: If I want to plug in my own sorting for facets, what would be the best 
: approach. I know, out of the box, solr supports sort by facet count and 
: sort by alpha. I want to plug in my own sorting (say by relevancy). Is 
: there a way to do that? Where should I start with if I need to write a 
: Custom Facet Component?

it sounds like you're talking about the "classic" facets (using 
"facet.field") where facet.sort only supports "count" (desc) and "index" 
(asc)

Adding a custom sort option there would be close to impossible.

The newer "json.facets" API supports a much more robust set of options, 
that includes the ability to sort on an "aggregate" function across all 
documents in the bucket...

https://lucene.apache.org/solr/guide/7_4/json-facet-api.html

some of the existing sort options there might solve your need, but it's 
also possible using that API to write your own ValueSourceParser plugin 
that can be used to sort facets as long as it returns ValueSources that 
extend "AggValueSource"

: Basically I want to plug the scores calculated in earlier steps for the 
: documents matched, do some kind of aggregation of the scores of the 
: documents that fall under a facet and use this aggregate score to rank 

IIUC what you want is possibly something like...

curl http://localhost:8983/solr/techproducts/query -d 
'q=features:lcd=0&
 json.facet={
   categories:{
 type : terms,
 field : cat,
 sort : { x : desc },
 facet:{
   x : "sum(query($q))",
 }
   }
 }
'

...which will sort the buckets by the sum of the scores of every document 
in that bucket (using the original query .. but you could alternatively 
sort by any aggregation of the scores from any arbitrary query / document 
based function)





-Hoss
http://www.lucidworks.com/





Re: Memory requirements for TLOGs (7.3.1)

2018-07-18 Thread Ash Ramesh
Thanks for the quick responses Shawn & Erick! Just to clarify another few
points:
 1. Does having a larger heap size impact ingesting additional documents to
the index (all CRUD operations) onto a TLOG?
 2. Does having a larger ram configured machine (in this case 32gb) affect
ingestion on TLOGS also?
 3. We are currently routing queries via Amazon ASG / Load Balancer. Is
this one of the recommended ways to set up SOLR infrastructure?

Best Regards,

Ash


On Thu, Jul 19, 2018 at 12:56 AM Erick Erickson 
wrote:

> There's little good reason to _not_ route searches to your TLOG
> replicas. The only difference between the PULL and TLOG replicas is
> that the TLOG replicas get a raw copy of the incoming document from
> the leader and write them to the TLOG. I.e. there's some additional
> I/O.
>
> It's possible that if you have extremely heavy indexing you might
> notice some additional load on the TLOG .vs. PULL replicas, but from
> what you've said I doubt you have that much indexing traffic.
>
> So basically I'd configure my TLOG and PULL replicas pretty much
> identically and search them both.
>
> Best,
> Erick
>
> On Wed, Jul 18, 2018 at 7:46 AM, Shawn Heisey  wrote:
> > On 7/18/2018 12:04 AM, Ash Ramesh wrote:
> >>
> >> I have a quick question about what the memory requirements for TLOG
> >> machines are on 7.3.1. We currently run replication where there are 3
> >> TLOGs
> >> with 8gb ram (2gb heap) and N PULL replicas with 32gb ram (4gb heap). We
> >> have > 10M documents (1 collection) with the index size being ~ 17gb. We
> >> send all read traffic to the PULLs and send Updates and Additions to the
> >> Leader TLOG.
> >>
> >> We are wondering how this setup can affect performance for replication,
> >> etc. We are thinking of increasing the heap of the TLOG to 4gb but
> leaving
> >> the total memory on the machine at 8gb. What will that do to
> performance?
> >> We also expect our index to grow 3/4x in the next 6 months.
> >
> >
> > Performance has more to do with index size and memory size than the type
> of
> > replication you're doing.
> >
> > SolrCloud will load balance queries across the cloud, so your low-memory
> > TLOG replicas are most likely handling queries as well.  In a SolrCloud
> > cluster, a query is not necessarily handled by the machine that you send
> the
> > query to.
> >
> > With memory resources that low compared to index size, the 8GB machines
> > probably do not perform queries as well as the 32GB machines.  If you
> > increase the heap to 4GB, that will only leave 4GB available for the OS
> disk
> > cache, and that's going to drop query performance even further.
> >
> > There is a feature in Solr 7.4 that will allow you to prefer certain
> replica
> > types, so you can tell Solr that it should prefer PULL replicas.  But
> since
> > you're running 7.3.1, you don't have that feature.
> >
> > https://issues.apache.org/jira/browse/SOLR-11982
> >
> > There is also a "preferLocalShards" parameter that has existed for longer
> > than the new feature mentioned above.  This tells Solr that it should not
> > load balance queries in the cloud if there is a local index that can
> satisfy
> > the query.  This parameter should only be used if you have an external
> load
> > balancer.
> >
> > Indexing is a heap-intensive operation that doesn't benefit much from
> having
> > a lot of extra memory for the operating system. I have no idea whether
> 2GB
> > of heap is enough or not.  Increasing the heap size MIGHT make
> performance
> > better, or it might make no difference at all.
> >
> > Thanks,
> > Shawn
> >
>

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
. ***
** Empowering the world 
to design
Also, we're hiring. Apply here! 

  
  








Re: Facet Sorting

2018-07-18 Thread Chris Hostetter


: If I want to plug in my own sorting for facets, what would be the best 
: approach. I know, out of the box, solr supports sort by facet count and 
: sort by alpha. I want to plug in my own sorting (say by relevancy). Is 
: there a way to do that? Where should I start with if I need to write a 
: Custom Facet Component?

it sounds like you're talking about the "classic" facets (using 
"facet.field") where facet.sort only supports "count" (desc) and "index" 
(asc)

Adding a custom sort option there would be close to impossible.

The newer "json.facets" API supports a much more robust set of options, 
that includes the ability to sort on an "aggregate" function across all 
documents in the bucket...

https://lucene.apache.org/solr/guide/7_4/json-facet-api.html

some of the existing sort options there might solve your need, but it's 
also possible using that API to write your own ValueSourceParser plugin 
that can be used to sort facets as long as it returns ValueSources that 
extend "AggValueSource"

: Basically I want to plug the scores calculated in earlier steps for the 
: documents matched, do some kind of aggregation of the scores of the 
: documents that fall under a facet and use this aggregate score to rank 

IIUC what you want is possibly something like...

curl http://localhost:8983/solr/techproducts/query -d 'q=features:lcd=0&
 json.facet={
   categories:{
 type : terms,
 field : cat,
 sort : { x : desc },
 facet:{
   x : "sum(query($q))",
 }
   }
 }
'

...which will sort the buckets by the sum of the scores of every document 
in that bucket (using the original query .. but you could alternatively 
sort by any aggregation of the scores from any arbitrary query / document 
based function)





-Hoss
http://www.lucidworks.com/


Re: NullPointerException at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:324)

2018-07-18 Thread Erick Erickson
Probably SOLR-11770 and/or SOLR-11792.

In the meantime, insure that  has stored=true set and
insure that there are terms.

You'll probably have to re-index though

Best,
Erick

On Wed, Jul 18, 2018 at 10:38 AM, babuasian  wrote:
> Hi,Running solr version 6.5.Trying to get tf-idf values of a term ‘price’
> .../solr/mycore/tvrh/?q=price=0=1=on=true=true=trueWhen
> run, I get the following messagejava.lang.NullPointerException at
> org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:324)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440) at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) at
> ..Also, when tried against single field, it doesn’t throw error, but
> the following messageNote: I’ve enabled adding termVectors to title field,
> but to no avail. Kindly suggest...
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr cloud in kubernetes

2018-07-18 Thread ssivashunm
https://github.com/freedev/solrcloud-zookeeper-kubernetes 
 provides more detail about persistent disck usage for solr data and home. 

Th issue I face is, since all three statefulset will use the same solr port
(as they are replicas) they are not communicating with one another. 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


NullPointerException at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:324)

2018-07-18 Thread babuasian
Hi,Running solr version 6.5.Trying to get tf-idf values of a term ‘price’
.../solr/mycore/tvrh/?q=price=0=1=on=true=true=trueWhen
run, I get the following messagejava.lang.NullPointerException at
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:324)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2440) at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) at
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) at
..Also, when tried against single field, it doesn’t throw error, but
the following messageNote: I’ve enabled adding termVectors to title field,
but to no avail. Kindly suggest...



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Memory requirements for TLOGs (7.3.1)

2018-07-18 Thread Erick Erickson
There's little good reason to _not_ route searches to your TLOG
replicas. The only difference between the PULL and TLOG replicas is
that the TLOG replicas get a raw copy of the incoming document from
the leader and write them to the TLOG. I.e. there's some additional
I/O.

It's possible that if you have extremely heavy indexing you might
notice some additional load on the TLOG .vs. PULL replicas, but from
what you've said I doubt you have that much indexing traffic.

So basically I'd configure my TLOG and PULL replicas pretty much
identically and search them both.

Best,
Erick

On Wed, Jul 18, 2018 at 7:46 AM, Shawn Heisey  wrote:
> On 7/18/2018 12:04 AM, Ash Ramesh wrote:
>>
>> I have a quick question about what the memory requirements for TLOG
>> machines are on 7.3.1. We currently run replication where there are 3
>> TLOGs
>> with 8gb ram (2gb heap) and N PULL replicas with 32gb ram (4gb heap). We
>> have > 10M documents (1 collection) with the index size being ~ 17gb. We
>> send all read traffic to the PULLs and send Updates and Additions to the
>> Leader TLOG.
>>
>> We are wondering how this setup can affect performance for replication,
>> etc. We are thinking of increasing the heap of the TLOG to 4gb but leaving
>> the total memory on the machine at 8gb. What will that do to performance?
>> We also expect our index to grow 3/4x in the next 6 months.
>
>
> Performance has more to do with index size and memory size than the type of
> replication you're doing.
>
> SolrCloud will load balance queries across the cloud, so your low-memory
> TLOG replicas are most likely handling queries as well.  In a SolrCloud
> cluster, a query is not necessarily handled by the machine that you send the
> query to.
>
> With memory resources that low compared to index size, the 8GB machines
> probably do not perform queries as well as the 32GB machines.  If you
> increase the heap to 4GB, that will only leave 4GB available for the OS disk
> cache, and that's going to drop query performance even further.
>
> There is a feature in Solr 7.4 that will allow you to prefer certain replica
> types, so you can tell Solr that it should prefer PULL replicas.  But since
> you're running 7.3.1, you don't have that feature.
>
> https://issues.apache.org/jira/browse/SOLR-11982
>
> There is also a "preferLocalShards" parameter that has existed for longer
> than the new feature mentioned above.  This tells Solr that it should not
> load balance queries in the cloud if there is a local index that can satisfy
> the query.  This parameter should only be used if you have an external load
> balancer.
>
> Indexing is a heap-intensive operation that doesn't benefit much from having
> a lot of extra memory for the operating system. I have no idea whether 2GB
> of heap is enough or not.  Increasing the heap size MIGHT make performance
> better, or it might make no difference at all.
>
> Thanks,
> Shawn
>


Re: Solr Nodes Killed During a ReIndexing Process on New VMs Out of Memory Error

2018-07-18 Thread Shawn Heisey

On 7/18/2018 8:31 AM, THADC wrote:

Thanks for the reply. I read the link you provided. I am currently not
specifying a heap size with solr so my understanding is that by default it
will just grow automatically. If I add more physical memory to the VM
without doing anything with heap size, won't that possibly fix the problem?


No, that is not how it works.  If Java is not given a heap size, then it 
will choose the heap size for you based on how much memory the machine 
has and its own internal algorithms, and limit itself to that amount.


Solr 5.0 and later, when started using the included scripts, asks Java 
for a 512MB heap by default.  This is an extremely small heap.  Nearly 
all Solr users must increase the heap size beyond 512MB.


Thanks,
Shawn



Re: Memory requirements for TLOGs (7.3.1)

2018-07-18 Thread Shawn Heisey

On 7/18/2018 12:04 AM, Ash Ramesh wrote:

I have a quick question about what the memory requirements for TLOG
machines are on 7.3.1. We currently run replication where there are 3 TLOGs
with 8gb ram (2gb heap) and N PULL replicas with 32gb ram (4gb heap). We
have > 10M documents (1 collection) with the index size being ~ 17gb. We
send all read traffic to the PULLs and send Updates and Additions to the
Leader TLOG.

We are wondering how this setup can affect performance for replication,
etc. We are thinking of increasing the heap of the TLOG to 4gb but leaving
the total memory on the machine at 8gb. What will that do to performance?
We also expect our index to grow 3/4x in the next 6 months.


Performance has more to do with index size and memory size than the type 
of replication you're doing.


SolrCloud will load balance queries across the cloud, so your low-memory 
TLOG replicas are most likely handling queries as well.  In a SolrCloud 
cluster, a query is not necessarily handled by the machine that you send 
the query to.


With memory resources that low compared to index size, the 8GB machines 
probably do not perform queries as well as the 32GB machines.  If you 
increase the heap to 4GB, that will only leave 4GB available for the OS 
disk cache, and that's going to drop query performance even further.


There is a feature in Solr 7.4 that will allow you to prefer certain 
replica types, so you can tell Solr that it should prefer PULL 
replicas.  But since you're running 7.3.1, you don't have that feature.


https://issues.apache.org/jira/browse/SOLR-11982

There is also a "preferLocalShards" parameter that has existed for 
longer than the new feature mentioned above.  This tells Solr that it 
should not load balance queries in the cloud if there is a local index 
that can satisfy the query.  This parameter should only be used if you 
have an external load balancer.


Indexing is a heap-intensive operation that doesn't benefit much from 
having a lot of extra memory for the operating system. I have no idea 
whether 2GB of heap is enough or not.  Increasing the heap size MIGHT 
make performance better, or it might make no difference at all.


Thanks,
Shawn



Re: Solr Nodes Killed During a ReIndexing Process on New VMs Out of Memory Error

2018-07-18 Thread THADC
Thanks for the reply. I read the link you provided. I am currently not
specifying a heap size with solr so my understanding is that by default it
will just grow automatically. If I add more physical memory to the VM
without doing anything with heap size, won't that possibly fix the problem?

Thanks



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr Nodes Killed During a ReIndexing Process on New VMs Out of Memory Error

2018-07-18 Thread Shawn Heisey

On 7/18/2018 7:10 AM, THADC wrote:

We performed a full reindex for the first time against our largest database
and on two new VMs dedicated to solr indexing. We have two solr nodes
(solrCloud/solr7.3) with a zookeeper cluster. Several hours into the
reindexing process, both solr nodes shut down with:

java.long.OutOfMemoryError: Java heap space

Running OOM killer script for process blah on port blah

Does this definitely indicate we need more memory or could it just be a heap
space adjustment issue? Is there a way to better diagnose what to do?


https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

There are exactly two ways to deal with OOME:  Increase the available 
amount of the resource that's running out (heap space in this case), or 
change something so the program requires less of that resource.  
Depending on the hardware and software configuration, either of these 
options might turn out to be impossible.


The rest of that wiki page discusses memory in general as well as heap 
memory.  If you have questions after reading the page, go ahead and ask 
them.


Thanks,
Shawn



Solr Nodes Killed During a ReIndexing Process on New VMs Out of Memory Error

2018-07-18 Thread THADC
Hi,

We performed a full reindex for the first time against our largest database
and on two new VMs dedicated to solr indexing. We have two solr nodes
(solrCloud/solr7.3) with a zookeeper cluster. Several hours into the
reindexing process, both solr nodes shut down with:

java.long.OutOfMemoryError: Java heap space

Running OOM killer script for process blah on port blah

Does this definitely indicate we need more memory or could it just be a heap
space adjustment issue? Is there a way to better diagnose what to do?

Thanks for any reply.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: Cannot index to 7.2.1 collection alias

2018-07-18 Thread Markus Jelsma
Ah, it was caused by an badly made alias via the GUI. If you do not select the 
destination collection in that popup, it will mess up, and show these 
exceptions.
 
-Original message-
> From:Markus Jelsma 
> Sent: Tuesday 17th July 2018 16:52
> To: solr-user@lucene.apache.org
> Subject: RE: Cannot index to 7.2.1 collection alias
> 
> Hi Shawn,
> 
> Indexing stack trace:
> 
> null:java.lang.NullPointerException
>   at 
> org.apache.solr.servlet.HttpSolrCall.getCoreUrl(HttpSolrCall.java:931)
>   at 
> org.apache.solr.servlet.HttpSolrCall.getRemotCoreUrl(HttpSolrCall.java:902)
>   at 
> org.apache.solr.servlet.HttpSolrCall.extractRemotePath(HttpSolrCall.java:432)
>   at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:289)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:470)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
> 
> Reloading an alias is just not supported it seems: 
> 
> 2018-07-17 14:51:35.223 ERROR 
> (OverseerThreadFactory-32-thread-5-processing-n:idx2.oi.dev:8983_solr) [   ] 
> o.a.s.c.OverseerCollectionMessageHandler Collection: c1 operation: reload 
> failed:org.apache.solr.common.SolrException: Could not find collection : c1
> at 
> org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:111)
> at 
> org.apache.solr.cloud.OverseerCollectionMessageHandler.collectionCmd(OverseerCollectionMessageHandler.java:795)
> at 
> org.apache.solr.cloud.OverseerCollectionMessageHandler.collectionCmd(OverseerCollectionMessageHandler.java:784)
> 
>  
> Thanks,
> MArkus
>  
> -Original message-
> > From:Shawn Heisey 
> > Sent: Tuesday 17th July 2018 16:39
> > To: solr-user@lucene.apache.org
> > Subject: Re: Cannot index to 7.2.1 collection alias
> > 
> > On 7/17/2018 6:28 AM, Markus Jelsma wrote:
> > > Just attempted to connect and index a bunch of documents to a collection 
> > > alias, got a NPE right away. Can't find this error in Jira, did i 
> > > overlook something? Create new ticket?
> > 
> > Indexing to an alias should send the documents only to the first 
> > collection in the alias.  I am not aware of any problems in this 
> > functionality.
> > 
> > Before opening a Jira, can we see the full stacktrace from the error, so 
> > we can look into it?  Can you confirm that 7.2.1 is the version that 
> > created the stacktrace?
> > 
> > I don't know whether RELOAD is supported on aliases.  It would be good 
> > to see that stacktrace as well.
> > 
> > Thanks,
> > Shawn
> > 
> > 
> 


RE: API to convert solr response to Rowset

2018-07-18 Thread Srinivas Kashyap
I have a collection and thru solrJ, Ii query the collection and get 
QueryResponse SolrJ object. Is there a way I can convert this query response to 
Rowset(JDBC). I see the parallel SQL interface is introduced in 7.x version, 
but is it possible in Solr 5.2.1?

Thanks and Regards,
Srinivas Kashyap
             
 
-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: 18 July 2018 04:25 PM
To: solr-user 
Subject: Re: API to convert solr response to Rowset

Do you mean to use it with JDBC like source?

https://lucene.apache.org/solr/guide/7_4/parallel-sql-interface.html

Regards,
Alex

On Wed, Jul 18, 2018, 2:15 AM Srinivas Kashyap, < 
srini...@tradestonesoftware.com> wrote:

> Hello,
>
> Is there any API to convert Solr query response to JDBC Rowset?
>
> Thanks and Regards,
> Srinivas Kashyap
>
>


Re: synonyms question

2018-07-18 Thread ennio
Vicenzo,

Thank you for the tip. I restarted Solr and it worked.

-Ennio



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: API to convert solr response to Rowset

2018-07-18 Thread Alexandre Rafalovitch
Do you mean to use it with JDBC like source?

https://lucene.apache.org/solr/guide/7_4/parallel-sql-interface.html

Regards,
Alex

On Wed, Jul 18, 2018, 2:15 AM Srinivas Kashyap, <
srini...@tradestonesoftware.com> wrote:

> Hello,
>
> Is there any API to convert Solr query response to JDBC Rowset?
>
> Thanks and Regards,
> Srinivas Kashyap
>
>


Re: Timeout waiting for connection from pool

2018-07-18 Thread akshay
Is there any way through which I can create an external plugin and update
this values?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Timeout waiting for connection from pool

2018-07-18 Thread akshay
I don't have an issue with increasing the request rate. But facing this issue
when the system is going under recovery. Its not able to recover properly
and throwing this connection error



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Timeout waiting for connection from pool

2018-07-18 Thread Zisis T.
If you are having an issue when increasing the search request rate you should
have a look at maxConnectionsPerHost, I believe maxUpdateConnectionsPerHost
is related to indexing. 
You can modify your solr.xml as follows (I believe it's not clear from the
official documentation, I had to go through the code to find out about
maxConnectionsPerHost )


15000
12
500
  

Another thing I'd check is if you have GC pauses, it might affect your
performance.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Timeout waiting for connection from pool

2018-07-18 Thread akshay
Hey,

I am currently running solr 5.4.0 in solr cloud mode. Everything has been
working fine till now but when I start increasing the request rate I am
starting to get connection timeout errors.

Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout
waiting for connection from pool

On reading more about this I found that solr 5.4.0 has a major bug fixed in
verion 5.5 related to low values for maxUpdateConnectionsPerHost. But I
can't update my system to 5.5 as of now.
I am not able to find where/how do I add/edit the above mentioned parameter
to increase its value.

Any help would be  highly appreciated.

Regards,
Akshay



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Block Join Faceting issue

2018-07-18 Thread soham gandhi
I am using solr 6.2.1. Is json.facet supported there? If not, is there a
way to combine the 2 queries into a single query in solr 6.2.1? In the
example I mentioned, could you please help me form a sample query where I
can facet on both parent as well as child docs?

Thanks,
Soham

On Wed, Jul 18, 2018 at 1:35 PM, Mikhail Khludnev  wrote:

>  child.facet.field  works if parent query goes as q ie q={!parent
> which="doc_type:parent"}color:blue.
> However, it's considered as deprecated and proposed to be replaced with
> json.facet and uniqueBlock
>
> On Wed, Jul 18, 2018 at 7:04 AM soham gandhi 
> wrote:
>
> > Hi,
> >
> > I am working on a query that must return parent docs and facet on both
> the
> > parent and child fields. Here's a sample doc-
> > 
> > 1
> > merc
> > car
> > sagandhi
> > 2
> > 3
> > parent
> > 
> > 2
> > child
> > S
> > blue
> > 
> > 
> > 3
> > Z
> > 25
> > child
> > 
> > 
> > I want to search for "merc" or "*" and return facets for type/user from
> > parent docs, and code/color from child docs. Currently I am not using
> > blockjoin. Instead I make two queries, one on the parent docs and the
> other
> > on the child docs. I get the applicable child_id from the first query and
> > feed it into the second query to get the child facets.
> > However this has impacted performance and is not scalable if the
> child_ids
> > I get are huge. Is there a way to combine the two queries using block
> join.
> > I tried this query -
> > q=*:* =user:sagandhi ={!parent
> > which="doc_type:parent"}color:blue=type
> facet.field=code
> >
> > I get this error - Block join faceting is allowed with
> > ToParentBlockJoinQuery only
> >
> > Am I missing something here? Any pointers please.
> >
> > Thanks,
> > Soham
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: Block Join Faceting issue

2018-07-18 Thread Mikhail Khludnev
 child.facet.field  works if parent query goes as q ie q={!parent
which="doc_type:parent"}color:blue.
However, it's considered as deprecated and proposed to be replaced with
json.facet and uniqueBlock

On Wed, Jul 18, 2018 at 7:04 AM soham gandhi 
wrote:

> Hi,
>
> I am working on a query that must return parent docs and facet on both the
> parent and child fields. Here's a sample doc-
> 
> 1
> merc
> car
> sagandhi
> 2
> 3
> parent
> 
> 2
> child
> S
> blue
> 
> 
> 3
> Z
> 25
> child
> 
> 
> I want to search for "merc" or "*" and return facets for type/user from
> parent docs, and code/color from child docs. Currently I am not using
> blockjoin. Instead I make two queries, one on the parent docs and the other
> on the child docs. I get the applicable child_id from the first query and
> feed it into the second query to get the child facets.
> However this has impacted performance and is not scalable if the child_ids
> I get are huge. Is there a way to combine the two queries using block join.
> I tried this query -
> q=*:* =user:sagandhi ={!parent
> which="doc_type:parent"}color:blue=type=code
>
> I get this error - Block join faceting is allowed with
> ToParentBlockJoinQuery only
>
> Am I missing something here? Any pointers please.
>
> Thanks,
> Soham
>


-- 
Sincerely yours
Mikhail Khludnev


API to convert solr response to Rowset

2018-07-18 Thread Srinivas Kashyap
Hello,

Is there any API to convert Solr query response to JDBC Rowset?

Thanks and Regards,
Srinivas Kashyap



Memory requirements for TLOGs (7.3.1)

2018-07-18 Thread Ash Ramesh
Hi everybody,

I have a quick question about what the memory requirements for TLOG
machines are on 7.3.1. We currently run replication where there are 3 TLOGs
with 8gb ram (2gb heap) and N PULL replicas with 32gb ram (4gb heap). We
have > 10M documents (1 collection) with the index size being ~ 17gb. We
send all read traffic to the PULLs and send Updates and Additions to the
Leader TLOG.

We are wondering how this setup can affect performance for replication,
etc. We are thinking of increasing the heap of the TLOG to 4gb but leaving
the total memory on the machine at 8gb. What will that do to performance?
We also expect our index to grow 3/4x in the next 6 months.

Any assistance would be well appreciated :)

Regards,

Ash

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
. ***
** Empowering the world 
to design
Also, we're hiring. Apply here!