[jira] [Commented] (SOLR-12692) Add hints/warnings for the ZK Status Admin UI

2018-08-28 Thread Greg Harris (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595291#comment-16595291
 ] 

Greg Harris commented on SOLR-12692:


Additional feature request would be if you could say do a click for a 'cons' 
command which will show latencies and packets rcvd/sent on all connections. 
This can be useful when determining if that max latency is an outlier or a 
significant problem or packet communication on a connection. You could also do 
ones for 'crst' (Connection reset of stats), 'srst' (Server reset of stats). 
Possibly might add 'dump' for connection ids and attached ephemeral nodes, but 
perhaps getting farther out there. I think the most important one here might 
just be 'cons'

> Add hints/warnings for the ZK Status Admin UI
> -
>
> Key: SOLR-12692
> URL: https://issues.apache.org/jira/browse/SOLR-12692
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Reporter: Varun Thacker
>Priority: Minor
> Attachments: SOLR-12692.patch, wrong_zk_warning.png, zk_ensemble.png
>
>
> Firstly I love the new UI pages ( ZK Status and Nodes ) . Thanks [~janhoy] 
> for all the great work!
> I setup a 3 node ZK ensemble to play around with the UI and attaching the 
> screenshot for reference.
>  
> Here are a few suggestions I had
>  # Let’s show Approximate Size in human readable form.  We can use 
> RamUsageEstimator#humanReadableUnits to calculate it
>  # Show warning symbol when Ensemble is standalone
>  # If maxSessionTimeout < Solr's ZK_CLIENT_TIMEOUT then ZK will only honor 
> up-to the maxSessionTimeout value for the Solr->ZK connection. We could mark 
> that as a warning.
>  # If maxClientCnxns < live_nodes show this as a red? Each solr node connects 
> to all zk nodes so if the number of nodes in the cluster is high one should 
> also be increasing maxClientCnxns
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11445) Overseer.processQueueItem().... zkStateWriter.enqueueUpdate might ideally have a try{}catch{} around it

2017-10-06 Thread Greg Harris (JIRA)
Greg Harris created SOLR-11445:
--

 Summary: Overseer.processQueueItem()  
zkStateWriter.enqueueUpdate might ideally have a try{}catch{} around it
 Key: SOLR-11445
 URL: https://issues.apache.org/jira/browse/SOLR-11445
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 7.0, 6.6.1, master (8.0)
Reporter: Greg Harris



So we had the following stack trace with a customer:

2017-10-04 11:25:30.339 ERROR () [ ] o.a.s.c.Overseer Exception in Overseer 
main queue loop
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /collections//state.json
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:391)
at 
org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:388)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:388)
at 
org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:235)
at 
org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:152)
at 
org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:271)
at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:199)
at java.lang.Thread.run(Thread.java:748)

I want to highlight: 
  at 
org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:152)
at 
org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:271)

This ends up coming from Overseer:
while (data != null)  {
final ZkNodeProps message = ZkNodeProps.load(data);
log.debug("processMessage: workQueueSize: {}, message = {}", 
workQueue.getStats().getQueueLength(), message);
// force flush to ZK after each message because there is no 
fallback if workQueue items
// are removed from workQueue but fail to be written to ZK
*clusterState = processQueueItem(message, clusterState, 
zkStateWriter, false, null);
workQueue.poll(); // poll-ing removes the element we got by 
peek-ing*
data = workQueue.peek();
  }

Note: The processQueueItem comes before the poll, therefore upon a thrown 
exception the same node/message that won't process becomes stuck. This made a 
large cluster unable to come up on it's own without deleting the problem node. 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6822) Block Join duplicates when _root_ is different type than uniqueKey

2014-12-05 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6822:
--
Description: 
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create weirdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates being sent for rows=x 
and affects the numFound and total docs returned if there are dupes in the rows 
being returned. So it is not consistent.

  was:
If you have in your setup {{_root_}} a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create weirdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates being sent for rows=x 
and affects the numFound and total docs returned if there are dupes in the rows 
being returned. So it is not consistent.


 Block Join duplicates when _root_ is different type than uniqueKey
 --

 Key: SOLR-6822
 URL: https://issues.apache.org/jira/browse/SOLR-6822
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris

 If you have in your setup _ root _ a different type than your uniqueKey, when 
 you update blocks with the same id, you start getting duplicates of the 
 entire block. So if you have parent-child-grandchild and update the block 
 with the same or different children you will have both blocks in their 
 entirety still in the index found with star:star
 This can further create weirdness when doing calls from different shards as a 
 call to the shard itself will give back all results... a call to a different 
 shard with shards=dupeshard will take out the duplicates being sent for 
 rows=x and affects the numFound and total docs returned if there are dupes in 
 the rows being returned. So it is not consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6822) Block Join duplicates when _root_ is different type than uniqueKey

2014-12-04 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6822:
-

 Summary: Block Join duplicates when _root_ is different type than 
uniqueKey
 Key: SOLR-6822
 URL: https://issues.apache.org/jira/browse/SOLR-6822
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris


If you have in your setup _root_ a different type than your uniqueKey, when you 
update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with *:*

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates and affects the 
numFound and total docs returned, so it is not consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6822) Block Join duplicates when _root_ is different type than uniqueKey

2014-12-04 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6822:
--
Description: 
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with *:*

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates and affects the 
numFound and total docs returned, so it is not consistent.

  was:
If you have in your setup _root_ a different type than your uniqueKey, when you 
update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with *:*

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates and affects the 
numFound and total docs returned, so it is not consistent.


 Block Join duplicates when _root_ is different type than uniqueKey
 --

 Key: SOLR-6822
 URL: https://issues.apache.org/jira/browse/SOLR-6822
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris

 If you have in your setup _ root _ a different type than your uniqueKey, when 
 you update blocks with the same id, you start getting duplicates of the 
 entire block. So if you have parent-child-grandchild and update the block 
 with the same or different children you will have both blocks in their 
 entirety still in the index found with *:*
 This can further create wierdness when doing calls from different shards as a 
 call to the shard itself will give back all results... a call to a different 
 shard with shards=dupeshard will take out the duplicates and affects the 
 numFound and total docs returned, so it is not consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6822) Block Join duplicates when _root_ is different type than uniqueKey

2014-12-04 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6822:
--
Description: 
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates and affects the 
numFound and total docs returned, so it is not consistent.

  was:
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with *:*

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates and affects the 
numFound and total docs returned, so it is not consistent.


 Block Join duplicates when _root_ is different type than uniqueKey
 --

 Key: SOLR-6822
 URL: https://issues.apache.org/jira/browse/SOLR-6822
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris

 If you have in your setup _ root _ a different type than your uniqueKey, when 
 you update blocks with the same id, you start getting duplicates of the 
 entire block. So if you have parent-child-grandchild and update the block 
 with the same or different children you will have both blocks in their 
 entirety still in the index found with star:star
 This can further create wierdness when doing calls from different shards as a 
 call to the shard itself will give back all results... a call to a different 
 shard with shards=dupeshard will take out the duplicates and affects the 
 numFound and total docs returned, so it is not consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6822) Block Join duplicates when _root_ is different type than uniqueKey

2014-12-04 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6822:
--
Description: 
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates being sent for rows=x 
and affects the numFound and total docs returned if there are dupes in the 
resultset. So it is not consistent.

  was:
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates and affects the 
numFound and total docs returned, so it is not consistent.


 Block Join duplicates when _root_ is different type than uniqueKey
 --

 Key: SOLR-6822
 URL: https://issues.apache.org/jira/browse/SOLR-6822
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris

 If you have in your setup _ root _ a different type than your uniqueKey, when 
 you update blocks with the same id, you start getting duplicates of the 
 entire block. So if you have parent-child-grandchild and update the block 
 with the same or different children you will have both blocks in their 
 entirety still in the index found with star:star
 This can further create wierdness when doing calls from different shards as a 
 call to the shard itself will give back all results... a call to a different 
 shard with shards=dupeshard will take out the duplicates being sent for 
 rows=x and affects the numFound and total docs returned if there are dupes in 
 the resultset. So it is not consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6822) Block Join duplicates when _root_ is different type than uniqueKey

2014-12-04 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6822:
--
Description: 
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates being sent for rows=x 
and affects the numFound and total docs returned if there are dupes in the rows 
being returned. So it is not consistent.

  was:
If you have in your setup _ root _ a different type than your uniqueKey, when 
you update blocks with the same id, you start getting duplicates of the entire 
block. So if you have parent-child-grandchild and update the block with the 
same or different children you will have both blocks in their entirety still in 
the index found with star:star

This can further create wierdness when doing calls from different shards as a 
call to the shard itself will give back all results... a call to a different 
shard with shards=dupeshard will take out the duplicates being sent for rows=x 
and affects the numFound and total docs returned if there are dupes in the 
resultset. So it is not consistent.


 Block Join duplicates when _root_ is different type than uniqueKey
 --

 Key: SOLR-6822
 URL: https://issues.apache.org/jira/browse/SOLR-6822
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris

 If you have in your setup _ root _ a different type than your uniqueKey, when 
 you update blocks with the same id, you start getting duplicates of the 
 entire block. So if you have parent-child-grandchild and update the block 
 with the same or different children you will have both blocks in their 
 entirety still in the index found with star:star
 This can further create wierdness when doing calls from different shards as a 
 call to the shard itself will give back all results... a call to a different 
 shard with shards=dupeshard will take out the duplicates being sent for 
 rows=x and affects the numFound and total docs returned if there are dupes in 
 the rows being returned. So it is not consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6771) Sending DIH request to non-leader can result in different number of successful documents

2014-11-20 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6771:
-

 Summary: Sending DIH request to non-leader can result in different 
number of successful documents
 Key: SOLR-6771
 URL: https://issues.apache.org/jira/browse/SOLR-6771
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10
Reporter: Greg Harris


Basically if you send a DIH request to the non-leader the following set of 
circumstances can occur:
1) If there are errors in some of the documents the request itself is rejected 
by the leader (try making a required field null with some documents to make 
sure there are rejections). 
2) This causes all documents on that request to appear to fail. The number of 
documents that a follower is able to update DIH with appears variable. 
3) You need to use a large number of documents it appears to see the anomaly. 

This results in the following error on the follower:
2014-11-20 12:06:16.470; 34054 [Thread-18] WARN  
org.apache.solr.update.processor.DistributedUpdateProcessor  – Error sending 
update
org.apache.solr.common.SolrException: Bad Request



request: 
http://10.0.2.15:8983/solr/collection1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2F10.0.2.15%3A8982%2Fsolr%2Fcollection1%2Fwt=javabinversion=2
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:240)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6753) Add GC Info to System Admin Page including % of uptime

2014-11-17 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6753:
-

 Summary: Add GC Info to System Admin Page including % of uptime
 Key: SOLR-6753
 URL: https://issues.apache.org/jira/browse/SOLR-6753
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.10.2
Reporter: Greg Harris
Priority: Minor


This is just a patch for adding in a little bit of GC information to the system 
admin page. I mainly wanted to get a percentage of time spent of both GC 
collections by uptime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6753) Add GC Info to System Admin Page including % of uptime

2014-11-17 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6753:
--
Attachment: SOLR-6753.patch

 Add GC Info to System Admin Page including % of uptime
 --

 Key: SOLR-6753
 URL: https://issues.apache.org/jira/browse/SOLR-6753
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.10.2
Reporter: Greg Harris
Priority: Minor
 Attachments: SOLR-6753.patch


 This is just a patch for adding in a little bit of GC information to the 
 system admin page. I mainly wanted to get a percentage of time spent of both 
 GC collections by uptime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6652) Expand Component does not search across collections like Collapse does

2014-10-24 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6652:
-

 Summary: Expand Component does not search across collections like 
Collapse does
 Key: SOLR-6652
 URL: https://issues.apache.org/jira/browse/SOLR-6652
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.10
Reporter: Greg Harris


It seems the Collapse query parser supports searching multiple collections via 
parameter: collection=xx,yy,zz. However, expand=true does not support this and 
all documents are returned from a single collection. Kind of confusing since 
expand is used with Collapse. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6210) Suggestor Version 2 doesn't support multiValued fields

2014-06-27 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6210:
-

 Summary: Suggestor Version 2 doesn't support multiValued fields
 Key: SOLR-6210
 URL: https://issues.apache.org/jira/browse/SOLR-6210
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.8.1
Reporter: Greg Harris


So if you use a multiValued field in the new suggestor it will not pick up 
terms for any term after the first one. So it treats the first term as the only 
term it will make it's dictionary from. 

This is the suggestor I'm talking about:
https://issues.apache.org/jira/browse/SOLR-5378



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5378) Suggester Version 2

2014-06-27 Thread Greg Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046286#comment-14046286
 ] 

Greg Harris commented on SOLR-5378:
---

Just made a new JIRA for the fact that this suggestor doesn't support 
multiValued fields. It will only grab the first entry to make terms. 

JIRA is here:
https://issues.apache.org/jira/browse/SOLR-6210

 Suggester Version 2
 ---

 Key: SOLR-5378
 URL: https://issues.apache.org/jira/browse/SOLR-5378
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Areek Zillur
Assignee: Shalin Shekhar Mangar
 Fix For: 4.7, 5.0

 Attachments: SOLR-5378-maven-fix.patch, SOLR-5378.patch, 
 SOLR-5378.patch, SOLR-5378.patch, SOLR-5378.patch, SOLR-5378.patch, 
 SOLR-5378.patch, SOLR-5378.patch, SOLR-5378.patch, SOLR-5378.patch, 
 SOLR-5378.patch, SOLR-5378.patch, SOLR-5378.patch, SOLR-5378.patch


 The idea is to add a new Suggester Component that will eventually replace the 
 Suggester support through the SpellCheck Component.
 This will enable Solr to fully utilize the Lucene suggester module (along 
 with supporting most of the existing features) in the following ways:
- Dictionary pluggability (give users the option to choose the dictionary 
 implementation to use for their suggesters to consume)
- Map the suggester options/ suggester result format (e.g. support for 
 payload)
- The new Component will also allow us to have beefier Lookup support 
 instead of resorting to collation and such. (Move computation from query time 
 to index time) with more freedom
 In addition to this, this suggester version should also have support for 
 distributed support, which was awkward at best with the previous 
 implementation due to SpellCheck requirements.
 Config (index time) options:
   - name - name of suggester
   - sourceLocation - external file location (for file-based suggesters)
   - lookupImpl - type of lookup to use [default JaspellLookupFactory]
   - dictionaryImpl - type of dictionary to use (lookup input) [default
 (sourceLocation == null ? HighFrequencyDictionaryFactory : 
 FileDictionaryFactory)]
   - storeDir - location to store in-memory data structure in disk
   - buildOnCommit - command to build suggester for every commit
   - buildOnOptimize - command to build suggester for every optimize
 Query time options:
   - suggest.dictionary - name of suggester to use
   - suggest.count - number of suggestions to return
   - suggest.q - query to use for lookup
   - suggest.build - command to build the suggester
   - suggest.reload - command to reload the suggester
 Example query:
 {code}
 http://localhost:8983/solr/suggest?suggest.dictionary=mySuggestersuggest=truesuggest.build=truesuggest.q=elec
 {code}
 Distributed query:
 {code}
 http://localhost:7574/solr/suggest?suggest.dictionary=mySuggestersuggest=truesuggest.build=truesuggest.q=elecshards=localhost:8983/solr,localhost:7574/solrshards.qt=/suggest
 {code}
 Example Response:
 {code}
 response
   lst name=responseHeader
 int name=status0/int
 int name=QTime28/int
   /lst
   str name=commandbuild/str
   result name=response numFound=0 start=0 maxScore=0.0/
   lst name=suggest
 lst name=suggestions
   lst name=e
 int name=numFound2/int
 lst name=suggestion
   str name=termelectronics and computer1/str
   long name=weight2199/long
   str name=payload/
 /lst
 lst name=suggestion
   str name=termelectronics/str
   long name=weight649/long
   str name=payload/
 /lst
   /lst
 /lst
   /lst
 /response
 {code}
 Example config file:
- Using DocumentDictionary and FuzzySuggester 
-- Suggestion on product_name sorted by popularity with the additional 
 product_id in the payload
 {code}  
   searchComponent class=solr.SuggestComponent name=suggest
   lst name=suggester
 str name=namesuggest_fuzzy_doc_dict/str
 str name=lookupImplFuzzyLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldproduct_name/str
 str name=weightFieldpopularity/str
 str name=payloadFieldproduct_id/str
 str name=storeDirsuggest_fuzzy_doc_dict_payload/str
 str name=suggestAnalyzerFieldTypetext/str
   /lst
 /searchComponent
 {code}
   - Using DocumentExpressionDictionary and FuzzySuggester
   -- Suggestion on product_name sorted by the expression ((price * 2) + 
 ln(popularity)) (where both price and popularity are fields in the document)
 {code}
 searchComponent class=solr.SuggestComponent name=suggest
   lst name=suggester
 str name=namesuggest_fuzzy_doc_expr_dict/str
 str name=dictionaryImplDocumentExpressionDictionaryFactory/str
 str 

[jira] [Created] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Greg Harris (JIRA)
Greg Harris created SOLR-6029:
-

 Summary: CollapsingQParserPlugin throws 
ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment
 Key: SOLR-6029
 URL: https://issues.apache.org/jira/browse/SOLR-6029
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.1
Reporter: Greg Harris
Priority: Minor
 Attachments: SOLR-6029.patch

CollapsingQParserPlugin misidentifies if a document is not found in a segment 
if the docid previously existed in a segment ie was deleted. 

Relevant code bit from CollapsingQParserPlugin needs to be changed from:
-if(doc != -1) {
+if((doc != -1)  (doc != DocsEnum.NO_MORE_DOCS)) {

What happens is if the doc is not found the returned value is 
DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
doc location causing an ArrayIndexOutOfBoundsException as the array is only as 
big as maxDocs. 




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6029) CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc has been deleted from a segment

2014-04-28 Thread Greg Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris updated SOLR-6029:
--

Attachment: SOLR-6029.patch

Patch with test for 4.7

 CollapsingQParserPlugin throws ArrayIndexOutOfBoundsException if elevated doc 
 has been deleted from a segment
 -

 Key: SOLR-6029
 URL: https://issues.apache.org/jira/browse/SOLR-6029
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.1
Reporter: Greg Harris
Priority: Minor
 Attachments: SOLR-6029.patch


 CollapsingQParserPlugin misidentifies if a document is not found in a segment 
 if the docid previously existed in a segment ie was deleted. 
 Relevant code bit from CollapsingQParserPlugin needs to be changed from:
 -if(doc != -1) {
 +if((doc != -1)  (doc != DocsEnum.NO_MORE_DOCS)) {
 What happens is if the doc is not found the returned value is 
 DocsEnum.NO_MORE_DOCS. This would then get set in the fq bitSet array as the 
 doc location causing an ArrayIndexOutOfBoundsException as the array is only 
 as big as maxDocs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5964) Fuzzy Search should be configurable in how many terms are expanded

2014-04-04 Thread Greg Harris (JIRA)
Greg Harris created SOLR-5964:
-

 Summary: Fuzzy Search should be configurable in how many terms are 
expanded
 Key: SOLR-5964
 URL: https://issues.apache.org/jira/browse/SOLR-5964
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.5
Reporter: Greg Harris
Priority: Minor


Fuzzy search gets expanded to a limit of up to a 50 term boolean query. If your 
field has more than 50 terms within the edit distance, some will be rejected as 
terms to search for and thus not be found. This should be configurable via 
local param syntax. The default is currently hardcoded at 50 and instantiated 
at that number in FuzzyQuery. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5480) Make MoreLikeThisHandler distributable

2014-03-12 Thread Greg Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932402#comment-13932402
 ] 

Greg Harris commented on SOLR-5480:
---

I tried to use this patch with mlt.fq. Essentially I'm having trouble with 
using an integer as the parameter, String seems to work. It's parsing junk as 
the value of the filter query as an int. I don't know exactly what needs to be 
fixed, but it has to do with using the toString() method on the TermQuery 
object as part of the fq. 

Basically at line 584 in SingleMoreLikeThisComponent:
String[] mltFilters = new String[rb._mltFilters.size()];
  for (int i = 0; i  mltFilters.length; i++) {
mltFilters[i] = rb._mltFilters.get(i).toString();
  }

the toString() eventually calls TermQuery.toString(String field). Since this is 
a bytesRef it seems that TermQuery.text() called in the toString() doesn't 
return what should be an int value and instead returns junk. Then the filter 
fails. 

 Make MoreLikeThisHandler distributable
 --

 Key: SOLR-5480
 URL: https://issues.apache.org/jira/browse/SOLR-5480
 Project: Solr
  Issue Type: Improvement
Reporter: Steve Molloy
 Attachments: SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, 
 SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch


 The MoreLikeThis component, when used in the standard search handler supports 
 distributed searches. But the MoreLikeThisHandler itself doesn't, which 
 prevents from say, passing in text to perform the query. I'll start looking 
 into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone 
 has some work done already and want to share, or want to contribute, any help 
 will be welcomed. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter

2013-11-05 Thread Greg Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814418#comment-13814418
 ] 

Greg Harris commented on SOLR-5027:
---

I have a request from a customer on this who would really benefit from this 
filter -- Ability to sort by two fields. I have looked into the code and 
understand this may not be easily feasible. Just getting it out there. 

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The CollapsingQParserPlugin also fully supports the QueryElevationComponent
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 be moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5425) PostFilters are executed twice when used with grouping

2013-11-05 Thread Greg Harris (JIRA)
Greg Harris created SOLR-5425:
-

 Summary: PostFilters are executed twice when used with grouping
 Key: SOLR-5425
 URL: https://issues.apache.org/jira/browse/SOLR-5425
 Project: Solr
  Issue Type: Improvement
Reporter: Greg Harris
Priority: Minor


PostFilters are executed twice when used with grouping:

Have experimented with removal at one stage but always comes in with wrong 
answers and it seems for now this is necessary. Makes an expensive PostFilter 
doubly so, which can be a problem.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org