date:20130224

[jira] [Comment Edited] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-02-24 Thread Raintung Li (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585682#comment-13585682
 ] 

Raintung Li edited comment on SOLR-4449 at 2/25/13 7:25 AM:


It is only the sample my idea, doesn't test.

  was (Author: raintung.li):
It is only the sample, not test.
  
> Enable backup requests for the internal solr load balancer
> --
>
> Key: SOLR-4449
> URL: https://issues.apache.org/jira/browse/SOLR-4449
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: philip hoy
>Priority: Minor
> Attachments: patch-4449.txt, SOLR-4449.patch
>
>
> Add the ability to configure the built-in solr load balancer such that it 
> submits a backup request to the next server in the list if the initial 
> request takes too long. Employing such an algorithm could improve the latency 
> of the 9xth percentile albeit at the expense of increasing overall load due 
> to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-02-24 Thread Raintung Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raintung Li updated SOLR-4449:
--

Attachment: patch-4449.txt

It is only the sample, not test.

> Enable backup requests for the internal solr load balancer
> --
>
> Key: SOLR-4449
> URL: https://issues.apache.org/jira/browse/SOLR-4449
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: philip hoy
>Priority: Minor
> Attachments: patch-4449.txt, SOLR-4449.patch
>
>
> Add the ability to configure the built-in solr load balancer such that it 
> submits a backup request to the next server in the list if the initial 
> request takes too long. Employing such an algorithm could improve the latency 
> of the 9xth percentile albeit at the expense of increasing overall load due 
> to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3755) shard splitting

2013-02-24 Thread Anshum Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585668#comment-13585668
 ] 

Anshum Gupta commented on SOLR-3755:


Any suggestions/feedback on the earlier comment about the Collections API would 
be good. Here's what the collections API call(s) would look like:

"The collections api may be invoked as follows:
http://host:port/solr/admin/collections?action=SPLIT&shard=&shard=

Sometimes, shard names are automatically assigned by SolrCloud and it may be 
more convenient for users to specify shards by shard keys instead of shard 
names e.g.
http://host:port/solr/admin/collections?action=SPLIT&shard.keys=shardKey1,shardKey2";

> shard splitting
> ---
>
> Key: SOLR-3755
> URL: https://issues.apache.org/jira/browse/SOLR-3755
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Yonik Seeley
> Attachments: SOLR-3755-CoreAdmin.patch, SOLR-3755.patch, 
> SOLR-3755.patch, SOLR-3755-testSplitter.patch, SOLR-3755-testSplitter.patch
>
>
> We can currently easily add replicas to handle increases in query volume, but 
> we should also add a way to add additional shards dynamically by splitting 
> existing shards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4078) Allow custom naming of nodes so that a new host:port combination can take over for a previous shard.

2013-02-24 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585664#comment-13585664
 ] 

Mark Miller commented on SOLR-4078:
---

I'm close with this - working on some final quirks and tests.

> Allow custom naming of nodes so that a new host:port combination can take 
> over for a previous shard.
> 
>
> Key: SOLR-4078
> URL: https://issues.apache.org/jira/browse/SOLR-4078
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.2, 5.0
>
> Attachments: SOLR-4078.patch
>
>
> Currently we auto assign a unique node name based on the host address and 
> core name - we should let the user optionally override this so that a new 
> host address + core name combo can take over the duties of a previous 
> registered node.
> Especially useful for ec2 if you are not using elastic ips.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4073) Overseer will miss operations in some cases for OverseerCollectionProcessor

2013-02-24 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585575#comment-13585575
 ] 

Mark Miller commented on SOLR-4073:
---

So what are we still missing?

> Overseer will miss  operations in some cases for OverseerCollectionProcessor
> 
>
> Key: SOLR-4073
> URL: https://issues.apache.org/jira/browse/SOLR-4073
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
> Environment: Solr cloud
>Reporter: Raintung Li
>Assignee: Mark Miller
> Fix For: 4.2, 5.0
>
> Attachments: patch-4073
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> One overseer disconnect with Zookeeper, but overseer thread still handle the 
> request(A) in the DistributedQueue. Example: overseer thread reconnect 
> Zookeeper try to remove the Top's request. "workQueue.remove();".   
> Now the other server will take over the overseer privilege because old 
> overseer disconnect. Start overseer thread and handle the queue request(A) 
> again, and remove the request(A) from queue, then try to get the top's 
> request(B, doesn't get). In the this time old overseer reconnect with 
> ZooKeeper, and remove the top's request from queue. Now the top request is B, 
> it is moved by old overseer server.  New overseer server never do B 
> request,because this request deleted by old overseer server, at the last this 
> request(B) miss operations.
> At best, distributeQueue.peek can get the request's ID that will be removed 
> for workqueue.remove(ID), not remove the top's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4073) Overseer will miss operations in some cases for OverseerCollectionProcessor

2013-02-24 Thread Raintung Li (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585569#comment-13585569
 ] 

Raintung Li commented on SOLR-4073:
---

Yes, I had done some of changes in the other patch. 

> Overseer will miss  operations in some cases for OverseerCollectionProcessor
> 
>
> Key: SOLR-4073
> URL: https://issues.apache.org/jira/browse/SOLR-4073
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
> Environment: Solr cloud
>Reporter: Raintung Li
>Assignee: Mark Miller
> Fix For: 4.2, 5.0
>
> Attachments: patch-4073
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> One overseer disconnect with Zookeeper, but overseer thread still handle the 
> request(A) in the DistributedQueue. Example: overseer thread reconnect 
> Zookeeper try to remove the Top's request. "workQueue.remove();".   
> Now the other server will take over the overseer privilege because old 
> overseer disconnect. Start overseer thread and handle the queue request(A) 
> again, and remove the request(A) from queue, then try to get the top's 
> request(B, doesn't get). In the this time old overseer reconnect with 
> ZooKeeper, and remove the top's request from queue. Now the top request is B, 
> it is moved by old overseer server.  New overseer server never do B 
> request,because this request deleted by old overseer server, at the last this 
> request(B) miss operations.
> At best, distributeQueue.peek can get the request's ID that will be removed 
> for workqueue.remove(ID), not remove the top's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-777) SpanWithinQuery - A SpanNotQuery that allows a specified number of intersections

2013-02-24 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585566#comment-13585566
 ] 

Mark Miller commented on LUCENE-777:


Hey Mike - did you include a test for that bug? 

I'd like to finally get this committed.

> SpanWithinQuery - A SpanNotQuery that allows a specified number of 
> intersections
> 
>
> Key: LUCENE-777
> URL: https://issues.apache.org/jira/browse/LUCENE-777
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Reporter: Mark Miller
>Priority: Minor
> Attachments: LUCENE-777-3X.patch, LUCENE-777-3X.patch, 
> LUCENE-777.patch, LUCENE-777.patch, SpanWithinQuery.java, SpanWithinQuery.java
>
>
> A SpanNotQuery that allows a specified number of intersections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (32bit/ibm-j9-jdk7) - Build # 4439 - Still Failing!

2013-02-24 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/4439/
Java: 32bit/ibm-j9-jdk7 

1 tests failed.
FAILED:  org.apache.solr.client.solrj.TestBatchUpdate.testWithBinary

Error Message:
IOException occured when talking to server at: https://127.0.0.1:50450/solr

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: https://127.0.0.1:50450/solr
at 
__randomizedtesting.SeedInfo.seed([BB129E76CD0D4961:E964D0694AE29372]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:416)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.add(HttpSolrServer.java:553)
at 
org.apache.solr.client.solrj.TestBatchUpdate.doIt(TestBatchUpdate.java:109)
at 
org.apache.solr.client.solrj.TestBatchUpdate.testWithBinary(TestBatchUpdate.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:88)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:613)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(

[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-02-24 Thread Raintung Li (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585565#comment-13585565
 ] 

Raintung Li commented on SOLR-4449:
---

Let me describe it clear to make the same page.

Normal case:
Client send 1 request for search, the Servlet will have handle request thread 
that we call it is request main thread.
Then main thread will start 3 threads to send the request to 3 shards, because 
this collection has 3 shards.
Main thread block to wait the full 3 threads(shards) response.  
Result:
We need 4 threads in the normal case that don't send the second request.

Your case:
Client send 1 request search .
Then main thread will start 6 threads to send the request to 3 shards..
Main thread block to wait the 3 threads response, the other 3 threads are 
stared in the LB
Result:
We need 7 threads in the normal case that don't send the second request.


My case:
Client send 1 request for search, 
Then this thread will .
Change: Main thread wait the 3 threads response in the fixed time. Which shard 
is overtime, main thread submit the second request.
Result:
We need 4 threads in the normal case that don't send the second request.
   




> Enable backup requests for the internal solr load balancer
> --
>
> Key: SOLR-4449
> URL: https://issues.apache.org/jira/browse/SOLR-4449
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: philip hoy
>Priority: Minor
> Attachments: SOLR-4449.patch
>
>
> Add the ability to configure the built-in solr load balancer such that it 
> submits a backup request to the next server in the list if the initial 
> request takes too long. Employing such an algorithm could improve the latency 
> of the 9xth percentile albeit at the expense of increasing overall load due 
> to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4499) StatsComponent could use some serious TLC

2013-02-24 Thread Robert Muir (JIRA)

Robert Muir created SOLR-4499:
-

 Summary: StatsComponent could use some serious TLC
 Key: SOLR-4499
 URL: https://issues.apache.org/jira/browse/SOLR-4499
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir


Most of these problems are actually documented on the wiki page, but here is my 
go at ideas for improving it, after reviewing this thing today.

# The external API should be made performant (e.g. some sort of paging for the 
stats.facet, vs returning ALL values)
# The code for multi-valued fields is clearly broken: it tries to use a 
combination of UninvertedField with a single-valued fieldcache for multivalued 
fields. 
# The behavior for multi-valued fields could be unexpected: whether its 
UninvertedField or DocValues, these datastructures return the *unique* set of 
ordinals for the document. So I think it can be very misleading to return stats 
like 'sum' for multivalued fields. 
# The stats returned should be implemented in ways that are fast. For example 
the string case returns min/max, but does this by looking up every single 
ordinal to term and using string.compareTo. the ords are themselves comparable, 
satisfying count/missing/min/max can all be done with 2 ord->term lookups per 
segment. These are also the only stats I think multi-valued numerics should 
return (see above).
# Things like accumulate(NamedList) appear to have scary runtime (I think this 
one is only used for merging distributed results?). They should not use the 
O\(n) get() method over and over in accumulate() but instead do a single pass 
through the list.

Finally the code is pretty difficult to follow, and tests are inadequate for 
what all is going on here.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4498) it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging

2013-02-24 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4498.
---

Resolution: Fixed

Thanks Roman!

> it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging
> ---
>
> Key: SOLR-4498
> URL: https://issues.apache.org/jira/browse/SOLR-4498
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.1
>Reporter: Roman Shaposhnik
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.2, 5.0
>
> Attachments: SOLR-4498.patch.txt
>
>
> It would be nice to have a -cmd list that would simply call 
> zkClient.printLayoutToStdOut()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4498) it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging

2013-02-24 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585562#comment-13585562
 ] 

Commit Tag Bot commented on SOLR-4498:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revision&revision=1449579

SOLR-4498: Add list command to ZkCLI that prints out the contents of ZooKeeper.


> it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging
> ---
>
> Key: SOLR-4498
> URL: https://issues.apache.org/jira/browse/SOLR-4498
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.1
>Reporter: Roman Shaposhnik
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.2, 5.0
>
> Attachments: SOLR-4498.patch.txt
>
>
> It would be nice to have a -cmd list that would simply call 
> zkClient.printLayoutToStdOut()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4490) add support for multivalued docvalues

2013-02-24 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-4490:
--

Attachment: SOLR-4490.patch

Updated patch: I think its ready.

I decided to bail on stats as this has larger issues (I'll open an issue) 
beyond docvalues. Actually it doesnt work right for multivalued fields at all 
now anyway.

> add support  for multivalued docvalues
> --
>
> Key: SOLR-4490
> URL: https://issues.apache.org/jira/browse/SOLR-4490
> Project: Solr
>  Issue Type: New Feature
>Reporter: Robert Muir
> Attachments: SOLR-4490.patch, SOLR-4490.patch
>
>
> exposing LUCENE-4765 essentially. 
> I think we don't need any new options, it just means doing the right thing 
> when someone has docValues=true and multivalued=true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4498) it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging

2013-02-24 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585559#comment-13585559
 ] 

Commit Tag Bot commented on SOLR-4498:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revision&revision=1449578

SOLR-4498: Add list command to ZkCLI that prints out the contents of ZooKeeper.


> it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging
> ---
>
> Key: SOLR-4498
> URL: https://issues.apache.org/jira/browse/SOLR-4498
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.1
>Reporter: Roman Shaposhnik
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.2, 5.0
>
> Attachments: SOLR-4498.patch.txt
>
>
> It would be nice to have a -cmd list that would simply call 
> zkClient.printLayoutToStdOut()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4498) it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging

2013-02-24 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4498:
--

Fix Version/s: 5.0
   4.2
 Assignee: Mark Miller
   Issue Type: New Feature  (was: Bug)

> it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging
> ---
>
> Key: SOLR-4498
> URL: https://issues.apache.org/jira/browse/SOLR-4498
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.1
>Reporter: Roman Shaposhnik
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.2, 5.0
>
> Attachments: SOLR-4498.patch.txt
>
>
> It would be nice to have a -cmd list that would simply call 
> zkClient.printLayoutToStdOut()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4498) it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging

2013-02-24 Thread Roman Shaposhnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated SOLR-4498:
---

Attachment: SOLR-4498.patch.txt

Attaching trivial patch.

> it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging
> ---
>
> Key: SOLR-4498
> URL: https://issues.apache.org/jira/browse/SOLR-4498
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.1
>Reporter: Roman Shaposhnik
>Priority: Minor
> Attachments: SOLR-4498.patch.txt
>
>
> It would be nice to have a -cmd list that would simply call 
> zkClient.printLayoutToStdOut()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4498) it would be useful for ZkCLI to expose printLayoutToStdOut to aid debugging

2013-02-24 Thread Roman Shaposhnik (JIRA)

Roman Shaposhnik created SOLR-4498:
--

 Summary: it would be useful for ZkCLI to expose 
printLayoutToStdOut to aid debugging
 Key: SOLR-4498
 URL: https://issues.apache.org/jira/browse/SOLR-4498
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Roman Shaposhnik
Priority: Minor


It would be nice to have a -cmd list that would simply call 
zkClient.printLayoutToStdOut()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4497) Collection Aliasing.

2013-02-24 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4497:
--

Attachment: SOLR-4497.patch

Here is a patch for an initial pass. 

* There are two new actions for the collections api, createalias and 
deletealias. 
* createalias is also currently used to update.
* On the search side you can map from one alias to multiple collections
* On the update side, only a one to one mapping will work.


> Collection Aliasing.
> 
>
> Key: SOLR-4497
> URL: https://issues.apache.org/jira/browse/SOLR-4497
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.2, 5.0
>
> Attachments: SOLR-4497.patch
>
>
> We should bring back the old aliasing feature, but for SolrCloud and with the 
> ability to alias one collection to many.
> The old alias feature was of more limited use and had some problems, so we 
> dropped it, but I think we can do this in a more useful way with SolrCloud, 
> and at a level were it's not invasive to the CoreContainer.
> Initially, the search side will allowing mapping a single alias to multiple 
> collections, but the index side will only support mapping a single alias to a 
> single collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4497) Collection Aliasing.

2013-02-24 Thread Mark Miller (JIRA)

Mark Miller created SOLR-4497:
-

 Summary: Collection Aliasing.
 Key: SOLR-4497
 URL: https://issues.apache.org/jira/browse/SOLR-4497
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.2, 5.0


We should bring back the old aliasing feature, but for SolrCloud and with the 
ability to alias one collection to many.

The old alias feature was of more limited use and had some problems, so we 
dropped it, but I think we can do this in a more useful way with SolrCloud, and 
at a level were it's not invasive to the CoreContainer.

Initially, the search side will allowing mapping a single alias to multiple 
collections, but the index side will only support mapping a single alias to a 
single collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4496) Support for faceting on the start of values

2013-02-24 Thread Teun Duynstee (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teun Duynstee updated SOLR-4496:


Attachment: limitLength-limitDelim-1st.patch

This implements the idea, but will throw an exception for multivalued fields. 
Tests have been added. Please have mercy on my coding style. I don't know my 
way around in java that well. 

> Support for faceting on the start of values
> ---
>
> Key: SOLR-4496
> URL: https://issues.apache.org/jira/browse/SOLR-4496
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Teun Duynstee
>Priority: Minor
> Attachments: limitLength-limitDelim-1st.patch
>
>
> The SimpleFacets component supports the prefix parameter to return only 
> facets starting with that prefix. This feature should (IMO) be complemented 
> by two more parameters to make it much more usefull (names could be improved 
> on of course):
> - limitLength: will return facets for only the first x characters of the real 
> facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
> would cause the facets A and C to be returned, with the sum of the counts. 
> This could typpically be used for a UI that allows you to select a first 
> letter for fields with many facets.
> - limitDelim: this would not truncate on a fixed length, but on the occurence 
> of a certain character after the prefix. This would allow the user to search 
> for hierarchical fields without having to resort to including each level of 
> the hierarchy at index analysis. This way, the value of the filed cat would 
> be 'Comics>Marvel>Batman' and this would be found using 
> prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
> combined count for all undelying cat values.
> I am working on a patch that would achieve this by postprocessing the 
> resulting counts in getTermCounts(). However, this will not return the 
> correct counts for multivalued fields. Also, the combination with field.limit 
> is not easy. Any tips for how to implement this? I'm available to work on a 
> patch. Or is it a bad idea anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4792) Smaller doc maps

2013-02-24 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585490#comment-13585490
 ] 

Adrien Grand commented on LUCENE-4792:
--

In case someone would like to use this class, I'd add that:
 - the encoded sequence does not strictly need to be monotonic: it can encode 
any sequence of values but it compresses best when the stream contains 
monotonic sub-sequences of 1024 longs at least (for example it would have a 
good compression ratio if there are first 1 increasing values and then 5000 
decreasing values),
 - it can address up to 2^42 values,
 - there are writer/reader equivalents called MonotonicBlockPackedWriter and 
MonotonicBlockPackedReader (which can either load values in memory or read from 
disk).

> Smaller doc maps
> 
>
> Key: LUCENE-4792
> URL: https://issues.apache.org/jira/browse/LUCENE-4792
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.2
>
> Attachments: LUCENE-4792.patch
>
>
> MergeState.DocMap could leverage MonotonicAppendingLongBuffer to save memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4792) Smaller doc maps

2013-02-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585474#comment-13585474
 ] 

Robert Muir commented on LUCENE-4792:
-

We are using the same compression for (as far as i know):
* stored fields, term vectors, docvalues "disk" addresses
* multidocvalues ordinal maps

We could consider trying it out for fieldcache and other places for example, im 
not sure what the perf hit would be.
(I'm not very interested in optimizing fieldcache myself)

> Smaller doc maps
> 
>
> Key: LUCENE-4792
> URL: https://issues.apache.org/jira/browse/LUCENE-4792
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.2
>
> Attachments: LUCENE-4792.patch
>
>
> MergeState.DocMap could leverage MonotonicAppendingLongBuffer to save memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4792) Smaller doc maps

2013-02-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585471#comment-13585471
 ] 

Michael McCandless commented on LUCENE-4792:


These RAM savings are AWESOME!  Where else can we use 
MonotonicAppendingLongBuffer!

> Smaller doc maps
> 
>
> Key: LUCENE-4792
> URL: https://issues.apache.org/jira/browse/LUCENE-4792
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.2
>
> Attachments: LUCENE-4792.patch
>
>
> MergeState.DocMap could leverage MonotonicAppendingLongBuffer to save memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2013-02-24 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585466#comment-13585466
 ] 

Alan Woodward commented on LUCENE-2878:
---

I've been chipping away at this for a bit.  Here's a summary of what I've done:
- Applied LUCENE-4524, and also added startPosition() and endPosition() to 
DocsEnum
- Changed the postings readers to return NO_MORE_POSITIONS once nextPosition() 
has been called freq() times
- Extended the ConjunctionScorer and DisjunctionScorer implementations to 
return positions
- Added an abstract PositionFilteredScorer with reset(int doc) and 
doNextPosition() methods
- Added a bunch of concrete implementations (ExactPhraseQuery, NotWithinQuery, 
OrderedNearQuery, UnorderedNearQuery, RangeFilterQuery) with tests - these are 
all in the posfilter package

I still need to implement SloppyPhraseQuery and MultiPhraseQuery, but I 
actually think these won't be too difficult with this API.  Plus there are a 
bunch of nocommits regarding freq() calculations, and this doesn't work at all 
with BooleanScorer - we'll probably need a way to tell the scorer that we do or 
don't want position information.

[~simonw] and I talked about this on IRC the other day, about resolving 
collisions in ExactPhraseQuery, but I think that problem may go away doing 
things this way.  I may have misunderstood though - if so, could you add a test 
to TestExactPhraseQuery showing what I'm missing, Simon?

> Allow Scorer to expose positions and payloads aka. nuke spans 
> --
>
> Key: LUCENE-2878
> URL: https://issues.apache.org/jira/browse/LUCENE-2878
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: Positions Branch
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
> mentor
> Fix For: Positions Branch
>
> Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, LUCENE-2878-vs-trunk.patch, 
> PosHighlighter.patch, PosHighlighter.patch
>
>
> Currently we have two somewhat separate types of queries, the one which can 
> make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
> doesn't really do scoring comparable to what other queries do and at the end 
> of the day they are duplicating lot of code all over lucene. Span*Queries are 
> also limited to other Span*Query instances such that you can not use a 
> TermQuery or a BooleanQuery with SpanNear or anthing like that. 
> Beside of the Span*Query limitation other queries lacking a quiet interesting 
> feature since they can not score based on term proximity since scores doesn't 
> expose any positional information. All those problems bugged me for a while 
> now so I stared working on that using the bulkpostings API. I would have done 
> that first cut on trunk but TermScorer is working on BlockReader that do not 
> expose positions while the one in this branch does. I started adding a new 
> Positions class which users can pull from a scorer, to prevent unnecessary 
> positions enums I added ScorerContext#needsPositions and eventually 
> Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
> currently only TermQuery / TermScorer implements this API and other simply 
> return null instead. 
> To show that the API really works and our BulkPostings work fine too with 
> positions I cut over TermSpanQuery to use a TermScorer under the hood and 
> nuked TermSpans entirely. A nice sideeffect of this was that the Position 
> BulkReading implementation got some exercise which now :) work all with 
> positions while Payloads for bulkreading are kind of experimental in the 
> patch and those only work with Standard codec. 
> So all spans now work on top of TermScorer ( I truly hate spans since today ) 
> including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
> to implement the other codecs yet since I want to get feedback on the API and 
> on this first cut before I go one with it. I will upload the corresponding 
> patch in a minute. 
> I also had to cut over SpanQuery.getSpans(IR) to 
> SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
> first but after that pain toda

[jira] [Updated] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-24 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4748:
---

Attachment: LUCENE-4748.patch

Good idea!  New patch pulling [package private] DSQ out.

> Add DrillSideways helper class to Lucene facets module
> --
>
> Key: LUCENE-4748
> URL: https://issues.apache.org/jira/browse/LUCENE-4748
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.2, 5.0
>
> Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch
>
>
> This came out of a discussion on the java-user list with subject
> "Faceted search in OR": http://markmail.org/thread/jmnq6z2x7ayzci5k
> The basic idea is to count "near misses" during collection, ie
> documents that matched the main query and also all except one of the
> drill down filters.
> Drill sideways makes for a very nice faceted search UI because you
> don't "lose" the facet counts after drilling in.  Eg maybe you do a
> search for "cameras", and you see facets for the manufacturer, so you
> drill into "Nikon".
> With drill sideways, even after drilling down, you'll still get the
> counts for all the other brands, where each count tells you how many
> hits you'd get if you changed to a different manufacturer.
> This becomes more fun if you add further drill-downs, eg maybe I next drill
> down into Resolution=10 megapixels", and then I can see how many 10
> megapixel cameras all other manufacturers, and what other resolutions
> Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4490) add support for multivalued docvalues

2013-02-24 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-4490:
--

Attachment: SOLR-4490.patch

Dumping my current state: queries and faceting are working but I haven't 
tackled Stats yet.

The current faceting "fc" code works for the single valued DV case, but i broke 
it out into a different implementation to handle multiple values and walk 
per-segment to avoid the binary search in MultiDocValues.

> add support  for multivalued docvalues
> --
>
> Key: SOLR-4490
> URL: https://issues.apache.org/jira/browse/SOLR-4490
> Project: Solr
>  Issue Type: New Feature
>Reporter: Robert Muir
> Attachments: SOLR-4490.patch
>
>
> exposing LUCENE-4765 essentially. 
> I think we don't need any new options, it just means doing the right thing 
> when someone has docValues=true and multivalued=true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585458#comment-13585458
 ] 

Shai Erera commented on LUCENE-4748:


Should DSQ perhaps be in its own class? Just to improve DS readability... 

> Add DrillSideways helper class to Lucene facets module
> --
>
> Key: LUCENE-4748
> URL: https://issues.apache.org/jira/browse/LUCENE-4748
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.2, 5.0
>
> Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch
>
>
> This came out of a discussion on the java-user list with subject
> "Faceted search in OR": http://markmail.org/thread/jmnq6z2x7ayzci5k
> The basic idea is to count "near misses" during collection, ie
> documents that matched the main query and also all except one of the
> drill down filters.
> Drill sideways makes for a very nice faceted search UI because you
> don't "lose" the facet counts after drilling in.  Eg maybe you do a
> search for "cameras", and you see facets for the manufacturer, so you
> drill into "Nikon".
> With drill sideways, even after drilling down, you'll still get the
> counts for all the other brands, where each count tells you how many
> hits you'd get if you changed to a different manufacturer.
> This becomes more fun if you add further drill-downs, eg maybe I next drill
> down into Resolution=10 megapixels", and then I can see how many 10
> megapixel cameras all other manufacturers, and what other resolutions
> Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4316) Admin UI - SolrCloud - extend core options to collections

2013-02-24 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585446#comment-13585446
 ] 

Mark Miller commented on SOLR-4316:
---

bq. a checkbox for that parameter.

I think it would be cool if someone added that.

bq.  I think it's a good idea to make collections available as entities within 
the UI.

I think the right first step is to add a UI for the collections API - similar 
to the one for the CoreAdmin API.

For UI stuff though, I've been depending on the kindness of one stranger mostly 
:)

> Admin UI - SolrCloud - extend core options to collections
> -
>
> Key: SOLR-4316
> URL: https://issues.apache.org/jira/browse/SOLR-4316
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.1
>Reporter: Shawn Heisey
> Fix For: 4.2, 5.0
>
>
> There are a number of sections available when you are looking at a core in 
> the UI - Ping, Query, Schema, Config, Replication, Analysis, Schema Browser, 
> Plugins / Stats, and Dataimport are the ones that I can see.
> A list of collections should be available, with as many of those options that 
> can apply to a collection,  If options specific to collections/SolrCloud can 
> be implemented, those should be there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3550) Create example code for core

2013-02-24 Thread Manpreet (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585434#comment-13585434
 ] 

Manpreet commented on LUCENE-3550:
--

Hi Shai - created the patch for 3550. Kindly review.

Thanks
-MS

> Create example code for core
> 
>
> Key: LUCENE-3550
> URL: https://issues.apache.org/jira/browse/LUCENE-3550
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/other
>Reporter: Shai Erera
>  Labels: newdev
> Attachments: LUCENE-3550.patch
>
>
> Trunk has gone under lots of API changes. Some of which are not trivial, and 
> the migration path from 3.x to 4.0 seems hard. I'd like to propose some way 
> to tackle this, by means of live example code.
> The facet module implements this approach. There is live Java code under 
> src/examples that demonstrate some well documented scenarios. The code itself 
> is documented, in addition to javadoc. Also, the code itself is being unit 
> tested regularly.
> We found it very difficult to keep documentation up-to-date -- javadocs 
> always lag behind, Wiki pages get old etc. However, when you have live Java 
> code, you're *forced* to keep it up-to-date. It doesn't compile if you break 
> the API, it fails to run if you change internal impl behavior. If you keep it 
> simple enough, its documentation stays simple to.
> And if we are successful at maintaining it (which we must be, otherwise the 
> build should fail), then people should have an easy experience migrating 
> between releases. So say you take the simple scenario "I'd like to index 
> documents which have the fields ID, date and body". Then you create an 
> example class/method that accomplishes that. And between releases, this code 
> gets updated, and people can follow the changes required to implement that 
> scenario.
> I'm not saying the examples code should always stay optimized. We can aim at 
> that, but I don't try to fool myself thinking that we'll succeed. But at 
> least we can get it compiled and regularly unit tested.
> I think that it would be good if we introduce the concept of examples such 
> that if a module (core, contrib, modules) have an src/examples, we package it 
> in a .jar and include it with the binary distribution. That's for a first 
> step. We can also have meta examples, under their own module/contrib, that 
> show how to combine several modules together (this might even uncover API 
> problems), but that's definitely a second phase.
> At first, let's do the "unit examples" (ala unit tests) and better start with 
> core. Whatever we succeed at writing for 4.0 will only help users. So let's 
> use this issue to:
> # List example scenarios that we want to demonstrate for core
> # Building the infrastructure in our build system to package and distribute a 
> module's examples.
> Please feel free to list here example scenarios that come to mind. We can 
> then track what's been done and what's not. The more we do the better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3550) Create example code for core

2013-02-24 Thread Manpreet (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manpreet updated LUCENE-3550:
-

Attachment: LUCENE-3550.patch

Patch for Example Code

> Create example code for core
> 
>
> Key: LUCENE-3550
> URL: https://issues.apache.org/jira/browse/LUCENE-3550
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/other
>Reporter: Shai Erera
>  Labels: newdev
> Attachments: LUCENE-3550.patch
>
>
> Trunk has gone under lots of API changes. Some of which are not trivial, and 
> the migration path from 3.x to 4.0 seems hard. I'd like to propose some way 
> to tackle this, by means of live example code.
> The facet module implements this approach. There is live Java code under 
> src/examples that demonstrate some well documented scenarios. The code itself 
> is documented, in addition to javadoc. Also, the code itself is being unit 
> tested regularly.
> We found it very difficult to keep documentation up-to-date -- javadocs 
> always lag behind, Wiki pages get old etc. However, when you have live Java 
> code, you're *forced* to keep it up-to-date. It doesn't compile if you break 
> the API, it fails to run if you change internal impl behavior. If you keep it 
> simple enough, its documentation stays simple to.
> And if we are successful at maintaining it (which we must be, otherwise the 
> build should fail), then people should have an easy experience migrating 
> between releases. So say you take the simple scenario "I'd like to index 
> documents which have the fields ID, date and body". Then you create an 
> example class/method that accomplishes that. And between releases, this code 
> gets updated, and people can follow the changes required to implement that 
> scenario.
> I'm not saying the examples code should always stay optimized. We can aim at 
> that, but I don't try to fool myself thinking that we'll succeed. But at 
> least we can get it compiled and regularly unit tested.
> I think that it would be good if we introduce the concept of examples such 
> that if a module (core, contrib, modules) have an src/examples, we package it 
> in a .jar and include it with the binary distribution. That's for a first 
> step. We can also have meta examples, under their own module/contrib, that 
> show how to combine several modules together (this might even uncover API 
> problems), but that's definitely a second phase.
> At first, let's do the "unit examples" (ala unit tests) and better start with 
> core. Whatever we succeed at writing for 4.0 will only help users. So let's 
> use this issue to:
> # List example scenarios that we want to demonstrate for core
> # Building the infrastructure in our build system to package and distribute a 
> module's examples.
> Please feel free to list here example scenarios that come to mind. We can 
> then track what's been done and what's not. The more we do the better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3550) Create example code for core

2013-02-24 Thread Manpreet (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manpreet updated LUCENE-3550:
-

Attachment: (was: LUCENE-3550.patch)

> Create example code for core
> 
>
> Key: LUCENE-3550
> URL: https://issues.apache.org/jira/browse/LUCENE-3550
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/other
>Reporter: Shai Erera
>  Labels: newdev
> Attachments: LUCENE-3550.patch
>
>
> Trunk has gone under lots of API changes. Some of which are not trivial, and 
> the migration path from 3.x to 4.0 seems hard. I'd like to propose some way 
> to tackle this, by means of live example code.
> The facet module implements this approach. There is live Java code under 
> src/examples that demonstrate some well documented scenarios. The code itself 
> is documented, in addition to javadoc. Also, the code itself is being unit 
> tested regularly.
> We found it very difficult to keep documentation up-to-date -- javadocs 
> always lag behind, Wiki pages get old etc. However, when you have live Java 
> code, you're *forced* to keep it up-to-date. It doesn't compile if you break 
> the API, it fails to run if you change internal impl behavior. If you keep it 
> simple enough, its documentation stays simple to.
> And if we are successful at maintaining it (which we must be, otherwise the 
> build should fail), then people should have an easy experience migrating 
> between releases. So say you take the simple scenario "I'd like to index 
> documents which have the fields ID, date and body". Then you create an 
> example class/method that accomplishes that. And between releases, this code 
> gets updated, and people can follow the changes required to implement that 
> scenario.
> I'm not saying the examples code should always stay optimized. We can aim at 
> that, but I don't try to fool myself thinking that we'll succeed. But at 
> least we can get it compiled and regularly unit tested.
> I think that it would be good if we introduce the concept of examples such 
> that if a module (core, contrib, modules) have an src/examples, we package it 
> in a .jar and include it with the binary distribution. That's for a first 
> step. We can also have meta examples, under their own module/contrib, that 
> show how to combine several modules together (this might even uncover API 
> problems), but that's definitely a second phase.
> At first, let's do the "unit examples" (ala unit tests) and better start with 
> core. Whatever we succeed at writing for 4.0 will only help users. So let's 
> use this issue to:
> # List example scenarios that we want to demonstrate for core
> # Building the infrastructure in our build system to package and distribute a 
> module's examples.
> Please feel free to list here example scenarios that come to mind. We can 
> then track what's been done and what's not. The more we do the better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-24 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4748:
---

Attachment: LUCENE-4748.patch

Thanks Shai, I fixed those issues.

For the first TODO, I removed the comment: I think an app should not make a 
DDQ, and no drill downs to it, and pass it to DS.

For the 2nd TODO, I fixed it so the "pure browse" works, but I put another TODO 
to improve the implementation later ...

> Add DrillSideways helper class to Lucene facets module
> --
>
> Key: LUCENE-4748
> URL: https://issues.apache.org/jira/browse/LUCENE-4748
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.2, 5.0
>
> Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch
>
>
> This came out of a discussion on the java-user list with subject
> "Faceted search in OR": http://markmail.org/thread/jmnq6z2x7ayzci5k
> The basic idea is to count "near misses" during collection, ie
> documents that matched the main query and also all except one of the
> drill down filters.
> Drill sideways makes for a very nice faceted search UI because you
> don't "lose" the facet counts after drilling in.  Eg maybe you do a
> search for "cameras", and you see facets for the manufacturer, so you
> drill into "Nikon".
> With drill sideways, even after drilling down, you'll still get the
> counts for all the other brands, where each count tells you how many
> hits you'd get if you changed to a different manufacturer.
> This becomes more fun if you add further drill-downs, eg maybe I next drill
> down into Resolution=10 megapixels", and then I can see how many 10
> megapixel cameras all other manufacturers, and what other resolutions
> Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4496) Support for faceting on the start of values

2013-02-24 Thread Teun Duynstee (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teun Duynstee updated SOLR-4496:


Description: 
The SimpleFacets component supports the prefix parameter to return only facets 
starting with that prefix. This feature should (IMO) be complemented by two 
more parameters to make it much more usefull (names could be improved on of 
course):
- limitLength: will return facets for only the first x characters of the real 
facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
would cause the facets A and C to be returned, with the sum of the counts. This 
could typpically be used for a UI that allows you to select a first letter for 
fields with many facets.
- limitDelim: this would not truncate on a fixed length, but on the occurence 
of a certain character after the prefix. This would allow the user to search 
for hierarchical fields without having to resort to including each level of the 
hierarchy at index analysis. This way, the value of the filed cat would be 
'Comics>Marvel>Batman' and this would be found using 
prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
combined count for all undelying cat values.

I am working on a patch that would achieve this by postprocessing the resulting 
counts in getTermCounts(). However, this will not return the correct counts for 
multivalued fields. Also, the combination with field.limit is not easy. Any 
tips for how to implement this? I'm available to work on a patch. Or is it a 
bad idia anyway?


  was:
The SimpleFacets component supports the prefix parameter to return only facets 
starting with that prefix. This feature should (IMO) be complemented by two 
more parameters to make it much more usefull (names could be improved on of 
course):
- limitLength: will return facets for only the first x characters of the real 
facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
would cause the facets A and C to be returned, with the sum of the counts. This 
could typpically be used for a UI that allows you to select a first letter for 
fields with many facets.
- limitDelim: this would not truncate on a fixed length, but on the occurence 
of a certain character after the prefix. This would allow the user to search 
for hierarchical fields without having to resort to including each level of the 
hierarchy at index analysis. This way, the value of the filed cat would be 
'Comics>Marvel>Batman' and this would be found using 
prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
combined count for all undelying cat values.

I am working on a patch that would achieve this by postprocessing the resulting 
counts in getTermCounts(). However, this will not return the correct counts for 
multivalued fields. Also, the combination with field.limit is not easy. Any 
tips for how to implement this? I'm available to work on a patch. Or is it a 
bad adie anyway?



> Support for faceting on the start of values
> ---
>
> Key: SOLR-4496
> URL: https://issues.apache.org/jira/browse/SOLR-4496
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Teun Duynstee
>Priority: Minor
>
> The SimpleFacets component supports the prefix parameter to return only 
> facets starting with that prefix. This feature should (IMO) be complemented 
> by two more parameters to make it much more usefull (names could be improved 
> on of course):
> - limitLength: will return facets for only the first x characters of the real 
> facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
> would cause the facets A and C to be returned, with the sum of the counts. 
> This could typpically be used for a UI that allows you to select a first 
> letter for fields with many facets.
> - limitDelim: this would not truncate on a fixed length, but on the occurence 
> of a certain character after the prefix. This would allow the user to search 
> for hierarchical fields without having to resort to including each level of 
> the hierarchy at index analysis. This way, the value of the filed cat would 
> be 'Comics>Marvel>Batman' and this would be found using 
> prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
> combined count for all undelying cat values.
> I am working on a patch that would achieve this by postprocessing the 
> resulting counts in getTermCounts(). However, this will not return the 
> correct counts for multivalued fields. Also, the combination with field.limit 
> is not easy. Any tips for how to implement this? I'm available to work on a 
> patch. Or is it a bad idia anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more informatio

[jira] [Updated] (SOLR-4496) Support for faceting on the start of values

2013-02-24 Thread Teun Duynstee (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teun Duynstee updated SOLR-4496:


Description: 
The SimpleFacets component supports the prefix parameter to return only facets 
starting with that prefix. This feature should (IMO) be complemented by two 
more parameters to make it much more usefull (names could be improved on of 
course):
- limitLength: will return facets for only the first x characters of the real 
facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
would cause the facets A and C to be returned, with the sum of the counts. This 
could typpically be used for a UI that allows you to select a first letter for 
fields with many facets.
- limitDelim: this would not truncate on a fixed length, but on the occurence 
of a certain character after the prefix. This would allow the user to search 
for hierarchical fields without having to resort to including each level of the 
hierarchy at index analysis. This way, the value of the filed cat would be 
'Comics>Marvel>Batman' and this would be found using 
prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
combined count for all undelying cat values.

I am working on a patch that would achieve this by postprocessing the resulting 
counts in getTermCounts(). However, this will not return the correct counts for 
multivalued fields. Also, the combination with field.limit is not easy. Any 
tips for how to implement this? I'm available to work on a patch. Or is it a 
bad idea anyway?


  was:
The SimpleFacets component supports the prefix parameter to return only facets 
starting with that prefix. This feature should (IMO) be complemented by two 
more parameters to make it much more usefull (names could be improved on of 
course):
- limitLength: will return facets for only the first x characters of the real 
facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
would cause the facets A and C to be returned, with the sum of the counts. This 
could typpically be used for a UI that allows you to select a first letter for 
fields with many facets.
- limitDelim: this would not truncate on a fixed length, but on the occurence 
of a certain character after the prefix. This would allow the user to search 
for hierarchical fields without having to resort to including each level of the 
hierarchy at index analysis. This way, the value of the filed cat would be 
'Comics>Marvel>Batman' and this would be found using 
prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
combined count for all undelying cat values.

I am working on a patch that would achieve this by postprocessing the resulting 
counts in getTermCounts(). However, this will not return the correct counts for 
multivalued fields. Also, the combination with field.limit is not easy. Any 
tips for how to implement this? I'm available to work on a patch. Or is it a 
bad idia anyway?



> Support for faceting on the start of values
> ---
>
> Key: SOLR-4496
> URL: https://issues.apache.org/jira/browse/SOLR-4496
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Teun Duynstee
>Priority: Minor
>
> The SimpleFacets component supports the prefix parameter to return only 
> facets starting with that prefix. This feature should (IMO) be complemented 
> by two more parameters to make it much more usefull (names could be improved 
> on of course):
> - limitLength: will return facets for only the first x characters of the real 
> facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
> would cause the facets A and C to be returned, with the sum of the counts. 
> This could typpically be used for a UI that allows you to select a first 
> letter for fields with many facets.
> - limitDelim: this would not truncate on a fixed length, but on the occurence 
> of a certain character after the prefix. This would allow the user to search 
> for hierarchical fields without having to resort to including each level of 
> the hierarchy at index analysis. This way, the value of the filed cat would 
> be 'Comics>Marvel>Batman' and this would be found using 
> prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
> combined count for all undelying cat values.
> I am working on a patch that would achieve this by postprocessing the 
> resulting counts in getTermCounts(). However, this will not return the 
> correct counts for multivalued fields. Also, the combination with field.limit 
> is not easy. Any tips for how to implement this? I'm available to work on a 
> patch. Or is it a bad idea anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more informatio

[jira] [Created] (SOLR-4496) Support for faceting on the start of values

2013-02-24 Thread Teun Duynstee (JIRA)

Teun Duynstee created SOLR-4496:
---

 Summary: Support for faceting on the start of values
 Key: SOLR-4496
 URL: https://issues.apache.org/jira/browse/SOLR-4496
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Teun Duynstee
Priority: Minor


The SimpleFacets component supports the prefix parameter to return only facets 
starting with that prefix. This feature should (IMO) be complemented by two 
more parameters to make it much more usefull (names could be improved on of 
course):
- limitLength: will return facets for only the first x characters of the real 
facets. If the real values are AAA, CC and CCC, the limitLength=1 parameter 
would cause the facets A and C to be returned, with the sum of the counts. This 
could typpically be used for a UI that allows you to select a first letter for 
fields with many facets.
- limitDelim: this would not truncate on a fixed length, but on the occurence 
of a certain character after the prefix. This would allow the user to search 
for hierarchical fields without having to resort to including each level of the 
hierarchy at index analysis. This way, the value of the filed cat would be 
'Comics>Marvel>Batman' and this would be found using 
prefix=Comics>&limitDelim=>. This would return the facet Marvel with the 
combined count for all undelying cat values.

I am working on a patch that would achieve this by postprocessing the resulting 
counts in getTermCounts(). However, this will not return the correct counts for 
multivalued fields. Also, the combination with field.limit is not easy. Any 
tips for how to implement this? I'm available to work on a patch. Or is it a 
bad adie anyway?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4495) solr.xml sharedLib attribtue should take a list of paths

2013-02-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4495:
--

Attachment: SOLR-4495.patch

First patch attempt, it works, but no tests

> solr.xml sharedLib attribtue should take a list of paths
> 
>
> Key: SOLR-4495
> URL: https://issues.apache.org/jira/browse/SOLR-4495
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Jan Høydahl
>  Labels: classpath, solr.xml
> Attachments: SOLR-4495.patch
>
>
> solr.xml's sharedLib is a great way to add plugins that should be shared 
> across all cores/collections.
> For increased flexibility I would like for it to take a list of paths. Then 
> I'd put Solr's own contrib libs in one shared folder "solrJars" and custom 
> plugins with deps in another "customerJars". That eases Solr upgrades, then 
> we can simply wipe and replace all jars in "solrJars" during upgrade.
> I realize that solr.xml is going away, and so the same request will be valid 
> for whatever replaces solr.xml, whether it be system prop or properties file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585339#comment-13585339
 ] 

Shai Erera commented on LUCENE-4748:


Great work Mike!

Few comments:
* The CHANGES line should not remove 'static methods' right?

* Can you add jdoc to FA.create(), something like "returns FA if all requests 
are CountFR, StandardFA otherwise"?
** And while on that, fix FacetsCollector.create to say "calls create(FA) by 
creating FA using FA.create" ... something like that

* This "// TODO: remove this limitation: it's silly?" -- can we handle it now? 
It's odd that we add a 'silly' limitation :). If we don't want the app to do 
it, then just throw the exception, and remove the TODO. Otherwise allow it?
** Same goes for this "// TODO: remove this limitation (allow pure browse".
** Unless, it's not easy to handle these TODOs now.

* This "for(int i=0;i Add DrillSideways helper class to Lucene facets module
> --
>
> Key: LUCENE-4748
> URL: https://issues.apache.org/jira/browse/LUCENE-4748
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.2, 5.0
>
> Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
> LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch
>
>
> This came out of a discussion on the java-user list with subject
> "Faceted search in OR": http://markmail.org/thread/jmnq6z2x7ayzci5k
> The basic idea is to count "near misses" during collection, ie
> documents that matched the main query and also all except one of the
> drill down filters.
> Drill sideways makes for a very nice faceted search UI because you
> don't "lose" the facet counts after drilling in.  Eg maybe you do a
> search for "cameras", and you see facets for the manufacturer, so you
> drill into "Nikon".
> With drill sideways, even after drilling down, you'll still get the
> counts for all the other brands, where each count tells you how many
> hits you'd get if you changed to a different manufacturer.
> This becomes more fun if you add further drill-downs, eg maybe I next drill
> down into Resolution=10 megapixels", and then I can see how many 10
> megapixel cameras all other manufacturers, and what other resolutions
> Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

37 matches

Mail list logo