date:20130331

[jira] [Commented] (SOLR-4658) In preparation for dynamic schema modification via REST API, add a "managed" schema facility

2013-03-31 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618654#comment-13618654
 ] 

Steve Rowe commented on SOLR-4658:
--

bq. It seems a little wierd to tie in all this zookeeper etc stuff into 
indexschema

Well, since it's only used by IndexSchema, it seemed like the logical location. 
 Do you have an alternative suggestion?

bq. If the goal is to have multiple implementations of indexschema (immutable 
ones backed by human edited files, mutable ones saved to some opaque "database" 
that can be edited by REST), then why not make IndexSchema abstract and 
pluggable from solrconfig.xml like anything else?

I imagine you're thinking of a hierarchy like:

* IndexSchema
** MutableIndexSchema
*** MutableZooKeeperIndexSchema
*** MutableLocalIndexSchema
*** MutableIndexSchema
*** ...
** ImmutableIndexSchema
*** ImmutableZooKeeperIndexSchema
*** ImmutableLocalIndexSchema
*** MutableIndexSchema
*** ...

Is that right?

Then solrconfig.xml config could be something like



And the implementation could be chosen using SPI or something like it.

One problem I see with this: the decision about the storage location for 
configs is made elsewhere - it would definitely be an issue if people chose 
e.g. {{managed="true" storage="local"}} in SolrCloud mode.

Or maybe I've misrepresented what you had in mind, Robert?

> In preparation for dynamic schema modification via REST API, add a "managed" 
> schema facility
> 
>
> Key: SOLR-4658
> URL: https://issues.apache.org/jira/browse/SOLR-4658
> Project: Solr
>  Issue Type: Sub-task
>  Components: Schema and Analysis
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.3
>
> Attachments: SOLR-4658.patch
>
>
> The idea is to have a set of configuration items in {{solrconfig.xml}}:
> {code:xml}
>  managedSchemaResourceName="managed-schema"/>
> {code} 
> It will be a precondition for future dynamic schema modification APIs that 
> {{mutable="true"}}.  {{solrconfig.xml}} parsing will fail if 
> {{mutable="true"}} but {{managed="false"}}.
> When {{managed="true"}}, and the resource named in 
> {{managedSchemaResourceName}} doesn't exist, Solr will automatically upgrade 
> the schema to "managed": the non-managed schema resource (typically 
> {{schema.xml}}) is parsed and then persisted at {{managedSchemaResourceName}} 
> under {{$solrHome/$collectionOrCore/conf/}}, or on ZooKeeper at 
> {{/configs/$configName/}}, and the non-managed schema resource is renamed by 
> appending {{.bak}}, e.g. {{schema.xml.bak}}.
> Once the upgrade has taken place, users can get the full schema from the 
> {{/schema?wt=schema.xml}} REST API, and can use this as the basis for 
> modifications which can then be used to manually downgrade back to 
> non-managed schema: put the {{schema.xml}} in place, then add {{ managed="false"/>}} to {{solrconfig.xml}} (or remove the whole {{}} 
> element, since {{managed="false"}} is the default).
> If users take no action, then Solr behaves the same as always: the example 
> {{solrconfig.xml}} will include {{}}.
> For a discussion of rationale for this feature, see 
> [~hossman_luc...@fucit.org]'s post to the solr-user mailing list in the 
> thread "Dynamic schema design: feedback requested" 
> [http://markmail.org/message/76zj24dru2gkop7b]:
>  
> {quote}
> Ignoring for a moment what format is used to persist schema information, I 
> think it's important to have a conceptual distinction between "data" that 
> is managed by applications and manipulated by a REST API, and "config" 
> that is managed by the user and loaded by solr on init -- or via an 
> explicit "reload config" REST API.
> Past experience with how users percieve(d) solr.xml has heavily reinforced 
> this opinion: on one hand, it's a place users must specify some config 
> information -- so people wnat to be able to keep it in version control 
> with other config files.  On the other hand it's a "live" data file that 
> is rewritten by solr when cores are added.  (God help you if you want do a 
> rolling deploy a new version of solr.xml where you've edited some of the 
> config values while simultenously clients are creating new SolrCores)
> As we move forward towards having REST APIs that treat schema information 
> as "data" that can be manipulated, I anticipate the same types of 
> confusion, missunderstanding, and grumblings if we try to use the same 
> pattern of treating the existing schema.xml (or some new schema.json) as a 
> hybrid configs & data file.  "Edit it by hand if you want, the /schema/* 
> REST API will too!"  ... Even assuming we don't make any of the same 
> technical mistakes that have caused problems with solr.xml round t

[jira] [Updated] (SOLR-4652) Resource loader has broken behavior for solr.xml plugins (basically ShardHandlerFactory)

2013-03-31 Thread Ryan Ernst (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated SOLR-4652:
-

Attachment: SOLR-4652.patch

{quote}
Can a test be written that just does the simplest checks: getting the loaders 
and verifying the "hierarchy" with Classloader.getParent()?
{quote}

Good idea.  Done with this new patch.

> Resource loader has broken behavior for solr.xml plugins (basically 
> ShardHandlerFactory)
> 
>
> Key: SOLR-4652
> URL: https://issues.apache.org/jira/browse/SOLR-4652
> Project: Solr
>  Issue Type: Bug
>Reporter: Ryan Ernst
> Attachments: SOLR-4652.patch, SOLR-4652.patch
>
>
> I have the following scenario:
> MyShardHandlerFactory is plugged in via solr.xml.  The jar containing 
> MyShardHandlerFactory is in the shared lib dir.  There are a couple issues:
> 1. From within a per core handler (that is loaded within the core's lib dir), 
> you grab the ShardHandlerFactory from CoreContainer, casting to 
> MyShardHandlerFactory will results in a ClassCastException with a message 
> like "cannot cast instance of MyShardHandlerFactory to MyShardHandlerFactory".
> 2. Adding a custom dir for shared lib (for example "mylib") does not work.  
> The ShardHandlerFactory is initialized before sharedLib is loaded.
> I've been pouring through the code on this and I don't see an easy fix.  I'll 
> keep looking at it, but I wanted to get this up so hopefully others have some 
> thoughts on how best to fix.  IMO, it seems like there needs to be a clear 
> chain of resource loaders (one for loading solr.xml, a child for loading the 
> lib dir, used for solr.xml plugins, a grandchild for per core config, and a 
> great grandchild for per core lib dir based plugins).  Right now there are 
> some siblings, because any place a SolrResourceLoader is created with a null 
> parent classloader, it gets the jetty thread's classloader as the parent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4659) SolrResourceLoader should always get parent classloader explicitly (no "default" behavior)

2013-03-31 Thread Ryan Ernst (JIRA)

Ryan Ernst created SOLR-4659:


 Summary: SolrResourceLoader should always get parent classloader 
explicitly (no "default" behavior)
 Key: SOLR-4659
 URL: https://issues.apache.org/jira/browse/SOLR-4659
 Project: Solr
  Issue Type: Bug
Reporter: Ryan Ernst


The webapp classloader should be retrieved once from the context (during static 
initialization), and passed through when loading the shared resource loader.

This was spun off of https://issues.apache.org/jira/browse/SOLR-4652

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4652) Resource loader has broken behavior for solr.xml plugins (basically ShardHandlerFactory)

2013-03-31 Thread Ryan Ernst (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618638#comment-13618638
 ] 

Ryan Ernst commented on SOLR-4652:
--

{quote}
Separately about the refactoring, i think Uwe mentioned (then deleted) a key 
step for a future issue: in order to be able to guarantee its correct in the 
future, a good rote refactoring step after this issue would be to remove the 
'treat null as context classloader' (instead throw exception if its null!) from 
SolrResourceLoader and remove the ctor that takes no parent classloader: this 
way its explicit what is happening everywhere and would reduce the confusion.
{quote}

Yeah, this is the route I had started going down when I realized it was a 
sizable undertaking.  I've created a separate issue:
https://issues.apache.org/jira/browse/SOLR-4659

> Resource loader has broken behavior for solr.xml plugins (basically 
> ShardHandlerFactory)
> 
>
> Key: SOLR-4652
> URL: https://issues.apache.org/jira/browse/SOLR-4652
> Project: Solr
>  Issue Type: Bug
>Reporter: Ryan Ernst
> Attachments: SOLR-4652.patch
>
>
> I have the following scenario:
> MyShardHandlerFactory is plugged in via solr.xml.  The jar containing 
> MyShardHandlerFactory is in the shared lib dir.  There are a couple issues:
> 1. From within a per core handler (that is loaded within the core's lib dir), 
> you grab the ShardHandlerFactory from CoreContainer, casting to 
> MyShardHandlerFactory will results in a ClassCastException with a message 
> like "cannot cast instance of MyShardHandlerFactory to MyShardHandlerFactory".
> 2. Adding a custom dir for shared lib (for example "mylib") does not work.  
> The ShardHandlerFactory is initialized before sharedLib is loaded.
> I've been pouring through the code on this and I don't see an easy fix.  I'll 
> keep looking at it, but I wanted to get this up so hopefully others have some 
> thoughts on how best to fix.  IMO, it seems like there needs to be a clear 
> chain of resource loaders (one for loading solr.xml, a child for loading the 
> lib dir, used for solr.xml plugins, a grandchild for per core config, and a 
> great grandchild for per core lib dir based plugins).  Right now there are 
> some siblings, because any place a SolrResourceLoader is created with a null 
> parent classloader, it gets the jetty thread's classloader as the parent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4893.


   Resolution: Fixed
Fix Version/s: 4.3
   5.0
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x. I added defensive code to prevent an app tripping 
itself, if it called getFacetResults before doing search, without calling 
reset. setNextReader now clears the cached results.

Thanks crocket for reporting this!

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
>Assignee: Shai Erera
> Fix For: 5.0, 4.3
>
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch, LUCENE-4893.patch, 
> LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4861) can we use a single PostingsHighlighter for both whole and snippet highlighting?

2013-03-31 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4861.
-

Resolution: Fixed

> can we use a single PostingsHighlighter for both whole and snippet 
> highlighting?
> 
>
> Key: LUCENE-4861
> URL: https://issues.apache.org/jira/browse/LUCENE-4861
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.3
>
> Attachments: LUCENE-4861.patch
>
>
> Right now, because we pass the BreakIterator to the ctor, you have to make 
> two instances if you sometimes want whole and sometimes want snippets.
> But I think this is a fairly common use case, eg I want entire title field 
> (with matches highlighted) but I want body field (snippets + highlights).  It 
> would be nice to have this work with a single instance ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4652) Resource loader has broken behavior for solr.xml plugins (basically ShardHandlerFactory)

2013-03-31 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618603#comment-13618603
 ] 

Robert Muir commented on SOLR-4652:
---

Can a test be written that just does the simplest checks: getting the loaders 
and verifying the "hierarchy" with Classloader.getParent()?

Separately about the refactoring, i think Uwe mentioned (then deleted) a key 
step for a future issue: in order to be able to guarantee its correct in the 
future, a good rote refactoring step after this issue would be to remove the 
'treat null as context classloader' (instead throw exception if its null!) from 
SolrResourceLoader and remove the ctor that takes no parent classloader: this 
way its explicit what is happening everywhere and would reduce the confusion.


> Resource loader has broken behavior for solr.xml plugins (basically 
> ShardHandlerFactory)
> 
>
> Key: SOLR-4652
> URL: https://issues.apache.org/jira/browse/SOLR-4652
> Project: Solr
>  Issue Type: Bug
>Reporter: Ryan Ernst
> Attachments: SOLR-4652.patch
>
>
> I have the following scenario:
> MyShardHandlerFactory is plugged in via solr.xml.  The jar containing 
> MyShardHandlerFactory is in the shared lib dir.  There are a couple issues:
> 1. From within a per core handler (that is loaded within the core's lib dir), 
> you grab the ShardHandlerFactory from CoreContainer, casting to 
> MyShardHandlerFactory will results in a ClassCastException with a message 
> like "cannot cast instance of MyShardHandlerFactory to MyShardHandlerFactory".
> 2. Adding a custom dir for shared lib (for example "mylib") does not work.  
> The ShardHandlerFactory is initialized before sharedLib is loaded.
> I've been pouring through the code on this and I don't see an easy fix.  I'll 
> keep looking at it, but I wanted to get this up so hopefully others have some 
> thoughts on how best to fix.  IMO, it seems like there needs to be a clear 
> chain of resource loaders (one for loading solr.xml, a child for loading the 
> lib dir, used for solr.xml plugins, a grandchild for per core config, and a 
> great grandchild for per core lib dir based plugins).  Right now there are 
> some siblings, because any place a SolrResourceLoader is created with a null 
> parent classloader, it gets the jetty thread's classloader as the parent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4658) In preparation for dynamic schema modification via REST API, add a "managed" schema facility

2013-03-31 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618599#comment-13618599
 ] 

Robert Muir commented on SOLR-4658:
---

It seems a little wierd to tie in all this zookeeper etc stuff into 
indexschema, and i'm still trying to figure out the mutable/managed stuff. 

If the goal is to have multiple implementations of indexschema (immutable ones 
backed by human edited files, mutable ones saved to some opaque "database" that 
can be edited by REST), then why not make IndexSchema abstract and pluggable 
from solrconfig.xml like anything else?


> In preparation for dynamic schema modification via REST API, add a "managed" 
> schema facility
> 
>
> Key: SOLR-4658
> URL: https://issues.apache.org/jira/browse/SOLR-4658
> Project: Solr
>  Issue Type: Sub-task
>  Components: Schema and Analysis
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.3
>
> Attachments: SOLR-4658.patch
>
>
> The idea is to have a set of configuration items in {{solrconfig.xml}}:
> {code:xml}
>  managedSchemaResourceName="managed-schema"/>
> {code} 
> It will be a precondition for future dynamic schema modification APIs that 
> {{mutable="true"}}.  {{solrconfig.xml}} parsing will fail if 
> {{mutable="true"}} but {{managed="false"}}.
> When {{managed="true"}}, and the resource named in 
> {{managedSchemaResourceName}} doesn't exist, Solr will automatically upgrade 
> the schema to "managed": the non-managed schema resource (typically 
> {{schema.xml}}) is parsed and then persisted at {{managedSchemaResourceName}} 
> under {{$solrHome/$collectionOrCore/conf/}}, or on ZooKeeper at 
> {{/configs/$configName/}}, and the non-managed schema resource is renamed by 
> appending {{.bak}}, e.g. {{schema.xml.bak}}.
> Once the upgrade has taken place, users can get the full schema from the 
> {{/schema?wt=schema.xml}} REST API, and can use this as the basis for 
> modifications which can then be used to manually downgrade back to 
> non-managed schema: put the {{schema.xml}} in place, then add {{ managed="false"/>}} to {{solrconfig.xml}} (or remove the whole {{}} 
> element, since {{managed="false"}} is the default).
> If users take no action, then Solr behaves the same as always: the example 
> {{solrconfig.xml}} will include {{}}.
> For a discussion of rationale for this feature, see 
> [~hossman_luc...@fucit.org]'s post to the solr-user mailing list in the 
> thread "Dynamic schema design: feedback requested" 
> [http://markmail.org/message/76zj24dru2gkop7b]:
>  
> {quote}
> Ignoring for a moment what format is used to persist schema information, I 
> think it's important to have a conceptual distinction between "data" that 
> is managed by applications and manipulated by a REST API, and "config" 
> that is managed by the user and loaded by solr on init -- or via an 
> explicit "reload config" REST API.
> Past experience with how users percieve(d) solr.xml has heavily reinforced 
> this opinion: on one hand, it's a place users must specify some config 
> information -- so people wnat to be able to keep it in version control 
> with other config files.  On the other hand it's a "live" data file that 
> is rewritten by solr when cores are added.  (God help you if you want do a 
> rolling deploy a new version of solr.xml where you've edited some of the 
> config values while simultenously clients are creating new SolrCores)
> As we move forward towards having REST APIs that treat schema information 
> as "data" that can be manipulated, I anticipate the same types of 
> confusion, missunderstanding, and grumblings if we try to use the same 
> pattern of treating the existing schema.xml (or some new schema.json) as a 
> hybrid configs & data file.  "Edit it by hand if you want, the /schema/* 
> REST API will too!"  ... Even assuming we don't make any of the same 
> technical mistakes that have caused problems with solr.xml round tripping 
> in hte past (ie: losing comments, reading new config options that we 
> forget to write back out, etc...) i'm fairly certain there is still going 
> to be a lot of things that will loook weird and confusing to people.
> (XML may bave been designed to be both "human readable & writable" and 
> "machine readable & writable", but practically speaking it's hard have a 
> single XML file be "machine and human readable & writable")
> I think it would make a lot of sense -- not just in terms of 
> implementation but also for end user clarity -- to have some simple, 
> straightforward to understand caveats about maintaining schema 
> information...
> 1) If you want to keep schema information in an authoritative config file 
> that you can manually edit, then

[jira] [Updated] (SOLR-4658) In preparation for dynamic schema modification via REST API, add a "managed" schema facility

2013-03-31 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-4658:
-

Attachment: SOLR-4658.patch

Patch implementing the idea.

This makes the IndexSchema constructor private, and adds a factory method named 
{{create()}}, which manages the upgrade-to-managed-schema process when 
necessary.

The persistence format is kept as XML.  A comment at the top says:

{code:xml}

{code}

This patch also add a method to {{core.Config}} to test for unexpected element 
attributes when parsing {{solrconfig.xml}}: 
{{complainAboutUnknownAttributes()}}.  I'm only using it for the {{}} 
tag at this point, but it should be useful for any other config elements that 
have a known fixed set of attributes.

Tests added for SolrCloud and standalone modes.

I think it's ready to go.


> In preparation for dynamic schema modification via REST API, add a "managed" 
> schema facility
> 
>
> Key: SOLR-4658
> URL: https://issues.apache.org/jira/browse/SOLR-4658
> Project: Solr
>  Issue Type: Sub-task
>  Components: Schema and Analysis
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.3
>
> Attachments: SOLR-4658.patch
>
>
> The idea is to have a set of configuration items in {{solrconfig.xml}}:
> {code:xml}
>  managedSchemaResourceName="managed-schema"/>
> {code} 
> It will be a precondition for future dynamic schema modification APIs that 
> {{mutable="true"}}.  {{solrconfig.xml}} parsing will fail if 
> {{mutable="true"}} but {{managed="false"}}.
> When {{managed="true"}}, and the resource named in 
> {{managedSchemaResourceName}} doesn't exist, Solr will automatically upgrade 
> the schema to "managed": the non-managed schema resource (typically 
> {{schema.xml}}) is parsed and then persisted at {{managedSchemaResourceName}} 
> under {{$solrHome/$collectionOrCore/conf/}}, or on ZooKeeper at 
> {{/configs/$configName/}}, and the non-managed schema resource is renamed by 
> appending {{.bak}}, e.g. {{schema.xml.bak}}.
> Once the upgrade has taken place, users can get the full schema from the 
> {{/schema?wt=schema.xml}} REST API, and can use this as the basis for 
> modifications which can then be used to manually downgrade back to 
> non-managed schema: put the {{schema.xml}} in place, then add {{ managed="false"/>}} to {{solrconfig.xml}} (or remove the whole {{}} 
> element, since {{managed="false"}} is the default).
> If users take no action, then Solr behaves the same as always: the example 
> {{solrconfig.xml}} will include {{}}.
> For a discussion of rationale for this feature, see 
> [~hossman_luc...@fucit.org]'s post to the solr-user mailing list in the 
> thread "Dynamic schema design: feedback requested" 
> [http://markmail.org/message/76zj24dru2gkop7b]:
>  
> {quote}
> Ignoring for a moment what format is used to persist schema information, I 
> think it's important to have a conceptual distinction between "data" that 
> is managed by applications and manipulated by a REST API, and "config" 
> that is managed by the user and loaded by solr on init -- or via an 
> explicit "reload config" REST API.
> Past experience with how users percieve(d) solr.xml has heavily reinforced 
> this opinion: on one hand, it's a place users must specify some config 
> information -- so people wnat to be able to keep it in version control 
> with other config files.  On the other hand it's a "live" data file that 
> is rewritten by solr when cores are added.  (God help you if you want do a 
> rolling deploy a new version of solr.xml where you've edited some of the 
> config values while simultenously clients are creating new SolrCores)
> As we move forward towards having REST APIs that treat schema information 
> as "data" that can be manipulated, I anticipate the same types of 
> confusion, missunderstanding, and grumblings if we try to use the same 
> pattern of treating the existing schema.xml (or some new schema.json) as a 
> hybrid configs & data file.  "Edit it by hand if you want, the /schema/* 
> REST API will too!"  ... Even assuming we don't make any of the same 
> technical mistakes that have caused problems with solr.xml round tripping 
> in hte past (ie: losing comments, reading new config options that we 
> forget to write back out, etc...) i'm fairly certain there is still going 
> to be a lot of things that will loook weird and confusing to people.
> (XML may bave been designed to be both "human readable & writable" and 
> "machine readable & writable", but practically speaking it's hard have a 
> single XML file be "machine and human readable & writable")
> I think it would make a lot of sense -- not just in terms of 
> implementation but also for end user clarit

[jira] [Created] (SOLR-4658) In preparation for dynamic schema modification via REST API, add a "managed" schema facility

2013-03-31 Thread Steve Rowe (JIRA)

Steve Rowe created SOLR-4658:


 Summary: In preparation for dynamic schema modification via REST 
API, add a "managed" schema facility
 Key: SOLR-4658
 URL: https://issues.apache.org/jira/browse/SOLR-4658
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.3


The idea is to have a set of configuration items in {{solrconfig.xml}}:

{code:xml}

{code} 

It will be a precondition for future dynamic schema modification APIs that 
{{mutable="true"}}.  {{solrconfig.xml}} parsing will fail if {{mutable="true"}} 
but {{managed="false"}}.

When {{managed="true"}}, and the resource named in 
{{managedSchemaResourceName}} doesn't exist, Solr will automatically upgrade 
the schema to "managed": the non-managed schema resource (typically 
{{schema.xml}}) is parsed and then persisted at {{managedSchemaResourceName}} 
under {{$solrHome/$collectionOrCore/conf/}}, or on ZooKeeper at 
{{/configs/$configName/}}, and the non-managed schema resource is renamed by 
appending {{.bak}}, e.g. {{schema.xml.bak}}.

Once the upgrade has taken place, users can get the full schema from the 
{{/schema?wt=schema.xml}} REST API, and can use this as the basis for 
modifications which can then be used to manually downgrade back to non-managed 
schema: put the {{schema.xml}} in place, then add {{}} 
to {{solrconfig.xml}} (or remove the whole {{}} element, since 
{{managed="false"}} is the default).

If users take no action, then Solr behaves the same as always: the example 
{{solrconfig.xml}} will include {{}}.

For a discussion of rationale for this feature, see 
[~hossman_luc...@fucit.org]'s post to the solr-user mailing list in the thread 
"Dynamic schema design: feedback requested" 
[http://markmail.org/message/76zj24dru2gkop7b]:
 
{quote}
Ignoring for a moment what format is used to persist schema information, I 
think it's important to have a conceptual distinction between "data" that 
is managed by applications and manipulated by a REST API, and "config" 
that is managed by the user and loaded by solr on init -- or via an 
explicit "reload config" REST API.

Past experience with how users percieve(d) solr.xml has heavily reinforced 
this opinion: on one hand, it's a place users must specify some config 
information -- so people wnat to be able to keep it in version control 
with other config files.  On the other hand it's a "live" data file that 
is rewritten by solr when cores are added.  (God help you if you want do a 
rolling deploy a new version of solr.xml where you've edited some of the 
config values while simultenously clients are creating new SolrCores)

As we move forward towards having REST APIs that treat schema information 
as "data" that can be manipulated, I anticipate the same types of 
confusion, missunderstanding, and grumblings if we try to use the same 
pattern of treating the existing schema.xml (or some new schema.json) as a 
hybrid configs & data file.  "Edit it by hand if you want, the /schema/* 
REST API will too!"  ... Even assuming we don't make any of the same 
technical mistakes that have caused problems with solr.xml round tripping 
in hte past (ie: losing comments, reading new config options that we 
forget to write back out, etc...) i'm fairly certain there is still going 
to be a lot of things that will loook weird and confusing to people.

(XML may bave been designed to be both "human readable & writable" and 
"machine readable & writable", but practically speaking it's hard have a 
single XML file be "machine and human readable & writable")

I think it would make a lot of sense -- not just in terms of 
implementation but also for end user clarity -- to have some simple, 
straightforward to understand caveats about maintaining schema 
information...

1) If you want to keep schema information in an authoritative config file 
that you can manually edit, then the /schema REST API will be read only. 

2) If you wish to use the /schema REST API for read and write operations, 
then schema information will be persisted under the covers in a data store 
whose format is an implementation detail just like the index file format.

3) If you are using a schema config file and you wish to switch to using 
the /schema REST API for managing schema information, there is a 
tool/command/API you can run to so.

4) if you are using the /schema REST API for managing schema information, 
and you wish to switch to using a schema config file, there is a 
tool/command/API you can run to export the schema info if a config file 
format.


...wether of not the "under the covers in a data store" used by the REST 
API is JSON, or some binary data, or an XML file just schema.xml w/o 
whitespace/comments should be an implementation detail.  Likewise is the 
question of wether some new config f

[jira] [Commented] (SOLR-4657) Failing OpenCloseCoreStressTest

2013-03-31 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618561#comment-13618561
 ] 

Erick Erickson commented on SOLR-4657:
--

trunk r: 1463068
4xr: 1463076

Now we'll see if the "failure to close resource" goes away.

> Failing OpenCloseCoreStressTest
> ---
>
> Key: SOLR-4657
> URL: https://issues.apache.org/jira/browse/SOLR-4657
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-4657.patch
>
>
> I have an idea what's happening with the test that apparently doesn't close a 
> core, but it'll be something of a hit-or-miss process to fix it as it looks 
> timing related. Might have a patch up later today other commitments willing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: asking for help to access the bugzilla for lucene

2013-03-31 Thread Han Jiang

They don't use "bugzilla" ... https://issues.apache.org/jira/browse/LUCENE

On Mon, Apr 1, 2013 at 9:32 AM, 陈秀招  wrote:

> Hi,
>
> I’m a graduate student in Peking University. And due to the research
> recently in my lab, I want to get access to the bugzilla for lucene. What
> should I do? Thank you!
>
> Best wishes!
>



-- 
Han Jiang

Team of Search Engine and Web Mining,
School of Electronic Engineering  and Computer Science,
Peking University, China

asking for help to access the bugzilla for lucene

2013-03-31 Thread 陈秀招

Hi,

I’m a graduate student in Peking University. And due to the research
recently in my lab, I want to get access to the bugzilla for lucene. What
should I do? Thank you!

Best wishes!

[jira] [Updated] (SOLR-4657) Failing OpenCloseCoreStressTest

2013-03-31 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4657:
-

Attachment: SOLR-4657.patch

See if this takes care of dangling cores during the stress test.

> Failing OpenCloseCoreStressTest
> ---
>
> Key: SOLR-4657
> URL: https://issues.apache.org/jira/browse/SOLR-4657
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-4657.patch
>
>
> I have an idea what's happening with the test that apparently doesn't close a 
> core, but it'll be something of a hit-or-miss process to fix it as it looks 
> timing related. Might have a patch up later today other commitments willing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.2.1 - Build # 11 - Still Failing

2013-03-31 Thread Robert Muir

On Sun, Mar 31, 2013 at 6:42 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Sun, Mar 31, 2013 at 6:23 PM, Robert Muir  wrote:
> > hmm there goes that theory. Maybe its just a leftover process that didn't
> > get killed from a previous smoketester: I think to be safe the python
> code
> > should always terminate the server it starts with 'kill -9' and nothing
> > else!
>
> It does kill the server ... but there was a bug in that logic such
> that if the 30 minute startup wait elapsed it failed to kill it.
>
> Which is curious ... it seems to mean that the server took 30 minutes,
> didn't seem to start, but did in fact start (after 30 minutes) and
> bind to the port.
>

maybe its related to the blackhole :)

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.2.1 - Build # 11 - Still Failing

2013-03-31 Thread Michael McCandless

On Sun, Mar 31, 2013 at 6:23 PM, Robert Muir  wrote:
> hmm there goes that theory. Maybe its just a leftover process that didn't
> get killed from a previous smoketester: I think to be safe the python code
> should always terminate the server it starts with 'kill -9' and nothing
> else!

It does kill the server ... but there was a bug in that logic such
that if the 30 minute startup wait elapsed it failed to kill it.

Which is curious ... it seems to mean that the server took 30 minutes,
didn't seem to start, but did in fact start (after 30 minutes) and
bind to the port.

But this could just be a stupid bug in smokeTester.  All it looks for
is "Started SocketConnector@0.0.0.0:8983" in the server's stderr
output.  Maybe this is too brittle?

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.2.1 - Build # 11 - Still Failing

2013-03-31 Thread Steve Rowe

I think there's only one executor on the lucene slave, though, so no concurrent 
jobs.

On Mar 31, 2013, at 6:15 PM, Robert Muir  wrote:

> Maybe the fact we now have this 4.2.1-SmokeRelease job (didnt the vote pass?) 
> created a situation where two smoke-testing jobs (e.g. 5.x and 4.2.1 or 
> something) were running concurrently and both tried to bind to the same port?
> 
> On Sun, Mar 31, 2013 at 5:29 PM, Michael McCandless 
>  wrote:
> Hmm cascading errors.  First, the 4.x smoke tester failed because
> Solr's example (java -jar start.jar) took more than 30 minutes to
> start:
> 
> https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/61/console
> 
> But then because of a bug in the smoke tester, it left this server
> running, which then caused future smoke testers to fail since they
> can't bind to the port.
> 
> So:
> 
>   * I killed the leftover "java -jar start.jar" on Jenkins
> 
>   * I'll fix smoke tester to not leave leftover processes
> 
> Still not sure why the Solr example took more than 30 minutes to start
> though ... could just be a bug in the smoke tester.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> On Sun, Mar 31, 2013 at 4:43 PM, Apache Jenkins Server
>  wrote:
> > Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.2.1/11/
> >
> > No tests ran.
> >
> > Build Log:
> > [...truncated 32490 lines...]
> > prepare-release-no-sign:
> > [mkdir] Created dir: 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease
> >  [copy] Copying 401 files to 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/lucene
> >  [copy] Copying 194 files to 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/solr
> >  [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
> >  [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
> >  [exec] NOTE: output encoding is US-ASCII
> >  [exec]
> >  [exec] Load release URL 
> > "file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/"...
> >  [exec]
> >  [exec] Test Lucene...
> >  [exec]   test basics...
> >  [exec]   get KEYS
> >  [exec] 0.1 MB
> >  [exec]   check changes HTML...
> >  [exec]   download lucene-4.2.1-src.tgz...
> >  [exec] 26.7 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download lucene-4.2.1.tgz...
> >  [exec] 48.1 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download lucene-4.2.1.zip...
> >  [exec] 57.7 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   unpack lucene-4.2.1.tgz...
> >  [exec] verify JAR/WAR metadata...
> >  [exec] test demo with 1.6...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] test demo with 1.7...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] check Lucene's javadoc JAR
> >  [exec]   unpack lucene-4.2.1.zip...
> >  [exec] verify JAR/WAR metadata...
> >  [exec] test demo with 1.6...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] test demo with 1.7...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] check Lucene's javadoc JAR
> >  [exec]   unpack lucene-4.2.1-src.tgz...
> >  [exec] make sure no JARs/WARs in src dist...
> >  [exec] run "ant validate"
> >  [exec] run tests w/ Java 6...
> >  [exec] test demo with 1.6...
> >  [exec]   got 222 hits for query "lucene"
> >  [exec] generate javadocs w/ Java 6...
> >  [exec] run tests w/ Java 7...
> >  [exec] test demo with 1.7...
> >  [exec]   got 222 hits for query "lucene"
> >  [exec] generate javadocs w/ Java 7...
> >  [exec]
> >  [exec] Crawl/parse...
> >  [exec]
> >  [exec] Verify...
> >  [exec]
> >  [exec] Test Solr...
> >  [exec]   test basics...
> >  [exec]   get KEYS
> >  [exec] 0.1 MB
> >  [exec]   check changes HTML...
> >  [exec]   download solr-4.2.1-src.tgz...
> >  [exec] 30.3 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download solr-4.2.1.tgz...
> >  [exec] 111.0 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download solr-4.2.1.zip...
> >  [exec] 115.5 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   unpack solr-4.2.1.tgz...
> >  [exec] verify JAR/WAR metadata...
> >  [exec]   **WARNING**: skipping check of 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contrib/dataimporthandler/lib/activation-1.1.jar:
> >  it has javax.* classes
> >  [exec]   **WARNING**: skipping check of 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRe

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.2.1 - Build # 11 - Still Failing

2013-03-31 Thread Steve Rowe

Good point.

I'll take down all the 4.2.1 jobs.

Steve

On Mar 31, 2013, at 6:15 PM, Robert Muir  wrote:

> Maybe the fact we now have this 4.2.1-SmokeRelease job (didnt the vote pass?) 
> created a situation where two smoke-testing jobs (e.g. 5.x and 4.2.1 or 
> something) were running concurrently and both tried to bind to the same port?
> 
> On Sun, Mar 31, 2013 at 5:29 PM, Michael McCandless 
>  wrote:
> Hmm cascading errors.  First, the 4.x smoke tester failed because
> Solr's example (java -jar start.jar) took more than 30 minutes to
> start:
> 
> https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/61/console
> 
> But then because of a bug in the smoke tester, it left this server
> running, which then caused future smoke testers to fail since they
> can't bind to the port.
> 
> So:
> 
>   * I killed the leftover "java -jar start.jar" on Jenkins
> 
>   * I'll fix smoke tester to not leave leftover processes
> 
> Still not sure why the Solr example took more than 30 minutes to start
> though ... could just be a bug in the smoke tester.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> On Sun, Mar 31, 2013 at 4:43 PM, Apache Jenkins Server
>  wrote:
> > Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.2.1/11/
> >
> > No tests ran.
> >
> > Build Log:
> > [...truncated 32490 lines...]
> > prepare-release-no-sign:
> > [mkdir] Created dir: 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease
> >  [copy] Copying 401 files to 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/lucene
> >  [copy] Copying 194 files to 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/solr
> >  [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
> >  [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
> >  [exec] NOTE: output encoding is US-ASCII
> >  [exec]
> >  [exec] Load release URL 
> > "file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/"...
> >  [exec]
> >  [exec] Test Lucene...
> >  [exec]   test basics...
> >  [exec]   get KEYS
> >  [exec] 0.1 MB
> >  [exec]   check changes HTML...
> >  [exec]   download lucene-4.2.1-src.tgz...
> >  [exec] 26.7 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download lucene-4.2.1.tgz...
> >  [exec] 48.1 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download lucene-4.2.1.zip...
> >  [exec] 57.7 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   unpack lucene-4.2.1.tgz...
> >  [exec] verify JAR/WAR metadata...
> >  [exec] test demo with 1.6...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] test demo with 1.7...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] check Lucene's javadoc JAR
> >  [exec]   unpack lucene-4.2.1.zip...
> >  [exec] verify JAR/WAR metadata...
> >  [exec] test demo with 1.6...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] test demo with 1.7...
> >  [exec]   got 5450 hits for query "lucene"
> >  [exec] check Lucene's javadoc JAR
> >  [exec]   unpack lucene-4.2.1-src.tgz...
> >  [exec] make sure no JARs/WARs in src dist...
> >  [exec] run "ant validate"
> >  [exec] run tests w/ Java 6...
> >  [exec] test demo with 1.6...
> >  [exec]   got 222 hits for query "lucene"
> >  [exec] generate javadocs w/ Java 6...
> >  [exec] run tests w/ Java 7...
> >  [exec] test demo with 1.7...
> >  [exec]   got 222 hits for query "lucene"
> >  [exec] generate javadocs w/ Java 7...
> >  [exec]
> >  [exec] Crawl/parse...
> >  [exec]
> >  [exec] Verify...
> >  [exec]
> >  [exec] Test Solr...
> >  [exec]   test basics...
> >  [exec]   get KEYS
> >  [exec] 0.1 MB
> >  [exec]   check changes HTML...
> >  [exec]   download solr-4.2.1-src.tgz...
> >  [exec] 30.3 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download solr-4.2.1.tgz...
> >  [exec] 111.0 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   download solr-4.2.1.zip...
> >  [exec] 115.5 MB
> >  [exec] verify md5/sha1 digests
> >  [exec]   unpack solr-4.2.1.tgz...
> >  [exec] verify JAR/WAR metadata...
> >  [exec]   **WARNING**: skipping check of 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contrib/dataimporthandler/lib/activation-1.1.jar:
> >  it has javax.* classes
> >  [exec]   **WARNING**: skipping check of 
> > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contr

[jira] [Commented] (LUCENE-4877) Fix analyzer factories to throw exception when arguments are invalid

2013-03-31 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618477#comment-13618477
 ] 

Michael McCandless commented on LUCENE-4877:


+1

> Fix analyzer factories to throw exception when arguments are invalid
> 
>
> Key: LUCENE-4877
> URL: https://issues.apache.org/jira/browse/LUCENE-4877
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Reporter: Robert Muir
> Attachments: LUCENE-4877_one_solution_prototype.patch, 
> LUCENE-4877.patch, LUCENE-4877.patch
>
>
> Currently if someone typos an argument "someParamater=xyz" instead of 
> someParameter=xyz, they get no exception and sometimes incorrect behavior.
> It would be way better if these factories threw exception on unknown params, 
> e.g. they removed the args they used and checked they were empty at the end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618474#comment-13618474
 ] 

Michael McCandless commented on LUCENE-4893:


+1, thanks Shai!

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch, LUCENE-4893.patch, 
> LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.2.1 - Build # 11 - Still Failing

2013-03-31 Thread Michael McCandless

Hmm cascading errors.  First, the 4.x smoke tester failed because
Solr's example (java -jar start.jar) took more than 30 minutes to
start:

https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/61/console

But then because of a bug in the smoke tester, it left this server
running, which then caused future smoke testers to fail since they
can't bind to the port.

So:

  * I killed the leftover "java -jar start.jar" on Jenkins

  * I'll fix smoke tester to not leave leftover processes

Still not sure why the Solr example took more than 30 minutes to start
though ... could just be a bug in the smoke tester.

Mike McCandless

http://blog.mikemccandless.com

On Sun, Mar 31, 2013 at 4:43 PM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.2.1/11/
>
> No tests ran.
>
> Build Log:
> [...truncated 32490 lines...]
> prepare-release-no-sign:
> [mkdir] Created dir: 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease
>  [copy] Copying 401 files to 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/lucene
>  [copy] Copying 194 files to 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/solr
>  [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
>  [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
>  [exec] NOTE: output encoding is US-ASCII
>  [exec]
>  [exec] Load release URL 
> "file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/"...
>  [exec]
>  [exec] Test Lucene...
>  [exec]   test basics...
>  [exec]   get KEYS
>  [exec] 0.1 MB
>  [exec]   check changes HTML...
>  [exec]   download lucene-4.2.1-src.tgz...
>  [exec] 26.7 MB
>  [exec] verify md5/sha1 digests
>  [exec]   download lucene-4.2.1.tgz...
>  [exec] 48.1 MB
>  [exec] verify md5/sha1 digests
>  [exec]   download lucene-4.2.1.zip...
>  [exec] 57.7 MB
>  [exec] verify md5/sha1 digests
>  [exec]   unpack lucene-4.2.1.tgz...
>  [exec] verify JAR/WAR metadata...
>  [exec] test demo with 1.6...
>  [exec]   got 5450 hits for query "lucene"
>  [exec] test demo with 1.7...
>  [exec]   got 5450 hits for query "lucene"
>  [exec] check Lucene's javadoc JAR
>  [exec]   unpack lucene-4.2.1.zip...
>  [exec] verify JAR/WAR metadata...
>  [exec] test demo with 1.6...
>  [exec]   got 5450 hits for query "lucene"
>  [exec] test demo with 1.7...
>  [exec]   got 5450 hits for query "lucene"
>  [exec] check Lucene's javadoc JAR
>  [exec]   unpack lucene-4.2.1-src.tgz...
>  [exec] make sure no JARs/WARs in src dist...
>  [exec] run "ant validate"
>  [exec] run tests w/ Java 6...
>  [exec] test demo with 1.6...
>  [exec]   got 222 hits for query "lucene"
>  [exec] generate javadocs w/ Java 6...
>  [exec] run tests w/ Java 7...
>  [exec] test demo with 1.7...
>  [exec]   got 222 hits for query "lucene"
>  [exec] generate javadocs w/ Java 7...
>  [exec]
>  [exec] Crawl/parse...
>  [exec]
>  [exec] Verify...
>  [exec]
>  [exec] Test Solr...
>  [exec]   test basics...
>  [exec]   get KEYS
>  [exec] 0.1 MB
>  [exec]   check changes HTML...
>  [exec]   download solr-4.2.1-src.tgz...
>  [exec] 30.3 MB
>  [exec] verify md5/sha1 digests
>  [exec]   download solr-4.2.1.tgz...
>  [exec] 111.0 MB
>  [exec] verify md5/sha1 digests
>  [exec]   download solr-4.2.1.zip...
>  [exec] 115.5 MB
>  [exec] verify md5/sha1 digests
>  [exec]   unpack solr-4.2.1.tgz...
>  [exec] verify JAR/WAR metadata...
>  [exec]   **WARNING**: skipping check of 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contrib/dataimporthandler/lib/activation-1.1.jar:
>  it has javax.* classes
>  [exec]   **WARNING**: skipping check of 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contrib/dataimporthandler/lib/mail-1.4.1.jar:
>  it has javax.* classes
>  [exec] make sure WAR file has no javax.* or java.* classes...
>  [exec] copying unpacked distribution for Java 6 ...
>  [exec] test solr example w/ Java 6...
>  [exec]   start Solr instance 
> (log=/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/solr-example.log)...
>  [exec] Startup failed; see log 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/solr-example.log
>

[JENKINS] Lucene-Solr-SmokeRelease-4.2.1 - Build # 11 - Still Failing

2013-03-31 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.2.1/11/

No tests ran.

Build Log:
[...truncated 32490 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease
 [copy] Copying 401 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/lucene
 [copy] Copying 194 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/solr
 [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
 [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
 [exec] NOTE: output encoding is US-ASCII
 [exec] 
 [exec] Load release URL 
"file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeRelease/"...
 [exec] 
 [exec] Test Lucene...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB
 [exec]   check changes HTML...
 [exec]   download lucene-4.2.1-src.tgz...
 [exec] 26.7 MB
 [exec] verify md5/sha1 digests
 [exec]   download lucene-4.2.1.tgz...
 [exec] 48.1 MB
 [exec] verify md5/sha1 digests
 [exec]   download lucene-4.2.1.zip...
 [exec] 57.7 MB
 [exec] verify md5/sha1 digests
 [exec]   unpack lucene-4.2.1.tgz...
 [exec] verify JAR/WAR metadata...
 [exec] test demo with 1.6...
 [exec]   got 5450 hits for query "lucene"
 [exec] test demo with 1.7...
 [exec]   got 5450 hits for query "lucene"
 [exec] check Lucene's javadoc JAR
 [exec]   unpack lucene-4.2.1.zip...
 [exec] verify JAR/WAR metadata...
 [exec] test demo with 1.6...
 [exec]   got 5450 hits for query "lucene"
 [exec] test demo with 1.7...
 [exec]   got 5450 hits for query "lucene"
 [exec] check Lucene's javadoc JAR
 [exec]   unpack lucene-4.2.1-src.tgz...
 [exec] make sure no JARs/WARs in src dist...
 [exec] run "ant validate"
 [exec] run tests w/ Java 6...
 [exec] test demo with 1.6...
 [exec]   got 222 hits for query "lucene"
 [exec] generate javadocs w/ Java 6...
 [exec] run tests w/ Java 7...
 [exec] test demo with 1.7...
 [exec]   got 222 hits for query "lucene"
 [exec] generate javadocs w/ Java 7...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [exec] 
 [exec] Test Solr...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB
 [exec]   check changes HTML...
 [exec]   download solr-4.2.1-src.tgz...
 [exec] 30.3 MB
 [exec] verify md5/sha1 digests
 [exec]   download solr-4.2.1.tgz...
 [exec] 111.0 MB
 [exec] verify md5/sha1 digests
 [exec]   download solr-4.2.1.zip...
 [exec] 115.5 MB
 [exec] verify md5/sha1 digests
 [exec]   unpack solr-4.2.1.tgz...
 [exec] verify JAR/WAR metadata...
 [exec]   **WARNING**: skipping check of 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contrib/dataimporthandler/lib/activation-1.1.jar:
 it has javax.* classes
 [exec]   **WARNING**: skipping check of 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1/contrib/dataimporthandler/lib/mail-1.4.1.jar:
 it has javax.* classes
 [exec] make sure WAR file has no javax.* or java.* classes...
 [exec] copying unpacked distribution for Java 6 ...
 [exec] test solr example w/ Java 6...
 [exec]   start Solr instance 
(log=/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/solr-example.log)...
 [exec] Startup failed; see log 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/solr-example.log
 [exec] 2013-03-31 20:43:07.748:INFO:oejs.Server:jetty-8.1.8.v20121106
 [exec] 2013-03-31 20:43:07.767:INFO:oejdp.ScanningAppProvider:Deployment 
monitor 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/example/contexts
 at interval 0
 [exec] 2013-03-31 20:43:07.772:INFO:oejd.DeploymentManager:Deployable 
added: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/example/contexts/solr-jetty-context.xml
 [exec] 2013-03-31 20:43:07.823:INFO:oejw.WebInfConfiguration:Extract 
jar:file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2.1-java6/example/webapps/solr.war!/
 to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.2.1/lucene/build/fakeReleaseTmp/unpack/solr-4.2

[jira] [Commented] (SOLR-3755) shard splitting

2013-03-31 Thread Anshum Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618464#comment-13618464
 ] 

Anshum Gupta commented on SOLR-3755:


Was trying to look into it but strangely, I haven't run into it over 15 
consecutive runs.

> shard splitting
> ---
>
> Key: SOLR-3755
> URL: https://issues.apache.org/jira/browse/SOLR-3755
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Yonik Seeley
> Attachments: SOLR-3755-combined.patch, 
> SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, 
> SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
> SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755.patch, 
> SOLR-3755-testSplitter.patch, SOLR-3755-testSplitter.patch
>
>
> We can currently easily add replicas to handle increases in query volume, but 
> we should also add a way to add additional shards dynamically by splitting 
> existing shards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618459#comment-13618459
 ] 

Shai Erera commented on LUCENE-4858:


Another way to fix addIndexes is to never merger, but rather behave like 
addIndexes(Directory) -- iterate on leaves and call SegmentMerger.merge() on 
each one of them. App can call maybeMerge afterwards. addIndexes(IndexReader) 
is intended, mostly I think, for filtering readers, otherwise the Directory 
version is much faster. Fixing addIndexes like that makes it consistent with 
the Directory version, and still accomplishing its goal. Note, it does not 
address the 'sorted' issue, but as I wrote a couple times already, the two are 
unrelated.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4893:
---

Attachment: LUCENE-4893.patch

Patch makes FacetsCollector cache the facet results, so .get is now a getter. 
reset() clears the cached results. Added additional test for reset().

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch, LUCENE-4893.patch, 
> LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618452#comment-13618452
 ] 

Shai Erera commented on LUCENE-4858:


I don't disagree, I just think that completely orthogonal to this issue, 
addIndexes should respect IW's MP. I've been working with an app that relied on 
having segments no bigger than X GB (controlled by MP), and having addIndexes 
completely ignore these settings is a bug, at least to me. Who said that 
addIndexes must end up w/ a single segment? Anyway, this is something we should 
discuss on a separate issue, since it's unrelated to sorting. I'm going to 
reproduce this separately, and then open a dedicated issue. This one can wait 
or proceed unrelated.

bq. The current patch goes to great extremes to add more and more and more 
complexity at the index-level but misses the forrest for the trees.

I don't think so? It only attempts to protect the user from silently tripping 
on his bugs. To the user, nothing really changes (except that his Sorter impls 
will need to define a sortKey or whatever .. big deal). The rest is completely 
opaque. I'm sure you won't object to protecting users from themselves, at least 
not when all we need is to make this tiny little change to OneMerge.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618413#comment-13618413
 ] 

Michael McCandless commented on LUCENE-4893:


I think caching the result (so .getXXX acts like a normal getter) is good?

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch, LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618404#comment-13618404
 ] 

Robert Muir commented on LUCENE-4858:
-

Just to be clear, I'm not against allowing someone to tradeoff relevance for 
speed (even where the requested sort is e.g. by score), but I just think that 
its way more important to make these tradeoffs clear in the APIs than to worry 
about expert things like addIndexes.

By having good APIs that are clear about this (e.g. a "safe" way and also an 
"unsafe" way) with good javadocs, then its more likely users will be happy and 
not run into traps.

The stuff like addIndexes is still good, I just dont think its as important in 
the big picture: its so esoteric that it need not even be addressed in the 
initial commit. I know good APIs and javadocs arent as sexy as adding more 
stuff to the index, but its much more important here IMO.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618399#comment-13618399
 ] 

Robert Muir commented on LUCENE-4858:
-

{quote}
+1. This makes sense. We need to be as robust as possible. If a user makes a 
mistake, it's best if he can avoid tripping himself.
{quote}

Then step number 1 is not to futz around with adding more and more metadata to 
the index and changing the behavior of addindexes and so on, but instead to 
enforce the collector is only used when sorting by index order!

The current patch goes to great extremes to add more and more and more 
complexity at the index-level but misses the forrest for the trees.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618393#comment-13618393
 ] 

Shai Erera commented on LUCENE-4858:


bq. Maybe we could even go further and add an identifier of the Sorter which 
has been used to sort the segment

+1. This makes sense. We need to be as robust as possible. If a user makes a 
mistake, it's best if he can avoid tripping himself. It needs to be something 
unique, i.e. not just the sorter class, but e.g. for NumericDV also the field. 
Perhaps Sorter should have a sortKey? Then we record 
Sorter.class_Sorter.sortKey?

I agree that addIndexes should use MergePolicy. Unlike the Directory version, 
which shallow-copies the segments, including whatever Diagnostics information 
they contain, the IR version uses SegmentMerger, however bypasses MP. So e.g. 
if the app uses TieredMP, limiting the merged segment size to 10 GB, you can 
addIndexes a 20-segment index, totalling 100 GB, and end up in a single 100 GB 
segment. That's ... uexpected.

So I think we need something on MP, maybe findMergesForAddIndexes... and then 
it will be easier to control how these indexes are added. If that's the 
direction, perhaps we do this in a different issue, as it's unrelated to 
sorting?

And, while diagnostics allow us to record sorted + sorter, we're still limited 
to SegmentReader. In practice this may not be a true limitation, but I feel 
that if AtomicReader exposed metadata(), like commitData() for the composite, 
it will give us more freedom. This collector does not need to be limited to 
SegmentReader only ... but I guess it's ok for now, at least, I know others 
don't like the idea of having metadata() on AR.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4656) Add hl.maxMultiValuedToExamine to limit the number of multiValued entries examined while highlighting

2013-03-31 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4656:
-

Attachment: SOLR-4656-4x.patch

When I was reconciling the patch for 4x I decremented the mvToMatch outside the 
for loop. Harmless since I wasn't looking at it any more, but unnecessary.

> Add hl.maxMultiValuedToExamine to limit the number of multiValued entries 
> examined while highlighting
> -
>
> Key: SOLR-4656
> URL: https://issues.apache.org/jira/browse/SOLR-4656
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 4.3, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-4656-4x.patch, SOLR-4656-4x.patch, SOLR-4656.patch, 
> SOLR-4656-trunk.patch
>
>
> I'm looking at an admittedly pathological case of many, many entries in a 
> multiValued field, and trying to implement a way to limit the number 
> examined, analogous to maxAnalyzedChars, see the patch.
> Along the way, I noticed that we do what looks like unnecessary copying of 
> the fields to be examined. We call Document.getFields, which copies all of 
> the fields and values to the returned array. Then we copy all of those to 
> another array, converting them to Strings. Then we actually examine them. a> 
> this doesn't seem very efficient and b> reduces the benefit from limiting the 
> number of mv values examined.
> So the attached does two things:
> 1> attempts to fix this
> 2> implements hl.maxMultiValuedToExamine
> I'd _really_ love it if someone who knows the highlighting code takes a peek 
> at the fix to see if I've messed things up, the changes are actually pretty 
> minimal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4656) Add hl.maxMultiValuedToExamine to limit the number of multiValued entries examined while highlighting

2013-03-31 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618367#comment-13618367
 ] 

Erick Erickson commented on SOLR-4656:
--

I plan on committing this Tuesday or so unless there are objections

hl.maxMultiValuedToMatch   - stops looking in the values in a multiValued field 
after N matches are found. Default is Integer.MAX_VALUE

hl.maxMultiValuedToExamine - stops looking in the values in a multiValued field 
after N values are examined, regardless of how many have been found (no matches 
is perfectly reasonable). Defaults to Integer.MAX_VALUE

If both are specified, the first condition met stops the comparisons.

The patch also restructures traversing the fields in the document so we aren't 
copying things around so much, I'd particularly like someone to glance at that 
code. All tests pass, but a second set of eyes would be welcome.

Also along the way I found this parameter that I'd never seen before: 
hl.preserveMulti and added it to the highlight parameter page 
(http://wiki.apache.org/solr/HighlightingParameters) with the explanation from 
a comment in the code, some clarification there might be a good thing.

Fortunately, the changes are actually relatively minor, most of the bulk of the 
patch is additional tests.

> Add hl.maxMultiValuedToExamine to limit the number of multiValued entries 
> examined while highlighting
> -
>
> Key: SOLR-4656
> URL: https://issues.apache.org/jira/browse/SOLR-4656
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 4.3, 5.0
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Attachments: SOLR-4656-4x.patch, SOLR-4656.patch, 
> SOLR-4656-trunk.patch
>
>
> I'm looking at an admittedly pathological case of many, many entries in a 
> multiValued field, and trying to implement a way to limit the number 
> examined, analogous to maxAnalyzedChars, see the patch.
> Along the way, I noticed that we do what looks like unnecessary copying of 
> the fields to be examined. We call Document.getFields, which copies all of 
> the fields and values to the returned array. Then we copy all of those to 
> another array, converting them to Strings. Then we actually examine them. a> 
> this doesn't seem very efficient and b> reduces the benefit from limiting the 
> number of mv values examined.
> So the attached does two things:
> 1> attempts to fix this
> 2> implements hl.maxMultiValuedToExamine
> I'd _really_ love it if someone who knows the highlighting code takes a peek 
> at the fix to see if I've messed things up, the changes are actually pretty 
> minimal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4657) Failing OpenCloseCoreStressTest

2013-03-31 Thread Erick Erickson (JIRA)

Erick Erickson created SOLR-4657:


 Summary: Failing OpenCloseCoreStressTest
 Key: SOLR-4657
 URL: https://issues.apache.org/jira/browse/SOLR-4657
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.3, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor


I have an idea what's happening with the test that apparently doesn't close a 
core, but it'll be something of a hit-or-miss process to fix it as it looks 
timing related. Might have a patch up later today other commitments willing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: New implementation of MLT

2013-03-31 Thread Erick Erickson

Gagan:

Absolutely open up a JIRA and attach a patch!

Erick

On Sun, Mar 31, 2013 at 1:18 AM, Gagandeep singh  wrote:
> Hi folks
>
> We started using the default implementation of MLT
> (org.apache.solr.handler.MoreLikeThisHandler) recently and found that there
> are a couple of things it lacks:
>
> Searching for terms in the same field as the original document:
>
> the current implementation picks the top field to search an interesting term
> in based on docFreq, however this can give bad results if say original
> product is from brand:"RED Valentino", and we end up searching red in color
> field.
>
> Phrase boosts:
>
> if product name is "business cards", then it makes sense to give a boost to
> the phrase boost to products which are also business cards.
>
> Support for bq, bf, fq, multiplicative boost:
>
> you might want to filter out_of_stock products, give a multiplicative boost
> to a product based on their price similarity / launch date.
>
> Support of explainOther
>
> We had a use case for each of these and i ended up writing my own
> MLTQueryParser which builds the MLT query for a given document. It also has
> a new concept called childDocs. You can think of some documents as products,
> and a collection of products can be though of as a category page. You could
> search for similar documents based on the products a category page has.
>
> I was wondering if you guys would be interested in an alternate
> implementation of MLT that supports all the knobs that solr search does. I
> could post a patch file maybe?
>
> Thanks
> Gagan
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Orphaned feature: commitIntervalLowerBound?

2013-03-31 Thread Jack Krupansky

I see code in solrconfig.java to parse and store something called 
“commitIntervalLowerBound”, part of the  configuration, but it 
doesn’t appear to be used in the actual update handler. Is this a partially 
implemented feature that should be preserved for future implementation, or the 
lingering remains of an old feature/idea that should be removed?

If it should be preserved, somebody should come up with a TODO that explains 
why and what it is there for.

I find no reference to it in Jira, but a few references in the mail archives.

It sounds like it should be removed.

-- Jack Krupansky

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618349#comment-13618349
 ] 

Adrien Grand commented on LUCENE-4858:
--

I like making it more robust by ensuring the segment is sorted and not only the 
result from a merge! Maybe we could even go further and add an identifier of 
the Sorter which has been used to sort the segment. This way, early query 
termination could keep working correctly even if the user decides to change the 
sort order of the index?

bq. This still does not address addIndexes. I think it will be good if we can 
have a SortingEarlyTerminationCollector which works with both modes. I'll try 
that later.

I'm still thinking about whether IndexWriter should sort the provided readers 
in addIndexes. On the one hand, I understand that given that the user can wrap 
the readers with a SortingAtomicReader himself, there is little added value in 
sorting the readers in addIndexes. But on the other hand, this method "merges 
the provided indexes into this index" (quoting the javadocs) so not sorting the 
readers while a SortingMergePolicy is used feels like the MergePolicy is being 
bypassed. So net/net I think I prefer making addIndexes sort the readers and 
have a dedicated method in MergePolicy to handle addIndexes? (And this would 
make it easy to add additional diagnostic to the segments resulting from a call 
to addIndexes.)

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-SmokeRelease-trunk - Build # 69 - Still Failing

2013-03-31 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-trunk/69/

No tests ran.

Build Log:
[...truncated 33170 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease
 [copy] Copying 401 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease/lucene
 [copy] Copying 194 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease/solr
 [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
 [exec] NOTE: output encoding is US-ASCII
 [exec] 
 [exec] Load release URL 
"file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeRelease/"...
 [exec] 
 [exec] Test Lucene...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB
 [exec]   check changes HTML...
 [exec]   download lucene-5.0.0-src.tgz...
 [exec] 26.4 MB
 [exec] verify md5/sha1 digests
 [exec]   download lucene-5.0.0.tgz...
 [exec] 48.1 MB
 [exec] verify md5/sha1 digests
 [exec]   download lucene-5.0.0.zip...
 [exec] 57.3 MB
 [exec] verify md5/sha1 digests
 [exec]   unpack lucene-5.0.0.tgz...
 [exec] verify JAR/WAR metadata...
 [exec] test demo with 1.7...
 [exec]   got 5394 hits for query "lucene"
 [exec] check Lucene's javadoc JAR
 [exec]   unpack lucene-5.0.0.zip...
 [exec] verify JAR/WAR metadata...
 [exec] test demo with 1.7...
 [exec]   got 5394 hits for query "lucene"
 [exec] check Lucene's javadoc JAR
 [exec]   unpack lucene-5.0.0-src.tgz...
 [exec] make sure no JARs/WARs in src dist...
 [exec] run "ant validate"
 [exec] run tests w/ Java 7...
 [exec] test demo with 1.7...
 [exec]   got 209 hits for query "lucene"
 [exec] generate javadocs w/ Java 7...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [exec] 
 [exec] Test Solr...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB
 [exec]   check changes HTML...
 [exec]   download solr-5.0.0-src.tgz...
 [exec] 30.1 MB
 [exec] verify md5/sha1 digests
 [exec]   download solr-5.0.0.tgz...
 [exec] 111.2 MB
 [exec] verify md5/sha1 digests
 [exec]   download solr-5.0.0.zip...
 [exec] 115.6 MB
 [exec] verify md5/sha1 digests
 [exec]   unpack solr-5.0.0.tgz...
 [exec] verify JAR/WAR metadata...
 [exec]   **WARNING**: skipping check of 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0/contrib/dataimporthandler/lib/activation-1.1.jar:
 it has javax.* classes
 [exec]   **WARNING**: skipping check of 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0/contrib/dataimporthandler/lib/mail-1.4.1.jar:
 it has javax.* classes
 [exec] make sure WAR file has no javax.* or java.* classes...
 [exec] copying unpacked distribution for Java 7 ...
 [exec] test solr example w/ Java 7...
 [exec]   start Solr instance 
(log=/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0-java7/solr-example.log)...
 [exec] Startup failed; see log 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-trunk/lucene/build/fakeReleaseTmp/unpack/solr-5.0.0-java7/solr-example.log
 [exec] Null identity service, trying login service: null
 [exec] Finding identity service: null
 [exec] java.lang.reflect.InvocationTargetException
 [exec] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [exec] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 [exec] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 [exec] at java.lang.reflect.Method.invoke(Method.java:601)
 [exec] at org.eclipse.jetty.start.Main.invokeMain(Main.java:472)
 [exec] at org.eclipse.jetty.start.Main.start(Main.java:620)
 [exec] at org.eclipse.jetty.start.Main.main(Main.java:95)
 [exec] Caused by: java.net.BindException: Address already in use
 [exec] at java.net.PlainSocketImpl.socketBind(Native Method)
 [exec] at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
 [exec] at java.net.ServerSocket.bind(ServerSocket.java:376)
 [exec] at java.net.ServerSocket.(ServerSocket.java:237)
 [exec] at java.net.ServerSocket.(ServerSocket.java:181)
 [exec] at 
org.eclipse.jetty.server.bio.SocketConnector.newServerSocket(SocketConnector.java:96)
 [exec] at 
org.eclipse.jetty.server.bio.SocketConnector.open(S

[jira] [Updated] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4893:
---

Attachment: LUCENE-4893.patch

I still didn't fix jdocs, this patch throws IllegalStateException if 
getFacetResults is called more than once, or no search was executed. But this 
gets TestDrillSideways.testBasic to fail, because DrillSideways (line 168) 
assumes it can call getFacetResult() even if the scorer it got was null.

I wonder what's the best course of action - track in FacetsCollector only the 
case where getFacetResult was called more than once, or simply caching the 
List and return it in .get() if it isn't null. An exception now 
seems too obtrusive to me ...

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch, LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618324#comment-13618324
 ] 

Shai Erera commented on LUCENE-4893:


That would mean FacetsCollector will need to track whether getFacetResults was 
called or not, and distinguish that from "no results were found". I guess it 
can be done by having matchingDocs set to null by getFacetResults(), and 
initialized in setNextReader, so getFacetResults can check if matchingDocs is 
null, and throw IllegalStateException indicating no search has been performed 
yet (since or not the last call to getFacetResults). TopDocsCollector can be 
fixed like that too, but I prefer in a separate issue.

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618319#comment-13618319
 ] 

Michael McCandless commented on LUCENE-4893:


bq. I chose not to throw an exception because TopDocsCollector returns an empty 
TopDocs if called twice.

Actually I think this is bad for TopDocsCollector to do: it's trappy.  I think 
users don't hit this because typically it's IndexSearcher.search that calls 
this method and returns the TopDocs.

I'd rather fix both of these classes to throw an exception if you call their 
"getter" methods more than once, than silently pretending the 2nd time there 
were no results?

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4893:
---

Attachment: LUCENE-4893.patch

Thanks crocket. I found a typo in the test's comment, so if you meant another 
one, please specify which file has the typo. I also improved 
FacetsCollector.getFacetResults documentation.

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch, LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4893) Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as FacetsCollector.getFacetResults is called.

2013-03-31 Thread crocket (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618308#comment-13618308
 ] 

crocket commented on LUCENE-4893:
-

LUCENE-4893.patch has some typos in comments.

> Facet counts in FacetsAccumulator.facetArrays are multiplied as many times as 
> FacetsCollector.getFacetResults is called.
> 
>
> Key: LUCENE-4893
> URL: https://issues.apache.org/jira/browse/LUCENE-4893
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.2
>Reporter: crocket
> Attachments: LUCENE-4893.patch
>
>
> In lucene 4.1, only StandardFacetsAccumulator could be instantiated.
> And as of lucene 4.2, it became possible to instantiate FacetsAccumulator.
> I invoked FacetsCollector.getFacetResults twice, and I saw doubled facet 
> counts.
> If I invoke it three times, I see facet counts multiplied three times.
> It all happens in FacetsAccumulator.accumulate.
> StandardFacetsAccumulator doesn't have this bug since it frees facetArrays 
> whenever StandardFacetsAccumulator.accumulate is called.
> Is it a feature or a bug?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2013-03-31 Thread Tim Vaillancourt (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618289#comment-13618289
 ] 

Tim Vaillancourt commented on SOLR-4470:


Scratch my last suggestion, I see the conditions now where credentials are 
needed but not provided by a super request. In my case I am only restricting 
/admin/* which I believe is only used by super requests, however.

I think a credentials in property file would resolve my concerns about the 
credentials appearing in JVM command line. I'll see if I can get that to work.

> Support for basic http auth in internal solr requests
> -
>
> Key: SOLR-4470
> URL: https://issues.apache.org/jira/browse/SOLR-4470
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java, multicore, replication (java), SolrCloud
>Affects Versions: 4.0
>Reporter: Per Steffensen
>  Labels: authentication, https, solrclient, solrcloud, ssl
> Fix For: 4.3
>
> Attachments: SOLR-4470_branch_4x_r1452629.patch, 
> SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
> SOLR-4470.patch
>
>
> We want to protect any HTTP-resource (url). We want to require credentials no 
> matter what kind of HTTP-request you make to a Solr-node.
> It can faily easy be acheived as described on 
> http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
> also make "internal" request to other Solr-nodes, and for it to work 
> credentials need to be provided here also.
> Ideally we would like to "forward" credentials from a particular request to 
> all the "internal" sub-requests it triggers. E.g. for search and update 
> request.
> But there are also "internal" requests
> * that only indirectly/asynchronously triggered from "outside" requests (e.g. 
> shard creation/deletion/etc based on calls to the "Collection API")
> * that do not in any way have relation to an "outside" "super"-request (e.g. 
> replica synching stuff)
> We would like to aim at a solution where "original" credentials are 
> "forwarded" when a request directly/synchronously trigger a subrequest, and 
> fallback to a configured "internal credentials" for the 
> asynchronous/non-rooted requests.
> In our solution we would aim at only supporting basic http auth, but we would 
> like to make a "framework" around it, so that not to much refactoring is 
> needed if you later want to make support for other kinds of auth (e.g. digest)
> We will work at a solution but create this JIRA issue early in order to get 
> input/comments from the community as early as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4858:
---

Attachment: LUCENE-4858.patch

Here's a quick patch that adds OneMerge.setInfo which SortingOneMerge overrides 
to add 'sorted' property. SortingEarlyTerminationCollector modified to read 
that property instead of SOURCE. 'core' and 'misc' tests pass.

This still does not address addIndexes. I think it will be good if we can have 
a SortingEarlyTerminationCollector which works with both modes. I'll try that 
later.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch, LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618280#comment-13618280
 ] 

Shai Erera commented on LUCENE-4858:


Also, I don't think that this works with addIndexes right? So if someone 
follows SortingAtomicReader.addIndexes jdoc example, he cannot use this 
collector.

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4858) Early termination with SortingMergePolicy

2013-03-31 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618274#comment-13618274
 ] 

Shai Erera commented on LUCENE-4858:


Patch looks good. I wonder if we can make the 'sorted' decision more specific 
though. E.g., if OneMerge.info was not assigned directly by IndexWriter, but 
rather IW called OneMerge.setInfo() and SortingOneMerge would add to the 
diagnostics another property 'sorted=true', then the collector would be more 
robust -- right now you can use it with unsorted segments and you'd silently 
get wrong results. Even though it is documented, I feel that with this tiny 
hook, we can make it work correctly, not tripping users who don't read 
javadocs. What do you think?

> Early termination with SortingMergePolicy
> -
>
> Key: LUCENE-4858
> URL: https://issues.apache.org/jira/browse/LUCENE-4858
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Fix For: 4.3
>
> Attachments: LUCENE-4858.patch
>
>
> Spin-off of LUCENE-4752, see 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13606565&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606565
>  and 
> https://issues.apache.org/jira/browse/LUCENE-4752?focusedCommentId=13607282&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13607282
> When an index is sorted per-segment, queries that sort according to the index 
> sort order could be early terminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

46 matches

Mail list logo