[jira] [Commented] (SOLR-4237) Implement index aliasing

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603174#comment-13603174
 ] 

Mark Miller commented on SOLR-4237:
---

I couldn't say - it depends on what you intended to implement for index aliases 
here. I have very specifics thought about what I intend for collection and 
shard aliasing. 

 Implement index aliasing
 

 Key: SOLR-4237
 URL: https://issues.apache.org/jira/browse/SOLR-4237
 Project: Solr
  Issue Type: New Feature
Reporter: Otis Gospodnetic
 Fix For: 4.3


 This is handy for searching log indices and in all other situations where 
 indices are added (and possibly deleted) over time.  Index aliasing allows 
 one to map an arbitrary set of indices to an alias and avoid needing to 
 change the search application to point it to new indices.
 See http://search-lucene.com/m/YBn4w1UAbEB
 It may also be worth thinking about using aliases when indexing.  This 
 question comes up once in a while on the ElasticSearch mailing list for 
 example.
 See 
 http://search-lucene.com/?q=index+time+aliasfc_project=ElasticSearchfc_type=mail+_hash_+user

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603203#comment-13603203
 ] 

Commit Tag Bot commented on LUCENE-4830:


[trunk commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1456787

LUCENE-4830: Sorter API: Make the doc ID mapping an abstract class.



 Sorter API: use an abstract doc map instead of an array
 ---

 Key: LUCENE-4830
 URL: https://issues.apache.org/jira/browse/LUCENE-4830
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: 4.3

 Attachments: LUCENE-4830.patch


 The sorter API uses arrays to store the old-new and new-old doc IDs 
 mappings. It should rather be an abstract class given that in some cases an 
 array is not required at all (reverse mapping for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Protecting content in zookeeper

2013-03-15 Thread Per Steffensen

On 3/14/13 5:21 PM, Mark Miller wrote:

On Mar 14, 2013, at 3:16 AM, Per Steffensen st...@designware.dk wrote:


Even though you do not share zookeeper you might want to set up permissions 
anyway, but never mind.

That's just the only reason I care about. Otherwise, I'm of a similar mind with 
ZooKeeper security as I am with Solr - lock it up behind closed doors and only 
allow trusted access. Makes things simpler for us.

The problem is that it's really nice to only have to run one ZooKeeper for many 
services. And in that case, it's really nice to ensure they won't interfere 
with each other due to bugs or misconfiguration.

So for that reason, I'd support this change. The other reasons really don't 
sway me at all.
Thats cool. I understand your points. Its just that my customer is very 
very paranoid - like in CIA'ish paranoid. We do not even trust people 
with access to the actual machines running Solr or ZK (at least not all 
of them, depending on how many starts are on their shoulders), and we 
need to run CloudSolrServers in an environment where we trust people 
even less. I know this is probably not a feature that will be used by 
many people, but what the heck, if its transparent and default is no 
protection, no harm done supporting it. And it will not be a lot of code.


Regards, Steff

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603207#comment-13603207
 ] 

Commit Tag Bot commented on LUCENE-4830:


[branch_4x commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1456789

LUCENE-4830: Sorter API: Make the doc ID mapping an abstract class (merged from 
r1456787).



 Sorter API: use an abstract doc map instead of an array
 ---

 Key: LUCENE-4830
 URL: https://issues.apache.org/jira/browse/LUCENE-4830
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: 4.3

 Attachments: LUCENE-4830.patch


 The sorter API uses arrays to store the old-new and new-old doc IDs 
 mappings. It should rather be an abstract class given that in some cases an 
 array is not required at all (reverse mapping for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603211#comment-13603211
 ] 

Commit Tag Bot commented on LUCENE-4830:


[trunk commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1456796

LUCENE-4830: Add missing @Override.



 Sorter API: use an abstract doc map instead of an array
 ---

 Key: LUCENE-4830
 URL: https://issues.apache.org/jira/browse/LUCENE-4830
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: 4.3

 Attachments: LUCENE-4830.patch


 The sorter API uses arrays to store the old-new and new-old doc IDs 
 mappings. It should rather be an abstract class given that in some cases an 
 array is not required at all (reverse mapping for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array

2013-03-15 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4830.
--

Resolution: Fixed

Thank you for the review, Shai!

 Sorter API: use an abstract doc map instead of an array
 ---

 Key: LUCENE-4830
 URL: https://issues.apache.org/jira/browse/LUCENE-4830
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: 4.3

 Attachments: LUCENE-4830.patch


 The sorter API uses arrays to store the old-new and new-old doc IDs 
 mappings. It should rather be an abstract class given that in some cases an 
 array is not required at all (reverse mapping for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4830) Sorter API: use an abstract doc map instead of an array

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603215#comment-13603215
 ] 

Commit Tag Bot commented on LUCENE-4830:


[branch_4x commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1456797

LUCENE-4830: Add missing @Override (merged from r1456796).



 Sorter API: use an abstract doc map instead of an array
 ---

 Key: LUCENE-4830
 URL: https://issues.apache.org/jira/browse/LUCENE-4830
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: 4.3

 Attachments: LUCENE-4830.patch


 The sorter API uses arrays to store the old-new and new-old doc IDs 
 mappings. It should rather be an abstract class given that in some cases an 
 array is not required at all (reverse mapping for example).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-3706:


Attachment: SOLR-3706-solr-log4j.patch

This patch switches things to log4j

I'm not wild about adding log4j to the .war, but that is the closest to what we 
currently do with JUL

Docs and sample configs would need to be updated too... 


 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3358) Capture Logging Events from JUL and Log4j

2013-03-15 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-3358.
-

Resolution: Duplicate

Closing this issue to be included in SOLR-3706

 Capture Logging Events from JUL and Log4j
 -

 Key: SOLR-3358
 URL: https://issues.apache.org/jira/browse/SOLR-3358
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Attachments: SOLR-3358-compile-path.patch, SOLR-3358-logging.patch, 
 SOLR-3358-logging.patch


 The UI should be able to show the last few log messages.  To support this, we 
 will need to register an Appender (log4j) or Handler
 (JUL) and keep a buffer of recent log events.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3399) distribute/assume log4j logging rather then JUL

2013-03-15 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-3399.
-

Resolution: Duplicate

 distribute/assume log4j logging rather then JUL
 ---

 Key: SOLR-3399
 URL: https://issues.apache.org/jira/browse/SOLR-3399
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Minor

 The discussion on SOLR-3358 has many threads, so I will break this out in its 
 own issue.
 Currently we use SLF4j to define logging and the war file distributes the the 
 JUL binding.  To improve the out-of-the-box logging experience, I think we 
 should switch to log4j.  I suggest we:
  * keep using SLF4J (especially in solrj)
  * replace the JUL log watcher with a log4j version
  * this will let us have the admin UI logging stuff work against a single 
 Appender rather then the root loggers

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3979) slf4j bindings other than jdk -- cannot change log levels

2013-03-15 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-3979.
-

Resolution: Duplicate

check SOLR-3706

 slf4j bindings other than jdk -- cannot change log levels
 -

 Key: SOLR-3979
 URL: https://issues.apache.org/jira/browse/SOLR-3979
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Shawn Heisey
 Fix For: 4.3

 Attachments: log4j-solr-stuff.zip


 Once I finally got log4j logging working, I was slightly surprised by the 
 message related to SOLR-3426.  I did not really consider that to be a big 
 deal, because if I want to look at my log, I'll be on the commandline anyway.
 I was even more surprised to find that I cannot change any of the log levels 
 from the admin gui.  My default log level is WARN for performance reasons, 
 but every once in a while I like to bump the log level to INFO to 
 troubleshoot a specific problem, then turn it back down.  This is very easy 
 with jdk logging in either 3.x or 4.0.  I changed to log4j because it easily 
 allows me to put the date of a log message on the same line as the first line 
 of the actual log message, so when I grep for things, I have the timestamp in 
 the grep output.
 Currently the only way for me to change my log level is by updating 
 log4j.properties and restarting Solr.  If the capability to figure this out 
 on a class-by-class basis isn't there with log4j, I would at least like to be 
 able to set the root logging level.  Is that possible?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4129) Solr UI doesn't support log4j

2013-03-15 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-4129.
-

Resolution: Duplicate

consolidating log4j issues

 Solr UI doesn't support log4j 
 --

 Key: SOLR-4129
 URL: https://issues.apache.org/jira/browse/SOLR-4129
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
Reporter: Raintung Li
  Labels: log
 Fix For: 4.3

 Attachments: patch-4129.txt


 For many project use the log4j, actually solr use slf logger framework, slf 
 can easy to integrate with log4j by design. 
 Solr use log4j-over-slf.jar to handle log4j case.
 This jar has some issues.
 a. Actually last invoke slf to print the logger (For solr it is 
 JDK14.logging).
 b. Not implement all log4j function. ex. Logger.setLevel() 
 c. JDK14 log miss some function, ex. thread.info, day rolling 
 Some dependence project had been used log4j that the customer still want to 
 use it. JDK14 log has many different with Log4j, at least configuration file 
 can't reuse.
 The bad thing is log4j-over-slf.jar conflict with log4j. If use solr, the 
 other project have to remove log4j.
 I think it shouldn't use log4j-over-slf.jar, still reuse log4j if customer 
 want to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs

2013-03-15 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-4834:


 Summary: Sorter API: Make TermsEnum.docs accept any source of 
liveDocs
 Key: LUCENE-4834
 URL: https://issues.apache.org/jira/browse/LUCENE-4834
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.3


TermsEnum.docs currently only works when liveDocs is null or the reader's 
liveDocs. This is enough for addIndexes but it would be cleaner to follow 
TermsEnum.docs contract and accept any source of liveDocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs

2013-03-15 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4834:
-

Attachment: LUCENE-4834.patch

Patch. I'll commit soon.

 Sorter API: Make TermsEnum.docs accept any source of liveDocs
 -

 Key: LUCENE-4834
 URL: https://issues.apache.org/jira/browse/LUCENE-4834
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.3

 Attachments: LUCENE-4834.patch


 TermsEnum.docs currently only works when liveDocs is null or the reader's 
 liveDocs. This is enough for addIndexes but it would be cleaner to follow 
 TermsEnum.docs contract and accept any source of liveDocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3369) shards.tolerant=true broken on group and facet queries

2013-03-15 Thread Ferry Landzaat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603244#comment-13603244
 ] 

Ferry Landzaat commented on SOLR-3369:
--

Is there any plan to fix this issue? We want to upgrade from 3.x and really 
need this patch to make the system reliable.

 shards.tolerant=true broken on group and facet queries
 --

 Key: SOLR-3369
 URL: https://issues.apache.org/jira/browse/SOLR-3369
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.0-ALPHA
 Environment: Distributed environment (shards)
Reporter: Russell Black
  Labels: patch
 Attachments: SOLR-3369-shards-tolerant.patch


 In a distributed environment, shards.tolerant=true allows for partial results 
 to be returned when individual shards are down.  For group=true and 
 facet=true queries, using this feature results in an error when shards are 
 down.  This patch allows users to use the shard tolerance feature with facet 
 and grouping queries. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs

2013-03-15 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603264#comment-13603264
 ] 

Shai Erera commented on LUCENE-4834:


Looks good! +1 to commit

 Sorter API: Make TermsEnum.docs accept any source of liveDocs
 -

 Key: LUCENE-4834
 URL: https://issues.apache.org/jira/browse/LUCENE-4834
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.3

 Attachments: LUCENE-4834.patch


 TermsEnum.docs currently only works when liveDocs is null or the reader's 
 liveDocs. This is enough for addIndexes but it would be cleaner to follow 
 TermsEnum.docs contract and accept any source of liveDocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603266#comment-13603266
 ] 

Commit Tag Bot commented on LUCENE-4834:


[trunk commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1456842

LUCENE-4834: Sorter API: Make TermsEnum.docs accept any source of liveDocs.



 Sorter API: Make TermsEnum.docs accept any source of liveDocs
 -

 Key: LUCENE-4834
 URL: https://issues.apache.org/jira/browse/LUCENE-4834
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.3

 Attachments: LUCENE-4834.patch


 TermsEnum.docs currently only works when liveDocs is null or the reader's 
 liveDocs. This is enough for addIndexes but it would be cleaner to follow 
 TermsEnum.docs contract and accept any source of liveDocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4752) Merge segments to sort them

2013-03-15 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603271#comment-13603271
 ] 

Shai Erera commented on LUCENE-4752:


This looks great. I think that it's fine that we let people override 
SegmentMerger ... it's super expert API, no sane person would ever want to do 
that, but those that want to, it's good to have the option.

Adrien, perhaps add a SortingSegmentMerger to the sorter package? Or at least 
add a test that verifies merges keep things sorted?

 Merge segments to sort them
 ---

 Key: LUCENE-4752
 URL: https://issues.apache.org/jira/browse/LUCENE-4752
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: David Smiley
Assignee: Adrien Grand
 Attachments: LUCENE-4752.patch


 It would be awesome if Lucene could write the documents out in a segment 
 based on a configurable order.  This of course applies to merging segments 
 to. The benefit is increased locality on disk of documents that are likely to 
 be accessed together.  This often applies to documents near each other in 
 time, but also spatially.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603273#comment-13603273
 ] 

Commit Tag Bot commented on LUCENE-4834:


[branch_4x commit] Adrien Grand
http://svn.apache.org/viewvc?view=revisionrevision=1456851

LUCENE-4834: Sorter API: Make TermsEnum.docs accept any source of liveDocs 
(merged from r1456842).



 Sorter API: Make TermsEnum.docs accept any source of liveDocs
 -

 Key: LUCENE-4834
 URL: https://issues.apache.org/jira/browse/LUCENE-4834
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.3

 Attachments: LUCENE-4834.patch


 TermsEnum.docs currently only works when liveDocs is null or the reader's 
 liveDocs. This is enough for addIndexes but it would be cleaner to follow 
 TermsEnum.docs contract and accept any source of liveDocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4834) Sorter API: Make TermsEnum.docs accept any source of liveDocs

2013-03-15 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4834.
--

Resolution: Fixed

Thanks Shai.

 Sorter API: Make TermsEnum.docs accept any source of liveDocs
 -

 Key: LUCENE-4834
 URL: https://issues.apache.org/jira/browse/LUCENE-4834
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.3

 Attachments: LUCENE-4834.patch


 TermsEnum.docs currently only works when liveDocs is null or the reader's 
 liveDocs. This is enough for addIndexes but it would be cleaner to follow 
 TermsEnum.docs contract and accept any source of liveDocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-03-15 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603299#comment-13603299
 ] 

Michael McCandless commented on LUCENE-4795:


bq. Then perhaps just drop a comment in the ctor?

OK I'll put a comment where I append w/ the delim explaining why I can't use 
CP.toString ...

Thanks Shai!  I'll commit soon...

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this 

[jira] [Commented] (LUCENE-4832) Unbounded getTopGroups for ToParentBlockJoinCollector

2013-03-15 Thread Aleksey Aleev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603313#comment-13603313
 ] 

Aleksey Aleev commented on LUCENE-4832:
---

Michael, thank you for replying!
I agree with you about Integer.MAX_VALUE, it looks much better in this way.
The reason why I introduced GroupDocsAccumulator class is that I wanted to 
reduce the size of getTopGroups(...) method and make it more readable.
Could you please tell me you don't like introducing new class and creating an 
instance of it at all or you think that it's not clear what accumulate() method 
should do?
Maybe it will be more clear if the loop by groups will remain in getTopGroups() 
and the loop's body will be extracted in accumulate() method? So we'll have:
{code:java}
for(int groupIDX=offset;groupIDXsortedGroups.length;groupIDX++) {
  groupDocsAccumulator.accumulate(groupIDX);
}
{code}

Please tell WDYT about it and I'll update the patch.

 Unbounded getTopGroups for ToParentBlockJoinCollector
 -

 Key: LUCENE-4832
 URL: https://issues.apache.org/jira/browse/LUCENE-4832
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/join
Reporter: Aleksey Aleev
 Attachments: LUCENE-4832.patch


 _ToParentBlockJoinCollector#getTopGroups_ method takes several arguments:
 {code:java}
 public TopGroupsInteger getTopGroups(ToParentBlockJoinQuery query, 
Sort withinGroupSort,
int offset,
int maxDocsPerGroup,
int withinGroupOffset,
boolean fillSortFields)
 {code}
 and one of them is {{maxDocsPerGroup}} which specifies upper bound of child 
 documents number returned within each group. 
 {{ToParentBlockJoinCollector}} collects and caches all child documents 
 matched by given {{ToParentBlockJoinQuery}} in {{OneGroup}} objects during 
 search so it is possible to create {{GroupDocs}} with all matched child 
 documents instead of part of them bounded by {{maxDocsPerGroup}}.
 When you specify {{maxDocsPerGroup}} new queues(I mean 
 {{TopScoreDocCollector}}/{{TopFieldCollector}}) will be created for each 
 group with {{maxDocsPerGroup}} objects created within each queue which could 
 lead to redundant memory allocation in case of child documents number within 
 group is less than {{maxDocsPerGroup}}.
 I suppose that there are many cases where you need to get all child documents 
 matched by query so it could be nice to have ability to get top groups with 
 all matched child documents without unnecessary memory allocation. 
 Possible solution is to pass negative {{maxDocsPerGroup}} in case when you 
 need to get all matched child documents within each group and check 
 {{maxDocsPerGroup}} value: if it is negative then we need to create queue 
 with size of matched child documents number; otherwise create queue with size 
 equals to {{maxDocsPerGroup}}. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Stress tests

2013-03-15 Thread Erick Erickson
The OpenCloseCoreStressTest is occasionally failing, I was looking at two
of them last night and I'm wondering if there's an issue with warming
searchers or perhaps background merging. I've been rather assuming that
core.close() waits until all that is done, but I don't have any proof of
that.

Is there a good way to find out if a searcher for a particular SolrCore is
in process of autowarming or if there's a background merge going on? I
might try preventing opens/closes if a background thread is warming a core.

I already prevent load/unload/reload operations from occurring
simultaneously, but I'm wondering if I'm running afoul of either
1 background merging
2 autowarming

This will be an ugly one to debug, so any thoughts welcome

Thanks,
Erick


[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 26880 - Failure!

2013-03-15 Thread builder
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/26880/

1 tests failed.
REGRESSION:  
org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded

Error Message:
Captured an uncaught exception in thread: Thread[id=95, name=Thread-59, 
state=RUNNABLE, group=TGRP-TestTimeLimitingCollector]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=95, name=Thread-59, state=RUNNABLE, 
group=TGRP-TestTimeLimitingCollector]
Caused by: java.lang.OutOfMemoryError: Java heap space
at __randomizedtesting.SeedInfo.seed([4AB0D0AAEC8C1237]:0)
at 
org.apache.lucene.codecs.sep.SepPostingsReader.readTermsBlock(SepPostingsReader.java:191)
at 
org.apache.lucene.codecs.pulsing.PulsingPostingsReader.readTermsBlock(PulsingPostingsReader.java:135)
at 
org.apache.lucene.codecs.blockterms.BlockTermsReader$FieldReader$SegmentTermsEnum.nextBlock(BlockTermsReader.java:832)
at 
org.apache.lucene.codecs.blockterms.BlockTermsReader$FieldReader$SegmentTermsEnum._next(BlockTermsReader.java:659)
at 
org.apache.lucene.codecs.blockterms.BlockTermsReader$FieldReader$SegmentTermsEnum.next(BlockTermsReader.java:651)
at org.apache.lucene.index.MultiTermsEnum.reset(MultiTermsEnum.java:128)
at org.apache.lucene.index.MultiTerms.iterator(MultiTerms.java:110)
at 
org.apache.lucene.search.TermQuery$TermWeight.getTermsEnum(TermQuery.java:101)
at 
org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:81)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:308)
at 
org.apache.lucene.search.AssertingIndexSearcher$1.scorer(AssertingIndexSearcher.java:80)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:596)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:302)
at 
org.apache.lucene.search.TestTimeLimitingCollector.search(TestTimeLimitingCollector.java:124)
at 
org.apache.lucene.search.TestTimeLimitingCollector.doTestSearch(TestTimeLimitingCollector.java:139)
at 
org.apache.lucene.search.TestTimeLimitingCollector.access$200(TestTimeLimitingCollector.java:42)
at 
org.apache.lucene.search.TestTimeLimitingCollector$1.run(TestTimeLimitingCollector.java:292)




Build Log:
[...truncated 1203 lines...]
[junit4:junit4] Suite: org.apache.lucene.search.TestTimeLimitingCollector
[junit4:junit4]   2 NOTE: reproduce with: ant test  
-Dtestcase=TestTimeLimitingCollector -Dtests.method=testSearchMultiThreaded 
-Dtests.seed=4AB0D0AAEC8C1237 -Dtests.slow=true -Dtests.locale=es_VE 
-Dtests.timezone=Europe/Oslo -Dtests.file.encoding=ISO-8859-1
[junit4:junit4] ERROR246s J1 | 
TestTimeLimitingCollector.testSearchMultiThreaded 
[junit4:junit4] Throwable #1: java.lang.AssertionError: some threads 
failed! expected:50 but was:32
[junit4:junit4]at org.junit.Assert.fail(Assert.java:93)
[junit4:junit4]at org.junit.Assert.failNotEquals(Assert.java:647)
[junit4:junit4]at org.junit.Assert.assertEquals(Assert.java:128)
[junit4:junit4]at org.junit.Assert.assertEquals(Assert.java:472)
[junit4:junit4]at 
org.apache.lucene.search.TestTimeLimitingCollector.doTestMultiThreads(TestTimeLimitingCollector.java:306)
[junit4:junit4]at 
org.apache.lucene.search.TestTimeLimitingCollector.testSearchMultiThreaded(TestTimeLimitingCollector.java:271)
[junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
[junit4:junit4]at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
[junit4:junit4]at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit4:junit4]at java.lang.reflect.Method.invoke(Method.java:601)
[junit4:junit4]at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
[junit4:junit4]at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
[junit4:junit4]at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
[junit4:junit4]at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
[junit4:junit4]at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
[junit4:junit4]at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
[junit4:junit4]at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
[junit4:junit4]at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
[junit4:junit4]at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
[junit4:junit4]at 

[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603338#comment-13603338
 ] 

Mikhail Khludnev commented on SOLR-4586:


bq. maxBooleanClauses no longer applies, that the limitation was removed from 
Lucene sometime in the 3.x series.

[really?|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java#L137]

 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603341#comment-13603341
 ] 

Commit Tag Bot commented on SOLR-4196:
--

[trunk commit] Erick Erickson
http://svn.apache.org/viewvc?view=revisionrevision=1456938

Added comments for deprecating solr.xml (SOLR-4196 etc)


 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, StressTest.zip, 
 StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: questions about solr.xml/solr.properties

2013-03-15 Thread Erick Erickson
Added some bits to CHANGES.txt for SOLR-4196


On Thu, Mar 14, 2013 at 10:01 PM, Erick Erickson erickerick...@gmail.comwrote:

 OK, I'll see what I can put in tomorrow. It won't be comprehensive,
 probably just refer to the Wiki page after a very brief explanation.


 On Thu, Mar 14, 2013 at 9:45 PM, Mark Miller markrmil...@gmail.comwrote:

 Okay - leaving it out on purpose can get kind of confusing - someone that
 wanted to look at the state of trunk right now might think, oh, only bug
 fixes and very minor changes, but surprise, there is actually a major
 structural change.

 I think we should try and keep CHANGES up to date with reality for our
 'trunk', '4x' users.

 - Mark

 On Mar 14, 2013, at 9:24 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  bq: Is there any mention of this in CHANGES yet
 
  Nope, it's one of the JIRAs I've assigned to myself. SOLR-4542. I have
 started a Wiki page here:
  http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29
 
  linked to from here:
  http://wiki.apache.org/solr/CoreAdmin#Configuration
 
  But I've been waiting for the dust to settle before fleshing this out
 much. Although the more exposure it gets, I suppose the more chance people
 will have to comment on it. If we're agreed that solr.properties is the way
 to go, then I'll put something in CHANGES Real Soon Now and perhaps let the
 Wiki page evolve in fits and starts.
 
 
 
  On Thu, Mar 14, 2013 at 8:43 PM, Mark Miller markrmil...@gmail.com
 wrote:
  Is there any mention of this in CHANGES yet erick? Was just browsing
 for it…
 
  - Mark
 
  On Mar 14, 2013, at 6:37 PM, Jan Høydahl jan@cominvent.com wrote:
 
   solr.yml :-)
  
   --
   Jan Høydahl, search solution architect
   Cominvent AS - www.cominvent.com
   Solr Training - www.solrtraining.com
  
   14. mars 2013 kl. 22:02 skrev Yonik Seeley yo...@lucidworks.com:
  
   On Thu, Mar 14, 2013 at 3:46 PM, Robert Muir rcm...@gmail.com
 wrote:
   It seems to me there are two changes involved:
   1. ability to auto-discover cores from the filesystem so you don't
   need to explicitly list them
   2. changing .xml format to .properties
  
   These are indeed completely independent.
   My main concern/goal in this area has been #1.
   I assume #2 is just because developer tastes have been shifting away
   from XML, but like you I worry about what happens for config that
   needs more structure.
  
   -Yonik
   http://lucidworks.com
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: dev-h...@lucene.apache.org
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: dev-h...@lucene.apache.org
  
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





[jira] [Commented] (SOLR-4196) Untangle XML-specific nature of Config and Container classes

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603356#comment-13603356
 ] 

Commit Tag Bot commented on SOLR-4196:
--

[branch_4x commit] Erick Erickson
http://svn.apache.org/viewvc?view=revisionrevision=1456941

Added comments for deprecating solr.xml (SOLR-4196 etc)


 Untangle XML-specific nature of Config and Container classes
 

 Key: SOLR-4196
 URL: https://issues.apache.org/jira/browse/SOLR-4196
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, 
 SOLR-4196.patch, SOLR-4196.patch, SOLR-4196.patch, StressTest.zip, 
 StressTest.zip, StressTest.zip, StressTest.zip


 sub-task for SOLR-4083. If we're going to try to obsolete solr.xml, we need 
 to pull all of the specific XML processing out of Config and Container. 
 Currently, we refer to xpaths all over the place. This JIRA is about 
 providing a thunking layer to isolate the XML-esque nature of solr.xml and 
 allow a simple properties file to be used instead which will lead, 
 eventually, to solr.xml going away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread John Crygier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Crygier updated SOLR-4588:
---

Attachment: schema.xml

Sample Schema

 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not be multi-valued.
 Test Case
 Use the following schema:
 schema name='JohnTest' version='1.5'
   fields
   field name='id' type='String' indexed='true' stored='true' 
 required='true' multiValued='false' /
   field name='_version_' type='int' indexed='true' stored='true' 
 required='false' multiValued='false' /
   
   dynamicField name='*LatLon' type='location' indexed='true' 
 stored='true' required='false' multiValued='false' /
   dynamicField name='*_coordinate' type='int' indexed='true' 
 stored='true' required='false' multiValued='false' /
   /fields
   uniqueKeyid/uniqueKey
   types
   fieldType sortMissingLast='true' name='String' 
 class='solr.StrField' /
   fieldType name=int class=solr.TrieIntField 
 precisionStep=0 positionIncrementGap=0/
   fieldType name=location class=solr.LatLonType 
 subFieldSuffix=_coordinate/   
   /types
 /schema
 And issue the following commands (With responses):
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d '[{id:JohnTestDocument, 
 JohnTestLatLon : 0,0}]'
 RESPONSE: {responseHeader:{status:0,QTime:2133}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:0.0,
 JohnTestLatLon_1_coordinate:0.0,
 JohnTestLatLon:0,0,
 _version_:-1596981248}]
   }}
   
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d 
 '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
 RESPONSE: {responseHeader:{status:0,QTime:218}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:[0.0,
   5.0],
 JohnTestLatLon_1_coordinate:[0.0,
   7.0],
 JohnTestLatLon:5,7,
 _version_:-118489088}]
   }}
 As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
 JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread John Crygier (JIRA)
John Crygier created SOLR-4588:
--

 Summary: Partial Update of Poly Field Corrupts Data
 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
 Attachments: schema.xml

When updating a field that is a poly type (Testing with LatLonType), when you 
do a partial document update, the poly fields will become multi-valued.  This 
occurs even when the field is configured to not be multi-valued.

Test Case
Use the following schema:
schema name='JohnTest' version='1.5'
fields
field name='id' type='String' indexed='true' stored='true' 
required='true' multiValued='false' /
field name='_version_' type='int' indexed='true' stored='true' 
required='false' multiValued='false' /

dynamicField name='*LatLon' type='location' indexed='true' 
stored='true' required='false' multiValued='false' /
dynamicField name='*_coordinate' type='int' indexed='true' 
stored='true' required='false' multiValued='false' /
/fields
uniqueKeyid/uniqueKey
types
fieldType sortMissingLast='true' name='String' 
class='solr.StrField' /
fieldType name=int class=solr.TrieIntField 
precisionStep=0 positionIncrementGap=0/
fieldType name=location class=solr.LatLonType 
subFieldSuffix=_coordinate/   
/types
/schema

And issue the following commands (With responses):
curl 'localhost:8983/solr/update?commit=true' -H 
'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon 
: 0,0}]'
RESPONSE: {responseHeader:{status:0,QTime:2133}}

curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
RESPONSE: {
  responseHeader:{
status:0,
QTime:2,
params:{
  indent:true,
  q:*:*,
  wt:json}},
  response:{numFound:1,start:0,docs:[
  {
id:JohnTestDocument,
JohnTestLatLon_0_coordinate:0.0,
JohnTestLatLon_1_coordinate:0.0,
JohnTestLatLon:0,0,
_version_:-1596981248}]
  }}
  
curl 'localhost:8983/solr/update?commit=true' -H 
'Content-type:application/json' -d 
'[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
RESPONSE: {responseHeader:{status:0,QTime:218}}

curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
RESPONSE: {
  responseHeader:{
status:0,
QTime:2,
params:{
  indent:true,
  q:*:*,
  wt:json}},
  response:{numFound:1,start:0,docs:[
  {
id:JohnTestDocument,
JohnTestLatLon_0_coordinate:[0.0,
  5.0],
JohnTestLatLon_1_coordinate:[0.0,
  7.0],
JohnTestLatLon:5,7,
_version_:-118489088}]
  }}

As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread John Crygier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Crygier updated SOLR-4588:
---

Description: 
When updating a field that is a poly type (Testing with LatLonType), when you 
do a partial document update, the poly fields will become multi-valued.  This 
occurs even when the field is configured to not be multi-valued.

Test Case
Use the attached schema (schema.xml)

And issue the following commands (With responses):
curl 'localhost:8983/solr/update?commit=true' -H 
'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon 
: 0,0}]'
RESPONSE: {responseHeader:{status:0,QTime:2133}}

curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
RESPONSE: {
  responseHeader:{
status:0,
QTime:2,
params:{
  indent:true,
  q:*:*,
  wt:json}},
  response:{numFound:1,start:0,docs:[
  {
id:JohnTestDocument,
JohnTestLatLon_0_coordinate:0.0,
JohnTestLatLon_1_coordinate:0.0,
JohnTestLatLon:0,0,
_version_:-1596981248}]
  }}
  
curl 'localhost:8983/solr/update?commit=true' -H 
'Content-type:application/json' -d 
'[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
RESPONSE: {responseHeader:{status:0,QTime:218}}

curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
RESPONSE: {
  responseHeader:{
status:0,
QTime:2,
params:{
  indent:true,
  q:*:*,
  wt:json}},
  response:{numFound:1,start:0,docs:[
  {
id:JohnTestDocument,
JohnTestLatLon_0_coordinate:[0.0,
  5.0],
JohnTestLatLon_1_coordinate:[0.0,
  7.0],
JohnTestLatLon:5,7,
_version_:-118489088}]
  }}

As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
JohnTestLatLon_1_coordinate.

  was:
When updating a field that is a poly type (Testing with LatLonType), when you 
do a partial document update, the poly fields will become multi-valued.  This 
occurs even when the field is configured to not be multi-valued.

Test Case
Use the following schema:
schema name='JohnTest' version='1.5'
fields
field name='id' type='String' indexed='true' stored='true' 
required='true' multiValued='false' /
field name='_version_' type='int' indexed='true' stored='true' 
required='false' multiValued='false' /

dynamicField name='*LatLon' type='location' indexed='true' 
stored='true' required='false' multiValued='false' /
dynamicField name='*_coordinate' type='int' indexed='true' 
stored='true' required='false' multiValued='false' /
/fields
uniqueKeyid/uniqueKey
types
fieldType sortMissingLast='true' name='String' 
class='solr.StrField' /
fieldType name=int class=solr.TrieIntField 
precisionStep=0 positionIncrementGap=0/
fieldType name=location class=solr.LatLonType 
subFieldSuffix=_coordinate/   
/types
/schema

And issue the following commands (With responses):
curl 'localhost:8983/solr/update?commit=true' -H 
'Content-type:application/json' -d '[{id:JohnTestDocument, JohnTestLatLon 
: 0,0}]'
RESPONSE: {responseHeader:{status:0,QTime:2133}}

curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
RESPONSE: {
  responseHeader:{
status:0,
QTime:2,
params:{
  indent:true,
  q:*:*,
  wt:json}},
  response:{numFound:1,start:0,docs:[
  {
id:JohnTestDocument,
JohnTestLatLon_0_coordinate:0.0,
JohnTestLatLon_1_coordinate:0.0,
JohnTestLatLon:0,0,
_version_:-1596981248}]
  }}
  
curl 'localhost:8983/solr/update?commit=true' -H 
'Content-type:application/json' -d 
'[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
RESPONSE: {responseHeader:{status:0,QTime:218}}

curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
RESPONSE: {
  responseHeader:{
status:0,
QTime:2,
params:{
  indent:true,
  q:*:*,
  wt:json}},
  response:{numFound:1,start:0,docs:[
  {
id:JohnTestDocument,
JohnTestLatLon_0_coordinate:[0.0,
  5.0],
JohnTestLatLon_1_coordinate:[0.0,
  7.0],
JohnTestLatLon:5,7,
_version_:-118489088}]
  }}

As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
JohnTestLatLon_1_coordinate.


 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not 

[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread John Crygier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Crygier updated SOLR-4588:
---

Priority: Minor  (was: Major)

 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
Priority: Minor
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not be multi-valued.
 Test Case
 Use the attached schema (schema.xml)
 And issue the following commands (With responses):
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d '[{id:JohnTestDocument, 
 JohnTestLatLon : 0,0}]'
 RESPONSE: {responseHeader:{status:0,QTime:2133}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:0.0,
 JohnTestLatLon_1_coordinate:0.0,
 JohnTestLatLon:0,0,
 _version_:-1596981248}]
   }}
   
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d 
 '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
 RESPONSE: {responseHeader:{status:0,QTime:218}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:[0.0,
   5.0],
 JohnTestLatLon_1_coordinate:[0.0,
   7.0],
 JohnTestLatLon:5,7,
 _version_:-118489088}]
   }}
 As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
 JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: questions about solr.xml/solr.properties

2013-03-15 Thread Erick Erickson
By far, the bulk of the work was trying to un-entangle the solr.xml from
CoreContainer, especially as far as the core tags were concerned and
allow cores to come and go. No matter whether we decide on some other
mechanism than solr.properties or not, altering how we do this should be
much easier now.

The work for rapidly opening/closing (and lazily loading) cores was
enhanced as part of this restructuring, if I were going to do it all over
again I'd probably break it into smaller chunks. But even if we change
course, the opening/closing stuff won't be affected so we're probably OK
there.

As far as nesting is concerned, it seems like we can emulate by multi-level
properties if necessary, at least for simple structuring, you can see that
a little in the page I started here:
http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29, I
did a straight-forward mapping from cores to a bunch of properties like
cores.hostPort
cores.adminPath

etc. Now, all I'm saying with that is that simple structure is still
possible, I'm not particularly wedded to the idea. Consider Log4J
properties. (OK, maybe that isn't a great example, I find it more than a
little confusing, but...).

But part of the motivation for this is moving to the SolrCloud way of doing
things. The present setup has assumptions built into it from
single-core-only days that make dynamic adjustments hard. Just one example:
CoreContainer.load essentially assumed that it was the only thread
operating when loading cores. To allow cores to come and go I had to do
quite a bit of coordinating work. If we extend this to cores coming and
going in response to load, or cores/collections being created on-the-fly
etc. I'm not sure solr.xml is going to adapt whereas system properties are
a more automation-friendly way of doing things.

If we're going to eventually expand/contract/dynamically have nodes come
and go we probably need to be able to, essentially, define all our
properties at run-time rather than have a static, edit-by-hand
configuration file.

All that said, I'm open to whatever consensus we build. It'll about break
my heart to _undo_ code, but I'll survive somehow, partially consoled by
the fact that actually reading the solr.properties file wasn't much of the
work G...

Erick



On Fri, Mar 15, 2013 at 9:08 AM, Erick Erickson erickerick...@gmail.comwrote:

 Added some bits to CHANGES.txt for SOLR-4196


 On Thu, Mar 14, 2013 at 10:01 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 OK, I'll see what I can put in tomorrow. It won't be comprehensive,
 probably just refer to the Wiki page after a very brief explanation.


 On Thu, Mar 14, 2013 at 9:45 PM, Mark Miller markrmil...@gmail.comwrote:

 Okay - leaving it out on purpose can get kind of confusing - someone
 that wanted to look at the state of trunk right now might think, oh, only
 bug fixes and very minor changes, but surprise, there is actually a major
 structural change.

 I think we should try and keep CHANGES up to date with reality for our
 'trunk', '4x' users.

 - Mark

 On Mar 14, 2013, at 9:24 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  bq: Is there any mention of this in CHANGES yet
 
  Nope, it's one of the JIRAs I've assigned to myself. SOLR-4542. I have
 started a Wiki page here:
 
 http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29
 
  linked to from here:
  http://wiki.apache.org/solr/CoreAdmin#Configuration
 
  But I've been waiting for the dust to settle before fleshing this out
 much. Although the more exposure it gets, I suppose the more chance people
 will have to comment on it. If we're agreed that solr.properties is the way
 to go, then I'll put something in CHANGES Real Soon Now and perhaps let the
 Wiki page evolve in fits and starts.
 
 
 
  On Thu, Mar 14, 2013 at 8:43 PM, Mark Miller markrmil...@gmail.com
 wrote:
  Is there any mention of this in CHANGES yet erick? Was just browsing
 for it…
 
  - Mark
 
  On Mar 14, 2013, at 6:37 PM, Jan Høydahl jan@cominvent.com
 wrote:
 
   solr.yml :-)
  
   --
   Jan Høydahl, search solution architect
   Cominvent AS - www.cominvent.com
   Solr Training - www.solrtraining.com
  
   14. mars 2013 kl. 22:02 skrev Yonik Seeley yo...@lucidworks.com:
  
   On Thu, Mar 14, 2013 at 3:46 PM, Robert Muir rcm...@gmail.com
 wrote:
   It seems to me there are two changes involved:
   1. ability to auto-discover cores from the filesystem so you don't
   need to explicitly list them
   2. changing .xml format to .properties
  
   These are indeed completely independent.
   My main concern/goal in this area has been #1.
   I assume #2 is just because developer tastes have been shifting away
   from XML, but like you I worry about what happens for config that
   needs more structure.
  
   -Yonik
   http://lucidworks.com
  
  
 -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: 

[jira] [Commented] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603373#comment-13603373
 ] 

Erick Erickson commented on SOLR-4588:
--

Does this happen on a recent build? I _think_ I remember something about this, 
it sounds related to https://issues.apache.org/jira/browse/SOLR-4134.

Could you try it with 4.2 and let us know if it's still an issue?

 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
Priority: Minor
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not be multi-valued.
 Test Case
 Use the attached schema (schema.xml)
 And issue the following commands (With responses):
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d '[{id:JohnTestDocument, 
 JohnTestLatLon : 0,0}]'
 RESPONSE: {responseHeader:{status:0,QTime:2133}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:0.0,
 JohnTestLatLon_1_coordinate:0.0,
 JohnTestLatLon:0,0,
 _version_:-1596981248}]
   }}
   
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d 
 '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
 RESPONSE: {responseHeader:{status:0,QTime:218}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:[0.0,
   5.0],
 JohnTestLatLon_1_coordinate:[0.0,
   7.0],
 JohnTestLatLon:5,7,
 _version_:-118489088}]
   }}
 As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
 JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4828) BooleanQuery.extractTerms should not recurse into MUST_NOT clauses

2013-03-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603392#comment-13603392
 ] 

Yonik Seeley commented on LUCENE-4828:
--

bq. I think ideally we would not weight or score MUST_NOT or constant-scored 
clauses at all. I know this isnt the case today, but I just think its dumb.

Not weighting prohibited clauses would needlessly break certain types of 
queries.

 BooleanQuery.extractTerms should not recurse into MUST_NOT clauses
 --

 Key: LUCENE-4828
 URL: https://issues.apache.org/jira/browse/LUCENE-4828
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.3

 Attachments: LUCENE-4828.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603400#comment-13603400
 ] 

Yonik Seeley commented on SOLR-4588:


Partial update isn't currently going to work well with stored copyField targets 
or stored sub-fields of a polyField.
Further, sub-fields of a polyField normally should be stored=false... they are 
meant more as an implementation detail and not interface for clients.

This is the definition in the stock schema (notice stored=false):
{code}
   !-- Type used to index the lat and lon components for the location 
FieldType --
   dynamicField name=*_coordinate  type=tdouble indexed=true  
stored=false /
{code}

 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
Priority: Minor
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not be multi-valued.
 Test Case
 Use the attached schema (schema.xml)
 And issue the following commands (With responses):
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d '[{id:JohnTestDocument, 
 JohnTestLatLon : 0,0}]'
 RESPONSE: {responseHeader:{status:0,QTime:2133}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:0.0,
 JohnTestLatLon_1_coordinate:0.0,
 JohnTestLatLon:0,0,
 _version_:-1596981248}]
   }}
   
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d 
 '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
 RESPONSE: {responseHeader:{status:0,QTime:218}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:[0.0,
   5.0],
 JohnTestLatLon_1_coordinate:[0.0,
   7.0],
 JohnTestLatLon:5,7,
 _version_:-118489088}]
   }}
 As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
 JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4828) BooleanQuery.extractTerms should not recurse into MUST_NOT clauses

2013-03-15 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603402#comment-13603402
 ] 

Robert Muir commented on LUCENE-4828:
-

What kind of queries would this break? 

Just to be clear, when I say weight. I mean, similarity. we'd still 
createWeight, it just wouldnt fetch any term statistics.

 BooleanQuery.extractTerms should not recurse into MUST_NOT clauses
 --

 Key: LUCENE-4828
 URL: https://issues.apache.org/jira/browse/LUCENE-4828
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.3

 Attachments: LUCENE-4828.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread John Crygier (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603414#comment-13603414
 ] 

John Crygier commented on SOLR-4588:


Thanks all for the input.  I did verify that the behavior is the same in 4.2.  
I didn't put the full story here that is actually leading to me needing this 
bug fix.

I actually have a custom field that I've written that works with strings.  My 
intention is to use highlighting on the dynamic poly fields, so the user knows 
when there is a hit to a certain column.  I had thought that I read that 
highlighting only works on stored fields, so that's why I was working with 
stored fields.

It's a minor issue, and since I'm coding custom fields anyway, I should be able 
to work around it.

Thanks again!

 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: John Crygier
Priority: Minor
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not be multi-valued.
 Test Case
 Use the attached schema (schema.xml)
 And issue the following commands (With responses):
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d '[{id:JohnTestDocument, 
 JohnTestLatLon : 0,0}]'
 RESPONSE: {responseHeader:{status:0,QTime:2133}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:0.0,
 JohnTestLatLon_1_coordinate:0.0,
 JohnTestLatLon:0,0,
 _version_:-1596981248}]
   }}
   
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d 
 '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
 RESPONSE: {responseHeader:{status:0,QTime:218}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:[0.0,
   5.0],
 JohnTestLatLon_1_coordinate:[0.0,
   7.0],
 JohnTestLatLon:5,7,
 _version_:-118489088}]
   }}
 As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
 JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4588) Partial Update of Poly Field Corrupts Data

2013-03-15 Thread John Crygier (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Crygier updated SOLR-4588:
---

Affects Version/s: 4.2

 Partial Update of Poly Field Corrupts Data
 --

 Key: SOLR-4588
 URL: https://issues.apache.org/jira/browse/SOLR-4588
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0, 4.2
Reporter: John Crygier
Priority: Minor
 Attachments: schema.xml


 When updating a field that is a poly type (Testing with LatLonType), when you 
 do a partial document update, the poly fields will become multi-valued.  This 
 occurs even when the field is configured to not be multi-valued.
 Test Case
 Use the attached schema (schema.xml)
 And issue the following commands (With responses):
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d '[{id:JohnTestDocument, 
 JohnTestLatLon : 0,0}]'
 RESPONSE: {responseHeader:{status:0,QTime:2133}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:0.0,
 JohnTestLatLon_1_coordinate:0.0,
 JohnTestLatLon:0,0,
 _version_:-1596981248}]
   }}
   
 curl 'localhost:8983/solr/update?commit=true' -H 
 'Content-type:application/json' -d 
 '[{id:JohnTestDocument,JohnTestLatLon:{set:5,7}}]'
 RESPONSE: {responseHeader:{status:0,QTime:218}}
 curl 'http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=true'
 RESPONSE: {
   responseHeader:{
 status:0,
 QTime:2,
 params:{
   indent:true,
   q:*:*,
   wt:json}},
   response:{numFound:1,start:0,docs:[
   {
 id:JohnTestDocument,
 JohnTestLatLon_0_coordinate:[0.0,
   5.0],
 JohnTestLatLon_1_coordinate:[0.0,
   7.0],
 JohnTestLatLon:5,7,
 _version_:-118489088}]
   }}
 As you can see, the 0.0 hangs around in JohnTestLatLon_0_coordinate and 
 JohnTestLatLon_1_coordinate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4574) The Collections API will silently return success on an unknown ACTION parameter.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4574.
---

Resolution: Fixed

 The Collections API will silently return success on an unknown ACTION 
 parameter.
 

 Key: SOLR-4574
 URL: https://issues.apache.org/jira/browse/SOLR-4574
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-4574.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4576) Collections API validation errors should cause an exception on clients and otherwise act as validation errors with the Core Admin API.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4576.
---

Resolution: Fixed

 Collections API validation errors should cause an exception on clients and 
 otherwise act as validation errors with the Core Admin API.
 --

 Key: SOLR-4576
 URL: https://issues.apache.org/jira/browse/SOLR-4576
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-4576.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4577) The collections API should return responses (sucess or failure) for each node it attempts to work with.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4577.
---

Resolution: Fixed

 The collections API should return responses (sucess or failure) for each node 
 it attempts to work with.
 ---

 Key: SOLR-4577
 URL: https://issues.apache.org/jira/browse/SOLR-4577
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0


 This is when the command itself is successful on the node, but then we need a 
 report of the sub command result on each node.
 There is some code that sort of attempts to do this that came in with the 
 collection api response contribution, but it's not really working currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4752) Merge segments to sort them

2013-03-15 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603427#comment-13603427
 ] 

Robert Muir commented on LUCENE-4752:
-

I disagree... and I guess I'm willing to go to bat for this.

There is real cost in exposing stuff like this. I'm already frustrated about 
the amount of stuff around this area that is 'public' solely due to packaging 
(e.g. the .codecs package and the .index package both need it).

Finally, if we have code in lucene itself that relies upon the inner details 
because it e.g. subclasses segmentmerger, this makes it harder to refactor core 
lucene and evolve it in the future because we have modules doing sorting or 
shuffling or god knows what that rely upon its api. in such a case where i want 
to refactor SM, i could just eradicate those modules and nobody would complain, 
right?

I don't think we should do it.


 Merge segments to sort them
 ---

 Key: LUCENE-4752
 URL: https://issues.apache.org/jira/browse/LUCENE-4752
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: David Smiley
Assignee: Adrien Grand
 Attachments: LUCENE-4752.patch


 It would be awesome if Lucene could write the documents out in a segment 
 based on a configurable order.  This of course applies to merging segments 
 to. The benefit is increased locality on disk of documents that are likely to 
 be accessed together.  This often applies to documents near each other in 
 time, but also spatially.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4752) Merge segments to sort them

2013-03-15 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603430#comment-13603430
 ] 

Robert Muir commented on LUCENE-4752:
-

Another example of a 'super expert api' that i think is ok, is indexingchain, 
where it stays package private, and the IWConfig setter is also package 
private. i think something like this might be reasonable.

 Merge segments to sort them
 ---

 Key: LUCENE-4752
 URL: https://issues.apache.org/jira/browse/LUCENE-4752
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: David Smiley
Assignee: Adrien Grand
 Attachments: LUCENE-4752.patch


 It would be awesome if Lucene could write the documents out in a segment 
 based on a configurable order.  This of course applies to merging segments 
 to. The benefit is increased locality on disk of documents that are likely to 
 be accessed together.  This often applies to documents near each other in 
 time, but also spatially.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4585) The Collections API validates numShards with 0 but should use = 0.

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603438#comment-13603438
 ] 

Commit Tag Bot commented on SOLR-4585:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1456979

SOLR-4585: The Collections API validates numShards with  0 but should use  = 
0.


 The Collections API validates numShards with  0 but should use = 0.
 -

 Key: SOLR-4585
 URL: https://issues.apache.org/jira/browse/SOLR-4585
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4752) Merge segments to sort them

2013-03-15 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603440#comment-13603440
 ] 

Robert Muir commented on LUCENE-4752:
-

And finally i think it would be way better to provide whatever 'hook' is needed 
for this kinda stuff rather than allow subclassing of segmentmerger. like a 
proper pluggable api (e.g. codec is an example of this) versus letting people 
just subclass concrete things.

 Merge segments to sort them
 ---

 Key: LUCENE-4752
 URL: https://issues.apache.org/jira/browse/LUCENE-4752
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: David Smiley
Assignee: Adrien Grand
 Attachments: LUCENE-4752.patch


 It would be awesome if Lucene could write the documents out in a segment 
 based on a configurable order.  This of course applies to merging segments 
 to. The benefit is increased locality on disk of documents that are likely to 
 be accessed together.  This often applies to documents near each other in 
 time, but also spatially.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: questions about solr.xml/solr.properties

2013-03-15 Thread Mark Miller
Cool - you probably also want to move another entry under that? Usually I've 
been using Additional Changes: below for this:

* SOLR-4543: setting shardHandlerFactory in solr.xml/solr.properties does not 
work.
  (Ryan Ernst, Robert Muir via Erick Erickson)

That's not a released bug right? If not we don't want it to appear so - we 
still want to give credit and have the tracking for trunk users I think, that's 
why I use the Additional Changes for follow on JIRAs to large CHANGES.

- Mark

On Mar 15, 2013, at 9:08 AM, Erick Erickson erickerick...@gmail.com wrote:

 Added some bits to CHANGES.txt for SOLR-4196
 
 
 On Thu, Mar 14, 2013 at 10:01 PM, Erick Erickson erickerick...@gmail.com 
 wrote:
 OK, I'll see what I can put in tomorrow. It won't be comprehensive, probably 
 just refer to the Wiki page after a very brief explanation.
 
 
 On Thu, Mar 14, 2013 at 9:45 PM, Mark Miller markrmil...@gmail.com wrote:
 Okay - leaving it out on purpose can get kind of confusing - someone that 
 wanted to look at the state of trunk right now might think, oh, only bug 
 fixes and very minor changes, but surprise, there is actually a major 
 structural change.
 
 I think we should try and keep CHANGES up to date with reality for our 
 'trunk', '4x' users.
 
 - Mark
 
 On Mar 14, 2013, at 9:24 PM, Erick Erickson erickerick...@gmail.com wrote:
 
  bq: Is there any mention of this in CHANGES yet
 
  Nope, it's one of the JIRAs I've assigned to myself. SOLR-4542. I have 
  started a Wiki page here:
  http://wiki.apache.org/solr/Core%20Discovery%20%284.3%20and%20beyond%29
 
  linked to from here:
  http://wiki.apache.org/solr/CoreAdmin#Configuration
 
  But I've been waiting for the dust to settle before fleshing this out much. 
  Although the more exposure it gets, I suppose the more chance people will 
  have to comment on it. If we're agreed that solr.properties is the way to 
  go, then I'll put something in CHANGES Real Soon Now and perhaps let the 
  Wiki page evolve in fits and starts.
 
 
 
  On Thu, Mar 14, 2013 at 8:43 PM, Mark Miller markrmil...@gmail.com wrote:
  Is there any mention of this in CHANGES yet erick? Was just browsing for it…
 
  - Mark
 
  On Mar 14, 2013, at 6:37 PM, Jan Høydahl jan@cominvent.com wrote:
 
   solr.yml :-)
  
   --
   Jan Høydahl, search solution architect
   Cominvent AS - www.cominvent.com
   Solr Training - www.solrtraining.com
  
   14. mars 2013 kl. 22:02 skrev Yonik Seeley yo...@lucidworks.com:
  
   On Thu, Mar 14, 2013 at 3:46 PM, Robert Muir rcm...@gmail.com wrote:
   It seems to me there are two changes involved:
   1. ability to auto-discover cores from the filesystem so you don't
   need to explicitly list them
   2. changing .xml format to .properties
  
   These are indeed completely independent.
   My main concern/goal in this area has been #1.
   I assume #2 is just because developer tastes have been shifting away
   from XML, but like you I worry about what happens for config that
   needs more structure.
  
   -Yonik
   http://lucidworks.com
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: dev-h...@lucene.apache.org
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: dev-h...@lucene.apache.org
  
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #274: POMs out of sync

2013-03-15 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/274/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at 
__randomizedtesting.SeedInfo.seed([6894893B8E7703A5:E9720723F9286399]:0)
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:196)




Build Log:
[...truncated 23171 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4585) The Collections API validates numShards with 0 but should use = 0.

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603447#comment-13603447
 ] 

Commit Tag Bot commented on SOLR-4585:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1456981

SOLR-4585: The Collections API validates numShards with  0 but should use  = 
0.


 The Collections API validates numShards with  0 but should use = 0.
 -

 Key: SOLR-4585
 URL: https://issues.apache.org/jira/browse/SOLR-4585
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: questions about solr.xml/solr.properties

2013-03-15 Thread Robert Muir
On Fri, Mar 15, 2013 at 11:12 AM, Mark Miller markrmil...@gmail.com wrote:
 Cool - you probably also want to move another entry under that? Usually I've 
 been using Additional Changes: below for this:

 * SOLR-4543: setting shardHandlerFactory in solr.xml/solr.properties does not 
 work.
   (Ryan Ernst, Robert Muir via Erick Erickson)

 That's not a released bug right? If not we don't want it to appear so - we 
 still want to give credit and have the tracking for trunk users I think, 
 that's why I use the Additional Changes for follow on JIRAs to large CHANGES.

Its a released bug: it looks like it never worked from solr.xml (see
4.2 src code CoreContainer:707)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: questions about solr.xml/solr.properties

2013-03-15 Thread Mark Miller

On Mar 15, 2013, at 11:17 AM, Robert Muir rcm...@gmail.com wrote:

 Its a released bug: it looks like it never worked from solr.xml (see
 4.2 src code CoreContainer:707)

Ah okay - it would ignore the shard handler class before and just accept 
settings.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.

2013-03-15 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603461#comment-13603461
 ] 

Erick Erickson commented on SOLR-4318:
--

Actually, on a second look I think the original patch is the right thing to do. 
There is actually a default tokenizer assigned to a TextField, admittedly it is 
rudimentary, but it's still defined. A note in the docs would be good though.

 NullPointerException encountered when /select query on solr.TextField.
 --

 Key: SOLR-4318
 URL: https://issues.apache.org/jira/browse/SOLR-4318
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Junaid Surve
Assignee: Erick Erickson
  Labels: query, select
 Attachments: SOLR-4318.patch


 I have two fields, one is title and the other is description in my Solr 
 schema like -
 Type - fieldType name=text class=solr.TextField 
 positionIncrementGap=100/
 Declaration - field name=description type=text indexed=true 
 stored=true/
 without any tokenizer or filter.
 On querying /select?q=description:myText it works. However when I add a '*' 
 it fails.
 Failure scenario -
 /select?q=description:*
 /select?q=description:myText*
 .. etc 
 solrconfig.xml - 
 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftitle/str
  /lst
 /requestHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.

2013-03-15 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603461#comment-13603461
 ] 

Erick Erickson edited comment on SOLR-4318 at 3/15/13 3:50 PM:
---

Actually, on a second look I think the original patch is the right thing to do. 
There is actually a default tokenizer assigned to a TextField, admittedly it is 
rudimentary, but it's still defined. So my original statement was entirely 
wrong, a TextField type with no analysis chain is perfectly correct, it was 
entirely a problem with the MultiTermAware code. Fixing it with the original 
patch is fine. I'll add a test too.

  was (Author: erickerickson):
Actually, on a second look I think the original patch is the right thing to 
do. There is actually a default tokenizer assigned to a TextField, admittedly 
it is rudimentary, but it's still defined. A note in the docs would be good 
though.
  
 NullPointerException encountered when /select query on solr.TextField.
 --

 Key: SOLR-4318
 URL: https://issues.apache.org/jira/browse/SOLR-4318
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Junaid Surve
Assignee: Erick Erickson
  Labels: query, select
 Attachments: SOLR-4318.patch


 I have two fields, one is title and the other is description in my Solr 
 schema like -
 Type - fieldType name=text class=solr.TextField 
 positionIncrementGap=100/
 Declaration - field name=description type=text indexed=true 
 stored=true/
 without any tokenizer or filter.
 On querying /select?q=description:myText it works. However when I add a '*' 
 it fails.
 Failure scenario -
 /select?q=description:*
 /select?q=description:myText*
 .. etc 
 solrconfig.xml - 
 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftitle/str
  /lst
 /requestHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Log4j?

2013-03-15 Thread Ryan Ernst
Strongly in favor here!  I'm currently repackaging the war to use log4j
bindings and would love to avoid that step.

Really it seems like this should be an option in the build (since it must
be in the war, or at least if slf4j is in the war, then so must the
binding).  Like if there were an ant property you could set with -D that
says which binding you want?


On Fri, Mar 15, 2013 at 9:38 AM, Ryan McKinley ryan...@gmail.com wrote:

 We have discussed a few times shipping with or supporting log4j rather
 then defaulting to JUL

 I just updated SOLR-3706 to do this.

 What are peoples thoughts on this issue now?

 Thanks
 Ryan





Re: Solr Log4j?

2013-03-15 Thread Ryan McKinley


 Really it seems like this should be an option in the build


Note that it already is, and has been for a while.

There is an option to build a .war with no logging bindings included:
ant dist-war-excl-slf4j


[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603503#comment-13603503
 ] 

Mark Miller commented on SOLR-3706:
---

I'd prefer a solution closer to what we are discussing.

There are also other things to look at I think:

*existing log ant targets - are they now broken?
*existing pre logging setup in jetty.xml and stuff in the README
*an example log4j conf file rather than the java util one
*consider switching our tests to it so that devs actually deal with what users 
will see

I think that Jan is on the right track for how we should tackle the switch.

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603512#comment-13603512
 ] 

Ryan McKinley commented on SOLR-3706:
-

Does that mean:
- no logging slf4j/log4j in .war  (like dist-war-excl-slf4j)
- put logging files slf4j+log4j in example/lib

I'm a big +1 to that

but last we discussed inertia pointed towords keeping concrete logging 
dependencies in solr.war 



 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Christian Moen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603516#comment-13603516
 ] 

Christian Moen commented on SOLR-3706:
--

{quote}
Mark, have you tried Logback? That's a good logging implementation; arguably a 
better one.
{quote}

David and Mark, I believe [Log4J 2|http://logging.apache.org/log4j/2.x/|] 
addresses a lot of the weaknesses in Log4J 1.x also addressed by Logback.  
However, Log4J 2 hasn't been released yet.

To me it sounds like a good idea to use Log4J 1.x now and move to Log4J 2 in 
the future.

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603517#comment-13603517
 ] 

Mark Miller commented on SOLR-3706:
---

bq. but last we discussed inertia pointed towords keeping concrete logging 
dependencies in solr.war

Why? I don't see anyone arguing for that here.

There seems to be plenty of advantageous to getting it out of the webapp and 
plenty of downsides to having it in.

Where and what is someone else arguing?

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken

2013-03-15 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-4561.
--

Resolution: Duplicate

This is the same as SOLR-3857.

 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603532#comment-13603532
 ] 

Ryan McKinley commented on SOLR-3706:
-

 Where and what is someone else arguing?

I am remembering discussion from a long time ago (year+) the last time I really 
paid attention to this discussion.  I can dig it up, but am much happier to 
drop it from the .war!


I will switch things around to pull the logging jars out of the war and put 
them in the example lib folder

Are there concrete patches on other issues that I ignored?  The key stuff in 
this one is for the admin UI managing log4j levels.

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603537#comment-13603537
 ] 

Mark Miller commented on SOLR-3706:
---

bq. I can dig it up, but am much happier to drop it from the .war!

Great, this is my feeling - a lot has changed in a year - we have a new 
discussion here that seems to be making progress and has a lot of visibility. 
If someone wants to toss a monkey wrench in, the lane is wide open :)

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603539#comment-13603539
 ] 

Mark Miller commented on SOLR-3706:
---

bq. I will switch things around to pull the logging jars out of the war and put 
them in the example lib folder

I'm happy to help out too - I've been meaning to get to this and just have not 
yet. Happy to let someone else do it, but if you need any help, I have a strong 
interest in making this work well.

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4585) The Collections API validates numShards with 0 but should use = 0.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4585.
---

Resolution: Fixed

 The Collections API validates numShards with  0 but should use = 0.
 -

 Key: SOLR-4585
 URL: https://issues.apache.org/jira/browse/SOLR-4585
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4478) Allow cores to specify a named config set

2013-03-15 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603565#comment-13603565
 ] 

Erick Erickson commented on SOLR-4478:
--

Starting on this finally, couple of points for discussion:

What do we do with each of these if we find a configSet entry in the 
core.properties file?
  instanceDir - nothing to do here except we don't look here for configuration 
  files
  dataDir - again, nothing. The meaning remains unchanged.
  config  - check that it exists in the config set and blow up if we don't 
  find it.
  schema  - treat as config.

for config and schema, it hurts my head to think of resolving relative paths, 
absolute paths, the relationship to solr_home, the relationship of referenced 
files ( stopwords, etc). At least for the first cut I want to allow the config 
and schema files to be a different name, but that's it. And require that they 
live in the configSet directory. Unless all of this just automagically happens 
through the resource loader.

The properties entry in the core.properties file (doesn't depend on configSet) 
- does it make sense to have it any more at all? I propose we deprecate it.

Is there a convenient place in the SolrCloud code that I can rip off? I'll look 
but I don't want to re-invent the wheel if I miss it


 Allow cores to specify a named config set
 -

 Key: SOLR-4478
 URL: https://issues.apache.org/jira/browse/SOLR-4478
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson

 Part of moving forward to the new way, after SOLR-4196 etc... I propose an 
 additional parameter specified on the core node in solr.xml or as a 
 parameter in the discovery mode core.properties file, call it configSet, 
 where the value provided is a path to a directory, either absolute or 
 relative. Really, this is as though you copied the conf directory somewhere 
 to be used by more than one core.
 Straw-man: There will be a directory solr_home/configsets which will be the 
 default. If the configSet parameter is, say, myconf, then I'd expect a 
 directory named myconf to exist in solr_home/configsets, which would look 
 something like
 solr_home/configsets/myconf/schema.xml
   solrconfig.xml
   stopwords.txt
   velocity
   velocity/query.vm
 etc.
 If multiple cores used the same configSet, schema, solrconfig etc. would all 
 be shared (i.e. shareSchema=true would be assumed). I don't see a good 
 use-case for _not_ sharing schemas, so I don't propose to allow this to be 
 turned off. Hmmm, what if shareSchema is explicitly set to false in the 
 solr.xml or properties file? I'd guess it should be honored but maybe log a 
 warning?
 Mostly I'm putting this up for comments. I know that there are already 
 thoughts about how this all should work floating around, so before I start 
 any work on this I thought I'd at least get an idea of whether this is the 
 way people are thinking about going.
 Configset can be either a relative or absolute path, if relative it's assumed 
 to be relative to solr_home.
 Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603578#comment-13603578
 ] 

Commit Tag Bot commented on SOLR-4318:
--

[trunk commit] Erick Erickson
http://svn.apache.org/viewvc?view=revisionrevision=1457032

SOLR-4318 NPE when doing a wildcard query on a TextField with the default 
analysis chain


 NullPointerException encountered when /select query on solr.TextField.
 --

 Key: SOLR-4318
 URL: https://issues.apache.org/jira/browse/SOLR-4318
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Junaid Surve
Assignee: Erick Erickson
  Labels: query, select
 Attachments: SOLR-4318.patch


 I have two fields, one is title and the other is description in my Solr 
 schema like -
 Type - fieldType name=text class=solr.TextField 
 positionIncrementGap=100/
 Declaration - field name=description type=text indexed=true 
 stored=true/
 without any tokenizer or filter.
 On querying /select?q=description:myText it works. However when I add a '*' 
 it fails.
 Failure scenario -
 /select?q=description:*
 /select?q=description:myText*
 .. etc 
 solrconfig.xml - 
 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftitle/str
  /lst
 /requestHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Log4j?

2013-03-15 Thread Erik Hatcher
How about logback rather than log4j?   
http://logback.qos.ch/reasonsToSwitch.html

Erik

On Mar 15, 2013, at 12:38 , Ryan McKinley wrote:

 We have discussed a few times shipping with or supporting log4j rather then 
 defaulting to JUL
 
 I just updated SOLR-3706 to do this.
 
 What are peoples thoughts on this issue now?
 
 Thanks
 Ryan
 
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4569) waitForReplicasToComeUp should bail right away if it doesn't see the expected slice in the clusterstate rather than waiting.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4569.
---

Resolution: Fixed

 waitForReplicasToComeUp should bail right away if it doesn't see the expected 
 slice in the clusterstate rather than waiting.
 

 Key: SOLR-4569
 URL: https://issues.apache.org/jira/browse/SOLR-4569
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4570) Even if an explicit shard id is used, ZkController#preRegister should still wait to see the shard id in it's current ClusterState.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4570.
---

Resolution: Fixed

 Even if an explicit shard id is used, ZkController#preRegister should still 
 wait to see the shard id in it's current ClusterState.
 --

 Key: SOLR-4570
 URL: https://issues.apache.org/jira/browse/SOLR-4570
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4571) SolrZkClient#setData should return Stat object.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4571.
---

Resolution: Fixed

 SolrZkClient#setData should return Stat object.
 ---

 Key: SOLR-4571
 URL: https://issues.apache.org/jira/browse/SOLR-4571
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4568) The lastPublished state check before becoming a leader is not working.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4568.
---

Resolution: Fixed

 The lastPublished state check before becoming a leader is not working.
 --

 Key: SOLR-4568
 URL: https://issues.apache.org/jira/browse/SOLR-4568
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4578) CoreAdminHandler#handleCreateAction gets a SolrCore and does not close it in SolrCloud mode when a core with the same name already exists.

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4578.
---

Resolution: Fixed

 CoreAdminHandler#handleCreateAction gets a SolrCore and does not close it in 
 SolrCloud mode when a core with the same name already exists.
 --

 Key: SOLR-4578
 URL: https://issues.apache.org/jira/browse/SOLR-4578
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0


 {noformat}
   if (coreContainer.getZkController() != null) {
 if (coreContainer.getCore(name) != null) {
   log.info(Re-creating a core with existing name is not allowed in 
 cloud mode);
   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
   Core with name ' + name + ' already exists.);
 }
   }
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4073) Overseer will miss operations in some cases for OverseerCollectionProcessor

2013-03-15 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4073:
--

Fix Version/s: 5.0
Affects Version/s: 4.1
   4.2

 Overseer will miss  operations in some cases for OverseerCollectionProcessor
 

 Key: SOLR-4073
 URL: https://issues.apache.org/jira/browse/SOLR-4073
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0, 4.1, 4.2
 Environment: Solr cloud
Reporter: Raintung Li
Assignee: Mark Miller
 Fix For: 4.3, 5.0

 Attachments: patch-4073

   Original Estimate: 168h
  Remaining Estimate: 168h

 One overseer disconnect with Zookeeper, but overseer thread still handle the 
 request(A) in the DistributedQueue. Example: overseer thread reconnect 
 Zookeeper try to remove the Top's request. workQueue.remove();.   
 Now the other server will take over the overseer privilege because old 
 overseer disconnect. Start overseer thread and handle the queue request(A) 
 again, and remove the request(A) from queue, then try to get the top's 
 request(B, doesn't get). In the this time old overseer reconnect with 
 ZooKeeper, and remove the top's request from queue. Now the top request is B, 
 it is moved by old overseer server.  New overseer server never do B 
 request,because this request deleted by old overseer server, at the last this 
 request(B) miss operations.
 At best, distributeQueue.peek can get the request's ID that will be removed 
 for workqueue.remove(ID), not remove the top's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3706) Ship setup to log with log4j.

2013-03-15 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-3706:


Attachment: SOLR-3706-solr-log4j.patch

Mark, can you take a look at this?

removes all logging from solr.war and *tries* to copy files to example/lib

for some reason the log4j dependency does not copy into the that folder

 Ship setup to log with log4j.
 -

 Key: SOLR-3706
 URL: https://issues.apache.org/jira/browse/SOLR-3706
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0

 Attachments: SOLR-3706-solr-log4j.patch, SOLR-3706-solr-log4j.patch


 Currently we default to java util logging and it's terrible in my opinion.
 *It's simple built in logger is a 2 line logger.
 *You have to jump through hoops to use your own custom formatter with jetty - 
 either putting your class in the start.jar or other pain in the butt 
 solutions.
 *It can't roll files by date out of the box.
 I'm sure there are more issues, but those are the ones annoying me now. We 
 should switch to log4j - it's much nicer and it's easy to get a nice single 
 line format and roll by date, etc.
 If someone wants to use JUL they still can - but at least users could start 
 with something decent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Log4j?

2013-03-15 Thread Ryan McKinley
logback is great too

I'm happy with any of them -- assuming we stick to SLF4j bindings



On Fri, Mar 15, 2013 at 10:50 AM, Erik Hatcher erik.hatc...@gmail.comwrote:

 How about logback rather than log4j?   
 http://logback.qos.ch/reasonsToSwitch.html

 Erik

 On Mar 15, 2013, at 12:38 , Ryan McKinley wrote:

  We have discussed a few times shipping with or supporting log4j rather
 then defaulting to JUL
 
  I just updated SOLR-3706 to do this.
 
  What are peoples thoughts on this issue now?
 
  Thanks
  Ryan
 
 


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Solr Log4j?

2013-03-15 Thread Yonik Seeley
On Fri, Mar 15, 2013 at 1:50 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 How about logback rather than log4j?   
 http://logback.qos.ch/reasonsToSwitch.html

+1, looks promising.

-Yonik
http://lucidworks.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time

2013-03-15 Thread Hoss Man (JIRA)
Hoss Man created SOLR-4589:
--

 Summary: 4.x + enableLazyFieldLoading + large nultivalued fields + 
varying fl = pathalogical CPU load  response time
 Key: SOLR-4589
 URL: https://issues.apache.org/jira/browse/SOLR-4589
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2, 4.1, 4.0
Reporter: Hoss Man


Following up on a [user report of exterme CPU usage in 
4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E],
 I've discovered that the following combination of factors can result in 
extreme CPU usage and excessively HTTP response times...

* Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
* enableLazyFieldLoading == true (included in example solrconfig.xml)
* documents with a large number of values in multivalued fields (eg: tested 
~10-15K values)
* multiple requests returning the same doc with different fl lists

I haven't dug into the route cause yet, but the essential observations is: if 
lazyloading is used in 4.x, then once a document has been fetched with an 
initial fl list X, subsequent requests for that document using a differnet fl 
list Y can be many orders of magnitute slower (while pegging the CPU) -- even 
if those same requests using fl Y uncached (or w/o lazy laoding) would be 
extremely fast.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time

2013-03-15 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4589:
---

Attachment: test-just-queries.sh
test-just-queries.out__4.0.0_mmap_lazy_using36index.txt
test.sh
test.out__4.2.0_nio_nolazy.txt
test.out__4.2.0_nio_lazy.txt
test.out__4.2.0_mmap_nolazy.txt
test.out__4.2.0_mmap_lazy.txt
test.out__4.0.0_nio_nolazy.txt
test.out__4.0.0_nio_lazy.txt
test.out__4.0.0_mmap_nolazy.txt
test.out__4.0.0_mmap_lazy.txt
test.out__3.6.1_nio_nolazy.txt
test.out__3.6.1_nio_lazy.txt
test.out__3.6.1_mmap_nolazy.txt
test.out__3.6.1_mmap_lazy.txt


The attached files include a test.sh script that:
* creates some data where fields have a large number of values
* loads the data into solr
* execs 2 queries for a single doc using two different fl options
* triggers a commit to flush caches
* execs the same two queries in a differnet order

Also attached are the raw results of running this script on my Thinkpad T430s 
against the example jetty  solr configs where the version of solr, lazyfield 
loading, and the directory impl were varried...

* version of solr
** 3.6.1
** 4.0.0
** 4.2.0
* lazy field loading:
** lazy: default example configs
** nolazy: perl -i -pe 
's{enableLazyFieldLoadingtrue}{enableLazyFieldLoadingfalse}' solrconfig.xml
* directory impl:
** mmap: java -Dsolr.directoryFactory=solr.MMapDirectoryFactory -jar start.jar
** nio: java -Dsolr.directoryFactory=solr.NIOFSDirectoryFactory -jar start.jar

There was no apparent difference in the directory impl choosen, or between 4.0 
and 4.2.  Here's the summary results for 3.6 vs 4.0 using mmap...

|| step || 3.6 nolazy || 3.6 lazy || 4.0 nolazy || 4.0 lazy ||
| small fl | 0m0.308s | 0m0.998s | 0m0.260s | 0m0.202s | 
| big fl | 0m0.178s | 0m0.263s | 0m0.084s | *16m15.735s* | 
| commit | XXX | XXX | XXX | XXX |
| big fl | 0m0.157s | 0m0.118s | 0m0.218s | 0m0.133s |
| small fl | 0m0.036s | 0m0.035s | 0m0.049s | *3m2.814s* |

Also attached is also the results of a single test I did running Solr 4.0 
pointed at the configs  index built with 3.6.1 to rule out codec changes: it 
behaved essentially the same as the 4.0 tests that built the index from scratch.


 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = 
 pathalogical CPU load  response time
 

 Key: SOLR-4589
 URL: https://issues.apache.org/jira/browse/SOLR-4589
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0, 4.1, 4.2
Reporter: Hoss Man
 Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, 
 test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, 
 test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, 
 test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, 
 test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, 
 test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, 
 test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, 
 test.out__4.2.0_nio_nolazy.txt, test.sh


 Following up on a [user report of exterme CPU usage in 
 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E],
  I've discovered that the following combination of factors can result in 
 extreme CPU usage and excessively HTTP response times...
 * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
 * enableLazyFieldLoading == true (included in example solrconfig.xml)
 * documents with a large number of values in multivalued fields (eg: tested 
 ~10-15K values)
 * multiple requests returning the same doc with different fl lists
 I haven't dug into the route cause yet, but the essential observations is: if 
 lazyloading is used in 4.x, then once a document has been fetched with an 
 initial fl list X, subsequent requests for that document using a differnet fl 
 list Y can be many orders of magnitute slower (while pegging the CPU) -- even 
 if those same requests using fl Y uncached (or w/o lazy laoding) would be 
 extremely fast.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time

2013-03-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603649#comment-13603649
 ] 

Yonik Seeley commented on SOLR-4589:


I wonder if this could be related to index compression (and maybe the same 
block being repeatedly decompressed for each lazy field being accessed?)

 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = 
 pathalogical CPU load  response time
 

 Key: SOLR-4589
 URL: https://issues.apache.org/jira/browse/SOLR-4589
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0, 4.1, 4.2
Reporter: Hoss Man
 Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, 
 test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, 
 test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, 
 test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, 
 test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, 
 test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, 
 test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, 
 test.out__4.2.0_nio_nolazy.txt, test.sh


 Following up on a [user report of exterme CPU usage in 
 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E],
  I've discovered that the following combination of factors can result in 
 extreme CPU usage and excessively HTTP response times...
 * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
 * enableLazyFieldLoading == true (included in example solrconfig.xml)
 * documents with a large number of values in multivalued fields (eg: tested 
 ~10-15K values)
 * multiple requests returning the same doc with different fl lists
 I haven't dug into the route cause yet, but the essential observations is: if 
 lazyloading is used in 4.x, then once a document has been fetched with an 
 initial fl list X, subsequent requests for that document using a differnet fl 
 list Y can be many orders of magnitute slower (while pegging the CPU) -- even 
 if those same requests using fl Y uncached (or w/o lazy laoding) would be 
 extremely fast.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr Log4j?

2013-03-15 Thread Ryan McKinley
Looking at SOLR-3706, there are really two issues that need consensus:

1. The default logging framework shipped in the examples
2. Removing an explicit logging framework from solr.war

I am happy with either log4j or logback for #1




On Fri, Mar 15, 2013 at 11:24 AM, Yonik Seeley yo...@lucidworks.com wrote:

 On Fri, Mar 15, 2013 at 1:50 PM, Erik Hatcher erik.hatc...@gmail.com
 wrote:
  How about logback rather than log4j?   
 http://logback.qos.ch/reasonsToSwitch.html

 +1, looks promising.

 -Yonik
 http://lucidworks.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (SOLR-4589) 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = pathalogical CPU load response time

2013-03-15 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603683#comment-13603683
 ] 

Uwe Schindler commented on SOLR-4589:
-

bq. I wonder if this could be related to index compression (and maybe the same 
block being repeatedly decompressed for each lazy field being accessed?)

This also happens in Solr 4.0, which had no compression.

The reason here might be the changes in stored fields altogether. Lucene 
natively no longer has support for lazy field loading, but there is a 
backwards layer just for Solr in modules/misc (LazyDocument.java). The 
document does not use maps to lookup, if you have many fields its always a scan 
through the ArrayList of all fields in the document. The lazyness in 
LazyDocument is only that the *whole* document is loaded delayed, but no longer 
single fields.

 4.x + enableLazyFieldLoading + large nultivalued fields + varying fl = 
 pathalogical CPU load  response time
 

 Key: SOLR-4589
 URL: https://issues.apache.org/jira/browse/SOLR-4589
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0, 4.1, 4.2
Reporter: Hoss Man
 Attachments: test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, 
 test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, 
 test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, 
 test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, 
 test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, 
 test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, 
 test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, 
 test.out__4.2.0_nio_nolazy.txt, test.sh


 Following up on a [user report of exterme CPU usage in 
 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%3c1362019882934-4043543.p...@n3.nabble.com%3E],
  I've discovered that the following combination of factors can result in 
 extreme CPU usage and excessively HTTP response times...
 * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
 * enableLazyFieldLoading == true (included in example solrconfig.xml)
 * documents with a large number of values in multivalued fields (eg: tested 
 ~10-15K values)
 * multiple requests returning the same doc with different fl lists
 I haven't dug into the route cause yet, but the essential observations is: if 
 lazyloading is used in 4.x, then once a document has been fetched with an 
 initial fl list X, subsequent requests for that document using a differnet fl 
 list Y can be many orders of magnitute slower (while pegging the CPU) -- even 
 if those same requests using fl Y uncached (or w/o lazy laoding) would be 
 extremely fast.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603725#comment-13603725
 ] 

Shawn Heisey commented on SOLR-4586:


Mikhail, I was just going by what a committer told me in IRC.  If that's wrong, 
then the patch shouldn't be applied and this issue can be closed.  I tried the 
patched Solr out after removed maxBooleanClauses from my config, and a 
1500-clause query fails, saying too many clauses.  Dropping that to 1024 allows 
the query to complete.  There were no results found, but it parsed and said 
numFound=0.

If the information about Lucene no longer having such a limitation is correct, 
perhaps Solr's code needs updating?


 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4707) Track file reference kept by readers that are opened through the writer

2013-03-15 Thread Jessica Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603737#comment-13603737
 ] 

Jessica Cheng commented on LUCENE-4707:
---

Hi Michael,

I did what you suggested, but I ran into a related problem involving a race 
between the merge and the reader being returned so that I can protect the 
reference. In the log trace below, the thread that's executing 
IndexWriter.getReader gets stalled when maybeMerge is called, at which point 
the Lucene Merge Thread came in and deleted the files referred to the 
segmentInfos that the getReader call has already cloned, but since getReader 
has not returned yet, those files were not protected (incRef'ed) yet and the 
Lucene Merge Thread was able to delete the files. (I'm guessing in this case 
the file was created and merged within a softCommit cycle so the previous NRT 
reader/searcher never had a reference to it.)

Questions:
1. What's my best way to get around that?
2. How does the OS-level file protection help in this case since the 
segmentInfos are just clone()ed in getReader and the call seems to just copy 
around references and never registered in any way with the directory?

Thanks so much for your help again.

Log:
BD 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: applyDeletes: 
infos=[_41(4.1):C2006, _9p(4.1):C17686, _3s(4.1):C1163, _4d(4.1):c313, 
_3y(4.1):c365, _4b(4.1):c423, _4a(4.1):C881, _4c(4.1):c54, _4f(4.1):c186, 
_4e(4.1):c30, _4g(4.1):c3, _ao(4.1):C3734, _ch(4.1):C3464, _d1(4.1):c708, 
_dk(4.1):c269, _dh(4.1):c36, _di(4.1):c4, _dj(4.1):c47, _dm(4.1):c3, 
_dl(4.1):c1, _dn(4.1):c1, _do(4.1):c1, _dp(4.1):c1, _dq(4.1):c49, _dr(4.1):c15, 
_ds(4.1):c1, _du(4.1):c101, _dv(4.1):c27, _dw(4.1):c3, _dx(4.1):c1, 
_dy(4.1):c1, _e0(4.1):c1, _dz(4.1):c1] packetCount=1\
...
BD 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: seg=_d1(4.1):c708 
segGen=763 coalesced deletes=[CoalescedDeletes(termSets=1,queries=0)] 
newDelCount=0\
...
IW 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: return reader 
version=1013 reader=StandardDirectoryReader(segments_3:1013:nrt _41(4.1):C2006 
_9p(4.1):C17686 _3s(4.1):C1163 _4d(4.1):c313 _3y(4.1):c365 _4b(4.1):c423 
_4a(4.1):C881 _4c(4.1):c54 _4f(4.1):c186 _4e(4.1):c30 _4g(4.1):c3 
_ao(4.1):C3734 _ch(4.1):C3464 _d1(4.1):c708 _dk(4.1):c269 _dh(4.1):c36 
_di(4.1):c4 _dj(4.1):c47 _dm(4.1):c3 _dl(4.1):c1 _dn(4.1):c1 _do(4.1):c1 
_dp(4.1):c1 _dq(4.1):c49 _dr(4.1):c15 _ds(4.1):c1 _du(4.1):c101 _dv(4.1):c27 
_dw(4.1):c3 _dx(4.1):c1 _dy(4.1):c1 _e0(4.1):c1 _dz(4.1):c1)\
DW 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: commitScheduler-333 
finishFullFlush success=true\
TMP 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: findMerges: 33 
segments\
...
TMP 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]:   
seg=_d1(4.1):c708 size=1.144 MB [merging] [floored]\
…
IW 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: registerMerge 
merging= [_4c, _do, _dl, _4f, _41, _dn, _4g, _4d, _di, _4a, _4b, _dh, _3y, _dp
, _4e, _dk, _dj, _d1, _dm, _3s, ]\
...
CMS 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]:   index: 
_41(4.1):C2006 _9p(4.1):C17686 _3s(4.1):C1163 _4d(4.1):c313 _3y(4.1):c365 
_4b(4.1):c423 _4a(4.1):C881 _4c(4.1):c54 _4f(4.1):c186 _4e(4.1):c30 _4g(4.1):c3 
_ao(4.1):C3734 _ch(4.1):C3464 _d1(4.1):c708 _dk(4.1):c269 _dh(4.1):c36 
_di(4.1):c4 _dj(4.1):c47 _dm(4.1):c3 _dl(4.1):c1 _dn(4.1):c1 _do(4.1):c1 
_dp(4.1):c1 _dq(4.1):c49 _dr(4.1):c15 _ds(4.1):c1 _du(4.1):c101 _dv(4.1):c27 
_dw(4.1):c3 _dx(4.1):c1 _dy(4.1):c1 _e0(4.1):c1 _dz(4.1):c1\
CMS 129 [Thu Mar 14 17:41:23 PDT 2013; commitScheduler-333]: too many 
merges; stalling...\
IFD 129 [Thu Mar 14 17:41:24 PDT 2013; Lucene Merge Thread #20]:   DecRef 
_d1.cfs: pre-decr count is 1\
IFD 129 [Thu Mar 14 17:41:24 PDT 2013; Lucene Merge Thread #20]: delete 
_d1.cfs\
...
...at this point commitScheduler-333 tries to incRef _d1.cfs but it's too late.

 Track file reference kept by readers that are opened through the writer
 ---

 Key: LUCENE-4707
 URL: https://issues.apache.org/jira/browse/LUCENE-4707
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
 Environment: Mac OS X 10.8.2 and Linux 2.6.32
Reporter: Jessica Cheng

 We ran into a bug where files (mostly CFS) that are still referred to by our 
 NRT reader/searcher are deleted by IndexFileDeleter. As far as I can see from 
 the verbose logging and reading the code, it seems that the problem is the 
 creation and merging of these CFS files between hard commits. The files 
 referred to by hard commits are incRef’ed at commit checkpoints, so these 
 files won’t be deleted until they are decRef’ed when the commit is deleted 
 according to the 

[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603749#comment-13603749
 ] 

Mark Miller commented on SOLR-4586:
---

It's been a long time, but as far as I remember, this isn't supposed to be a 
problem anymore. 

It's still used to limit BQ's in Lucene, but Solr shouldn't be creating those 
large BQ's - I think it's possibly a bug if we are. I think for all normal 
cases we should be using the smart multi term queries that were made to avoid 
this problem?

I'd have to dig to be sure. I also thought I remember shawn saying in irc that 
he confirmed that no code was reading this setting in solr anymore.

 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603751#comment-13603751
 ] 

Mark Miller commented on SOLR-4586:
---

Basically, the idea is that a user should not need this setting or we still 
have work to do.

 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.

2013-03-15 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4318:
-

Attachment: SOLR-4138.patch

Patch with CHANGES entry as well as code.

 NullPointerException encountered when /select query on solr.TextField.
 --

 Key: SOLR-4318
 URL: https://issues.apache.org/jira/browse/SOLR-4318
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Junaid Surve
Assignee: Erick Erickson
  Labels: query, select
 Attachments: SOLR-4138.patch, SOLR-4318.patch


 I have two fields, one is title and the other is description in my Solr 
 schema like -
 Type - fieldType name=text class=solr.TextField 
 positionIncrementGap=100/
 Declaration - field name=description type=text indexed=true 
 stored=true/
 without any tokenizer or filter.
 On querying /select?q=description:myText it works. However when I add a '*' 
 it fails.
 Failure scenario -
 /select?q=description:*
 /select?q=description:myText*
 .. etc 
 solrconfig.xml - 
 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftitle/str
  /lst
 /requestHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-03-15 Thread Andrew Muldowney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603758#comment-13603758
 ] 

Andrew Muldowney commented on SOLR-2894:


The issue lies in how the refinement requests were formatted and how they were 
parsed on the shard side, I've made changes that should alleviate this issue 
and I'll push out a patch soon

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.3

 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.

2013-03-15 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-4318.
--

   Resolution: Fixed
Fix Version/s: 5.0
   4.3

Trunk r: 1457032
4xr: 1457077

Thanks for reporting this Junaid!

 NullPointerException encountered when /select query on solr.TextField.
 --

 Key: SOLR-4318
 URL: https://issues.apache.org/jira/browse/SOLR-4318
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Junaid Surve
Assignee: Erick Erickson
  Labels: query, select
 Fix For: 4.3, 5.0

 Attachments: SOLR-4138.patch, SOLR-4318.patch


 I have two fields, one is title and the other is description in my Solr 
 schema like -
 Type - fieldType name=text class=solr.TextField 
 positionIncrementGap=100/
 Declaration - field name=description type=text indexed=true 
 stored=true/
 without any tokenizer or filter.
 On querying /select?q=description:myText it works. However when I add a '*' 
 it fails.
 Failure scenario -
 /select?q=description:*
 /select?q=description:myText*
 .. etc 
 solrconfig.xml - 
 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftitle/str
  /lst
 /requestHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603762#comment-13603762
 ] 

Shawn Heisey commented on SOLR-4586:


bq. I'd have to dig to be sure. I also thought I remember shawn saying in irc 
that he confirmed that no code was reading this setting in solr anymore.

I was wrong about the value not actually being used anywhere.  I think that can 
be attributed to not grokking Lucene internals and having only a short history 
with Java.  I have since located the following bit of code that is removed from 
SolrCore.java by my patch.  At the time it didn't look like anything important.

{code}
BooleanQuery.setMaxClauseCount(boolean_query_max_clause_count);
{code}


 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4832) Unbounded getTopGroups for ToParentBlockJoinCollector

2013-03-15 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603763#comment-13603763
 ] 

Michael McCandless commented on LUCENE-4832:


Hi Aleksey, reducing that method size would be nice!  Can we just make it a new 
method (accumulate is good), instead of a new class?  (And also the 
Integer.MAX_VALUE fix).  I think this will be a good improvement...

 Unbounded getTopGroups for ToParentBlockJoinCollector
 -

 Key: LUCENE-4832
 URL: https://issues.apache.org/jira/browse/LUCENE-4832
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/join
Reporter: Aleksey Aleev
 Attachments: LUCENE-4832.patch


 _ToParentBlockJoinCollector#getTopGroups_ method takes several arguments:
 {code:java}
 public TopGroupsInteger getTopGroups(ToParentBlockJoinQuery query, 
Sort withinGroupSort,
int offset,
int maxDocsPerGroup,
int withinGroupOffset,
boolean fillSortFields)
 {code}
 and one of them is {{maxDocsPerGroup}} which specifies upper bound of child 
 documents number returned within each group. 
 {{ToParentBlockJoinCollector}} collects and caches all child documents 
 matched by given {{ToParentBlockJoinQuery}} in {{OneGroup}} objects during 
 search so it is possible to create {{GroupDocs}} with all matched child 
 documents instead of part of them bounded by {{maxDocsPerGroup}}.
 When you specify {{maxDocsPerGroup}} new queues(I mean 
 {{TopScoreDocCollector}}/{{TopFieldCollector}}) will be created for each 
 group with {{maxDocsPerGroup}} objects created within each queue which could 
 lead to redundant memory allocation in case of child documents number within 
 group is less than {{maxDocsPerGroup}}.
 I suppose that there are many cases where you need to get all child documents 
 matched by query so it could be nice to have ability to get top groups with 
 all matched child documents without unnecessary memory allocation. 
 Possible solution is to pass negative {{maxDocsPerGroup}} in case when you 
 need to get all matched child documents within each group and check 
 {{maxDocsPerGroup}} value: if it is negative then we need to create queue 
 with size of matched child documents number; otherwise create queue with size 
 equals to {{maxDocsPerGroup}}. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603770#comment-13603770
 ] 

Mark Miller commented on SOLR-4586:
---

Yup - that's the one.

I tried finding a jira issue i was involved in from years ago about this 
setting, but couldn't dig it/them up.

We worked hard to limit the problems it was causing lucene and solr users. I 
think it's kind of a crappy setting, always have, and it used to be a very 
common pain point before things got better.

Anywhere sane should be using multi term queries that switch over to contant 
score and don't have this limitation.

What did you do to trip this Shawn?

 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4318) NullPointerException encountered when /select query on solr.TextField.

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603779#comment-13603779
 ] 

Commit Tag Bot commented on SOLR-4318:
--

[branch_4x commit] Erick Erickson
http://svn.apache.org/viewvc?view=revisionrevision=1457077

SOLR-4318 NPE when doing a wildcard query on a TextField with the default 
analysis chain


 NullPointerException encountered when /select query on solr.TextField.
 --

 Key: SOLR-4318
 URL: https://issues.apache.org/jira/browse/SOLR-4318
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Junaid Surve
Assignee: Erick Erickson
  Labels: query, select
 Fix For: 4.3, 5.0

 Attachments: SOLR-4138.patch, SOLR-4318.patch


 I have two fields, one is title and the other is description in my Solr 
 schema like -
 Type - fieldType name=text class=solr.TextField 
 positionIncrementGap=100/
 Declaration - field name=description type=text indexed=true 
 stored=true/
 without any tokenizer or filter.
 On querying /select?q=description:myText it works. However when I add a '*' 
 it fails.
 Failure scenario -
 /select?q=description:*
 /select?q=description:myText*
 .. etc 
 solrconfig.xml - 
 requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftitle/str
  /lst
 /requestHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603784#comment-13603784
 ] 

Shawn Heisey commented on SOLR-4586:


bq. What did you do to trip this Shawn?

I was just warning someone on IRC about the existence of the 1024-clause limit, 
then you mentioned it doesn't exist any more.  After that we discussed whether 
or not to remove it from Solr.

And then there's us escaping now. -- Wheatley


 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3857) DIH: SqlEntityProcessor with simple cache broken

2013-03-15 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-3857:
-

Attachment: SOLR-3857.patch

Here is a working patch based on the fix Sudheer Prem suggested on SOLR-4561.  
All tests pass and it restores pre-3.6 functionality.

The way this feature works (and always has) is by creating a new cache for 
every key.  If using the default cache impl, this means a 1-element SortedMap 
in memory in addition to your data.  In addition all of these 1-element caches 
are kept in a map, keyed by the query text with tokens replaced.  This is why 
Sudheer's fix needs to replace tokens first and then see if the cache exists 
second, because each version of the query gets its own cache.  Using 
SortedMapBackedCache (the default), this is merely a memory waste (and possibly 
a net gain if you are caching far less data).  But the point of the recent 
cache refactorings is to allow for pluggable cache implementations, including 
those that persist data to disk.  Clearly this behavior is not going to work 
for the general case.

While the way it ought to work is easy to conceptualize, the DIH structure 
doesn't make it easy.  The query's tokens get replaced several calls up the 
stack from the cache layer.

Those who want this functionality can apply and build with this patch.  But 
perhaps a better way is simply to put a subselect in your child entity query.  
For instance:

{code:xml}
entity name=parent query=SELECT * FROM PARENT pk=ID
 entity name=child cacheImpl=SortedMapBackedCache query=SELECT * FROM 
CHILD WHERE CHILD_ID IN (SELECT CHILD_ID FROM PARENT) /
/entity
{code} 

Although this does not give you lazy loading, it does cause only the needed 
data to be cached.

 DIH: SqlEntityProcessor with simple cache broken
 --

 Key: SOLR-3857
 URL: https://issues.apache.org/jira/browse/SOLR-3857
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.6.1, 4.0-BETA
Reporter: James Dyer
 Attachments: SOLR-3857.patch


 The wiki describes a usage of CachedSqlEntityProcessor like this:
 {code:xml}
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 {code}
 This creates what the code refers as a simple cache.  Rather than build the 
 entire cache up-front, the cache is built on-the-go.  I think this has 
 limited use cases but it would be nice to preserve the feature if possible.
 Unfortunately this was not included in any (effective) unit tests, and 
 SOLR-2382 entirely broke the functionality for 3.6/4.0-alpha+ .  At a first 
 glance, the fix may not be entirely straightforward.
 This was found while writing tests for SOLR-3856.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4590) Collections API should return a nice error when not in SolrCloud mode.

2013-03-15 Thread Mark Miller (JIRA)
Mark Miller created SOLR-4590:
-

 Summary: Collections API should return a nice error when not in 
SolrCloud mode.
 Key: SOLR-4590
 URL: https://issues.apache.org/jira/browse/SOLR-4590
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.3, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4707) Track file reference kept by readers that are opened through the writer

2013-03-15 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603787#comment-13603787
 ] 

Michael McCandless commented on LUCENE-4707:


Hi Jessica,

How about changing the approach in your Directory wrapper.  Instead of 
incRef'ing when you get an NRTReader, incRef whenever openInput is called, and 
refuse to delete the file is it's still held open by anything (throw an 
IOException in .deleteFile: IndexWriter catches this and will retry the 
deletion later).

This will make Unix behave like Windows, ie still-open files cannot be deleted.

I think that should fix this race condition, because the NRT reader must first 
.openFile all files it uses ...

 Track file reference kept by readers that are opened through the writer
 ---

 Key: LUCENE-4707
 URL: https://issues.apache.org/jira/browse/LUCENE-4707
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
 Environment: Mac OS X 10.8.2 and Linux 2.6.32
Reporter: Jessica Cheng

 We ran into a bug where files (mostly CFS) that are still referred to by our 
 NRT reader/searcher are deleted by IndexFileDeleter. As far as I can see from 
 the verbose logging and reading the code, it seems that the problem is the 
 creation and merging of these CFS files between hard commits. The files 
 referred to by hard commits are incRef’ed at commit checkpoints, so these 
 files won’t be deleted until they are decRef’ed when the commit is deleted 
 according to the DeletionPolicy (good). However, intermediate files that are 
 created and merged between the hard commits only have refs through the 
 regular checkpoints, so as soon as a new checkpoint no longer includes those 
 files, they are immediately deleted by the deleter. See the abridged verbose 
 log lines that illustrate this behavior:
 IW 11 [Mon Jan 21 17:30:35 PST 2013; commitScheduler]: create compound file 
 _8.cfs
 IFD 7 [Mon Jan 21 17:23:41 PST 2013; commitScheduler]: now checkpoint 
 _0(4.0.0.2):C3_1(4.0.0.2):C7 _2(4.0.0.2):C16 _3(4.0.0.2):C21 _4(4.0.0.2):C5 
 _5(4.0.0.2):C5_6(4.0.0.2):C5 _7(4.0.0.2):C7 _8(4.0.0.2):c6 [9 segments ; 
 isCommit = false]
 IFD 7 [Mon Jan 21 17:23:41 PST 2013; commitScheduler]:   IncRef _8.cfs: 
 pre-incr count is 0
 IFD 7 [Mon Jan 21 17:23:42 PST 2013; commitScheduler]: now checkpoint 
 _0(4.0.0.2):C3_1(4.0.0.2):C7 _2(4.0.0.2):C16 _3(4.0.0.2):C21 _4(4.0.0.2):C5 
 _5(4.0.0.2):C5 _6(4.0.0.2):C5 _7(4.0.0.2):C7 _8(4.0.0.2):c6 _9(4.0.0.2):c6 
 [10 segments ; isCommit = false]
 IFD 7 [Mon Jan 21 17:23:42 PST 2013; commitScheduler]:   IncRef _8.cfs: 
 pre-incr count is 1
 IFD 7 [Mon Jan 21 17:23:42 PST 2013; commitScheduler]:   DecRef _8.cfs: 
 pre-decr count is 2
 IFD 7 [Mon Jan 21 17:23:42 PST 2013; Lucene Merge Thread #0]: now checkpoint 
 _b(4.0.0.2):C81 [1 segments ; isCommit = false]
 IFD 7 [Mon Jan 21 17:23:42 PST 2013; Lucene Merge Thread #0]:   DecRef 
 _8.cfs: pre-decr count is 1
 IFD 7 [Mon Jan 21 17:23:42 PST 2013; Lucene Merge Thread #0]: delete _8.cfs
 With this behavior, it seems no matter how frequently we refresh the reader 
 (unless we do it at every read), we’d run into the race where the reader 
 still holds a reference to the file that’s just been deleted by the deleter. 
 My proposal is to count the file reference handed out to the NRT 
 reader/searcher when writer.getReader(boolean) is called and decRef the files 
 only when the said reader is closed.
 Please take a look and evaluate if my observations are correct and if the 
 proposal makes sense. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4539) Consistently failing seed for SyncSliceTest

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603793#comment-13603793
 ] 

Mark Miller commented on SOLR-4539:
---

This seems to be a problem with ephemeral dir factories - I'm guessing the log 
clearing is still not quite working right.

 Consistently failing seed for SyncSliceTest
 ---

 Key: SOLR-4539
 URL: https://issues.apache.org/jira/browse/SOLR-4539
 Project: Solr
  Issue Type: Bug
Reporter: Shawn Heisey
Assignee: Mark Miller
 Fix For: 4.3, 5.0


 http://mail-archives.us.apache.org/mod_mbox/lucene-dev/201303.mbox/%3c513933dd.5000...@elyograg.org%3E
 {quote}
 [junit4:junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=SyncSliceTest 
 -Dtests.method=testDistribSearch -Dtests.seed=1D1206F80A77FE6F 
 -Dtests.nightly=true -Dtests.weekly=true -Dtests.slow=true 
 -Dtests.locale=ar_LY -Dtests.timezone=BET -Dtests.file.encoding=UTF-8
 [junit4:junit4] FAILURE  109s | SyncSliceTest.testDistribSearch 
 [junit4:junit4] Throwable #1: java.lang.AssertionError: shard1 is not 
 consistent.  Got 305 from http://127.0.0.1:44083/collection1lastClient and 
 got 5 from http://127.0.0.1:43445/collection1
 [junit4:junit4]at 
 __randomizedtesting.SeedInfo.seed([1D1206F80A77FE6F:9CF488E07D289E53]:0)
 [junit4:junit4]at org.junit.Assert.fail(Assert.java:93)
 [junit4:junit4]at 
 org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:963)
 [junit4:junit4]at 
 org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:234)
 [junit4:junit4]at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:806)
 {quote}
 (issue files by Hoss on Shawn's behalf so we don't lose track of it)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Remove maxBooleanClauses from Solr

2013-03-15 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603795#comment-13603795
 ] 

Mark Miller commented on SOLR-4586:
---

I mean this:

bq. I tried the patched Solr out after removed maxBooleanClauses from my 
config, and a 1500-clause query fails, saying too many clauses.

 Remove maxBooleanClauses from Solr
 --

 Key: SOLR-4586
 URL: https://issues.apache.org/jira/browse/SOLR-4586
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.2
Reporter: Shawn Heisey
 Attachments: SOLR-4586.patch


 In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
 someone asking a question about queries.  Mark Miller told me that 
 maxBooleanClauses no longer applies, that the limitation was removed from 
 Lucene sometime in the 3.x series.  The config still shows up in the example 
 even in the just-released 4.2.
 Checking through the source code, I found that the config option is parsed 
 and the value stored in objects, but does not actually seem to be used by 
 anything.  I removed every trace of it that I could find, and all tests still 
 pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-03-15 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-4795.


   Resolution: Fixed
Fix Version/s: 4.3
   5.0

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.3

 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this means it's a good option to make available?  Yes
 it has downsides (NRT reopen more costly, small added RAM usage,
 slightly slower faceting), 

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-03-15 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603796#comment-13603796
 ] 

Commit Tag Bot commented on LUCENE-4795:


[trunk commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1457092

LUCENE-4795: add new facet method to facet from SortedSetDocValues without 
using taxonomy index


 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.3

 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower 

  1   2   >