Re: [VOTE] Release PyLucene 4.4.0-1

2013-08-22 Thread Eric Hall
On Tue, Aug 20, 2013 at 07:06:06AM -0400, Michael McCandless wrote:
 +1 to release; I smoke tested by indexing first 100K docs of a
 Wikipedia English export and running a few searches, on OS X.
 

I was able to build PyLucene 4.4.0-1 on OS X 10.8.4 with
system python and java 1.6, all tests pass.  I was also able to
build it with MacPorts installed Python 2.7.5 after making one 
adjustment to the MacPorts python and all tests passed with that as well.


-eric



Re: [NAG][VOTE] Release PyLucene 4.4.0-1

2013-08-22 Thread Steve Rowe
+1

'make test' succeeds for me on OS X 10.8.4 with stock Python 2.7.

However, when I enable either the smartcn or the spatial contrib by uncommented 
the lines adding them to JARS in Makefile, pylucene build fails:

== Smartcn enabled ==

/usr/bin/python -m jcc --shared --arch x86_64 --jar 
lucene-java-4.4.0/lucene/build/core/lucene-core-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/analysis/common/lucene-analyzers-common-4.4.0.jar
 --jar lucene-java-4.4.0/lucene/build/memory/lucene-memory-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/highlighter/lucene-highlighter-4.4.0.jar --jar 
build/jar/extensions.jar --jar 
lucene-java-4.4.0/lucene/build/queries/lucene-queries-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/queryparser/lucene-queryparser-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/sandbox/lucene-sandbox-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/analysis/smartcn/lucene-analyzers-smartcn-4.4.0.jar
 --jar 
lucene-java-4.4.0/lucene/build/analysis/stempel/lucene-analyzers-stempel-4.4.0.jar
 --jar lucene-java-4.4.0/lucene/build/grouping/lucene-grouping-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/join/lucene-join-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/facet/lucene-facet-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/suggest/lucene-suggest-4.4.0.jar --include 
lucene-java-4.4.0/lucene/build/misc/lucene-misc-4.4.0.jar  --use_full_names 
--package java.lang java.lang.System java.lang.Runtime --package java.util 
java.util.Arrays java.util.Collections java.util.HashMap java.util.HashSet 
java.util.TreeSet java.lang.IllegalStateException 
java.lang.IndexOutOfBoundsException java.util.NoSuchElementException 
java.text.SimpleDateFormat java.text.DecimalFormat java.text.Collator --package 
java.util.concurrent java.util.concurrent.Executors --package java.util.regex 
--package java.io java.io.StringReader java.io.InputStreamReader 
java.io.FileInputStream java.io.DataInputStream --exclude 
org.apache.lucene.sandbox.queries.regex.JakartaRegexpCapabilities --exclude 
org.apache.regexp.RegexpTunnel --python lucene --mapping 
org.apache.lucene.document.Document 
'get:(Ljava/lang/String;)Ljava/lang/String;' --mapping java.util.Properties 
'getProperty:(Ljava/lang/String;)Ljava/lang/String;' --sequence 
java.util.AbstractList 'size:()I' 'get:(I)Ljava/lang/Object;' 
org.apache.lucene.index.IndexWriter:getReader --version 4.4.0 --module 
python/collections.py --module python/ICUNormalizer2Filter.py --module 
python/ICUFoldingFilter.py --module python/ICUTransformFilter.py  --files 8 
--build 
While loading org/apache/lucene/analysis/cn/smart/AnalyzerProfile
Traceback (most recent call last):
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py,
 line 162, in _run_module_as_main
__main__, fname, loader, pkg_name)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py,
 line 72, in _run_code
exec code in run_globals
  File 
/Library/Python/2.7/site-packages/JCC-2.17-py2.7-macosx-10.8-intel.egg/jcc/__main__.py,
 line 107, in module
cpp.jcc(sys.argv)
  File 
/Library/Python/2.7/site-packages/JCC-2.17-py2.7-macosx-10.8-intel.egg/jcc/cpp.py,
 line 583, in jcc
cls = findClass(className.replace('.', '/'))
  File 
/Library/Python/2.7/site-packages/JCC-2.17-py2.7-macosx-10.8-intel.egg/jcc/cpp.py,
 line 73, in findClass
cls = _findClass(className)
jcc.cpp.JavaError: java.lang.ExceptionInInitializerError
Java stacktrace:
java.lang.ExceptionInInitializerError
Caused by: java.lang.RuntimeException: WARNING: Can not find lexical dictionary 
directory! This will cause unpredictable exceptions in your application! Please 
refer to the manual to download the dictionaries.
at 
org.apache.lucene.analysis.cn.smart.AnalyzerProfile.init(AnalyzerProfile.java:72)
at 
org.apache.lucene.analysis.cn.smart.AnalyzerProfile.clinit(AnalyzerProfile.java:43)

make: *** [compile] Error 255

===

=== Spatial enabled ===
/usr/bin/python -m jcc --shared --arch x86_64 --jar 
lucene-java-4.4.0/lucene/build/core/lucene-core-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/analysis/common/lucene-analyzers-common-4.4.0.jar
 --jar lucene-java-4.4.0/lucene/build/memory/lucene-memory-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/highlighter/lucene-highlighter-4.4.0.jar --jar 
build/jar/extensions.jar --jar 
lucene-java-4.4.0/lucene/build/queries/lucene-queries-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/queryparser/lucene-queryparser-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/sandbox/lucene-sandbox-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/analysis/smartcn/lucene-analyzers-smartcn-4.4.0.jar
 --jar 
lucene-java-4.4.0/lucene/build/analysis/stempel/lucene-analyzers-stempel-4.4.0.jar
 --jar lucene-java-4.4.0/lucene/build/spatial/lucene-spatial-4.4.0.jar --jar 
lucene-java-4.4.0/lucene/build/grouping/lucene-grouping-4.4.0.jar --jar 

[jira] [Commented] (LUCENE-5186) Add CachingWrapperFilter.getFilter()

2013-08-22 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747308#comment-13747308
 ] 

Adrien Grand commented on LUCENE-5186:
--

This sounds good to me, would you like to write a patch?

 Add CachingWrapperFilter.getFilter()
 

 Key: LUCENE-5186
 URL: https://issues.apache.org/jira/browse/LUCENE-5186
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Trejkaz
Priority: Minor

 There are a couple of use cases I can think of where being able to get the 
 underlying filter out of CachingWrapperFilter would be useful:
 1. You might want to introspect the filter to figure out what's in it (the 
 use case we hit.)
 2. You might want to serialise the filter since Lucene no longer supports 
 that itself.
 We currently work around this by subclassing, keeping another copy of the 
 underlying filter reference and implementing a trivial getter, which is an 
 easy workaround, but the trap is that a junior developer could unknowingly 
 create a CachingWrapperFilter without knowing that the 
 BetterCachingWrapperFilter exists, introducing a filter which cannot be 
 introspected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5186) Add CachingWrapperFilter.getFilter()

2013-08-22 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5186:
-

Assignee: Adrien Grand

 Add CachingWrapperFilter.getFilter()
 

 Key: LUCENE-5186
 URL: https://issues.apache.org/jira/browse/LUCENE-5186
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Trejkaz
Assignee: Adrien Grand
Priority: Minor

 There are a couple of use cases I can think of where being able to get the 
 underlying filter out of CachingWrapperFilter would be useful:
 1. You might want to introspect the filter to figure out what's in it (the 
 use case we hit.)
 2. You might want to serialise the filter since Lucene no longer supports 
 that itself.
 We currently work around this by subclassing, keeping another copy of the 
 underlying filter reference and implementing a trivial getter, which is an 
 easy workaround, but the trap is that a junior developer could unknowingly 
 create a CachingWrapperFilter without knowing that the 
 BetterCachingWrapperFilter exists, introducing a filter which cannot be 
 introspected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2750) add Kamikaze 3.0.1 into Lucene

2013-08-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747331#comment-13747331
 ] 

ASF subversion and git services commented on LUCENE-2750:
-

Commit 1516375 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1516375 ]

LUCENE-2750: PForDeltaDocIdSet, an in-memory DocIdSet impl based on PFOR 
encoding.

 add Kamikaze 3.0.1 into Lucene
 --

 Key: LUCENE-2750
 URL: https://issues.apache.org/jira/browse/LUCENE-2750
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: modules/other
Reporter: hao yan
Assignee: Adrien Grand
 Attachments: LUCENE-2750.patch, LUCENE-2750.patch, LUCENE-2750.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Kamikaze 3.0.1 is the updated version of Kamikaze 2.0.0. It can achieve 
 significantly better performance then Kamikaze 2.0.0 in terms of both 
 compressed size and decompression speed. The main difference between the two 
 versions is Kamikaze 3.0.x uses the much more efficient implementation of the 
 PForDelta compression algorithm. My goal is to integrate the highly efficient 
 PForDelta implementation into Lucene Codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2750) add Kamikaze 3.0.1 into Lucene

2013-08-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747339#comment-13747339
 ] 

ASF subversion and git services commented on LUCENE-2750:
-

Commit 1516380 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1516380 ]

LUCENE-2750: PForDeltaDocIdSet, an in-memory DocIdSet impl based on PFOR 
encoding.

 add Kamikaze 3.0.1 into Lucene
 --

 Key: LUCENE-2750
 URL: https://issues.apache.org/jira/browse/LUCENE-2750
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: modules/other
Reporter: hao yan
Assignee: Adrien Grand
 Attachments: LUCENE-2750.patch, LUCENE-2750.patch, LUCENE-2750.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Kamikaze 3.0.1 is the updated version of Kamikaze 2.0.0. It can achieve 
 significantly better performance then Kamikaze 2.0.0 in terms of both 
 compressed size and decompression speed. The main difference between the two 
 versions is Kamikaze 3.0.x uses the much more efficient implementation of the 
 PForDelta compression algorithm. My goal is to integrate the highly efficient 
 PForDelta implementation into Lucene Codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2750) add Kamikaze 3.0.1 into Lucene

2013-08-22 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-2750.
--

   Resolution: Fixed
Fix Version/s: 4.5
   5.0

 add Kamikaze 3.0.1 into Lucene
 --

 Key: LUCENE-2750
 URL: https://issues.apache.org/jira/browse/LUCENE-2750
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: modules/other
Reporter: hao yan
Assignee: Adrien Grand
 Fix For: 5.0, 4.5

 Attachments: LUCENE-2750.patch, LUCENE-2750.patch, LUCENE-2750.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Kamikaze 3.0.1 is the updated version of Kamikaze 2.0.0. It can achieve 
 significantly better performance then Kamikaze 2.0.0 in terms of both 
 compressed size and decompression speed. The main difference between the two 
 versions is Kamikaze 3.0.x uses the much more efficient implementation of the 
 PForDelta compression algorithm. My goal is to integrate the highly efficient 
 PForDelta implementation into Lucene Codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3191) field exclusion from fl

2013-08-22 Thread Andrea Gazzarini (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747346#comment-13747346
 ] 

Andrea Gazzarini commented on SOLR-3191:


Hi all, I'm going to complete the implementation of the field exclusion (using 
trunk code) but in order to complete I have some questions (so probably I'll 
have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 

 



 field exclusion from fl
 ---

 Key: SOLR-3191
 URL: https://issues.apache.org/jira/browse/SOLR-3191
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna
Priority: Minor

 I think it would be useful to add a way to exclude field from the Solr 
 response. If I have for example 100 stored fields and I want to return all of 
 them but one, it would be handy to list just the field I want to exclude 
 instead of the 99 fields for inclusion through fl.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5148) SortedSetDocValues caching / state

2013-08-22 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5148:
-

Assignee: Adrien Grand

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor

 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3191) field exclusion from fl

2013-08-22 Thread Andrea Gazzarini (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747346#comment-13747346
 ] 

Andrea Gazzarini edited comment on SOLR-3191 at 8/22/13 8:32 AM:
-

Hi all, I'm going to complete the implementation of the field exclusion (using 
trunk code) but in order to complete I have some questions (so probably I'll 
have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 
** -* I understand that should be wrong but what would be the correct 
behaviour? SyntaxError?

 



  was (Author: a.gazzarini):
Hi all, I'm going to complete the implementation of the field exclusion 
(using trunk code) but in order to complete I have some questions (so probably 
I'll have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 

 


  
 field exclusion from fl
 ---

 Key: SOLR-3191
 URL: https://issues.apache.org/jira/browse/SOLR-3191
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna
Priority: Minor

 I think it would be useful to add a way to exclude field from the Solr 
 response. If I have for example 100 stored fields and I want to return all of 
 them but one, it would be handy to list just the field I want to exclude 
 instead of the 99 fields for inclusion through fl.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3191) field exclusion from fl

2013-08-22 Thread Andrea Gazzarini (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747346#comment-13747346
 ] 

Andrea Gazzarini edited comment on SOLR-3191 at 8/22/13 8:32 AM:
-

Hi all, I'm going to complete the implementation of the field exclusion (using 
trunk code) but in order to complete I have some questions (so probably I'll 
have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 
** -\* I understand that should be wrong but what would be the correct 
behaviour? SyntaxError?

 



  was (Author: a.gazzarini):
Hi all, I'm going to complete the implementation of the field exclusion 
(using trunk code) but in order to complete I have some questions (so probably 
I'll have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 
** -* I understand that should be wrong but what would be the correct 
behaviour? SyntaxError?

 


  
 field exclusion from fl
 ---

 Key: SOLR-3191
 URL: https://issues.apache.org/jira/browse/SOLR-3191
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna
Priority: Minor

 I think it would be useful to add a way to exclude field from the Solr 
 response. If I have for example 100 stored fields and I want to return all of 
 them but one, it would be handy to list just the field I want to exclude 
 instead of the 99 fields for inclusion through fl.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Lucene search returns no result

2013-08-22 Thread Caba
I am trying to create a hibernate full text search using
*hibernate-search-4.3.0.Final.jar* There is no errors in this application,
but my Lucene query unsing the query DSL doesn't return any results. I mean
It doesn't return any of rows in the table. can any one please help me.

This is my function:


and this is my Entity class:



I have another folder in my project. *Indexes* in this folder I have two
files. If I open these two files with Luke, I see nothing, that seems that
these two files are empty.

What should I do to solve this problem? thank you so much



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Lucene-search-returns-no-result-tp4086046.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Lucene search returns no result

2013-08-22 Thread Uwe Schindler
Hi,

this is the Lucene developer's mailing list, for discussing source code 
development of the Lucene library. If you have a question about using Lucene, 
ask on java-u...@lucene.apache.org. But for your current question, I don't 
think Lucene's mailing lists are the correct place to ask. Hibernate Search is 
a separate, non-Apache project which is just based on Lucene but is not managed 
by the Lucene development crew, so it is better to ask your question on the 
Hibernate(-Search) mailing lists.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Caba [mailto:babak...@gmail.com]
 Sent: Thursday, August 22, 2013 11:07 AM
 To: dev@lucene.apache.org
 Subject: Lucene search returns no result
 
 I am trying to create a hibernate full text search using
 *hibernate-search-4.3.0.Final.jar* There is no errors in this application, but
 my Lucene query unsing the query DSL doesn't return any results. I mean It
 doesn't return any of rows in the table. can any one please help me.
 
 This is my function:
 
 
 and this is my Entity class:
 
 
 
 I have another folder in my project. *Indexes* in this folder I have two 
 files.
 If I open these two files with Luke, I see nothing, that seems that these two
 files are empty.
 
 What should I do to solve this problem? thank you so much
 
 
 
 --
 View this message in context: http://lucene.472066.n3.nabble.com/Lucene-
 search-returns-no-result-tp4086046.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4408) Server hanging on startup

2013-08-22 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-4408.
--

   Resolution: Cannot Reproduce
Fix Version/s: (was: 4.5)
   (was: 5.0)

Since this is so out-of-date and I haven't seen any recent reports like this 
I'm going to close this. We can re-open if necessary.

 Server hanging on startup
 -

 Key: SOLR-4408
 URL: https://issues.apache.org/jira/browse/SOLR-4408
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.1
 Environment: OpenJDK 64-Bit Server VM (23.2-b09 mixed mode)
 Tomcat 7.0
 Eclipse Juno + WTP
Reporter: Francois-Xavier Bonnet
Assignee: Erick Erickson
 Attachments: patch-4408.txt


 While starting, the server hangs indefinitely. Everything works fine when I 
 first start the server with no index created yet but if I fill the index then 
 stop and start the server, it hangs. Could it be a lock that is never 
 released?
 Here is what I get in a full thread dump:
 2013-02-06 16:28:52
 Full thread dump OpenJDK 64-Bit Server VM (23.2-b09 mixed mode):
 searcherExecutor-4-thread-1 prio=10 tid=0x7fbdfc16a800 nid=0x42c6 in 
 Object.wait() [0x7fbe0ab1]
java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 0xc34c1c48 (a java.lang.Object)
   at java.lang.Object.wait(Object.java:503)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492)
   - locked 0xc34c1c48 (a java.lang.Object)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247)
   at 
 org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94)
   at 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:213)
   at 
 org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180)
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
   at 
 org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
   at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 coreLoadExecutor-3-thread-1 prio=10 tid=0x7fbe04194000 nid=0x42c5 in 
 Object.wait() [0x7fbe0ac11000]
java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 0xc34c1c48 (a java.lang.Object)
   at java.lang.Object.wait(Object.java:503)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492)
   - locked 0xc34c1c48 (a java.lang.Object)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312)
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247)
   at 
 org.apache.solr.handler.ReplicationHandler.getIndexVersion(ReplicationHandler.java:495)
   at 
 org.apache.solr.handler.ReplicationHandler.getStatistics(ReplicationHandler.java:518)
   at 
 org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:232)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
   at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
   at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:512)
   at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140)
   at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:636)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:809)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:607)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1003)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at 

[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting

2013-08-22 Thread William Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745938#comment-13745938
 ] 

William Harris edited comment on SOLR-2894 at 8/22/13 12:31 PM:


shards.tolerant=true did indeed yield a more descriptive error:
{code}
ERROR - 2013-08-21 12:54:17.392; org.apache.solr.common.SolrException; 
null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.FacetComponent.refinePivotFacets(FacetComponent.java:882)
at 
org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:411)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1850)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:724
{code}

I also reindexed everything replacing the values of all string fields with 
their corresponding hashes in order to see if the error could be caused by some 
odd strings, but the same error occurs.
I am also seeing this error after i switched to MD5 hashes for document IDs:
{code}
ERROR - 2013-08-22 14:28:25.248; org.apache.solr.common.SolrException; 
null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:903)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:649)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1850)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:724
{code}

  was (Author: killscreen):
shards.tolerant=true did indeed yield a more descriptive error:
{code}
ERROR - 2013-08-21 12:54:17.392; org.apache.solr.common.SolrException; 
null:java.lang.NullPointerException
at 

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 756 - Failure!

2013-08-22 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/756/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseG1GC

No tests ran.

Build Log:
[...truncated 9924 lines...]
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
hudson.remoting.RequestAbortedException: 
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
at hudson.remoting.Request.call(Request.java:174)
at hudson.remoting.Channel.call(Channel.java:714)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)
at com.sun.proxy.$Proxy76.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
at hudson.Launcher$ProcStarter.join(Launcher.java:360)
at hudson.tasks.Ant.perform(Ant.java:217)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
at hudson.model.Build$BuildExecution.build(Build.java:199)
at hudson.model.Build$BuildExecution.doRun(Build.java:160)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
at hudson.model.Run.execute(Run.java:1603)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:247)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: 
Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:299)
at hudson.remoting.Channel.terminate(Channel.java:774)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at hudson.remoting.Command.readFrom(Command.java:92)
at 
hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:71)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5181) Passage knows its own docID

2013-08-22 Thread Luca Cavanna (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747493#comment-13747493
 ] 

Luca Cavanna commented on LUCENE-5181:
--

True, having the doc id would be useful there. Why not adding it directly to 
the Passage, to be able know which document the Passage comes from?

 Passage knows its own docID
 ---

 Key: LUCENE-5181
 URL: https://issues.apache.org/jira/browse/LUCENE-5181
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.4
Reporter: Jon Stewart
Priority: Minor

 The new PostingsHighlight package allows for retrieval of term matches from a 
 query if one creates a class that extends PassageFormatter and overrides 
 format(). However, class Passage does not have a docID field, nor is this 
 provided via PassageFormatter.format(). Therefore, it's very difficult to 
 know which Document contains a given Passage.
 It would suffice for PassageFormatter.format() to be passed the docID as a 
 parameter. From the code in PostingsHighlight, this seems like it would be 
 easy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5181) Passage knows its own docID

2013-08-22 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747499#comment-13747499
 ] 

Robert Muir commented on LUCENE-5181:
-

Can you give a concrete example where docid is actually useful?


 Passage knows its own docID
 ---

 Key: LUCENE-5181
 URL: https://issues.apache.org/jira/browse/LUCENE-5181
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.4
Reporter: Jon Stewart
Priority: Minor

 The new PostingsHighlight package allows for retrieval of term matches from a 
 query if one creates a class that extends PassageFormatter and overrides 
 format(). However, class Passage does not have a docID field, nor is this 
 provided via PassageFormatter.format(). Therefore, it's very difficult to 
 know which Document contains a given Passage.
 It would suffice for PassageFormatter.format() to be passed the docID as a 
 parameter. From the code in PostingsHighlight, this seems like it would be 
 easy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5183) Add block support for JSONLoader

2013-08-22 Thread Varun Thacker (JIRA)
Varun Thacker created SOLR-5183:
---

 Summary: Add block support for JSONLoader
 Key: SOLR-5183
 URL: https://issues.apache.org/jira/browse/SOLR-5183
 Project: Solr
  Issue Type: Sub-task
Reporter: Varun Thacker


We should be able to index block documents in JSON format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5183) Add block support for JSONLoader

2013-08-22 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747524#comment-13747524
 ] 

Varun Thacker commented on SOLR-5183:
-

Example Json:
{code:json} 
{
  add: {
doc : {
  id : 1,
  parent : true,
  doc : {
id : 2,
subject : black
  },
  doc : {
id : 3,
subject : blue
  }  
}
  },
  add: {
doc : {
  id : 4,
  parent : true,
  doc : {
id : 5,
subject : black
  },
  doc : {
id : 6,
subject : red
  }  
}
  }
} 
{code}

 Add block support for JSONLoader
 

 Key: SOLR-5183
 URL: https://issues.apache.org/jira/browse/SOLR-5183
 Project: Solr
  Issue Type: Sub-task
Reporter: Varun Thacker
 Fix For: 4.5, 5.0


 We should be able to index block documents in JSON format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5183) Add block support for JSONLoader

2013-08-22 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-5183:


Attachment: SOLR-5183.patch

Patch which can parse the above mentioned format. If this is okay I'll add 
tests in AddBlockUpdateTest.java

 Add block support for JSONLoader
 

 Key: SOLR-5183
 URL: https://issues.apache.org/jira/browse/SOLR-5183
 Project: Solr
  Issue Type: Sub-task
Reporter: Varun Thacker
 Fix For: 4.5, 5.0

 Attachments: SOLR-5183.patch


 We should be able to index block documents in JSON format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5148) SortedSetDocValues caching / state

2013-08-22 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5148:
-

Attachment: LUCENE-5148.patch

I tried to add auto-cloning to see its impact:
 - SortedSet instances are cached per-thread and cloned by SegmentCoreReaders 
when requested,
 - clones are only available for use in the current thread (no cloning of the 
index inputs).

So nothing changes for users, it just removes the trap mentioned in the 
summary. However, it requires codec implementers to implement clone() correctly 
so that two different instances on the same field can be used in parallel in 
the same thread. A test has been added to BaseDocValuesFormatTestCase to make 
sure all our impls do that correctly.

Robert, what do you think?

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5148.patch


 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_25) - Build # 3179 - Still Failing!

2013-08-22 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3179/
Java: 64bit/jdk1.7.0_25 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
REGRESSION:  
org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom

Error Message:
CheckReader failed

Stack Trace:
java.lang.RuntimeException: CheckReader failed
at 
__randomizedtesting.SeedInfo.seed([C17F45D490FFB13E:B33360DB219F074D]:0)
at org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:261)
at org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:240)
at 
org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1310)
at 
org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1286)
at 
org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom(DistinctValuesCollectorTest.java:253)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:724)




Build Log:
[...truncated 7213 lines...]
   [junit4] Suite: org.apache.lucene.search.grouping.DistinctValuesCollectorTest
   [junit4]   1 CheckReader failed
   [junit4]   1 test: field norms.OK [1 fields]
   [junit4]   1 test: terms, freq, 

[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state

2013-08-22 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747532#comment-13747532
 ] 

Robert Muir commented on LUCENE-5148:
-

Right: I'm still convinced the trap only impacts committers writing unit tests 
that compare against slow-wrappers :)

The patch seems to have a very large amount of changes for such a small 
thing... is there some reformatting happening?

If we can't implement this without major changes: then I dont think we should 
do it.

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5148.patch


 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_25) - Build # 3179 - Still Failing!

2013-08-22 Thread Robert Muir
I will look into this.

On Thu, Aug 22, 2013 at 9:42 AM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3179/
 Java: 64bit/jdk1.7.0_25 -XX:+UseCompressedOops -XX:+UseG1GC

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom

 Error Message:
 CheckReader failed

 Stack Trace:
 java.lang.RuntimeException: CheckReader failed
 at 
 __randomizedtesting.SeedInfo.seed([C17F45D490FFB13E:B33360DB219F074D]:0)
 at org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:261)
 at org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:240)
 at 
 org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1310)
 at 
 org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1286)
 at 
 org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom(DistinctValuesCollectorTest.java:253)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at java.lang.Thread.run(Thread.java:724)




 Build Log:
 [...truncated 7213 lines...]
[junit4] Suite: 

[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state

2013-08-22 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747538#comment-13747538
 ] 

Robert Muir commented on LUCENE-5148:
-

and FieldCache should be consistent as well.

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5148.patch


 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3191) field exclusion from fl

2013-08-22 Thread Andrea Gazzarini (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747346#comment-13747346
 ] 

Andrea Gazzarini edited comment on SOLR-3191 at 8/22/13 1:58 PM:
-

Hi all, I'm going to complete the implementation of the field exclusion (using 
trunk code) but in order to complete I have some questions (so probably I'll 
have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 
** -\* I understand that should be wrong but what would be the correct 
behaviour? SyntaxError?
** name name:manu (at the moment it seems the aliased field always wins)
** pippo:name pippo:manu ( at the moment it seems the last alias wins) 

For latest three I believe a SyntaxError would be more appropriate

  was (Author: a.gazzarini):
Hi all, I'm going to complete the implementation of the field exclusion 
(using trunk code) but in order to complete I have some questions (so probably 
I'll have to change a little bit something on my code):

* *New ReturnFields implementor or SolrReturnFields change?*
At the moment I created another class so the old implementation 
(SolrReturnFields) is still there. There's no code duplication (or just a 
little bit that can be remove if that is ok by changing the SolrReturnFields 
too) because the fl parsing uses a different logic (basically no 
QueryParsing.StrParser). 

* *SolrReturnFields:83 support for fl=' ' = *,score*
There's a comment below this line about an old feature that could be removed. 
On the wiki there's no mention about that so can I remove that?

* *About glob*
What would be the right behaviour (see below)? A full name expansion support or 
just \*aaa bbb\*? Does SOLR need that complexity? I started playing with SOLR 
in 2009 and honestly I never used globs in fl so I have no concrete experience 
for taking a decision.
** The Wiki talks about just one example (with trailing wildcard)
** _org.apache.solr.search.ReturnFieldsTest.testWilcards()_ considers only 
beginning and trailing wildcard cases (e.g. \*aaa, bbb\*)
** But the code supports a wide range of globs (e.g. a*a, a?a, n*m*,n*m?)

* *fl expressions* (probably I will attach some other ambiguous case later) 
What would be the expected behaviour in these cases?
** \* -name test
** -name test \*
** -name \* test 
** -\* I understand that should be wrong but what would be the correct 
behaviour? SyntaxError?

 


  
 field exclusion from fl
 ---

 Key: SOLR-3191
 URL: https://issues.apache.org/jira/browse/SOLR-3191
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna
Priority: Minor

 I think it would be useful to add a way to exclude field from the Solr 
 response. If I have for example 100 stored fields and I want to return all of 
 them but one, it would be handy to list just the field I want to exclude 
 instead of the 99 fields for inclusion through fl.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_25) - Build # 3179 - Still Failing!

2013-08-22 Thread Robert Muir
I committed a fix: slowwrapper bug

On Thu, Aug 22, 2013 at 9:42 AM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3179/
 Java: 64bit/jdk1.7.0_25 -XX:+UseCompressedOops -XX:+UseG1GC

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom

 Error Message:
 CheckReader failed

 Stack Trace:
 java.lang.RuntimeException: CheckReader failed
 at 
 __randomizedtesting.SeedInfo.seed([C17F45D490FFB13E:B33360DB219F074D]:0)
 at org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:261)
 at org.apache.lucene.util._TestUtil.checkReader(_TestUtil.java:240)
 at 
 org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1310)
 at 
 org.apache.lucene.util.LuceneTestCase.newSearcher(LuceneTestCase.java:1286)
 at 
 org.apache.lucene.search.grouping.DistinctValuesCollectorTest.testRandom(DistinctValuesCollectorTest.java:253)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at java.lang.Thread.run(Thread.java:724)




 Build Log:
 [...truncated 7213 lines...]

[jira] [Resolved] (LUCENE-5148) SortedSetDocValues caching / state

2013-08-22 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5148.
--

Resolution: Won't Fix

bq. The patch seems to have a very large amount of changes for such a small 
thing... is there some reformatting happening?

Yes. In some cases I couldn't use anonymous classes to implement clone properly 
so I had to use named classes for the codec-specific SortedSet impls so the 
indentation was smaller by 2 spaces.

bq. If we can't implement this without major changes: then I dont think we 
should do it.

I wanted to know your opinion first but I came to a similar conclusion. I 
initially hadn't thought about the issue of cloning too many index inputs... 
Thanks for your input!

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5148.patch


 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5184) now collations are sorted by ascending internal rank.

2013-08-22 Thread Markus Jelsma (JIRA)
Markus Jelsma created SOLR-5184:
---

 Summary:  now collations are sorted by ascending internal rank.
 Key: SOLR-5184
 URL: https://issues.apache.org/jira/browse/SOLR-5184
 Project: Solr
  Issue Type: Bug
Reporter: Markus Jelsma




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5184) Sort collations by hits descending

2013-08-22 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-5184:


Summary: Sort collations by hits descending  (was:  now collations are 
sorted by ascending internal rank.)

 Sort collations by hits descending
 --

 Key: SOLR-5184
 URL: https://issues.apache.org/jira/browse/SOLR-5184
 Project: Solr
  Issue Type: Bug
Reporter: Markus Jelsma



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5184) Sort collations by hits descending

2013-08-22 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-5184:


  Component/s: spellchecker
Fix Version/s: 5.0
   4.5
 Priority: Minor  (was: Major)
  Description: Collations are sorted by internal rank ascending. In quite a 
few cases this results in bad collations since we always get the first instead 
of iterating over them in the client. These cases can be improved upon by 
sorting on the collation's hits descending.
   Issue Type: Improvement  (was: Bug)

 Sort collations by hits descending
 --

 Key: SOLR-5184
 URL: https://issues.apache.org/jira/browse/SOLR-5184
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.5, 5.0


 Collations are sorted by internal rank ascending. In quite a few cases this 
 results in bad collations since we always get the first instead of iterating 
 over them in the client. These cases can be improved upon by sorting on the 
 collation's hits descending.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4787) Join Contrib

2013-08-22 Thread Kranti Parisa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747175#comment-13747175
 ] 

Kranti Parisa edited comment on SOLR-4787 at 8/22/13 3:59 PM:
--

Nested Joins! That's exactly what I am trying :) and thought about adding fq to 
the solr joins.

Using local-param in the first join:{code} {!join fromIndex=a from=f1 to=f2 
v=$joinQ}joinQ=(field:123 AND _query_={another join}){code}. So here another 
join could be passed as a FQ and it should get results faster!!

  was (Author: krantiparisa):
Nested Joins! That's exactly what I am trying :) and thought about adding 
fq to the solr joins.

Using local-param in the first join:{code} {!joinfromIndex=a from=f1 to=f2 
v=$joinQ}joinQ=(field:123 AND _query_={another join}){code}. So here another 
join could be passed as a FQ and it should get results faster!!
  
 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and toKey must be longs but this will change in future 
 versions. Like the pjoin, the join SolrCache is used to hold the join 
 memory structures.
 To 

[jira] [Comment Edited] (SOLR-4787) Join Contrib

2013-08-22 Thread Kranti Parisa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747175#comment-13747175
 ] 

Kranti Parisa edited comment on SOLR-4787 at 8/22/13 3:59 PM:
--

Nested Joins! That's exactly what I am trying :) and thought about adding fq to 
the solr joins.

Using local-param in the first join:{code} {!joinfromIndex=a from=f1 to=f2 
v=$joinQ}joinQ=(field:123 AND _query_={another join}){code}. So here another 
join could be passed as a FQ and it should get results faster!!

  was (Author: krantiparisa):
Nested Joins! That's exactly what I am trying :) and thought about adding 
fq to the solr joins.

Using local-param in the first join: v=$joinQjoinQ=(field:123 AND 
_query_={another join}). So here another join could be passed as a FQ and it 
should get results faster!!
  
 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and toKey must be longs but this will change in future 
 versions. Like the pjoin, the join SolrCache is used to hold the join 
 memory structures.
 To configure the vjoin you must register the 

[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-08-22 Thread Isaac Hebsh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747639#comment-13747639
 ] 

Isaac Hebsh commented on SOLR-4449:
---

[~phloy], can you provide an example for solrconfig.xml, which this plugin?

 Enable backup requests for the internal solr load balancer
 --

 Key: SOLR-4449
 URL: https://issues.apache.org/jira/browse/SOLR-4449
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: philip hoy
Priority: Minor
 Attachments: patch-4449.txt, SOLR-4449.patch, SOLR-4449.patch, 
 SOLR-4449.patch, solr-back-request-lb-plugin.jar


 Add the ability to configure the built-in solr load balancer such that it 
 submits a backup request to the next server in the list if the initial 
 request takes too long. Employing such an algorithm could improve the latency 
 of the 9xth percentile albeit at the expense of increasing overall load due 
 to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-08-22 Thread Isaac Hebsh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747639#comment-13747639
 ] 

Isaac Hebsh edited comment on SOLR-4449 at 8/22/13 4:31 PM:


[~phloy], can you provide an example for solrconfig.xml, with this plugin?

  was (Author: isaachebsh):
[~phloy], can you provide an example for solrconfig.xml, which this plugin?
  
 Enable backup requests for the internal solr load balancer
 --

 Key: SOLR-4449
 URL: https://issues.apache.org/jira/browse/SOLR-4449
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: philip hoy
Priority: Minor
 Attachments: patch-4449.txt, SOLR-4449.patch, SOLR-4449.patch, 
 SOLR-4449.patch, solr-back-request-lb-plugin.jar


 Add the ability to configure the built-in solr load balancer such that it 
 submits a backup request to the next server in the list if the initial 
 request takes too long. Employing such an algorithm could improve the latency 
 of the 9xth percentile albeit at the expense of increasing overall load due 
 to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-08-22 Thread Isaac Hebsh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747646#comment-13747646
 ] 

Isaac Hebsh commented on SOLR-4449:
---

And, as i asked in the duplicated issue SOLR-5092, the code contains a comment 
that says it's important to use the same replica for all phases of a 
distributed request.

I don't think it's important, because it might happen even without this plugin, 
if one replica goes down between two phases of the request.

Do you have any thoughts/conclusions about this?

 Enable backup requests for the internal solr load balancer
 --

 Key: SOLR-4449
 URL: https://issues.apache.org/jira/browse/SOLR-4449
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: philip hoy
Priority: Minor
 Attachments: patch-4449.txt, SOLR-4449.patch, SOLR-4449.patch, 
 SOLR-4449.patch, solr-back-request-lb-plugin.jar


 Add the ability to configure the built-in solr load balancer such that it 
 submits a backup request to the next server in the list if the initial 
 request takes too long. Employing such an algorithm could improve the latency 
 of the 9xth percentile albeit at the expense of increasing overall load due 
 to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-08-22 Thread Andrew Muldowney (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747675#comment-13747675
 ] 

Andrew Muldowney commented on SOLR-2894:


The first error occurs on a line where the refinement response is being mined 
for its information. The line asks for the value and it gets an NPE. Does your 
data contain nulls? I have code in to deal with that situation but its possible 
I'm missing an edge case. Do you have any suggestions for a test case that 
would create this error?

The second error never gets to anything I've changed so I think MD5ing your 
docIDs is causing all sorts of other issues unrelated to this patch.

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.5

 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5183) Add block support for JSONLoader

2013-08-22 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747760#comment-13747760
 ] 

Mikhail Khludnev commented on SOLR-5183:


Varun,

I'm not experienced in JSON, wouldn't it better to put them in array?

{code}
   childrenDocs:[
  {
id : 5,
subject : black
  },
  {
id : 6,
subject : red
  }  
   ]
{code}

wdyt?

 Add block support for JSONLoader
 

 Key: SOLR-5183
 URL: https://issues.apache.org/jira/browse/SOLR-5183
 Project: Solr
  Issue Type: Sub-task
Reporter: Varun Thacker
 Fix For: 4.5, 5.0

 Attachments: SOLR-5183.patch


 We should be able to index block documents in JSON format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4267) binary packaging should include licenses/

2013-08-22 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747771#comment-13747771
 ] 

Steve Rowe commented on LUCENE-4267:


This can be resolved, fix version 4.0, no?

 binary packaging should include licenses/
 -

 Key: LUCENE-4267
 URL: https://issues.apache.org/jira/browse/LUCENE-4267
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir

 I heavy-committed LUCENE-4262 to enable ivy's sync=true (which means not just 
 get the right jars, but delete shit that shouldnt be there) to end the whole 
 clean-jars issue.
 Its working except for the solr/lib (SOLR-3686) which we must fix for a 
 number of reasons.
 Anyway, because of this I made a lucene/licenses and solr/licenses 
 directories respectively that contain all the .sha1/license/notice for 3rd 
 party jars, so ivy wouldnt delete them.
 we should update build patterns so these directories are in the binary 
 release, its valuable information on our 3rd party licensing and additional 
 verification for consumers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4787) Join Contrib

2013-08-22 Thread Kranti Parisa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747175#comment-13747175
 ] 

Kranti Parisa edited comment on SOLR-4787 at 8/22/13 9:21 PM:
--

Nested Joins! That's exactly what I am trying :) and thought about adding fq to 
the solr joins.

Using local-param in the first join:{code} {!join fromIndex=a from=f1 to=f2 
v=$joinQ}joinQ=(field:123 AND _query_={another join}){code}. So here another 
join could be passed as a FQ and it should get results faster!! Hence the 
above, query would look like,

{code} {!join fromIndex=a from=f1 to=f2 v=$joinQ 
fq=$joinFQ}joinQ=(field:123)joinFQ={another join}{code}

  was (Author: krantiparisa):
Nested Joins! That's exactly what I am trying :) and thought about adding 
fq to the solr joins.

Using local-param in the first join:{code} {!join fromIndex=a from=f1 to=f2 
v=$joinQ}joinQ=(field:123 AND _query_={another join}){code}. So here another 
join could be passed as a FQ and it should get results faster!!
  
 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and 

[jira] [Commented] (SOLR-4787) Join Contrib

2013-08-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747907#comment-13747907
 ] 

Joel Bernstein commented on SOLR-4787:
--

That's exactly the syntax. I'm just working out the caching details and then 
I'll put up the code.

Getting the queryResultCache and FilterCache to play nicely with nested joins 
is tricky.

 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and toKey must be longs but this will change in future 
 versions. Like the pjoin, the join SolrCache is used to hold the join 
 memory structures.
 To configure the vjoin you must register the ValueSource plugin in the 
 solrconfig.xml as follows:
 valueSourceParser name=vjoin 
 class=org.apache.solr.joins.ValueSourceJoinParserPlugin /

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, 

[jira] [Commented] (SOLR-4787) Join Contrib

2013-08-22 Thread Kranti Parisa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747912#comment-13747912
 ] 

Kranti Parisa commented on SOLR-4787:
-

Cool, will you be able to put the code up here sometime tomorrow? I want to 
apply that patch and see how it performs.


 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and toKey must be longs but this will change in future 
 versions. Like the pjoin, the join SolrCache is used to hold the join 
 memory structures.
 To configure the vjoin you must register the ValueSource plugin in the 
 solrconfig.xml as follows:
 valueSourceParser name=vjoin 
 class=org.apache.solr.joins.ValueSourceJoinParserPlugin /

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4787) Join Contrib

2013-08-22 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747915#comment-13747915
 ] 

Joel Bernstein commented on SOLR-4787:
--

Yes. I'm very close now. I'll also need to write some quick docs because these 
joins have a lot more functionality. 

 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and toKey must be longs but this will change in future 
 versions. Like the pjoin, the join SolrCache is used to hold the join 
 memory structures.
 To configure the vjoin you must register the ValueSource plugin in the 
 solrconfig.xml as follows:
 valueSourceParser name=vjoin 
 class=org.apache.solr.joins.ValueSourceJoinParserPlugin /

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4787) Join Contrib

2013-08-22 Thread Kranti Parisa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747919#comment-13747919
 ] 

Kranti Parisa commented on SOLR-4787:
-

Awesome!

 Join Contrib
 

 Key: SOLR-4787
 URL: https://issues.apache.org/jira/browse/SOLR-4787
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
 SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch


 This contrib provides a place where different join implementations can be 
 contributed to Solr. This contrib currently includes 2 join implementations. 
 The initial patch was generated from the Solr 4.3 tag. Because of changes in 
 the FieldCache API this patch will only build with Solr 4.2 or above.
 *PostFilterJoinQParserPlugin aka pjoin*
 The pjoin provides a join implementation that filters results in one core 
 based on the results of a search in another core. This is similar in 
 functionality to the JoinQParserPlugin but the implementation differs in a 
 couple of important ways.
 The first way is that the pjoin is designed to work with integer join keys 
 only. So, in order to use pjoin, integer join keys must be included in both 
 the to and from core.
 The second difference is that the pjoin builds memory structures that are 
 used to quickly connect the join keys. It also uses a custom SolrCache named 
 join to hold intermediate DocSets which are needed to build the join memory 
 structures. So, the pjoin will need more memory then the JoinQParserPlugin to 
 perform the join.
 The main advantage of the pjoin is that it can scale to join millions of keys 
 between cores.
 Because it's a PostFilter, it only needs to join records that match the main 
 query.
 The syntax of the pjoin is the same as the JoinQParserPlugin except that the 
 plugin is referenced by the string pjoin rather then join.
 fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1
 The example filter query above will search the fromCore (collection2) for 
 user:customer1. This query will generate a list of values from the from 
 field that will be used to filter the main query. Only records from the main 
 query, where the to field is present in the from list will be included in 
 the results.
 The solrconfig.xml in the main query core must contain the reference to the 
 pjoin.
 queryParser name=pjoin 
 class=org.apache.solr.joins.PostFilterJoinQParserPlugin/
 And the join contrib jars must be registed in the solrconfig.xml.
 lib dir=../../../dist/ regex=solr-joins-\d.*\.jar /
 The solrconfig.xml in the fromcore must have the join SolrCache configured.
  cache name=join
   class=solr.LRUCache
   size=4096
   initialSize=1024
   /
 *ValueSourceJoinParserPlugin aka vjoin*
 The second implementation is the ValueSourceJoinParserPlugin aka vjoin. 
 This implements a ValueSource function query that can return a value from a 
 second core based on join keys and limiting query. The limiting query can be 
 used to select a specific subset of data from the join core. This allows 
 customer specific relevance data to be stored in a separate core and then 
 joined in the main query.
 The vjoin is called using the vjoin function query. For example:
 bf=vjoin(fromCore, fromKey, fromVal, toKey, query)
 This example shows vjoin being called by the edismax boost function 
 parameter. This example will return the fromVal from the fromCore. The 
 fromKey and toKey are used to link the records from the main query to the 
 records in the fromCore. The query is used to select a specific set of 
 records to join with in fromCore.
 Currently the fromKey and toKey must be longs but this will change in future 
 versions. Like the pjoin, the join SolrCache is used to hold the join 
 memory structures.
 To configure the vjoin you must register the ValueSource plugin in the 
 solrconfig.xml as follows:
 valueSourceParser name=vjoin 
 class=org.apache.solr.joins.ValueSourceJoinParserPlugin /

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5182) add regenerator for blockjoin cache

2013-08-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748020#comment-13748020
 ] 

ASF subversion and git services commented on SOLR-5182:
---

Commit 1516653 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1516653 ]

SOLR-5182: add regenerator for blockjoin cache

 add regenerator for blockjoin cache
 ---

 Key: SOLR-5182
 URL: https://issues.apache.org/jira/browse/SOLR-5182
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: SOLR-5182.patch


 The BlockJoin parsers cache by default with CachingWrapperFilter, but unless 
 the user writes some code, the parent filter will be totally discarded every 
 commit (losing all cached segments and behaving as if it were top-level).
 This defeats the point... we should provide a regenerator that just copies 
 elements over for things like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5182) add regenerator for blockjoin cache

2013-08-22 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-5182.
---

   Resolution: Fixed
Fix Version/s: 5.0
   4.5

 add regenerator for blockjoin cache
 ---

 Key: SOLR-5182
 URL: https://issues.apache.org/jira/browse/SOLR-5182
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.5, 5.0

 Attachments: SOLR-5182.patch


 The BlockJoin parsers cache by default with CachingWrapperFilter, but unless 
 the user writes some code, the parent filter will be totally discarded every 
 commit (losing all cached segments and behaving as if it were top-level).
 This defeats the point... we should provide a regenerator that just copies 
 elements over for things like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5182) add regenerator for blockjoin cache

2013-08-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748023#comment-13748023
 ] 

ASF subversion and git services commented on SOLR-5182:
---

Commit 1516655 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1516655 ]

SOLR-5182: add regenerator for blockjoin cache

 add regenerator for blockjoin cache
 ---

 Key: SOLR-5182
 URL: https://issues.apache.org/jira/browse/SOLR-5182
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: SOLR-5182.patch


 The BlockJoin parsers cache by default with CachingWrapperFilter, but unless 
 the user writes some code, the parent filter will be totally discarded every 
 commit (losing all cached segments and behaving as if it were top-level).
 This defeats the point... we should provide a regenerator that just copies 
 elements over for things like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-22 Thread Kevin Osborn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748047#comment-13748047
 ] 

Kevin Osborn commented on SOLR-5081:


I may have this issue as well. I am posting batches of 1000 through SolrJ. I 
have autoCommit set to 15000 with openSearcher=false. autoSoftCommit is set to 
3. During my initial testing, I was able to recreate it after just a couple 
updates. I then change the limit of the number of open files for the process 
from 4096 to 15000. This seemed to help, but only to a point.

If all my updates are at once, it seems to succeed. But if I have pauses 
between updates, it seems to have problems. I have also only seen this error 
when I have more than 1 node in my SolrCloud cluster.

I also took a look at netstat. There seemed to be a lot of connections between 
my two nodes. Could the the frequency of my updates be overwhelming the 
connection from the leader to the replica?

Deletes also fail, but queries still seem to work.

Restarting the nodes fixes the problem.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Of resource loaders, CacheHeaderTest and being puzzled.

2013-08-22 Thread Erick Erickson
I'm working on SOLR-4817 on trunk. The idea there is that if there is no
path to solr.xml, we should fail. That part is easy to do.

Where I'm having trouble is that this causes CacheHeaderTest to fail
miserably. I've fixed other test failures by setting up a Jetty instance by
creating a temporary directory like in other tests and populating it with a
minimal set of config files.

But CacheHeaderTest doesn't succeed if I do that. The glaring difference is
this call:
createJetty(solr/, null, null);

If I create a temp dir that populates a directory (myHome) with solr.xml,
collection1/conf/good stuff and call createJetty(myHome.getAbsolutePath(),
null, null) then the test fails in one of several flavors.

I'm having real trouble figuring out where the hell the configs are read
from when the solrHome of solr/ comes from, and why it would behave
differently than a full configuration with an absolute path.

Any pointers appreciated. Otherwise I'll just try it again in the morning.

I know this is incoherent, but I'm at my wits end

Erick


[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-08-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748193#comment-13748193
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1516677 from [~billy] in branch 'dev/branches/lucene3069'
[ https://svn.apache.org/r1516677 ]

LUCENE-3069: API refactoring on MockRandom, revert supress codec in 
compatibility test

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5186) Add CachingWrapperFilter.getFilter()

2013-08-22 Thread Trejkaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trejkaz updated LUCENE-5186:


Attachment: LUCENE-5186.patch

Since it's pretty trivial. :)

 Add CachingWrapperFilter.getFilter()
 

 Key: LUCENE-5186
 URL: https://issues.apache.org/jira/browse/LUCENE-5186
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Trejkaz
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5186.patch


 There are a couple of use cases I can think of where being able to get the 
 underlying filter out of CachingWrapperFilter would be useful:
 1. You might want to introspect the filter to figure out what's in it (the 
 use case we hit.)
 2. You might want to serialise the filter since Lucene no longer supports 
 that itself.
 We currently work around this by subclassing, keeping another copy of the 
 underlying filter reference and implementing a trivial getter, which is an 
 easy workaround, but the trap is that a junior developer could unknowingly 
 create a CachingWrapperFilter without knowing that the 
 BetterCachingWrapperFilter exists, introducing a filter which cannot be 
 introspected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5183) Add block support for JSONLoader

2013-08-22 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748280#comment-13748280
 ] 

Varun Thacker commented on SOLR-5183:
-

Hi Mikhail,

Ideally that would be the best way to represent the child docs.

The reason why I thought of this format was because the way we do single doc 
updates in JSON currently. We use
{code:xml}
{
  add: {
doc : {
  id : 1
}
  },
  add: {
doc : {
  id : 2
}
   }
}
{code}

Instead of...

{code:xml}
{
  add: {
docs : [
  { id : 1 },
  { id : 2 }
]
  }
}
{code}

 Add block support for JSONLoader
 

 Key: SOLR-5183
 URL: https://issues.apache.org/jira/browse/SOLR-5183
 Project: Solr
  Issue Type: Sub-task
Reporter: Varun Thacker
 Fix For: 4.5, 5.0

 Attachments: SOLR-5183.patch


 We should be able to index block documents in JSON format

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5185) licenses/*.jar.sha1 don't belong in Lucene and Solr binary distributions

2013-08-22 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-5185.


   Resolution: Won't Fix
Fix Version/s: (was: 4.5)
   (was: 5.0)

I don't think it's useful to have checksums for only some of the files within 
an archive that is itself checksummed.  But you seem to think it's useful, 
Robert, so I won't pursue removing them.

 licenses/*.jar.sha1 don't belong in Lucene and Solr binary distributions
 

 Key: LUCENE-5185
 URL: https://issues.apache.org/jira/browse/LUCENE-5185
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor

 On LUCENE-3945, where external dependency checksum verification was put in 
 place, [~hossman_luc...@fucit.org] wrote:
 bq. So i propose that we include checksum files in svn and in our source 
 releases that can be used by users to verify that the jars they get from ivy 
 match the jars we tested against.
 That is, checksum files in *binary* distributions was not part of the 
 proposal.
 And [in his comment associated with the final 
 patch|https://issues.apache.org/jira/browse/LUCENE-3945?focusedCommentId=13246476page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13246476]:
 bq. 2) fixes the binary releases to exlcude the sha1 files
 Somewhere between then and now, {{\*.jar.sha1}} files snuck back into the 
 Lucene and Solr binary releases, under the {{licenses/}} directory.  They 
 should not be there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues

2013-08-22 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748318#comment-13748318
 ] 

Bill Bell commented on SOLR-5170:
-

David,

How many points is the limit when it adds up? Does it give an OOM exception? 
Or does it just take longer and longer to respond? 

In most use cases there is almost no need to cache the geo spatial search 
results, since most users are running queries from multiple locations (with GEO 
IP) targeting. At least that is our use case. If the corpus of points is high, 
is there an approximation that can be use to reduce it and then run the Circle 
radius? For example fq={!cache=false cost=10}lat:[X to Y] AND long:[X1 to Y1] 
and apply the fq={!geofilt cost=100} or geodist ?

We have found that doing that speeds things up... Wonder if the code could just 
do that for us ?



 Spatial multi-value distance sort via DocValues
 ---

 Key: SOLR-5170
 URL: https://issues.apache.org/jira/browse/SOLR-5170
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley
 Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch


 The attached patch implements spatial multi-value distance sorting.  In other 
 words, a document can have more than one point per field, and using a 
 provided function query, it will return the distance to the closest point.  
 The data goes into binary DocValues, and as-such it's pretty friendly to 
 realtime search requirements, and it only uses 8 bytes per point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org