[Lucene.Net] [jira] [Updated] (LUCENENET-430) Contrib.ChainedFilter

2011-07-04 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-430:
---

Attachment: ChainedFilterTest.cs
ChainedFilter.cs

 Contrib.ChainedFilter
 -

 Key: LUCENENET-430
 URL: https://issues.apache.org/jira/browse/LUCENENET-430
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g

 Attachments: ChainedFilter.cs, ChainedFilterTest.cs


 Port of lucene.Java 3.0.3's ChainedFilter  its test cases.
 See the StackOverflow question: How to combine multiple filters within one 
 search?
 http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Created] (LUCENENET-430) Contrib.ChainedFilter

2011-07-04 Thread Digy (JIRA)
Contrib.ChainedFilter
-

 Key: LUCENENET-430
 URL: https://issues.apache.org/jira/browse/LUCENENET-430
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g
 Attachments: ChainedFilter.cs, ChainedFilterTest.cs

Port of lucene.Java 3.0.3's ChainedFilter  its test cases.

See the StackOverflow question: How to combine multiple filters within one 
search?
http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [VOTE] Release PyLucene 3.3.0

2011-07-04 Thread Michael McCandless
Sorry, I should have included my errors 1st time around:

In file included from build/_lucene/__wrap03__.cpp:514:
build/_lucene/org/apache/lucene/search/grouping/AbstractSecondPassGroupingCollector$SearchGroupDocs.h:55:
error: expected unqualified-id before '' token
build/_lucene/org/apache/lucene/search/grouping/AbstractSecondPassGroupingCollector$SearchGroupDocs.h:55:
error: expected ',' or '...' before '' token
build/_lucene/__wrap03__.cpp:548: error: expected unqualified-id
before '' token
build/_lucene/__wrap03__.cpp:548: error: expected ',' or '...' before '' token
build/_lucene/__wrap03__.cpp: In constructor
'org::apache::lucene::search::grouping::AbstractSecondPassGroupingCollector$SearchGroupDocs::AbstractSecondPassGroupingCollector$SearchGroupDocs()':
build/_lucene/__wrap03__.cpp:548: error: 'a0' was not declared in this scope
build/_lucene/__wrap03__.cpp:548: error: 'a1' was not declared in this scope
build/_lucene/__wrap03__.cpp:548: error: 'a2' was not declared in this scope
build/_lucene/__wrap03__.cpp: In function 'int
org::apache::lucene::search::grouping::t_AbstractSecondPassGroupingCollector$SearchGroupDocs_init_(org::apache::lucene::search::grouping::t_AbstractSecondPassGroupingCollector$SearchGroupDocs*,
PyObject*, PyObject*)':
build/_lucene/__wrap03__.cpp:653: error:
'AbstractSecondPassGroupingCollector' is not a member of
'org::apache::lucene::search::grouping'
build/_lucene/__wrap03__.cpp:653: error: expected `;' before 'a0'
build/_lucene/__wrap03__.cpp:660: error:
'org::apache::lucene::search::grouping::AbstractSecondPassGroupingCollector'
has not been declared
build/_lucene/__wrap03__.cpp:660: error: 'a0' was not declared in this scope
build/_lucene/__wrap03__.cpp:660: error:
'org::apache::lucene::search::grouping::t_AbstractSecondPassGroupingCollector'
has not been declared
error: command 'gcc-4.2' failed with exit status 1
make: *** [compile] Error 1

My env is OS X 10.6.6, Apple's build of Python (2.6.1), Java 1.6.0_22.

Mike McCandless

http://blog.mikemccandless.com

On Sun, Jul 3, 2011 at 12:17 PM, Andi Vajda va...@apache.org wrote:

  Hi Mike,

 On Sun, 3 Jul 2011, Michael McCandless wrote:

 Re-send, this time to pylucene-dev:

 Everything looks good -- I was able to compile, run all tests
 successfully, and run my usual smoke test (indexing  optimizing 
 searching on first 100K wikipedia docs), but...

 I then tried to enable the grouping module (lucene/contrib/grouping),
 by adding a GROUPING_JAR matching all the other contrib jars, and
 running make.  This then hit various compilation errors -- is anyone
 able to enable the grouping module and compile successfully?

 What kind of errors ?

 So I added the grouping module to the PyLucene branch_3x build and it just
 built (tm). I even committed the change to the build (rev 1142455) but I
 didn't check that the grouping module was functional in PyLucene as I didn't
 port any unit tests or even know much about it.

 Andi..


 Mike McCandless

 http://blog.mikemccandless.com

 On Sun, Jul 3, 2011 at 10:14 AM, Michael McCandless
 luc...@mikemccandless.com wrote:

 Everything looks good -- I was able to compile, run all tests
 successfully, and run my usual smoke test (indexing  optimizing 
 searching on first 100K wikipedia docs), but...

 I then tried to enable the grouping module (lucene/contrib/grouping),
 by adding a GROUPING_JAR matching all the other contrib jars, and
 running make.  This then hit various compilation errors -- is anyone
 able to enable the grouping module and compile successfully?

 Mike McCandless

 http://blog.mikemccandless.com

 On Fri, Jul 1, 2011 at 8:24 AM, Andi Vajda va...@apache.org wrote:

 The PyLucene 3.3.0-1 release closely tracking the recent release of
 Lucene
 Java 3.3 is ready.

 A release candidate is available from:
 http://people.apache.org/~vajda/staging_area/

 A list of changes in this release can be seen at:

 http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_3/CHANGES

 PyLucene 3.3.0 is built with JCC 2.9 included in these release
 artifacts.

 A list of Lucene Java changes can be seen at:

 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/lucene/CHANGES.txt

 Please vote to release these artifacts as PyLucene 3.3.0-1.

 Thanks !

 Andi..

 ps: the KEYS file for PyLucene release signing is at:
 http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
 http://people.apache.org/~vajda/staging_area/KEYS

 pps: here is my +1





Problems building JCC

2011-07-04 Thread Petrus Hyvönen
Hi,

This is likely another faq but,

I've moved to a windows 7 machine (64bit) and trying to compile jcc. mingw32
compiler, JDK, JRE installed. I'm getting a libjcc.a - No such file or
directory error. javac is available at the command prompt.

Building with:

python setup.py build --compiler=mingw32

Any help highly appriciated.
/Petrus


writing build\temp.win32-2.6\Release\jcc\sources\jcc.def
C:\Program Files (x86)\pythonxy\mingw\bin\g++.exe -mno-cygwin -mdll -static
--en
try _DllMain@12 -Wl,--out-implib,build\lib.win32-2.6\jcc\jcc.lib
--output-lib bu
ild\temp.win32-2.6\Release\jcc\sources\libjcc.a --def
build\temp.win32-2.6\Relea
se\jcc\sources\jcc.def -s build\temp.win32-2.6\Release\jcc\sources\jcc.o
build\t
emp.win32-2.6\Release\jcc\sources\jccenv.o -LC:\Python26\libs
-LC:\Python26\PCbu
ild -lpython26 -lmsvcr90 -o build\lib.win32-2.6\jcc.dll -LC:\Program Files
(x86
)\Java\jdk1.6.0_26/lib -ljvm -Wl,-S -Wl,--out-implib,jcc\jcc.lib
g++: build\temp.win32-2.6\Release\jcc\sources\libjcc.a: No such file or
director
y
error: command 'g++' failed with exit status 1


[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-07-04 Thread Phillipe Ramalho (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillipe Ramalho updated LUCENE-2979:
-

Attachment: LUCENE-2979_phillipe_ramalho_2.patch

As Adriano asked me, here is the first patch ready to be committed. It includes 
javadoc and package.html and overview.html updated based on the changes I made 
to the code.

I am still working on integrating the new API with the old API.

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9300 - Failure

2011-07-04 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9300/

1 tests failed.
REGRESSION:  org.apache.solr.client.solrj.TestLBHttpSolrServer.testSimple

Error Message:
expected:3 but was:2

Stack Trace:
junit.framework.AssertionFailedError: expected:3 but was:2
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1430)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1348)
at 
org.apache.solr.client.solrj.TestLBHttpSolrServer.testSimple(TestLBHttpSolrServer.java:127)




Build Log (for compile errors):
[...truncated 7907 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-07-04 Thread Adriano Crestani (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059329#comment-13059329
 ] 

Adriano Crestani commented on LUCENE-2979:
--

Hi Phillipe, thanks for the patch. However, as you did many changes to 
javadocs, I decided to run ant javadocs and it fails. It seems your patch 
references many times the constants in 
StandardQueryConfigHandler.ConfigurationKeys using @see tag, unfortunately you 
forgot to create a javadoc for those constants and it's causing the ant script 
to fail. Please, add these missing javadocs, run ant javadocs on 
contrib/queryparser to check if it finishes successfully and then submit a new 
patch.

Besides that, great job, tests are running fine even after your big change :)

Thanks!

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ

2011-07-04 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059333#comment-13059333
 ] 

Ahmet Arslan commented on SOLR-1499:


Lance, I used it once to upgrade. 

 SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via 
 SolrJ
 -

 Key: SOLR-1499
 URL: https://issues.apache.org/jira/browse/SOLR-1499
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Lance Norskog
 Fix For: 3.4, 4.0

 Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, 
 SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch


 The SolrEntityProcessor queries an external Solr instance. The Solr documents 
 returned are unpacked and emitted as DIH fields.
 The SolrEntityProcessor uses the following attributes:
 * solr='http://localhost:8983/solr/sms'
 ** This gives the URL of the target Solr instance.
 *** Note: the connection to the target Solr uses the binary SolrJ format.
 * query='Jeffersonsort=id+asc'
 ** This gives the base query string use with Solr. It can include any 
 standard Solr request parameter. This attribute is processed under the 
 variable resolution rules and can be driven in an inner stage of the indexing 
 pipeline.
 * rows='10'
 ** This gives the number of rows to fetch per request..
 ** The SolrEntityProcessor always fetches every document that matches the 
 request..
 * fields='id,tag'
 ** This selects the fields to be returned from the Solr request.
 ** These must also be declared as field elements.
 ** As with all fields, template processors can be used to alter the contents 
 to be passed downwards.
 * timeout='30'
 ** This limits the query to 5 seconds. This can be used as a fail-safe to 
 prevent the indexing session from freezing up. By default the timeout is 5 
 minutes.
 Limitations:
 * Solr errors are not handled correctly.
 * Loop control constructs have not been tested.
 * Multi-valued returned fields have not been tested.
 The unit tests give examples of how to use it as the root entity and an inner 
 entity.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male updated LUCENE-3273:
---

Attachment: LUCENE-3273.patch

Patch with first shot at this.

- MockQueryParser is introduced.  It handles a very simple syntax consisting of 
boolean operators and can identify Wildcard queries.  Cannot handle complex 
BooleanQuerys, boosts or PhraseQuerys.
- QueryBuilderHelper is introduced which provides some utilities methods for 
building queries (currently just to create a TermQuery with a boost)
- BooleanQueryBuilder and PhraseQueryBuilder are introduced to ease the process 
of programmatically creating complex BooleanQuerys and PhraseQuerys.

- All core Lucene tests (apart from those in the queryparser package) have been 
moved away from relying on QueryParser.  In extremely trivial situations, 
TermQuerys are now directly instantiated.  In others, the MockQueryParser is 
used.  In complex scenarios, the Builder classes are used to programmatically 
create the queries.
- Some tests have been split up and moved around.  Tests that did both parsing 
assertions and search assertions have been split so the parsing assertions go 
into TestQueryParser (since they are testing the QPs supported language).

Next step is to visit the contrib tests and clear those out too, so we can 
prevent any back dependencies on the queryparser module.

 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query

2011-07-04 Thread Koji Sekiguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059371#comment-13059371
 ] 

Koji Sekiguchi commented on SOLR-2632:
--

{quote}
http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
For this query, highlighting does not work. Specifying hl.fl or not, does not 
influence the result. The result is:
lst name=highlighting
lst name=GB18030TEST/
lst name=UTF8TEST/
/lst
{quote}

This request creates a BooleanQuery that is composed of TermQuery(inStock, 
true) and BoostedQuery. Lucene's Highlighter knows TermQuery but doesn't know 
how to deal with Solr's BoostedQuery. The BoostedQuery should include 
TermQuery(name,test) that you want to hihglight, but Lucene doesn't care 
BoostedQuery, so Highlighter ignores entire BoostedQuery.

 Highlighting does not work for embedded boost query that boosts a dismax query
 --

 Key: SOLR-2632
 URL: https://issues.apache.org/jira/browse/SOLR-2632
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4.1, 3.2, 3.3
 Environment: Linux.
 Reproduced in different machines with different Linux distributions and 
 different JDK's.
 Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2.
Reporter: Juan Antonio Farré Basurte
Priority: Minor
  Labels: _query_, boost, dismax, edismax, embedded, highlighting, 
 hl.fl, query

 I need to issue a dismax query, with date boost (I'd like to use the 
 multiplicative boost provided by boost queries) and also filtering for other 
 fields with too many possible distinct values to fit in a filter query. To 
 achieve this, I use the boost query as a nested query using the pseudofield 
 _query_. I also need highlighting for the fields used in the dismax query, 
 but highlighting does not work. If I just use the boosted dismax query 
 without embedding it inside another query, it works correctly. If I use bf 
 instead of a boost query, and embed directly the dismax query, it works too, 
 but hl.fl needs to be specified.
 It's a bit complicated to explain, so, I'll give examples using the example 
 data that comes with solr (the problem is reproducible in the example solr 
 distribution, not only in my concrete project).
 http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 For this query, highlighting does not work. Specifying hl.fl or not, does not 
 influence the result. The result is:
 lst name=highlighting
   lst name=GB18030TEST/
   lst name=UTF8TEST/
 /lst
 http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This doesn't work either. Same result.
 http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq 
 defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true
 In this case, hightlighting works correctly:
 lst name=highlighting
   lst name=GB18030TEST
 arr name=name
   stremTest/em with some GB18030 encoded characters/str
 /arr
   /lst
   lst name=UTF8TEST
 arr name=name
   stremTest/em with some UTF-8 encoded characters/str
 /arr
   /lst
 /lst
 http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This also works. Same result as before. But in this case hl.fl is needed. 
 Without it, highlighting does not work, either.
 Thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query

2011-07-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059375#comment-13059375
 ] 

Juan Antonio Farré Basurte commented on SOLR-2632:
--

Sounds logical, but... if highlighter doesn't know how to deal with 
BoostedQuery, then why does it work when I issue the boosted query alone, 
without embedding it in the boolean query?
May be I'm wrong, but it looks to me more like a problem of embedding the 
boosted query into the boolean query than a problem with boosted query itself. 
In fact, as you can see in my examples, if I directly embed the dismax query 
(without boost query) in the boolean query, it works, but it requires 
specifying hl.fl, when I believe it should just use the qf.
My feeling is that the highlighter has problems dealing with embedded queries. 
The problems go worse if you embed boosted queries.

 Highlighting does not work for embedded boost query that boosts a dismax query
 --

 Key: SOLR-2632
 URL: https://issues.apache.org/jira/browse/SOLR-2632
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4.1, 3.2, 3.3
 Environment: Linux.
 Reproduced in different machines with different Linux distributions and 
 different JDK's.
 Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2.
Reporter: Juan Antonio Farré Basurte
Priority: Minor
  Labels: _query_, boost, dismax, edismax, embedded, highlighting, 
 hl.fl, query

 I need to issue a dismax query, with date boost (I'd like to use the 
 multiplicative boost provided by boost queries) and also filtering for other 
 fields with too many possible distinct values to fit in a filter query. To 
 achieve this, I use the boost query as a nested query using the pseudofield 
 _query_. I also need highlighting for the fields used in the dismax query, 
 but highlighting does not work. If I just use the boosted dismax query 
 without embedding it inside another query, it works correctly. If I use bf 
 instead of a boost query, and embed directly the dismax query, it works too, 
 but hl.fl needs to be specified.
 It's a bit complicated to explain, so, I'll give examples using the example 
 data that comes with solr (the problem is reproducible in the example solr 
 distribution, not only in my concrete project).
 http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 For this query, highlighting does not work. Specifying hl.fl or not, does not 
 influence the result. The result is:
 lst name=highlighting
   lst name=GB18030TEST/
   lst name=UTF8TEST/
 /lst
 http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This doesn't work either. Same result.
 http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq 
 defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true
 In this case, hightlighting works correctly:
 lst name=highlighting
   lst name=GB18030TEST
 arr name=name
   stremTest/em with some GB18030 encoded characters/str
 /arr
   /lst
   lst name=UTF8TEST
 arr name=name
   stremTest/em with some UTF-8 encoded characters/str
 /arr
   /lst
 /lst
 http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This also works. Same result as before. But in this case hl.fl is needed. 
 Without it, highlighting does not work, either.
 Thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests

2011-07-04 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3269:


Attachment: LUCENE-3269.patch

here's a patch that speeds up the slowest ones a bit (doesn't really solve the 
problem, but helps as a step)

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2011-07-04 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059381#comment-13059381
 ] 

Simon Willnauer commented on LUCENE-2878:
-

Mike, its so awesome that you help here. I will be back on wednesday and post 
comments / suggestions then.

simon

 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Bulk Postings branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3269) Speed up Top-K sampling tests

2011-07-04 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059388#comment-13059388
 ] 

Shai Erera commented on LUCENE-3269:


Patch looks good. One other idea I think we should try is to create the large 
indexes once per all Top-K tests extensions. There are several references to 
FacetTestBase.initIndex(), and I think that the TopK tests can create their 
indexes (which is the same) at @BeforeClass, perhaps all indexes per partition 
sizes that are tested, and then proceed with testing. I think that will cut 
away a large portion of the running time.

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests

2011-07-04 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3269:


Attachment: LUCENE-3269.patch

Hi Shai, here is an updated patch that achieves the same thing, now the tests 
don't create redundant indexes

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests

2011-07-04 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3269:


Attachment: LUCENE-3269.patch

One more tweak, this one seems to help a lot, allows subclasses to tweak the 
IWConfig (we use the same trick here that we use for NRQ tests to prevent 
really slow behavior for such large indexes)

 Speed up Top-K sampling tests
 -

 Key: LUCENE-3269
 URL: https://issues.apache.org/jira/browse/LUCENE-3269
 Project: Lucene - Java
  Issue Type: Test
  Components: modules/facet
Reporter: Robert Muir
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch


 speed up the top-k sampling tests (but make sure they are thorough on nightly 
 etc still)
 usually we would do this with use of atLeast(), but these tests are somewhat 
 tricky,
 so maybe a different approach is needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3268) TestScoredDocIDsUtils.testWithDeletions test failure

2011-07-04 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059404#comment-13059404
 ] 

Robert Muir commented on LUCENE-3268:
-

Hi Shai, I found another fail in this test:
ant test -Dtestcase=TestScoredDocIDsUtils -Dtestmethod=testWithDeletions 
-Dtests.seed=-203625378244176964:-5047330594665853233

 TestScoredDocIDsUtils.testWithDeletions test failure
 

 Key: LUCENE-3268
 URL: https://issues.apache.org/jira/browse/LUCENE-3268
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Robert Muir
Assignee: Shai Erera
 Fix For: 3.4, 4.0


 ant test -Dtestcase=TestScoredDocIDsUtils -Dtestmethod=testWithDeletions 
 -Dtests.seed=-2216133137948616963:2693740419732273624 -Dtests.multiplier=5
 In general, on both 3.x and trunk, if you run this test with -Dtests.iter=100 
 it tends to fail 2% of the time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3275) hang on 1.6.0u26

2011-07-04 Thread Robert Muir (JIRA)
hang on 1.6.0u26


 Key: LUCENE-3275
 URL: https://issues.apache.org/jira/browse/LUCENE-3275
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir


on the mac, a system update pushed out an upgrade to 1.6.0u26

basically, if i run 'ant test' from the faceting module, my jre completely 
hangs (0% cpu, won't even respond to kill -QUIT to print a stacktrace).
This is reproducable... it always happens inside SamplingAccumulatorTest.

Of course if i run this test by itself, or anything else, it doesn't want to 
hang... but you should be able to reproduce by running 'ant test 
-Dtests.threadspercpu=0' which runs all tests sequentially.

Acts like http://forums.oracle.com/forums/thread.jspa?threadID=2246699

I think this JRE version (update 26) is broken. If your mac asks you to 
upgrade, just say no.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3268) TestScoredDocIDsUtils.testWithDeletions test failure

2011-07-04 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059450#comment-13059450
 ] 

Shai Erera commented on LUCENE-3268:


Committed revision 1142675 (3x).
Committed revision 1142676 (trunk).

 TestScoredDocIDsUtils.testWithDeletions test failure
 

 Key: LUCENE-3268
 URL: https://issues.apache.org/jira/browse/LUCENE-3268
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Robert Muir
Assignee: Shai Erera
 Fix For: 3.4, 4.0


 ant test -Dtestcase=TestScoredDocIDsUtils -Dtestmethod=testWithDeletions 
 -Dtests.seed=-2216133137948616963:2693740419732273624 -Dtests.multiplier=5
 In general, on both 3.x and trunk, if you run this test with -Dtests.iter=100 
 it tends to fail 2% of the time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1932) add relevancy function queries

2011-07-04 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059453#comment-13059453
 ] 

Yonik Seeley commented on SOLR-1932:


Hmm, yeah, I didn't even know about Terms.getSumTotalTermFreq!

 add relevancy function queries
 --

 Key: SOLR-1932
 URL: https://issues.apache.org/jira/browse/SOLR-1932
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-1932.patch, SOLR-1932_totaltermfreq.patch


 Add function queries for relevancy factors such as tf, idf, etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #171: POMs out of sync

2011-07-04 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/171/

No tests ran.

Build Log (for compile errors):
[...truncated 7447 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query

2011-07-04 Thread Koji Sekiguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059462#comment-13059462
 ] 

Koji Sekiguchi commented on SOLR-2632:
--

bq. What I'm not sure is about the conclusion. Is this a bug that should be 
corrected?

I'm not sure. If getHighlightQuery() is for providing basic query so that 
Lucene's highlighter can understand what kind of query it is, it looks bug to 
me.

BTW, how do you think the idea of SOLR-1926. If it can be used, does it solve 
your problem?


 Highlighting does not work for embedded boost query that boosts a dismax query
 --

 Key: SOLR-2632
 URL: https://issues.apache.org/jira/browse/SOLR-2632
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4.1, 3.2, 3.3
 Environment: Linux.
 Reproduced in different machines with different Linux distributions and 
 different JDK's.
 Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2.
Reporter: Juan Antonio Farré Basurte
Priority: Minor
  Labels: _query_, boost, dismax, edismax, embedded, highlighting, 
 hl.fl, query

 I need to issue a dismax query, with date boost (I'd like to use the 
 multiplicative boost provided by boost queries) and also filtering for other 
 fields with too many possible distinct values to fit in a filter query. To 
 achieve this, I use the boost query as a nested query using the pseudofield 
 _query_. I also need highlighting for the fields used in the dismax query, 
 but highlighting does not work. If I just use the boosted dismax query 
 without embedding it inside another query, it works correctly. If I use bf 
 instead of a boost query, and embed directly the dismax query, it works too, 
 but hl.fl needs to be specified.
 It's a bit complicated to explain, so, I'll give examples using the example 
 data that comes with solr (the problem is reproducible in the example solr 
 distribution, not only in my concrete project).
 http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 For this query, highlighting does not work. Specifying hl.fl or not, does not 
 influence the result. The result is:
 lst name=highlighting
   lst name=GB18030TEST/
   lst name=UTF8TEST/
 /lst
 http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This doesn't work either. Same result.
 http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq 
 defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true
 In this case, hightlighting works correctly:
 lst name=highlighting
   lst name=GB18030TEST
 arr name=name
   stremTest/em with some GB18030 encoded characters/str
 /arr
   /lst
   lst name=UTF8TEST
 arr name=name
   stremTest/em with some UTF-8 encoded characters/str
 /arr
   /lst
 /lst
 http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This also works. Same result as before. But in this case hl.fl is needed. 
 Without it, highlighting does not work, either.
 Thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query

2011-07-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059471#comment-13059471
 ] 

Juan Antonio Farré Basurte commented on SOLR-2632:
--

Interesting idea.
For my concrete problem, it would probably provide a workaround, yes.
The comment by Hoss Man sounds also quite reasonable. I can't think of a 
situation where having hl.q provides a clear advantage over the hl.text 
suggested by Hoss Man, though may be I just haven't come up with the use case.

 Highlighting does not work for embedded boost query that boosts a dismax query
 --

 Key: SOLR-2632
 URL: https://issues.apache.org/jira/browse/SOLR-2632
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4.1, 3.2, 3.3
 Environment: Linux.
 Reproduced in different machines with different Linux distributions and 
 different JDK's.
 Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2.
Reporter: Juan Antonio Farré Basurte
Priority: Minor
  Labels: _query_, boost, dismax, edismax, embedded, highlighting, 
 hl.fl, query

 I need to issue a dismax query, with date boost (I'd like to use the 
 multiplicative boost provided by boost queries) and also filtering for other 
 fields with too many possible distinct values to fit in a filter query. To 
 achieve this, I use the boost query as a nested query using the pseudofield 
 _query_. I also need highlighting for the fields used in the dismax query, 
 but highlighting does not work. If I just use the boosted dismax query 
 without embedding it inside another query, it works correctly. If I use bf 
 instead of a boost query, and embed directly the dismax query, it works too, 
 but hl.fl needs to be specified.
 It's a bit complicated to explain, so, I'll give examples using the example 
 data that comes with solr (the problem is reproducible in the example solr 
 distribution, not only in my concrete project).
 http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 For this query, highlighting does not work. Specifying hl.fl or not, does not 
 influence the result. The result is:
 lst name=highlighting
   lst name=GB18030TEST/
   lst name=UTF8TEST/
 /lst
 http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This doesn't work either. Same result.
 http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq 
 defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true
 In this case, hightlighting works correctly:
 lst name=highlighting
   lst name=GB18030TEST
 arr name=name
   stremTest/em with some GB18030 encoded characters/str
 /arr
   /lst
   lst name=UTF8TEST
 arr name=name
   stremTest/em with some UTF-8 encoded characters/str
 /arr
   /lst
 /lst
 http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name
 This also works. Same result as before. But in this case hl.fl is needed. 
 Without it, highlighting does not work, either.
 Thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1932) add relevancy function queries

2011-07-04 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1932:
---

Attachment: SOLR-1932_sumtotaltermfreq.patch

Here's an update that includes sumtotaltermfreq and aliases
totaltermfreq to ttf and sumtotaltermfreq to sttf.


 add relevancy function queries
 --

 Key: SOLR-1932
 URL: https://issues.apache.org/jira/browse/SOLR-1932
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-1932.patch, SOLR-1932_sumtotaltermfreq.patch, 
 SOLR-1932_totaltermfreq.patch


 Add function queries for relevancy factors such as tf, idf, etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3220) Implement various ranking models as Similarities

2011-07-04 Thread David Mark Nemeskey (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mark Nemeskey updated LUCENE-3220:


Attachment: LUCENE-3220.patch

Fixed a few things in MockBM25Similarity.

 Implement various ranking models as Similarities
 

 Key: LUCENE-3220
 URL: https://issues.apache.org/jira/browse/LUCENE-3220
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/search
Affects Versions: flexscoring branch
Reporter: David Mark Nemeskey
Assignee: David Mark Nemeskey
  Labels: gsoc
 Attachments: LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, 
 LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, 
 LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 With [LUCENE-3174|https://issues.apache.org/jira/browse/LUCENE-3174] done, we 
 can finally work on implementing the standard ranking models. Currently DFR, 
 BM25 and LM are on the menu.
 TODO:
  * {{EasyStats}}: contains all statistics that might be relevant for a 
 ranking algorithm
  * {{EasySimilarity}}: the ancestor of all the other similarities. Hides the 
 DocScorers and as much implementation detail as possible
  * _BM25_: the current mock implementation might be OK
  * _LM_
  * _DFR_
 Done:

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



EmbeddedSolrServer

2011-07-04 Thread Clecio Varjao
Hi,

Shouldn't org.apache.solr.client.solrj.embedded.EmbeddedSolrServer
class be located under
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/solrj;
instead of 
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src;
?

Thanks,

Clécio

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2631) PingRequestHandler can infinite loop if called with a qt that points to itsself

2011-07-04 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2631.


Resolution: Fixed

Uwe, sorry for my brevity -- my point was that you had fixed the infinite loop 
by adding an sanity check that will throw an error, but the example  test 
configs should also be improved to demonstrate better practices when using the 
PingRequestHandler so people using them can never encounter the sanity checking 
you added.

Committed revision 1142722. - trunk
Committed revision 1142730. - trunk stupid mistake
Committed revision 1142731. - 3x



 PingRequestHandler can infinite loop if called with a qt that points to 
 itsself
 ---

 Key: SOLR-2631
 URL: https://issues.apache.org/jira/browse/SOLR-2631
 Project: Solr
  Issue Type: Bug
  Components: search, web gui
Affects Versions: 1.4, 3.1, 3.2, 3.3
Reporter: Uwe Schindler
Assignee: Uwe Schindler
  Labels: security
 Fix For: 3.4, 4.0

 Attachments: SOLR-2631.patch


 We got a security report to priv...@lucene.apache.org, that Solr can infinite 
 loop, use 100% CPU and stack overflow, if you execute the following HTTP 
 request: 
 - http://localhost:8983/solr/select?qt=/admin/ping
 - http://localhost:8983/solr/admin/ping?qt=/admin/ping
 The qt paramter instructs PingRequestHandler to call the given request 
 handler. This leads to an infinite loop. This is not an security issue, but 
 for an unprotected Solr server with unprotected /solr/select path this makes 
 it stop working.
 The fix is to prevent infinite loop by disallowing calling itsself.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



failonjavadocwarning to false for ant generate-maven-artifacts

2011-07-04 Thread Eric Charles

Hi,
In current trunk, I had to set failonjavadocwarning to false to 
successfully generate the pom (via ant generate-maven-artifacts).


(invoking ant javadoc in lucene folder also fails).

I was simply looking for the pom.xml generation, but much more was done.

I'm not worry about that (just willing to share it).
Thx.
--
Eric

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr - MOD function

2011-07-04 Thread Radek Majer
Hi,

I was looking for MOD function in SOLR, but I couldn´t find it. Is there any
solution thas isn´t directly in SOLR or can you implement this funciton (if
you can, so when?)?

It´s very important function for our project. For example we need to search
after the five-year, decade, etc.

Regards
Radek Majer


Re: revisit naming for grouping/join?

2011-07-04 Thread Chris Hostetter

: In my example the city was parent -- I raised this example to explain
: that index-time joining is more general than just nested docs (ie, I
: think we should keep the name join for this module... also because
: we should factor out more general search-time-only join capabilities
: into it).

i think that may be the wrong approach to take when discussing examples, 
while it's great to say there are dozens of usecases that these features 
can all support in dozens of diff ways we should relaly focus on 
naming/deming these use cases in the ways where they really make the most 
sense.

In otherwords, i don't think we should say All of these types of problems 
are different types of nails, and all of these modules are specialty 
hammers that are slightly distinct from eachother in how they work, but 
you can use any of these hammers on any of these nails  instead we should 
say here are some specialty hammers, you can use them for lots of 
types of nails, ut for each hammer here is the type of nail where it 
really shines


block-index-join as i understand it requires all the docs you want to 
join up to be in one contigious range of docids in the index, so if you want to 
re-index one doc in a block you have to re-index the entire block -- so 
the city/doctor example doesn't sound like a good generic example of 
when/why to use this (because a doctor might change his office 
hours, or address -- maybe even leavong the city completely, while a 
city might change it's population w/o the doctor being affected at all.

The book and pages example seems much more appropriate, since in the 
real world these things change in lock step -- pages aren't added/removed to 
a book; pages don't change w/o the book itself being fundementally 
changed.  the fields of a page document are the text of that page, and 
that is inheriently data about the book -- the fields of a doctor 
document are metadata about the doctor, and that is not inheriently data 
about the city the doctor lives in.

as for the name ... i understand why it's called module/join and i 
understand why the classes are called BlockJoinQuery and 
BlockJoinCollector but i don't think those names really stand out and 
convey to end users what they do and how/why they are useful.

Personally i think better names would be modules/subdocuments, 
ParentDocumentQuery and ChildDocumentsCollector

I know mcccandless isn't a fan of the name Nested Documents because this 
functionality *can* be used for use cases where the data being modeled is 
not strictly organized in a nested relationship, but that doesn't mean 
it's *optimal* or easy for a user to apply to other usecases, because they 
have to design their model (and their indexing strategy) in such a way 
that they think them as nested or hierarchical documents.  

Naming it module/subdocuments would not only emphasis the usecase where 
it really shines, it would more importantly draw attention to how users 
have to model their data in order to take advantage of it -- and using 
ParentDocument and ChildDocuments in the names of the Query/Collector 
would make it clear what they match on relative the underlying query 
that they wrap/collect

it would also help distibguish from more general joins like what solr 
does today -- it seems like that should eventually take the name 
module/join

At a minum we should rename what we have now modules/block-join or 
modules/index-join (but the later is confusing) and eventually add 
modules/query-join  (yes, yes, block joins provide a query, btu the 
differnce is when you you have to make a decision about how you want to 
join your model, at index time or at query time.


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3233) HuperDuperSynonymsFilter™

2011-07-04 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3233:
---

Attachment: LUCENE-3233.patch

New patch w/ current state.

I think it's closer; the test has more cases now (but I'd still like to make a 
random test), fewer nocommits, etc.

 HuperDuperSynonymsFilter™
 -

 Key: LUCENE-3233
 URL: https://issues.apache.org/jira/browse/LUCENE-3233
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch


 The current synonymsfilter uses a lot of ram and cpu, especially at build 
 time.
 I think yesterday I heard about huge synonyms files three times.
 So, I think we should use an FST-based structure, sharing the inputs and 
 outputs.
 And we should be more efficient with the tokenStream api, e.g. using 
 save/restoreState instead of cloneAttributes()

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-07-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059527#comment-13059527
 ] 

Michael McCandless commented on LUCENE-2793:


I think BufferedIndexInput doesn't need a set/getMergeBufferSize?  Ie, BII only 
knows its bufferSize, regardless of the context from its parent.

Otherwise I think your patch is good: today on trunk we hardwire the 4 KB 
buffer size for merges, which is the same thing your patch is doing; the only 
difference is the constant MERGE_BUFFER_SIZE has moved from IW to BII, and each 
Dir impl now has the if.  As a future improvement we can add a 
set/getMergeBufferSize to each Dir impl...

 Directory createOutput and openInput should take an IOContext
 -

 Key: LUCENE-2793
 URL: https://issues.apache.org/jira/browse/LUCENE-2793
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Michael McCandless
Assignee: Varun Thacker
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
 LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch


 Today for merging we pass down a larger readBufferSize than for searching 
 because we get better performance.
 I think we should generalize this to a class (IOContext), which would hold 
 the buffer size, but then could hold other flags like DIRECT (bypass OS's 
 buffer cache), SEQUENTIAL, etc.
 Then, we can make the DirectIOLinuxDirectory fully usable because we would 
 only use DIRECT/SEQUENTIAL during merging.
 This will require fixing how IW pools readers, so that a reader opened for 
 merging is not then used for searching, and vice/versa.  Really, it's only 
 all the open file handles that need to be different -- we could in theory 
 share del docs, norms, etc, if that were somehow possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059531#comment-13059531
 ] 

Hoss Man commented on LUCENE-3273:
--

I'm in favor of eliminating the QueryParser dependency, but i feel like this 
approach of adding things like BooleanQueryBuilder leads us down the road 
towards tests that are so verbose in query construction it will draw attention 
away from the important parts of the test -- doing something with those queres.

a while back when i wrote TestExplanations, i added a bunch of convenience 
methods for constructing esoteric queries that i couldn't get cleanly from the 
QueryParser (mainly spans) -- perhaps we should move towards generalizing that 
approach ... either in a Utility class where they can be staticly imported, or 
into LuceneTestCase?  These days we could even use vargs for things like 
Phrase, Boolean, and SpanNear queries (we weren't using Java5 when i wrote the 
existing ones)

That way instead of things like this...

{code}
PhraseQuery q = new PhraseQuery(); // Query this hi this is a test is
q.add(new Term(field, hi), 1);
q.add(new Term(field, test), 5);

assertEquals(field:\? hi ? ? ? test\, q.toString());
{code}

...we could have ...

{code}
Query q = phraseQ(field, null, hi, null, null, null, test);

assertEquals(field:\? hi ? ? ? test\, q.toString());
{code}

And instead of this...

{code}
public void testDMQ8() throws Exception {
  DisjunctionMaxQuery q = new DisjunctionMaxQuery(0.5f);
  q.add(new BooleanQueryBuilder(FIELD)
  .addTermQuery(yy)
  .addQuery(QueryBuilderHelper.newTermQuery(FIELD, w5, 100))
  .get());
  q.add(QueryBuilderHelper.newTermQuery(FIELD, xx, 10));
  qtest(q, new int[] { 0,2,3 });
}
{code}

...we could have...

{code}
public void testDMQ8() throws Exception {
  DisjunctionMaxQuery q = new DisjunctionMaxQuery(0.5f);
  q.add(booleanQ(opt(termQ(FIELD, yy)), 
 opt(termQ(FIELD, w5, 100;
  q.add(termQ(FIELD, xx, 10));
  qtest(q, new int[] { 0,2,3 });
}
{code}



 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059532#comment-13059532
 ] 

Robert Muir commented on LUCENE-3273:
-

With all due respect hoss, i'd rather have the former than the latter.

The latter reminds me of solr tests which use this approach, I find them 
extremely painful to read.

 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: failonjavadocwarning to false for ant generate-maven-artifacts

2011-07-04 Thread Steven A Rowe
Hi Eric,

'ant get-maven-poms' will generate the pom.xml files for you.

'ant generate-maven-artifacts' has to generate the javadoc for each module, and 
javadoc generation fails on warnings.  When the javadoc tool fails to download 
the package list from Oracle, which seems to happen often, the resulting 
warning fails the build.

Steve

-Original Message-
From: Eric Charles [mailto:eric.char...@u-mangate.com] 
Sent: Monday, July 04, 2011 5:07 AM
To: dev@lucene.apache.org
Subject: failonjavadocwarning to false for ant generate-maven-artifacts

Hi,
In current trunk, I had to set failonjavadocwarning to false to successfully 
generate the pom (via ant generate-maven-artifacts).

(invoking ant javadoc in lucene folder also fails).

I was simply looking for the pom.xml generation, but much more was done.

I'm not worry about that (just willing to share it).
Thx.
--
Eric

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional 
commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059533#comment-13059533
 ] 

Hoss Man commented on LUCENE-3273:
--

to each his own i guess.

I just think it makes sense for utilities that do the banal stuff that's not 
central to the actually methods being tested should be as short as possible and 
get the hell out of the way -- the code you actually want to test should be 
verbose and catch your eye.



 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059537#comment-13059537
 ] 

Robert Muir commented on LUCENE-3273:
-

the difference here, is that I think in general the tests should use/look like 
the API.

this makes them readable for people (e.g. new contributors) who already know 
lucene's API to
understand what the tests do.

For example in the lucene tests we added various randomization, but we tried to 
make it look just like the API, except deleting a space:

{noformat}
new IndexWriterConfig() - newIndexWriterConfig()
new Directory() - newDirectory()
new Field() - newField()
...
{noformat}

in some of these tests, I think its actually *way more clear* to explicitly 
build the BQs and not use any builders or parsers, especially TestBoolean2 for 
example.

I fear sometimes, people get caught up on more lines of code == bad. I think 
this is wrong, sometimes more lines of code is good.

parsers, builder apis, and helper methods might reduce the number of lines of 
code, but they add additional layers and obfuscation that makes this a terrible 
tradeoff.


 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059538#comment-13059538
 ] 

Michael McCandless commented on LUCENE-3273:


I would also prefer to keep tests very straightforward, even if that makes them 
more verbose.  Ie just use the Lucene core API, and if the core API is 
insufficient we should improve it.  I don't think we should be adding very much 
special test-only APIs.

In fact, why even add a builder here for BQ?  Can't we just make the BQ and add 
the clauses?

In general I'm not a fan of builder APIs... I think they are over-applied these 
days (hammer!) and I don't think we need it here for our tests.

 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3167) Make lucene/solr a OSGI bundle through Ant

2011-07-04 Thread Luca Stancapiano (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059540#comment-13059540
 ] 

Luca Stancapiano commented on LUCENE-3167:
--

Here a updated version using the correct classpath:

  property name=bndclasspath refid=classpath/
  taskdef resource=aQute/bnd/ant/taskdef.properties / 
  bnd 
  classpath=${bndclasspath} 
  eclipse=false 
  failok=false 
  exceptions=true
  files=${common.dir}/lucene.bnd / 

The ant classpath is different by the maven classpath so there are differences 
in the resulting 'Export-Package' variable in the MANIFEST.MF but both are ok


 Make lucene/solr a OSGI bundle through Ant
 --

 Key: LUCENE-3167
 URL: https://issues.apache.org/jira/browse/LUCENE-3167
 Project: Lucene - Java
  Issue Type: New Feature
 Environment: bndtools
Reporter: Luca Stancapiano

 We need to make a bundle thriugh Ant, so the binary can be published and no 
 more need the download of the sources. Actually to get a OSGI bundle we need 
 to use maven tools and build the sources. Here the reference for the creation 
 of the OSGI bundle through Maven:
 https://issues.apache.org/jira/browse/LUCENE-1344
 Bndtools could be used inside Ant

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3273:


Attachment: LUCENE-3273_testboolean2.patch

here's my example, TestBoolean2.

in my opinion building the queries like this makes the test much more readable.

it adds 48 lines and deletes 29 lines of code... 

I think adding these 19 lines of code to this 343 line test case is worth every 
penny, because its much easier to see what any given test does, e.g. just 
glance real quick at testQueries06 and you see its a BQ with one MUST and two 
MUST_NOTS, no parsing by the brain required.


 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: revisit naming for grouping/join?

2011-07-04 Thread Michael McCandless
OK I'm sold!

I agree: let's rename this new module according to the most likely use
case, not according to its logical function, and I agree nested
documents is the compelling use case here.  Then fully generic joins
can go to a new module/join.

Maybe modules/nesteddocuments (I think that's more descriptive than
subdocuments)?

How about NestedDocumentQuery?  And NestedDocumentCollector?

See, you can use NestedDocumentQuery but collect it with any ordinary
collector if you don't care about the nesting (ie, you are only
interested in matches in the parent document space).  The
NestedDocumentCollector also collects all the nested docs matching
each parent hit.

You can of course still use this Query/Collector for any kind of
join, as long as your app is able to do this join at indexing time
and index all joined docs to a single row of the primary table as a
doc block.  But this will presumably be a less common use case so
I agree we should just name this feature according to its common use
case.

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jul 4, 2011 at 1:34 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : In my example the city was parent -- I raised this example to explain
 : that index-time joining is more general than just nested docs (ie, I
 : think we should keep the name join for this module... also because
 : we should factor out more general search-time-only join capabilities
 : into it).

 i think that may be the wrong approach to take when discussing examples,
 while it's great to say there are dozens of usecases that these features
 can all support in dozens of diff ways we should relaly focus on
 naming/deming these use cases in the ways where they really make the most
 sense.

 In otherwords, i don't think we should say All of these types of problems
 are different types of nails, and all of these modules are specialty
 hammers that are slightly distinct from eachother in how they work, but
 you can use any of these hammers on any of these nails  instead we should
 say here are some specialty hammers, you can use them for lots of
 types of nails, ut for each hammer here is the type of nail where it
 really shines


 block-index-join as i understand it requires all the docs you want to
 join up to be in one contigious range of docids in the index, so if you want 
 to
 re-index one doc in a block you have to re-index the entire block -- so
 the city/doctor example doesn't sound like a good generic example of
 when/why to use this (because a doctor might change his office
 hours, or address -- maybe even leavong the city completely, while a
 city might change it's population w/o the doctor being affected at all.

 The book and pages example seems much more appropriate, since in the
 real world these things change in lock step -- pages aren't added/removed to
 a book; pages don't change w/o the book itself being fundementally
 changed.  the fields of a page document are the text of that page, and
 that is inheriently data about the book -- the fields of a doctor
 document are metadata about the doctor, and that is not inheriently data
 about the city the doctor lives in.

 as for the name ... i understand why it's called module/join and i
 understand why the classes are called BlockJoinQuery and
 BlockJoinCollector but i don't think those names really stand out and
 convey to end users what they do and how/why they are useful.

 Personally i think better names would be modules/subdocuments,
 ParentDocumentQuery and ChildDocumentsCollector

 I know mcccandless isn't a fan of the name Nested Documents because this
 functionality *can* be used for use cases where the data being modeled is
 not strictly organized in a nested relationship, but that doesn't mean
 it's *optimal* or easy for a user to apply to other usecases, because they
 have to design their model (and their indexing strategy) in such a way
 that they think them as nested or hierarchical documents.

 Naming it module/subdocuments would not only emphasis the usecase where
 it really shines, it would more importantly draw attention to how users
 have to model their data in order to take advantage of it -- and using
 ParentDocument and ChildDocuments in the names of the Query/Collector
 would make it clear what they match on relative the underlying query
 that they wrap/collect

 it would also help distibguish from more general joins like what solr
 does today -- it seems like that should eventually take the name
 module/join

 At a minum we should rename what we have now modules/block-join or
 modules/index-join (but the later is confusing) and eventually add
 modules/query-join  (yes, yes, block joins provide a query, btu the
 differnce is when you you have to make a decision about how you want to
 join your model, at index time or at query time.


 -Hoss

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, 

[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059576#comment-13059576
 ] 

Hoss Man commented on LUCENE-3273:
--

bq. In general I'm not a fan of builder APIs... I think they are over-applied 
these days (hammer!) 

bq. I think adding these 19 lines of code to this 343 line test case is worth 
every penny, because its much easier to see what any given test does, e.g. just 
glance real quick at testQueries06 and you see its a BQ with one MUST and two 
MUST_NOTS, no parsing by the brain required.

i don't disagree with either of you.  particularly in this test where the whole 
point is testing BooleanQueries -- so lets actually have the test showing the 
construction of a BooleanQuery.

my point was more about tests where the construction of the Query object is 
ancillary to what the test is actually for.

that said: definitely in agreement that using the core api and constructing the 
queries right in the test leaves no room for ambiguity -- my main point was 
that if we're going to have builders to simplify the tests, let's make them 
short and terse like the QP syntax that use to be in those tests.

 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059579#comment-13059579
 ] 

Robert Muir commented on LUCENE-3273:
-

after reviewing the core tests, I think there is really not that many tests 
using the queryparser at all.

in fact it seems the only 'non-trivial' queries being built are inside the 
explanations tests (e.g. more than just a term, boolean, or phrase or 
whatever), if these are too laborious to make manually, maybe we can just have 
whatever is needed in the base TestExplanations... but I think it would be good 
to build queries directly in most places in general.


 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: revisit naming for grouping/join?

2011-07-04 Thread Chris Hostetter

: Maybe modules/nesteddocuments (I think that's more descriptive than
: subdocuments)?

either way ... subdocuments has the advantage of being a shorter directory 
name.  

i kinda wonder about first impressions and the entomology of nested ... 
it makes me think of bird nests and russion dolls, neither of which 
really convey the point: nesting in birds is about protecting/incubating 
and is only a single layer; while russian nesting dolls are singular 
wrappers arround wrappers arround wrappers.

subdocuments seems like it might better because it conveys more of a 
hierarchical nature (to me anyway).

: How about NestedDocumentQuery?  And NestedDocumentCollector?
: 
: See, you can use NestedDocumentQuery but collect it with any ordinary
: collector if you don't care about the nesting (ie, you are only
: interested in matches in the parent document space).  The
: NestedDocumentCollector also collects all the nested docs matching
: each parent hit.

Hmmm... 

My suggestion of ParentDocumentQuery was based on the understanding that 
the simplest usecase was...

  Query inner = getSomethingThatMatchesSomeChildDocs();
  Filter parents = someFilterThatMatcheAllKnownParentDocs()
  Query outer = new ParentDocumentQuery(inner, parents)
  TopDocs results = searcher.search(outer)

...and in this case results will contain the parents of the child 
documents that match inner.  is that correct?

if so, then indepenent of the Collector, ParentDocumentQuery (or 
ParentDocumentQueryWrapper) still seems like it makes the most sense.

For the Collector, i realize now that i totally missunderstood it's api -- 
for some reason i thought it would wrap another Collector and proxy to the 
inner collector only the parents, independently collecting/recording the 
groups of parent-children info which could be asked for later.  

ChildDocumentsCollector definitely doesn't make ense -- it's not 
just collecting children, it's collecting Groups made up of parents 
and children ... GroupCollector is obviously too general though ... i 
would toss out ParentChildrenTopGroupCollector to make it clear that:
  a) what you can get out of it are instances of TopGroups
  b) the Groups consists of Parents and Children

...but that may be trying to convey too much in a classname.  

I certianly wouldn't complain about NestedDocumentCollector or 
SubDocumentCollector if people like those better.


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2565) Prevent IW#close and cut over to IW#commit

2011-07-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059587#comment-13059587
 ] 

Mark Miller commented on SOLR-2565:
---

I've still got to put a note in changes about how you should reload SolrCores 
after this change. 

 Prevent IW#close and cut over to IW#commit
 --

 Key: SOLR-2565
 URL: https://issues.apache.org/jira/browse/SOLR-2565
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2565.patch


 Spinnoff from SOLR-2193. We already have a branch to work on this issue here 
 https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 
 The main goal here is to prevent solr from closing the IW and use IW#commit 
 instead. AFAIK the main issues here are:
 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 2. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 4. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.
 Eventually this is a preparation for NRT support in Solr which I will create 
 a followup issue for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2308) Separately specify a field's type

2011-07-04 Thread Nikola Tankovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikola Tankovic updated LUCENE-2308:


Attachment: LUCENE-2308-5.patch

Some test are cutover, more to come...
This fifth patch is to monitor progress, and see if something is wrong, or 
could be better.

Cutover InstantiatedDocument along the way also

 Separately specify a field's type
 -

 Key: LUCENE-2308
 URL: https://issues.apache.org/jira/browse/LUCENE-2308
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, 
 LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, 
 LUCENE-2308.patch, LUCENE-2308.patch


 This came up from dicussions on IRC.  I'm summarizing here...
 Today when you make a Field to add to a document you can set things
 index or not, stored or not, analyzed or not, details like omitTfAP,
 omitNorms, index term vectors (separately controlling
 offsets/positions), etc.
 I think we should factor these out into a new class (FieldType?).
 Then you could re-use this FieldType instance across multiple fields.
 The Field instance would still hold the actual value.
 We could then do per-field analyzers by adding a setAnalyzer on the
 FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
 for per-field codecs (with flex), where we now have
 PerFieldCodecWrapper).
 This would NOT be a schema!  It's just refactoring what we already
 specify today.  EG it's not serialized into the index.
 This has been discussed before, and I know Michael Busch opened a more
 ambitious (I think?) issue.  I think this is a good first baby step.  We could
 consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
 off on that for starters...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9328 - Failure

2011-07-04 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9328/

All tests passed

Build Log (for compile errors):
[...truncated 17547 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2308) Separately specify a field's type

2011-07-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059628#comment-13059628
 ] 

Michael McCandless commented on LUCENE-2308:


Patch looks good Nikola -- I'll commit it to the branch!

I removed the 2 nocommits from oal.document2.Document -- I think they were 
leftover from copying from Document.


 Separately specify a field's type
 -

 Key: LUCENE-2308
 URL: https://issues.apache.org/jira/browse/LUCENE-2308
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, 
 LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, 
 LUCENE-2308.patch, LUCENE-2308.patch


 This came up from dicussions on IRC.  I'm summarizing here...
 Today when you make a Field to add to a document you can set things
 index or not, stored or not, analyzed or not, details like omitTfAP,
 omitNorms, index term vectors (separately controlling
 offsets/positions), etc.
 I think we should factor these out into a new class (FieldType?).
 Then you could re-use this FieldType instance across multiple fields.
 The Field instance would still hold the actual value.
 We could then do per-field analyzers by adding a setAnalyzer on the
 FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
 for per-field codecs (with flex), where we now have
 PerFieldCodecWrapper).
 This would NOT be a schema!  It's just refactoring what we already
 specify today.  EG it's not serialized into the index.
 This has been discussed before, and I know Michael Busch opened a more
 ambitious (I think?) issue.  I think this is a good first baby step.  We could
 consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
 off on that for starters...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: EmbeddedSolrServer

2011-07-04 Thread Ryan McKinley
it is a bit weird, but we don't want solrj to depend on solr-core (it
is a client library, that should not need to know anything about
lucene/solr)

It might make sense to put EmbeddedSolrServer in its own source
tree/.jar but for the size/complexity, it seemed easiest to just put
in the package that already had the right dependencies.

ryan


On Mon, Jul 4, 2011 at 12:30 PM, Clecio Varjao cleciovar...@gmail.com wrote:
 Hi,

 Shouldn't org.apache.solr.client.solrj.embedded.EmbeddedSolrServer
 class be located under
 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/solrj;
 instead of 
 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src;
 ?

 Thanks,

 Clécio

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3275) SamplingAccumulatorTest hangs on Java 1.6.0u26

2011-07-04 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-3275:
-

Summary: SamplingAccumulatorTest hangs on Java 1.6.0u26  (was: hang on 
1.6.0u26)

Robert: did you file a new bug with oracle?

If the hypothesis of the reporter for the bug you linked to is correct, then 
it's not likely to be the same bug -- in that case the debug info for the 
process suggested that it was hung getting translation info as part of a call 
to 
GraphicsEnvironment.getLocalGraphicsEnvironment().getAvailableFontFamilyNames() 
while running in headless mode.  none of which sound like they are likely to 
happen in one of our tests.

that report was also filed by someone working on proprietary code who couldn't 
post a reproducible test case.  If you can post jvm info, os info, and a lucene 
svn r# that reliable reproduces orcale would have a much better bug report to 
work with.

 SamplingAccumulatorTest hangs on Java 1.6.0u26
 --

 Key: LUCENE-3275
 URL: https://issues.apache.org/jira/browse/LUCENE-3275
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 on the mac, a system update pushed out an upgrade to 1.6.0u26
 basically, if i run 'ant test' from the faceting module, my jre completely 
 hangs (0% cpu, won't even respond to kill -QUIT to print a stacktrace).
 This is reproducable... it always happens inside SamplingAccumulatorTest.
 Of course if i run this test by itself, or anything else, it doesn't want to 
 hang... but you should be able to reproduce by running 'ant test 
 -Dtests.threadspercpu=0' which runs all tests sequentially.
 Acts like http://forums.oracle.com/forums/thread.jspa?threadID=2246699
 I think this JRE version (update 26) is broken. If your mac asks you to 
 upgrade, just say no.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059642#comment-13059642
 ] 

Chris Male commented on LUCENE-3273:


In defence of builders, its a great design pattern and I don't agree that its 
over-applied.

With all that said, I'll move away from them.

 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Solr-3.x - Build # 394 - Failure

2011-07-04 Thread Phillipe Ramalho
Hi,

I am having similar problems. When running ant javadocs on
contrib/queryparser, I get the following error:

javadoc: warning - Error fetching URL:
http://java.sun.com/j2se/1.6/docs/api/package-list


and the script fails. What should I do? Is there are way to fix it?

Thanks,
Phillipe Ramalho

On Wed, Jun 29, 2011 at 2:05 PM, Robert Muir rcm...@gmail.com wrote:

 On Wed, Jun 29, 2011 at 2:00 PM, Chris Hostetter
 hossman_luc...@fucit.org wrote:
 
  : Failure to fetch junit's package list yet again... but Hoss is working
  : on this I think!
 
  I posted a straw-man patch, but i haven't relaly had time to seriously
  test it on modules/contrib ... and i thik rmuir had some reservations
  about putting the stuff in dev-tools ... but if someone is itching go
  ahead and commit. (i'm a little swamped right now)
 

 right, if the javadocs target in lucene/build.xml has a hard
 dependency on dev-tools,
 then the lucene source release won't work.

 but we could do some other things to fix this:
 * make this a soft dependency (e.g. the javadocs task will use
 dev-tools/plists when they are available, otherwise it downloads)
 * move dev-tools under lucene/ so we don't worry about this stuff
 * put the package-lists somewhere other than dev-tools (even if its
 just on hudson)

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Phillipe Ramalho


Re: [JENKINS] Solr-3.x - Build # 394 - Failure

2011-07-04 Thread Chris Male
http://download.oracle.com/javase/6/docs/api/package-list apparently works
reliably.

On Tue, Jul 5, 2011 at 12:08 PM, Phillipe Ramalho 
phillipe.rama...@gmail.com wrote:

 Hi,

 I am having similar problems. When running ant javadocs on
 contrib/queryparser, I get the following error:

 javadoc: warning - Error fetching URL:
 http://java.sun.com/j2se/1.6/docs/api/package-list


 and the script fails. What should I do? Is there are way to fix it?

 Thanks,
 Phillipe Ramalho

 On Wed, Jun 29, 2011 at 2:05 PM, Robert Muir rcm...@gmail.com wrote:

 On Wed, Jun 29, 2011 at 2:00 PM, Chris Hostetter
 hossman_luc...@fucit.org wrote:
 
  : Failure to fetch junit's package list yet again... but Hoss is working
  : on this I think!
 
  I posted a straw-man patch, but i haven't relaly had time to seriously
  test it on modules/contrib ... and i thik rmuir had some reservations
  about putting the stuff in dev-tools ... but if someone is itching go
  ahead and commit. (i'm a little swamped right now)
 

 right, if the javadocs target in lucene/build.xml has a hard
 dependency on dev-tools,
 then the lucene source release won't work.

 but we could do some other things to fix this:
 * make this a soft dependency (e.g. the javadocs task will use
 dev-tools/plists when they are available, otherwise it downloads)
 * move dev-tools under lucene/ so we don't worry about this stuff
 * put the package-lists somewhere other than dev-tools (even if its
 just on hudson)

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 --
 Phillipe Ramalho




-- 
Chris Male | Software Developer | JTeam BV.| www.jteam.nl


[jira] [Commented] (LUCENE-3275) SamplingAccumulatorTest hangs on Java 1.6.0u26

2011-07-04 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059645#comment-13059645
 ] 

Robert Muir commented on LUCENE-3275:
-

hoss, no I did not.

this is basically just a warning for other devs not
to upgrade to this broken jre on their macs.

otherwise your tests hang and you must kill -9

 SamplingAccumulatorTest hangs on Java 1.6.0u26
 --

 Key: LUCENE-3275
 URL: https://issues.apache.org/jira/browse/LUCENE-3275
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir

 on the mac, a system update pushed out an upgrade to 1.6.0u26
 basically, if i run 'ant test' from the faceting module, my jre completely 
 hangs (0% cpu, won't even respond to kill -QUIT to print a stacktrace).
 This is reproducable... it always happens inside SamplingAccumulatorTest.
 Of course if i run this test by itself, or anything else, it doesn't want to 
 hang... but you should be able to reproduce by running 'ant test 
 -Dtests.threadspercpu=0' which runs all tests sequentially.
 Acts like http://forums.oracle.com/forums/thread.jspa?threadID=2246699
 I think this JRE version (update 26) is broken. If your mac asks you to 
 upgrade, just say no.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-07-04 Thread Phillipe Ramalho (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillipe Ramalho updated LUCENE-2979:
-

Attachment: LUCENE-2979_phillipe_ramalho_3.patch

Hi Adriano,

Sorry for that, I forgot to add javadoc to those comments, they are pretty 
important.

Anyway, the problem was not really the missing javadoc, but javadoc was not 
understanding the link to inner classes (Class.InnerClass#Constant), I had to 
reference the inner class directly (InnerClass#Constant).

However, there is a still a javadoc warning, I just sent an email to the 
mailing list, I hope you saw it already, where I report the problem.

Here is the third patch with javadoc fixes.

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-07-04 Thread Phillipe Ramalho (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillipe Ramalho updated LUCENE-2979:
-

Attachment: LUCENE-2979_phillipe_ramalho_3.patch

oops, I had forgotten to check the ASF license.

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-07-04 Thread Adriano Crestani (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059651#comment-13059651
 ] 

Adriano Crestani commented on LUCENE-2979:
--

Hi Phillipe, thanks for the quick fix!

Just committed your last patch (LUCENE-2979_phillipe_ramalho_3.patch) on 
revision 1142862

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1768) NumericRange support for new query parser

2011-07-04 Thread Adriano Crestani (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059653#comment-13059653
 ] 

Adriano Crestani commented on LUCENE-1768:
--

Hi,

I committed the first patch from LUCENE-2979 and it changed completely the 
queryparser config API. 

It seems Vinicius is ahead of schedule with this project, is that corret?! Is 
there anything else to do after documentation? If not, I would ask whether it's 
possible for you to change the way you use config API for numeric. I think it 
will not require a lot of change, the api is much simpler now. I can ask 
Phillipe to help you and explain how the new api works ;) Otherwise, Uwe will 
not be able to commit your patch, since there will be many classes missing now.

What do you think Uwe?

 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/queryparser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Adriano Crestani
  Labels: contrib, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: week1.patch, week2.patch, week3.patch, week4.patch, 
 week5-6.patch


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser

2011-07-04 Thread Chris Male (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Male updated LUCENE-3273:
---

Attachment: LUCENE-3273.patch

New patch, new ideas.

I've moved away from introducing anything new and have converted all the core 
tests over to instantiating Queries programmatically.  No builders / helpers 
are used.

Everything compiles and passes again.

 Convert Lucene Core tests over to a simple MockQueryParser
 --

 Key: LUCENE-3273
 URL: https://issues.apache.org/jira/browse/LUCENE-3273
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/other
Reporter: Chris Male
 Attachments: LUCENE-3273.patch, LUCENE-3273.patch, 
 LUCENE-3273_testboolean2.patch


 Most tests use Lucene Core's QueryParser for convenience.  We want to 
 consolidate it into a QP module which we can't have as a dependency.  We 
 should add a simple MockQueryParser which does String.split() on the query 
 string, analyzers the terms and builds a BooleanQuery if necessary.  Any more 
 complex Queries (such as phrases) should be done programmatically. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-trunk - Build # 1615 - Still Failing

2011-07-04 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-trunk/1615/

No tests ran.

Build Log (for compile errors):
[...truncated 9395 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9334 - Failure

2011-07-04 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9334/

All tests passed

Build Log (for compile errors):
[...truncated 17541 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3274) Collapse Common module into Lucene core util

2011-07-04 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059674#comment-13059674
 ] 

Chris Male commented on LUCENE-3274:


I'm going to commit this tomorrow.

 Collapse Common module into Lucene core util
 

 Key: LUCENE-3274
 URL: https://issues.apache.org/jira/browse/LUCENE-3274
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Chris Male
 Attachments: LUCENE-3274.patch


 It was suggested by Robert in [http://markmail.org/message/wbfuzfamtn2qdvii] 
 that we should try to limit the dependency graph between modules and where 
 there is something 'common' it should probably go into lucene core.  Given 
 that I haven't added anything to this module except the MutableValue classes, 
 I'm going to collapse them into the util package, remove the module, and 
 correct the dependencies.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org