Build failed in Hudson: Solr-trunk #1253

2010-09-20 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Solr-trunk/1253/changes

Changes:

[ryan] running 'mvn generate-maven-artifacts' will put all the files in the 
same directory (dist/maven)

[yonik] SOLR-2123: group by query

[rmuir] LUCENE-2653: ThaiAnalyzer assumes things about your jre

[simonw] LUCENE-2588: Exposed indexed term prefix length to enable none-unicode 
sort order term indexes

[mikemccand] LUCENE-2647: refactor reusable components out of standard codec

--
[...truncated 6171 lines...]
[junit] Testsuite: org.apache.solr.handler.SpellCheckerRequestHandlerTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 1.235 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.StandardRequestHandlerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.716 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.TestCSVLoader
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.404 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.TestReplicationHandler
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 53.563 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.XmlUpdateRequestHandlerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.616 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.admin.LukeRequestHandlerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.998 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.admin.SystemInfoHandlerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.005 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.component.DebugComponentTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.798 sec
[junit] 
[junit] Testsuite: 
org.apache.solr.handler.component.DistributedSpellCheckComponentTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 10.587 sec
[junit] 
[junit] Testsuite: 
org.apache.solr.handler.component.DistributedTermsComponentTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.841 sec
[junit] 
[junit] Testsuite: 
org.apache.solr.handler.component.QueryElevationComponentTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.749 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.component.SearchHandlerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.589 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.component.SpellCheckComponentTest
[junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 0.967 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.component.StatsComponentTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.539 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.component.TermVectorComponentTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.701 sec
[junit] 
[junit] Testsuite: org.apache.solr.handler.component.TermsComponentTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Time elapsed: 0.815 sec
[junit] 
[junit] Testsuite: org.apache.solr.highlight.FastVectorHighlighterTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.66 sec
[junit] 
[junit] Testsuite: org.apache.solr.highlight.HighlighterConfigTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.571 sec
[junit] 
[junit] Testsuite: org.apache.solr.highlight.HighlighterTest
[junit] Tests run: 23, Failures: 0, Errors: 0, Time elapsed: 1.819 sec
[junit] 
[junit] Testsuite: org.apache.solr.request.JSONWriterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.633 sec
[junit] 
[junit] Testsuite: org.apache.solr.request.SimpleFacetsTest
[junit] Tests run: 22, Failures: 0, Errors: 0, Time elapsed: 6.174 sec
[junit] 
[junit] Testsuite: org.apache.solr.request.TestBinaryResponseWriter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.634 sec
[junit] 
[junit] Testsuite: org.apache.solr.request.TestFaceting
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 6.297 sec
[junit] 
[junit] Testsuite: org.apache.solr.request.TestWriterPerf
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.202 sec
[junit] 
[junit] Testsuite: org.apache.solr.response.TestCSVResponseWriter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.639 sec
[junit] 
[junit] Testsuite: org.apache.solr.schema.BadIndexSchemaTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.431 sec
[junit] 
[junit] Testsuite: org.apache.solr.schema.CopyFieldTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.525 sec
[junit] 
[junit] Testsuite: org.apache.solr.schema.DateFieldTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.014 sec

[jira] Updated: (SOLR-1301) Solr + Hadoop

2010-09-20 Thread Alexander Kanarsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kanarsky updated SOLR-1301:
-

Attachment: SOLR-1301.patch

The latest SOLR-1301-hadoop-0-20 patch is repackaged to be placed under the 
contrib, as it was initially (build.xml is included). and tested against the 
current trunk. As usual, after applying the patch put the 4 lib jars (hadoop, 
log4j, and two commons-logging) to the contrib/hadoop/lib. No unit tests as for 
now :) but I hope to add some soon. Here is the big question: as Andrzej once 
mentioned, the unit tests require a running Hadoop cluster. One approach is to 
make the patch and unit tests working with the Hadoop mini--cluster 
(ClusterMapReduceTestCase), however this will bring some extra dependencies 
needed to run the cluster (like jetty). Another idea is to use your own 
cluster and just configure access to this cluster in untt tests; this approach 
seems to be logical but potentially may give different test results on 
different clusters, and also may not give some low-level access to the 
execution, needed for tests. So what is your opinion on how the tests for 
solr-hadoop should be run? I am not really happy with the idea of starting and 
running the Hadoop cluster while performing the Solr unit tests, but this still 
could be the better option than no unit tests at all.  

 Solr + Hadoop
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Andrzej Bialecki 
 Fix For: Next

 Attachments: commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop.patch, log4j-1.2.15.jar, README.txt, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1218) maven artifact for webapp

2010-09-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912395#action_12912395
 ] 

Kjetil Ødegaard commented on SOLR-1218:
---

With the Solr WAR in the Maven repo, people would be able to easily build their 
own customized WARs with Maven WAR overlays. All you need to do is set a 
dependency to the Solr WAR from the web project with compile scope, and Maven 
handles the rest for you.

We've put the Solr WAR in our local repo and use this for our custom Solr 
deploy. If it were in central, things would be even easier.

 maven artifact for webapp
 -

 Key: SOLR-1218
 URL: https://issues.apache.org/jira/browse/SOLR-1218
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.3
Reporter: Benson Margulies

 It would be convenient to have a packagingwar/packaging maven project for 
 the webapp, to allow launching solr from maven via jetty.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Failed Test: junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriter (from TestSuite)

2010-09-20 Thread Michael McCandless
Hmm -- I suspect TestIndexWriter.testNoWaitClose hit an exc, which
caused it not to close the dir, but the code that catches this in
LuceneTestCase fails to show that root cause?

I think we should disable the dir/IndexInput/Output not closed
checking if the test hit an exc?

Ahh so here is the root cause:
 http://gperf.ath.cx:/hudson/job/Solcene/1704/testReport/junit/org.apache.lucene.index/TestIndexWriter/testNoWaitClose/

java.io.FileNotFoundException:
/home/mark/hudson_solcene/jobs/Solcene/workspace/solcene/lucene/build/test/7/test4946766365764846424tmp/_46.fnm
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:233)
at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:69)
at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:90)
at 
org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:56)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:351)
at 
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:299)
at org.apache.lucene.index.FieldInfos.init(FieldInfos.java:69)
at 
org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:131)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:536)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:509)
at 
org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:129)
at 
org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:96)
at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:91)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:414)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:233)
at 
org.apache.lucene.index.TestIndexWriter.testNoWaitClose(TestIndexWriter.java:2174)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:805)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:778)

The test is quite awful -- creates index w/ maxBufferedDocs 2 and
mergeFactor 100 and (randomly) CFS on/off.  So it's not surprising
when it gets FSDir that it'll run out of descriptors...

We should fix this test to not use any of the FS dirs?

Mike

On Sun, Sep 19, 2010 at 8:55 PM, Mark Miller markrmil...@gmail.com wrote:

 Failed

 junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriter (from 
 TestSuite)

 Failing for the past 1 build (Since #1704 )
 Took 0 ms.
 add description

 Error Message

 directory of test was not closed, opened from: 
 org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:585)

 Stacktrace

 junit.framework.AssertionFailedError: directory of test was not closed, 
 opened from: 
 org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:585)
   at 
 org.apache.lucene.util.LuceneTestCase.afterClassLuceneTestCaseJ4(LuceneTestCase.java:304)

 Standard Output

 NOTE: random codec of testcase 'testNoWaitClose' was: 
 MockFixedIntBlock(blockSize=340)
 NOTE: random locale of testcase 'testNoWaitClose' was: ar_LB
 NOTE: random timezone of testcase 'testNoWaitClose' was: EAT


 --
 - Mark

 http://www.lucidimagination.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1218) maven artifact for webapp

2010-09-20 Thread Stevo Slavic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912429#action_12912429
 ] 

Stevo Slavic commented on SOLR-1218:


Voted for the issue too.

As a temporary workaround, to reference solr.war but still keep solr config 
files in IDE under version control, I use following config:

{code:title=pom.xml|borderStyle=solid}
...
plugin
  groupIdorg.mortbay.jetty/groupId
  artifactIdjetty-maven-plugin/artifactId
  configuration
stopPort${jetty.stop.port}/stopPort
stopKeyfoo/stopKey
webApp${env.SOLR_HOME}/example/webapps/solr.war/webApp
tempDirectory${project.build.directory}/jetty-tmp/tempDirectory
systemProperties
  systemProperty
namesolr.solr.home/name
value${basedir}/src/main/solr/home/value
  /systemProperty
  systemProperty
namesolr.data.dir/name
value${project.build.directory}/solr/data/value
  /systemProperty
  systemProperty
namesolr_home/name
value${env.SOLR_HOME}/value
  /systemProperty
/systemProperties
  /configuration
  executions
execution
  idstart-jetty/id
  phasepre-integration-test/phase
  goals
goaldeploy-war/goal
  /goals
  configuration
daemontrue/daemon
webAppConfig
  contextPath/solr/contextPath
  tempDirectory${project.build.directory}/jetty-tmp/tempDirectory
/webAppConfig
connectors
  connector 
implementation=org.eclipse.jetty.server.nio.SelectChannelConnector
port${jetty.http.port}/port
  /connector
/connectors
  /configuration
/execution
execution
  idstop-jetty/id
  phasepost-integration-test/phase
  goals
goalstop/goal
  /goals
/execution
  /executions
/plugin
...
{code}

And update the SOLR_HOME environment variable with move to new Solr 
installation/version. This is easy for development environment, not for CI 
(Hudson). That's why solr.war on public repo would be handy.

 maven artifact for webapp
 -

 Key: SOLR-1218
 URL: https://issues.apache.org/jira/browse/SOLR-1218
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.3
Reporter: Benson Margulies

 It would be convenient to have a packagingwar/packaging maven project for 
 the webapp, to allow launching solr from maven via jetty.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2491) Extend Codec with a SegmentInfos writer / reader

2010-09-20 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  resolved LUCENE-2491.
---

Fix Version/s: 4.0
   Resolution: Fixed

This was committed as a part of LUCENE-2373.

 Extend Codec with a SegmentInfos writer / reader
 

 Key: LUCENE-2491
 URL: https://issues.apache.org/jira/browse/LUCENE-2491
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 4.0
Reporter: Andrzej Bialecki 
 Fix For: 4.0


 I'm trying to implement a Codec that works with append-only filesystems 
 (HDFS). It's _almost_ done, except for the SegmentInfos.write(dir), which 
 uses ChecksumIndexOutput, which in turn uses IndexOutput.seek() - and seek is 
 not supported on append-only output. I propose to extend the Codec interface 
 to encapsulate also the details of SegmentInfos writing / reading. Patch to 
 follow after some feedback ;)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-2482) Index sorter

2010-09-20 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  reassigned LUCENE-2482:
-

Assignee: Andrzej Bialecki 

 Index sorter
 

 Key: LUCENE-2482
 URL: https://issues.apache.org/jira/browse/LUCENE-2482
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/*
Affects Versions: 3.1
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 3.1

 Attachments: indexSorter.patch


 A tool to sort index according to a float document weight. Documents with 
 high weight are given low document numbers, which means that they will be 
 first evaluated. When using a strategy of early termination of queries (see 
 TimeLimitedCollector) such sorting significantly improves the quality of 
 partial results.
 (Originally this tool was created by Doug Cutting in Nutch, and used norms as 
 document weights - thus the ordering was limited by the limited resolution of 
 norms. This is a pure Lucene version of the tool, and it uses arbitrary 
 floats from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2656) If tests fail, don't report about unclosed resources

2010-09-20 Thread Robert Muir (JIRA)
If tests fail, don't report about unclosed resources


 Key: LUCENE-2656
 URL: https://issues.apache.org/jira/browse/LUCENE-2656
 Project: Lucene - Java
  Issue Type: Test
  Components: Tests
Affects Versions: 3.1, 4.0
Reporter: Robert Muir
 Fix For: 3.1, 4.0
 Attachments: LUCENE-2656.patch

LuceneTestCase ensures in afterClass() if you closed all your directories, 
which in turn will check if you have closed any open files.

This is good, as a test will fail if we have resource leaks.

But if a test truly fails, this is just confusing, because its usually not 
going to make it to the part of its code where it would call .close()

So, if any tests fail, I think we should omit this check in afterClass()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2656) If tests fail, don't report about unclosed resources

2010-09-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2656:


Attachment: LUCENE-2656.patch

 If tests fail, don't report about unclosed resources
 

 Key: LUCENE-2656
 URL: https://issues.apache.org/jira/browse/LUCENE-2656
 Project: Lucene - Java
  Issue Type: Test
  Components: Tests
Affects Versions: 3.1, 4.0
Reporter: Robert Muir
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2656.patch


 LuceneTestCase ensures in afterClass() if you closed all your directories, 
 which in turn will check if you have closed any open files.
 This is good, as a test will fail if we have resource leaks.
 But if a test truly fails, this is just confusing, because its usually not 
 going to make it to the part of its code where it would call .close()
 So, if any tests fail, I think we should omit this check in afterClass()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Failed Test: junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriter (from TestSuite)

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 5:09 AM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Hmm -- I suspect TestIndexWriter.testNoWaitClose hit an exc, which
 caused it not to close the dir, but the code that catches this in
 LuceneTestCase fails to show that root cause?

 I think we should disable the dir/IndexInput/Output not closed
 checking if the test hit an exc?


https://issues.apache.org/jira/browse/LUCENE-2656

https://issues.apache.org/jira/browse/LUCENE-2656

-- 
Robert Muir
rcm...@gmail.com


[jira] Commented: (LUCENE-2656) If tests fail, don't report about unclosed resources

2010-09-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912445#action_12912445
 ] 

Michael McCandless commented on LUCENE-2656:


Super -- patch looks great!

 If tests fail, don't report about unclosed resources
 

 Key: LUCENE-2656
 URL: https://issues.apache.org/jira/browse/LUCENE-2656
 Project: Lucene - Java
  Issue Type: Test
  Components: Tests
Affects Versions: 3.1, 4.0
Reporter: Robert Muir
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2656.patch


 LuceneTestCase ensures in afterClass() if you closed all your directories, 
 which in turn will check if you have closed any open files.
 This is good, as a test will fail if we have resource leaks.
 But if a test truly fails, this is just confusing, because its usually not 
 going to make it to the part of its code where it would call .close()
 So, if any tests fail, I think we should omit this check in afterClass()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll
A little late to the party, but...

On Sep 18, 2010, at 5:09 PM, Ryan McKinley wrote:

 I cannot in good conscience sign with my key, nor vote over any maven
 artifacts. I noticed these guides only mentioned how to upload (which itself
 seems extremely complex). But nowhere do i see 'how do you test that your
 artifacts are correct'. And thats really the main problem I have with our
 maven support.
 
 I understand what you are worried about... and think we can avoid it.
 How about:
 
 1. Keep the generate-maven-artifacts in the release.  This just
 copies the official jar files to a special directory structure (same
 keys etc)

OK, I get that a lot of committers here don't like Maven and I don't think 
Lucene should switch to a Maven build and it's a pain to do complex things in, 
but I use it all the time for Lucene/Solr (for none complex things) and I know 
of a lot of people in user land who use it as well b/c it makes the common 
things _users_ do really easy.  

And, as much as Hoss restarted this thread by saying the PMC releases only 
source, it simply is not what users expect.  That's why we sign all the 
artifacts.  They are the RM saying I verify this and the PMC then votes on all 
the artifacts and it's why we push them all up for distribution.  Of course, we 
are only required to release source, but you show me a project that does only 
that at the ASF and I'll show you a project w/ very few users.

At any rate, the big problem w/ Maven and Lucene is not that 
generate-maven-artifacts doesn't work, it's that the POM templates aren't kept 
in sync.  However, I think we now have a solution for that thanks to Steve and 
Robert's work to make it easier to bring Lucene into IntelliJ.  In other words, 
that process does much of what is needed for Maven, so it should be relatively 
straightforward to have it automatically generate the templates, too.  In fact, 
it would be just as easy for that project to simply produce POM files (which 
are well understood and have a published spec) instead of creating the IntelliJ 
project files, which are not well understood and not publicly spec'd and 
subject to change w/ every release and simply have IntelliJ suck in the POM 
file since IntelliJ supports that very, very well. 

Then, to automatically test Maven, we simply need to do a few things:
1. Generate the templates
2. Build the Maven artifacts and install them (this is a Maven concept that 
copies them to your local repository, usually in ~/.mvn/repository, but it can 
be in other places and it should be clean)
3. Generate a test pom that includes, as dependencies all the Lucene Maven 
artifacts and maybe even compiles a small source tree with it

If that last step passes, you know everything is right.  However, short of #2 
and #3, as long as the POM's are being generated accurately, I think I would 
feel comfortable releasing them, whereas I agree, now, with Robert, that we 
probably shouldn't be releasing them now.

(BTW, I love the Maven is Magic (and really any It's magic, therefore I 
don't like it) reasoning for not liking it, whereby everyone complains that 
b/c Maven hides a bunch of details from you (i.e. it's magic), therefore you 
don't like it.  At the same time, I'm sure said person doesn't understand every 
last detail of, oh, I don't know: the CPU, RAM, the Compiler, the JDK, etc. and 
yet they have no problem using that.  In other words, we deal with abstractions 
all the time.  It's fine if you don't get the abstraction or don't personally 
find it useful, but that doesn't make the abstraction bad.) 

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 8:23 AM, Grant Ingersoll gsing...@apache.orgwrote:

 At any rate, the big problem w/ Maven and Lucene is not that
 generate-maven-artifacts doesn't work, it's that the POM templates aren't
 kept in sync.  However, I think we now have a solution for that thanks to
 Steve and Robert's work to make it easier to bring Lucene into IntelliJ.  In
 other words, that process does much of what is needed for Maven, so it
 should be relatively straightforward to have it automatically generate the
 templates, too.  In fact, it would be just as easy for that project to
 simply produce POM files (which are well understood and have a published
 spec) instead of creating the IntelliJ project files, which are not well
 understood and not publicly spec'd and subject to change w/ every release
 and simply have IntelliJ suck in the POM file since IntelliJ supports that
 very, very well.


So are you saying, instead of generating IntelliJ configuration, we generate
poms, and then we have a route, via maven, for users to automatically set up
their IntelliJ (and also eclipse?) IDEs?

If so this sounds great to me. Because it would be nice to make the IDE
configuration easier, not just for IntelliJ.


 Then, to automatically test Maven, we simply need to do a few things:
 1. Generate the templates
 2. Build the Maven artifacts and install them (this is a Maven concept
 that copies them to your local repository, usually in ~/.mvn/repository, but
 it can be in other places and it should be clean)
 3. Generate a test pom that includes, as dependencies all the Lucene
 Maven artifacts and maybe even compiles a small source tree with it


+1. this would resolve all my concerns about maven, because we have a way to
test that it stands a chance of working *before release*.

I hope you don't think I am picking on maven here, I'm equally disturbed
about the demo application, and i think it should have a basic unit test too
that indexes stuff, fires itself up in jetty, and runs a search.

Like maven, i know some people don't necessarily like the demo, but as long
as we are going to ship it, I want tests so that we dont find its completely
nonfunctional after the release. Unlike maven, i think i stand a chance of
actually being able to write the test for this one though.


-- 
Robert Muir
rcm...@gmail.com


Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 8:44 AM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 8:23 AM, Grant Ingersoll gsing...@apache.org wrote:
 At any rate, the big problem w/ Maven and Lucene is not that 
 generate-maven-artifacts doesn't work, it's that the POM templates aren't 
 kept in sync.  However, I think we now have a solution for that thanks to 
 Steve and Robert's work to make it easier to bring Lucene into IntelliJ.  In 
 other words, that process does much of what is needed for Maven, so it should 
 be relatively straightforward to have it automatically generate the 
 templates, too.  In fact, it would be just as easy for that project to simply 
 produce POM files (which are well understood and have a published spec) 
 instead of creating the IntelliJ project files, which are not well understood 
 and not publicly spec'd and subject to change w/ every release and simply 
 have IntelliJ suck in the POM file since IntelliJ supports that very, very 
 well.
 
 
 So are you saying, instead of generating IntelliJ configuration, we generate 
 poms, and then we have a route, via maven, for users to automatically set up 
 their IntelliJ (and also eclipse?) IDEs?
 
 If so this sounds great to me. Because it would be nice to make the IDE 
 configuration easier, not just for IntelliJ.

Yes.  I know for a fact IntelliJ can read the POMs.  I use it all the time.  Go 
check out Mahout and point IntelliJ at it's POM.  You will be up and compiling  
(in your IDE) in less than 2 minutes give or take.  I imagine Eclipse has 
similar support.

  
 Then, to automatically test Maven, we simply need to do a few things:
 1. Generate the templates
 2. Build the Maven artifacts and install them (this is a Maven concept that 
 copies them to your local repository, usually in ~/.mvn/repository, but it 
 can be in other places and it should be clean)
 3. Generate a test pom that includes, as dependencies all the Lucene Maven 
 artifacts and maybe even compiles a small source tree with it
 
 
 +1. this would resolve all my concerns about maven, because we have a way to 
 test that it stands a chance of working *before release*.
 
 I hope you don't think I am picking on maven here, I'm equally disturbed 
 about the demo application, and i think it should have a basic unit test too 
 that indexes stuff, fires itself up in jetty, and runs a search.

I totally understand it.  I'm not some Maven fanboi (especially as the person 
who used it to put together the Mahout release, initially).  I know it's warts, 
believe me, as I have lived the pain.  That being said, for _most_ users (i.e. 
not necessarily us committers) who are simply using Lucene/Solr within a much 
broader environment of dependencies, having the JARs available in the Maven 
repo w/ correct POM files is a very good thing that makes it so much easier for 
them to do their day to day work and I would hate to see that go away, 
especially since it is something we have supported for quite some time, albeit 
with varying levels of success.

 
 Like maven, i know some people don't necessarily like the demo, but as long 
 as we are going to ship it, I want tests so that we dont find its completely 
 nonfunctional after the release. Unlike maven, i think i stand a chance of 
 actually being able to write the test for this one though.

I've been wanting to do those Maven tests for a while now, but simply can't 
find the time relative to my other priorities.  I guess if the community is 
saying that if someone doesn't step up, it's going to be dropped, I'll step up. 
 I can likely commit to it before the next release. 

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Mark Miller

 (BTW, I love the Maven is Magic (and really any It's magic, therefore I 
 don't like it) reasoning for not liking it, whereby everyone complains that 
 b/c Maven hides a bunch of details from you (i.e. it's magic), therefore 
 you don't like it.  At the same time, I'm sure said person doesn't understand 
 every last detail of, oh, I don't know: the CPU, RAM, the Compiler, the JDK, 
 etc. and yet they have no problem using that.  In other words, we deal with 
 abstractions all the time.  It's fine if you don't get the abstraction or 
 don't personally find it useful, but that doesn't make the abstraction bad.) 
 
 -Grant

Maven is not bad because it's magic - magic is frigging great - I want
my software to be magic - it's bad because every 5 line program from
some open source code/project that I have tried to build with it has
gone on an absurd downloading spree that takes forever because it's
getting many tiny files. This downloading spree never corresponds to the
size of the code base I am working with, and always manages to surprise
by the amount of time it can slurp up.

That's enough for me right there - I've heard others talk of other non
magical things that sound scary, but I won't dig any deeper into this
absurdity. Either I *really* don't like Maven, or no one knows how to
properly set it up - which makes me still not like it. When the magic is
absurd, it loses a little of its magic.

Finally, there is a difference between releasing source code, releasing
signed jars, and signed maven files, and *just* releasing signed jars.
Dropping maven doesn't get you back down to releasing source code. I
still think Maven should be a downstream issue.

- Mark

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 8:58 AM, Grant Ingersoll wrote:

 
 On Sep 20, 2010, at 8:44 AM, Robert Muir wrote:
 
 
 
 On Mon, Sep 20, 2010 at 8:23 AM, Grant Ingersoll gsing...@apache.org wrote:
 At any rate, the big problem w/ Maven and Lucene is not that 
 generate-maven-artifacts doesn't work, it's that the POM templates aren't 
 kept in sync.  However, I think we now have a solution for that thanks to 
 Steve and Robert's work to make it easier to bring Lucene into IntelliJ.  In 
 other words, that process does much of what is needed for Maven, so it 
 should be relatively straightforward to have it automatically generate the 
 templates, too.  In fact, it would be just as easy for that project to 
 simply produce POM files (which are well understood and have a published 
 spec) instead of creating the IntelliJ project files, which are not well 
 understood and not publicly spec'd and subject to change w/ every release 
 and simply have IntelliJ suck in the POM file since IntelliJ supports that 
 very, very well.
 
 
 So are you saying, instead of generating IntelliJ configuration, we generate 
 poms, and then we have a route, via maven, for users to automatically set up 
 their IntelliJ (and also eclipse?) IDEs?
 
 If so this sounds great to me. Because it would be nice to make the IDE 
 configuration easier, not just for IntelliJ.
 
 Yes.  I know for a fact IntelliJ can read the POMs.  I use it all the time.  
 Go check out Mahout and point IntelliJ at it's POM.  You will be up and 
 compiling  (in your IDE) in less than 2 minutes give or take.  I imagine 
 Eclipse has similar support.

I should correct myself here.  While all of the above is true, it likely still 
won't work for Lucene b/c the source trees aren't in line w/ Maven conventions. 
 Thus, we will probably still need to output IntelliJ format.  I do, however, 
think it isn't much of a leap to also output a POM file.

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Ryan McKinley
 I hope you don't think I am picking on maven here, I'm equally disturbed
 about the demo application, and i think it should have a basic unit test too
 that indexes stuff, fires itself up in jetty, and runs a search.

The solr sample app is tested -- i don't know anything about lucene demo stuff.

Most of the solrj tests run from the example schema via jetty and embedded.

ryan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 9:00 AM, Mark Miller wrote:

 
 (BTW, I love the Maven is Magic (and really any It's magic, therefore I 
 don't like it) reasoning for not liking it, whereby everyone complains that 
 b/c Maven hides a bunch of details from you (i.e. it's magic), therefore 
 you don't like it.  At the same time, I'm sure said person doesn't 
 understand every last detail of, oh, I don't know: the CPU, RAM, the 
 Compiler, the JDK, etc. and yet they have no problem using that.  In other 
 words, we deal with abstractions all the time.  It's fine if you don't get 
 the abstraction or don't personally find it useful, but that doesn't make 
 the abstraction bad.) 
 
 -Grant
 
 Maven is not bad because it's magic - magic is frigging great - I want
 my software to be magic - it's bad because every 5 line program from
 some open source code/project that I have tried to build with it has
 gone on an absurd downloading spree that takes forever because it's
 getting many tiny files. This downloading spree never corresponds to the
 size of the code base I am working with, and always manages to surprise
 by the amount of time it can slurp up.

Agreed, but over time, it is lessened by the fact that you already have most 
common files/jars and furthermore, you only have one copy of them instead of 
one under every source tree.  I think, over time, you actually end up 
downloading less than with other approaches and that even includes the 
downloads one gets when Maven upgrades itself.  

I do, agree, though, that Maven makes you drink the Kool-aid and it doesn't 
play well with other conventions (although it isn't horrible when it comes to 
Ant, either).  There are plenty of days I hate Maven for what it assumes, but 
there are also many days when I love the fact that the POM describes my project 
in one clear, fairly concise, validatable way.

 
 I
 still think Maven should be a downstream issue.

I don't see how it can be.  You have to be a committer to push it to the ASF 
repository for syndication on iBiblio, etc.  That being said, we really aren't 
that far from a process that we can have confidence in.

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



removing payload.xml

2010-09-20 Thread Yonik Seeley
I was documenting some field collapsing stuff when I ran across a
response like this (using the example data):

  grouped:{
price:[0 TO 99.99]:{
  matches:8,
  doclist:{numFound:2,start:0,docs:[
  {
name:CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM
Unbuffered DDR 400 (PC 3200) System Memory - Retail,
price:74.99},
  {
name:CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM
Unbuffered DDR 400 (PC 3200) System Memory - Retail,
price:74.99}]
  }},


At first I thought something was horribly wrong with the grouping...
but on expanding all the fields, I realized that some of the documents
were copies of others, with the ID field changed, and a payload field
added.  I imagine others will make the same mistake, so I'm going to
simply move the payload fields to the originial docs and remove
payload.xml

Does anyone know of any Solr docs that need to be adjusted as part of
this?  I couldn't find anything on payloads in our wiki.

-Yonik
http://lucenerevolution.org  Lucene/Solr Conference, Boston Oct 7-8

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 9:00 AM, Mark Miller markrmil...@gmail.com wrote:
 I still think Maven should be a downstream issue.

+1

Maven has never been a required part of our releases, and I don't
think we should change that.

We should also keep in mind that there's nothing really official about
a release manager.
There's no reason the person(s) that signed the normal release need to
be the same person that signs the maven stuff (but it should be a PMC
member if it's hosted by the ASF).

If there are people around during a release that want to handle the
maven stuff, that seems fine.  It does *not* have to be the release
manager.  It seems fine to make reasonable accommodations if some are
working on making maven artifacts available at roughly the same... but
if not,  it should not hold up the release.

-Yonik
http://lucenerevolution.org  Lucene/Solr Conference, Boston Oct 7-8

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Andrzej Bialecki

On 2010-09-20 15:21, Grant Ingersoll wrote:

I do, agree, though, that Maven makes you drink the Kool-aid and it doesn't play well 
with other conventions (although it isn't horrible when it comes to Ant, either).  There 
are plenty of days I hate Maven for what it assumes, but there are also many days when I 
love the fact that the POM describes my project in one clear, fairly concise, 
validatable way.


We took the middle road in Nutch - we switched to ant+ivy to manage 
dependencies. This way we get single copies of all deps, and build.xml 
is still recognizable and useful. Of coure, this doesn't solve the 
publishing part of Maven functionality (yet).


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: discussion about release frequency.

2010-09-20 Thread Steven A Rowe
On 9/20/2010 at 8:24 AM, Grant Ingersoll wrote:
 At any rate, the big problem w/ Maven and Lucene is not that generate-
 maven-artifacts doesn't work, it's that the POM templates aren't kept in
 sync.  However, I think we now have a solution for that thanks to Steve
 and Robert's work to make it easier to bring Lucene into IntelliJ.  In
 other words, that process does much of what is needed for Maven, so it
 should be relatively straightforward to have it automatically generate the
 templates, too.  In fact, it would be just as easy for that project to
 simply produce POM files (which are well understood and have a published
 spec) instead of creating the IntelliJ project files, which are not well
 understood and not publicly spec'd and subject to change w/ every release
 and simply have IntelliJ suck in the POM file since IntelliJ supports that
 very, very well.

Unfortunately, LUCENE-2611 does not automatically generate IntelliJ setup files 
- they are static, just like the POM template files.  I think it's possible, 
using an Ant BuildListener-extending class, to do automatic generation, but I 
haven't attempted it yet.  I'll open an issue.

Steve



Re: discussion about release frequency.

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 10:18 AM, Grant Ingersoll gsing...@apache.org wrote:

 On Sep 20, 2010, at 9:55 AM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 9:00 AM, Mark Miller markrmil...@gmail.com wrote:
 I still think Maven should be a downstream issue.

 +1

 Maven has never been a required part of our releases, and I don't
 think we should change that.

 We should also keep in mind that there's nothing really official about
 a release manager.
 There's no reason the person(s) that signed the normal release need to
 be the same person that signs the maven stuff (but it should be a PMC
 member if it's hosted by the ASF).

 If there are people around during a release that want to handle the
 maven stuff, that seems fine.  It does *not* have to be the release
 manager.  It seems fine to make reasonable accommodations if some are
 working on making maven artifacts available at roughly the same... but
 if not,  it should not hold up the release.

 I completely disagree.

With what part?  Do you mean to say you wish to make maven a required
part of our releases?
If so, perhaps you should call a vote?

  It's either a first class citizen or it's not and by moving it out

It is not a first class citizen.  Apparently the last Solr release
went out w/o working maven support.

But it's not quite so black and white either... I see no reason to
*remove* maven related stuff from ant (and it's good if people improve
it), and I've even applied patches to the maven stuff when supplied by
others.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2657) Auto-generate POM templates from Ant builds

2010-09-20 Thread Steven Rowe (JIRA)
Auto-generate POM templates from Ant builds
---

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Reporter: Steven Rowe
Priority: Minor
 Fix For: 3.1, 4.0


Lucene and Solr modules' POM templates are manually maintained, and so are not 
always in sync with the dependencies used by the Ant build. 

It should be possible to auto-generate POM templates using build tools 
extending Ant's 
[BuildListener|http://api.dpml.net/ant/1.6.5/org/apache/tools/ant/BuildListener.html]
 interface, similarly to how the [ant2ide|http://gleamynode.net/articles/2234/] 
project generates eclipse project files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2482) Index sorter

2010-09-20 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  resolved LUCENE-2482.
---

Resolution: Fixed

Committed in rev. 998948.

 Index sorter
 

 Key: LUCENE-2482
 URL: https://issues.apache.org/jira/browse/LUCENE-2482
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/*
Affects Versions: 3.1
Reporter: Andrzej Bialecki 
Assignee: Andrzej Bialecki 
 Fix For: 3.1

 Attachments: indexSorter.patch


 A tool to sort index according to a float document weight. Documents with 
 high weight are given low document numbers, which means that they will be 
 first evaluated. When using a strategy of early termination of queries (see 
 TimeLimitedCollector) such sorting significantly improves the quality of 
 partial results.
 (Originally this tool was created by Doug Cutting in Nutch, and used norms as 
 document weights - thus the ordering was limited by the limited resolution of 
 norms. This is a pure Lucene version of the tool, and it uses arbitrary 
 floats from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-2657) Auto-generate POM templates from Ant builds

2010-09-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912487#action_12912487
 ] 

Robert Muir edited comment on LUCENE-2657 at 9/20/10 10:59 AM:
---

How would the BuildListener interface know about dependencies? Does it have 
some magic way to know this?

As an example, lets take modules/analysis/icu which has 3 dependencies:
* Lucene core itself (implicit from contrib-build.xml)
* external dependency: ICU
* internal dependency: modules/analysis/common

take a look at modules/analysis/icu's pom.xml which has:
{noformat}
  dependencies
dependency
  groupIdcom.ibm.icu/groupId
  artifactIdicu4j/artifactId
  version${icu-version}/version
/dependency
  /dependencies
{noformat}

However, our ant builds (that depend on common-build/contrib-build) declare 
their dependencies in a semi-standard way:
* External dependencies:
{noformat}
path id=additional.dependencies
  fileset dir=lib includes=icu4j-*.jar/
/path
{noformat}
* Internal dependencies:
{noformat}
module-uptodate name=analysis/common 
jarfile=../build/common/lucene-analyzers-common-${version}.jar
  property=analyzers-common.uptodate 
classpath.property=analyzers-common.jar/
{noformat}

The contrib-build.xml already has a 'dist-maven' target, that is called 
recursively. Perhaps an alternative would be to improve contrib-build.xml for 
it to have a 'generate-maven' target, also called recursively.
I've already prototyped/proposed in SOLR-2002 that we migrate the solr build to 
extend the lucene build, so everywhere would use it.

Furthermore, couldnt we also make a recursive 'test-maven' target, that 
generates a maven project to 'download' or whatever it needs, then tries to run 
all the tests?
If somehow the maven is broken, the tests simply won't pass.

I realize that running all of a modules tests again redundantly via 'maven' 
might not be the most elegant solution, but it seems like it would test that 
everything is working.


  was (Author: rcmuir):
How would the BuildListener interface know about dependencies? Does it have 
some magic way to know this?

As an example, lets take modules/analysis/icu which has 3 dependencies:
* Lucene core itself (implicit from contrib-build.xml)
* external dependency: ICU
* internal dependency: modules/analysis/common

take a look at modules/analysis/icu's pom.xml which has:
{noformat}
  dependencies
dependency
  groupIdcom.ibm.icu/groupId
  artifactIdicu4j/artifactId
  version${icu-version}/version
/dependency
  /dependencies
{noformat}

However, our ant builds (that depend on common-build/contrib-build) declare 
their dependencies in a semi-standard way:
* External dependencies:
{noformat}
path id=additional.dependencies
  fileset dir=lib includes=icu4j-*.jar/
/path
* Internal dependencies:
{noformat}
module-uptodate name=analysis/common 
jarfile=../build/common/lucene-analyzers-common-${version}.jar
  property=analyzers-common.uptodate 
classpath.property=analyzers-common.jar/
{noformat}

The contrib-build.xml already has a 'dist-maven' target, that is called 
recursively. Perhaps an alternative would be to improve contrib-build.xml for 
it to have a 'generate-maven' target, also called recursively.
I've already prototyped/proposed in SOLR-2002 that we migrate the solr build to 
extend the lucene build, so everywhere would use it.

Furthermore, couldnt we also make a recursive 'test-maven' target, that 
generates a maven project to 'download' or whatever it needs, then tries to run 
all the tests?
If somehow the maven is broken, the tests simply won't pass.

I realize that running all of a modules tests again redundantly via 'maven' 
might not be the most elegant solution, but it seems like it would test that 
everything is working.

  
 Auto-generate POM templates from Ant builds
 ---

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Reporter: Steven Rowe
Priority: Minor
 Fix For: 3.1, 4.0


 Lucene and Solr modules' POM templates are manually maintained, and so are 
 not always in sync with the dependencies used by the Ant build. 
 It should be possible to auto-generate POM templates using build tools 
 extending Ant's 
 [BuildListener|http://api.dpml.net/ant/1.6.5/org/apache/tools/ant/BuildListener.html]
  interface, similarly to how the 
 [ant2ide|http://gleamynode.net/articles/2234/] project generates eclipse 
 project files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For 

[jira] Commented: (LUCENE-2493) Rename lucene/solr dev jar files to -SNAPSHOT.jar

2010-09-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912491#action_12912491
 ] 

David Smiley commented on LUCENE-2493:
--

Of course we should do this.  I've had to do this on my end with that 
-Ddev.verrsion=4.0-SNAPSHOT trick in the mean time.

 Rename lucene/solr dev jar files to -SNAPSHOT.jar
 -

 Key: LUCENE-2493
 URL: https://issues.apache.org/jira/browse/LUCENE-2493
 Project: Lucene - Java
  Issue Type: Task
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-2493-dev-to-SNAPSHOT.patch


 Currently the lucene dev jar files end with '-dev.jar' this is all fine, but 
 it makes people using maven jump through a few hoops to get the -SNAPSHOT 
 naming convention required by maven.  If we want to publish snapshot builds 
 with hudson, we would need to either write some crazy scripts or run the 
 build twice.
 I suggest we switch to -SNAPSHOT.jar.  Hopefully for the 3.x branch and for 
 the /trunk (4.x) branch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 10:28 AM, Steven A Rowe wrote:

 On 9/20/2010 at 8:24 AM, Grant Ingersoll wrote:
 At any rate, the big problem w/ Maven and Lucene is not that generate-
 maven-artifacts doesn't work, it's that the POM templates aren't kept in
 sync.  However, I think we now have a solution for that thanks to Steve
 and Robert's work to make it easier to bring Lucene into IntelliJ.  In
 other words, that process does much of what is needed for Maven, so it
 should be relatively straightforward to have it automatically generate the
 templates, too.  In fact, it would be just as easy for that project to
 simply produce POM files (which are well understood and have a published
 spec) instead of creating the IntelliJ project files, which are not well
 understood and not publicly spec'd and subject to change w/ every release
 and simply have IntelliJ suck in the POM file since IntelliJ supports that
 very, very well.
 
 Unfortunately, LUCENE-2611 does not automatically generate IntelliJ setup 
 files - they are static, just like the POM template files.

Hmm, hadn't looked that closely.  I'd say this is going to suffer the same fate 
of the POM template files then and would thus be against including it.

  I think it's possible, using an Ant BuildListener-extending class, to do 
 automatic generation, but I haven't attempted it yet.  I'll open an issue.


Cool.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2657) Auto-generate POM templates from Ant builds

2010-09-20 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912495#action_12912495
 ] 

Steven Rowe commented on LUCENE-2657:
-

bq. How would the BuildListener interface know about dependencies? Does it have 
some magic way to know this? 

BuildListener has hooks for build task onset and completion events (inter 
alia).  ant2ide listens for Javac task completion, and captures from it the 
source and target directories, as well as the build classpath.  You have to 
invoke compilation from an Ant build in order for this to work.

Seems kinda magical to me :)

The missing part here is figuring out the maven groupId/artifactId/version, and 
I *think* this can be dealt with by looking at the manifest in the jar.  
Maven-produced jars also contain their POMs, and pulling from there would be 
even simpler.

 Auto-generate POM templates from Ant builds
 ---

 Key: LUCENE-2657
 URL: https://issues.apache.org/jira/browse/LUCENE-2657
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Reporter: Steven Rowe
Priority: Minor
 Fix For: 3.1, 4.0


 Lucene and Solr modules' POM templates are manually maintained, and so are 
 not always in sync with the dependencies used by the Ant build. 
 It should be possible to auto-generate POM templates using build tools 
 extending Ant's 
 [BuildListener|http://api.dpml.net/ant/1.6.5/org/apache/tools/ant/BuildListener.html]
  interface, similarly to how the 
 [ant2ide|http://gleamynode.net/articles/2234/] project generates eclipse 
 project files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-09-20 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912497#action_12912497
 ] 

Jason Rutherglen commented on SOLR-1301:


Alexander,

I think we'll need to use Hadoop's Mini Cluster in order to have a proper unit 
test.  Adding Jetty as a dependency shouldn't be too much of a problem as Solr 
already includes a small version of Jetty?  That being said, it doesn't mean 
it's fun to write the unit test.  I can assist if needed.

 Solr + Hadoop
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Andrzej Bialecki 
 Fix For: Next

 Attachments: commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop.patch, log4j-1.2.15.jar, README.txt, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default filesystem (e.g. HDFS). Such part-N directories 
 can be used to run N shard servers. Additionally, users can specify the 
 number of reduce tasks, in particular 1 reduce task, in which case the output 
 will consist of a single shard.
 An example application is provided that processes large CSV files and uses 
 this API. It uses a custom CSV processing to avoid (de)serialization overhead.
 This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
 issue, you should put it in contrib/hadoop/lib.
 Note: the development of this patch was sponsored by an anonymous contributor 
 and approved for release under Apache License.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: discussion about release frequency.

2010-09-20 Thread Steven A Rowe
On 9/20/2010 at 11:15 AM, Grant Ingersoll wrote:
 Unfortunately, LUCENE-2611 does not automatically generate IntelliJ
 setup files - they are static, just like the POM template files.
 
 Hmm, hadn't looked that closely.  I'd say this is going to suffer the same
 fate of the POM template files then and would thus be against including
 it.

It's not quite as bad as the POM template files, since IntelliJ can be told to 
find all dependencies in a directory, rather than explicitly naming every 
dependency, and LUCENE-2611 uses that facility just about everywhere (I think 
the only exception is the JUnit jar test dependency, since the other stuff in 
the same directory shouldn't necessarily be depended on during testing).

So the IntelliJ project files in LUCENE-2611 would continue to work without 
manual intervention in the face of upgraded and/or additional dependencies, but 
would require manual effort to sync up with structural changes.  While I don't 
agree that this is a deal-breaker, since the manual intervention required would 
be fairly minimal, I agree that auto-generation would be a lot more useful than 
the current static approach.

My thought process was that setting this up manually would provide a benchmark 
for auto-generation; the auto-generated version should not be less functional 
than the manually generated one.

Steve



Re: discussion about release frequency.

2010-09-20 Thread Ryan McKinley
On Mon, Sep 20, 2010 at 12:01 PM, Robert Muir rcm...@gmail.com wrote:


 On Mon, Sep 20, 2010 at 10:29 AM, Yonik Seeley yo...@lucidimagination.com
 wrote:

 With what part?  Do you mean to say you wish to make maven a required
 part of our releases?
 If so, perhaps you should call a vote?


 It sounds like maybe we should

I'm not sure it would be useful yet.  There is consensus that the
process needs to improve.  The only concrete 'vote' i could imagine
now is to drop maven.


ryan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 12:29 PM, Ryan McKinley ryan...@gmail.com wrote:


 I'm not sure it would be useful yet.  There is consensus that the
 process needs to improve.  The only concrete 'vote' i could imagine
 now is to drop maven.


I completely agree the process needs to improve, but at the end of the day,
if we are planning to support maven officially in releases, i think we
should vote on it becoming part of the actual release process.

So maybe its premature to vote on this part, but at the same time, I have
concerns about what it would take to 'fully support' maven.

For example, if we have to reorganize our source tree to what it wants
(src/main/java, src/main/test), and rename our artifacts to what it wants
(-SNAPSHOT, etc), this is pretty important. what else might maven 'require'.

its also my understanding that in the past, when maven is upgraded (e.g.
Maven 2), it might require you to modify your project in ways such as this
to fit its new needs.

From what I know of maven, its quite inflexible about such things, and I
want to know what i'm getting into before we claim to 'make maven first
class citizen'.

-- 
Robert Muir
rcm...@gmail.com


[jira] Commented: (SOLR-1722) Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE

2010-09-20 Thread Ephraim Ofir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912573#action_12912573
 ] 

Ephraim Ofir commented on SOLR-1722:


Tried using the defaultCoreName attribute on a 2 core setup.  After performing 
a swap, my solr.xml no longer contains the defaultCoreName attribute, and the 
core which was dafult is now renamed to , so after restart of the process 
can't access it by it's former name and can't perform other operations on it 
such as rename, reload or swap...

 Allowing changing the special default core name, and as a default default 
 core name, switch to using collection1 rather than DEFAULT_CORE
 ---

 Key: SOLR-1722
 URL: https://issues.apache.org/jira/browse/SOLR-1722
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 1.5, 3.1, 4.0

 Attachments: SOLR-1722.patch, SOLR-1722.patch


 see 
 http://search.lucidimagination.com/search/document/f5f2af7c5041a79e/default_core

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: discussion about release frequency.

2010-09-20 Thread Uwe Schindler
If somebody reorders the directory structure, I will shout “revert revert 
revert” J

 

We can only “fully” support maven by switching to maven, but most of the core 
committers don’t want this (including me). In my opinion, the approach we had 
was fine, to simply create the jar files as we do for the binary release, but 
add some (hopefully) automatically generated pom files to it.

 

One thing I don’t like in this release process (as it currently works) is 
non-repeatable maven artifact generation. With maven, it’s impossible to 
regenerate the JAR files with the *same* MD5, even the MD5’s of the jar files 
in the binary release zip are different than the maven ones. If repeatability 
is not possible, at least the JAR files in the –bin.zip should be identical to 
the maven released ones!

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de/ http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Monday, September 20, 2010 9:35 AM
To: dev@lucene.apache.org
Subject: Re: discussion about release frequency.

 

 

On Mon, Sep 20, 2010 at 12:29 PM, Ryan McKinley ryan...@gmail.com wrote:

 

I'm not sure it would be useful yet.  There is consensus that the
process needs to improve.  The only concrete 'vote' i could imagine
now is to drop maven.

 

I completely agree the process needs to improve, but at the end of the day, if 
we are planning to support maven officially in releases, i think we should vote 
on it becoming part of the actual release process.

 

So maybe its premature to vote on this part, but at the same time, I have 
concerns about what it would take to 'fully support' maven.

 

For example, if we have to reorganize our source tree to what it wants 
(src/main/java, src/main/test), and rename our artifacts to what it wants 
(-SNAPSHOT, etc), this is pretty important. what else might maven 'require'.

 

its also my understanding that in the past, when maven is upgraded (e.g. Maven 
2), it might require you to modify your project in ways such as this to fit its 
new needs.

 

From what I know of maven, its quite inflexible about such things, and I want 
to know what i'm getting into before we claim to 'make maven first class 
citizen'.

 

-- 
Robert Muir
rcm...@gmail.com



[jira] Commented: (SOLR-1722) Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE

2010-09-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912588#action_12912588
 ] 

Mark Miller commented on SOLR-1722:
---

Thanks for the report Ephraim, thanks for the report - could you make a new 
issue for this bug?

 Allowing changing the special default core name, and as a default default 
 core name, switch to using collection1 rather than DEFAULT_CORE
 ---

 Key: SOLR-1722
 URL: https://issues.apache.org/jira/browse/SOLR-1722
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 1.5, 3.1, 4.0

 Attachments: SOLR-1722.patch, SOLR-1722.patch


 see 
 http://search.lucidimagination.com/search/document/f5f2af7c5041a79e/default_core

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1488) multilingual analyzer based on icu

2010-09-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1488:
--

Fix Version/s: 3.1

 multilingual analyzer based on icu
 --

 Key: LUCENE-1488
 URL: https://issues.apache.org/jira/browse/LUCENE-1488
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/analyzers
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: ICUAnalyzer.patch, LUCENE-1488.patch, LUCENE-1488.patch, 
 LUCENE-1488.patch, LUCENE-1488.patch, LUCENE-1488.txt, LUCENE-1488.txt


 The standard analyzer in lucene is not exactly unicode-friendly with regards 
 to breaking text into words, especially with respect to non-alphabetic 
 scripts.  This is because it is unaware of unicode bounds properties.
 I actually couldn't figure out how the Thai analyzer could possibly be 
 working until i looked at the jflex rules and saw that codepoint range for 
 most of the Thai block was added to the alphanum specification. defining the 
 exact codepoint ranges like this for every language could help with the 
 problem but you'd basically be reimplementing the bounds properties already 
 stated in the unicode standard. 
 in general it looks like this kind of behavior is bad in lucene for even 
 latin, for instance, the analyzer will break words around accent marks in 
 decomposed form. While most latin letter + accent combinations have composed 
 forms in unicode, some do not. (this is also an issue for asciifoldingfilter 
 i suppose). 
 I've got a partially tested standardanalyzer that uses icu Rule-based 
 BreakIterator instead of jflex. Using this method you can define word 
 boundaries according to the unicode bounds properties. After getting it into 
 some good shape i'd be happy to contribute it for contrib but I wonder if 
 theres a better solution so that out of box lucene will be more friendly to 
 non-ASCII text. Unfortunately it seems jflex does not support use of these 
 properties such as [\p{Word_Break = Extend}] so this is probably the major 
 barrier.
 Thanks,
 Robert

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 1:01 PM, Grant Ingersoll gsing...@apache.org wrote:
 - ''Note: You need committer rights to create a new Lucene release.''
 + This page is to help a Lucene/Solr committer create a new release (you
 need committer rights for some of the steps to create an official release).
 It does not reflect official release policy - many of the items may be
 optional, or may be modified as necessary.

 I think putting this up on the wiki is a bad idea.  We should strive to have
 a repeatable release process.  By saying it is up to the person who happens
 to be doing the release is just asking for less quality in our releases.  If
 you don't think you can follow the release process, then you shouldn't be
 doing the release.  And, if we as a community can't define a repeatable
 release process, then we shouldn't have a release either.

Calling something that anyone can go and edit and add their best ideas
to official is silly.
It does not list iron-clad requirements - it is there simply to help.
That's pretty obvious by looking at the huge list of content on that
page.  I'd rather spend my time writing code and improving the
projects rather than engaging in bureaucratic exercises.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Robert Muir
sounds like we might need to create an official release policy, vote on it,
and commit it.

On Mon, Sep 20, 2010 at 1:07 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Mon, Sep 20, 2010 at 1:01 PM, Grant Ingersoll gsing...@apache.org
 wrote:
  - ''Note: You need committer rights to create a new Lucene release.''
  + This page is to help a Lucene/Solr committer create a new release (you
  need committer rights for some of the steps to create an official
 release).
  It does not reflect official release policy - many of the items may be
  optional, or may be modified as necessary.
 
  I think putting this up on the wiki is a bad idea.  We should strive to
 have
  a repeatable release process.  By saying it is up to the person who
 happens
  to be doing the release is just asking for less quality in our releases.
  If
  you don't think you can follow the release process, then you shouldn't be
  doing the release.  And, if we as a community can't define a repeatable
  release process, then we shouldn't have a release either.

 Calling something that anyone can go and edit and add their best ideas
 to official is silly.
 It does not list iron-clad requirements - it is there simply to help.
 That's pretty obvious by looking at the huge list of content on that
 page.  I'd rather spend my time writing code and improving the
 projects rather than engaging in bureaucratic exercises.

 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Robert Muir
rcm...@gmail.com


[jira] Resolved: (LUCENE-2656) If tests fail, don't report about unclosed resources

2010-09-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-2656.
-

  Assignee: Robert Muir
Resolution: Fixed

Committed revision 999016, 999021 (3x)

 If tests fail, don't report about unclosed resources
 

 Key: LUCENE-2656
 URL: https://issues.apache.org/jira/browse/LUCENE-2656
 Project: Lucene - Java
  Issue Type: Test
  Components: Tests
Affects Versions: 3.1, 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2656.patch


 LuceneTestCase ensures in afterClass() if you closed all your directories, 
 which in turn will check if you have closed any open files.
 This is good, as a test will fail if we have resource leaks.
 But if a test truly fails, this is just confusing, because its usually not 
 going to make it to the part of its code where it would call .close()
 So, if any tests fail, I think we should omit this check in afterClass()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 1:07 PM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 1:01 PM, Grant Ingersoll gsing...@apache.org wrote:
 - ''Note: You need committer rights to create a new Lucene release.''
 + This page is to help a Lucene/Solr committer create a new release (you
 need committer rights for some of the steps to create an official release).
 It does not reflect official release policy - many of the items may be
 optional, or may be modified as necessary.
 
 I think putting this up on the wiki is a bad idea.  We should strive to have
 a repeatable release process.  By saying it is up to the person who happens
 to be doing the release is just asking for less quality in our releases.  If
 you don't think you can follow the release process, then you shouldn't be
 doing the release.  And, if we as a community can't define a repeatable
 release process, then we shouldn't have a release either.
 
 Calling something that anyone can go and edit and add their best ideas
 to official is silly.

Fine, let's lock it down then.

 It does not list iron-clad requirements - it is there simply to help.

Again, I disagree.  Having done a number of releases, it would simply be 
impossible without it, no matter how long the list is.  Unless, of course, all 
you want is the release to be the source, but even that is in doubt b/c how 
would I know where to upload it to?  For instance, how do you know which Ant 
target really gets you the right thing to distribute?

 That's pretty obvious by looking at the huge list of content on that
 page.  I'd rather spend my time writing code and improving the
 projects rather than engaging in bureaucratic exercises.

Well, part of an improved projects is a release that people can consistently 
rely on.  If there is too much chaff in the current release, fine, let's get 
rid of it or automate it.  However, to suggest that a written out release 
process is not needed or is subject to whatever the RM wants is just plain 
ludicrous.  Are you really arguing that we, the writers of a massively used and 
deployed open source library, should have a release process that is subject to 
the whims of whoever happens to be doing it on that given day?  Regardless as 
to whether you want to or not, we as a community need to make sure the 
community can rely on the results of us writing the code.

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 1:49 PM, Grant Ingersoll gsing...@apache.org wrote:
 On Sep 20, 2010, at 1:07 PM, Yonik Seeley wrote:
 It does not list iron-clad requirements - it is there simply to help.

 Again, I disagree.  Having done a number of releases, it would simply be 
 impossible without it

Usefulness certainly does not imply officialness and certainly does
not imply that everything on there is mandatory.
We've never needed anything quote so iron-clad in the past - we were
able to use our judgments to adapt as necessary.  And individuals went
and updated that page with helpful things because no one was under the
impression that anything there was binding.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: discussion about release frequency.

2010-09-20 Thread karl.wright
“but again, i have serious questions about maven in general.”

Maybe you just need to drink the Maven Koolaid.  Unless they have something 
stronger… ;-)

Karl


From: ext Robert Muir [mailto:rcm...@gmail.com]
Sent: Monday, September 20, 2010 1:08 PM
To: dev@lucene.apache.org
Subject: Re: discussion about release frequency.


On Mon, Sep 20, 2010 at 12:54 PM, Uwe Schindler 
u...@thetaphi.demailto:u...@thetaphi.de wrote:
If somebody reorders the directory structure, I will shout “revert revert 
revert” ☺

I wouldn't shout revert revert revert if by renaming stuff from src/java to 
src/main/java etc, Grant's idea would work, in that we still use ant for our 
build, but we have some way to automagically generate IDE configuration files 
for eclipse, idea, netbeans, emacs, whatever, via some maven tool.

If this was the benefit, and the tradeoff being more difficult merging, and 
having to ignore some path segments on existing patches, I might consider it 
worth the cost.

but again, i have serious questions about maven in general. for example, what 
if I wanted to add/modify a contrib that depends on a library that is not 
mavenized?   Is it my responsibility to mavenize that dependency, too? Does 
it make the release artifact invalid? is it a valid reason against adding that 
contrib, since its dependencies are not all mavenized?

the fact that maven acts like a computer virus, but requires special things of 
its hosts, means that i am pretty hesitant to vote for full support of it 
without knowing exactly what the tradeoffs are.

--
Robert Muir
rcm...@gmail.commailto:rcm...@gmail.com


Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 2:04 PM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 1:49 PM, Grant Ingersoll gsing...@apache.org wrote:
 On Sep 20, 2010, at 1:07 PM, Yonik Seeley wrote:
 It does not list iron-clad requirements - it is there simply to help.
 
 Again, I disagree.  Having done a number of releases, it would simply be 
 impossible without it
 
 Usefulness certainly does not imply officialness and certainly does
 not imply that everything on there is mandatory.
 We've never needed anything quote so iron-clad in the past - we were
 able to use our judgments to adapt as necessary.  And individuals went
 and updated that page with helpful things because no one was under the
 impression that anything there was binding.
 

Of course it makes sense for it to be updatable to reflect that things change, 
servers get moved, ant targets get improved, but your message, on the heels of 
the Maven discussion, was interpreted by me (and please correct me if I'm 
wrong) to presume that you are saying that it is alright for the RM to decide 
what artifacts should be released.  So, if that's not the case, then fine, I 
agree, but if it is, then no, I don't think this is the right message to put on 
the page.  And it certainly isn't up to you alone to decide by placing it on 
the Wiki as a trivial update.

-Grant


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote:
 And it certainly isn't up to you alone to decide by placing it on the Wiki as 
a trivial update.

Most of the updates to that page were made w/o consensus, just as mine
was.  It's a guide - nothing more.
Again, if you feel differently, point to where we voted on that as
official policy, or call a vote to make it official policy.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 1:07 PM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 12:54 PM, Uwe Schindler u...@thetaphi.de wrote:
 If somebody reorders the directory structure, I will shout “revert revert 
 revert” J
 
 
 I wouldn't shout revert revert revert if by renaming stuff from src/java to 
 src/main/java etc, Grant's idea would work, in that we still use ant for our 
 build, but we have some way to automagically generate IDE configuration files 
 for eclipse, idea, netbeans, emacs, whatever, via some maven tool.
 
 If this was the benefit, and the tradeoff being more difficult merging, and 
 having to ignore some path segments on existing patches, I might consider it 
 worth the cost.
 
 but again, i have serious questions about maven in general. for example, what 
 if I wanted to add/modify a contrib that depends on a library that is not 
 mavenized?   Is it my responsibility to mavenize that dependency, too? 
 Does it make the release artifact invalid? is it a valid reason against 
 adding that contrib, since its dependencies are not all mavenized?

Typically, this is done by adding the library in question to the release, 
renamed appropriately.  For instance, in Solr, we had a trunk based version of 
Commons CSV at one point, so we put it up w/ the Solr artifacts and had the POM 
reflect that.  But yeah, it can be a pain.

 
 the fact that maven acts like a computer virus, but requires special things 
 of its hosts, means that i am pretty hesitant to vote for full support of 
 it without knowing exactly what the tradeoffs are.

I'm not saying we have to support it, but, in my view, it's pretty hard to take 
back a feature, admittedly only for some, that we have supported for a long 
time.



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote:
  And it certainly isn't up to you alone to decide by placing it on the Wiki 
 as a trivial update.
 
 Most of the updates to that page were made w/o consensus, just as mine
 was.  

You know there is a difference.  In the past, updates were made to the steps 
involved and subsequent RM's went and followed them or improved them.  Your 
update was to say throw all that work out, if you so desire, and do what you 
want.  While, yes, I will agree it is not official, it is the de facto standard 
by which we have done releases and RM's have always worked to it.  So, yes, we 
can argue the semantics of a wiki page, but the intent of that page, IMO, is 
that the RM follow it and that has, AFAICT, always been how RMs have acted when 
doing releases.

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 2:31 PM, Grant Ingersoll gsing...@apache.orgwrote:


 Typically, this is done by adding the library in question to the release,
 renamed appropriately.  For instance, in Solr, we had a trunk based version
 of Commons CSV at one point, so we put it up w/ the Solr artifacts and had
 the POM reflect that.  But yeah, it can be a pain.


I don't understand this, if I, as a lucene committer, can arbitrarily
publish commons CSV artifacts under maven, without being a commons CSV
committer, then why does someone have to be a lucene committer to publish
maven artifacts?!

Furthermore, if this is possible, then why does lucene itself have to
support maven, if someone else (e.g. hibernate) can simply download our jar
files and do the same?


 I'm not saying we have to support it, but, in my view, it's pretty hard to
 take back a feature, admittedly only for some, that we have supported for a
 long time.


I'm not sure we supported it, it seems to be a broken feature in nearly
every release.

-- 
Robert Muir
rcm...@gmail.com


Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 2:37 PM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 2:31 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 Typically, this is done by adding the library in question to the release, 
 renamed appropriately.  For instance, in Solr, we had a trunk based version 
 of Commons CSV at one point, so we put it up w/ the Solr artifacts and had 
 the POM reflect that.  But yeah, it can be a pain.
 
 I don't understand this, if I, as a lucene committer, can arbitrarily publish 
 commons CSV artifacts under maven, without being a commons CSV committer, 
 then why does someone have to be a lucene committer to publish maven 
 artifacts?!

It's under the Solr area, not the commons CSV area.  

 
 Furthermore, if this is possible, then why does lucene itself have to support 
 maven, if someone else (e.g. hibernate) can simply download our jar files and 
 do the same?
 
 
 I'm not saying we have to support it, but, in my view, it's pretty hard to 
 take back a feature, admittedly only for some, that we have supported for a 
 long time.
 
 
 I'm not sure we supported it, it seems to be a broken feature in nearly every 
 release. 

Nah, some times some pieces are broken, but the core one always works, AFAICT.  
;-)
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 2:43 PM, Grant Ingersoll gsing...@apache.orgwrote:


 It's under the Solr area, not the commons CSV area.


sure, but this doesn't answer the question. if other projects can do this
same trick, why do we need to do any maven at all? we can just let those
that want maven support, provide it themselves. Ultimately this would
probably mean they do a better job of it anyway, since they care about it
working for their project to work.

-- 
Robert Muir
rcm...@gmail.com


Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 2:46 PM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 2:36 PM, Grant Ingersoll gsing...@apache.org wrote:
 On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote:
 On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org 
 wrote:
 While, yes, I will agree it is not official, it is the de facto standard by 
 which we have done releases and RM's have always worked to it.
 
 I'd wager that there has never been a single lucene or solr release
 that followed every single instruction to the T.  Which means that
 people need to use their heads and understand that many of the items
 may be optional, or may be modified as necessary.
 
 You can't point at the guide as a *reason* to do something, only *how*
 to do something.  If I knew someone would point to it and say you
 must do XYZ because it's on that HOWTO then I would have vetoed most
 changes to that page.


As I have said for the 3rd time, of course I get that people need to be 
flexible and there has always been an implied use your head.  But, as I said, 
given you wrote it on the heels of the discussion around Maven and that you 
think we shouldn't publish Maven artifacts, I think it is clear you intend it 
to imply that the RM gets to chose what artifacts are released.  Is that not 
the case?
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 2:47 PM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 2:43 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 It's under the Solr area, not the commons CSV area.
 
 sure, but this doesn't answer the question. if other projects can do this 
 same trick, why do we need to do any maven at all? we can just let those that 
 want maven support, provide it themselves. Ultimately this would probably 
 mean they do a better job of it anyway, since they care about it working for 
 their project to work.

Not following.  Joe Schmoe w/ project X doesn't have the right to go publish 
artifacts at org.apache.lucene.XXX in the iBiblio repository.  And, in many 
cases, we may not have the right to publish others, but for Apache projects, we 
can.  Otherwise, in the past, I've often asked the dependency authors to 
produce them.  Most people will if it means they are getting a wider 
distribution.  In practice, it rarely is an issue.

-Grant



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 3:30 PM, Grant Ingersoll gsing...@apache.orgwrote:


 Not following.  Joe Schmoe w/ project X doesn't have the right to go
 publish artifacts at org.apache.lucene.XXX in the iBiblio repository.  And,
 in many cases, we may not have the right to publish others, but for Apache
 projects, we can.  Otherwise, in the past, I've often asked the dependency
 authors to produce them.  Most people will if it means they are getting a
 wider distribution.  In practice, it rarely is an issue.


right but why cant joe shmoe make joe.schmoe.luceneMaven.XXX in the iBiblio
repository?

At the end of the day, I'm trying to figure out if we can push maven
downstream as others have suggested, and it sounds like we can.

-- 
Robert Muir
rcm...@gmail.com


Re: discussion about release frequency.

2010-09-20 Thread Mark Miller
On 9/20/10 3:36 PM, Robert Muir wrote:
 
 right but why cant joe shmoe make joe.schmoe.luceneMaven.XXX in the
 iBiblio repository?
 

That sounds enticing - someone else can step up to be the authority.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 3:27 PM, Grant Ingersoll gsing...@apache.org wrote:

 On Sep 20, 2010, at 2:46 PM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 2:36 PM, Grant Ingersoll gsing...@apache.org wrote:
 On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote:
 On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org 
 wrote:
 While, yes, I will agree it is not official, it is the de facto standard by 
 which we have done releases and RM's have always worked to it.

 I'd wager that there has never been a single lucene or solr release
 that followed every single instruction to the T.  Which means that
 people need to use their heads and understand that many of the items
 may be optional, or may be modified as necessary.

 You can't point at the guide as a *reason* to do something, only *how*
 to do something.  If I knew someone would point to it and say you
 must do XYZ because it's on that HOWTO then I would have vetoed most
 changes to that page.

 As I have said for the 3rd time, of course I get that people need to be 
 flexible and there has always been an implied use your head.  But, as I 
 said, given you wrote it on the heels of the discussion around Maven and that 
 you think we shouldn't publish Maven artifacts, I think it is clear you 
 intend it to imply that the RM gets to chose what artifacts are released.  Is 
 that not the case?

IMO, the RM has no more power than any other PMC member.  But when
there are a lot of optional things on the list... I guess the
volunteers doing the work get to decide what parts they want to do.
The PMC as a whole gets to decide to release artifacts or not.

I am also re-asserting (as I have asserted in the past) that the Maven
artifacts are *optional*.
We've discussed maven not being mandatory before:
http://search.lucidimagination.com/search/document/bd618c89a4d458dc/lucene_2_9_again
http://search.lucidimagination.com/search/document/3b98fa9ec3073936

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 3:36 PM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 3:30 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 Not following.  Joe Schmoe w/ project X doesn't have the right to go publish 
 artifacts at org.apache.lucene.XXX in the iBiblio repository.  And, in many 
 cases, we may not have the right to publish others, but for Apache projects, 
 we can.  Otherwise, in the past, I've often asked the dependency authors to 
 produce them.  Most people will if it means they are getting a wider 
 distribution.  In practice, it rarely is an issue.
 
 
 right but why cant joe shmoe make joe.schmoe.luceneMaven.XXX in the iBiblio 
 repository?
 
 At the end of the day, I'm trying to figure out if we can push maven 
 downstream as others have suggested, and it sounds like we can.
 

Why don't we just leave this as this:

Those of us who want Maven supported as part of the release need to get our 
stuff together by the next release or else it will be dropped.  That means 
making sure the artifacts are correct and easily testable/reproducible.  If we 
can't do that, then I agree, it should be a downstream effort, at least until 
we all realize how many people actually use it and then we revisit it at the 
next release.

-Grant



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 3:46 PM, Grant Ingersoll gsing...@apache.orgwrote:


 Why don't we just leave this as this:

 Those of us who want Maven supported as part of the release need to get our
 stuff together by the next release or else it will be dropped.  That means
 making sure the artifacts are correct and easily testable/reproducible.  If
 we can't do that, then I agree, it should be a downstream effort, at least
 until we all realize how many people actually use it and then we revisit it
 at the next release.


But I'm not sure this is the best solution? If we can push this downstream,
so that the release manager has less to worry about (even with testable
artifacts etc, the publication etc), why wouldn't we do that instead?

-- 
Robert Muir
rcm...@gmail.com


[jira] Created: (LUCENE-2658) TestIndexWriterExceptions random failure: AIOOBE in ByteBlockPool.allocSlice

2010-09-20 Thread Robert Muir (JIRA)
TestIndexWriterExceptions random failure: AIOOBE in ByteBlockPool.allocSlice


 Key: LUCENE-2658
 URL: https://issues.apache.org/jira/browse/LUCENE-2658
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Robert Muir


TestIndexWriterExceptions threw this today, and its reproducable

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2658) TestIndexWriterExceptions random failure: AIOOBE in ByteBlockPool.allocSlice

2010-09-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2658:


Attachment: LUCENE-2658_environment.patch

attached is my current modifications to trunk (unrelated to this failure 
completely)

this is because, i have a single test seed that controls all behavior, so i 
want to make sure the random seed i give you will actually work.

if you apply the patch, just run 

ant test-core -Dtestcase=TestIndexWriterExceptions -Dtests.seed=1285011726042

{noformat}

junit-sequential:
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
[junit] Testcase: 
testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions): 
  FAILED
[junit] thread Indexer 0: hit unexpected failure
[junit] junit.framework.AssertionFailedError: thread Indexer 0: hit 
unexpected failure
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:773)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:746)
[junit] at 
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptio
ns.java:195)
[junit]
[junit]
[junit] Tests run: 2, Failures: 1, Errors: 0, Time elapsed: 1.257 sec
[junit]
[junit] - Standard Output ---
[junit] Indexer 2: unexpected exception3
[junit] java.lang.ArrayIndexOutOfBoundsException: 483
[junit] at 
org.apache.lucene.index.ByteSliceReader.nextSlice(ByteSliceReader.java:108)
[junit] at 
org.apache.lucene.index.ByteSliceReader.writeTo(ByteSliceReader.java:90)
[junit] at 
org.apache.lucene.index.TermVectorsTermsWriterPerField.finish(TermVectorsTermsWriterPerField.java:186
)
[junit] at 
org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:552)
[junit] at 
org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:554)
[junit] at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:208)
[junit] at 
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:24
8)
[junit] at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:839)
[junit] at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:820)
[junit] at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2162)
[junit] at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2134)
[junit] at 
org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:98
)
[junit] Indexer 0: unexpected exception3
[junit] java.lang.ArrayIndexOutOfBoundsException: 507
[junit] at 
org.apache.lucene.index.ByteSliceReader.nextSlice(ByteSliceReader.java:108)
[junit] at 
org.apache.lucene.index.ByteSliceReader.writeTo(ByteSliceReader.java:90)
[junit] at 
org.apache.lucene.index.TermVectorsTermsWriterPerField.finish(TermVectorsTermsWriterPerField.java:186
)
[junit] at 
org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:552)
[junit] at 
org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:554)
[junit] at 
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:208)
[junit] at 
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:24
8)
[junit] at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:839)
[junit] at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:820)
[junit] at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2162)
[junit] at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2134)
[junit] at 
org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:98
)
[junit] Indexer 1: unexpected exception3
[junit] java.lang.ArrayIndexOutOfBoundsException: 15
[junit] at 
org.apache.lucene.index.ByteBlockPool.allocSlice(ByteBlockPool.java:122)
[junit] at 
org.apache.lucene.index.TermsHashPerField.writeByte(TermsHashPerField.java:526)
[junit] at 
org.apache.lucene.index.TermsHashPerField.writeVInt(TermsHashPerField.java:547)
[junit] at 
org.apache.lucene.index.TermVectorsTermsWriterPerField.newTerm(TermVectorsTermsWriterPerField.java:22
5)
[junit] at 
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:375)
[junit] at 
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:513)
[junit] at 

Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 3:49 PM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 3:46 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 Why don't we just leave this as this:
 
 Those of us who want Maven supported as part of the release need to get our 
 stuff together by the next release or else it will be dropped.  That means 
 making sure the artifacts are correct and easily testable/reproducible.  If 
 we can't do that, then I agree, it should be a downstream effort, at least 
 until we all realize how many people actually use it and then we revisit it 
 at the next release.
 
 
 But I'm not sure this is the best solution? If we can push this downstream, 
 so that the release manager has less to worry about (even with testable 
 artifacts etc, the publication etc), why wouldn't we do that instead?
 

Because it's not authoritative.  How would our users know which one is the 
official one?  By publishing it under the ASF one with our signatures we are 
saying this is our official version.  We would never claim that the Solr 
Commons CSV one is the official Commons jar, it's just the official one that 
Solr officially uses.  It's a big difference.   Besides, it's not like the 
iBiblio repo is open to anyone.  You have to apply and you have to have 
authority to write to it.  For the ASF, there is a whole sync process whereby 
iBiblio syncs with an ASF version.  In other words, we are the only ones who 
can publish it to the same space where it is currently published.

-Grant



Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 4:11 PM, Grant Ingersoll gsing...@apache.orgwrote:


 Because it's not authoritative.  How would our users know which one is the
 official one?  By publishing it under the ASF one with our signatures we are
 saying this is our official version.  We would never claim that the Solr
 Commons CSV one is the official Commons jar, it's just the official one that
 Solr officially uses.  It's a big difference.   Besides, it's not like the
 iBiblio repo is open to anyone.  You have to apply and you have to have
 authority to write to it.  For the ASF, there is a whole sync process
 whereby iBiblio syncs with an ASF version.  In other words, we are the only
 ones who can publish it to the same space where it is currently published.


This authoratitiveness comes with a significant cost, that is the
complexity of maven in our release process.  I'm not convinced its worth
this cost, and before we decide to have maven as part of the release, i'd
like for there to be an actual vote.

Sorry to change my tone, but I was under the impression we needed a lucene
committer to do all this releasing work to support maven, it seems that this
is not the case, and other options are available.

-- 
Robert Muir
rcm...@gmail.com


Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 3:46 PM, Yonik Seeley wrote:

 On Mon, Sep 20, 2010 at 3:27 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 On Sep 20, 2010, at 2:46 PM, Yonik Seeley wrote:
 
 On Mon, Sep 20, 2010 at 2:36 PM, Grant Ingersoll gsing...@apache.org 
 wrote:
 On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote:
 On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org 
 wrote:
 While, yes, I will agree it is not official, it is the de facto standard 
 by which we have done releases and RM's have always worked to it.
 
 I'd wager that there has never been a single lucene or solr release
 that followed every single instruction to the T.  Which means that
 people need to use their heads and understand that many of the items
 may be optional, or may be modified as necessary.
 
 You can't point at the guide as a *reason* to do something, only *how*
 to do something.  If I knew someone would point to it and say you
 must do XYZ because it's on that HOWTO then I would have vetoed most
 changes to that page.
 
 As I have said for the 3rd time, of course I get that people need to be 
 flexible and there has always been an implied use your head.  But, as I 
 said, given you wrote it on the heels of the discussion around Maven and 
 that you think we shouldn't publish Maven artifacts, I think it is clear you 
 intend it to imply that the RM gets to chose what artifacts are released.  
 Is that not the case?
 
 IMO, the RM has no more power than any other PMC member.  But when
 there are a lot of optional things on the list...

Perhaps you should itemize all the items that are optional and then we can mark 
them as such.  Is uploading the artifacts (maven or not) optional?  Perhaps 
next time I do a release I'll just skip that one.  Is updating the website?  
OK, so I'll give you the FreshMeat and the ServerSide posts, etc.

 I guess the
 volunteers doing the work get to decide what parts they want to do.

I'd agree that there are some things that should be optional, especially the 
post release items.  Some things, however, are not.  Perhaps we should just 
list out what we view as being required and which ones are not.

 The PMC as a whole gets to decide to release artifacts or not.

Of course.  I don't see how that is relevant to the question I asked.

 
 I am also re-asserting (as I have asserted in the past) that the Maven
 artifacts are *optional*.
 We've discussed maven not being mandatory before:
 http://search.lucidimagination.com/search/document/bd618c89a4d458dc/lucene_2_9_again
 http://search.lucidimagination.com/search/document/3b98fa9ec3073936
 

You asserting in previous threads that Maven is optional does not make it 
optional.  AFAICT, we have done them for as long as we have said we would do 
them.  I'm fine with us as a community dropping Maven releases if that is what 
is decided.  I am absolutely not fine with the RM deciding to drop them based 
on what he feels like doing as part of that release.  If you don't have time to 
do the required items, then you shouldn't be an RM.




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: discussion about release frequency.

2010-09-20 Thread Grant Ingersoll

On Sep 20, 2010, at 4:15 PM, Robert Muir wrote:

 
 
 On Mon, Sep 20, 2010 at 4:11 PM, Grant Ingersoll gsing...@apache.org wrote:
 
 Because it's not authoritative.  How would our users know which one is the 
 official one?  By publishing it under the ASF one with our signatures we are 
 saying this is our official version.  We would never claim that the Solr 
 Commons CSV one is the official Commons jar, it's just the official one that 
 Solr officially uses.  It's a big difference.   Besides, it's not like the 
 iBiblio repo is open to anyone.  You have to apply and you have to have 
 authority to write to it.  For the ASF, there is a whole sync process whereby 
 iBiblio syncs with an ASF version.  In other words, we are the only ones who 
 can publish it to the same space where it is currently published.
 
 
 This authoratitiveness comes with a significant cost, that is the 
 complexity of maven in our release process.  I'm not convinced its worth this 
 cost, and before we decide to have maven as part of the release, i'd like for 
 there to be an actual vote. 

I agree.  But, like I said, if those who want it step up and make it fully 
supported, then there is no more cost than uploading a few extra artifacts, 
then what's the extra cost?  As usual in open source, why don't we just leave 
it those who do the work?  If no one steps up and fixes it, then it doesn't get 
included.

 
 Sorry to change my tone, but I was under the impression we needed a lucene 
 committer to do all this releasing work to support maven, it seems that this 
 is not the case, and other options are available.
 

I'm sorry, I don't see the other options.  I think it does need to be done by a 
Lucene committer to be an official Lucene artifact.  OK, well, I suppose some 
other ASF person could do it, but short of a benevolent volunteer to do so, I 
don't think there are other options.

-Grant




Re: discussion about release frequency.

2010-09-20 Thread Robert Muir
On Mon, Sep 20, 2010 at 4:20 PM, Grant Ingersoll gsing...@apache.orgwrote:


 I'm sorry, I don't see the other options.  I think it does need to be done
 by a Lucene committer to be an official Lucene artifact.  OK, well, I
 suppose some other ASF person could do it, but short of a benevolent
 volunteer to do so, I don't think there are other options.


I will quote Ryan here: The artifacts are the identical .jar files put
into a special directory structure.

Therefore, if we release without maven, the jar files are signed by our
release key. this is authoritative enough, maven does check signatures
correct?

I'm not buying the authoritative argument, it seems like any old joker can
take our signed jars and put them in maven themselves, without us having to
do any work.

-- 
Robert Muir
rcm...@gmail.com


Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley

2010-09-20 Thread Yonik Seeley
On Mon, Sep 20, 2010 at 4:17 PM, Grant Ingersoll gsing...@apache.org wrote:
 On Sep 20, 2010, at 3:46 PM, Yonik Seeley wrote:
 I am also re-asserting (as I have asserted in the past) that the Maven
 artifacts are *optional*.
 We've discussed maven not being mandatory before:
 http://search.lucidimagination.com/search/document/bd618c89a4d458dc/lucene_2_9_again
 http://search.lucidimagination.com/search/document/3b98fa9ec3073936


 You asserting in previous threads that Maven is optional does not make it 
 optional.

I *think* that's a roundabout way of saying that you do think it's
mandatory.  But you've been unable to point to how it became
mandatory... and there seems to be a distinct lack of consensus over
it.  Certainly makes it sound optional.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2613) spatial random test failure (TestCartesian)

2010-09-20 Thread Lee Cooper (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736
 ] 

Lee Cooper commented on LUCENE-2613:


This is the only reference similar to a problem I am experiencing.  I am using 
Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)

 spatial random test failure (TestCartesian)
 ---

 Key: LUCENE-2613
 URL: https://issues.apache.org/jira/browse/LUCENE-2613
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-2613.patch


 {noformat}
 java.lang.IllegalArgumentException: null iterator
   at 
 org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
   at 
 org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168)
   at 
 org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542)
 {noformat}
 plug in seed of -6954859807298077232L to newRandom to reproduce.
 didnt test to see if it affected 3x also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)

2010-09-20 Thread Lee Cooper (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736
 ] 

Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:55 PM:
-

This is the only reference similar to a problem I am experiencing.  I am using 
Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

description.
java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)

  was (Author: leetcooper):
This is the only reference similar to a problem I am experiencing.  I am 
using Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)
  
 spatial random test failure (TestCartesian)
 ---

 Key: LUCENE-2613
 URL: https://issues.apache.org/jira/browse/LUCENE-2613
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-2613.patch


 {noformat}
 java.lang.IllegalArgumentException: null iterator
   at 
 org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
   at 
 org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168)
   at 
 org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542)
 {noformat}
 plug in seed of -6954859807298077232L to newRandom to reproduce.
 didnt test to see if it affected 3x also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)

2010-09-20 Thread Lee Cooper (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736
 ] 

Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:56 PM:
-

This is the only reference similar to a problem I am experiencing.  I am using 
Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

*
java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)*

  was (Author: leetcooper):
This is the only reference similar to a problem I am experiencing.  I am 
using Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

description.
java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)
  
 spatial random test failure (TestCartesian)
 ---

 Key: LUCENE-2613
 URL: https://issues.apache.org/jira/browse/LUCENE-2613
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-2613.patch


 {noformat}
 java.lang.IllegalArgumentException: null iterator
   at 
 org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
   at 
 org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168)
   at 
 org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542)
 {noformat}
 plug in seed of -6954859807298077232L to newRandom to reproduce.
 didnt test to see if it affected 3x also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)

2010-09-20 Thread Lee Cooper (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736
 ] 

Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:57 PM:
-

This is the only reference similar to a problem I am experiencing.  I am using 
Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

*strong*
java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)*strong*

  was (Author: leetcooper):
This is the only reference similar to a problem I am experiencing.  I am 
using Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

*
java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)*
  
 spatial random test failure (TestCartesian)
 ---

 Key: LUCENE-2613
 URL: https://issues.apache.org/jira/browse/LUCENE-2613
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-2613.patch


 {noformat}
 java.lang.IllegalArgumentException: null iterator
   at 
 org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
   at 
 org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168)
   at 
 org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542)
 {noformat}
 plug in seed of -6954859807298077232L to newRandom to reproduce.
 didnt test to see if it affected 3x also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)

2010-09-20 Thread Lee Cooper (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736
 ] 

Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:57 PM:
-

This is the only reference similar to a problem I am experiencing.  I am using 
Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)

  was (Author: leetcooper):
This is the only reference similar to a problem I am experiencing.  I am 
using Lucene 2.9.2 and I am getting the below exception when using a 
GeoHashDistanceFilter filter

It is working most of the time but under certain conditions and sometimes 
intermittently Lucene throws this exception.  

Can someone tell me why the exception might be thrown and is there anything I 
can do to stop it happening?

I see that this is still open will the resolution of this problem solve my 
problem?

*strong*
java.lang.IllegalArgumentException: null iterator
at 
org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
at 
org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
at 
org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71)
at 
org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Searcher.search(Searcher.java:181)*strong*
  
 spatial random test failure (TestCartesian)
 ---

 Key: LUCENE-2613
 URL: https://issues.apache.org/jira/browse/LUCENE-2613
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-2613.patch


 {noformat}
 java.lang.IllegalArgumentException: null iterator
   at 
 org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38)
   at 
 org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72)
   at 
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168)
   at 
 org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542)
 {noformat}
 plug in seed of -6954859807298077232L to newRandom to reproduce.
 didnt test to see if it affected 3x also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Document links

2010-09-20 Thread mark harwood
I've been looking at Graph Databases recently (neo4j, OrientDb, InfiniteGraph) 
as a faster alternative to relational stores. I notice they either embed Lucene 
for indexing node properties or (in the case of OrientDB) are talking about 
doing this. 

I think their fundamental performance advantage over relational stores is that 
they don't have to de-reference foreign keys in a b-tree index to get from a 
source node to a target node. Instead they use internally-generated IDs to act 
like pointers with more-or-less direct references between nodes/vertexes.  As a 
result they can follow links very quickly. This got me thinking could Lucene 
adopt the idea of creating links between documents that were equally fast using 
Lucene doc ids?

Maybe the user API would look something like this...

indexWriter.addLink(fromDocId, toDocId);
DocIdSet reader.getInboundLinks(docId);
DocIdSet reader.getOutboundLinks(docId);


Internally a new index file structure would be needed to record link info. Any 
recorded links that connect documents from different segments would need 
careful 
adjustment of referenced link IDs when segments merge and Lucene doc IDs are 
shuffled.

As well as handling typical graphs (social networks, web data) this could 
potentially be used to support tagging operations where apps could create tag 
documents and then link them to existing documents that are being tagged 
without 
having to update the target doc. There are probably a ton of applications for 
this stuff.

I see the Graph DBs busy recreating transactional support, indexes, segment 
merging etc and it seems to me that Lucene has a pretty good head start with 
this stuff.
Anyone else think this might be an area worth exploring?

Cheers
Mark




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1568) Implement Spatial Filter

2010-09-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912780#action_12912780
 ] 

Bill Bell commented on SOLR-1568:
-

The calculations of distance appears to be off.

Note: The radius of the sphere to be used when calculating distances on a 
sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see 
org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which 
is set to 3,958.761458084784856. Most applications will not need to set this.

The radius of the earth in KM is  6371.009 km (≈3958.761 mi).

Also filtering distance appears to be off - example data:

45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles 
= 220 kilometers

http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
 asc 

Nothing shows. d=285 shows results. This is off by a lot.

Bill





 Implement Spatial Filter
 

 Key: SOLR-1568
 URL: https://issues.apache.org/jira/browse/SOLR-1568
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: CartesianTierQParserPlugin.java, 
 SOLR-1568.Mattmann.031010.patch.txt, SOLR-1568.patch, SOLR-1568.patch, 
 SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, 
 SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, 
 SOLR-1568.patch, SOLR-1568.patch


 Given an index with spatial information (either as a geohash, 
 SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be 
 able to pass in a filter query that takes in the field name, lat, lon and 
 distance and produces an appropriate Filter (i.e. one that is aware of the 
 underlying field type for use by Solr. 
 The interface _could_ look like:
 {code}
 fq={!sfilt dist=20}location:49.32,-79.0
 {code}
 or it could be:
 {code}
 fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20}
 {code}
 or:
 {code}
 fq={!sfilt p=49.32,-79.0 f=location dist=20}
 {code}
 or:
 {code}
 fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20}
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Bill Bell (JIRA)
Spatial filter is not accurate
--

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell


The calculations of distance appears to be off.

Note: The radius of the sphere to be used when calculating distances on a 
sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see 
org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which 
is set to 3,958.761458084784856. Most applications will not need to set this.

The radius of the earth in KM is  6371.009 km (≈3958.761 mi).

Also filtering distance appears to be off - example data:

45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles 
= 220 kilometers

http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
 asc 

Nothing shows. d=285 shows results. This is off by a lot.

Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912785#action_12912785
 ] 

Yonik Seeley commented on SOLR-2125:


Just in time Bill!
I just started looking at spatial stuff today since I'm planning on putting 
some of it in my Lucene Revolution presentation.  I've seen some tweets about 
people having difficulties, and I've had some problems when I tried stuff 
myself.

Anyway, I'm going to try and clean up some of this stuff over the next few days 
and make the wiki a bit more user oriented - an extra pair of eyeballs would be 
welcome!

 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912799#action_12912799
 ] 

Yonik Seeley commented on SOLR-2125:


Hmmm, well, I just corrected one bug that hard-coded the distance in miles, but 
it was just a check to see if we crossed the poles.
I don't think that change alone will fix your issue.

Earlier today, I switched around some fields/field-types in the example schema, 
so store is now of latlon type, and it's the only location type (having 
multiple is just confusing).

So just looking at the bounding box now, here's the URL from your example:
http://localhost:8983/solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store}qt=standardpt=44.9369054,-91.3929348d=280debugQuery=true

And I can see that the generated bounding box is:
+store_0_coordinate:[43.129843715965166 TO 46.688683890119314] 
+store_1_coordinate:[-93.83266208454557 TO -88.79716545231159]

Which just misses the longitude of the point on the document of -93.87341.

Can anyone point to an webapp for checking arbitrary distances between two 
lat/lon points?


 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912808#action_12912808
 ] 

Yonik Seeley commented on SOLR-2125:


Bill, I found two online distance calculators that both give the distance 
between the points you provided as 196km.
http://www.movable-type.co.uk/scripts/latlong.html
http://www.es.flinders.edu.au/~mattom/Utilities/distance.html

Now... the distance of 280km you provided should certainly still encompass 
that, so we still have a bug anyway.

 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820
 ] 

Bill Bell commented on SOLR-2125:
-

Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies should be less distance.

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.






 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820
 ] 

Bill Bell edited comment on SOLR-2125 at 9/20/10 11:18 PM:
---

Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies is less distance (which makes sense).

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

code
function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

code

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.






  was (Author: billnbell):
Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies is less distance (which makes sense).

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.





  
 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820
 ] 

Bill Bell edited comment on SOLR-2125 at 9/20/10 11:16 PM:
---

Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies is less distance (which makes sense).

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.






  was (Author: billnbell):
Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies should be less distance.

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.





  
 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820
 ] 

Bill Bell edited comment on SOLR-2125 at 9/20/10 11:19 PM:
---

Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies is less distance (which makes sense).

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

{code}
function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

{code}

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.






  was (Author: billnbell):
Yes there is still a bug.

Most of what I was saying was right. I just did a quick maps.google.com - click 
directions - and then put the 2 lat,long in both fields.

137 miles = 220.480128 kilometers (Google)
196.6km using http://www.movable-type.co.uk/scripts/latlong.html 

See on map: 
http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935

Distance:   196.6 km
Initial bearing:096°53′44″
Final bearing:  098°39′05″
Midpoint:   45°03′48″N, 092°37′50″W

As the crow flies is less distance (which makes sense).

I even used the JS function on 
http://www.movable-type.co.uk/scripts/latlong.html:

code
function toRad(a) {
return (a*Math.PI/180);
};

function hsin(lat1,lon1,lat2,lon2) {
var R = 6371; // km
var dLat = toRad(lat2-lat1);
var dLon = toRad(lon2-lon1); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * 
Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c;
return d;
};

code

As a Javascript function - while looping through the results. Since I cannot 
find a way to output the distance automagically from the XML coming back from 
SOLR.

scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script

I kept playing with d=km to see when the filter is not longer showing on the 
results at while value.

sort=dist(2,store,vector(44.9369054,-91.3929348)) asc 

d=285 shows.
d=284 does not show.





  
 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912822#action_12912822
 ] 

Chris Male commented on SOLR-2125:
--

The sorting won't be the issue though surely? The bug seems to be in the 
bounding box generation that Yonik pointed out.  There will be some rounding 
issues at different places I can imagine, but nothing that would generate such 
a discrepancy.

 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912823#action_12912823
 ] 

Chris A. Mattmann commented on SOLR-2125:
-

Distance using the Haversine function is extremely sensitive to what spatial 
reference system the data was recorded in. WGS84 isn't particular great with 
long distances. The PostGIS in Action book has a really good explanation of 
this.

 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912827#action_12912827
 ] 

Yonik Seeley commented on SOLR-2125:


bq. Distance using the Haversine function is extremely sensitive to what 
spatial reference system the data was recorded in. WGS84 isn't particular great 
with long distances.

I know nothing on this topic, but an error of 45% at 200 km?  I'm pretty 
certain that there is a bug not having to do with the accuracy of spatial 
reference systems here.

 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912830#action_12912830
 ] 

Chris A. Mattmann commented on SOLR-2125:
-

Umm well if you know nothing then how are you pretty sure? And yes, the error 
bars are fairly high for the Great Circle distance.

 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2125) Spatial filter is not accurate

2010-09-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912833#action_12912833
 ] 

Bill Bell commented on SOLR-2125:
-

The calculation using hsin Javascript is more accurate than our algorithm ? 
Chris a few percentage points maybe - but not 45%.

I will look into it some more tonight. It can't be that complicated.

Bill


 Spatial filter is not accurate
 --

 Key: SOLR-2125
 URL: https://issues.apache.org/jira/browse/SOLR-2125
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 1.5
Reporter: Bill Bell

 The calculations of distance appears to be off.
 Note: The radius of the sphere to be used when calculating distances on a 
 sphere (i.e. haversine). Default is the Earth's mean radius in kilometers 
 (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) 
 which is set to 3,958.761458084784856. Most applications will not need to set 
 this.
 The radius of the earth in KM is  6371.009 km (≈3958.761 mi).
 Also filtering distance appears to be off - example data:
 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 
 miles = 220 kilometers
 http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348))
  asc 
 Nothing shows. d=285 shows results. This is off by a lot.
 Bill

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2659) lucenetestcase ease of use improvements

2010-09-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2659:


Attachment: LUCENE-2659.patch

 lucenetestcase ease of use improvements
 ---

 Key: LUCENE-2659
 URL: https://issues.apache.org/jira/browse/LUCENE-2659
 Project: Lucene - Java
  Issue Type: Test
  Components: Tests
Reporter: Robert Muir
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2659.patch


 I started working on this in LUCENE-2658, here is the finished patch.
 There are some problems with LuceneTestCase:
 * a tests beforeClass, or the test itself (its @befores and its method), 
 might have some
   random behavior, but only the latter can be reproduced with -Dtests.seed
 * if you want to do things in beforeClass, you have to use a different API: 
 newDirectory(random)
   instead of newDirectory, etc.
 * for a new user, the current output can be verbose, confusing and 
 overwhelming.
 So, I refactored this class to address these problems. 
 A class still needs 2 seeds internally, as the beforeClass will only run 
 once, 
 but the methods or setUp() might run many times, especially when increasing 
 iterations.
 but lucenetestcase deals with this, and the seed is 128-bit (UUID): 
 the MSB is initialized in beforeClass, the LSB varied for each method run.
 if you provide a seed with a -D, they are both fixed to the UUID you provided.
 I fixed the API to be consistent, so you should be able to migrate a test 
 from 
 setUp() to beforeClass() [junit3 to junit4] without changing parameters.
 The codec, locale, timezone is only printed once at the end if any tests 
 fail, 
 as its per-class anyway (setup in beforeClass)
 finally, when a test fails, you get a single reproduce with command line 
 you can copy and paste to reproduce.
 this way you dont have to spend time trying to figure out what the command 
 line should be.
 {noformat}
 [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 0.197 sec
 [junit]
 [junit] - Standard Output ---
 [junit] NOTE: reproduce with: ant test -Dtestcase=TestExample 
 -Dtestmethod=testMethodA 
   -Dtests.seed=a51e707b-6550-7800-9f8c-72622d14bf5f
 [junit] NOTE: reproduce with: ant test -Dtestcase=TestExample 
 -Dtestmethod=testMethodB 
   -Dtests.seed=a51e707b-6550-7800-f7eb-2efca3820738
 [junit] NOTE: test params are: codec=PreFlex, locale=ar_LY, 
 timezone=Etc/UCT
 [junit] -  ---
 [junit] Test org.apache.lucene.util.TestExample FAILED
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2659) lucenetestcase ease of use improvements

2010-09-20 Thread Robert Muir (JIRA)
lucenetestcase ease of use improvements
---

 Key: LUCENE-2659
 URL: https://issues.apache.org/jira/browse/LUCENE-2659
 Project: Lucene - Java
  Issue Type: Test
  Components: Tests
Reporter: Robert Muir
 Fix For: 3.1, 4.0
 Attachments: LUCENE-2659.patch

I started working on this in LUCENE-2658, here is the finished patch.

There are some problems with LuceneTestCase:
* a tests beforeClass, or the test itself (its @befores and its method), might 
have some
  random behavior, but only the latter can be reproduced with -Dtests.seed
* if you want to do things in beforeClass, you have to use a different API: 
newDirectory(random)
  instead of newDirectory, etc.
* for a new user, the current output can be verbose, confusing and overwhelming.

So, I refactored this class to address these problems. 
A class still needs 2 seeds internally, as the beforeClass will only run once, 
but the methods or setUp() might run many times, especially when increasing 
iterations.

but lucenetestcase deals with this, and the seed is 128-bit (UUID): 
the MSB is initialized in beforeClass, the LSB varied for each method run.
if you provide a seed with a -D, they are both fixed to the UUID you provided.

I fixed the API to be consistent, so you should be able to migrate a test from 
setUp() to beforeClass() [junit3 to junit4] without changing parameters.

The codec, locale, timezone is only printed once at the end if any tests fail, 
as its per-class anyway (setup in beforeClass)

finally, when a test fails, you get a single reproduce with command line you 
can copy and paste to reproduce.
this way you dont have to spend time trying to figure out what the command line 
should be.

{noformat}
[junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 0.197 sec
[junit]
[junit] - Standard Output ---
[junit] NOTE: reproduce with: ant test -Dtestcase=TestExample 
-Dtestmethod=testMethodA 
  -Dtests.seed=a51e707b-6550-7800-9f8c-72622d14bf5f
[junit] NOTE: reproduce with: ant test -Dtestcase=TestExample 
-Dtestmethod=testMethodB 
  -Dtests.seed=a51e707b-6550-7800-f7eb-2efca3820738
[junit] NOTE: test params are: codec=PreFlex, locale=ar_LY, timezone=Etc/UCT
[junit] -  ---
[junit] Test org.apache.lucene.util.TestExample FAILED
{noformat}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] Resolved: (LUCENE-2482) Index sorter

2010-09-20 Thread Lance Norskog
What is the philosophy about the 3.x branch? This is an all-new feature 
added to 3.x.


Andrzej Bialecki (JIRA) wrote:

  [ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  resolved LUCENE-2482.
---

 Resolution: Fixed

Committed in rev. 998948.

   

Index sorter


 Key: LUCENE-2482
 URL: https://issues.apache.org/jira/browse/LUCENE-2482
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/*
Affects Versions: 3.1
Reporter: Andrzej Bialecki
Assignee: Andrzej Bialecki
 Fix For: 3.1

 Attachments: indexSorter.patch


A tool to sort index according to a float document weight. Documents with high weight are 
given low document numbers, which means that they will be first evaluated. When using a 
strategy of early termination of queries (see TimeLimitedCollector) such 
sorting significantly improves the quality of partial results.
(Originally this tool was created by Doug Cutting in Nutch, and used norms as 
document weights - thus the ordering was limited by the limited resolution of 
norms. This is a pure Lucene version of the tool, and it uses arbitrary floats 
from a specified stored field).
 
   


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2126) highlighting multicore searches relying on q.alt gives NPE

2010-09-20 Thread David Smiley (JIRA)
highlighting multicore searches relying on q.alt gives NPE
--

 Key: SOLR-2126
 URL: https://issues.apache.org/jira/browse/SOLR-2126
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4
 Environment: I'm on a trunk release from early March, but I also just 
verified this on LucidWorks 1.4 which I have handy.
Reporter: David Smiley
Priority: Minor


To reproduce this, run the example multicore solr configuration.  Then index 
each example document into each core.  Now we're going to do a distributed 
search, with q.alt=*:* and defType=dismax.  Normally, these would be set in a 
request handler config as defaults but we'll put them in the url to make it 
clear they need to be set and because the default multicore example config is 
so bare bones that it doesn't already have a dismax setup.  We're going to 
enable highlighting.

http://localhost:8983/solr/core0/select?hl=trueq.alt=*:*defType=dismaxshards=localhost:8983/solr/core0,localhost:8983/solr/core1

java.lang.NullPointerException
at 
org.apache.solr.handler.component.HighlightComponent.finishStage(HighlightComponent.java:130)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

Since I happen to be using edismax in trunk, it was easy for me to work around 
this problem by renaming my q.alt parameter in my request handler defaults to 
just q since edismax understands raw lucene queries.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2650) improve windows defaults in FSDirectory

2010-09-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912838#action_12912838
 ] 

Robert Muir commented on LUCENE-2650:
-

I'm going to add the extra safety here for cloned mmapindexinputs as a separate 
commit 
from changing the defaults (in case we have to revert the defaults).

Its also good to backport (unlike the defaults)


 improve windows defaults in FSDirectory
 ---

 Key: LUCENE-2650
 URL: https://issues.apache.org/jira/browse/LUCENE-2650
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Store
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-2650.patch, LUCENE-2650.patch


 Currently windows defaults to SimpleFSDirectory, but this is a problem due to 
 the synchronization.
 I have been benchmarking queries *sequentially* and was pretty surprised at 
 how much faster
 MMapDirectory is, for example for cases that do many seeks.
 I think we should change the defaults for windows as such:
 if (WINDOWS and UNMAP_SUPPORTED and 64-bit)
   use MMapDirectory
 else
   use SimpleFSDirectory 
 I think we should just consider doing this for 4.0 only and see how it goes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org