Build failed in Hudson: Solr-trunk #1253
See https://hudson.apache.org/hudson/job/Solr-trunk/1253/changes Changes: [ryan] running 'mvn generate-maven-artifacts' will put all the files in the same directory (dist/maven) [yonik] SOLR-2123: group by query [rmuir] LUCENE-2653: ThaiAnalyzer assumes things about your jre [simonw] LUCENE-2588: Exposed indexed term prefix length to enable none-unicode sort order term indexes [mikemccand] LUCENE-2647: refactor reusable components out of standard codec -- [...truncated 6171 lines...] [junit] Testsuite: org.apache.solr.handler.SpellCheckerRequestHandlerTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 1.235 sec [junit] [junit] Testsuite: org.apache.solr.handler.StandardRequestHandlerTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.716 sec [junit] [junit] Testsuite: org.apache.solr.handler.TestCSVLoader [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.404 sec [junit] [junit] Testsuite: org.apache.solr.handler.TestReplicationHandler [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 53.563 sec [junit] [junit] Testsuite: org.apache.solr.handler.XmlUpdateRequestHandlerTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.616 sec [junit] [junit] Testsuite: org.apache.solr.handler.admin.LukeRequestHandlerTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.998 sec [junit] [junit] Testsuite: org.apache.solr.handler.admin.SystemInfoHandlerTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.005 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.DebugComponentTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.798 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.DistributedSpellCheckComponentTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 10.587 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.DistributedTermsComponentTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.841 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.QueryElevationComponentTest [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.749 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.SearchHandlerTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.589 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.SpellCheckComponentTest [junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 0.967 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.StatsComponentTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.539 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.TermVectorComponentTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.701 sec [junit] [junit] Testsuite: org.apache.solr.handler.component.TermsComponentTest [junit] Tests run: 13, Failures: 0, Errors: 0, Time elapsed: 0.815 sec [junit] [junit] Testsuite: org.apache.solr.highlight.FastVectorHighlighterTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.66 sec [junit] [junit] Testsuite: org.apache.solr.highlight.HighlighterConfigTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.571 sec [junit] [junit] Testsuite: org.apache.solr.highlight.HighlighterTest [junit] Tests run: 23, Failures: 0, Errors: 0, Time elapsed: 1.819 sec [junit] [junit] Testsuite: org.apache.solr.request.JSONWriterTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.633 sec [junit] [junit] Testsuite: org.apache.solr.request.SimpleFacetsTest [junit] Tests run: 22, Failures: 0, Errors: 0, Time elapsed: 6.174 sec [junit] [junit] Testsuite: org.apache.solr.request.TestBinaryResponseWriter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.634 sec [junit] [junit] Testsuite: org.apache.solr.request.TestFaceting [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 6.297 sec [junit] [junit] Testsuite: org.apache.solr.request.TestWriterPerf [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.202 sec [junit] [junit] Testsuite: org.apache.solr.response.TestCSVResponseWriter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.639 sec [junit] [junit] Testsuite: org.apache.solr.schema.BadIndexSchemaTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.431 sec [junit] [junit] Testsuite: org.apache.solr.schema.CopyFieldTest [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.525 sec [junit] [junit] Testsuite: org.apache.solr.schema.DateFieldTest [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.014 sec
[jira] Updated: (SOLR-1301) Solr + Hadoop
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Kanarsky updated SOLR-1301: - Attachment: SOLR-1301.patch The latest SOLR-1301-hadoop-0-20 patch is repackaged to be placed under the contrib, as it was initially (build.xml is included). and tested against the current trunk. As usual, after applying the patch put the 4 lib jars (hadoop, log4j, and two commons-logging) to the contrib/hadoop/lib. No unit tests as for now :) but I hope to add some soon. Here is the big question: as Andrzej once mentioned, the unit tests require a running Hadoop cluster. One approach is to make the patch and unit tests working with the Hadoop mini--cluster (ClusterMapReduceTestCase), however this will bring some extra dependencies needed to run the cluster (like jetty). Another idea is to use your own cluster and just configure access to this cluster in untt tests; this approach seems to be logical but potentially may give different test results on different clusters, and also may not give some low-level access to the execution, needed for tests. So what is your opinion on how the tests for solr-hadoop should be run? I am not really happy with the idea of starting and running the Hadoop cluster while performing the Solr unit tests, but this still could be the better option than no unit tests at all. Solr + Hadoop - Key: SOLR-1301 URL: https://issues.apache.org/jira/browse/SOLR-1301 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Andrzej Bialecki Fix For: Next Attachments: commons-logging-1.0.4.jar, commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, hadoop-0.20.1-core.jar, hadoop.patch, log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java This patch contains a contrib module that provides distributed indexing (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is twofold: * provide an API that is familiar to Hadoop developers, i.e. that of OutputFormat * avoid unnecessary export and (de)serialization of data maintained on HDFS. SolrOutputFormat consumes data produced by reduce tasks directly, without storing it in intermediate files. Furthermore, by using an EmbeddedSolrServer, the indexing task is split into as many parts as there are reducers, and the data to be indexed is not sent over the network. Design -- Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, which in turn uses SolrRecordWriter to write this data. SolrRecordWriter instantiates an EmbeddedSolrServer, and it also instantiates an implementation of SolrDocumentConverter, which is responsible for turning Hadoop (key, value) into a SolrInputDocument. This data is then added to a batch, which is periodically submitted to EmbeddedSolrServer. When reduce task completes, and the OutputFormat is closed, SolrRecordWriter calls commit() and optimize() on the EmbeddedSolrServer. The API provides facilities to specify an arbitrary existing solr.home directory, from which the conf/ and lib/ files will be taken. This process results in the creation of as many partial Solr home directories as there were reduce tasks. The output shards are placed in the output directory on the default filesystem (e.g. HDFS). Such part-N directories can be used to run N shard servers. Additionally, users can specify the number of reduce tasks, in particular 1 reduce task, in which case the output will consist of a single shard. An example application is provided that processes large CSV files and uses this API. It uses a custom CSV processing to avoid (de)serialization overhead. This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this issue, you should put it in contrib/hadoop/lib. Note: the development of this patch was sponsored by an anonymous contributor and approved for release under Apache License. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1218) maven artifact for webapp
[ https://issues.apache.org/jira/browse/SOLR-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912395#action_12912395 ] Kjetil Ødegaard commented on SOLR-1218: --- With the Solr WAR in the Maven repo, people would be able to easily build their own customized WARs with Maven WAR overlays. All you need to do is set a dependency to the Solr WAR from the web project with compile scope, and Maven handles the rest for you. We've put the Solr WAR in our local repo and use this for our custom Solr deploy. If it were in central, things would be even easier. maven artifact for webapp - Key: SOLR-1218 URL: https://issues.apache.org/jira/browse/SOLR-1218 Project: Solr Issue Type: New Feature Affects Versions: 1.3 Reporter: Benson Margulies It would be convenient to have a packagingwar/packaging maven project for the webapp, to allow launching solr from maven via jetty. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Failed Test: junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriter (from TestSuite)
Hmm -- I suspect TestIndexWriter.testNoWaitClose hit an exc, which caused it not to close the dir, but the code that catches this in LuceneTestCase fails to show that root cause? I think we should disable the dir/IndexInput/Output not closed checking if the test hit an exc? Ahh so here is the root cause: http://gperf.ath.cx:/hudson/job/Solcene/1704/testReport/junit/org.apache.lucene.index/TestIndexWriter/testNoWaitClose/ java.io.FileNotFoundException: /home/mark/hudson_solcene/jobs/Solcene/workspace/solcene/lucene/build/test/7/test4946766365764846424tmp/_46.fnm (Too many open files) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:233) at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:69) at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:90) at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:56) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:351) at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:299) at org.apache.lucene.index.FieldInfos.init(FieldInfos.java:69) at org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:131) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:536) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:509) at org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:129) at org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:96) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:91) at org.apache.lucene.index.IndexReader.open(IndexReader.java:414) at org.apache.lucene.index.IndexReader.open(IndexReader.java:233) at org.apache.lucene.index.TestIndexWriter.testNoWaitClose(TestIndexWriter.java:2174) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:805) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:778) The test is quite awful -- creates index w/ maxBufferedDocs 2 and mergeFactor 100 and (randomly) CFS on/off. So it's not surprising when it gets FSDir that it'll run out of descriptors... We should fix this test to not use any of the FS dirs? Mike On Sun, Sep 19, 2010 at 8:55 PM, Mark Miller markrmil...@gmail.com wrote: Failed junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriter (from TestSuite) Failing for the past 1 build (Since #1704 ) Took 0 ms. add description Error Message directory of test was not closed, opened from: org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:585) Stacktrace junit.framework.AssertionFailedError: directory of test was not closed, opened from: org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:585) at org.apache.lucene.util.LuceneTestCase.afterClassLuceneTestCaseJ4(LuceneTestCase.java:304) Standard Output NOTE: random codec of testcase 'testNoWaitClose' was: MockFixedIntBlock(blockSize=340) NOTE: random locale of testcase 'testNoWaitClose' was: ar_LB NOTE: random timezone of testcase 'testNoWaitClose' was: EAT -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1218) maven artifact for webapp
[ https://issues.apache.org/jira/browse/SOLR-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912429#action_12912429 ] Stevo Slavic commented on SOLR-1218: Voted for the issue too. As a temporary workaround, to reference solr.war but still keep solr config files in IDE under version control, I use following config: {code:title=pom.xml|borderStyle=solid} ... plugin groupIdorg.mortbay.jetty/groupId artifactIdjetty-maven-plugin/artifactId configuration stopPort${jetty.stop.port}/stopPort stopKeyfoo/stopKey webApp${env.SOLR_HOME}/example/webapps/solr.war/webApp tempDirectory${project.build.directory}/jetty-tmp/tempDirectory systemProperties systemProperty namesolr.solr.home/name value${basedir}/src/main/solr/home/value /systemProperty systemProperty namesolr.data.dir/name value${project.build.directory}/solr/data/value /systemProperty systemProperty namesolr_home/name value${env.SOLR_HOME}/value /systemProperty /systemProperties /configuration executions execution idstart-jetty/id phasepre-integration-test/phase goals goaldeploy-war/goal /goals configuration daemontrue/daemon webAppConfig contextPath/solr/contextPath tempDirectory${project.build.directory}/jetty-tmp/tempDirectory /webAppConfig connectors connector implementation=org.eclipse.jetty.server.nio.SelectChannelConnector port${jetty.http.port}/port /connector /connectors /configuration /execution execution idstop-jetty/id phasepost-integration-test/phase goals goalstop/goal /goals /execution /executions /plugin ... {code} And update the SOLR_HOME environment variable with move to new Solr installation/version. This is easy for development environment, not for CI (Hudson). That's why solr.war on public repo would be handy. maven artifact for webapp - Key: SOLR-1218 URL: https://issues.apache.org/jira/browse/SOLR-1218 Project: Solr Issue Type: New Feature Affects Versions: 1.3 Reporter: Benson Margulies It would be convenient to have a packagingwar/packaging maven project for the webapp, to allow launching solr from maven via jetty. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2491) Extend Codec with a SegmentInfos writer / reader
[ https://issues.apache.org/jira/browse/LUCENE-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved LUCENE-2491. --- Fix Version/s: 4.0 Resolution: Fixed This was committed as a part of LUCENE-2373. Extend Codec with a SegmentInfos writer / reader Key: LUCENE-2491 URL: https://issues.apache.org/jira/browse/LUCENE-2491 Project: Lucene - Java Issue Type: Improvement Components: Index Affects Versions: 4.0 Reporter: Andrzej Bialecki Fix For: 4.0 I'm trying to implement a Codec that works with append-only filesystems (HDFS). It's _almost_ done, except for the SegmentInfos.write(dir), which uses ChecksumIndexOutput, which in turn uses IndexOutput.seek() - and seek is not supported on append-only output. I propose to extend the Codec interface to encapsulate also the details of SegmentInfos writing / reading. Patch to follow after some feedback ;) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-2482) Index sorter
[ https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki reassigned LUCENE-2482: - Assignee: Andrzej Bialecki Index sorter Key: LUCENE-2482 URL: https://issues.apache.org/jira/browse/LUCENE-2482 Project: Lucene - Java Issue Type: New Feature Components: contrib/* Affects Versions: 3.1 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 3.1 Attachments: indexSorter.patch A tool to sort index according to a float document weight. Documents with high weight are given low document numbers, which means that they will be first evaluated. When using a strategy of early termination of queries (see TimeLimitedCollector) such sorting significantly improves the quality of partial results. (Originally this tool was created by Doug Cutting in Nutch, and used norms as document weights - thus the ordering was limited by the limited resolution of norms. This is a pure Lucene version of the tool, and it uses arbitrary floats from a specified stored field). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2656) If tests fail, don't report about unclosed resources
If tests fail, don't report about unclosed resources Key: LUCENE-2656 URL: https://issues.apache.org/jira/browse/LUCENE-2656 Project: Lucene - Java Issue Type: Test Components: Tests Affects Versions: 3.1, 4.0 Reporter: Robert Muir Fix For: 3.1, 4.0 Attachments: LUCENE-2656.patch LuceneTestCase ensures in afterClass() if you closed all your directories, which in turn will check if you have closed any open files. This is good, as a test will fail if we have resource leaks. But if a test truly fails, this is just confusing, because its usually not going to make it to the part of its code where it would call .close() So, if any tests fail, I think we should omit this check in afterClass() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2656) If tests fail, don't report about unclosed resources
[ https://issues.apache.org/jira/browse/LUCENE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2656: Attachment: LUCENE-2656.patch If tests fail, don't report about unclosed resources Key: LUCENE-2656 URL: https://issues.apache.org/jira/browse/LUCENE-2656 Project: Lucene - Java Issue Type: Test Components: Tests Affects Versions: 3.1, 4.0 Reporter: Robert Muir Fix For: 3.1, 4.0 Attachments: LUCENE-2656.patch LuceneTestCase ensures in afterClass() if you closed all your directories, which in turn will check if you have closed any open files. This is good, as a test will fail if we have resource leaks. But if a test truly fails, this is just confusing, because its usually not going to make it to the part of its code where it would call .close() So, if any tests fail, I think we should omit this check in afterClass() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Failed Test: junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriter (from TestSuite)
On Mon, Sep 20, 2010 at 5:09 AM, Michael McCandless luc...@mikemccandless.com wrote: Hmm -- I suspect TestIndexWriter.testNoWaitClose hit an exc, which caused it not to close the dir, but the code that catches this in LuceneTestCase fails to show that root cause? I think we should disable the dir/IndexInput/Output not closed checking if the test hit an exc? https://issues.apache.org/jira/browse/LUCENE-2656 https://issues.apache.org/jira/browse/LUCENE-2656 -- Robert Muir rcm...@gmail.com
[jira] Commented: (LUCENE-2656) If tests fail, don't report about unclosed resources
[ https://issues.apache.org/jira/browse/LUCENE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912445#action_12912445 ] Michael McCandless commented on LUCENE-2656: Super -- patch looks great! If tests fail, don't report about unclosed resources Key: LUCENE-2656 URL: https://issues.apache.org/jira/browse/LUCENE-2656 Project: Lucene - Java Issue Type: Test Components: Tests Affects Versions: 3.1, 4.0 Reporter: Robert Muir Fix For: 3.1, 4.0 Attachments: LUCENE-2656.patch LuceneTestCase ensures in afterClass() if you closed all your directories, which in turn will check if you have closed any open files. This is good, as a test will fail if we have resource leaks. But if a test truly fails, this is just confusing, because its usually not going to make it to the part of its code where it would call .close() So, if any tests fail, I think we should omit this check in afterClass() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
A little late to the party, but... On Sep 18, 2010, at 5:09 PM, Ryan McKinley wrote: I cannot in good conscience sign with my key, nor vote over any maven artifacts. I noticed these guides only mentioned how to upload (which itself seems extremely complex). But nowhere do i see 'how do you test that your artifacts are correct'. And thats really the main problem I have with our maven support. I understand what you are worried about... and think we can avoid it. How about: 1. Keep the generate-maven-artifacts in the release. This just copies the official jar files to a special directory structure (same keys etc) OK, I get that a lot of committers here don't like Maven and I don't think Lucene should switch to a Maven build and it's a pain to do complex things in, but I use it all the time for Lucene/Solr (for none complex things) and I know of a lot of people in user land who use it as well b/c it makes the common things _users_ do really easy. And, as much as Hoss restarted this thread by saying the PMC releases only source, it simply is not what users expect. That's why we sign all the artifacts. They are the RM saying I verify this and the PMC then votes on all the artifacts and it's why we push them all up for distribution. Of course, we are only required to release source, but you show me a project that does only that at the ASF and I'll show you a project w/ very few users. At any rate, the big problem w/ Maven and Lucene is not that generate-maven-artifacts doesn't work, it's that the POM templates aren't kept in sync. However, I think we now have a solution for that thanks to Steve and Robert's work to make it easier to bring Lucene into IntelliJ. In other words, that process does much of what is needed for Maven, so it should be relatively straightforward to have it automatically generate the templates, too. In fact, it would be just as easy for that project to simply produce POM files (which are well understood and have a published spec) instead of creating the IntelliJ project files, which are not well understood and not publicly spec'd and subject to change w/ every release and simply have IntelliJ suck in the POM file since IntelliJ supports that very, very well. Then, to automatically test Maven, we simply need to do a few things: 1. Generate the templates 2. Build the Maven artifacts and install them (this is a Maven concept that copies them to your local repository, usually in ~/.mvn/repository, but it can be in other places and it should be clean) 3. Generate a test pom that includes, as dependencies all the Lucene Maven artifacts and maybe even compiles a small source tree with it If that last step passes, you know everything is right. However, short of #2 and #3, as long as the POM's are being generated accurately, I think I would feel comfortable releasing them, whereas I agree, now, with Robert, that we probably shouldn't be releasing them now. (BTW, I love the Maven is Magic (and really any It's magic, therefore I don't like it) reasoning for not liking it, whereby everyone complains that b/c Maven hides a bunch of details from you (i.e. it's magic), therefore you don't like it. At the same time, I'm sure said person doesn't understand every last detail of, oh, I don't know: the CPU, RAM, the Compiler, the JDK, etc. and yet they have no problem using that. In other words, we deal with abstractions all the time. It's fine if you don't get the abstraction or don't personally find it useful, but that doesn't make the abstraction bad.) -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 8:23 AM, Grant Ingersoll gsing...@apache.orgwrote: At any rate, the big problem w/ Maven and Lucene is not that generate-maven-artifacts doesn't work, it's that the POM templates aren't kept in sync. However, I think we now have a solution for that thanks to Steve and Robert's work to make it easier to bring Lucene into IntelliJ. In other words, that process does much of what is needed for Maven, so it should be relatively straightforward to have it automatically generate the templates, too. In fact, it would be just as easy for that project to simply produce POM files (which are well understood and have a published spec) instead of creating the IntelliJ project files, which are not well understood and not publicly spec'd and subject to change w/ every release and simply have IntelliJ suck in the POM file since IntelliJ supports that very, very well. So are you saying, instead of generating IntelliJ configuration, we generate poms, and then we have a route, via maven, for users to automatically set up their IntelliJ (and also eclipse?) IDEs? If so this sounds great to me. Because it would be nice to make the IDE configuration easier, not just for IntelliJ. Then, to automatically test Maven, we simply need to do a few things: 1. Generate the templates 2. Build the Maven artifacts and install them (this is a Maven concept that copies them to your local repository, usually in ~/.mvn/repository, but it can be in other places and it should be clean) 3. Generate a test pom that includes, as dependencies all the Lucene Maven artifacts and maybe even compiles a small source tree with it +1. this would resolve all my concerns about maven, because we have a way to test that it stands a chance of working *before release*. I hope you don't think I am picking on maven here, I'm equally disturbed about the demo application, and i think it should have a basic unit test too that indexes stuff, fires itself up in jetty, and runs a search. Like maven, i know some people don't necessarily like the demo, but as long as we are going to ship it, I want tests so that we dont find its completely nonfunctional after the release. Unlike maven, i think i stand a chance of actually being able to write the test for this one though. -- Robert Muir rcm...@gmail.com
Re: discussion about release frequency.
On Sep 20, 2010, at 8:44 AM, Robert Muir wrote: On Mon, Sep 20, 2010 at 8:23 AM, Grant Ingersoll gsing...@apache.org wrote: At any rate, the big problem w/ Maven and Lucene is not that generate-maven-artifacts doesn't work, it's that the POM templates aren't kept in sync. However, I think we now have a solution for that thanks to Steve and Robert's work to make it easier to bring Lucene into IntelliJ. In other words, that process does much of what is needed for Maven, so it should be relatively straightforward to have it automatically generate the templates, too. In fact, it would be just as easy for that project to simply produce POM files (which are well understood and have a published spec) instead of creating the IntelliJ project files, which are not well understood and not publicly spec'd and subject to change w/ every release and simply have IntelliJ suck in the POM file since IntelliJ supports that very, very well. So are you saying, instead of generating IntelliJ configuration, we generate poms, and then we have a route, via maven, for users to automatically set up their IntelliJ (and also eclipse?) IDEs? If so this sounds great to me. Because it would be nice to make the IDE configuration easier, not just for IntelliJ. Yes. I know for a fact IntelliJ can read the POMs. I use it all the time. Go check out Mahout and point IntelliJ at it's POM. You will be up and compiling (in your IDE) in less than 2 minutes give or take. I imagine Eclipse has similar support. Then, to automatically test Maven, we simply need to do a few things: 1. Generate the templates 2. Build the Maven artifacts and install them (this is a Maven concept that copies them to your local repository, usually in ~/.mvn/repository, but it can be in other places and it should be clean) 3. Generate a test pom that includes, as dependencies all the Lucene Maven artifacts and maybe even compiles a small source tree with it +1. this would resolve all my concerns about maven, because we have a way to test that it stands a chance of working *before release*. I hope you don't think I am picking on maven here, I'm equally disturbed about the demo application, and i think it should have a basic unit test too that indexes stuff, fires itself up in jetty, and runs a search. I totally understand it. I'm not some Maven fanboi (especially as the person who used it to put together the Mahout release, initially). I know it's warts, believe me, as I have lived the pain. That being said, for _most_ users (i.e. not necessarily us committers) who are simply using Lucene/Solr within a much broader environment of dependencies, having the JARs available in the Maven repo w/ correct POM files is a very good thing that makes it so much easier for them to do their day to day work and I would hate to see that go away, especially since it is something we have supported for quite some time, albeit with varying levels of success. Like maven, i know some people don't necessarily like the demo, but as long as we are going to ship it, I want tests so that we dont find its completely nonfunctional after the release. Unlike maven, i think i stand a chance of actually being able to write the test for this one though. I've been wanting to do those Maven tests for a while now, but simply can't find the time relative to my other priorities. I guess if the community is saying that if someone doesn't step up, it's going to be dropped, I'll step up. I can likely commit to it before the next release. -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
(BTW, I love the Maven is Magic (and really any It's magic, therefore I don't like it) reasoning for not liking it, whereby everyone complains that b/c Maven hides a bunch of details from you (i.e. it's magic), therefore you don't like it. At the same time, I'm sure said person doesn't understand every last detail of, oh, I don't know: the CPU, RAM, the Compiler, the JDK, etc. and yet they have no problem using that. In other words, we deal with abstractions all the time. It's fine if you don't get the abstraction or don't personally find it useful, but that doesn't make the abstraction bad.) -Grant Maven is not bad because it's magic - magic is frigging great - I want my software to be magic - it's bad because every 5 line program from some open source code/project that I have tried to build with it has gone on an absurd downloading spree that takes forever because it's getting many tiny files. This downloading spree never corresponds to the size of the code base I am working with, and always manages to surprise by the amount of time it can slurp up. That's enough for me right there - I've heard others talk of other non magical things that sound scary, but I won't dig any deeper into this absurdity. Either I *really* don't like Maven, or no one knows how to properly set it up - which makes me still not like it. When the magic is absurd, it loses a little of its magic. Finally, there is a difference between releasing source code, releasing signed jars, and signed maven files, and *just* releasing signed jars. Dropping maven doesn't get you back down to releasing source code. I still think Maven should be a downstream issue. - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 8:58 AM, Grant Ingersoll wrote: On Sep 20, 2010, at 8:44 AM, Robert Muir wrote: On Mon, Sep 20, 2010 at 8:23 AM, Grant Ingersoll gsing...@apache.org wrote: At any rate, the big problem w/ Maven and Lucene is not that generate-maven-artifacts doesn't work, it's that the POM templates aren't kept in sync. However, I think we now have a solution for that thanks to Steve and Robert's work to make it easier to bring Lucene into IntelliJ. In other words, that process does much of what is needed for Maven, so it should be relatively straightforward to have it automatically generate the templates, too. In fact, it would be just as easy for that project to simply produce POM files (which are well understood and have a published spec) instead of creating the IntelliJ project files, which are not well understood and not publicly spec'd and subject to change w/ every release and simply have IntelliJ suck in the POM file since IntelliJ supports that very, very well. So are you saying, instead of generating IntelliJ configuration, we generate poms, and then we have a route, via maven, for users to automatically set up their IntelliJ (and also eclipse?) IDEs? If so this sounds great to me. Because it would be nice to make the IDE configuration easier, not just for IntelliJ. Yes. I know for a fact IntelliJ can read the POMs. I use it all the time. Go check out Mahout and point IntelliJ at it's POM. You will be up and compiling (in your IDE) in less than 2 minutes give or take. I imagine Eclipse has similar support. I should correct myself here. While all of the above is true, it likely still won't work for Lucene b/c the source trees aren't in line w/ Maven conventions. Thus, we will probably still need to output IntelliJ format. I do, however, think it isn't much of a leap to also output a POM file. -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
I hope you don't think I am picking on maven here, I'm equally disturbed about the demo application, and i think it should have a basic unit test too that indexes stuff, fires itself up in jetty, and runs a search. The solr sample app is tested -- i don't know anything about lucene demo stuff. Most of the solrj tests run from the example schema via jetty and embedded. ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 9:00 AM, Mark Miller wrote: (BTW, I love the Maven is Magic (and really any It's magic, therefore I don't like it) reasoning for not liking it, whereby everyone complains that b/c Maven hides a bunch of details from you (i.e. it's magic), therefore you don't like it. At the same time, I'm sure said person doesn't understand every last detail of, oh, I don't know: the CPU, RAM, the Compiler, the JDK, etc. and yet they have no problem using that. In other words, we deal with abstractions all the time. It's fine if you don't get the abstraction or don't personally find it useful, but that doesn't make the abstraction bad.) -Grant Maven is not bad because it's magic - magic is frigging great - I want my software to be magic - it's bad because every 5 line program from some open source code/project that I have tried to build with it has gone on an absurd downloading spree that takes forever because it's getting many tiny files. This downloading spree never corresponds to the size of the code base I am working with, and always manages to surprise by the amount of time it can slurp up. Agreed, but over time, it is lessened by the fact that you already have most common files/jars and furthermore, you only have one copy of them instead of one under every source tree. I think, over time, you actually end up downloading less than with other approaches and that even includes the downloads one gets when Maven upgrades itself. I do, agree, though, that Maven makes you drink the Kool-aid and it doesn't play well with other conventions (although it isn't horrible when it comes to Ant, either). There are plenty of days I hate Maven for what it assumes, but there are also many days when I love the fact that the POM describes my project in one clear, fairly concise, validatable way. I still think Maven should be a downstream issue. I don't see how it can be. You have to be a committer to push it to the ASF repository for syndication on iBiblio, etc. That being said, we really aren't that far from a process that we can have confidence in. -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
removing payload.xml
I was documenting some field collapsing stuff when I ran across a response like this (using the example data): grouped:{ price:[0 TO 99.99]:{ matches:8, doclist:{numFound:2,start:0,docs:[ { name:CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail, price:74.99}, { name:CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail, price:74.99}] }}, At first I thought something was horribly wrong with the grouping... but on expanding all the fields, I realized that some of the documents were copies of others, with the ID field changed, and a payload field added. I imagine others will make the same mistake, so I'm going to simply move the payload fields to the originial docs and remove payload.xml Does anyone know of any Solr docs that need to be adjusted as part of this? I couldn't find anything on payloads in our wiki. -Yonik http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 9:00 AM, Mark Miller markrmil...@gmail.com wrote: I still think Maven should be a downstream issue. +1 Maven has never been a required part of our releases, and I don't think we should change that. We should also keep in mind that there's nothing really official about a release manager. There's no reason the person(s) that signed the normal release need to be the same person that signs the maven stuff (but it should be a PMC member if it's hosted by the ASF). If there are people around during a release that want to handle the maven stuff, that seems fine. It does *not* have to be the release manager. It seems fine to make reasonable accommodations if some are working on making maven artifacts available at roughly the same... but if not, it should not hold up the release. -Yonik http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On 2010-09-20 15:21, Grant Ingersoll wrote: I do, agree, though, that Maven makes you drink the Kool-aid and it doesn't play well with other conventions (although it isn't horrible when it comes to Ant, either). There are plenty of days I hate Maven for what it assumes, but there are also many days when I love the fact that the POM describes my project in one clear, fairly concise, validatable way. We took the middle road in Nutch - we switched to ant+ivy to manage dependencies. This way we get single copies of all deps, and build.xml is still recognizable and useful. Of coure, this doesn't solve the publishing part of Maven functionality (yet). -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: discussion about release frequency.
On 9/20/2010 at 8:24 AM, Grant Ingersoll wrote: At any rate, the big problem w/ Maven and Lucene is not that generate- maven-artifacts doesn't work, it's that the POM templates aren't kept in sync. However, I think we now have a solution for that thanks to Steve and Robert's work to make it easier to bring Lucene into IntelliJ. In other words, that process does much of what is needed for Maven, so it should be relatively straightforward to have it automatically generate the templates, too. In fact, it would be just as easy for that project to simply produce POM files (which are well understood and have a published spec) instead of creating the IntelliJ project files, which are not well understood and not publicly spec'd and subject to change w/ every release and simply have IntelliJ suck in the POM file since IntelliJ supports that very, very well. Unfortunately, LUCENE-2611 does not automatically generate IntelliJ setup files - they are static, just like the POM template files. I think it's possible, using an Ant BuildListener-extending class, to do automatic generation, but I haven't attempted it yet. I'll open an issue. Steve
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 10:18 AM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 9:55 AM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 9:00 AM, Mark Miller markrmil...@gmail.com wrote: I still think Maven should be a downstream issue. +1 Maven has never been a required part of our releases, and I don't think we should change that. We should also keep in mind that there's nothing really official about a release manager. There's no reason the person(s) that signed the normal release need to be the same person that signs the maven stuff (but it should be a PMC member if it's hosted by the ASF). If there are people around during a release that want to handle the maven stuff, that seems fine. It does *not* have to be the release manager. It seems fine to make reasonable accommodations if some are working on making maven artifacts available at roughly the same... but if not, it should not hold up the release. I completely disagree. With what part? Do you mean to say you wish to make maven a required part of our releases? If so, perhaps you should call a vote? It's either a first class citizen or it's not and by moving it out It is not a first class citizen. Apparently the last Solr release went out w/o working maven support. But it's not quite so black and white either... I see no reason to *remove* maven related stuff from ant (and it's good if people improve it), and I've even applied patches to the maven stuff when supplied by others. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2657) Auto-generate POM templates from Ant builds
Auto-generate POM templates from Ant builds --- Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Reporter: Steven Rowe Priority: Minor Fix For: 3.1, 4.0 Lucene and Solr modules' POM templates are manually maintained, and so are not always in sync with the dependencies used by the Ant build. It should be possible to auto-generate POM templates using build tools extending Ant's [BuildListener|http://api.dpml.net/ant/1.6.5/org/apache/tools/ant/BuildListener.html] interface, similarly to how the [ant2ide|http://gleamynode.net/articles/2234/] project generates eclipse project files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2482) Index sorter
[ https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved LUCENE-2482. --- Resolution: Fixed Committed in rev. 998948. Index sorter Key: LUCENE-2482 URL: https://issues.apache.org/jira/browse/LUCENE-2482 Project: Lucene - Java Issue Type: New Feature Components: contrib/* Affects Versions: 3.1 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 3.1 Attachments: indexSorter.patch A tool to sort index according to a float document weight. Documents with high weight are given low document numbers, which means that they will be first evaluated. When using a strategy of early termination of queries (see TimeLimitedCollector) such sorting significantly improves the quality of partial results. (Originally this tool was created by Doug Cutting in Nutch, and used norms as document weights - thus the ordering was limited by the limited resolution of norms. This is a pure Lucene version of the tool, and it uses arbitrary floats from a specified stored field). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2657) Auto-generate POM templates from Ant builds
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912487#action_12912487 ] Robert Muir edited comment on LUCENE-2657 at 9/20/10 10:59 AM: --- How would the BuildListener interface know about dependencies? Does it have some magic way to know this? As an example, lets take modules/analysis/icu which has 3 dependencies: * Lucene core itself (implicit from contrib-build.xml) * external dependency: ICU * internal dependency: modules/analysis/common take a look at modules/analysis/icu's pom.xml which has: {noformat} dependencies dependency groupIdcom.ibm.icu/groupId artifactIdicu4j/artifactId version${icu-version}/version /dependency /dependencies {noformat} However, our ant builds (that depend on common-build/contrib-build) declare their dependencies in a semi-standard way: * External dependencies: {noformat} path id=additional.dependencies fileset dir=lib includes=icu4j-*.jar/ /path {noformat} * Internal dependencies: {noformat} module-uptodate name=analysis/common jarfile=../build/common/lucene-analyzers-common-${version}.jar property=analyzers-common.uptodate classpath.property=analyzers-common.jar/ {noformat} The contrib-build.xml already has a 'dist-maven' target, that is called recursively. Perhaps an alternative would be to improve contrib-build.xml for it to have a 'generate-maven' target, also called recursively. I've already prototyped/proposed in SOLR-2002 that we migrate the solr build to extend the lucene build, so everywhere would use it. Furthermore, couldnt we also make a recursive 'test-maven' target, that generates a maven project to 'download' or whatever it needs, then tries to run all the tests? If somehow the maven is broken, the tests simply won't pass. I realize that running all of a modules tests again redundantly via 'maven' might not be the most elegant solution, but it seems like it would test that everything is working. was (Author: rcmuir): How would the BuildListener interface know about dependencies? Does it have some magic way to know this? As an example, lets take modules/analysis/icu which has 3 dependencies: * Lucene core itself (implicit from contrib-build.xml) * external dependency: ICU * internal dependency: modules/analysis/common take a look at modules/analysis/icu's pom.xml which has: {noformat} dependencies dependency groupIdcom.ibm.icu/groupId artifactIdicu4j/artifactId version${icu-version}/version /dependency /dependencies {noformat} However, our ant builds (that depend on common-build/contrib-build) declare their dependencies in a semi-standard way: * External dependencies: {noformat} path id=additional.dependencies fileset dir=lib includes=icu4j-*.jar/ /path * Internal dependencies: {noformat} module-uptodate name=analysis/common jarfile=../build/common/lucene-analyzers-common-${version}.jar property=analyzers-common.uptodate classpath.property=analyzers-common.jar/ {noformat} The contrib-build.xml already has a 'dist-maven' target, that is called recursively. Perhaps an alternative would be to improve contrib-build.xml for it to have a 'generate-maven' target, also called recursively. I've already prototyped/proposed in SOLR-2002 that we migrate the solr build to extend the lucene build, so everywhere would use it. Furthermore, couldnt we also make a recursive 'test-maven' target, that generates a maven project to 'download' or whatever it needs, then tries to run all the tests? If somehow the maven is broken, the tests simply won't pass. I realize that running all of a modules tests again redundantly via 'maven' might not be the most elegant solution, but it seems like it would test that everything is working. Auto-generate POM templates from Ant builds --- Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Reporter: Steven Rowe Priority: Minor Fix For: 3.1, 4.0 Lucene and Solr modules' POM templates are manually maintained, and so are not always in sync with the dependencies used by the Ant build. It should be possible to auto-generate POM templates using build tools extending Ant's [BuildListener|http://api.dpml.net/ant/1.6.5/org/apache/tools/ant/BuildListener.html] interface, similarly to how the [ant2ide|http://gleamynode.net/articles/2234/] project generates eclipse project files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
[jira] Commented: (LUCENE-2493) Rename lucene/solr dev jar files to -SNAPSHOT.jar
[ https://issues.apache.org/jira/browse/LUCENE-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912491#action_12912491 ] David Smiley commented on LUCENE-2493: -- Of course we should do this. I've had to do this on my end with that -Ddev.verrsion=4.0-SNAPSHOT trick in the mean time. Rename lucene/solr dev jar files to -SNAPSHOT.jar - Key: LUCENE-2493 URL: https://issues.apache.org/jira/browse/LUCENE-2493 Project: Lucene - Java Issue Type: Task Reporter: Ryan McKinley Priority: Minor Attachments: LUCENE-2493-dev-to-SNAPSHOT.patch Currently the lucene dev jar files end with '-dev.jar' this is all fine, but it makes people using maven jump through a few hoops to get the -SNAPSHOT naming convention required by maven. If we want to publish snapshot builds with hudson, we would need to either write some crazy scripts or run the build twice. I suggest we switch to -SNAPSHOT.jar. Hopefully for the 3.x branch and for the /trunk (4.x) branch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 10:28 AM, Steven A Rowe wrote: On 9/20/2010 at 8:24 AM, Grant Ingersoll wrote: At any rate, the big problem w/ Maven and Lucene is not that generate- maven-artifacts doesn't work, it's that the POM templates aren't kept in sync. However, I think we now have a solution for that thanks to Steve and Robert's work to make it easier to bring Lucene into IntelliJ. In other words, that process does much of what is needed for Maven, so it should be relatively straightforward to have it automatically generate the templates, too. In fact, it would be just as easy for that project to simply produce POM files (which are well understood and have a published spec) instead of creating the IntelliJ project files, which are not well understood and not publicly spec'd and subject to change w/ every release and simply have IntelliJ suck in the POM file since IntelliJ supports that very, very well. Unfortunately, LUCENE-2611 does not automatically generate IntelliJ setup files - they are static, just like the POM template files. Hmm, hadn't looked that closely. I'd say this is going to suffer the same fate of the POM template files then and would thus be against including it. I think it's possible, using an Ant BuildListener-extending class, to do automatic generation, but I haven't attempted it yet. I'll open an issue. Cool. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2657) Auto-generate POM templates from Ant builds
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912495#action_12912495 ] Steven Rowe commented on LUCENE-2657: - bq. How would the BuildListener interface know about dependencies? Does it have some magic way to know this? BuildListener has hooks for build task onset and completion events (inter alia). ant2ide listens for Javac task completion, and captures from it the source and target directories, as well as the build classpath. You have to invoke compilation from an Ant build in order for this to work. Seems kinda magical to me :) The missing part here is figuring out the maven groupId/artifactId/version, and I *think* this can be dealt with by looking at the manifest in the jar. Maven-produced jars also contain their POMs, and pulling from there would be even simpler. Auto-generate POM templates from Ant builds --- Key: LUCENE-2657 URL: https://issues.apache.org/jira/browse/LUCENE-2657 Project: Lucene - Java Issue Type: Improvement Components: Build Reporter: Steven Rowe Priority: Minor Fix For: 3.1, 4.0 Lucene and Solr modules' POM templates are manually maintained, and so are not always in sync with the dependencies used by the Ant build. It should be possible to auto-generate POM templates using build tools extending Ant's [BuildListener|http://api.dpml.net/ant/1.6.5/org/apache/tools/ant/BuildListener.html] interface, similarly to how the [ant2ide|http://gleamynode.net/articles/2234/] project generates eclipse project files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1301) Solr + Hadoop
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912497#action_12912497 ] Jason Rutherglen commented on SOLR-1301: Alexander, I think we'll need to use Hadoop's Mini Cluster in order to have a proper unit test. Adding Jetty as a dependency shouldn't be too much of a problem as Solr already includes a small version of Jetty? That being said, it doesn't mean it's fun to write the unit test. I can assist if needed. Solr + Hadoop - Key: SOLR-1301 URL: https://issues.apache.org/jira/browse/SOLR-1301 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Andrzej Bialecki Fix For: Next Attachments: commons-logging-1.0.4.jar, commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, hadoop-0.20.1-core.jar, hadoop.patch, log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java This patch contains a contrib module that provides distributed indexing (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is twofold: * provide an API that is familiar to Hadoop developers, i.e. that of OutputFormat * avoid unnecessary export and (de)serialization of data maintained on HDFS. SolrOutputFormat consumes data produced by reduce tasks directly, without storing it in intermediate files. Furthermore, by using an EmbeddedSolrServer, the indexing task is split into as many parts as there are reducers, and the data to be indexed is not sent over the network. Design -- Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, which in turn uses SolrRecordWriter to write this data. SolrRecordWriter instantiates an EmbeddedSolrServer, and it also instantiates an implementation of SolrDocumentConverter, which is responsible for turning Hadoop (key, value) into a SolrInputDocument. This data is then added to a batch, which is periodically submitted to EmbeddedSolrServer. When reduce task completes, and the OutputFormat is closed, SolrRecordWriter calls commit() and optimize() on the EmbeddedSolrServer. The API provides facilities to specify an arbitrary existing solr.home directory, from which the conf/ and lib/ files will be taken. This process results in the creation of as many partial Solr home directories as there were reduce tasks. The output shards are placed in the output directory on the default filesystem (e.g. HDFS). Such part-N directories can be used to run N shard servers. Additionally, users can specify the number of reduce tasks, in particular 1 reduce task, in which case the output will consist of a single shard. An example application is provided that processes large CSV files and uses this API. It uses a custom CSV processing to avoid (de)serialization overhead. This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this issue, you should put it in contrib/hadoop/lib. Note: the development of this patch was sponsored by an anonymous contributor and approved for release under Apache License. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: discussion about release frequency.
On 9/20/2010 at 11:15 AM, Grant Ingersoll wrote: Unfortunately, LUCENE-2611 does not automatically generate IntelliJ setup files - they are static, just like the POM template files. Hmm, hadn't looked that closely. I'd say this is going to suffer the same fate of the POM template files then and would thus be against including it. It's not quite as bad as the POM template files, since IntelliJ can be told to find all dependencies in a directory, rather than explicitly naming every dependency, and LUCENE-2611 uses that facility just about everywhere (I think the only exception is the JUnit jar test dependency, since the other stuff in the same directory shouldn't necessarily be depended on during testing). So the IntelliJ project files in LUCENE-2611 would continue to work without manual intervention in the face of upgraded and/or additional dependencies, but would require manual effort to sync up with structural changes. While I don't agree that this is a deal-breaker, since the manual intervention required would be fairly minimal, I agree that auto-generation would be a lot more useful than the current static approach. My thought process was that setting this up manually would provide a benchmark for auto-generation; the auto-generated version should not be less functional than the manually generated one. Steve
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 12:01 PM, Robert Muir rcm...@gmail.com wrote: On Mon, Sep 20, 2010 at 10:29 AM, Yonik Seeley yo...@lucidimagination.com wrote: With what part? Do you mean to say you wish to make maven a required part of our releases? If so, perhaps you should call a vote? It sounds like maybe we should I'm not sure it would be useful yet. There is consensus that the process needs to improve. The only concrete 'vote' i could imagine now is to drop maven. ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 12:29 PM, Ryan McKinley ryan...@gmail.com wrote: I'm not sure it would be useful yet. There is consensus that the process needs to improve. The only concrete 'vote' i could imagine now is to drop maven. I completely agree the process needs to improve, but at the end of the day, if we are planning to support maven officially in releases, i think we should vote on it becoming part of the actual release process. So maybe its premature to vote on this part, but at the same time, I have concerns about what it would take to 'fully support' maven. For example, if we have to reorganize our source tree to what it wants (src/main/java, src/main/test), and rename our artifacts to what it wants (-SNAPSHOT, etc), this is pretty important. what else might maven 'require'. its also my understanding that in the past, when maven is upgraded (e.g. Maven 2), it might require you to modify your project in ways such as this to fit its new needs. From what I know of maven, its quite inflexible about such things, and I want to know what i'm getting into before we claim to 'make maven first class citizen'. -- Robert Muir rcm...@gmail.com
[jira] Commented: (SOLR-1722) Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE
[ https://issues.apache.org/jira/browse/SOLR-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912573#action_12912573 ] Ephraim Ofir commented on SOLR-1722: Tried using the defaultCoreName attribute on a 2 core setup. After performing a swap, my solr.xml no longer contains the defaultCoreName attribute, and the core which was dafult is now renamed to , so after restart of the process can't access it by it's former name and can't perform other operations on it such as rename, reload or swap... Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE --- Key: SOLR-1722 URL: https://issues.apache.org/jira/browse/SOLR-1722 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 1.5, 3.1, 4.0 Attachments: SOLR-1722.patch, SOLR-1722.patch see http://search.lucidimagination.com/search/document/f5f2af7c5041a79e/default_core -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: discussion about release frequency.
If somebody reorders the directory structure, I will shout “revert revert revert” J We can only “fully” support maven by switching to maven, but most of the core committers don’t want this (including me). In my opinion, the approach we had was fine, to simply create the jar files as we do for the binary release, but add some (hopefully) automatically generated pom files to it. One thing I don’t like in this release process (as it currently works) is non-repeatable maven artifact generation. With maven, it’s impossible to regenerate the JAR files with the *same* MD5, even the MD5’s of the jar files in the binary release zip are different than the maven ones. If repeatability is not possible, at least the JAR files in the –bin.zip should be identical to the maven released ones! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de/ http://www.thetaphi.de eMail: u...@thetaphi.de From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, September 20, 2010 9:35 AM To: dev@lucene.apache.org Subject: Re: discussion about release frequency. On Mon, Sep 20, 2010 at 12:29 PM, Ryan McKinley ryan...@gmail.com wrote: I'm not sure it would be useful yet. There is consensus that the process needs to improve. The only concrete 'vote' i could imagine now is to drop maven. I completely agree the process needs to improve, but at the end of the day, if we are planning to support maven officially in releases, i think we should vote on it becoming part of the actual release process. So maybe its premature to vote on this part, but at the same time, I have concerns about what it would take to 'fully support' maven. For example, if we have to reorganize our source tree to what it wants (src/main/java, src/main/test), and rename our artifacts to what it wants (-SNAPSHOT, etc), this is pretty important. what else might maven 'require'. its also my understanding that in the past, when maven is upgraded (e.g. Maven 2), it might require you to modify your project in ways such as this to fit its new needs. From what I know of maven, its quite inflexible about such things, and I want to know what i'm getting into before we claim to 'make maven first class citizen'. -- Robert Muir rcm...@gmail.com
[jira] Commented: (SOLR-1722) Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE
[ https://issues.apache.org/jira/browse/SOLR-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912588#action_12912588 ] Mark Miller commented on SOLR-1722: --- Thanks for the report Ephraim, thanks for the report - could you make a new issue for this bug? Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE --- Key: SOLR-1722 URL: https://issues.apache.org/jira/browse/SOLR-1722 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 1.5, 3.1, 4.0 Attachments: SOLR-1722.patch, SOLR-1722.patch see http://search.lucidimagination.com/search/document/f5f2af7c5041a79e/default_core -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1488) multilingual analyzer based on icu
[ https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1488: -- Fix Version/s: 3.1 multilingual analyzer based on icu -- Key: LUCENE-1488 URL: https://issues.apache.org/jira/browse/LUCENE-1488 Project: Lucene - Java Issue Type: New Feature Components: contrib/analyzers Reporter: Robert Muir Assignee: Robert Muir Priority: Minor Fix For: 3.1, 4.0 Attachments: ICUAnalyzer.patch, LUCENE-1488.patch, LUCENE-1488.patch, LUCENE-1488.patch, LUCENE-1488.patch, LUCENE-1488.txt, LUCENE-1488.txt The standard analyzer in lucene is not exactly unicode-friendly with regards to breaking text into words, especially with respect to non-alphabetic scripts. This is because it is unaware of unicode bounds properties. I actually couldn't figure out how the Thai analyzer could possibly be working until i looked at the jflex rules and saw that codepoint range for most of the Thai block was added to the alphanum specification. defining the exact codepoint ranges like this for every language could help with the problem but you'd basically be reimplementing the bounds properties already stated in the unicode standard. in general it looks like this kind of behavior is bad in lucene for even latin, for instance, the analyzer will break words around accent marks in decomposed form. While most latin letter + accent combinations have composed forms in unicode, some do not. (this is also an issue for asciifoldingfilter i suppose). I've got a partially tested standardanalyzer that uses icu Rule-based BreakIterator instead of jflex. Using this method you can define word boundaries according to the unicode bounds properties. After getting it into some good shape i'd be happy to contribute it for contrib but I wonder if theres a better solution so that out of box lucene will be more friendly to non-ASCII text. Unfortunately it seems jflex does not support use of these properties such as [\p{Word_Break = Extend}] so this is probably the major barrier. Thanks, Robert -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Mon, Sep 20, 2010 at 1:01 PM, Grant Ingersoll gsing...@apache.org wrote: - ''Note: You need committer rights to create a new Lucene release.'' + This page is to help a Lucene/Solr committer create a new release (you need committer rights for some of the steps to create an official release). It does not reflect official release policy - many of the items may be optional, or may be modified as necessary. I think putting this up on the wiki is a bad idea. We should strive to have a repeatable release process. By saying it is up to the person who happens to be doing the release is just asking for less quality in our releases. If you don't think you can follow the release process, then you shouldn't be doing the release. And, if we as a community can't define a repeatable release process, then we shouldn't have a release either. Calling something that anyone can go and edit and add their best ideas to official is silly. It does not list iron-clad requirements - it is there simply to help. That's pretty obvious by looking at the huge list of content on that page. I'd rather spend my time writing code and improving the projects rather than engaging in bureaucratic exercises. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
sounds like we might need to create an official release policy, vote on it, and commit it. On Mon, Sep 20, 2010 at 1:07 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Mon, Sep 20, 2010 at 1:01 PM, Grant Ingersoll gsing...@apache.org wrote: - ''Note: You need committer rights to create a new Lucene release.'' + This page is to help a Lucene/Solr committer create a new release (you need committer rights for some of the steps to create an official release). It does not reflect official release policy - many of the items may be optional, or may be modified as necessary. I think putting this up on the wiki is a bad idea. We should strive to have a repeatable release process. By saying it is up to the person who happens to be doing the release is just asking for less quality in our releases. If you don't think you can follow the release process, then you shouldn't be doing the release. And, if we as a community can't define a repeatable release process, then we shouldn't have a release either. Calling something that anyone can go and edit and add their best ideas to official is silly. It does not list iron-clad requirements - it is there simply to help. That's pretty obvious by looking at the huge list of content on that page. I'd rather spend my time writing code and improving the projects rather than engaging in bureaucratic exercises. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com
[jira] Resolved: (LUCENE-2656) If tests fail, don't report about unclosed resources
[ https://issues.apache.org/jira/browse/LUCENE-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2656. - Assignee: Robert Muir Resolution: Fixed Committed revision 999016, 999021 (3x) If tests fail, don't report about unclosed resources Key: LUCENE-2656 URL: https://issues.apache.org/jira/browse/LUCENE-2656 Project: Lucene - Java Issue Type: Test Components: Tests Affects Versions: 3.1, 4.0 Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.1, 4.0 Attachments: LUCENE-2656.patch LuceneTestCase ensures in afterClass() if you closed all your directories, which in turn will check if you have closed any open files. This is good, as a test will fail if we have resource leaks. But if a test truly fails, this is just confusing, because its usually not going to make it to the part of its code where it would call .close() So, if any tests fail, I think we should omit this check in afterClass() -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Sep 20, 2010, at 1:07 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 1:01 PM, Grant Ingersoll gsing...@apache.org wrote: - ''Note: You need committer rights to create a new Lucene release.'' + This page is to help a Lucene/Solr committer create a new release (you need committer rights for some of the steps to create an official release). It does not reflect official release policy - many of the items may be optional, or may be modified as necessary. I think putting this up on the wiki is a bad idea. We should strive to have a repeatable release process. By saying it is up to the person who happens to be doing the release is just asking for less quality in our releases. If you don't think you can follow the release process, then you shouldn't be doing the release. And, if we as a community can't define a repeatable release process, then we shouldn't have a release either. Calling something that anyone can go and edit and add their best ideas to official is silly. Fine, let's lock it down then. It does not list iron-clad requirements - it is there simply to help. Again, I disagree. Having done a number of releases, it would simply be impossible without it, no matter how long the list is. Unless, of course, all you want is the release to be the source, but even that is in doubt b/c how would I know where to upload it to? For instance, how do you know which Ant target really gets you the right thing to distribute? That's pretty obvious by looking at the huge list of content on that page. I'd rather spend my time writing code and improving the projects rather than engaging in bureaucratic exercises. Well, part of an improved projects is a release that people can consistently rely on. If there is too much chaff in the current release, fine, let's get rid of it or automate it. However, to suggest that a written out release process is not needed or is subject to whatever the RM wants is just plain ludicrous. Are you really arguing that we, the writers of a massively used and deployed open source library, should have a release process that is subject to the whims of whoever happens to be doing it on that given day? Regardless as to whether you want to or not, we as a community need to make sure the community can rely on the results of us writing the code. -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Mon, Sep 20, 2010 at 1:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 1:07 PM, Yonik Seeley wrote: It does not list iron-clad requirements - it is there simply to help. Again, I disagree. Having done a number of releases, it would simply be impossible without it Usefulness certainly does not imply officialness and certainly does not imply that everything on there is mandatory. We've never needed anything quote so iron-clad in the past - we were able to use our judgments to adapt as necessary. And individuals went and updated that page with helpful things because no one was under the impression that anything there was binding. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: discussion about release frequency.
“but again, i have serious questions about maven in general.” Maybe you just need to drink the Maven Koolaid. Unless they have something stronger… ;-) Karl From: ext Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, September 20, 2010 1:08 PM To: dev@lucene.apache.org Subject: Re: discussion about release frequency. On Mon, Sep 20, 2010 at 12:54 PM, Uwe Schindler u...@thetaphi.demailto:u...@thetaphi.de wrote: If somebody reorders the directory structure, I will shout “revert revert revert” ☺ I wouldn't shout revert revert revert if by renaming stuff from src/java to src/main/java etc, Grant's idea would work, in that we still use ant for our build, but we have some way to automagically generate IDE configuration files for eclipse, idea, netbeans, emacs, whatever, via some maven tool. If this was the benefit, and the tradeoff being more difficult merging, and having to ignore some path segments on existing patches, I might consider it worth the cost. but again, i have serious questions about maven in general. for example, what if I wanted to add/modify a contrib that depends on a library that is not mavenized? Is it my responsibility to mavenize that dependency, too? Does it make the release artifact invalid? is it a valid reason against adding that contrib, since its dependencies are not all mavenized? the fact that maven acts like a computer virus, but requires special things of its hosts, means that i am pretty hesitant to vote for full support of it without knowing exactly what the tradeoffs are. -- Robert Muir rcm...@gmail.commailto:rcm...@gmail.com
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Sep 20, 2010, at 2:04 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 1:49 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 1:07 PM, Yonik Seeley wrote: It does not list iron-clad requirements - it is there simply to help. Again, I disagree. Having done a number of releases, it would simply be impossible without it Usefulness certainly does not imply officialness and certainly does not imply that everything on there is mandatory. We've never needed anything quote so iron-clad in the past - we were able to use our judgments to adapt as necessary. And individuals went and updated that page with helpful things because no one was under the impression that anything there was binding. Of course it makes sense for it to be updatable to reflect that things change, servers get moved, ant targets get improved, but your message, on the heels of the Maven discussion, was interpreted by me (and please correct me if I'm wrong) to presume that you are saying that it is alright for the RM to decide what artifacts should be released. So, if that's not the case, then fine, I agree, but if it is, then no, I don't think this is the right message to put on the page. And it certainly isn't up to you alone to decide by placing it on the Wiki as a trivial update. -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote: And it certainly isn't up to you alone to decide by placing it on the Wiki as a trivial update. Most of the updates to that page were made w/o consensus, just as mine was. It's a guide - nothing more. Again, if you feel differently, point to where we voted on that as official policy, or call a vote to make it official policy. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 1:07 PM, Robert Muir wrote: On Mon, Sep 20, 2010 at 12:54 PM, Uwe Schindler u...@thetaphi.de wrote: If somebody reorders the directory structure, I will shout “revert revert revert” J I wouldn't shout revert revert revert if by renaming stuff from src/java to src/main/java etc, Grant's idea would work, in that we still use ant for our build, but we have some way to automagically generate IDE configuration files for eclipse, idea, netbeans, emacs, whatever, via some maven tool. If this was the benefit, and the tradeoff being more difficult merging, and having to ignore some path segments on existing patches, I might consider it worth the cost. but again, i have serious questions about maven in general. for example, what if I wanted to add/modify a contrib that depends on a library that is not mavenized? Is it my responsibility to mavenize that dependency, too? Does it make the release artifact invalid? is it a valid reason against adding that contrib, since its dependencies are not all mavenized? Typically, this is done by adding the library in question to the release, renamed appropriately. For instance, in Solr, we had a trunk based version of Commons CSV at one point, so we put it up w/ the Solr artifacts and had the POM reflect that. But yeah, it can be a pain. the fact that maven acts like a computer virus, but requires special things of its hosts, means that i am pretty hesitant to vote for full support of it without knowing exactly what the tradeoffs are. I'm not saying we have to support it, but, in my view, it's pretty hard to take back a feature, admittedly only for some, that we have supported for a long time. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote: And it certainly isn't up to you alone to decide by placing it on the Wiki as a trivial update. Most of the updates to that page were made w/o consensus, just as mine was. You know there is a difference. In the past, updates were made to the steps involved and subsequent RM's went and followed them or improved them. Your update was to say throw all that work out, if you so desire, and do what you want. While, yes, I will agree it is not official, it is the de facto standard by which we have done releases and RM's have always worked to it. So, yes, we can argue the semantics of a wiki page, but the intent of that page, IMO, is that the RM follow it and that has, AFAICT, always been how RMs have acted when doing releases. -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 2:31 PM, Grant Ingersoll gsing...@apache.orgwrote: Typically, this is done by adding the library in question to the release, renamed appropriately. For instance, in Solr, we had a trunk based version of Commons CSV at one point, so we put it up w/ the Solr artifacts and had the POM reflect that. But yeah, it can be a pain. I don't understand this, if I, as a lucene committer, can arbitrarily publish commons CSV artifacts under maven, without being a commons CSV committer, then why does someone have to be a lucene committer to publish maven artifacts?! Furthermore, if this is possible, then why does lucene itself have to support maven, if someone else (e.g. hibernate) can simply download our jar files and do the same? I'm not saying we have to support it, but, in my view, it's pretty hard to take back a feature, admittedly only for some, that we have supported for a long time. I'm not sure we supported it, it seems to be a broken feature in nearly every release. -- Robert Muir rcm...@gmail.com
Re: discussion about release frequency.
On Sep 20, 2010, at 2:37 PM, Robert Muir wrote: On Mon, Sep 20, 2010 at 2:31 PM, Grant Ingersoll gsing...@apache.org wrote: Typically, this is done by adding the library in question to the release, renamed appropriately. For instance, in Solr, we had a trunk based version of Commons CSV at one point, so we put it up w/ the Solr artifacts and had the POM reflect that. But yeah, it can be a pain. I don't understand this, if I, as a lucene committer, can arbitrarily publish commons CSV artifacts under maven, without being a commons CSV committer, then why does someone have to be a lucene committer to publish maven artifacts?! It's under the Solr area, not the commons CSV area. Furthermore, if this is possible, then why does lucene itself have to support maven, if someone else (e.g. hibernate) can simply download our jar files and do the same? I'm not saying we have to support it, but, in my view, it's pretty hard to take back a feature, admittedly only for some, that we have supported for a long time. I'm not sure we supported it, it seems to be a broken feature in nearly every release. Nah, some times some pieces are broken, but the core one always works, AFAICT. ;-) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 2:43 PM, Grant Ingersoll gsing...@apache.orgwrote: It's under the Solr area, not the commons CSV area. sure, but this doesn't answer the question. if other projects can do this same trick, why do we need to do any maven at all? we can just let those that want maven support, provide it themselves. Ultimately this would probably mean they do a better job of it anyway, since they care about it working for their project to work. -- Robert Muir rcm...@gmail.com
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Sep 20, 2010, at 2:46 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:36 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote: While, yes, I will agree it is not official, it is the de facto standard by which we have done releases and RM's have always worked to it. I'd wager that there has never been a single lucene or solr release that followed every single instruction to the T. Which means that people need to use their heads and understand that many of the items may be optional, or may be modified as necessary. You can't point at the guide as a *reason* to do something, only *how* to do something. If I knew someone would point to it and say you must do XYZ because it's on that HOWTO then I would have vetoed most changes to that page. As I have said for the 3rd time, of course I get that people need to be flexible and there has always been an implied use your head. But, as I said, given you wrote it on the heels of the discussion around Maven and that you think we shouldn't publish Maven artifacts, I think it is clear you intend it to imply that the RM gets to chose what artifacts are released. Is that not the case? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 2:47 PM, Robert Muir wrote: On Mon, Sep 20, 2010 at 2:43 PM, Grant Ingersoll gsing...@apache.org wrote: It's under the Solr area, not the commons CSV area. sure, but this doesn't answer the question. if other projects can do this same trick, why do we need to do any maven at all? we can just let those that want maven support, provide it themselves. Ultimately this would probably mean they do a better job of it anyway, since they care about it working for their project to work. Not following. Joe Schmoe w/ project X doesn't have the right to go publish artifacts at org.apache.lucene.XXX in the iBiblio repository. And, in many cases, we may not have the right to publish others, but for Apache projects, we can. Otherwise, in the past, I've often asked the dependency authors to produce them. Most people will if it means they are getting a wider distribution. In practice, it rarely is an issue. -Grant
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 3:30 PM, Grant Ingersoll gsing...@apache.orgwrote: Not following. Joe Schmoe w/ project X doesn't have the right to go publish artifacts at org.apache.lucene.XXX in the iBiblio repository. And, in many cases, we may not have the right to publish others, but for Apache projects, we can. Otherwise, in the past, I've often asked the dependency authors to produce them. Most people will if it means they are getting a wider distribution. In practice, it rarely is an issue. right but why cant joe shmoe make joe.schmoe.luceneMaven.XXX in the iBiblio repository? At the end of the day, I'm trying to figure out if we can push maven downstream as others have suggested, and it sounds like we can. -- Robert Muir rcm...@gmail.com
Re: discussion about release frequency.
On 9/20/10 3:36 PM, Robert Muir wrote: right but why cant joe shmoe make joe.schmoe.luceneMaven.XXX in the iBiblio repository? That sounds enticing - someone else can step up to be the authority. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Mon, Sep 20, 2010 at 3:27 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 2:46 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:36 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote: While, yes, I will agree it is not official, it is the de facto standard by which we have done releases and RM's have always worked to it. I'd wager that there has never been a single lucene or solr release that followed every single instruction to the T. Which means that people need to use their heads and understand that many of the items may be optional, or may be modified as necessary. You can't point at the guide as a *reason* to do something, only *how* to do something. If I knew someone would point to it and say you must do XYZ because it's on that HOWTO then I would have vetoed most changes to that page. As I have said for the 3rd time, of course I get that people need to be flexible and there has always been an implied use your head. But, as I said, given you wrote it on the heels of the discussion around Maven and that you think we shouldn't publish Maven artifacts, I think it is clear you intend it to imply that the RM gets to chose what artifacts are released. Is that not the case? IMO, the RM has no more power than any other PMC member. But when there are a lot of optional things on the list... I guess the volunteers doing the work get to decide what parts they want to do. The PMC as a whole gets to decide to release artifacts or not. I am also re-asserting (as I have asserted in the past) that the Maven artifacts are *optional*. We've discussed maven not being mandatory before: http://search.lucidimagination.com/search/document/bd618c89a4d458dc/lucene_2_9_again http://search.lucidimagination.com/search/document/3b98fa9ec3073936 -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 3:36 PM, Robert Muir wrote: On Mon, Sep 20, 2010 at 3:30 PM, Grant Ingersoll gsing...@apache.org wrote: Not following. Joe Schmoe w/ project X doesn't have the right to go publish artifacts at org.apache.lucene.XXX in the iBiblio repository. And, in many cases, we may not have the right to publish others, but for Apache projects, we can. Otherwise, in the past, I've often asked the dependency authors to produce them. Most people will if it means they are getting a wider distribution. In practice, it rarely is an issue. right but why cant joe shmoe make joe.schmoe.luceneMaven.XXX in the iBiblio repository? At the end of the day, I'm trying to figure out if we can push maven downstream as others have suggested, and it sounds like we can. Why don't we just leave this as this: Those of us who want Maven supported as part of the release need to get our stuff together by the next release or else it will be dropped. That means making sure the artifacts are correct and easily testable/reproducible. If we can't do that, then I agree, it should be a downstream effort, at least until we all realize how many people actually use it and then we revisit it at the next release. -Grant
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 3:46 PM, Grant Ingersoll gsing...@apache.orgwrote: Why don't we just leave this as this: Those of us who want Maven supported as part of the release need to get our stuff together by the next release or else it will be dropped. That means making sure the artifacts are correct and easily testable/reproducible. If we can't do that, then I agree, it should be a downstream effort, at least until we all realize how many people actually use it and then we revisit it at the next release. But I'm not sure this is the best solution? If we can push this downstream, so that the release manager has less to worry about (even with testable artifacts etc, the publication etc), why wouldn't we do that instead? -- Robert Muir rcm...@gmail.com
[jira] Created: (LUCENE-2658) TestIndexWriterExceptions random failure: AIOOBE in ByteBlockPool.allocSlice
TestIndexWriterExceptions random failure: AIOOBE in ByteBlockPool.allocSlice Key: LUCENE-2658 URL: https://issues.apache.org/jira/browse/LUCENE-2658 Project: Lucene - Java Issue Type: Bug Components: Index Reporter: Robert Muir TestIndexWriterExceptions threw this today, and its reproducable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2658) TestIndexWriterExceptions random failure: AIOOBE in ByteBlockPool.allocSlice
[ https://issues.apache.org/jira/browse/LUCENE-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2658: Attachment: LUCENE-2658_environment.patch attached is my current modifications to trunk (unrelated to this failure completely) this is because, i have a single test seed that controls all behavior, so i want to make sure the random seed i give you will actually work. if you apply the patch, just run ant test-core -Dtestcase=TestIndexWriterExceptions -Dtests.seed=1285011726042 {noformat} junit-sequential: [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions [junit] Testcase: testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions): FAILED [junit] thread Indexer 0: hit unexpected failure [junit] junit.framework.AssertionFailedError: thread Indexer 0: hit unexpected failure [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:773) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:746) [junit] at org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptio ns.java:195) [junit] [junit] [junit] Tests run: 2, Failures: 1, Errors: 0, Time elapsed: 1.257 sec [junit] [junit] - Standard Output --- [junit] Indexer 2: unexpected exception3 [junit] java.lang.ArrayIndexOutOfBoundsException: 483 [junit] at org.apache.lucene.index.ByteSliceReader.nextSlice(ByteSliceReader.java:108) [junit] at org.apache.lucene.index.ByteSliceReader.writeTo(ByteSliceReader.java:90) [junit] at org.apache.lucene.index.TermVectorsTermsWriterPerField.finish(TermVectorsTermsWriterPerField.java:186 ) [junit] at org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:552) [junit] at org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:554) [junit] at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:208) [junit] at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:24 8) [junit] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:839) [junit] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:820) [junit] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2162) [junit] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2134) [junit] at org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:98 ) [junit] Indexer 0: unexpected exception3 [junit] java.lang.ArrayIndexOutOfBoundsException: 507 [junit] at org.apache.lucene.index.ByteSliceReader.nextSlice(ByteSliceReader.java:108) [junit] at org.apache.lucene.index.ByteSliceReader.writeTo(ByteSliceReader.java:90) [junit] at org.apache.lucene.index.TermVectorsTermsWriterPerField.finish(TermVectorsTermsWriterPerField.java:186 ) [junit] at org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:552) [junit] at org.apache.lucene.index.TermsHashPerField.finish(TermsHashPerField.java:554) [junit] at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:208) [junit] at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:24 8) [junit] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:839) [junit] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:820) [junit] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2162) [junit] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2134) [junit] at org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:98 ) [junit] Indexer 1: unexpected exception3 [junit] java.lang.ArrayIndexOutOfBoundsException: 15 [junit] at org.apache.lucene.index.ByteBlockPool.allocSlice(ByteBlockPool.java:122) [junit] at org.apache.lucene.index.TermsHashPerField.writeByte(TermsHashPerField.java:526) [junit] at org.apache.lucene.index.TermsHashPerField.writeVInt(TermsHashPerField.java:547) [junit] at org.apache.lucene.index.TermVectorsTermsWriterPerField.newTerm(TermVectorsTermsWriterPerField.java:22 5) [junit] at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:375) [junit] at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:513) [junit] at
Re: discussion about release frequency.
On Sep 20, 2010, at 3:49 PM, Robert Muir wrote: On Mon, Sep 20, 2010 at 3:46 PM, Grant Ingersoll gsing...@apache.org wrote: Why don't we just leave this as this: Those of us who want Maven supported as part of the release need to get our stuff together by the next release or else it will be dropped. That means making sure the artifacts are correct and easily testable/reproducible. If we can't do that, then I agree, it should be a downstream effort, at least until we all realize how many people actually use it and then we revisit it at the next release. But I'm not sure this is the best solution? If we can push this downstream, so that the release manager has less to worry about (even with testable artifacts etc, the publication etc), why wouldn't we do that instead? Because it's not authoritative. How would our users know which one is the official one? By publishing it under the ASF one with our signatures we are saying this is our official version. We would never claim that the Solr Commons CSV one is the official Commons jar, it's just the official one that Solr officially uses. It's a big difference. Besides, it's not like the iBiblio repo is open to anyone. You have to apply and you have to have authority to write to it. For the ASF, there is a whole sync process whereby iBiblio syncs with an ASF version. In other words, we are the only ones who can publish it to the same space where it is currently published. -Grant
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 4:11 PM, Grant Ingersoll gsing...@apache.orgwrote: Because it's not authoritative. How would our users know which one is the official one? By publishing it under the ASF one with our signatures we are saying this is our official version. We would never claim that the Solr Commons CSV one is the official Commons jar, it's just the official one that Solr officially uses. It's a big difference. Besides, it's not like the iBiblio repo is open to anyone. You have to apply and you have to have authority to write to it. For the ASF, there is a whole sync process whereby iBiblio syncs with an ASF version. In other words, we are the only ones who can publish it to the same space where it is currently published. This authoratitiveness comes with a significant cost, that is the complexity of maven in our release process. I'm not convinced its worth this cost, and before we decide to have maven as part of the release, i'd like for there to be an actual vote. Sorry to change my tone, but I was under the impression we needed a lucene committer to do all this releasing work to support maven, it seems that this is not the case, and other options are available. -- Robert Muir rcm...@gmail.com
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Sep 20, 2010, at 3:46 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 3:27 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 2:46 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:36 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 2:21 PM, Yonik Seeley wrote: On Mon, Sep 20, 2010 at 2:16 PM, Grant Ingersoll gsing...@apache.org wrote: While, yes, I will agree it is not official, it is the de facto standard by which we have done releases and RM's have always worked to it. I'd wager that there has never been a single lucene or solr release that followed every single instruction to the T. Which means that people need to use their heads and understand that many of the items may be optional, or may be modified as necessary. You can't point at the guide as a *reason* to do something, only *how* to do something. If I knew someone would point to it and say you must do XYZ because it's on that HOWTO then I would have vetoed most changes to that page. As I have said for the 3rd time, of course I get that people need to be flexible and there has always been an implied use your head. But, as I said, given you wrote it on the heels of the discussion around Maven and that you think we shouldn't publish Maven artifacts, I think it is clear you intend it to imply that the RM gets to chose what artifacts are released. Is that not the case? IMO, the RM has no more power than any other PMC member. But when there are a lot of optional things on the list... Perhaps you should itemize all the items that are optional and then we can mark them as such. Is uploading the artifacts (maven or not) optional? Perhaps next time I do a release I'll just skip that one. Is updating the website? OK, so I'll give you the FreshMeat and the ServerSide posts, etc. I guess the volunteers doing the work get to decide what parts they want to do. I'd agree that there are some things that should be optional, especially the post release items. Some things, however, are not. Perhaps we should just list out what we view as being required and which ones are not. The PMC as a whole gets to decide to release artifacts or not. Of course. I don't see how that is relevant to the question I asked. I am also re-asserting (as I have asserted in the past) that the Maven artifacts are *optional*. We've discussed maven not being mandatory before: http://search.lucidimagination.com/search/document/bd618c89a4d458dc/lucene_2_9_again http://search.lucidimagination.com/search/document/3b98fa9ec3073936 You asserting in previous threads that Maven is optional does not make it optional. AFAICT, we have done them for as long as we have said we would do them. I'm fine with us as a community dropping Maven releases if that is what is decided. I am absolutely not fine with the RM deciding to drop them based on what he feels like doing as part of that release. If you don't have time to do the required items, then you shouldn't be an RM. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discussion about release frequency.
On Sep 20, 2010, at 4:15 PM, Robert Muir wrote: On Mon, Sep 20, 2010 at 4:11 PM, Grant Ingersoll gsing...@apache.org wrote: Because it's not authoritative. How would our users know which one is the official one? By publishing it under the ASF one with our signatures we are saying this is our official version. We would never claim that the Solr Commons CSV one is the official Commons jar, it's just the official one that Solr officially uses. It's a big difference. Besides, it's not like the iBiblio repo is open to anyone. You have to apply and you have to have authority to write to it. For the ASF, there is a whole sync process whereby iBiblio syncs with an ASF version. In other words, we are the only ones who can publish it to the same space where it is currently published. This authoratitiveness comes with a significant cost, that is the complexity of maven in our release process. I'm not convinced its worth this cost, and before we decide to have maven as part of the release, i'd like for there to be an actual vote. I agree. But, like I said, if those who want it step up and make it fully supported, then there is no more cost than uploading a few extra artifacts, then what's the extra cost? As usual in open source, why don't we just leave it those who do the work? If no one steps up and fixes it, then it doesn't get included. Sorry to change my tone, but I was under the impression we needed a lucene committer to do all this releasing work to support maven, it seems that this is not the case, and other options are available. I'm sorry, I don't see the other options. I think it does need to be done by a Lucene committer to be an official Lucene artifact. OK, well, I suppose some other ASF person could do it, but short of a benevolent volunteer to do so, I don't think there are other options. -Grant
Re: discussion about release frequency.
On Mon, Sep 20, 2010 at 4:20 PM, Grant Ingersoll gsing...@apache.orgwrote: I'm sorry, I don't see the other options. I think it does need to be done by a Lucene committer to be an official Lucene artifact. OK, well, I suppose some other ASF person could do it, but short of a benevolent volunteer to do so, I don't think there are other options. I will quote Ryan here: The artifacts are the identical .jar files put into a special directory structure. Therefore, if we release without maven, the jar files are signed by our release key. this is authoritative enough, maven does check signatures correct? I'm not buying the authoritative argument, it seems like any old joker can take our signed jars and put them in maven themselves, without us having to do any work. -- Robert Muir rcm...@gmail.com
Re: [Lucene-java Wiki] Trivial Update of ReleaseTodo by YonikSeeley
On Mon, Sep 20, 2010 at 4:17 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 20, 2010, at 3:46 PM, Yonik Seeley wrote: I am also re-asserting (as I have asserted in the past) that the Maven artifacts are *optional*. We've discussed maven not being mandatory before: http://search.lucidimagination.com/search/document/bd618c89a4d458dc/lucene_2_9_again http://search.lucidimagination.com/search/document/3b98fa9ec3073936 You asserting in previous threads that Maven is optional does not make it optional. I *think* that's a roundabout way of saying that you do think it's mandatory. But you've been unable to point to how it became mandatory... and there seems to be a distinct lack of consensus over it. Certainly makes it sound optional. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2613) spatial random test failure (TestCartesian)
[ https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736 ] Lee Cooper commented on LUCENE-2613: This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181) spatial random test failure (TestCartesian) --- Key: LUCENE-2613 URL: https://issues.apache.org/jira/browse/LUCENE-2613 Project: Lucene - Java Issue Type: Bug Components: contrib/spatial Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-2613.patch {noformat} java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168) at org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542) {noformat} plug in seed of -6954859807298077232L to newRandom to reproduce. didnt test to see if it affected 3x also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)
[ https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736 ] Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:55 PM: - This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? description. java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181) was (Author: leetcooper): This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181) spatial random test failure (TestCartesian) --- Key: LUCENE-2613 URL: https://issues.apache.org/jira/browse/LUCENE-2613 Project: Lucene - Java Issue Type: Bug Components: contrib/spatial Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-2613.patch {noformat} java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168) at org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542) {noformat} plug in seed of -6954859807298077232L to newRandom to reproduce. didnt test to see if it affected 3x also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)
[ https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736 ] Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:56 PM: - This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? * java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181)* was (Author: leetcooper): This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? description. java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181) spatial random test failure (TestCartesian) --- Key: LUCENE-2613 URL: https://issues.apache.org/jira/browse/LUCENE-2613 Project: Lucene - Java Issue Type: Bug Components: contrib/spatial Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-2613.patch {noformat} java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168) at org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542) {noformat} plug in seed of -6954859807298077232L to newRandom to reproduce. didnt test to see if it affected 3x also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)
[ https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736 ] Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:57 PM: - This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? *strong* java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181)*strong* was (Author: leetcooper): This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? * java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181)* spatial random test failure (TestCartesian) --- Key: LUCENE-2613 URL: https://issues.apache.org/jira/browse/LUCENE-2613 Project: Lucene - Java Issue Type: Bug Components: contrib/spatial Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-2613.patch {noformat} java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168) at org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542) {noformat} plug in seed of -6954859807298077232L to newRandom to reproduce. didnt test to see if it affected 3x also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2613) spatial random test failure (TestCartesian)
[ https://issues.apache.org/jira/browse/LUCENE-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912736#action_12912736 ] Lee Cooper edited comment on LUCENE-2613 at 9/20/10 5:57 PM: - This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181) was (Author: leetcooper): This is the only reference similar to a problem I am experiencing. I am using Lucene 2.9.2 and I am getting the below exception when using a GeoHashDistanceFilter filter It is working most of the time but under certain conditions and sometimes intermittently Lucene throws this exception. Can someone tell me why the exception might be thrown and is there anything I can do to stop it happening? I see that this is still open will the resolution of this problem solve my problem? *strong* java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:71) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:279) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:254) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) at org.apache.lucene.search.Searcher.search(Searcher.java:181)*strong* spatial random test failure (TestCartesian) --- Key: LUCENE-2613 URL: https://issues.apache.org/jira/browse/LUCENE-2613 Project: Lucene - Java Issue Type: Bug Components: contrib/spatial Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-2613.patch {noformat} java.lang.IllegalArgumentException: null iterator at org.apache.lucene.search.FilteredDocIdSetIterator.init(FilteredDocIdSetIterator.java:38) at org.apache.lucene.search.FilteredDocIdSet$1.init(FilteredDocIdSet.java:72) at org.apache.lucene.search.FilteredDocIdSet.iterator(FilteredDocIdSet.java:72) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:241) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:216) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:168) at org.apache.lucene.spatial.tier.TestCartesian.testGeoHashRange(TestCartesian.java:542) {noformat} plug in seed of -6954859807298077232L to newRandom to reproduce. didnt test to see if it affected 3x also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Document links
I've been looking at Graph Databases recently (neo4j, OrientDb, InfiniteGraph) as a faster alternative to relational stores. I notice they either embed Lucene for indexing node properties or (in the case of OrientDB) are talking about doing this. I think their fundamental performance advantage over relational stores is that they don't have to de-reference foreign keys in a b-tree index to get from a source node to a target node. Instead they use internally-generated IDs to act like pointers with more-or-less direct references between nodes/vertexes. As a result they can follow links very quickly. This got me thinking could Lucene adopt the idea of creating links between documents that were equally fast using Lucene doc ids? Maybe the user API would look something like this... indexWriter.addLink(fromDocId, toDocId); DocIdSet reader.getInboundLinks(docId); DocIdSet reader.getOutboundLinks(docId); Internally a new index file structure would be needed to record link info. Any recorded links that connect documents from different segments would need careful adjustment of referenced link IDs when segments merge and Lucene doc IDs are shuffled. As well as handling typical graphs (social networks, web data) this could potentially be used to support tagging operations where apps could create tag documents and then link them to existing documents that are being tagged without having to update the target doc. There are probably a ton of applications for this stuff. I see the Graph DBs busy recreating transactional support, indexes, segment merging etc and it seems to me that Lucene has a pretty good head start with this stuff. Anyone else think this might be an area worth exploring? Cheers Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1568) Implement Spatial Filter
[ https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912780#action_12912780 ] Bill Bell commented on SOLR-1568: - The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill Implement Spatial Filter Key: SOLR-1568 URL: https://issues.apache.org/jira/browse/SOLR-1568 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 3.1, 4.0 Attachments: CartesianTierQParserPlugin.java, SOLR-1568.Mattmann.031010.patch.txt, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch, SOLR-1568.patch Given an index with spatial information (either as a geohash, SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be able to pass in a filter query that takes in the field name, lat, lon and distance and produces an appropriate Filter (i.e. one that is aware of the underlying field type for use by Solr. The interface _could_ look like: {code} fq={!sfilt dist=20}location:49.32,-79.0 {code} or it could be: {code} fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20} {code} or: {code} fq={!sfilt p=49.32,-79.0 f=location dist=20} {code} or: {code} fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20} {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2125) Spatial filter is not accurate
Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912785#action_12912785 ] Yonik Seeley commented on SOLR-2125: Just in time Bill! I just started looking at spatial stuff today since I'm planning on putting some of it in my Lucene Revolution presentation. I've seen some tweets about people having difficulties, and I've had some problems when I tried stuff myself. Anyway, I'm going to try and clean up some of this stuff over the next few days and make the wiki a bit more user oriented - an extra pair of eyeballs would be welcome! Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912799#action_12912799 ] Yonik Seeley commented on SOLR-2125: Hmmm, well, I just corrected one bug that hard-coded the distance in miles, but it was just a check to see if we crossed the poles. I don't think that change alone will fix your issue. Earlier today, I switched around some fields/field-types in the example schema, so store is now of latlon type, and it's the only location type (having multiple is just confusing). So just looking at the bounding box now, here's the URL from your example: http://localhost:8983/solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store}qt=standardpt=44.9369054,-91.3929348d=280debugQuery=true And I can see that the generated bounding box is: +store_0_coordinate:[43.129843715965166 TO 46.688683890119314] +store_1_coordinate:[-93.83266208454557 TO -88.79716545231159] Which just misses the longitude of the point on the document of -93.87341. Can anyone point to an webapp for checking arbitrary distances between two lat/lon points? Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912808#action_12912808 ] Yonik Seeley commented on SOLR-2125: Bill, I found two online distance calculators that both give the distance between the points you provided as 196km. http://www.movable-type.co.uk/scripts/latlong.html http://www.es.flinders.edu.au/~mattom/Utilities/distance.html Now... the distance of 280km you provided should certainly still encompass that, so we still have a bug anyway. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820 ] Bill Bell commented on SOLR-2125: - Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies should be less distance. I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820 ] Bill Bell edited comment on SOLR-2125 at 9/20/10 11:18 PM: --- Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies is less distance (which makes sense). I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: code function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; code As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. was (Author: billnbell): Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies is less distance (which makes sense). I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820 ] Bill Bell edited comment on SOLR-2125 at 9/20/10 11:16 PM: --- Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies is less distance (which makes sense). I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. was (Author: billnbell): Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies should be less distance. I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912820#action_12912820 ] Bill Bell edited comment on SOLR-2125 at 9/20/10 11:19 PM: --- Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies is less distance (which makes sense). I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: {code} function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; {code} As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. was (Author: billnbell): Yes there is still a bug. Most of what I was saying was right. I just did a quick maps.google.com - click directions - and then put the 2 lat,long in both fields. 137 miles = 220.480128 kilometers (Google) 196.6km using http://www.movable-type.co.uk/scripts/latlong.html See on map: http://www.movable-type.co.uk/scripts/latlong-map.html?lat1=45.176140long1=-93.873410lat2=44.936905long2=-91.392935 Distance: 196.6 km Initial bearing:096°53′44″ Final bearing: 098°39′05″ Midpoint: 45°03′48″N, 092°37′50″W As the crow flies is less distance (which makes sense). I even used the JS function on http://www.movable-type.co.uk/scripts/latlong.html: code function toRad(a) { return (a*Math.PI/180); }; function hsin(lat1,lon1,lat2,lon2) { var R = 6371; // km var dLat = toRad(lat2-lat1); var dLon = toRad(lon2-lon1); var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2)) * Math.sin(dLon/2) * Math.sin(dLon/2); var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); var d = R * c; return d; }; code As a Javascript function - while looping through the results. Since I cannot find a way to output the distance automagically from the XML coming back from SOLR. scriptdocument.write(hsin(lat,lon,solr.lat,solr.lom));/script I kept playing with d=km to see when the filter is not longer showing on the results at while value. sort=dist(2,store,vector(44.9369054,-91.3929348)) asc d=285 shows. d=284 does not show. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912822#action_12912822 ] Chris Male commented on SOLR-2125: -- The sorting won't be the issue though surely? The bug seems to be in the bounding box generation that Yonik pointed out. There will be some rounding issues at different places I can imagine, but nothing that would generate such a discrepancy. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912823#action_12912823 ] Chris A. Mattmann commented on SOLR-2125: - Distance using the Haversine function is extremely sensitive to what spatial reference system the data was recorded in. WGS84 isn't particular great with long distances. The PostGIS in Action book has a really good explanation of this. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912827#action_12912827 ] Yonik Seeley commented on SOLR-2125: bq. Distance using the Haversine function is extremely sensitive to what spatial reference system the data was recorded in. WGS84 isn't particular great with long distances. I know nothing on this topic, but an error of 45% at 200 km? I'm pretty certain that there is a bug not having to do with the accuracy of spatial reference systems here. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912830#action_12912830 ] Chris A. Mattmann commented on SOLR-2125: - Umm well if you know nothing then how are you pretty sure? And yes, the error bars are fairly high for the Great Circle distance. Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2125) Spatial filter is not accurate
[ https://issues.apache.org/jira/browse/SOLR-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912833#action_12912833 ] Bill Bell commented on SOLR-2125: - The calculation using hsin Javascript is more accurate than our algorithm ? Chris a few percentage points maybe - but not 45%. I will look into it some more tonight. It can't be that complicated. Bill Spatial filter is not accurate -- Key: SOLR-2125 URL: https://issues.apache.org/jira/browse/SOLR-2125 Project: Solr Issue Type: Bug Components: Build Affects Versions: 1.5 Reporter: Bill Bell The calculations of distance appears to be off. Note: The radius of the sphere to be used when calculating distances on a sphere (i.e. haversine). Default is the Earth's mean radius in kilometers (see org.apache.solr.search.function.distance.Constants.EARTH_MEAN_RADIUS_KM) which is set to 3,958.761458084784856. Most applications will not need to set this. The radius of the earth in KM is 6371.009 km (≈3958.761 mi). Also filtering distance appears to be off - example data: 45.17614,-93.87341 to 44.9369054,-91.3929348 Approx 137 miles Google. 169 miles = 220 kilometers http://../solr/select?fl=*,scorestart=0rows=10q={!sfilt%20fl=store_lat_lon}qt=standardpt=44.9369054,-91.3929348d=280sort=dist(2,store,vector(44.9369054,-91.3929348)) asc Nothing shows. d=285 shows results. This is off by a lot. Bill -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2659) lucenetestcase ease of use improvements
[ https://issues.apache.org/jira/browse/LUCENE-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2659: Attachment: LUCENE-2659.patch lucenetestcase ease of use improvements --- Key: LUCENE-2659 URL: https://issues.apache.org/jira/browse/LUCENE-2659 Project: Lucene - Java Issue Type: Test Components: Tests Reporter: Robert Muir Fix For: 3.1, 4.0 Attachments: LUCENE-2659.patch I started working on this in LUCENE-2658, here is the finished patch. There are some problems with LuceneTestCase: * a tests beforeClass, or the test itself (its @befores and its method), might have some random behavior, but only the latter can be reproduced with -Dtests.seed * if you want to do things in beforeClass, you have to use a different API: newDirectory(random) instead of newDirectory, etc. * for a new user, the current output can be verbose, confusing and overwhelming. So, I refactored this class to address these problems. A class still needs 2 seeds internally, as the beforeClass will only run once, but the methods or setUp() might run many times, especially when increasing iterations. but lucenetestcase deals with this, and the seed is 128-bit (UUID): the MSB is initialized in beforeClass, the LSB varied for each method run. if you provide a seed with a -D, they are both fixed to the UUID you provided. I fixed the API to be consistent, so you should be able to migrate a test from setUp() to beforeClass() [junit3 to junit4] without changing parameters. The codec, locale, timezone is only printed once at the end if any tests fail, as its per-class anyway (setup in beforeClass) finally, when a test fails, you get a single reproduce with command line you can copy and paste to reproduce. this way you dont have to spend time trying to figure out what the command line should be. {noformat} [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 0.197 sec [junit] [junit] - Standard Output --- [junit] NOTE: reproduce with: ant test -Dtestcase=TestExample -Dtestmethod=testMethodA -Dtests.seed=a51e707b-6550-7800-9f8c-72622d14bf5f [junit] NOTE: reproduce with: ant test -Dtestcase=TestExample -Dtestmethod=testMethodB -Dtests.seed=a51e707b-6550-7800-f7eb-2efca3820738 [junit] NOTE: test params are: codec=PreFlex, locale=ar_LY, timezone=Etc/UCT [junit] - --- [junit] Test org.apache.lucene.util.TestExample FAILED {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2659) lucenetestcase ease of use improvements
lucenetestcase ease of use improvements --- Key: LUCENE-2659 URL: https://issues.apache.org/jira/browse/LUCENE-2659 Project: Lucene - Java Issue Type: Test Components: Tests Reporter: Robert Muir Fix For: 3.1, 4.0 Attachments: LUCENE-2659.patch I started working on this in LUCENE-2658, here is the finished patch. There are some problems with LuceneTestCase: * a tests beforeClass, or the test itself (its @befores and its method), might have some random behavior, but only the latter can be reproduced with -Dtests.seed * if you want to do things in beforeClass, you have to use a different API: newDirectory(random) instead of newDirectory, etc. * for a new user, the current output can be verbose, confusing and overwhelming. So, I refactored this class to address these problems. A class still needs 2 seeds internally, as the beforeClass will only run once, but the methods or setUp() might run many times, especially when increasing iterations. but lucenetestcase deals with this, and the seed is 128-bit (UUID): the MSB is initialized in beforeClass, the LSB varied for each method run. if you provide a seed with a -D, they are both fixed to the UUID you provided. I fixed the API to be consistent, so you should be able to migrate a test from setUp() to beforeClass() [junit3 to junit4] without changing parameters. The codec, locale, timezone is only printed once at the end if any tests fail, as its per-class anyway (setup in beforeClass) finally, when a test fails, you get a single reproduce with command line you can copy and paste to reproduce. this way you dont have to spend time trying to figure out what the command line should be. {noformat} [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 0.197 sec [junit] [junit] - Standard Output --- [junit] NOTE: reproduce with: ant test -Dtestcase=TestExample -Dtestmethod=testMethodA -Dtests.seed=a51e707b-6550-7800-9f8c-72622d14bf5f [junit] NOTE: reproduce with: ant test -Dtestcase=TestExample -Dtestmethod=testMethodB -Dtests.seed=a51e707b-6550-7800-f7eb-2efca3820738 [junit] NOTE: test params are: codec=PreFlex, locale=ar_LY, timezone=Etc/UCT [junit] - --- [junit] Test org.apache.lucene.util.TestExample FAILED {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] Resolved: (LUCENE-2482) Index sorter
What is the philosophy about the 3.x branch? This is an all-new feature added to 3.x. Andrzej Bialecki (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved LUCENE-2482. --- Resolution: Fixed Committed in rev. 998948. Index sorter Key: LUCENE-2482 URL: https://issues.apache.org/jira/browse/LUCENE-2482 Project: Lucene - Java Issue Type: New Feature Components: contrib/* Affects Versions: 3.1 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: 3.1 Attachments: indexSorter.patch A tool to sort index according to a float document weight. Documents with high weight are given low document numbers, which means that they will be first evaluated. When using a strategy of early termination of queries (see TimeLimitedCollector) such sorting significantly improves the quality of partial results. (Originally this tool was created by Doug Cutting in Nutch, and used norms as document weights - thus the ordering was limited by the limited resolution of norms. This is a pure Lucene version of the tool, and it uses arbitrary floats from a specified stored field). - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2126) highlighting multicore searches relying on q.alt gives NPE
highlighting multicore searches relying on q.alt gives NPE -- Key: SOLR-2126 URL: https://issues.apache.org/jira/browse/SOLR-2126 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4 Environment: I'm on a trunk release from early March, but I also just verified this on LucidWorks 1.4 which I have handy. Reporter: David Smiley Priority: Minor To reproduce this, run the example multicore solr configuration. Then index each example document into each core. Now we're going to do a distributed search, with q.alt=*:* and defType=dismax. Normally, these would be set in a request handler config as defaults but we'll put them in the url to make it clear they need to be set and because the default multicore example config is so bare bones that it doesn't already have a dismax setup. We're going to enable highlighting. http://localhost:8983/solr/core0/select?hl=trueq.alt=*:*defType=dismaxshards=localhost:8983/solr/core0,localhost:8983/solr/core1 java.lang.NullPointerException at org.apache.solr.handler.component.HighlightComponent.finishStage(HighlightComponent.java:130) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Since I happen to be using edismax in trunk, it was easy for me to work around this problem by renaming my q.alt parameter in my request handler defaults to just q since edismax understands raw lucene queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2650) improve windows defaults in FSDirectory
[ https://issues.apache.org/jira/browse/LUCENE-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12912838#action_12912838 ] Robert Muir commented on LUCENE-2650: - I'm going to add the extra safety here for cloned mmapindexinputs as a separate commit from changing the defaults (in case we have to revert the defaults). Its also good to backport (unlike the defaults) improve windows defaults in FSDirectory --- Key: LUCENE-2650 URL: https://issues.apache.org/jira/browse/LUCENE-2650 Project: Lucene - Java Issue Type: Improvement Components: Store Affects Versions: 4.0 Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-2650.patch, LUCENE-2650.patch Currently windows defaults to SimpleFSDirectory, but this is a problem due to the synchronization. I have been benchmarking queries *sequentially* and was pretty surprised at how much faster MMapDirectory is, for example for cases that do many seeks. I think we should change the defaults for windows as such: if (WINDOWS and UNMAP_SUPPORTED and 64-bit) use MMapDirectory else use SimpleFSDirectory I think we should just consider doing this for 4.0 only and see how it goes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org