[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229928#comment-13229928 ] Harley Parks commented on SOLR-2155: Sorry, about the editing. But Thanks For the Feedback. so perhaps there is something wrong with my configuration, since the search results do not return lat, long but the geohash string. The maven build went great. I did finally figure out how to use the geofilt function using the pt and d. I am reindexing each time, but yes, I delete the data folder, and reindex. I am placing the jar file into the tomcats solr/lib folder, after a restart, and after changing solrconfig and schema, geohash string is displayed, not the lat,long. Schema Field Type: fieldType name=geohash class=solr2155.solr.schema.GeoHashField length=12/ this is the field: field name=GeoTagGeoHash type=geohash indexed=true stored=true multiValued=true / this is the info from solr/admin/ field types - GEOHASH Field Type: geohash Fields: GEOTAGGEOHASH Tokenized: true Class Name: solr2155.solr.schema.GeoHashField Index Analyzer: org.apache.solr.analysis.TokenizerChain Tokenizer Class: solr2155.solr.schema.GeoHashField$1 Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer Still, the query returns values like: arr name=GeoTagGeoHash strrw3sh9g8c6mx/str strrw3f3xc9dnh3/str strrw3ckbue74y7/str /arr so, if this is not the right, is there anything I can do to troubleshoot? Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3869) possible hang in UIMATypeAwareAnalyzerTest
[ https://issues.apache.org/jira/browse/LUCENE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229939#comment-13229939 ] Tommaso Teofili commented on LUCENE-3869: - I tried to reproduce that many times (same command/seed) but with no luck so far, which environment are you running Robert? possible hang in UIMATypeAwareAnalyzerTest -- Key: LUCENE-3869 URL: https://issues.apache.org/jira/browse/LUCENE-3869 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Affects Versions: 4.0 Reporter: Robert Muir Just testing an unrelated patch, I was hung (with 100% cpu) in UIMATypeAwareAnalyzerTest. I'll attach stacktrace at the moment of the hang. The fact we get a seed in the actual stacktraces for cases like this is awesome! Thanks Dawid! I don't think it reproduces 100%, but I'll try beasting this seed to see if i can reproduce the hang: should be 'ant test -Dtestcase=UIMATypeAwareAnalyzerTest -Dtests.seed=-262aada3325aa87a:-44863926cf5c87e9:5c8c471d901b98bd' from what I can see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229945#comment-13229945 ] Bill Bell commented on SOLR-2155: - David, What is an example URL call for multiValued field? Does geofilt work? /select?q=*:*fq={!geofilt}sort=geodist() ascsfield=store_hashd=10 Or do we need to use gh_geofilt? like this? /select?q=*:*fq={!gh_geofilt}sort=geodist() ascsfield=store_hashd=10 Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-3871) Check what's up with stack traces being insane.
[ https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reopened LUCENE-3871: - Check what's up with stack traces being insane. --- Key: LUCENE-3871 URL: https://issues.apache.org/jira/browse/LUCENE-3871 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3871) Check what's up with stack traces being insane.
[ https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229969#comment-13229969 ] Dawid Weiss commented on LUCENE-3871: - I dug a little deeper. The problem on ANT 1.7 is caused by broken stack filtering (and the root cause is an assertion inside org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner rethrowing the original exception. I will add a workaround by disabling stack filtering; full stack may be verbose but it is better than a broken stack. Check what's up with stack traces being insane. --- Key: LUCENE-3871 URL: https://issues.apache.org/jira/browse/LUCENE-3871 Project: Lucene - Java Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3871) Stack traces from failed tests are messed up on ANT 1.7.x
[ https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-3871: Issue Type: Bug (was: Task) Summary: Stack traces from failed tests are messed up on ANT 1.7.x (was: Check what's up with stack traces being insane.) Stack traces from failed tests are messed up on ANT 1.7.x - Key: LUCENE-3871 URL: https://issues.apache.org/jira/browse/LUCENE-3871 Project: Lucene - Java Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.6, 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3871) Stack traces from failed tests are messed up on ANT 1.7.x
[ https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-3871. - Resolution: Fixed Fix Version/s: 3.6 Stack traces from failed tests are messed up on ANT 1.7.x - Key: LUCENE-3871 URL: https://issues.apache.org/jira/browse/LUCENE-3871 Project: Lucene - Java Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.6, 4.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3868) Thread interruptions shouldn't cause unhandled thread errors (or should they?).
[ https://issues.apache.org/jira/browse/LUCENE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229990#comment-13229990 ] Dawid Weiss commented on LUCENE-3868: - This is fairly messy in trunk; threads are interrupted either at method or at class level (depending on sysprop). Additionally, the interruption is done like this: {code} t.setUncaughtExceptionHandler(null); Thread.setDefaultUncaughtExceptionHandler(null); if (!t.getName().startsWith(SyncThread)) // avoid zookeeper jre crash t.interrupt(); {code} this doesn't restore default handler, may cause interference with other threads (which do have handlers), etc. I'd rather fix it by switching to LUCENE-3808 where this is solved at the runner's level (and controlled via annotations). Thread interruptions shouldn't cause unhandled thread errors (or should they?). --- Key: LUCENE-3868 URL: https://issues.apache.org/jira/browse/LUCENE-3868 Project: Lucene - Java Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.6, 4.0 This is a result of pulling uncaught exception catching to a rule above interrupt in internalTearDown(); check how it was before and restore previous behavior? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3233) SolrExampleStreamingBinaryTest num results != expected exceptions (reproducible).
[ https://issues.apache.org/jira/browse/SOLR-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-3233. --- Resolution: Cannot Reproduce Assignee: Dawid Weiss Cannot reproduce anymore with this seed, possibly fixed in between. SolrExampleStreamingBinaryTest num results != expected exceptions (reproducible). - Key: SOLR-3233 URL: https://issues.apache.org/jira/browse/SOLR-3233 Project: Solr Issue Type: Bug Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.0 {noformat} git clone -b SOLR-3220 --depth 0 g...@github.com:dweiss/lucene_solr.git git co 9b1efde7a4882caa9dd04556aa4b849db68081a5 cd solr ant test-core -Dtests.filter=*.SolrExampleStreamingBinaryTest -Dtests.filter.method=testStatistics -Drt.seed=F57E2420CEBDC955 -Dargs=-Dfile.encoding=UTF-8 {noformat} The number of returned committed docs is invalid, this is reproducible and occurs in many methods, not only in testStatistics? {code} int i=0; // 0 1 2 3 4 5 6 7 8 9 int[] nums = new int[] { 23, 26, 38, 46, 55, 63, 77, 84, 92, 94 }; for( int num : nums ) { SolrInputDocument doc = new SolrInputDocument(); doc.setField( id, doc+i++ ); doc.setField( name, doc: +num ); doc.setField( f, num ); server.add( doc ); } server.commit(); assertNumFound( *:*, nums.length ); // FAILURE here. Indeed, a query via web browser shows not all docs are in? {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-trunk - Build # 1794 - Still Failing
Build: https://builds.apache.org/job/Solr-trunk/1794/ 1 tests failed. FAILED: org.apache.solr.TestDistributedSearch.testDistribSearch Error Message: Uncaught exception by thread: Thread[Thread-656,5,] Stack Trace: org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread: Uncaught exception by thread: Thread[Thread-656,5,] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:60) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743) Caused by: java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: http://localhost:20742/solr at org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:374) Caused by: org.apache.solr.client.solrj.SolrServerException: http://localhost:20742/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:496) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:312) at org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:369) Caused by: org.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 100 ms at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:155) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125) at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:426) ... 4 more Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
Re: [JENKINS] Solr-trunk - Build # 1794 - Still Failing
Connection time out again. D. On Thu, Mar 15, 2012 at 10:10 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Solr-trunk/1794/ 1 tests failed. FAILED: org.apache.solr.TestDistributedSearch.testDistribSearch Error Message: Uncaught exception by thread: Thread[Thread-656,5,] Stack Trace: org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread: Uncaught exception by thread: Thread[Thread-656,5,] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:60) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743) Caused by: java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: http://localhost:20742/solr at org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:374) Caused by: org.apache.solr.client.solrj.SolrServerException: http://localhost:20742/solr at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:496) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:312) at org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:369) Caused by: org.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 100 ms at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:155) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125) at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:426) ... 4 more Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at
[jira] [Resolved] (LUCENE-3856) Create docvalues based grouped facet collector
[ https://issues.apache.org/jira/browse/LUCENE-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen resolved LUCENE-3856. --- Resolution: Fixed Committed to trunk. Create docvalues based grouped facet collector -- Key: LUCENE-3856 URL: https://issues.apache.org/jira/browse/LUCENE-3856 Project: Lucene - Java Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Fix For: 4.0 Attachments: LUCENE-3856.patch, LUCENE-3856.patch, LUCENE-3856.patch Create docvalues based grouped facet collector. Currently only term based collectors have been implemented (LUCENE-3802). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3867: -- Attachment: LUCENE-3867.patch Hi, here new patch using Unsafe to get the bitness (with the well-known fallback) and for compressedOops detection. Looks much cleaner. I also like it more, that the addressSize is now detected natively and not from sysprops. The constants mentioned by Dawid are only availabe in Java 7, so i reflected the underlying methods from theUnsafe. I also changed the boolean JRE_USES_COMPRESSED_OOPS to an integer JRE_REFERENCE_SIZE that is used by RamUsageEstimator. We might do the same for all other native types... (this is just a start). Shai: Can you test with your JVMs and also enable/disable compressed oops/refs? RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230030#comment-13230030 ] Vadim Kisselmann commented on SOLR-3238: I use Solr 4.0 from trunk(latest) with tomcat6. I get an error in New Admin UI (since one week, i update every day from trunk): This interface requires that you activate the admin request handlers, add the following configuration to your solrconfig.xml: !-- Admin Handlers - This will register all the standard admin RequestHandlers. -- requestHandler name=/admin/ class=solr.admin.AdminHandlers / Admin request Handlers are definitely activated in my solrconfig. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230043#comment-13230043 ] Uwe Schindler commented on SOLR-3238: - I got this error with trunk checkout using ant run-example, too. But only on the first run, later runs work. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230043#comment-13230043 ] Uwe Schindler edited comment on SOLR-3238 at 3/15/12 10:43 AM: --- I got this error with trunk checkout using ant run-example, too. But only on the first run, later runs work. EDIT: I think this has nothing to do with the admin UI. When this happened, I got some Exceptions in the startup of Solr. Can you check for them in the logs? Unfortunately I cannot reproduce at the moment. was (Author: thetaphi): I got this error with trunk checkout using ant run-example, too. But only on the first run, later runs work. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230056#comment-13230056 ] Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 10:47 AM: -- It's weird:) ant run-example starts server with jetty, and it works. As next step i build it one more time with ant example and start my tomcat, and it works, too. When i update to new Solr version from trunk, and build it with ant example, i get this error again. was (Author: ldb): It's weird:) ant run-example starts server with jetty, and it works. As next step i build it one more time with ant example and start my tomcat, and it works, too. When i update to new Solr version form trunk, and build it with ant example, i get this error again. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230056#comment-13230056 ] Vadim Kisselmann commented on SOLR-3238: It's weird:) ant run-example starts server with jetty, and it works. As next step i build it one more time with ant example and start my tomcat, and it works, too. When i update to new Solr version form trunk, and build it with ant example, i get this error again. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230056#comment-13230056 ] Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 11:02 AM: -- It's weird:) ant run-example starts server with jetty, and it works. As next step i build it one more time with ant example and start my tomcat, and it works, too. When i update to new Solr version from trunk, and build it with ant example, i get this error again. EDIT: no errors at this time in my log files. was (Author: ldb): It's weird:) ant run-example starts server with jetty, and it works. As next step i build it one more time with ant example and start my tomcat, and it works, too. When i update to new Solr version from trunk, and build it with ant example, i get this error again. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3869) possible hang in UIMATypeAwareAnalyzerTest
[ https://issues.apache.org/jira/browse/LUCENE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230069#comment-13230069 ] Robert Muir commented on LUCENE-3869: - linux amd64. I can try to dig into this, and ill try upgrading my jvm etc too, its a bit outdated :) possible hang in UIMATypeAwareAnalyzerTest -- Key: LUCENE-3869 URL: https://issues.apache.org/jira/browse/LUCENE-3869 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Affects Versions: 4.0 Reporter: Robert Muir Just testing an unrelated patch, I was hung (with 100% cpu) in UIMATypeAwareAnalyzerTest. I'll attach stacktrace at the moment of the hang. The fact we get a seed in the actual stacktraces for cases like this is awesome! Thanks Dawid! I don't think it reproduces 100%, but I'll try beasting this seed to see if i can reproduce the hang: should be 'ant test -Dtestcase=UIMATypeAwareAnalyzerTest -Dtests.seed=-262aada3325aa87a:-44863926cf5c87e9:5c8c471d901b98bd' from what I can see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12754 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12754/ All tests passed Build Log (for compile errors): [...truncated 15069 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 12754 - Failure
I committed a fix for this. Martijn On 15 March 2012 12:11, Apache Jenkins Server jenk...@builds.apache.orgwrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12754/ All tests passed Build Log (for compile errors): [...truncated 15069 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Met vriendelijke groet, Martijn van Groningen
[jira] [Commented] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104 ] Vadim Kisselmann commented on SOLR-3238: Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104 ] Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 12:44 PM: -- Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Now it's completely broken. Rebuild and restart, whether jetty or tomcat, change nothing. was (Author: ldb): Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104 ] Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 12:53 PM: -- Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Now it's completely broken. Rebuild and restart, whether jetty or tomcat, change nothing. EDIT: i get the same problem on another server(tomcat, sharded, without ZK). With an old revision from Feb. it works, with new checkout from trunk not. was (Author: ldb): Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Now it's completely broken. Rebuild and restart, whether jetty or tomcat, change nothing. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Fix For: 4.0 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To
AW: AW: AW: AW: Setting Stopword Set in PyLucene (or using Set in general)
Hi, I have to add a comment to my previous mail: I'd preferred using this option (#2) in toArray (for both JavaList and JavaSet) as it does not require the wrapping into Java Integer (etc.) objects. However this method does not work with lucene.ArrayList: x=lucene. JArray ('int')([1,2]) JArrayint[1, 2] y=lucene. ArrayList (x) Traceback: lucene.InvalidArgsError: (type 'ArrayList', '__init__', (JArrayint[1, 2],)) Sorry - that's rubbish of course: ArrayList requires a collection in its constructor and JArray isn't a collection. So this can't work! The 'challenge' was to be able to use JavaSet and/or JavaList (both are collections) as an argument for ArrayList. (During init of ArrayList the toArray() method is called however.) So I gave it a quick try again, and tried the 2nd alternative: 1) return JArray(object)([lucene.Integer()-object*]) or 2) return JArray(int)([Python-int-literal*]) but that option then fails (in the demo code) when using bool (or float) types. Attached is a revised version of collections.py with the alternative code (disabled) - if anyone's interested... The mentioned issue with the created JArray containing the same objects still remains. I'll have to look deeper into that, but as said I'm out of office next week ... BTW, sorry if this is out of scope of the PyLucene mailing list (it's more a JCC related discussion) - we can continue with 'private' mail if that's preferred. Regards, Thomas # Licensed under the Apache License, Version 2.0 (the License); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an AS IS BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. from lucene import JArray, Boolean, Float, Integer, Long, String, \ PythonSet, PythonList, PythonIterator, PythonListIterator, JavaError, \ NoSuchElementException, IllegalStateException, IndexOutOfBoundsException # 1 wrap via lucene.Integer etc. (string not needed) - works (almost...) _types = { bool: Boolean, float: Float, int: Integer, long: Long, } # TODO: Remove this code ... # 2 used typed JArray: works for string, but not float and bool X_types = { int: 'int', bool: 'bool', float: 'float', long: 'long', str: 'string' } class JavaSet(PythonSet): This class implements java.util.Set around a Python set instance it wraps. def __init__(self, _set): super(JavaSet, self).__init__() self._set = _set def __contains__(self, obj): return obj in self._set def __len__(self): return len(self._set) def __iter__(self): return iter(self._set) def add(self, obj): if obj not in self._set: self._set.add(obj) return True return False def addAll(self, collection): size = len(self._set) self._set.update(collection) return len(self._set) size def clear(self): self._set.clear() def contains(self, obj): return obj in self._set def containsAll(self, collection): for obj in collection: if obj not in self._set: return False return True def equals(self, collection): if type(self) is type(collection): return self._set == collection._set return False def isEmpty(self): return len(self._set) == 0 def iterator(self): class _iterator(PythonIterator): def __init__(_self): super(_iterator, _self).__init__() _self._iterator = iter(self._set) def hasNext(_self): if hasattr(_self, '_next'): return True try: _self._next = _self._iterator.next() return True except StopIteration: return False def next(_self): if hasattr(_self, '_next'): next = _self._next del _self._next else: next = _self._iterator.next() return next return _iterator() def remove(self, obj): try: self._set.remove(obj) return True except KeyError: return False def removeAll(self, collection): result = False for obj in collection: try: self._set.remove(obj) result = True except KeyError: pass return result def retainAll(self, collection):
[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-3867: --- Attachment: LUCENE-3867.patch Thanks Uwe ! I ran the test, and now with both J9 (IBM) and Oracle, I get this print (without enabling any flag): {code} [junit] NOTE: running test testReferenceSize [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 {code} * I modified the test name to testReferenceSize (was testCompressedOops). I wrote this small test to print the differences between sizeOf(String) and estimateRamUsage(String): {code} public void testSizeOfString() throws Exception { String s = abcdefgkjdfkdsjdskljfdskfjdsf; String sub = s.substring(0, 4); System.out.println(original= + RamUsageEstimator.sizeOf(s)); System.out.println(sub= + RamUsageEstimator.sizeOf(sub)); System.out.println(checkInterned=true(orig): + new RamUsageEstimator().estimateRamUsage(s)); System.out.println(checkInterned=false(orig): + new RamUsageEstimator(false).estimateRamUsage(s)); System.out.println(checkInterned=false(sub): + new RamUsageEstimator(false).estimateRamUsage(sub)); } {code} It prints: {code} original=104 sub=56 checkInterned=true(orig): 0 checkInterned=false(orig): 98 checkInterned=false(sub): 98 {code} So clearly estimateRamUsage factors in the sub-string's larger char[]. The difference in sizes of 'orig' stem from AverageGuessMemoryModel which computes the reference size to be 4 (hardcoded), and array size to be 16 (hardcoded). I modified AverageGuess to use constants from RUE (they are best guesses themselves). Still the test prints a difference, but now I think it's because sizeOf(String) aligns the size to mod 8, while estimateRamUsage isn't. I fixed that in size(Object), and now the prints are the same. * I also fixed sizeOfArray -- if the array.length == 0, it returned 0, but it should return its header, and aligned to mod 8 as well. * I modified sizeOf(String[]) to sizeOf(Object[]) and compute its raw size only. I started to add sizeOf(String), fastSizeOf(String) and deepSizeOf(String[]), but reverted to avoid the hassle -- the documentation confuses even me :). * Changed all sizeOf() to return long, and align() to take and return long. I think this is ready to commit, though I'd appreciate a second look on the MemoryModel and size(Obj) changes. Also, how about renaming MemoryModel methods to: arrayHeaderSize(), classHeaderSize(), objReferenceSize() to make them more clear and accurate? For instance, getArraySize does not return the size of an array, but its object header ... RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String
[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104 ] Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 1:14 PM: - Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Now it's completely broken. Rebuild and restart, whether jetty or tomcat, change nothing. EDIT: i get the same problem on another server(tomcat, sharded, without ZK). With an old revision from Feb. it works, with new checkout from trunk not. EDIT2: It works when i remove the example-folder, checkout new version from trunk and rebuild it. i think it's a problem with solr.xml. On server-restart it breaks. With older revisions like r1292064 from Feb. it works. I think you're right, this has nothing to do with the admin UI. Sorry for spam here. New Issue? was (Author: ldb): Now i have error messages: SCHWERWIEGEND: The web application [/solr2] appears to have started a thread named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very likely to create a memory leak. Exception in thread Thread-2 java.lang.NullPointerException at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179) at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104) at java.lang.Thread.run(Thread.java:662) 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass INFO: Illegal access: this web application instance has been stopped already. Could not load org.apache.zookeeper.server.ZooTrace. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact. java.lang.IllegalStateException at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196) 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. log4j:WARN No appenders could be found for logger (org.apache.solr.core.SolrResourceLoader). log4j:WARN Please initialize the log4j system properly. Steps: I deleted the one default core in solr.xml, because i would create new cores with CoreAdminHandler. I started tomcat. Now it's completely broken. Rebuild and restart, whether jetty or tomcat, change nothing. EDIT: i get the same problem on another server(tomcat, sharded, without ZK). With an old revision from Feb. it works, with new checkout from trunk not. Sequel of Admin UI -- Key: SOLR-3238 URL: https://issues.apache.org/jira/browse/SOLR-3238 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes)
Test weirdness
I've seen a couple of notes this morning about running/test weirdness. I was getting an error when testing in TestSystemPropertiesInvariantsRule that made no sense whatsoever, with no changes to the code. Even after the usual tricks (updating a couple of times, running 'ant clean' all that jazz). Error was about new-property-1 new-value-1 existing, which makes some sense, that's the test. but what made no sense is that it was suddenly failing. Deleted the entire tree and did a fresh checkout from the repo and the problems vanished. Bad magic, but it worked. Mac OS X. FWIW - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230171#comment-13230171 ] Dawid Weiss commented on LUCENE-3867: - -1 to mixing shallow and deep sizeofs -- sizeOf(Object[] arr) is shallow and just feels wrong to me. All the other methods yield the deep total, why make an exception? If anything, make it explicit and then do it for any type of object -- {code} shallowSizeOf(Object t); sizeOf(Object t); {code} I'm not complaining just because my sense of taste is feeling bad. I am actually using this class in my own projects and I would hate to look into the JavaDoc every time to make sure what a given method does (especially with multiple overrides). In other words, I would hate to see this: {code} Object [] o1 = new Object [] {1, 2, 3}; Object o2 = o1; if (sizeOf(o1) != sizeOf(o2)) throw new WtfException(); {code} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230181#comment-13230181 ] Uwe Schindler commented on LUCENE-3867: --- {quote} I ran the test, and now with both J9 (IBM) and Oracle, I get this print (without enabling any flag): {code} [junit] NOTE: running test testReferenceSize [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 {code} {quote} I hope with compressedOops explicitely enabled (or however they call them), you get a reference size of 4 in J9 and pre-1.6.0_23 Oracle? RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-3867: --- Attachment: LUCENE-3867.patch Ok removed sizeOf(Object[]). One can compute it by using RUE.estimateRamSize to do a deep calculation. Geez Dawid, you took away all the reasons I originally opened the issue for ;). But at least AvgGuessMemoryModel and RUE.size() are more accurate now. And we have some useful utility methods. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230187#comment-13230187 ] Shai Erera commented on LUCENE-3867: I ran ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-XX:+UseCompressedOops and ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-XX:-UseCompressedOops and get 8 and 4 (with CompressedOops). RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230202#comment-13230202 ] Mark Miller commented on LUCENE-3867: - Oh, bummer - looks like we lost the whole history of this class...such a bummer. I really wanted to take a look at how this class had evolved since I last looked at it. I've missed the conversations around the history loss - is that gone, gone, gone, or is there still some way to find it? RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230206#comment-13230206 ] Mark Miller commented on LUCENE-3867: - Scratch that - I was trying to look back from the apache git clone using git - assumed it's history matched svn - but I get a clean full history using svn. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230208#comment-13230208 ] Uwe Schindler commented on LUCENE-3867: --- Die, GIT, die! :-) (as usual) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3250) Dynamic Field capabilities based on value not name
Dynamic Field capabilities based on value not name -- Key: SOLR-3250 URL: https://issues.apache.org/jira/browse/SOLR-3250 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll In some situations, one already knows the schema of their content, so having to declare a schema in Solr becomes cumbersome in some situations. For instance, if you have all your content in JSON (or can easily generate it) or other typed serializations, then you already have a schema defined. It would be nice if we could have support for dynamic fields that used whatever name was passed in, but then picked the appropriate FieldType for that field based on the value of the content. So, for instance, if the input is a number, it would select the appropriate numeric type. If it is a plain text string, it would pick the appropriate text field (you could even add in language detection here). If it is comma separated, it would treat them as keywords, etc. Also, we could likely send in a hint as to the type too. With this approach, you of course have a first in wins situation, but assuming you have this schema defined elsewhere, it is likely fine. Supporting such cases would allow us to be schemaless when appropriate, while offering the benefits of schemas when appropriate. Naturally, one could mix and match these too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1983 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1983/ 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: ERROR: SolrIndexSearcher opens=91 closes=90 Stack Trace: junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=91 closes=90 at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner$3.addError(JUnitTestRunner.java:974) at junit.framework.TestResult.addError(TestResult.java:38) at junit.framework.JUnit4TestAdapterCache$1.testFailure(JUnit4TestAdapterCache.java:51) at org.junit.runner.notification.RunNotifier$4.notifyListener(RunNotifier.java:100) at org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:41) at org.junit.runner.notification.RunNotifier.fireTestFailure(RunNotifier.java:97) at org.junit.internal.runners.model.EachTestNotifier.addFailure(EachTestNotifier.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:306) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743) Caused by: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=91 closes=90 at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:211) at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:100) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) ... 4 more Build Log (for compile errors): [...truncated 11141 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230214#comment-13230214 ] Uwe Schindler commented on LUCENE-3867: --- bq. I ran ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-XX:+UseCompressedOops and ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-XX:-UseCompressedOops and get 8 and 4 (with CompressedOops). OK, thanks. So it seems to work at least with Oracle/Sun and IBM J9. I have no other updates to this detection code. RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230219#comment-13230219 ] Dawid Weiss commented on LUCENE-3867: - bq. Geez Dawid, you took away all the reasons I originally opened the issue for This is by no means wasted time. I think the improvements are clear? bq. Die, GIT, die! I disagree here -- git is a great tool, even if the learning curve may be steep at first. git-svn is a whole different story (it's a great hack but just a hack). RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230222#comment-13230222 ] Uwe Schindler commented on LUCENE-3867: --- bq. I disagree here Calm down, was just my well-known standard answer :-) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230229#comment-13230229 ] Shai Erera commented on LUCENE-3867: bq. This is by no means wasted time. I think the improvements are clear? Yes, yes. It was a joke. Ok so can I proceed with the commit, or does someone intend to review the patch later? RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230228#comment-13230228 ] Dawid Weiss commented on LUCENE-3867: - Oh, I am calm, I just know people do hate git (and I used to as well, until I started using it frequently). Robert has a strong opinion about git, for example. Besides, there's nothing wrong in having a strong opinion -- it's great people can choose what they like and still collaborate via patches (and this seems to be the common ground between all vcs's). RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230230#comment-13230230 ] Uwe Schindler commented on LUCENE-3867: --- With unsafe we also get all those information like size of array header we have hardcoded. Should we not try to get these in the same way like I did for bitness and reference size - using Unsafe.theUnsafe.arrayBaseOffset()? And fallback to our hardcoded defaults? RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3250) Dynamic Field capabilities based on value not name
[ https://issues.apache.org/jira/browse/SOLR-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230231#comment-13230231 ] Grant Ingersoll commented on SOLR-3250: --- Note, a core reload is not something I would want to do. Dynamic Field capabilities based on value not name -- Key: SOLR-3250 URL: https://issues.apache.org/jira/browse/SOLR-3250 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll In some situations, one already knows the schema of their content, so having to declare a schema in Solr becomes cumbersome in some situations. For instance, if you have all your content in JSON (or can easily generate it) or other typed serializations, then you already have a schema defined. It would be nice if we could have support for dynamic fields that used whatever name was passed in, but then picked the appropriate FieldType for that field based on the value of the content. So, for instance, if the input is a number, it would select the appropriate numeric type. If it is a plain text string, it would pick the appropriate text field (you could even add in language detection here). If it is comma separated, it would treat them as keywords, etc. Also, we could likely send in a hint as to the type too. With this approach, you of course have a first in wins situation, but assuming you have this schema defined elsewhere, it is likely fine. Supporting such cases would allow us to be schemaless when appropriate, while offering the benefits of schemas when appropriate. Naturally, one could mix and match these too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230235#comment-13230235 ] Dawid Weiss commented on LUCENE-3867: - bq. using Unsafe.theUnsafe.arrayBaseOffset()? And fallback to our hardcoded defaults? +1. I will also try on OpenJDK with various jits but I'll do it in the evening. bq. Yes, yes. It was a joke. Joke or no joke the truth is I did complain a lot. :) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3250) Dynamic Field capabilities based on value not name
[ https://issues.apache.org/jira/browse/SOLR-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230243#comment-13230243 ] Yonik Seeley commented on SOLR-3250: Of course hopefully everyone knows schemaless is mostly marketing b.s. - when people do this, there is still a schema, but it's guessed on first use (and hence generally a horrible idea for production systems). It would be easy enough on a single node... but how does one handle a cluster? Say you index price=0 on nodeA, and price=100.0 on nodeB? A quick thought on how it might work: - have a separate file auto_fields.json that keeps track of the mappings that would be the same for all cores using that schema - when we run across a field we haven't seen before, we must guess a type for it, then grab a lock - update the auto_fields.json - we can update our in-memory schema with any new fields we find in auto_fields.json - works the same in ZK mode... it's just the auto_fields.json is in ZK, and we would use something like optimistic locking to update it Dynamic Field capabilities based on value not name -- Key: SOLR-3250 URL: https://issues.apache.org/jira/browse/SOLR-3250 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll In some situations, one already knows the schema of their content, so having to declare a schema in Solr becomes cumbersome in some situations. For instance, if you have all your content in JSON (or can easily generate it) or other typed serializations, then you already have a schema defined. It would be nice if we could have support for dynamic fields that used whatever name was passed in, but then picked the appropriate FieldType for that field based on the value of the content. So, for instance, if the input is a number, it would select the appropriate numeric type. If it is a plain text string, it would pick the appropriate text field (you could even add in language detection here). If it is comma separated, it would treat them as keywords, etc. Also, we could likely send in a hint as to the type too. With this approach, you of course have a first in wins situation, but assuming you have this schema defined elsewhere, it is likely fine. Supporting such cases would allow us to be schemaless when appropriate, while offering the benefits of schemas when appropriate. Naturally, one could mix and match these too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3872: --- Attachment: LUCENE-3872.patch Patch w/ failing test showing how we silently lose indexed docs... Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3778) Create a grouping convenience class
[ https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-3778: -- Fix Version/s: 4.0 Create a grouping convenience class --- Key: LUCENE-3778 URL: https://issues.apache.org/jira/browse/LUCENE-3778 Project: Lucene - Java Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Fix For: 4.0 Attachments: LUCENE-3778.patch, LUCENE-3778.patch Currently the grouping module has many collector classes with a lot of different options per class. I think it would be a good idea to have a GroupUtil (Or another name?) convenience class. I think this could be a builder, because of the many options (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations (term/dv/function) grouping has. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3207) Add field name validation
[ https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Cavanna updated SOLR-3207: --- Attachment: SOLR-3207.patch First draft patch. I introduced a new FieldNameValidator class which is used within the IndexSchema class to validate every field name. The new class exposes also some boolean methods which are used within the ReturnFields class, in order to apply the same rules there to detect a field name. That's needed to make sure that we accept field names that we can handle within the fl parameter. Apparently, if you use a placeholder as field name you receive on IndexSchema the default value, which can be empty. That's why I'm allowing empty field names. I'm not even sure I understood correctly how placeholders work, can someone help me out with this? Let me know what you think about my patch! Add field name validation - Key: SOLR-3207 URL: https://issues.apache.org/jira/browse/SOLR-3207 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Luca Cavanna Fix For: 4.0 Attachments: SOLR-3207.patch Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would be useful to add some kind of validation regarding the field names you can use on Solr. The objective would be adding consistency, allowing only field names that you can then use within fl, sorting etc. The rules, taken from the actual StrParser behaviour, seem to be the following: - same used for java identifiers (Character#isJavaIdentifierPart), plus the use of trailing '.' and '-' - for the first character the rule is Character#isJavaIdentifierStart minus '$' (The dash can't be used as first character (SOLR-3191) for example) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3778) Create a grouping convenience class
[ https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-3778: -- Attachment: LUCENE-3778.patch Updated patch. * Changed package.html * The methods that set something now have a name that begins with set. Create a grouping convenience class --- Key: LUCENE-3778 URL: https://issues.apache.org/jira/browse/LUCENE-3778 Project: Lucene - Java Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Fix For: 4.0 Attachments: LUCENE-3778.patch, LUCENE-3778.patch, LUCENE-3778.patch Currently the grouping module has many collector classes with a lot of different options per class. I think it would be a good idea to have a GroupUtil (Or another name?) convenience class. I think this could be a builder, because of the many options (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations (term/dv/function) grouping has. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3872: --- Attachment: LUCENE-3872.patch Patch, I think it's ready. One test was failing to call the commit() matching its prepareCommit()... I fixed it. Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch, LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3207) Add field name validation
[ https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230287#comment-13230287 ] Yonik Seeley commented on SOLR-3207: bq. same used for java identifiers (Character#isJavaIdentifierPart), plus the use of trailing '.' and '-' I think we should prob define it as I documented in the schema: !-- field names should consist of alphanumeric or underscore characters only and not start with a digit. This is not currently strictly enforced, but other field names will not have first class support from all components and back compatibility is not guaranteed. -- Add field name validation - Key: SOLR-3207 URL: https://issues.apache.org/jira/browse/SOLR-3207 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Luca Cavanna Fix For: 4.0 Attachments: SOLR-3207.patch Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would be useful to add some kind of validation regarding the field names you can use on Solr. The objective would be adding consistency, allowing only field names that you can then use within fl, sorting etc. The rules, taken from the actual StrParser behaviour, seem to be the following: - same used for java identifiers (Character#isJavaIdentifierPart), plus the use of trailing '.' and '-' - for the first character the rule is Character#isJavaIdentifierStart minus '$' (The dash can't be used as first character (SOLR-3191) for example) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230289#comment-13230289 ] Harley Parks commented on SOLR-2155: Bill: the doc's on queryParser state the name of the function can then be used as the main query, gh_geofilt, perhaps something like: /select?q={!gh_geofilt}... but, good question. geofilt is working for me on multivalued fields. my issue is the query result returns the geohash string, not the geohash lat, long. In building the v 1.0.3 jar file for solr2155, I used jdk 6. as I didn't see any errors, so hopefully, that's fine. so, I'm going to see if solr 3.5 will perhaps resolve my issue. Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230293#comment-13230293 ] Robert Muir commented on LUCENE-3872: - I don't have a better fix, but at the same time i feel you should be able to close() at any time, (such as when handling exceptions in your app), since we are a real closeable here. Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch, LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230307#comment-13230307 ] Michael McCandless commented on LUCENE-3872: Well, we could also easily allow skipping the call to commit... in this case IW.close would detect the missing call to commit, call commit, and call commit again to save any changes done after the prepareCommit and before close. Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch, LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3848) basetokenstreamtestcase should fail if tokenstream starts with posinc=0
[ https://issues.apache.org/jira/browse/LUCENE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3848: Attachment: LUCENE-3848.patch updated patch: I think its ready to commit. I didn't integrate Mike's nice MockGraphTokenFilter *yet* but will do this under a separate issue: its likely to expose a few bugs :) basetokenstreamtestcase should fail if tokenstream starts with posinc=0 --- Key: LUCENE-3848 URL: https://issues.apache.org/jira/browse/LUCENE-3848 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3848-MockGraphTokenFilter.patch, LUCENE-3848.patch, LUCENE-3848.patch This is meaningless for a tokenstream to start with posinc=0, Its also caused problems and hairiness in the indexer (LUCENE-1255, LUCENE-1542), and it makes senseless tokenstreams. We should add a check and fix any that do this. Furthermore the same bug can exist in removing-filters if they have enablePositionIncrements=false. I think this option is useful: but it shouldnt mean 'allow broken tokenstream', it just means we don't add gaps. If you remove tokens with enablePositionIncrements=false it should not cause the TS to start with positionincrement=0, and it shouldnt 'restructure' the tokenstream (e.g. moving synonyms on top of a different word). It should just not add any 'holes'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1984 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1984/ 1 tests failed. FAILED: org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest.testMultiThreaded Error Message: Uncaught exception by thread: Thread[DocThread-1,5,] Stack Trace: org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread: Uncaught exception by thread: Thread[DocThread-1,5,] at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:60) at org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164) at org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20) at org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51) at org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21) at org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743) Caused by: java.lang.AssertionError: DocThread-1---http://localhost:17096/solr at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.client.solrj.LargeVolumeTestBase$DocThread.run(LargeVolumeTestBase.java:120) Build Log (for compile errors): [...truncated 10124 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3196) partialResults response header not propagated in distributed search
[ https://issues.apache.org/jira/browse/SOLR-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-3196. -- Resolution: Fixed Fix Version/s: 4.0 3.6 4.x r: 1301097 3.6 r: 1301096 Right, after looking a bit more closely, there's nothing here that would break back-compat, that was just my paranoia was at work. partialResults response header not propagated in distributed search --- Key: SOLR-3196 URL: https://issues.apache.org/jira/browse/SOLR-3196 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 4.0 Reporter: Russell Black Labels: patch Fix For: 3.6, 4.0 Attachments: SOLR-3196-3x.patch, SOLR-3196-partialResults-header.patch For {{timeAllowed=true}} requests, the response contains a {{partialResults}} header that indicates when a search was terminated early due to running out of time. This header is being discarded by the collator. Patch to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3873) tie MockGraphTokenFilter into all analyzers tests
tie MockGraphTokenFilter into all analyzers tests - Key: LUCENE-3873 URL: https://issues.apache.org/jira/browse/LUCENE-3873 Project: Lucene - Java Issue Type: Task Components: modules/analysis Reporter: Robert Muir Mike made a MockGraphTokenFilter on LUCENE-3848. Many filters currently arent tested with anything but a simple tokenstream. we should test them with this, too, it might find bugs (zero-length terms, stacked terms/synonyms, etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3251) dynamically add field to schema
dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3848) basetokenstreamtestcase should fail if tokenstream starts with posinc=0
[ https://issues.apache.org/jira/browse/LUCENE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230341#comment-13230341 ] Michael McCandless commented on LUCENE-3848: +1 basetokenstreamtestcase should fail if tokenstream starts with posinc=0 --- Key: LUCENE-3848 URL: https://issues.apache.org/jira/browse/LUCENE-3848 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3848-MockGraphTokenFilter.patch, LUCENE-3848.patch, LUCENE-3848.patch This is meaningless for a tokenstream to start with posinc=0, Its also caused problems and hairiness in the indexer (LUCENE-1255, LUCENE-1542), and it makes senseless tokenstreams. We should add a check and fix any that do this. Furthermore the same bug can exist in removing-filters if they have enablePositionIncrements=false. I think this option is useful: but it shouldnt mean 'allow broken tokenstream', it just means we don't add gaps. If you remove tokens with enablePositionIncrements=false it should not cause the TS to start with positionincrement=0, and it shouldnt 'restructure' the tokenstream (e.g. moving synonyms on top of a different word). It should just not add any 'holes'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-3251: --- Attachment: SOLR-3251.patch Here's a quick start... no tests or external API yet. dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230345#comment-13230345 ] Robert Muir commented on LUCENE-3872: - {quote} in this case IW.close would detect the missing call to commit, call commit, and call commit again to save any changes done after the prepareCommit and before close. {quote} I think that would make it even more lenient and complicated and worse. I guess i feel close() should really be rollback(). But this is likely ridiculous to change. So on second thought I think patch is good... if someone is handling exceptional cases like this they should be thinking about using rollback() anyway, and they have this option still. I wasn't really against the patch anyway, just whining. its definitely an improvement on the current behavior, let's do it. Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch, LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230346#comment-13230346 ] Harley Parks commented on SOLR-2155: Oh.. I may have messed up my build, since i did not include the solr 3.4 jar files in the class path... is there an enviorment variable that maven will use? such as CLASSPATH or a lib folder in the project being built? Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3874) bogus positions create a corrumpt index
bogus positions create a corrumpt index --- Key: LUCENE-3874 URL: https://issues.apache.org/jira/browse/LUCENE-3874 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Its pretty common that positionIncrement can overflow, this happens really easily if people write analyzers that don't clearAttributes(). It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check), that IW would throw an exception. But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3874) bogus positions create a corrumpt index
[ https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3874: Attachment: LUCENE-3874_test.patch Simple test that overflows posinc. Output is: {noformat} junit-sequential: [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.239 sec [junit] [junit] - Standard Output --- [junit] CheckIndex failed [junit] Segments file=segments_1 numSegments=1 version=4.0 format=FORMAT_4_0 [Lucene 4.0] [junit] 1 of 1: name=_0 docCount=1 [junit] codec=SimpleText [junit] compound=false [junit] hasProx=true [junit] numFiles=4 [junit] size (MB)=0 [junit] diagnostics = {os.version=3.0.0-14-generic, os=Linux, lucene.version=4.0-SNAPSHOT, source=flush, os.arch=amd64, java.version=1.6.0_24, java.vendor=Sun Microsystems Inc.} [junit] has deletions [delGen=-1] [junit] test: open reader.OK [junit] test: fields..OK [1 fields] [junit] test: field norms.OK [1 fields] [junit] test: terms, freq, prox...ERROR: java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out of bounds [junit] java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out of bounds [junit] at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:860) ... {noformat} bogus positions create a corrumpt index --- Key: LUCENE-3874 URL: https://issues.apache.org/jira/browse/LUCENE-3874 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Robert Muir Attachments: LUCENE-3874_test.patch Its pretty common that positionIncrement can overflow, this happens really easily if people write analyzers that don't clearAttributes(). It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check), that IW would throw an exception. But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230373#comment-13230373 ] Yonik Seeley commented on SOLR-3251: Any ideas for an external API? We could use a single entry point for all things schema related... http://localhost:8983/solr/schema {addField:{myfield:{type:int ...}} Or more specific to fields... http://localhost:8983/solr/fields OR PUT/POST to http://localhost:8983/solr/schema/fields (nesting all schema related stuff under schema would help pollute the namespace less) {myfield:{type:int ...}} I'm leaning toward the last option. Thoughts? dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3874) bogus positions create a corrumpt index
[ https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3874: Affects Version/s: 3.6 3.x too: just s/TextField/Field to port the test bogus positions create a corrumpt index --- Key: LUCENE-3874 URL: https://issues.apache.org/jira/browse/LUCENE-3874 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.6, 4.0 Reporter: Robert Muir Attachments: LUCENE-3874_test.patch Its pretty common that positionIncrement can overflow, this happens really easily if people write analyzers that don't clearAttributes(). It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check), that IW would throw an exception. But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230381#comment-13230381 ] Ryan McKinley commented on SOLR-3251: - Does this imply that the schema would be writeable? The PUT/POST option is nicer dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230390#comment-13230390 ] Ryan McKinley commented on SOLR-3251: - What are the thoughts on error handling? are you only able to add fields that don't exist? If they exist in the schema but not in the index? What about if the index Analyzer is identical, but the query Analyzer has changed? dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230393#comment-13230393 ] Yonik Seeley commented on SOLR-3251: bq. Does this imply that the schema would be writeable? The in-memory schema object yes. The question is how to persist changes. I was thinking it might be easiest to keep a separate file alongside schema.xml for dynamically added fields for now. The term dynamicFields has already been taken though and we probably shouldn't overload it. Maybe extra_fields.json? Or maybe even schema.json/schema.yaml that acts as an extension of schema.xml (and could acquire additional features over time such as the ability to define types too?) But a separate file that just lists fields will be much quicker (and easier) to update. Reloading a full schema.xml (along with type instantiation) would currently be somewhat prohibitive. dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230398#comment-13230398 ] Harley Parks commented on SOLR-2155: All of the Class Paths in the solr1.0.3 project point to apache solr 3.4 libraries on the apache website... so no action needed, to answer my own question. I'm stumped. Geospatial search using geohash prefixes Key: SOLR-2155 URL: https://issues.apache.org/jira/browse/SOLR-2155 Project: Solr Issue Type: Improvement Reporter: David Smiley Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch There currently isn't a solution in Solr for doing geospatial filtering on documents that have a variable number of points. This scenario occurs when there is location extraction (i.e. via a gazateer) occurring on free text. None, one, or many geospatial locations might be extracted from any given document and users want to limit their search results to those occurring in a user-specified area. I've implemented this by furthering the GeoHash based work in Lucene/Solr with a geohash prefix based filter. A geohash refers to a lat-lon box on the earth. Each successive character added further subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid. The first step in this scheme is figuring out which geohash grid squares cover the user's search query. I've added various extra methods to GeoHashUtils (and added tests) to assist in this purpose. The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares in the index. Once a matching geohash grid is found, the points therein are compared against the user's query to see if it matches. I created an abstraction GeoShape extended by subclasses named PointDistance... and CartesianBox to support different queried shapes so that the filter need not care about these details. This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3874) bogus positions create a corrumpt index
[ https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3874: Attachment: LUCENE-3874.patch first cut at a patch, throws IllegalArgumentException and aborts the doc (ensuring fieldState never sees the overflow since i dont trust what happens to it after this!) bogus positions create a corrumpt index --- Key: LUCENE-3874 URL: https://issues.apache.org/jira/browse/LUCENE-3874 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.6, 4.0 Reporter: Robert Muir Attachments: LUCENE-3874.patch, LUCENE-3874_test.patch Its pretty common that positionIncrement can overflow, this happens really easily if people write analyzers that don't clearAttributes(). It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check), that IW would throw an exception. But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230399#comment-13230399 ] Sami Siren commented on SOLR-3251: -- I like the latter option more. dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3874) bogus positions create a corrumpt index
[ https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230402#comment-13230402 ] Michael McCandless commented on LUCENE-3874: +1 Crazy we don't catch this already... bogus positions create a corrumpt index --- Key: LUCENE-3874 URL: https://issues.apache.org/jira/browse/LUCENE-3874 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.6, 4.0 Reporter: Robert Muir Attachments: LUCENE-3874.patch, LUCENE-3874_test.patch Its pretty common that positionIncrement can overflow, this happens really easily if people write analyzers that don't clearAttributes(). It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check), that IW would throw an exception. But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230399#comment-13230399 ] Sami Siren edited comment on SOLR-3251 at 3/15/12 6:32 PM: --- bq. Any ideas for an external API? I like the latter option more. was (Author: siren): I like the latter option more. dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230406#comment-13230406 ] Ryan McKinley commented on SOLR-3251: - bq. separate file alongside schema.xml This makes sense. As is, the ad-hoc naming conventions in schema make writing out the full schema pretty daunting. dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3207) Add field name validation
[ https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230408#comment-13230408 ] Luca Cavanna commented on SOLR-3207: The first letter should be ok as checked in my patch. Regarding the trailing characters, do you mean we shouldn't use isJavaIdentifierPart anymore but something else? That's even more restrictive than my patch since I've used the existing rules applied while parsing the fl parameter (ReturnFields class). No problem for me, are we all sure we want to proceed this way? I'll update my patch later on. Then I'd document this within the Schema wiki page. That's a big change, any opinion is welcome! Add field name validation - Key: SOLR-3207 URL: https://issues.apache.org/jira/browse/SOLR-3207 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Luca Cavanna Fix For: 4.0 Attachments: SOLR-3207.patch Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would be useful to add some kind of validation regarding the field names you can use on Solr. The objective would be adding consistency, allowing only field names that you can then use within fl, sorting etc. The rules, taken from the actual StrParser behaviour, seem to be the following: - same used for java identifiers (Character#isJavaIdentifierPart), plus the use of trailing '.' and '-' - for the first character the rule is Character#isJavaIdentifierStart minus '$' (The dash can't be used as first character (SOLR-3191) for example) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3207) Add field name validation
[ https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230422#comment-13230422 ] Yonik Seeley commented on SOLR-3207: bq. Regarding the trailing characters, do you mean we shouldn't use isJavaIdentifierPart anymore but something else? That was just a shortcut... looking again, it's pretty open (maybe more open than we want?) esp since unicode changes over time. Anyway, isJavaIdentifierPart doesn't include - or . If people do need another separator type character, we could allow $ too (just not as the first char, since that's taken by variable dereferencing). bq. That's even more restrictive than my patch since I've used the existing rules applied while parsing the fl parameter (ReturnFields class). Allowing '-' in the fl was just to resolve that regression for people who already used fieldnames like that and are upgrading. If we want to start validating field names strictly, then we should bump the schema version number (and should skip validating when the version number is less than that). Add field name validation - Key: SOLR-3207 URL: https://issues.apache.org/jira/browse/SOLR-3207 Project: Solr Issue Type: Improvement Affects Versions: 4.0 Reporter: Luca Cavanna Fix For: 4.0 Attachments: SOLR-3207.patch Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would be useful to add some kind of validation regarding the field names you can use on Solr. The objective would be adding consistency, allowing only field names that you can then use within fl, sorting etc. The rules, taken from the actual StrParser behaviour, seem to be the following: - same used for java identifiers (Character#isJavaIdentifierPart), plus the use of trailing '.' and '-' - for the first character the rule is Character#isJavaIdentifierStart minus '$' (The dash can't be used as first character (SOLR-3191) for example) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230436#comment-13230436 ] Dawid Weiss commented on LUCENE-3872: - bq. I guess i feel close() should really be rollback(). Yeah... I think this feeling of unease is fairly common -- see JDBC's Connection javadoc on close, for example: It is strongly recommended that an application explicitly commits or rolls back an active transaction prior to calling the close method. If the close method is called and there is an active transaction, the results are implementation-defined. Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch, LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()
[ https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-3872. Resolution: Fixed Thanks Tim! Index changes are lost if you call prepareCommit() then close() --- Key: LUCENE-3872 URL: https://issues.apache.org/jira/browse/LUCENE-3872 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3872.patch, LUCENE-3872.patch You are supposed to call commit() after calling prepareCommit(), but... if you forget, and call close() after prepareCommit() without calling commit(), then any changes done after the prepareCommit() are silently lost (including adding/deleting docs, but also any completed merges). Spinoff from java-user thread lots of .cfs (compound files) in the index directory from Tim Bogaert. I think to fix this, IW.close should throw an IllegalStateException if prepareCommit() was called with no matching call to commit(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3874) bogus positions create a corrumpt index
[ https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-3874. - Resolution: Fixed Fix Version/s: 4.0 3.6 bogus positions create a corrumpt index --- Key: LUCENE-3874 URL: https://issues.apache.org/jira/browse/LUCENE-3874 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.6, 4.0 Reporter: Robert Muir Fix For: 3.6, 4.0 Attachments: LUCENE-3874.patch, LUCENE-3874_test.patch Its pretty common that positionIncrement can overflow, this happens really easily if people write analyzers that don't clearAttributes(). It used to be the case that if this happened (and perhaps still is in 3.x, i didnt check), that IW would throw an exception. But i couldnt find the code checking this, I wrote a test and it makes a corrumpt index... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230503#comment-13230503 ] Dawid Weiss commented on LUCENE-3867: - I just peeked at OpenJDK sources and addressSize() is defined as this: {code} // See comment at file start about UNSAFE_LEAF //UNSAFE_LEAF(jint, Unsafe_AddressSize()) UNSAFE_ENTRY(jint, Unsafe_AddressSize(JNIEnv *env, jobject unsafe)) UnsafeWrapper(Unsafe_AddressSize); return sizeof(void*); UNSAFE_END {code} In this light this switch: {code} switch (addressSize) { case 4: is64Bit = Boolean.FALSE; break; case 8: is64Bit = Boolean.TRUE; break; } {code} Becomes interesting. Do you know of any architecture with pointers different than 4 or 8 bytes? :) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3848) basetokenstreamtestcase should fail if tokenstream starts with posinc=0
[ https://issues.apache.org/jira/browse/LUCENE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230510#comment-13230510 ] Robert Muir commented on LUCENE-3848: - I think this is ready to go in, ill wait a bit. I didn't make any changes re: graph-restructuring, though I still feel we should fix this, but it means dealing with backwards compatibility, etc. The changes in this patch are backwards compatible, in the sense that consumers are already correcting 'initial posInc=0' to posinc=1 anyway. basetokenstreamtestcase should fail if tokenstream starts with posinc=0 --- Key: LUCENE-3848 URL: https://issues.apache.org/jira/browse/LUCENE-3848 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-3848-MockGraphTokenFilter.patch, LUCENE-3848.patch, LUCENE-3848.patch This is meaningless for a tokenstream to start with posinc=0, Its also caused problems and hairiness in the indexer (LUCENE-1255, LUCENE-1542), and it makes senseless tokenstreams. We should add a check and fix any that do this. Furthermore the same bug can exist in removing-filters if they have enablePositionIncrements=false. I think this option is useful: but it shouldnt mean 'allow broken tokenstream', it just means we don't add gaps. If you remove tokens with enablePositionIncrements=false it should not cause the TS to start with positionincrement=0, and it shouldnt 'restructure' the tokenstream (e.g. moving synonyms on top of a different word). It should just not add any 'holes'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230529#comment-13230529 ] Dawid Weiss commented on LUCENE-3867: - A few more exotic jits from OpenJDK (all seem to be using explicit 8 byte ref size on 64-bit: {noformat} ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-jamvm [junit] JVM: OpenJDK Runtime Environment, JamVM, Robert Lougher, 1.6.0-devel, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-jamvm -XX:+UseCompressedOops [junit] JVM: OpenJDK Runtime Environment, JamVM, Robert Lougher, 1.6.0-devel, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-cacao [junit] JVM: OpenJDK Runtime Environment, CACAO, CACAOVM - Verein zur Foerderung der freien virtuellen Maschine CACAO, 1.1.0pre2, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-server [junit] JVM: OpenJDK Runtime Environment, OpenJDK 64-Bit Server VM, Sun Microsystems Inc., 20.0-b11, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 4 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-server -XX:-UseCompressedOops [junit] JVM: OpenJDK Runtime Environment, OpenJDK 64-Bit Server VM, Sun Microsystems Inc., 20.0-b11, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (LUCENE-3875) ValueSourceFilter
ValueSourceFilter - Key: LUCENE-3875 URL: https://issues.apache.org/jira/browse/LUCENE-3875 Project: Lucene - Java Issue Type: New Feature Reporter: Andrew Morrison Attachments: LUCENE-3875.patch A ValueSourceFilter is a filter that takes a ValueSource and a threshold value, filtering out documents for which their value returned by the ValueSource is below the threshold. We use the ValueSourceFilter for filtering documents based on their value in an ExternalFileField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3875) ValueSourceFilter
[ https://issues.apache.org/jira/browse/LUCENE-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Morrison updated LUCENE-3875: Attachment: LUCENE-3875.patch ValueSourceFilter - Key: LUCENE-3875 URL: https://issues.apache.org/jira/browse/LUCENE-3875 Project: Lucene - Java Issue Type: New Feature Reporter: Andrew Morrison Attachments: LUCENE-3875.patch A ValueSourceFilter is a filter that takes a ValueSource and a threshold value, filtering out documents for which their value returned by the ValueSource is below the threshold. We use the ValueSourceFilter for filtering documents based on their value in an ExternalFileField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230535#comment-13230535 ] Dawid Weiss commented on LUCENE-3867: - Mac: {noformat} ant test-core -Dtestcase=TestRam* -Dtests.verbose=true [junit] JVM: Java(TM) SE Runtime Environment, Java HotSpot(TM) 64-Bit Server VM, Apple Inc., 20.4-b02-402, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_29, Apple Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 4 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-server -XX:-UseCompressedOops [junit] JVM: Java(TM) SE Runtime Environment, Java HotSpot(TM) 64-Bit Server VM, Apple Inc., 20.4-b02-402, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_29, Apple Inc., null, [junit] NOTE: This JVM is 64bit: true [junit] NOTE: Reference size in this JVM: 8 {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230543#comment-13230543 ] Mark Miller commented on LUCENE-3867: - Nooo!!! My eyes I'm pretty sure my liver has just been virally licensed! RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3869) possible hang in UIMATypeAwareAnalyzerTest
[ https://issues.apache.org/jira/browse/LUCENE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230549#comment-13230549 ] Robert Muir commented on LUCENE-3869: - I think i got lucky yesterday twice... its pretty hard to reproduce this now. Maybe a thread-safety issue? I'll look more. My computer has been known to be crazy... dont waste time on this one Tommaso, I'll try to dig in more. possible hang in UIMATypeAwareAnalyzerTest -- Key: LUCENE-3869 URL: https://issues.apache.org/jira/browse/LUCENE-3869 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Affects Versions: 4.0 Reporter: Robert Muir Just testing an unrelated patch, I was hung (with 100% cpu) in UIMATypeAwareAnalyzerTest. I'll attach stacktrace at the moment of the hang. The fact we get a seed in the actual stacktraces for cases like this is awesome! Thanks Dawid! I don't think it reproduces 100%, but I'll try beasting this seed to see if i can reproduce the hang: should be 'ant test -Dtestcase=UIMATypeAwareAnalyzerTest -Dtests.seed=-262aada3325aa87a:-44863926cf5c87e9:5c8c471d901b98bd' from what I can see. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved LUCENE-3795. --- Resolution: Fixed I will mark this resolved and we can start new issues for ongoing problems. The next big step is to integrate with solr. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230565#comment-13230565 ] Ryan McKinley commented on LUCENE-3795: --- did not mean to 'resolve' the Math.toRadians issue though -- I think we should change that back to multiplication... Math.* seems to be pretty clunky Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230572#comment-13230572 ] Dawid Weiss commented on LUCENE-3867: - Ok, right, sorry, let me scramble for intellectual property protection reasons: {noformat} // See cemnmot at flie sratt abuot U_ANEESAFLF / / ULAAFEN_SEF (jnit, UfdAsnerS_zsiaedse ()) UEATERSNFN_Y (jint, UnidsdserSAasfe_ze (JNnEIv * env, jcbjeot unfsae)) UesWrpfapaner ( UdenfsSseAazs_drie ); rreutn seiozf (void * ;) UNEF_SNEAD {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module
[ https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230574#comment-13230574 ] Yonik Seeley commented on LUCENE-3795: -- bq. I'd be very surprised to hear if this is true. If Math.toRadians had been written as x*(PI/180.0) then the compiler would have done constant folding and it would simply be multiplication by a constant. But it's unfortunately written as x/180.0*PI (for no good reason in this case), and the compiler/JVM is not allowed to do the simple transformation by itself. That's why we do it. Sometimes knowing how optimizers work and the restrictions on them allow one to know what will be faster or slower without benchmarking. I did benchmark it after the fact (after you questioned it), and it was indeed the case that Math.toRadians was much slower than a simple multiply. Replace spatial contrib module with LSP's spatial-lucene module --- Key: LUCENE-3795 URL: https://issues.apache.org/jira/browse/LUCENE-3795 Project: Lucene - Java Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.0 I propose that Lucene's spatial contrib module be replaced with the spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been in development for approximately 1 year by David Smiley, Ryan McKinley, and Chris Male and we feel it is ready. LSP is here: http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene module is intuitively in svn/trunk/spatial-lucene/. I'll add more comments to prevent the issue description from being too long. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-3222) Pull optimal cache warming queries from a warm solr instance
[ https://issues.apache.org/jira/browse/SOLR-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Black closed SOLR-3222. --- Resolution: Incomplete Turns out this patch doesn't work, since there is no reliable way to turn a Query object into URL query parameters. I ended up solving the problem with a cache plugin. Let me know if you're interested in the solution and I can post code. Pull optimal cache warming queries from a warm solr instance Key: SOLR-3222 URL: https://issues.apache.org/jira/browse/SOLR-3222 Project: Solr Issue Type: New Feature Components: search Affects Versions: 3.5, 4.0 Reporter: Russell Black Labels: patch, performance Attachments: SOLR-3222-autowarm.patch Ever wondered what queries to use to prime your cache? This patch allows you to query a warm running instance for a list of warming queries. The list is generated from the server's caches, meaning you get back an optimal set of queries. The set is optimal to the extent that the caches are optimized. The queries are returned in a format that can be consumed by the {code:xml}listener event=firstSearcher class=solr.QuerySenderListener{code} section of {{solrconfig.xml}}. One can use this feature to generate a static set of good warming queries to place in {{solrconfig.xml}} under {code:xml}listener event=firstSearcher class=solr.QuerySenderListener{code} It can even be used in a dynamic fashion like this: {code:xml} listener event=firstSearcher class=solr.QuerySenderListener xi:include href=http://host/solr/core/autowarm; xpointer=element(/1/2) xmlns:xi=http://www.w3.org/2001/XInclude/ /listener {code} which can work well in certain distributed load-balanced architectures, although in production it would be wise to add an {{xi:fallback}} element to the include in the event that the host is down. I implemented this by introducing a new request handler: {code:xml} requestHandler name=/autowarm class=solr.AutoWarmRequestHandler / {code} The request handler pulls a configurable number of top keys from the {{filterCache}},{{fieldValueCache}}, and {{queryResultCache}}. For each key, it constructs a query that will cause that key to be placed in the associated cache. The list of constructed queries are then returned in the response. Patch to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3251) dynamically add field to schema
[ https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230627#comment-13230627 ] Hoss Man commented on SOLR-3251: bq. Any ideas for an external API? I think the best way to support this externally is using the existing mechanism for plugins... * a RequestHandler people can register (if they want to support external clients programaticly modifying the schema) that accepts ContentStreams containing whatever payload structure makes sense given the functionality. * an UpdateProcessor people can register (if they want to support stuff like SOLR-3250 where clients adding documents can submit any field name and a type is added based on the type of hte value) which could be configured with mappings of java types to fieldTypes and rules about other field attributes -- ie if a client submits a new field=value with a java.lang.Integer value, create a new tint field with that name and set stored=true. dynamically add field to schema --- Key: SOLR-3251 URL: https://issues.apache.org/jira/browse/SOLR-3251 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-3251.patch One related piece of functionality needed for SOLR-3250 is the ability to dynamically add a field to the schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230631#comment-13230631 ] Uwe Schindler commented on LUCENE-3867: --- bq. Becomes interesting. Do you know of any architecture with pointers different than 4 or 8 bytes? When I was writing that code, I was thinking a very long time about: Hm, should I add a default case saying: {noformat} default: throw new Error(Lucene does not like architectures with pointer size + addressSize) {noformat} But then I decided: If there is an architecture with a pointer size of 6, does this break Lucene really? Hm, maybe I should have added a comment there: {noformat} default: // this is the philosophical case of Lucene reaching an architecture returning something different here {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
[ https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230644#comment-13230644 ] Uwe Schindler commented on LUCENE-3867: --- Maybe this for @UweSays: {noformat} default: throw new Error(Your processor(*) hit me with his + addressSize + inch dick); // (*)Dawid {noformat} RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect - Key: LUCENE-3867 URL: https://issues.apache.org/jira/browse/LUCENE-3867 Project: Lucene - Java Issue Type: Bug Components: core/index Reporter: Shai Erera Assignee: Shai Erera Priority: Trivial Fix For: 3.6, 4.0 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml {quote} A single-dimension array is a single object. As expected, the array has the usual object header. However, this object head is 12 bytes to accommodate a four-byte array length. Then comes the actual array data which, as you might expect, consists of the number of elements multiplied by the number of bytes required for one element, depending on its type. The memory usage for one element is 4 bytes for an object reference ... {quote} While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is: {code} /** * Computes the approximate size of a String object. Note that if this object * is also referenced by another object, you should add * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this * method. */ public static int sizeOf(String str) { return 2 * str.length() + 6 // chars + additional safeness for arrays alignment + 3 * RamUsageEstimator.NUM_BYTES_INT // String maintains 3 integers + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // char[] array + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // String object } {code} If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org